One of the problems network science sets out to solve is to find important nodes. Of course, what is important depends on the context, but an applied scientist coming to network science for an answer probably has a clear idea of what it means in her study system. There is no shortage of methods in the literature. Still, when you start applying them to a specific problem, you’ll find assumptions that are either misaligned with your objective or appear out of nowhere. For your viral marketing campaign on a partially known social-media network, should you use a method known to find nodes whose deletion efficiently fragments model graphs? Obviously, there must be better ways, but still, I find myself making such recommendations.
I don’t think there is any quick & easy solution to this issue, but we can list some things applied scientists need to consider. When we (network theoreticians) namedrop applications for our new creations, we should think through if our methods hold for all realistic answers to these questions.
What is the objective? In practical situations, it is often not to maximize a quantity of the network itself, but rather the returns from an investment (in the network). For a vaccination campaign, administrating the vaccination and gathering information about the network are costs that need to be weighed against the expected lowering of the disease burden. Just seeking to minimize the basic reproductive number, or similar, is to pursue a goal whose societal value is anybody’s guess. (Furthermore, vaccinating or quarantining one person will not stop an ongoing disease outbreak, so the importance concerning disease spreading is really a property of a group of people, not an individual.)
What are the possible interventions? The question of what nodes are important only makes sense if there is a way to affect the nodes to reach the objective. Continuing the example of disease spreading, it is often not legally possible to enforce interventions (like travel restrictions or vaccinations) studied in network epidemiology. One can promote health behavior, but the outcome depends on the individual.
What kind of dynamics are we considering? For example, is there some feedback from the system to the interventions? In disease spreading, awareness raised by mass media, or social contagion of behavior, can affect the dynamics—disease awareness can spread between people and mitigate epidemics, etc.
What initial conditions are we considering? Many methods in network science make unstated assumptions about this. For infectious diseases: are we interested in protecting against an emergent new pathogen or bioterrorism? These two scenarios are entirely different. If the epidemic outbreak we want to stop has already started, then any important node must have a high chance of getting the disease; if it is yet to start, important nodes are those who make large outbreaks as infection sources.
What is the network? How accurately and cheaply can we gather network information? (For, e.g., online information spreading, reconstructing the relevant network is easy and precise; for disease spreading, networks are costly to reconstruct and not very precise.) Is it really the structure the interesting dynamics happen on? Or just an approximation? Does it change during the situation in question?
One thought on “The importance of being earnest about the importance of nodes”
I really like the idea of asking the right questions instead of asking the question “what is the most important node?”
I would add the specification to the question “What initial conditions are we considering? “. If you consider the non-equilibrium system (such as temporal network), then the question about initial conditions is more broad: it is difficult to properly define the initial conditions of the epidemiological system in respect to the state of the networks.