Searching for superspreaders of information in real-world social media - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul 3:4:5547.
doi: 10.1038/srep05547.

Searching for superspreaders of information in real-world social media

Affiliations

Searching for superspreaders of information in real-world social media

Sen Pei et al. Sci Rep. .

Abstract

A number of predictors have been suggested to detect the most influential spreaders of information in online social media across various domains such as Twitter or Facebook. In particular, degree, PageRank, k-core and other centralities have been adopted to rank the spreading capability of users in information dissemination media. So far, validation of the proposed predictors has been done by simulating the spreading dynamics rather than following real information flow in social networks. Consequently, only model-dependent contradictory results have been achieved so far for the best predictor. Here, we address this issue directly. We search for influential spreaders by following the real spreading dynamics in a wide range of networks. We find that the widely-used degree and PageRank fail in ranking users' influence. We find that the best spreaders are consistently located in the k-core across dissimilar social platforms such as Twitter, Facebook, Livejournal and scientific publishing in the American Physical Society. Furthermore, when the complete global network structure is unavailable, we find that the sum of the nearest neighbors' degree is a reliable local proxy for user's influence. Our analysis provides practical instructions for optimal design of strategies for "viral" information dissemination in relevant applications.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Schematic illustrations for diffusion process and network structure.
(a), A schematic illustration of two-layer structure of connectivity and diffusion. The lower layer displays social network while the upper layer represents the information diffusion. (b), An example of a diffusion instance starting from source node s. The influence region of s shaded in green contains 5 nodes. (c), The k-shell structure of LJ social network. The kS indices increase as we move from the periphery to the center. The node's degree is reflected by its size. Here we highlight four hubs located in the periphery of network. This inset is created with the Lanet-vi tool (http://lanetvi.soic.indiana.edu/lanetvi.php). (d–f), The influence of the spreading process cannot be predicted by degree reliably. For the LJ network, we compare the influence area of single nodes with the same degree k = 6902 (nodes A and B) or the same index kS = 230 (nodes A and C). In the lower level of the corresponding plots, nodes' k-shell indexes are marked with different colors. In the upper level, nodes with green color constitute the influence area, while the grey nodes are not influenced by the source node.
Figure 2
Figure 2. The k-shell index predicts the average influence of spreading more reliably than in-degree and PageRank.
Logarithmic values of average size of influence region M(kS, kin) when spreading originates in nodes with (kS, kin) for LJ (a), APS (c), Facebook (e) and Twitter (g) are shown. The same analysis with PageRank is also presented in (b),(d),(f),(h). In general, spreading is larger for nodes of higher kS, whereas nodes of a given kin or PageRank can result in either small or large spreading.
Figure 3
Figure 3. Nodes with high k-shell have larger average influence than those with high in-degree and PageRank.
(a), The standard deviation of influence s(M) for nodes within each interval for LJ. The data intervals are created by dividing the range of measures equally according to the logarithmic values. (b), The average influence M(f) for nodes ranking in top f fraction by k-shell kS, in-degree kin and PageRank p for LJ data. (c), The ratio between the average influence of nodes within top f fraction of kS and that of the other two measures. The red line marks the value of 1. The error bars in (b) and (c) present the 95% confidence intervals obtained by bootstrap analysis.
Figure 4
Figure 4. k-shell can recognize influential spreaders more accurately than in-degree and PageRank.
The recognition rate r(f) for LJ (a), APS (b), Facebook (c) and Twitter (d) with k-shell kS, in-degree kin and PageRank p. For all the datasets, kS performs better than in-degree and PageRank. The error bars mark the 95% confidence intervals by bootstrap.
Figure 5
Figure 5. ksum predicts the average influence more reliably than in-degree and PageRank.
The index ksum outperforms in-degree in predicting the average influence of nodes with (ksum, kin) for LJ (a), APS (c), Facebook (e) and Twitter (g). Similar result for PageRank is also obtained in (b),(d),(f),(h).
Figure 6
Figure 6. ksum has good performance in identifying influential spreaders.
The comparisons of kS with ksum and k2sum are shown for LJ (a, b), APS (c, d), Facebook (e, f) and Twitter (g, h). Error bars indicate the 95% confidence intervals. To our surprise ksum has performance comparable with kS. With more local information, k2sum improve the performance slightly.
Figure 7
Figure 7. Effect of sampling methods on kS, kin and PageRank.
Snowball sampling used for Facebook data will not change the relative ranking for kS (a), kin (b) and PageRank (c) dramatically. Meanwhile, with the activity sampling adopted in Twitter data, the ranking for kS (d), kin (e) and PageRank (f) are also not affected significantly.

Similar articles

Cited by

References

    1. Rogers E. M. Diffusion of Innovation (Free Press, New York, 1995).
    1. Watts D. J. & Peretti J. Viral marketing for the real world. Harvard Business Review 104–112 (May2007).
    1. González-Bailón S., Borge-Holthoefer J., Rivero A. & Moreno Y. The dynamics of protest recruitment through an online network. Sci. Rep. 1, 197 (2011). - PMC - PubMed
    1. Gruhl D., Liben-Nowell D., Guha R. V. & Tomkins A. Information diffusion through blogspace. Proc. 13th Intl. WWW Conf. 491–501 (2004).
    1. Muchnik L., Aral S. & Taylor S. J. Social Influence Bias: A Randomized Experiment. Science 341, 647–651 (2013). - PubMed

Publication types