Abstract
Repeated patterns observed in graph and network structures can be utilized for predictive purposes in various domains including cheminformatics, bioinformatics, political sciences, and sociology. In large scale network structures like social networks, graph theoretical link and annotation prediction algorithms are usually not applicable due to graph isomorphism problem, unless some form of approximation is applied. We propose a non-graph theoretical alternative to link and annotation prediction in large networks by flattening network structures into feature vectors. We extract repeated sub-network pattern vectors for the nodes of a network, and utilize traditional machine learning algorithms for estimating missing or unknown annotations and links in the network. Our main contribution is a novel method for extracting features from large scale networks, and evaluation of the benefit each extraction method provides. We applied our methodology for suggesting new Twitter friends. In our experiments, we observed 11-27% improvement in prediction accuracy when compared to the simple methodology of suggesting friends of friends.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Clauset, A., Moore, C., Newman, M.E.J.: Hierarchical structure and the prediction of missing links in networks. Nature 453(7191), 98–101 (2008)
Goldberg, D., Roth, F.: Assessing experimentally derived interactions in a small world. Proc. Natl. Acad. Sci. U.S.A. (2003)
Golub, B., Jackson, M.O.: How homophily affects the speed of learning and best-response dynamics. Quarterly Journal of Economics (2012)
Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: WTF: The Who to Follow Service at Twitter. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 505–514 (2013)
Kirac, M., Ozsoyoglu, G., Yang, J.: Annotating proteins by mining protein interaction networks. Bioinformatics 22(14) (2008)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a Social Network or a News Media? In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 591–600 (2010)
Lee, D.: Document ranking and the vector-space model. IEEE Computer Society 14(2) (1997)
Lee, R., Sumiya, K.: Measuring geographical regularities of crowd behaviors for twitter-based geo-social event detection. In: LBSN (2010)
Liben-Nowell, D., Kleinberg, J.M.: The link prediction problem for social networks. In: LinkKDD (2004)
Milenova, B., Yarmus, J., Campos, M.: SVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines. In: Very Large Databases, VLDB (2005)
Pennacchiotti, M., Popescu, A.M.: A machine learning approach to twitter user classification. In: AAAI Conference on Weblogs and Social Media (2011)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3) (2008)
Taskar, B., Wong, M.F., Abbeel, P., Koller, D.: Link prediction in relational data. In: Proceeding of Neural Information Processing Systems (2003)
Twitter Inc.: Twitter rest api, https://dev.twitter.com/docs/api
Zhou, T., Lu, L., Zhang, Y.C.: Predicting missing links via local information. The European Physical Journal B 71(4) (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Isikli, B., Sevilgen, F.E., Kirac, M. (2014). Link and Annotation Prediction Using Topology and Feature Structure in Large Scale Social Networks. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures, and Structures. BDAS 2014. Communications in Computer and Information Science, vol 424. Springer, Cham. https://doi.org/10.1007/978-3-319-06932-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-06932-6_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06931-9
Online ISBN: 978-3-319-06932-6
eBook Packages: Computer ScienceComputer Science (R0)