Abstract
In this paper, we explore heterogenous information networks in which each vertex represents one entity and the edges reflect linkage relationships. Heterogenous information networks contain vertices of several entity types, such as papers, authors and terms, and hence can fully reflect multiple linkage relationships among different entities. Such a heterogeneous information network is similar to a mixed media graph (MMG). By representing a bibliographic dataset as an MMG, the performance obtained when searching relevant entities (e.g., papers) can be improved. Furthermore, our academic search enables multiple-entity search, where a variety of entity search results are provided, such as relevant papers, authors and conferences, via a one-time query. Explicitly, given a bibliographic dataset, we propose a Global-MMG, in which a global heterogeneous information network is built. When a user submits a query keyword, we perform a random walk with restart (RWR) to retrieve papers or other types of entity objects. To reduce the query response time, algorithm Net-MMG (standing for NetClus-based MMG) is developed. Algorithm Net-MMG first divides a heterogeneous information network into a collection of sub-networks. Afterward, the Net-MMG performs a RWR on a set of selected relevant sub-networks. We implemented our academic search and conducted extensive experiments using the ACM Digital Library. The experimental results show that by exploring heterogeneous information networks and RWR, both the Global-MMG and Net-MMG achieve better search quality compared with existing academic search services. In addition, the Net-MMG has a shorter query response time while still guaranteeing good quality in search results.









Similar content being viewed by others
References
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval. ACM press, New York
Bharat K, Kamba T, Albers M (1998) Personalized, interactive news on the web. Multimed Syst 6(5): 349–358
Breese JS, Heckerman D, Kadie C et al. (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of uncertainty in artificial intelligence, pp 43–52
Cheng H, Tan PN, Sticklen J, Punch WF (2007) Recommendation via query centered random walk on K-partite graph. In: Proceedings of IEEE computer society international conference on data mining, pp 457–462
Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl Inform Syst 27(2):193–225
Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, Los Altos
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann, Los Altos
Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of SIGKDD. ACM, New York, NY, pp 538–543
Jiawei H, Jian P, Yiwen Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD, pp 1–12
Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inform Syst 27(2):303–325
Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632
Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon LR, Riedl J (1997) GroupLens: applying collaborative filtering to Usenet news. Commun ACM 40(3):87
Konstas I, Stathopoulos V, Jose Joemon M (2009) On social networks and collaborative recommendation. In: Procedings of SIGIR, pp 195–202
Liu B (2007) Web data mining: exploring hyperlinks, contents, and usage data. Springer, Berlin
Liu NN, Yang Q (2008) Eigenrank: a ranking-oriented approach to collaborative filtering. In: Proceedings of SIGIR. ACM, New York, pp 83–90
Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Inform Process Manag 41(6):1462–1480
Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of SIGKDD. ACM, New York, p 326
Page L, Brin S, Motwani R, Winograd T (1998) Bringing order to the web. The pagerank citation ranking.
Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of SIGKDD, pp 653–658
Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. Knowl Inform Syst 26(3):467–486
Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of WWW. ACM, New York, p 295
Silberschatz A, Korth HF, Sudarshan S (2002) Database system concepts. McGraw-Hill, New York
Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T (2009) Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th EDBT. ACM, New York, pp 565–576
Sun Y, Wu T, Yin Z, Cheng H, Han J, Yin X, Zhao P (2008) BibNetMiner: mining bibliographic information networks. In: Proceedings of SIGMOD. ACM, New York, pp 1341–1344
Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of SIGKDD. ACM, New York, pp 797–806
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of SIGKDD. ACM, New York, pp 990–998
Tong H, Faloutsos C, Pan JY (2006) Fast random walk with restart and its applications. In: Proceedings of ICDM, pp 613–622
Tong H, Papadimitriou S, Yu PS, Faloutsos C (2008) Proximity tracking on time-evolving bipartite graphs. In Proceedings of SIAM. Citeseer, pp 704–715
Wang JL (2008) Academic literature search based on collaborative recommendation by authors. Master’s thesis, National Chengchi University
Wang X, Sun J-T, Chen Z (2007) Shine: search heterogeneous interrelated entities. In: Proceedings of CIKM, pp 583–592
Zhou D, Orshanskiy SA, Zha H, Lee GC (2007) Co-ranking authors and documents in a heterogeneous network. In Proceedings of ICDM. IEEE Computer Society, pp 739–744
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chiang, MF., Liou, JJ., Wang, JL. et al. Exploring heterogeneous information networks and random walk with restart for academic search. Knowl Inf Syst 36, 59–82 (2013). https://doi.org/10.1007/s10115-012-0523-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0523-8