基于标签相似度计算的学术圈构建方法

计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 52-56.doi: 10.11896/j.issn.1002-137X.2016.09.009

• 2015 年第三届CCF 大数据学术会议 • 上一篇    下一篇

基于标签相似度计算的学术圈构建方法

傅城州,汤庸,贺超波,王津凌,袁成哲   

  1. 华南师范大学计算机学院 广州510631,华南师范大学计算机学院 广州510631,仲恺农业工程学院信息科学与技术学院 广州510225,华南师范大学计算机学院 广州510631,华南师范大学计算机学院 广州510631
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受863高技术研究发展计划项目基金(2013AA01A212),国家自然科学基金(61272067,61502180),广东省公益研究与能力建设专项(2015A020209178)资助

Construction Method of Academic Circle Based on Label Similarity Computation

FU Cheng-zhou, TANG Yong, HE Chao-bo, WANG Jin-ling and YUAN Cheng-zhe   

  • Online:2018-12-01 Published:2018-12-01

摘要: 为面向学者的社交网络系统中的用户构建学术圈,对促进学者之间的交流具有重要的应用价值。根据学者之间的共同属性进行相似度计算,形成学术领域相似和研究课题相近的学术圈,能让学者们更加紧密和频繁地协同合作。提出了利用学者的学术信息提取代表个人特征的学术标签,并对不同类别标签的权重进行衡量,再通过相似度计算和聚类算法构建学术圈的方法。通过抓取学者社交网络平台SCHOLAT公开的学者信息进行实验,进而验证所提方法的可靠性和实用性。

关键词: 社交网络,标签,相似度计算,聚类算法,学术圈

Abstract: Constructing academic circles for users in the scholar-oriented social network system has important application values for promoting exchanges among scholars.Similarity computation is done based on the common properties of scholars,constituting academic circles with similar academic field and research subject,allowing scholars to collaborate more closely and frequently.This paper proposed a method which uses scholars of the academic information to extract the personal characteristics of academic labels,and measures the weight of different class labels.Then through the similarity computation and clustering algorithm,the academic circle can be constructed.By crawling the public information on the academic social network platform named SCHOLAT to perform experiments,the reliability and usefulness of the proposed method are verified.

Key words: Social network,Label,Similarity computation,Clustering algorithm,Academic circle

[1] Zhang H P,Zhang R Q,Zhao Y P,et al.Big data modeling and analysis of microblog ecosystem[J].International Journal of Automation and Computing,2014,1(2):119-127
[2] Wang Y,Gao L.Social circle-based algorithm for friend recommendation in online social networks[J].Chinese Journal of Computer,2014(4):801-808(in Chinese) 王玙,高琳.基于社交圈的在线社交网络朋友推荐算法[J].计算机学报,2014(4):801-808
[3] Lin Y F,Wang T Y,Tang R,et al.An effective model and algorithm for community detection in social networks[J].Journal of Computer Research and Development,2012,9(2):337-345(in Chinese) 林友芳,王天宇,唐锐,等.一种有效的社会网络社区发现模型和算法[J].计算机研究与发展,2012,9(2):337-345
[4] McAuley J,Leskovec J.Discovering social circles in ego net-works[J].ACM Transactions on Knowledge Discovery from Data (TKDD),2014,8(1):73-100
[5] Wang J,Lochovsky F H.Data extraction and label assignment for web databases[C]∥Proceedings of the 12th International Conference on World Wide Web.New York:ACM Press,2003:187-196
[6] Huang A N.Similarity measures for text document clustering[C]∥Proceedings of the Sixth New Zealand Computer Science Research Student Conference (NZCSRSC2008).Christchurch,2008:49-56
[7] Ahlgren P,Jarneving B,Rousseau R.Requirements for a cocitation similarity measure,with special reference to Pearson’s correlation coefficient[J].Journal of the American Society for Information Science and Technology,2003,4(6):550-560
[8] Deng A,Zhu Y,Shi B L.A collaborative filtering recommendation algorithm based on item rating prediction[J].Journal of Software,2003,4(9):1621-1628
[9] Leskovec J,Mcauley J J.Learning to discover social circles inego networks[J].Advances in Neural Information Processing Systems,2012:539-547
[10] Robert W P L.Chinese string searching using the KMP algorithm[C]∥Proceedings of the 16th Conference on Computational Linguistics.Stroudsburg:Association for Computational Linguistics,1996(2):1111-1114
[11] White T.Hadoop:The definitive guide[M].O’Reilly,2009
[12] Lin W Q,Lu F S,Ding Z Y,et al.Parallel computing hierachical community approach based on weighted-graph[J].Journal of Software,2012,3(6):1517-1530(in Chinese) 林旺群,卢风顺,丁兆云,等.基于加权图的层次化社区并行计算方法[J].软件学报 ,2012,3(6):1517-1530
[13] He L,Wu L D,Cai Yi-zhao.Survey of clustering algorithm indata mining[J].Application Research of Computer,2007,4(1):10-13(in Chinese) 贺玲,吴玲达,蔡益朝.数据挖掘中的聚类算法综述[J].计算机应用研究,2007,4(1):10-13
[14] Suo H G,Wang Y W.An improved k-means algorithm for document clustering[J].Journal of Shandong University(Natural Science),2008,3(1):60-64(in Chinese) 索红光,王玉伟.一种用于文本聚类的改进k-means算法[J].山东大学学报(理学版),2008,3(1):60-64
[15] Mahdavi M,Abolhassani H.Harmony K-means algorithm fordocument clustering[J].Data Mining and Knowledge Discovery,2009,18(3):370-391
[16] Gupta H,Srivastava R.K-means based document clustering withautomatic “k” selection and cluster refinement[J].International Journal of Computer Science and Mobile Applications,2014,2(5):7-13
[17] Mavroeidis D,Marchiori E.Feature selection for k-means clustering stability:theoretical analysis and an algorithm[J].Data Mining and Knowledge Discovery,2014,8(4):918-960
[18] Gaudani H,Lakhani K,Chhatrala R.Survey of document clustering[J].International Journal of Computer Science and Mobile Computing,2014,3(5):871-874
[19] Xie J Y,Wang Y E.K-means algorithm based on minimum deviation initialized clustering centers[J].Computer Engineering,2014,0(8):205-211,223(in Chinese) 谢娟英,王艳娥.最小方差优化初始聚类中心的K-means算法[J].计算机工程,2014,0(8):205-211,223
[20] Bennett K,Robertson J,Milton P M,et al.MATLABapplica-tions for the practical engineer[M].InTech,2014

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!