Abstract
In traditional bibliometric analysis, author keywords (AKs) play a critical role in such areas as information query, co-word analysis, and capturing topic terms. In past decades, the most relevant studies have focused on the weighting methods of AKs to find specialty or discriminated terms for a topic; however, very few explorations touched the issue of role differentiation for AKs within a specific topic or the context of topic query. Furthermore, either traditional co-word analysis or the latest semantic modeling methods still face the challenges on accurate classifying and ranking the keywords/terms for a specific research topic. As a complement to prior research, a novel analytical framework based on role differentiation of AKs and Technique for Order of Preference by Similarity to Ideal Solution is proposed in this article. In addition, a case study on additive manufacturing is conducted to verify the proposed framework.
Similar content being viewed by others
Notes
The programming tool is Visual Studio Community 2015 (C# language) of Microsoft Company.
References
Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing and Management, 39(1), 45–65.
Altınçay, H., & Erenel, Z. (2010). Analytical evaluation of term weighting schemes for text categorization. Pattern Recognition Letters, 31(11), 1310–1323.
Behzadian, M., Otaghsara, S. K., Yazdani, M., & Ignatius, J. (2012). A state-of the-art survey of TOPSIS applications. Expert Systems with Applications, 39(17), 13051–13069.
Bhattacharjee, P., Debnath, A., Chakraborty, S., & Mandal, U. K. (2017). Selection of optimal aluminum alloy using TOPSIS method under fuzzy environment. Journal of Intelligent and Fuzzy Systems, 32(1), 871–876.
Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(1), 993–1022.
Chen, C. M. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technology, 57(3), 359–377.
Chen, G., & Xiao, L. (2016). Selecting publication keywords for domain analysis in bibliometrics: a comparison of three methods. Journal of Informetrics, 10(1), 212–223.
Chen, K., Zhang, Z., Long, J., & Zhang, H. (2016). Turning from TF-IDF to TF-IGM for term weighting in text classification. Expert Systems with Applications, 66, 245–260.
Choi, J., Yi, S., & Lee, K. C. (2011). Analysis of keyword networks in MIS research and implications for predicting knowledge evolution. Information and Management, 48(8), 371–381.
Datta, D., Varma, S., & Singh, S. K. (2017). Multimodal retrieval using mutual information based textual query reformulation. Expert Systems with Applications, 68, 81–92.
Della Rocca, P., Senatore, S., & Loia, V. (2017). A semantic-grained perspective of latent knowledge modeling. Information Fusion, 36, 52–67.
Erenel, Z., & Altınçay, H. (2012). Nonlinear transformation of term frequencies for term weighting in text categorization. Engineering Applications of Artificial Intelligence, 25(7), 1505–1514.
Garfield, E. (1990). Key Words Plus-ISI’s breakthrough retrieval method. Expanding your searching power on current-contents on diskette. Current Contents, 32, 5–9.
Garfield, E., & Sher, I. H. (1993). Brief communication keywords plus algorithmic derivative indexing. Journal of the American Society for Information Science, 44(5), 298.
Goswami, P., Gaussier, E., & Amini, M. R. (2017). Exploring the space of information retrieval term scoring functions. Information Processing and Management, 53(2), 454–472.
Grossman, D. A., & Frieder, O. (2012). Information retrieval: Algorithms and heuristics (Vol. 15). Berlin: Springer.
Harold, A. L. (2011). Three eras of technology foresight. Technovation, 31, 69–76.
Huang, S. H., Liu, P., Mokasdar, A., & Hou, L. (2013). Additive manufacturing and its societal impact: A literature review. The International Journal of Advanced Manufacturing Technology, 67(5–8), 1191–1203.
Jones, S., & Paynter, G. W. (2002). Automatic extraction of document key phrases for use in digital libraries: Evaluation and applications. Journal of the American Society for Information Science and Technology, 53(8), 653–677.
Khorram Niaki, M., & Nonino, F. (2017). Additive manufacturing management: A review and future research agenda. International Journal of Production Research, 55(5), 1419–1439.
Ko, Y. (2015). A new term-weighting scheme for text classification using the odds of positive and negative class probabilities. Journal of the Association for Information Science and Technology, 66(12), 2553–2565.
Lan, M., Tan, C. L., Su, J., & Lu, Y. (2009). Supervised and traditional term weighting methods for automatic text categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 721–735.
Li, M. N., & Chu, Y. Q. (2017). Explore the research front of a specific research theme based on a novel technique of enhanced co-word analysis. Journal of Information Science, 43(6), 725–741.
Li, M. N., & Porter, A. L. (2018). Facilitating the discovery of relevant studies on risk analysis for three-dimensional printing based on an integrated framework. Scientometrics, 114(1), 277–300.
Li, M. N., Porter, A. L., & Wang, Z. L. (2017). Evolutionary trend analysis of nanogenerator research based on a novel perspective of phased bibliographic coupling. Nano Energy, 34(4), 93–102.
Liu, Z., Liu, Y., Guo, Y., & Wang, H. (2013). Progress in global parallel computing research: A bibliometric approach. Scientometrics, 95(3), 967–983.
Liu, Y., Loh, H. T., & Sun, A. (2009). Imbalanced text classification: A term weighting approach. Expert Systems with Applications, 36(1), 690–701.
Robertson, S. (2004). Understanding inverse document frequency: On theoretical arguments for IDF. Journal of Documentation, 60(5), 503–520.
Rousseau, R. (1998). Jaccard similarity leads to the Marczewski-Steinhaus topology for information retrieval. Information Processing and Management, 34(1), 87–94.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1–47.
Shams, M., & Baraani-Dastjerdi, A. (2017). Enriched LDA (ELDA): Combination of latent Dirichlet allocation with word co-occurrence analysis for aspect extraction. Expert Systems with Applications, 80, 136–146.
Soucy, P., & Mineau, G. W. (2005). Beyond TFIDF weighting for text categorization in the vector space model. In Proceedings of the 19th international joint conference on artificial intelligence, San Francisco, CA, USA (pp. 1130–1135). Morgan Kaufmann Publishers Inc.
Spärck, J. K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1), 11–21.
Su, H. N., & Lee, P. C. (2010). Mapping knowledge structure by keyword co-occurrence: A first look at journal papers in Technology Foresight. Scientometrics, 85(1), 65–79.
Suominen, A., & Toivanen, H. (2016). Map of science with topic modeling: comparison of unsupervised learning and human-assigned subject classification. Journal of the Association for Information Science and Technology, 67(10), 2464–2476.
Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141–188.
Wang, Y., Lee, J. S., & Choi, I. C. (2016). Indexing by latent dirichlet allocation and an ensemble model. Journal of the Association for Information Science and Technology, 67(7), 1736–1750.
Wu, H. B., Gu, X. D., & Gu, Y. W. (2017). Balancing between over-weighting and under-weighting in supervised term weighting. Information Processing and Management, 53(2), 547–557.
Yang, S., Han, R., Wolfram, D., & Zhao, Y. (2016). Visualizing the intellectual structure of information science (2006–2015): Introducing author keyword coupling analysis. Journal of Informetrics, 10(1), 132–150.
Zhang, Y., Shang, L., Huang, L., et al. (2016a). A hybrid similarity measure method for patent portfolio analysis. Journal of Informetrics, 10(4), 1108–1113.
Zhang, W., Yoshida, T., & Tang, X. (2011). A comparative study of TF* IDF, LSI and multi-words for text classification. Expert Systems with Applications, 38(3), 2758–2765.
Zhang, J., Yu, Q., Zheng, F., Long, C., et al. (2016b). Comparing keywords plus of WOS and author keywords: A case study of patient adherence research. Journal of the Association for Information Science and Technology, 4(67), 967–972.
Acknowledgements
The authors acknowledge and appreciate all of the experts who were involved in the email survey. This material is based on work supported by the National Natural Science Foundation of China (No. 71673088), the Foundation of Guangdong Soft Science (No. 2017A070706003), the Foundation of China Scholarship Council (No. 201606155066).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Li, M. Classifying and ranking topic terms based on a novel approach: role differentiation of author keywords. Scientometrics 116, 77–100 (2018). https://doi.org/10.1007/s11192-018-2741-7
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-018-2741-7