Abstract
The Internet is an extremely useful resource for education and research. The Internet has been experiencing broken connections issue in spite of its concurrent services. Broken links are common issues stirring in the area of the web. Sometimes the page which was pointing from another page has been disappeared forever or moved to some other location. There can be many reasons for broken links such as the target website is for all time not available, the target website page has been detaching, the target web page was changed or altered and also has misspellings in the link. The broken link itself contains a lot of information such as URL, mark content, encompassing content close to naming content and the content in the page. Every one of these assets of information is valuable for recovering the candidate pages relevance for broken links. The system returns the ranked lists of highly relevant candidate pages on submitting a query which has been extracted from different sources. In this paper, we explore the expression that is used for the proximity (position) connection in the terms of the label and full text in order to extract relative (good and bad) terms through Naïve Bayes classification model. This solves the problem by providing non-identical terms to inquire multiple broken connections and also enrich the accomplishment as the terms that are closely identical show relevancy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Martinez-Romo, J., Araujo, L.: Updating broken web links: an automatic recommendation system. Inf. Process. Manag. 48(2), 183–203 (2012)
Zhang, H., et al.: Development of novel prediction model for drug-induced mitochondrial toxicity by using Naïve Bayes classifier method. Food Chem. Toxicol. 110(October), 122–129 (2017)
Jürgen, C., Uwe, L.: Data Mining, vol. 1. Springer, Singapore (2016)
Feki-Sahnoun, W., et al.: Using general linear model, Bayesian Networks and Naive Bayes classifier for prediction of Karenia selliformis occurrences and blooms. Ecol. Inform. 43, 12–23 (2018)
Shein, E.: Preserving the internet (2015)
Yang, C.C., Soh, C.S., Yap, V.V.: A non-intrusive appliance load monitoring for efficient energy consumption based on Naive Bayes classifier. Sustain. Comput. Inform. Syst. 14, 34–42 (2017)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: IGARSS 2014, no. 1, pp. 1–5 (2014)
Suresh, K., Dillibabu, R.: Designing a machine learning based software risk assessment model using Naïve Bayes algorithm. TAGA J. 14, 3141–3147 (2018)
Corani, G., Benavoli, A., Demšar, J., Mangili, F., Zaffalon, M.: Statistical comparison of classifiers through Bayesian hierarchical modelling. Mach. Learn. 106(11), 1817–1837 (2017)
Jadon, E.: Data mining: document classification using Naive Bayes classifier. Int. J. Comput. Appl. 167(6), 13–16 (2017)
Kucukyilmaz, T., Cambazoglu, B.B., Aykanat, C., Baeza-Yates, R.: A machine learning approach for result caching in web search engines. Inf. Process. Manag. 53(4), 834–850 (2017)
Ko, Y.: How to use negative class information for Naive Bayes classification. Inf. Process. Manag. 53(6), 1255–1268 (2017)
Rafique, H., Anwer, F., Shamim, A., Minaei-bidgoli, B.: Factors affecting acceptance of mobile library applications: structural equation model. LIBRI 68(2), 99–112 (2018)
Abellán, J., Castellano, J.G.: Improving the Naive Bayes classifier via a quick variable selection method using maximum of entropy. Entropy 19(6) (2017)
Ibrahim, M., Sarwar, N.: NoSQL database generation using SAT solver. In: 2016 Sixth International Conference on Innovative Computing Technology (INTECH), pp. 627–631 (2016)
Bajwa, I.S., Sarwar, N., Naeem, M.A.: Generating EXPRESS data models from SBVR. A. Phys. Comput. Sci. 381 (2016)
Cheema, S.M., Sarwar, N., Yousaf, F.: Contrastive analysis of bubble & merge sort proposing hybrid approach. In: 2016 Sixth International Conference on Innovative Computing Technology (INTECH), pp. 371–375 (2016)
Sajjad, R., Sarwar, N.: NLP based verification of a UML class model. In: 2016 Sixth International Conference on Innovative Computing Technology (INTECH), pp. 30–35 (2016)
Saeed, M.S., Sarwar, N., Bilal, M.: Efficient requirement engineering for small scale project by using UML. In: 2016 Sixth International Conference on Innovative Computing Technology (INTECH), pp. 662–666 (2016)
Sarwar, N., Latif, M.S., Aslam, N., Batool, A.: Automated object role model generation. Int. J. Comput. Sci. Inf. Secur. 14(9), 301 (2016)
Bilal, M., Sarwar, N., Bajwa, I.S., Nasir, J.A., Rafiq, W.: New work flow model approach for test case generation of web applications. Bahria Univ. J. Inf. Commun. Technol. 9(2), 28–33 (2016)
Acknowledgments
I would like to express my sincerest appreciation to my supervisor Dr. Shariq Bashir for his directions, assistance, and guidance. I sincerely thanked for his vigorous support, inspirational and technical advice in the research area. I am very thankful to him from the core of my heart for the final level, as he enabled me to develop an understanding of the subject. He has taught me, both consciously and unconsciously, how good experimental work is carried He always remained there whenever there was a need related to Experimental work (software support, dataset, and its understanding) and any other help required in step by step execution and completion of the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Khan, F.N., Ali, A., Hussain, I., Sarwar, N., Rafique, H. (2019). Repairing Broken Links Using Naive Bayes Classifier. In: Bajwa, I., Kamareddine, F., Costa, A. (eds) Intelligent Technologies and Applications. INTAP 2018. Communications in Computer and Information Science, vol 932. Springer, Singapore. https://doi.org/10.1007/978-981-13-6052-7_40
Download citation
DOI: https://doi.org/10.1007/978-981-13-6052-7_40
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6051-0
Online ISBN: 978-981-13-6052-7
eBook Packages: Computer ScienceComputer Science (R0)