{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,3,31]],"date-time":"2022-03-31T19:38:04Z","timestamp":1648755484907},"reference-count":31,"publisher":"Emerald","issue":"2","license":[{"start":{"date-parts":[[2015,6,15]],"date-time":"2015-06-15T00:00:00Z","timestamp":1434326400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,6,15]]},"abstract":"\n Purpose<\/jats:title>\n \u2013 This paper aims to present a machine learning approach for solving the problem of Web spam detection. Based on an adoption of the ant colony optimization (ACO), three algorithms are proposed to construct rule-based classifiers to distinguish between non-spam and spam hosts. Moreover, the paper also proposes an adaptive learning technique to enhance the spam detection performance. <\/jats:p>\n <\/jats:sec>\n \n Design\/methodology\/approach<\/jats:title>\n \u2013 The Trust<\/jats:italic>-ACO<\/jats:italic> algorithm is designed to let an ant start from a non-spam seed, and afterwards, decide to walk through paths in the host graph. Trails (i.e. trust paths) discovered by ants are then interpreted and compiled to non-spam classification rules. Similarly, the Distrust<\/jats:italic>-ACO<\/jats:italic> algorithm is designed to generate spam classification ones. The last Combine<\/jats:italic>-ACO<\/jats:italic> algorithm aims to accumulate rules given from the former algorithms. Moreover, an adaptive learning technique is introduced to let ants walk with longer (or shorter) steps by rewarding them when they find desirable paths or penalizing them otherwise. <\/jats:p>\n <\/jats:sec>\n \n Findings<\/jats:title>\n \u2013 Experiments are conducted on two publicly available WEBSPAM-UK2006 and WEBSPAM-UK2007 datasets. The results show that the proposed algorithms outperform well-known rule-based classification baselines. Especially, the proposed adaptive learning technique helps improving the AUC<\/jats:italic> scores up to 0.899 and 0.784 on the former and the latter datasets, respectively. <\/jats:p>\n <\/jats:sec>\n \n Originality\/value<\/jats:title>\n \u2013 To the best of our knowledge, this is the first comprehensive study that adopts the ACO learning approach to solve the problem of Web spam detection. In addition, we have improved the traditional ACO by using the adaptive learning technique.<\/jats:p>\n <\/jats:sec>","DOI":"10.1108\/ijwis-12-2014-0047","type":"journal-article","created":{"date-parts":[[2015,6,3]],"date-time":"2015-06-03T17:21:20Z","timestamp":1433352080000},"page":"142-161","source":"Crossref","is-referenced-by-count":5,"title":["Web spam detection using trust and distrust-based ant colony optimization learning"],"prefix":"10.1108","volume":"11","author":[{"given":"Bundit","family":"Manaskasemsak","sequence":"first","affiliation":[]},{"given":"Arnon","family":"Rungsawang","sequence":"additional","affiliation":[]}],"member":"140","reference":[{"key":"key2020122304272495400_b1","doi-asserted-by":"crossref","unstructured":"Araujo, L.\n and \n Martinez-Romo, J.\n (2010), \u201cWeb spam detection: new classification features based on qualified link analysis and language models\u201d, \n IEEE Transactions on Information Forensics and Security\n , Vol. 5 No. 3, pp. 581-590.","DOI":"10.1109\/TIFS.2010.2050767"},{"key":"key2020122304272495400_b2","unstructured":"Baeza-Yates, R.A.\n and \n Ribeiro-Neto, B.A.\n (1999), \n Modern Information Retrieval\n , Addison Wesley."},{"key":"key2020122304272495400_b3","unstructured":"Becchetti, L.\n , \n Castillo, C.\n , \n Donato, D.\n , \n Leonardi, S.\n and \n Baeza-Yates, R.A.\n (2006), \u201cLink-based characterization and detection of web spam\u201d, Proceedings of the 2nd International Workshop on Adversarial Information Retrieval on the Web, Seattle, WA, pp. 1-8."},{"key":"key2020122304272495400_b4","unstructured":"Becchetti, L.\n , \n Castillo, C.\n , \n Donato, D.\n , \n Leonardi, S.\n and \n Baeza-Yates, R.A.\n (2008), \u201cWeb spam detection: link-based and content-based techniques\u201d, \n The European Integrated Project Dynamically Evolving, Large Scale Information System (DELIS): Proceedings of the Final Workshop\n , Vol. 222, pp. 99-113."},{"key":"key2020122304272495400_b5","doi-asserted-by":"crossref","unstructured":"Castillo, C.\n , \n Donato, D.\n , \n Becchetti, L.\n , \n Boldi, P.\n , \n Leonardi, S.\n , \n Santini, M.\n and \n Vigna, S.\n (2006), \u201cA reference collection for web spam\u201d, \n ACM SIGIR Forum\n , Vol. 40 No. 2, pp. 11-24.","DOI":"10.1145\/1189702.1189703"},{"key":"key2020122304272495400_b6","doi-asserted-by":"crossref","unstructured":"Castillo, C.\n , \n Donato, D.\n , \n Gionis, A.\n , \n Murdock, V.\n and \n Silvestri, F.\n (2007), \u201cKnow your neighbors: web spam detection using the web topology\u201d, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, pp. 423-430.","DOI":"10.1145\/1277741.1277814"},{"key":"key2020122304272495400_b7","doi-asserted-by":"crossref","unstructured":"Castillo, C.\n , \n Chellapilla, K.\n and \n Davison, B.D.\n (2008), \u201cAdversarial information retrieval on the web (AIRWEB 2007)\u201d, \n ACM SIGIR Forum\n , Vol. 42 No. 1, pp. 68-72.","DOI":"10.1145\/1394251.1394267"},{"key":"key2020122304272495400_b8","doi-asserted-by":"crossref","unstructured":"Dong, C.\n and \n Zhou, B.\n (2012), \u201cEffectively detecting content spam on the web using topical diversity measures\u201d, Proceedings of the IEEE\/WIC\/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology, Vol. 1, Washington, DC, pp. 266-273.","DOI":"10.1109\/WI-IAT.2012.98"},{"key":"key2020122304272495400_b11","doi-asserted-by":"crossref","unstructured":"Dorigo, M.\n , \n Di Caro, G.\n and \n Gambardella, L.M.\n (1999), \u201cAnt algorithms for discrete optimization\u201d, \n Artificial Life\n , Vol. 5 No. 2, pp. 137-172.","DOI":"10.1162\/106454699568728"},{"key":"key2020122304272495400_b9","doi-asserted-by":"crossref","unstructured":"Dorigo, M.\n and \n Gambardella, L.M.\n (1997), \u201cAnt colony system: a cooperative learning approach to the traveling salesman problem\u201d, \n IEEE Transactions on Evolutionary Computation\n , Vol. 1 No. 1, pp. 53-66.","DOI":"10.1109\/4235.585892"},{"key":"key2020122304272495400_b10","doi-asserted-by":"crossref","unstructured":"Dorigo, M.\n , \n Maniezzo, V.\n and \n Colorni, A.\n (1996), \u201cAnt system: optimization by a colony of cooperating agents\u201d, \n IEEE Transactions on System, Man, and Cybernetics\n , Vol. 26 No. 1, pp. 29-41.","DOI":"10.1109\/3477.484436"},{"key":"key2020122304272495400_b12","doi-asserted-by":"crossref","unstructured":"Erd\u00e9lyi, M.\n , \n Bencz\u00far, A.A.\n , \n Masan\u00e9s, J.\n and \n Sikl\u00f3si, D.\n (2009), \u201cWeb spam filtering in internet archives\u201d, Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web, New York, NY, pp. 17-20.","DOI":"10.1145\/1531914.1531918"},{"key":"key2020122304272495400_b13","doi-asserted-by":"crossref","unstructured":"Fawcett, T.\n (2006), \u201cAn introduction to roc analysis\u201d, \n Pattern Recognition Letters\n , Vol. 27 No. 8, pp. 861-874.","DOI":"10.1016\/j.patrec.2005.10.010"},{"key":"key2020122304272495400_b14","unstructured":"Fayyad, U.M.\n and \n Irani, K.B.\n (1993), \u201cMulti-interval discretization of continuous-valued attributes for classification learning\u201d, Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chamb\u00e9ry, pp. 1022-1027."},{"key":"key2020122304272495400_b15","doi-asserted-by":"crossref","unstructured":"Fetterly, D.\n , \n Manasse, M.\n and \n Najork, M.\n (2004), \u201cSpam, damn spam, and statistics: using statistical analysis to locate spam web pages\u201d, Proceedings of the 17th International Workshop on the Web and Databases, Paris, pp. 1-6.","DOI":"10.1145\/1017074.1017077"},{"key":"key2020122304272495400_b16","doi-asserted-by":"crossref","unstructured":"Goh, K.L.\n , \n Singh, A.K.\n and \n Lim, K.H.\n (2013), \u201cMultilayer perceptrons neural network based web spam detection application\u201d, Proceedings of the IEEE China Summit \n\t\t\t\t\t&\n\t\t\t\t International Conference on Signal and Information Processing, Beijing, pp. 636-640.","DOI":"10.1109\/ChinaSIP.2013.6625419"},{"key":"key2020122304272495400_b17","unstructured":"Gy\u00f6ngyi, Z.\n and \n Garcia-Molina, H.\n (2005), \u201cWeb spam taxonomy\u201d, Proceedings of the 1st International Workshop on Adversarial Information Retrieval on the Web, Chiba, pp. 39-47."},{"key":"key2020122304272495400_b18","unstructured":"Gy\u00f6ngyi, Z.\n , \n Garcia-Molina, H.\n and \n Pedersen, J.\n (2004), \u201cCombating web spam with trustrank\u201d, Proceedings of the 13th International Conference on Very Large Data Bases, Stanford, CA, pp. 576-587."},{"key":"key2020122304272495400_b19","doi-asserted-by":"crossref","unstructured":"Hall, M.\n , \n Frank, E.\n , \n Holmes, G.\n , \n Pfahringer, B.\n , \n Reutemann, P.\n and \n Witten, I.H.\n (2009), \u201cThe weka data mining software: an update\u201d, \n ACM SIGKDD Explorations Newsletter\n , Vol. 11 No. 1, pp. 10-18.","DOI":"10.1145\/1656274.1656278"},{"key":"key2020122304272495400_b20","doi-asserted-by":"crossref","unstructured":"Kleinberg, J.M.\n (1999), \u201cAuthoritative sources in a hyperlinked environment\u201d, \n Journal of the ACM\n , Vol. 46 No. 5, pp. 604-632.","DOI":"10.1145\/324133.324140"},{"key":"key2020122304272495400_b21","unstructured":"Krishnan, V.\n and \n Raj, R.\n (2006), \u201cWeb spam detection with anti-trust rank\u201d, Proceedings of the 2nd International Workshop on Adversarial Information Retrieval on the Web, Seattle, WA, pp. 37-40."},{"key":"key2020122304272495400_b22","doi-asserted-by":"crossref","unstructured":"Liu, Y.\n , \n Gao, B.\n , \n Liu, T.Y.\n , \n Zhan, Y.\n , \n Ma, Z.\n , \n He, S.\n and \n Li, H.\n (2008a), \u201cBrowserank: letting web users vote for page importance\u201d, Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, pp. 451-458.","DOI":"10.1145\/1390334.1390412"},{"key":"key2020122304272495400_b23","doi-asserted-by":"crossref","unstructured":"Liu, Y.\n , \n Zhang, M.\n , \n Ma, S.\n and \n Ru, L.\n (2008b), \u201cUser behavior oriented web spam detection\u201d, Proceedings of the 17th International Conference on World Wide Web, New York, NY, pp. 1039-1040.","DOI":"10.1145\/1367497.1367645"},{"key":"key2020122304272495400_b24","doi-asserted-by":"crossref","unstructured":"Luckner, M.\n , \n Gad, M.\n and \n Sobkowiak, P.\n (2014), \u201cStable web spam detection using features based on lexical items\u201d, \n Computers & Security\n , Vol. 46, pp. 79-93.","DOI":"10.1016\/j.cose.2014.07.006"},{"key":"key2020122304272495400_b25","doi-asserted-by":"crossref","unstructured":"Manaskasemsak, B.\n , \n Jiarpakdee, J.\n and \n Rungsawang, A.\n (2014), \u201cAdaptive learning ant colony optimization for web spam detection\u201d, Proceedings of the 14th International Conference on Computational Science and Its Applications, Guimar\u00e3es, 30 June-3 July, pp. 642-653.","DOI":"10.1007\/978-3-319-09153-2_48"},{"key":"key2020122304272495400_b26","doi-asserted-by":"crossref","unstructured":"Ntoulas, A.\n , \n Najork, M.\n , \n Manasse, M.\n and \n Fetterly, D.\n (2006), \u201cDetecting spam web pages through content analysis\u201d, Proceedings of the 15th International Conference on World Wide Web, New York, NY, pp. 83-92.","DOI":"10.1145\/1135777.1135794"},{"key":"key2020122304272495400_b27","unstructured":"Page, L.\n , \n Brin, S.\n , \n Motwani, R.\n and \n Winograd, T.\n (1999), \u201cThe Pagerank citation ranking: bringing order to the web\u201d, Technical report, Stanford Digital Libraries."},{"key":"key2020122304272495400_b28","doi-asserted-by":"crossref","unstructured":"St\u00fctzle, T.\n and \n Hoos, H.H.\n (2000), \u201cMax-min ant system\u201d, \n Future Generation Computer Systems\n , Vol. 16 No. 9, pp. 889-914.","DOI":"10.1016\/S0167-739X(00)00043-1"},{"key":"key2020122304272495400_b29","doi-asserted-by":"crossref","unstructured":"Taweesiriwate, A.\n , \n Manaskasemsak, B.\n and \n Rungsawang, A.\n (2012), \u201cWeb spam detection using link-based ant colony optimization\u201d, Proceedings of the 27th IEEE International Conference on Advanced Information Networking and Applications, Fukuoka, pp. 868-873.","DOI":"10.1109\/AINA.2012.118"},{"key":"key2020122304272495400_b30","doi-asserted-by":"crossref","unstructured":"Wu, B.\n and \n Davison, B.D.\n (2005), \u201cIdentifying link farm spam pages\u201d, Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, New York, NY, pp. 820-829.","DOI":"10.1145\/1062745.1062762"},{"key":"key2020122304272495400_b31","unstructured":"Wu, B.\n , \n Goel, V.\n and \n Davison, B.D.\n (2006), \u201cPropagating trust and distrust to demote web spam\u201d, Proceedings of the Workshop on Models of Trust for the Web, Edinburgh."}],"container-title":["International Journal of Web Information Systems"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/www.emeraldinsight.com\/doi\/full-xml\/10.1108\/IJWIS-12-2014-0047","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-12-2014-0047\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-12-2014-0047\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,12,23]],"date-time":"2020-12-23T04:27:48Z","timestamp":1608697668000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-12-2014-0047\/full\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,6,15]]},"references-count":31,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2015,6,15]]}},"alternative-id":["10.1108\/IJWIS-12-2014-0047"],"URL":"https:\/\/doi.org\/10.1108\/ijwis-12-2014-0047","relation":{},"ISSN":["1744-0084"],"issn-type":[{"value":"1744-0084","type":"print"}],"subject":[],"published":{"date-parts":[[2015,6,15]]}}}