Abstract
The Web has evolved over the years and, now, not only the administrators of a site generate content. Users of a website can express themselves showing their feelings or opinions. This fact has led to negative side effects: sometimes the content generated is inappropriate. Frequently, this content is authored by troll users who deliberately seek controversy. In this paper we propose a new method to detect trolling comments in social news websites. To this end, we extract a combination of statistical, syntactic and opinion features from the user comments. Since this troll phenomenon is quite common in the web, we propose a novel experimental setup for our anomaly detection method: considering troll comments as base model (normal behaviour: ‘normality’). We evaluate our approach with data from ‘Menéame’, a popular Spanish social news site, showing that our method can obtain high rates whilst minimising the labelling task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
OReilly, T.: What is web 2.0: Design patterns and business models for the next generation of software. Communications & Strategies (1), 17 (2007)
Dadvar, M., Trieschnigg, D., Ordelman, R., de Jong, F.: Improving cyberbullying detection with user context. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 693–696. Springer, Heidelberg (2013)
Smith, P.K., Mahdavi, J., Carvalho, M., Fisher, S., Russell, S., Tippett, N.: Cyberbullying: Its nature and impact in secondary school pupils. Journal of Child Psychology and Psychiatry 49(4), 376–385 (2008)
Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: The Social Mobile Web (2011)
Shachaf, P., Hara, N.: Beyond vandalism: Wikipedia trolls. Journal of Information Science 36(3), 357–370 (2010)
Bergstrom, K.: don’t feed the troll: Shutting down debate about community expectations on reddit. com. First Monday 16(8) (2011)
Fisher, D., Smith, M., Welser, H.T.: You are who you talk to: Detecting roles in usenet newsgroups. In: Proceedings of the 39th Annual Hawaii International Conference on System Sciences, HICSS 2006, vol. 3, p. 59b. IEEE (2006)
Lea, M., O’Shea, T., Fung, P., Spears, R.: ’Flaming’in computer-mediated communication: Observations, explanations, implications. Harvester Wheatsheaf (1992)
Postmes, T., Spears, R., Lea, M.: Breaching or building social boundaries? side-effects of computer-mediated communication. Communication Research 25(6), 689–715 (1998)
Lerman, K.: User participation in social media: Digg study. In: Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Workshops, pp. 255–258. IEEE Computer Society (2007)
Jindal, N., Liu, B.: Review spam detection. In: Proceedings of the 16th International Conference on World Wide Web, pp. 1189–1190. ACM (2007)
Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the International Conference on Web Search and Web Data Mining, pp. 219–230. ACM (2008)
Santos, I., de-la Peña-Sordo, J., Pastor-López, I., Galán-García, P., Bringas, P.: Automatic categorisation of comments in social news websites. Expert Systems with Applications (2012)
Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)
Salton, G., McGill, M.: Introduction to modern information retrieval. McGraw-Hill New York (1983)
Tata, S., Patel, J.M.: Estimating the selectivity of tf-idf based cosine similarity predicates. ACM SIGMOD Record 36(2), 75–80 (2007)
Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol. 14, pp. 1137–1145 (1995)
Cooper, G.F., Herskovits, E.: A bayesian method for constructing bayesian belief networks from databases. In: Proceedings of the 1991 Conference on Uncertainty in Artificial Intelligence (1991)
Geiger, D., Goldszmidt, M., Provan, G., Langley, P., Smyth, P.: Bayesian network classifiers. In: Machine Learning, pp. 131–163 (1997)
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press (1995)
Amari, S., Wu, S.: Improving support vector machine classifiers by modifying kernel functions. Neural Networks 12(6), 783–789 (1999)
Maji, S., Berg, A., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)
Üstün, B., Melssen, W., Buydens, L.: Visualisation and interpretation of support vector regression models. Analytica Chimica Acta 595(1-2), 299–309 (2007)
Cho, B., Yu, H., Lee, J., Chee, Y., Kim, I., Kim, S.: Nonlinear support vector machine visualization for risk factor analysis using nomograms and localized radial basis function kernels. IEEE Transactions on Information Technology in Biomedicine 12(2), 247–256 (2008)
Garner, S.: Weka: The waikato environment for knowledge analysis. In: Proceedings of the 1995 New Zealand Computer Science Research Students Conference, pp. 57–64 (1995)
Quinlan, J.: C4.5 programs for machine learning. Morgan Kaufmann (1993)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
de-la-Peña-Sordo, J., Pastor-López, I., Ugarte-Pedrero, X., Santos, I., Bringas, P.G. (2014). Anomalous User Comment Detection in Social News Websites. In: de la Puerta, J., et al. International Joint Conference SOCO’14-CISIS’14-ICEUTE’14. Advances in Intelligent Systems and Computing, vol 299. Springer, Cham. https://doi.org/10.1007/978-3-319-07995-0_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-07995-0_51
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07994-3
Online ISBN: 978-3-319-07995-0
eBook Packages: EngineeringEngineering (R0)