RGF-Bot: A Novel Feature Selection Method to Identify Malicious Bot Accounts on Social Networking Sites Using Machine Learning

Chanti, S.; Chithralekha, T.

doi:10.1007/s42979-023-02263-5

RGF-Bot: A Novel Feature Selection Method to Identify Malicious Bot Accounts on Social Networking Sites Using Machine Learning

Original Research
Published: 03 November 2023

Volume 4, article number 843, (2023)
Cite this article

SN Computer Science Aims and scope Submit manuscript

95 Accesses
Explore all metrics

Abstract

A bot is an automated code used for malicious activities such as posting fake news, spreading malware, commenting on tweets, and liking the tweets, on Social Networking Sites (SNS) like Twitter. This paper proposes a novel feature selection method using machine learning to identify malicious bot accounts on social networking sites. This would help identify bot SNS accounts with minimal features yet maintain the same or higher accuracy. At the initial stage, the standard datasets from the Twitter platform were downloaded and pre-processed. Dataset 1, with 29 features and Dataset 2 with 30 features, were considered. The existing feature selection methods such as Variance Score (VS), Random Forest Importance (RFI), and Gradient Boost Importance (GBI) were applied to rank the features. Later, the proposed Recursive Grouping of Features (RGF) method is applied to VS, RFI, and GBI ranked feature sets to obtain the Minimal Features Sets (MFS)s in which the number of features is less than the total number of features. All classification algorithms were applied on VS, RFI, and GBI ranked MFSs to find the best-performing classifier and best feature ranking method. As a result, Decision trees were found to be the best classification algorithm on VS ranked MFSs. The proposed RGF method with the first MFS alone achieved the same accuracy on Dataset 1 and improved accuracy on Dataset 2 compared to all features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Identification of Social Bots in Online Social Networks Using Filter-Based Feature Selection Approach

Hybrid feature selection approach to identify optimal features of profile metadata to detect social bots in Twitter

Article Open access 19 September 2021

Analysis of Feature Selection Methods for P2P Botnet Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability Statement

Not applicable.

Code Availability

Not applicable.

References

Brian D. How many people use twitter in 2021? https://backlinko.com/twitter-users.
Alothali E, Zaki N, Mohamed EA, Alashwal H. Detecting social bots on twitter: a literature review. In: Proceedings of the 2018 13th international conference on innovations in information technology, IIT 2018. 2019. pp. 175–80. https://doi.org/10.1109/INNOVATIONS.2018.8605995.
Kaggle bot detection on Tweets | Kaggle. https://www.kaggle.com/c/bot-detection-on-tweets/data.
Yang K. Bot repository. https://botometer.osome.iu.edu/bot-repository/datasets.html.
Fonseca Abreu JV, Ghedini Ralha C, Costa Gondim JJ. Twitter bot detection with reduced feature set. In: Proceedings—2020 IEEE international conference on intelligence and security informatics, ISI 2020. 2020. pp. 1–6. https://doi.org/10.1109/ISI49825.2020.9280525.
David I, Siordia OS, Moctezuma D, Features combination for the detection of malicious Twitter accounts. In: IEEE international autumn meeting on power. Electronics and computing, ROPEC. 2016, vol. 2016, no. 2017. pp. 1–6. https://doi.org/10.1109/ROPEC.2016.7830626.
Rostami RR, Karbasi S. Detecting fake accounts on twitter social network using multi-objective hybrid feature selection approach. Webology. 2020;17(1):1–18. https://doi.org/10.14704/WEB/V17I1/A204.
Khalil H, Khan MUS, Ali M. Feature selection for unsupervised bot detection. In: 2020 3rd international conference on computing, mathematics and engineering technologies: idea to innovation for building the knowledge economy, iCoMET 2020. 2020. pp. 1–7. https://doi.org/10.1109/iCoMET48670.2020.9074131.
Fernquist J, Kaati L, Schroeder R. Political bots and the Swedish general election. In,. IEEE international conference on intelligence and security informatics (ISI). IEEE. 2018. 2018. pp. 124–9.
Chu Z, Gianvecchio S, Wang H, Jajodia S. Detecting automation of twitter accounts: are you a human, bot, or cyborg? IEEE Trans Depend Secure Comput. 2012;9(6):811–24.
Article Google Scholar
Efthimion PG, Payne S, Proferes N. Supervised machine learning bot detection techniques to identify social twitter bots. SMU Data Sci Rev. 2018;1(2):5.
Google Scholar
Heidari M, James H Jr, Uzuner O, An empirical study of machine learning algorithms for social media bot detection. In: IEEE international IOT, electronics and mechatronics conference (IEMTRONICS). IEEE. 2021. 2021. pp. 1–5. arXiv:24567.
Gera S, Sinha A. T-Bot: AI-based social media bot detection model for trend-centric twitter network. Social Netw Anal Min. 2022;12(1):1–19.
Article Google Scholar
Hayawi K, Mathew S, Venugopal N, Masud MM, Ho PH. DeeProBot: a hybrid deep neural network model for social bot detection based on user profile data. Soc Netw Anal Min. 2022;12(1):1–19.
Article Google Scholar
Chavoshi N, Hamooni H, Mueen A. Debot: Twitter bot detection via warped correlation. In: Icdm, vol. 18. 2016. pp. 28–65.
Shukla H, Jagtap N, Patil B. Enhanced Twitter bot detection using ensemble machine learning. In: Sixth international conference on inventive computation technologies [ICICT 2021]. IEEE; 2021. p. 930–6.
Anwar A, Yaqub U. Bot detection in twitter landscape using unsupervised learning. In: The 21st annual international conference on digital government research; 2020. pp. 329–30.
Zuccarelli E. Performance metrics in ML. https://towardsdatascience.com/performance-metrics-in-machine-learning-part-1-classification-6c6b8d8a8c92.

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Banking Technology, Pondicherry University, Kalapet, Puducherry, Puducherry, 605014, India
S. Chanti & T. Chithralekha
Department of Computer Science, Pondicherry University, Kalapet, Puducherry, Puducherry, 605014, India
T. Chithralekha

Authors

S. Chanti
View author publications
You can also search for this author inPubMed Google Scholar
T. Chithralekha
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to S. Chanti.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

Not applicable.

Consent to participate

The authors declare no consent to participate through Virtual mode.

Consent for publication

The authors declare no consent for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances in Internet Research and Engineering 2023” guest edited by Sudarsan S D, Mohit Sethi and Balaji Rajendran.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chanti, S., Chithralekha, T. RGF-Bot: A Novel Feature Selection Method to Identify Malicious Bot Accounts on Social Networking Sites Using Machine Learning. SN COMPUT. SCI. 4, 843 (2023). https://doi.org/10.1007/s42979-023-02263-5

Download citation

Received: 24 June 2023
Accepted: 19 August 2023
Published: 03 November 2023
DOI: https://doi.org/10.1007/s42979-023-02263-5

Keywords

Part of a collection:

Advances in Internet Research and Engineering 2023

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

RGF-Bot: A Novel Feature Selection Method to Identify Malicious Bot Accounts on Social Networking Sites Using Machine Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Identification of Social Bots in Online Social Networks Using Filter-Based Feature Selection Approach

Hybrid feature selection approach to identify optimal features of profile metadata to detect social bots in Twitter

Analysis of Feature Selection Methods for P2P Botnet Detection

Explore related subjects

Data Availability Statement

Code Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now