Abstract
Information systems depend on security mechanisms to detect and respond to cyber-attacks. One of the most frequent attacks is the Distributed Denial of Service (DDoS): it impairs the performance of systems and, in the worst case, leads to prolonged periods of downtime that prevent business processes from running normally. To detect this attack, several supervised Machine Learning (ML) algorithms have been developed and companies use them to protect their servers. A key stage in these algorithms is feature pre-processing, in which, input data features are assessed and selected to obtain the best results in the subsequent stages that are required to implement supervised ML algorithms. In this article, an innovative approach for feature selection is proposed: the use of Visibility Graphs (VGs) to select features for supervised machine learning algorithms used to detect distributed DoS attacks. The results show that VG can be quickly implemented and can compete with other methods to select ML features, as they require low computational resources and they offer satisfactory results, at least in our example based on the early detection of distributed DoS. The size of the processed data appears as the main implementation constraint for this novel feature selection method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abbas, L.B., Sadiq, M.A., Ahmad, M.O.: Machine learning-based detection of DDoS attacks: a review. Futur. Gener. Comput. Syst. 111, 799–811 (2020)
Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2010)
Asonye, EA., Anwuna, I., Musa, S.M.: Securing Zig-Bee IoT network against HULK distributed denial of service attack. In: 2020 IEEE 17th International Conference on Smart Communities: Improving Quality of Life Using ICT, IoT and AI (HONET), pp. 156–162 (2020). https://doi.org/10.1109/HONET50430.2020.9322808
Bagheri, R.: Introduction to SHAP Values and their Application in Machine Learning. Towards Data Science (2022). https://towardsdatascience.com/introduction-to-shap-values-and-their-application-in-machine-learning-8003718e6827
Barrera-Animas, A.Y., et al.: Rainfall prediction: a comparative analysis of modern machine learning algorithms for time-series forecasting. Mach. Learn. Appl. 7, 100204 (2022). ISSN 2666-8270. https://doi.org/10.1016/j.mlwa.2021.100204
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. In: International Conference on Learning Representations (2012)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Boccaletti, S., et al.: Complex networks: structure and dynamics. Phys. Rep. 424(4), 175–308 (2006). ISSN 0370-1573. https://doi.org/10.1016/j.physrep.2005.10.009
Boccaletti, S., et al.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)
Brown, C.: Data division strategies in machine learning. In: Proceedings of the International Conference on Machine Learning, pp. 234–245 (2017)
Chippalakatti, S., Renumadhavi, C.H., Pallavi, A.: Comparison of unsupervised machine learning Algorithm F or dimensionality reduction. In: 2022 International Conference on Knowledge Engineering and Communication Systems (ICKES), pp. 1–7 (2022). https://doi.org/10.1109/ICKECS56523.2022.10060625.
Cortes, C.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/bf00994018
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964
Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)
Falkner, S., Klein, A., Hutter, F.: BOHB: robust and efficient hyperparameter optimization at scale. In: Proceedings of the 35th International Conference on Machine Learning, pp. 1436–1445 (2018)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188. 1469–1809 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x.
Gani, A., Ullah, S., Khan, K.: Detection of Denial of Service (DoS) attacks using machine learning techniques. In: 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–6. IEEE (2019)
Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media (2019)
Gonzalez, M.: Algorithm Applications in Machine Learning. Springer, Heidelberg (2019)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)
Gupta, B.B., Badve, O.P.: Taxonomy of DoS and DDoS attacks and desirable defense mechanism in a Cloud computing environment. Neural Comput. Appl. 28(12 ), 3655–3682 (2017). ISSN 1433–3058. https://doi.org/10.1007/s00521-016-2317-5
Gupta, B., Gupta, R., Tyagi, S.K.: Taxonomy of DDoS attacks and their prevention techniques: a review. J. Netw. Comput. Appl. 126, 48–73 (2019). ISSN 1084-8045. https://doi.org/10.1016/j.jnca.2018.10.009
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS, Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Islam, S.M.R., et al.: Detecting DDoS attacks with machine learning techniques. Inf. Sci. 254, 1–14 (2014)
Johnson, M., Smith, L.: Visibility graphs: a survey. IEEE Trans. Vis. Comput. Graph. 21(8), 933–952 (2015)
Jones, M., Brown, E.: Data pre-processing techniques in machine learning. Int. J. Data Sci. 8(2), 789–804 (2016)
Kelleher, J.D., Tierney, B., Tierney, B.: Data Science: An Introduction, 2nd edn. CRC Press (2018). Chap. 5
Khosravi, A., Machado, L., Nunes, R.O.: Time-series prediction of wind speed using machine learning algorithms: a case study Osorio wind farm, Brazil. Appl. Energy 224, 550–566 (2018). ISSN 0306-2619. https://doi.org/10.1016/j.apenergy.2018.05.043
Lacasa, L., et al.: From time series to complex networks: the visibility graph. Proc. Natl. Acad. Sci. 105(13), 4972–4975 (2008)
Liu, J., Chen, J.: Visibility graphs for analyzing complex systems: a review. Chaos Interdisc. J. Nonlinear Sci. 28(4), 041101 (2018)
Lucas, T., da Fontoura Costa, L., da Rocha, L.E.C.: Visibility graph analysis: a review. J. Stat. Mech. Theor. Exp. 2014(8), 08001 (2014)
Mangalathu, S., Hwang, S.-H., Jeon, J.-S.: Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach. Eng. Struct. 219, 110927 (2020). ISSN 0141-0296. https://doi.org/10.1016/j.engstruct.2020.110927
McCallum, A., Nigam, K.: A comparison of event models for Naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, vol. 752, pp. 41–48 (1998)
Mishra, D.K., Singh, V.P., Tripathi, R.: Network security situation awareness using visibility graph. J. Netw. Comput. Appl. 58, 49–62 (2015). ISSN 1084-8045. https://doi.org/10.1016/j.jnca.2015.09.007
Müller, A.C., Guido, S.: Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media (2016)
Murty, M.N., Raghava, R.: Support Vector Machines and Perceptrons. Learning, Optimization, Classification, and Application to Social Networks. SCS, Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41063-0
Myles, A.J., et al.: An introduction to decision tree modeling. J. Chemom. J. Chemometr. Soc. 18(6), 275–285 (2004)
Nasteski, V.: An overview of the supervised machine learning methods. In: HORIZONS.B 4, pp. 51–62, December 2017. https://doi.org/10.20544/HORIZONS.B.04.1.17.P05
Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45(2), 167–257 (2003)
Ng, A.: Machine learning yearning. Draft (2018). https://www.mlyearning.org/
Partida, A., Criado, R., Romance, M.: Visibility graph analysis of IOTA and IoTeX price series: an intentional risk-based strategy to use 5G for IoT. Electronics 10(18) (2021). ISSN 2079-9292. https://doi.org/10.3390/electronics10182282
Partida, A., et al.: The chaotic, self-similar and hierarchical patterns in Bitcoin and Ethereum price series. Chaos Solitons Fractals 165, 112806 (2022). ISSN 0960-0779. https://doi.org/10.1016/j.chaos.2022.112806
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986). https://doi.org/10.1038/323533a0
šarčević, A., et al.: Cybersecurity knowledge extraction using XAI. Appl. Sci. 12(17) (2022). ISSN 2076-3417. https://doi.org/10.3390/app12178669
Shorey, T., et al.: Performance comparison and analysis of Slowloris, GoldenEye and Xerxes DDoS attack tools. In: 2018 International Conference on Advances in Computing, Communications and Informatics, ICA CCI 2018, pp. 318–322 (2018). https://doi.org/10.1109/ICACCI.2018.8554590
Smith, J., Johnson, S.: Data collection for machine learning. J. Mach. Learn. Res. 12(4), 1234–1256 (2018)
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: International Conference on Neural Information Processing Systems (2012)
Stefano, B.: Multiscale vulnerability of complex networks. Chaos 17(4), 175–308 (2007). https://doi.org/10.1063/1.2801687
Wang, X., Zhang, W.: Visibility graph analysis: a novel approach for network traffic modeling. In: Proceedings of the International Conference on Communications, pp. 123–130 (2017)
Warda: Application-Layer DDoS Dataset (2020). https://www.kaggle.com/datasets/wardac/applicationlayer-ddos-dataset?select=test_mosaic.csv
Xiang, J., Small, M.: Visibility graphlet approach to chaotic time series. Phys. Rev. E 92(6), 062817 (2015)
Zhang, J., Small, M.: Complex network from pseudoperiodic time series: topology versus dynamics. Phys. Rev. Lett. 96, 238701 (2006). https://doi.org/10.1103/PhysRevLett.96.238701
Acknowledgements
This work was partially supported by the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project “Cybers SeC IP” (NORTE-01-0145-FEDER-000044).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lopes, J., Partida, A., Pinto, P., Pinto, A. (2024). On the Use of VGs for Feature Selection in Supervised Machine Learning - A Use Case to Detect Distributed DoS Attacks. In: Pereira, A.I., Mendes, A., Fernandes, F.P., Pacheco, M.F., Coelho, J.P., Lima, J. (eds) Optimization, Learning Algorithms and Applications. OL2A 2023. Communications in Computer and Information Science, vol 1981. Springer, Cham. https://doi.org/10.1007/978-3-031-53025-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-53025-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53024-1
Online ISBN: 978-3-031-53025-8
eBook Packages: Computer ScienceComputer Science (R0)