Abstract
Intrusion detection has become an open challenge in any modern ICT system due to the ever-growing urge towards assuring security of present day networks. Various machine learning methods have been proposed for finding an effective solution to detect and prevent network intrusions. Many approaches, tuned and tested by means of public datasets, capitalize on well-known classifiers, which often reach detection accuracy close to 1. However, these results strongly depend on the training data, which may not be representative of real production environments and ever-evolving attacks. This paper is an initial exploration around this problem. After having learned a detector on the top of a public intrusion detection dataset, we test it against held-out data not used for learning and additional data gathered by attack emulation in a controlled network. The experiments presented are focused on Denial of Service attacks and based on the CICIDS2017 dataset. Overall, the figures gathered confirm that results obtained in the context of synthetic datasets may not generalize in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
http://it.archive.ubuntu.com/ubuntubionic-updates/main amd64 Packages.
- 14.
- 15.
- 16.
- 17.
- 18.
References
Ahmim, A., Maglaras, L., Ferrag, M.A., Derdour, M., Janicke, H.: A novel hierarchical intrusion detection system based on decision tree and rules-based models. In: Proceedings of International Conference on Distributed Computing in Sensor Systems, pp. 228–233 (2019)
Ali, O., Cotae, P.: Towards DoS/DDoS attack detection using artificial neural networks. In: Proceedings of 9th IEEE Annual Ubiquitous Computing, Electronics Mobile Communication Conference, pp. 229–234 (2018)
Beer, F., Hofer, T., Karimi, D., Bühler, U.: A new attack composition for network security. In: DFN-Forum Kommunikationstechnologien, pp. 11–20. Gesellschaft für Informatik e.V. (2017)
Bowen, T., Poylisher, A., Serban, C., Chadha, R., Jason Chiang, C., Marvel, L.M.: Enabling reproducible cyber research - four labeled datasets. In: Proceedings of Military Communications Conference, pp. 539–544. IEEE (2016)
Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U.: USB-IDS-1: a public multilayer dataset of labeled network flows for IDS evaluation. In: Proceedings of International Conference on Dependable Systems and Networks - Supplemental Volume. IEEE (2021)
Catillo, M., Pecchia, A., Rak, M., Villano, U.: A case study on the representativeness of public DoS network traffic data for cybersecurity research. In: Proceedings of International Conference on Availability, Reliability and Security, pp. 1–10, Art. no. 6. ACM (2020)
Catillo, M., Rak, M., Villano, U.: 2L-ZED-IDS: a two-level anomaly detector for multiple attack classes. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) WAINA 2020. AISC, vol. 1150, pp. 687–696. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44038-1_63
Catillo, M., Pecchia, A., Rak, M., Villano, U.: Demystifying the role of public intrusion datasets: a replication study of DoS network traffic data. Comput. Secur. 102341 (2021)
Kayacık, H.G., Zincir-Heywood, N.: Analysis of three intrusion detection system benchmark datasets using machine learning algorithms. In: Kantor, P., et al. (eds.) ISI 2005. LNCS, vol. 3495, pp. 362–367. Springer, Heidelberg (2005). https://doi.org/10.1007/11427995_29
Kenyon, A., Deka, L., Elizondo, D.: Are public intrusion datasets fit for purpose characterising the state of the art in intrusion event datasets. Comput. Secur. 99, 102022 (2020)
Kshirsagar, D., Kumar, S.: An efficient feature reduction method for the detection of DoS attack. ICT Express (2021)
Lashkari, A.H., Gil, G.D., Mamun, M.S.I., Ghorbani, A.A.: Characterization of Tor traffic using time based features. In: Proceedings of International Conference on Information Systems Security and Privacy, pp. 253–262 (2017)
Lee, J., Kim, J., Kim, I., Han, K.: Cyber threat detection based on artificial neural networks using event profiles. IEEE Access 7, 165607–165626 (2019)
Liu, H., Lang, B.: Machine learning and deep learning methods for intrusion detection systems: a survey. Appl. Sci. 9(20), 4396 (2019)
Maciá-Fernández, G., Camacho, J., Magán-Carrión, R., García-Teodoro, P., Therón, R.: UGR’16: a new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 73, 411–424 (2017)
McHugh, J.: Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Trans. Inf. Syst. Secur. 3(4), 262–294 (2000)
Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: Proceedings of Military Communications and Information Systems Conference, pp. 1–6. IEEE (2015)
Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A.: A survey of network-based intrusion detection data sets. Comput. Secur. 86, 147–167 (2019)
Sharafaldin, I., Lashkari, A.H., Ghorbani., A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of International Conference on Information Systems Security and Privacy, pp. 108–116. SciTePress (2018)
Silva, J.V.V., Lopez, M.A., Mattos, D.M.F.: Attackers are not stealthy: Statistical analysis of the well-known and infamous KDD network security dataset. In: Proceedings of Conference on Cloud and Internet of Things, pp. 1–8 (2020)
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: Proceedings of Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6. IEEE (2009)
Tavallaee, M., Stakhanova, N., Ghorbani, A.A.: Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(5), 516–524 (2010)
Viegas, E.K., Santin, A.O., Oliveira, L.S.: Toward a reliable anomaly-based intrusion detection in real-world environments. Comput. Netw. 127(C), 200–216 (2017)
Wankhede, S., Kshirsagar, D.: DoS attack detection using machine learning and neural network. In: Proceedings of 4th International Conference on Computing Communication Control and Automation, pp. 1–5 (2018)
Acknowledgment
Andrea Del Vecchio gratefully acknowledges support by the “Orio Carlini” 2020 GARR Consortium Fellowship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Catillo, M., Del Vecchio, A., Pecchia, A., Villano, U. (2021). A Critique on the Use of Machine Learning on Public Datasets for Intrusion Detection. In: Paiva, A.C.R., Cavalli, A.R., Ventura Martins, P., Pérez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2021. Communications in Computer and Information Science, vol 1439. Springer, Cham. https://doi.org/10.1007/978-3-030-85347-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-85347-1_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85346-4
Online ISBN: 978-3-030-85347-1
eBook Packages: Computer ScienceComputer Science (R0)