Assessing the Impact of Batch-Based Data Aggregation Techniques for Feature Engineering on Machine Learning-Based Network IDSs | SpringerLink
Skip to main content

Abstract

Communication networks and systems are continuously threatened by a great variety of cybersecurity attacks coming from new malware that targets old and new systems’ vulnerabilities. In this sense, Intrusion Detection Systems (IDSs) and, specifically, Network IDSs (NIDSs) are used to count on robust methods and techniques to detect and classify security attacks. One of the important parts in the assessment of NIDSs, is the Feature Engineering (FE) process, where raw datasets are transformed onto derived ones where both, features and observations are smartly transformed. In this work, the ff4ml framework, which includes the Feature as a Counter (FaaC) FE approach, is used to transform raw features into new ones that are counters of the originals. The FaaC approach aggregates raw observations by time intervals, thus limiting its use to network datasets containing timestamps. This work proposes a batch-based aggregation technique that allows applying FaaC in timestamp-less datasets and analyzes its impact on the performance of Machine Learning (ML)-based NIDSs in comparison to timestamp-based aggregation approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 19447
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 24309
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ali, R., Ali, A., Iqbal, F., Khattak, A.M., Aleem, S.: A systematic review of artificial intelligence and machine learning techniques for cyber security. In: Tian, Y., Ma, T., Khan, M.K. (eds.) ICBDS 2019. CCIS, vol. 1210, pp. 584–593. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-7530-3_44

    Chapter  Google Scholar 

  2. Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutor. 16(1), 303–336 (2014)

    Article  Google Scholar 

  3. Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York Inc., Information Science and Statistics, Berlin (2006)

    MATH  Google Scholar 

  4. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  5. Camacho, J., García-Giménez, J.M., Fuentes-García, N.M., Maciá-Fernández, G.: Multivariate Big Data Analysis for intrusion detection: 5 steps from the haystack to the needle. Comput. Secur. 87, 1–11 (2019)

    Article  Google Scholar 

  6. Camacho, J., Pérez-Villegas, A., García-Teodoro, P., Maciá-Fernández, G.: PCA-based multivariate statistical network monitoring for anomaly detection. Comput. Secur. 59, 118–137 (2016)

    Article  Google Scholar 

  7. ENISA: ENISA Threat Landscape Report (2020). https://bit.ly/3gdsB1O. Accessed 9 June 2020

  8. Maciá-Fernández, G., Camacho, J., Magán-Carrión, R., García-Teodoro, P., Therón, R.: UGR’16: a new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 73, 411–424 (2018)

    Article  Google Scholar 

  9. Magán-Carrión, R., Urda, D., Diaz-Cano, I., Dorronsoro, B.: Towards a reliable comparison and evaluation of network intrusion detection systems based on machine learning approaches. Appl. Sci. 10(5), 1775 (2020)

    Article  Google Scholar 

  10. Stapor, K., Ksieniewicz, P., García, S., Woźniak, M.: How to design the fair experimental classifier evaluation. Appl. Soft Comput. 104, 107219 (2021)

    Article  Google Scholar 

  11. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6 (2009)

    Google Scholar 

  12. Wiafe, I., Koranteng, F.N., Obeng, E.N., Assyne, N., Wiafe, A., Gulliver, S.R.: Artificial intelligence for cybersecurity: a systematic mapping of literature. IEEE Access 8, 146598–146612 (2020)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Spanish Ministerio de Ciencia, Innovación y Universidades and the ERDF under contracts RTI2018-100754-B-I00 (iSUN) and RTI2018-098160-B-I00 (DEEPAPFORE), ERDF under project FEDER-UCA18-108393 (OPTIMALE), and Junta de Andalucía and ERDF (GENIUS – P18-2399).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Magán-Carrión .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Magán-Carrión, R., Urda, D., Díaz-Cano, I., Dorronsoro, B. (2022). Assessing the Impact of Batch-Based Data Aggregation Techniques for Feature Engineering on Machine Learning-Based Network IDSs. In: Gude Prego, J.J., de la Puerta, J.G., García Bringas, P., Quintián, H., Corchado, E. (eds) 14th International Conference on Computational Intelligence in Security for Information Systems and 12th International Conference on European Transnational Educational (CISIS 2021 and ICEUTE 2021). CISIS - ICEUTE 2021. Advances in Intelligent Systems and Computing, vol 1400. Springer, Cham. https://doi.org/10.1007/978-3-030-87872-6_12

Download citation

Publish with us

Policies and ethics