Abstract
Several frameworks that cover cyber security education and professional development have been introduced as a guidance for learners, educators and professionals to the different knowledge areas of the field. One of the most important frameworks is the Cyber Security Body of Knowledge (CyBOK). In this paper, we apply the BERTopic topic modeling technique to CyBOK. We aim, by using this technique, to identify the most relevant topics related to each CyBOK’s knowledge area in an automated way. Our results indicate that it is possible to find a meaningful topic model describing CyBOK and, thus, suggests the possibility of applying related techniques to texts to identify their main themes.
This research was supported by the Ministerio de Ciencia, Innovación y Universidades (Grant No. PID2019-111429RB-C21), by the Region of Madrid grant CYNAMON-CM (P2018/TCS-4566), co-financed by European Structural Funds ESF and FEDER, and the Excellence Program EPUC3M17. The opinions, findings, conclusions, or recommendations expressed are those of the authors and do not necessarily reflect those of any of the funders.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
E.g. line 3 in page 4 of Cryptography chapter is literally ‘[3, c8–c9, App B] [4, c1–c5]’.
- 6.
- 7.
BERTopic recommends a value between 10 and 20, but as we obtained similar results with both values, we chose the smaller one to preserve the topic model quality.
- 8.
This parameter establishes the minimum number of documents that must be found in a cluster to be recognized as a topic. The lower this value is, the higher the number of topics are. Default value is 10 but it obtained 107 topics while a value of 20 obtained around 50 topics, which we found more compact and easier to analyze.
- 9.
References
Curriculum Guidelines for Post-Secondary Degree Programs in Cybersecurity. https://www.acm.org/binaries/content/assets/education/curricula-recommendations/csec2017.pdf. Accessed 03 Apr 2022
ISO/IEC 27002:2013 Information technology—Security techniques—Code of practice for information security controls. https://www.iso.org/standard/54533.html. Accessed 03 Apr 2022
The (ISC)\(^2\) CBK. https://www.isc2.org/Certifications/CBK. Accessed 03 Apr 2022
Ameri, K., Hempel, M., Sharif, H., Lopez, J., Jr., Perumalla, K.: CyBERT: cybersecurity claim classification by fine-tuning the BERT language model. J. Cybersecur. Priv. 1(4), 615–637 (2021)
Angelov, D.: Top2Vec: distributed representations of topics (2020)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Chartered Institute of Information Security: CIISec Skills Framework Version 2.4. https://www.ciisec.org/CIISEC/Resources/Capability_Methodology/ Skills_Framework/CIISEC/Resources/Skills_Framework.aspx. Accessed 03 Apr 2022
Churchill, R., Singh, L.: The evolution of topic modeling. ACM Comput. Surv. (CSUR) (2022). https://doi.org/10.1145/3507900
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Furnell, S.: The cybersecurity workforce and skills. Comput. Secur. 100, 102080 (2021)
Furnell, S., Bishop, M.: Addressing cyber security skills: the spectrum, not the silo. Comput. Fraud Secur. 2020(2), 6–11 (2020)
Furnell, S., Collins, E.: Cyber security: what are we talking about? Comput. Fraud Secur. 2021(7), 6–11 (2021)
Grootendorst, M.: BERTopic: leveraging BERT and c-TF-IDF to create easily interpretable topics (2020). https://doi.org/10.5281/zenodo.4381785
Hallett, J., Larson, R., Rashid, A.: Mirror, mirror, on the wall: What are we teaching them all? Characterising the focus of cybersecurity curricular frameworks. In: 2018 USENIX Workshop on Advances in Security Education (ASE 2018) (2018)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)
Ignaczak, L., Goldschmidt, G., Costa, C.A.D., Righi, R.D.R.: Text mining in cybersecurity: a systematic literature review. ACM Comput. Surv. (CSUR) 54(7), 1–36 (2021)
Kherwa, P., Bansal, P.: Topic modeling: a comprehensive review. EAI Endor. Trans. Scalable Inf. Syst. 7(24), e2 (2020). https://doi.org/10.4108/eai.13-7-2018.159623
Liu, L., Tang, L., Dong, W., Yao, S., Zhou, W.: An overview of topic modeling and its current applications in bioinformatics. Springerplus 5(1), 1–22 (2016). https://doi.org/10.1186/s40064-016-3252-8
McInnes, L., Healy, J.: Accelerated hierarchical density based clustering. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 33–42. IEEE (2017)
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)
Nautiyal, L., et al.: The United Kingdom’s cyber security degree certification program: a cyber security body of knowledge case study. IEEE Secur. Priv. 20(1), 87–95 (2022)
Neuhaus, S., Zimmermann, T.: Security trend analysis with CVE topic models. In: 2010 IEEE 21st International Symposium on Software Reliability Engineering, pp. 111–120. IEEE (2010)
Newhouse, W., Keith, S., Scribner, B., Witte, G.: National initiative for cybersecurity education (NICE) cybersecurity workforce framework. https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-181.pdf. Accessed 03 Apr 2022
Rashid, A., Chivers, H., Lupu, E., Martin, A., Schneder, S.: The cyber security body of knowledge version 1.1.0 (2021)
Sundarkumar, G.G., Ravi, V., Nwogu, I., Govindaraju, V.: Malware detection via API calls, topic models and machine learning. In: 2015 IEEE International Conference on Automation Science and Engineering (CASE), pp. 1212–1217. IEEE (2015)
Tripathy, J.K., Sethuraman, S.C., Cruz, M.V., Namburu, A., Mangalraj, P., Vijayakumar, V.: Comprehensive analysis of embeddings and pre-training in NLP. Comput. Sci. Rev. 42, 100433 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 IFIP International Federation for Information Processing
About this paper
Cite this paper
González-Tablas, A.I., Rashed, M. (2022). Exploring CyBOK with Topic Modeling Techniques. In: Clarke, N., Furnell, S. (eds) Human Aspects of Information Security and Assurance. HAISA 2022. IFIP Advances in Information and Communication Technology, vol 658. Springer, Cham. https://doi.org/10.1007/978-3-031-12172-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-12172-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12171-5
Online ISBN: 978-3-031-12172-2
eBook Packages: Computer ScienceComputer Science (R0)