Abstract
Malware is very dangerous for system and network user. Malware identification is essential tasks in effective detecting and preventing the computer system from being infected, protecting it from potential information loss and system compromise. Commonly, there are 25 malware families exists. Traditional malware detection and anti-virus systems fail to classify the new variants of unknown malware into their corresponding families. With development of malicious code engineering, it is possible to understand the malware variants and their features for new malware samples which carry variability and polymorphism. The detection methods can hardly detect such variants but it is significant in the cyber security field to analyze and detect large-scale malware samples more efficiently. Hence it is proposed to develop an accurate malware family classification model contemporary deep learning technique. In this paper, malware family recognition is formulated as multi classification task and appropriate solution is obtained using representation learning based on binary array of malware executable files. Six families of malware have been considered here for building the models. The feature dataset with 690 instances is applied to deep neural network to build the classifier. The experimental results, based on a dataset of 6 classes of malware families and 690 malware files trained model provides an accuracy of over 86.8% in discriminating from malware families. The techniques provide better results for classifying malware into families.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schultz, M., Eskin, E., Zadok, F., Stolfo, S., Data mining methods for detection of new malicious executables. In: Proceedings of 2001 IEEE Symposium on Security and Privacy, Oakland, 14–16 May 2001, pp. 38–49 (2001)
Nari, S., Ghorbani, A.: Automated malware classification based on network behavior. In: Proceedings of International Conference on Computing, Networking and Communications (ICNC), San Diego, pp. 642–647 (2013)
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B., Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, Article No. 4 (2011)
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19, 639–668 (2011)
Tian, R., Batten, L., Islam, R., Versteeg, S.: An automated classification system based on the strings of trojan and virus families. In: Proceedings of the 4th International Conference on Malicious and Unwanted Software, Montréal (2009)
Park, Y., Reeves, D., Mulukutla, V., Sundaravel, B.: fast malware classification by automated behavioral graph matching. In: Proceedings of the 6th Annual Workshop on Cyber Security and Information Intelligence Research, Article No. 45 (2010)
Cakir, B., Dogdu, E.: Malware classification using deep learning methods. In: ACM SE 2018: ACM SE 2018: Southeast Conference, Richmond, KY, USA, 5 p. ACM, New York (2018)
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Proceedings of the 10th Symposium on Recent Advances in Intrusion Detection (2007)
Uppal, D., Sinha, R., Mehra, V., Jain, V.: Malware detection and classification based on extraction of API sequences. In: International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2014)
Makandar, A., Patrot, A.: Malware image analysis and classification using support vector machine. Int. J. Trends Comput. Sci. Eng. 4(5), 01–03 (2015)
Zolkipli, M.F., Jantan, A.: An approach for malware behavior identification and classification. In: Proceeding of 3rd International Conference on Computer Research and Development, Shanghai, 11–13 March 2011, pp. 191–194 (2011)
Biley, M., Oberheid, J., Andersen, J., Morley Mao, Z., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection, vol. 4637, pp. 178–197 (2007)
Islam, R., Tian, R., Battenb, L., Versteeg, S.: Classification of malware based on integrated static and dynamic features. J. Network Comput. Appl. 36, 646–656 (2013)
Bhodia, N., Prajapati, P., Di Troia, F., Stamp, M.: Transfer learning for image-based malware classification. In: 3rd International Workshop on Formal Methods for Security Engineering (ForSE 2019), in Conjunction with the 5th International Conference on Information Systems Security and Privacy (ICISSP 2019), Prague, Czech Republic (2019)
Gandotra, E., Bansal, D., Sofat, S.: Malware analysis and classification: a survey. J. Inf. Secur. 5, 56–64 (2014)
Maksood, F.Z.: Analysis of data mining techniques and its applications. Int. J. Comput. Appl. (0975 – 8887) 140(3), 6–14 (2016)
Babu, S.I., Chandra Sekhara Rao, M.V.P., Nagi Reddy, G.: Research methodology on web mining for malware detection. Int. J. Comput. Trends Technol. (IJCTT) 12(4) (2014)
Gavrilut, D., Cimpoeşu, M., Anton, D., Ciortuz, L.: Malware detection using machine learning. In: Proceedings of the International Multi conference on Computer Science and Information Technology, pp. 735–741 (2009)
Saxe, J., Berlin, K.: Deep neural network based malware detection using two dimensional binary program features. In: 10th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE) (2015)
Dychka, I., Chernyshev, D.: Malware detection using artificial neural networks. In: ICCSEEA, Advances in Computer Science for Engineering and Education II, vol. 938, pp. 3–12 (2019)
Islam, R., Altas, I.: A comparative study of malware family classification. In: Chim, T.W., Yuen, T.H. (eds.) ICICS 2012. LNCS, vol. 7618, pp. 488–496. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34129-8_48
Saxe, J., Berlin, K.: Deep neural network based malware detection using two dimensional binary program features. In: Proceedings of the 10th International Conference on Malicious and Unwanted Software (MALWARE), IEEE, pp. 11–20 (2015)
Nazario, J., Oberheid, J., Andersen, J., Morley Mao, Z., Jahanian, F., Biley, M.: Automated classification and analysis of internet malware. In: Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection, vol. 4637, pp. 178–197 (2007)
Mohaisen, A., Alrawi, O.: Unveiling Zeus: automated classification of malware samples. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 829–832 (2013)
Kong, D., Yan, G.: Discriminate malware distance learning on structural information for automated malware classification. In: Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems, pp. 347–348 (2013)
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70542-0_6
Pascanu, R., Stokes, J.W., Sanossian, H., Marinescu, M., Thomas, A.: Malware classification with recurrent networks. In: 2015 IEEE International Conference on IEEE Acoustics, Speech and Signal Processing (ICASSP), pp. 1916–1920 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 IFIP International Federation for Information Processing
About this paper
Cite this paper
Gayathri, T., Vijaya, M.S. (2020). Malware Family Classification Model Using User Defined Features and Representation Learning. In: Chandrabose, A., Furbach, U., Ghosh, A., Kumar M., A. (eds) Computational Intelligence in Data Science. ICCIDS 2020. IFIP Advances in Information and Communication Technology, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-030-63467-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-63467-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63466-7
Online ISBN: 978-3-030-63467-4
eBook Packages: Computer ScienceComputer Science (R0)