Abstract
Deep learning based malware classification gains momentum recently. However, deep learning models are vulnerable to adversarial perturbation attacks especially when applied in network security application. Deep neural network (DNN)-based malware classifiers by eating the whole bit sequences are also vulnerable despite their satisfactory performance and less feature-engineering job. Therefore, this paper proposes a DNN-based malware classifier on the raw bit sequences of programs in Windows. We then propose two adversarial attacks targeting our trained DNNs to generate adversarial malware. A defensive mechanism is proposed by treating perturbations as noise added on bit sequences. In our defensive mechanism, a generative adversary network (GAN)-based model is designed to filter out the perturbation noise and those that with the highest probability to fool the target DNNs are chosen for adversarial training. The experiments show that GAN with filter-based model produced the highest quality adversarial samples with medium cost. The evasion ratio under GAN with filter-based model is as high as 50.64% on average. While incorporating GAN-based adversarial samples into training, the enhanced DNN achieves satisfactory with 90.20% accuracy while the evasion ratio is below 9.47%. GAN helps in secure the DNN-based malware classifier with negligible performance degradation when compared with the original DNN. The evasion ratio is remarkably minimized when faced with powerful adversarial attacks, including \({\textit{FGSM}}^r\) and \({\textit{FGSM}}^k\).
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Filiol, E., Josse, S.: A statistical model for undecidable viral detection. J. Comput. Virol. Tech. 3(2), 65–74 (2007). https://doi.org/10.1007/s11416-007-0041-5
Gavrilut, D., Cimpoesu, M., Anton, D., Ciortuz, L.: Malware detection using machine learning. In: International Multiconference on Computer Science & Information Technology, pp. 735–741. IEEE (2010). https://ieeexplore.ieee.org/document/5352759
Gibert, D., Mateu, C., Planes, J.: The rise of machine learning for detection and classification of malware: research developments, trends and challenges. J. Netw. Comput. Appl. 153, 102536 (2020). https://doi.org/10.1016/j.jnca.2019.102526
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath B.S.: Malware images: visualization and automatic classification. In: VizSec 11 Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7. ACM (2011). https://doi.org/10.1145/2016904.2016908
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.: Malware detection by eating a whole exe (2017). arXiv:1710.09435
Raff, E., Zak, R., Cox, R., Sylvester, J., Yacci, P., Ward, R., Tracy, A., McLean, M., Nicholas, C.: An investigation of byte n-gram features for malware classification. J. Comput. Virol. Hacking Tech. 14(1), 1–20 (2018). https://doi.org/10.1007/s11416-016-0283-1
Kolosnjaji, B., Demontis, A., Biggio, B., Maiorca, D., Giacinto, G.: Adversarial malware binaries: evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2019)
Suciu, O., Coull, S.E., Johns, J.: Exploring adversarial examples in malware detection. In: 2019 IEEE Security and Privacy Workshop (SPW), pp. 8–14. CEUR-WS (2019). https://doi.org/10.1109/SPW.2019.00015
Jin, G., Shen, S., Zhang, D., Dai, F., Zhang, Y.: APE-GAN: adversarial perturbation elimination with GAN. In: 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 3842–3846. IEEE (2019). https://doi.org/10.1109/ICASSP.2019.8683044
Al-Dujaili, A., Huang, A., Hemberg, E., O’Reilly, U.M.: Adversarial deep learning for robust detection of binary encoded malware. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 76–82. IEEE (2018). https://doi.org/10.1109/SPW.2018.00020
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: 6th International Conference on Learning Representations (ICLR 2018), pp. 1–28 (2018)
Mercaldo, F., Santone, A.: Deep learning for image-based mobile malware detection. J. Comput. Virol. Hacking Tech. 16(6), 1–15 (2020). https://doi.org/10.1007/s11416-019-00346-7
Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in android. In: International Conference on Security and Privacy in Communication Systems, pp. 86–103. Springer (2013). https://doi.org/10.1007/978-3-319-04283-1_6
Jerlin, M.A., Marimuthu, K.: A new malware detection system using machine learning techniques for API call sequences. J. Appl. Secur. Res. 13(1), 45–62 (2018)
Zhang, M., Duan, Y., Yin, H., Zhao, Z.: Semantics-aware android malware classification using weighted contextual API dependency graphs. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1105–1116. ACM (2014). https://doi.org/10.1145/2660267.2660359
Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: HinDroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1507–1515. ACM (2017). https://doi.org/10.1145/3097983.3098026
Kreuk, F., Barak, A., Aviv-Reuven, S., Baruch, M., Pinkas, B., Keshet, J.: Adversarial examples on discrete sequences for beating whole-binary malware detection (2018). arXiv:1802.04528v1
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations (ICLR 2015), pp. 1–11 (2015)
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN (2017). arXiv:1702.05983
Chen, X., Li, C., Wang, D., Wen, S., Zhang, J., Nepal, S., Xiang, Y., Ren, K.: Android HIV: a study of repackaging malware for evading machine-learning detection. IEEE Trans. Inform. Forens. Secur. 15, 987–1001 (2020)
Carlini, N., Wagner, D.: Adversarial examples are not easily detected: bypassing ten detection methods. In: Proceedings of the 10th ACM workshop on artificial intelligence and security, pp. 3–14. ACM (2017). https://doi.org/10.1145/3128572.3140444
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. IEEE (2016). https://doi.org/10.1109/EuroSP.2016.36
Wang, Q., Guo, W., Zhang, K., Ororbia, II., Alexander, G., Xing, X., Liu, X., Giles, C.L.: Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1145–1153. ACM (2017). https://doi.org/10.1145/3097983.3098158
Mahmood, S., Keane, L., Lujo, B., Michael, K.R., Saurabh, S.: Optimization-guided binary diversification to mislead neural networks for malware detection 2019. arXiv:1912.09064
Pappas, V., Polychronakis, M., Keromytis, A.D.: Smashing the gadgets: hindering return-oriented programming using in-place code randomization. In: 2012 IEEE Symposium on Security and Privacy, pp. 601–615. IEEE (2012). https://doi.org/10.1109/SP.2012.41
Koo, H., Polychronakis, M.: Juggling the gadgets: binary-level code randomization using instruction displacement. In: Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pp. 23–34. ACM (2016). https://doi.org/10.1145/2897845.2897863
Song, W., Li, X., Afroz, S., Garg, D., Kuznetsov, D., Yin, H.: Automatic generation of adversarial examples for interpreting malware classifiers (2020). arXiv:2003.03100
Demontis, A., Melis, M., Biggio, B., Maiorca, D., Arp, D., Rieck, K., Corona, I., Giacinto, G., Roli, F.: Yes, machine learning can be more secure! A case study on android malware detection. IEEE Trans. Depend. Secure Comput. 16(4), 711–724 (2019). https://doi.org/10.1109/TDSC.2017.2700270
Incer, I., Theodorides, M., Afroz, S., Wagner, D.: Adversarially robust malware detection using monotonic classification. In: The Fourth ACM International Workshop, pp. 54–63. ACM (2018). https://doi.org/10.1145/3180445.3180449
Maiorca, D., Biggio, B., Giacinto, G.: Towards adversarial malware detection: lessons learned from PDF-based attacks. ACM Comput. Surv. (CSUR) 52(4), 1–36 (2019). https://doi.org/10.1145/3332184
Chen, L., Hou, S., Ye, Y., Xu, S.: DroidEye: fortifying security of learning-based classifier against adversarial android malware attacks. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 782–789. IEEE (2018). https://doi.org/10.1109/ASONAM.2018.8508284
Chen, L., Ye, Y.: SecMD: make machine learning more secure against adversarial malware attacks. In: AI 2017: Advances in Artificial Intelligence, pp. 76–89. Springer (2017). https://doi.org/10.1007/978-3-319-63004-5_7
Chen, L., Hou, S., Ye, Y.: SecureDroid: enhancing security of machine learning-based detection against adversarial android malware attacks. In: Proceedings of the 33rd Annual Computer Security Applications Conference, pp. 362–372. ACM (2017). https://doi.org/10.1145/3134600.3134636
Yang, W., Kong, D., Xie, T., Gunter, C.A.: Malware detection in adversarial settings: exploiting feature evolutions and confusions in android apps. In: Proceedings of the 33rd Annual Computer Security Applications Conference, pp. 288–302. ACM (2017). https://doi.org/10.1145/3134600.3134642
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7(4), 2721–2744 (2006)
Acknowledgements
This work has been supported by the Open Foundation of Key Laboratory in Software Engineering of Yunnan Province under Grant Nos. 2020SE401, 2020SE306 and 2020SE305.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, Y., Li, H., Zheng, Y. et al. Enhanced DNNs for malware classification with GAN-based adversarial training. J Comput Virol Hack Tech 17, 153–163 (2021). https://doi.org/10.1007/s11416-021-00378-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-021-00378-y