Tarallo: Evading Behavioral Malware Detectors in the Problem Space | SpringerLink
Skip to main content

Tarallo: Evading Behavioral Malware Detectors in the Problem Space

  • Conference paper
  • First Online:
Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA 2024)

Abstract

Machine learning algorithms can effectively classify malware through dynamic behavior but are susceptible to adversarial attacks. Existing attacks, however, often fail to find an effective solution in both the feature and problem spaces. This issue arises from not addressing the intrinsic nondeterministic nature of malware, namely executing the same sample multiple times may yield significantly different behaviors. Hence, the perturbations computed for a specific behavior may be ineffective for others observed in subsequent executions. In this paper, we show how an attacker can augment their chance of success by leveraging a new and more efficient feature space algorithm for sequential data, which we have named Position Sensitive - Fast Gradient Sign Method, and by adopting two problem space strategies specially tailored to address nondeterminism in the problem space. We implement our novel algorithm and attack strategies in Tarallo, an end-to-end adversarial framework that significantly outperforms previous works in both white and black-box scenarios. Our preliminary analysis in a sandboxed environment and against two Recurrent Neural Network (RNN)-based malware detectors, shows that Tarallo achieves a success rate up to 99% on both feature and problem space attacks while significantly minimizing the number of modifications required for misclassification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 12583
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 15729
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/necst/Tarallo.

  2. 2.

    The list of parameters can be found at: https://github.com/necst/Tarallo/blob/main/ChainFramework/config/config_api_args.py.

References

  1. Cuckoo (2024). https://github.com/cuckoosandbox/cuckoo

  2. Afianian, A., Niksefat, S., Sadeghiyan, B., Baptiste, D.: Malware dynamic analysis evasion techniques: a survey. ACM Comput. Surv. 52(6) (2019). https://doi.org/10.1145/3365001

  3. Agrawal, R., Stokes, J.W., Marinescu, M., Selvaraj, K.: Neural sequential malware detection with parameters. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2656–2660. IEEE (2018)

    Google Scholar 

  4. Anderson, H.S., Roth, P.: Ember: an open dataset for training static PE malware machine learning models. arXiv preprint arXiv:1804.04637 (2018)

  5. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of Android malware in your pocket. In: NDSS, vol. 14, pp. 23–26 (2014)

    Google Scholar 

  6. Berman, D.S., Buczak, A.L., Chavis, J.S., Corbett, C.L.: A survey of deep learning methods for cyber security. Information 10(4), 122 (2019)

    Article  Google Scholar 

  7. Catak, F.O., Yazı, A.F., Elezaj, O., Ahmed, J.: Deep learning based sequential model for malware analysis using windows exe API calls. PeerJ Comput. Sci. 6, e285 (2020)

    Article  Google Scholar 

  8. Comparetti, P.M., Salvaneschi, G., Kirda, E., Kolbitsch, C., Kruegel, C., Zanero, S.: Identifying dormant functionality in malware programs. In: 2010 IEEE Symposium on Security and Privacy, pp. 61–76. IEEE (2010)

    Google Scholar 

  9. Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., Bharath, A.A.: Generative adversarial networks: an overview. IEEE Signal Process. Mag. 35(1), 53–65 (2018)

    Article  Google Scholar 

  10. D’Onghia, M., Di Cesare, F., Gallo, L., Carminati, M., Polino, M., Zanero, S.: Lookin’out my backdoor! investigating backdooring attacks against DL-driven malware detectors. In: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, pp. 209–220 (2023)

    Google Scholar 

  11. D’Onghia, M., Salvadore, M., Nespoli, B.M., Carminati, M., Polino, M., Zanero, S.: Apícula: static detection of API calls in generic streams of bytes. Comput. Secur. 119, 102775 (2022)

    Article  Google Scholar 

  12. Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1625–1634 (2018)

    Google Scholar 

  13. Fang, Y., et al.: A new malware classification approach based on malware dynamic analysis. In: Pieprzyk, J., Suriadi, S. (eds.) ACISP 2017. LNCS, vol. 10343, pp. 173–189. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59870-3_10

    Chapter  Google Scholar 

  14. Forensics, C.: Virusshare (2023). http://virusshare.com/

  15. Furht, B.: SIMD Single Instruction Multiple Data Processing. In: Furht, B. (ed.) Encyclopedia of Multimedia, pp. 817–819. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-78414-4_220

    Chapter  Google Scholar 

  16. Galloro, N., Polino, M., Carminati, M., Continella, A., Zanero, S.: A systematical and longitudinal study of evasive behaviors in windows malware. Comput. Secur. 113, 102550 (2022)

    Article  Google Scholar 

  17. Giffin, J.T., Jha, S., Miller, B.P.: Automated discovery of mimicry attacks. In: Zamboni, D., Kruegel, C. (eds.) RAID 2006. LNCS, vol. 4219, pp. 41–60. Springer, Heidelberg (2006). https://doi.org/10.1007/11856214_3

    Chapter  Google Scholar 

  18. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)

  19. Hariom, Handa, A., Kumar, N., Kumar Shukla, S.: Adversaries strike hard: adversarial attacks against malware classifiers using dynamic API calls as features. In: Dolev, S., Margalit, O., Pinkas, B., Schwarzmann, A. (eds.) CSCML 2021. LNCS, vol. 12716, pp. 20–37. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78086-9_2

    Chapter  Google Scholar 

  20. Hassen, M., Carvalho, M.M., Chan, P.K.: Malware classification using static analysis based features. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE (2017)

    Google Scholar 

  21. Hu, W., Tan, Y.: Black-box attacks against RNN based malware detection algorithms. arXiv preprint arXiv:1705.08131 (2017)

  22. Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. In: Tan, Y., Shi, Y. (eds.) DMBD 2022. CCIS, vol. 1745, pp. 409–423. Springer, Singapore (2022). https://doi.org/10.1007/978-981-19-8991-9_29

    Chapter  Google Scholar 

  23. Kolosnjaji, B., et al.: Adversarial malware binaries: evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2018)

    Google Scholar 

  24. Kreuk, F., Barak, A., Aviv-Reuven, S., Baruch, M., Pinkas, B., Keshet, J.: Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv preprint arXiv:1802.04528 (2018)

  25. Li, C., Lv, Q., Li, N., Wang, Y., Sun, D., Qiao, Y.: A novel deep framework for dynamic malware detection based on API sequence intrinsic features. Comput. Secur. 116, 102686 (2022)

    Article  Google Scholar 

  26. Liu, T., Curtsinger, C., Berger, E.D.: DTHREADS: efficient deterministic multithreading. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 327–336 (2011)

    Google Scholar 

  27. Lucas, K., Sharif, M., Bauer, L., Reiter, M.K., Shintre, S.: Malware makeover: breaking ML-based static analysis by modifying executable bytes. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 744–758 (2021)

    Google Scholar 

  28. Machado, G.R., Silva, E., Goldschmidt, R.R.: Adversarial machine learning in image classification: a survey toward the defender’s perspective. ACM Comput. Surv. (CSUR) 55(1), 1–38 (2021)

    Article  Google Scholar 

  29. Microsoft: Programming reference for the win32 API (2024). https://learn.microsoft.com/en-us/windows/win32/api/

  30. Ming, J., Xin, Z., Lan, P., Wu, D., Liu, P., Mao, B.: Impeding behavior-based malware analysis via replacement attacks to malware specifications. J. Comput. Virol. Hacking Tech. 13, 193–207 (2017)

    Article  Google Scholar 

  31. Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007), pp. 421–430. IEEE (2007)

    Google Scholar 

  32. Or-Meir, O., Nissim, N., Elovici, Y., Rokach, L.: Dynamic malware analysis in the modern era-a state of the art survey. ACM Comput. Surv. (CSUR) 52(5), 1–48 (2019)

    Article  Google Scholar 

  33. Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: 2016 IEEE Military Communications Conference, MILCOM 2016, pp. 49–54. IEEE (2016)

    Google Scholar 

  34. Pawlowski, A., Contag, M., Holz, T.: Probfuscation: an obfuscation approach using probabilistic control flows. In: Caballero, J., Zurutuza, U., Rodríguez, R.J. (eds.) DIMVA 2016. LNCS, vol. 9721, pp. 165–185. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40667-1_9

    Chapter  Google Scholar 

  35. Pierazzi, F., Pendlebury, F., Cortellazzi, J., Cavallaro, L.: Intriguing properties of adversarial ml attacks in the problem space. In: 2020 IEEE Symposium on Security and Privacy (SP), pp. 1332–1349. IEEE (2020)

    Google Scholar 

  36. Polino, M., et al.: Measuring and defeating anti-instrumentation-equipped malware. In: Polychronakis, M., Meier, M. (eds.) DIMVA 2017. LNCS, vol. 10327, pp. 73–96. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60876-1_4

    Chapter  Google Scholar 

  37. Polino, M., Scorti, A., Maggi, F., Zanero, S.: Jackdaw: towards automatic reverse engineering of large datasets of binaries. In: Almgren, M., Gulisano, V., Maggi, F. (eds.) DIMVA 2015. LNCS, vol. 9148, pp. 121–143. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20550-2_7

    Chapter  Google Scholar 

  38. Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.K.: Malware detection by eating a whole exe. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  39. Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)

    Article  Google Scholar 

  40. Rosenberg, I., Meir, S.: Bypassing NGAV for Fun and Pro t (2020)

    Google Scholar 

  41. Rosenberg, I., Meir, S., Berrebi, J., Gordon, I., Sicard, G., David, E.O.: Generating end-to-end adversarial examples for malware classifiers using explainability. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–10. IEEE (2020)

    Google Scholar 

  42. Rosenberg, I., Shabtai, A., Elovici, Y., Rokach, L.: Query-efficient black-box attack against sequence-based malware classifiers. In: Annual Computer Security Applications Conference, pp. 611–626 (2020)

    Google Scholar 

  43. Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against state of the art API call based malware classifiers. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 490–510. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_23

    Chapter  Google Scholar 

  44. Somayaji, A., Forrest, S.: Automated response using \(\{\)System-Call\(\}\) delay. In: 9th USENIX Security Symposium (USENIX Security 2000) (2000)

    Google Scholar 

  45. Suciu, O., Coull, S.E., Johns, J.: Exploring adversarial examples in malware detection. In: 2019 IEEE Security and Privacy Workshops (SPW). IEEE (2019)

    Google Scholar 

  46. Tan, K., Maxion, R.: “Why 6?” defining the operational limits of stide, an anomaly-based intrusion detector. In: Proceedings 2002 IEEE Symposium on Security and Privacy, pp. 188–201 (2002). https://doi.org/10.1109/SECPRI.2002.1004371

  47. Tian, R., Islam, R., Batten, L., Versteeg, S.: Differentiating malware from cleanware using behavioural analysis. In: 2010 5th International Conference on Malicious and Unwanted Software, pp. 23–30. IEEE (2010)

    Google Scholar 

  48. Ucci, D., Aniello, L., Baldoni, R.: Survey of machine learning techniques for malware analysis. Comput. Secur. 81, 123–147 (2019)

    Article  Google Scholar 

  49. Uppal, D., Sinha, R., Mehra, V., Jain, V.: Malware detection and classification based on extraction of API sequences. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE (2014)

    Google Scholar 

  50. Wagner, D., Dean, R.: Intrusion detection via static analysis. In: Proceedings 2001 IEEE Symposium on Security and Privacy, S &P 2001, pp. 156–168. IEEE (2000)

    Google Scholar 

  51. Wagner, D., Soto, P.: Mimicry attacks on host-based intrusion detection systems. In: Proceedings of the 9th ACM Conference on Computer and Communications Security, pp. 255–264 (2002)

    Google Scholar 

  52. Warrender, C., Forrest, S., Pearlmutter, B.: Detecting intrusions using system calls: alternative data models. In: Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No. 99CB36344), pp. 133–145. IEEE (1999)

    Google Scholar 

  53. You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, pp. 297–300 (2010). https://doi.org/10.1109/BWCCA.2010.85

  54. Zhang, Z., Qi, P., Wang, W.: Dynamic malware analysis with feature engineering and feature learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 1210–1217 (2020)

    Google Scholar 

Download references

Acknowledgements

This study was carried out within the MICS (Made in Italy - Circular and Sustainable) Extended Partnership and received funding from Next-Generation EU (Italian PNRR - M4 C2, Invest 1.3 - D.D. 1551.11-10-2022, PE00000004). CUP MICS D43C22003120001. Mario D’Onghia acknowledges support from TIM S.p.A. through the PhD scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriele Digregorio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Digregorio, G. et al. (2024). Tarallo: Evading Behavioral Malware Detectors in the Problem Space. In: Maggi, F., Egele, M., Payer, M., Carminati, M. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2024. Lecture Notes in Computer Science, vol 14828. Springer, Cham. https://doi.org/10.1007/978-3-031-64171-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-64171-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-64170-1

  • Online ISBN: 978-3-031-64171-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics