Tarallo: Evading Behavioral Malware Detectors in the Problem Space

Digregorio, Gabriele; Maccarrone, Salvatore; D’Onghia, Mario; Gallo, Luigi; Carminati, Michele; Polino, Mario; Zanero, Stefano

doi:10.1007/978-3-031-64171-8_7

Gabriele Digregorio²⁸,
Salvatore Maccarrone²⁸,
Mario D’Onghia²⁸,
Luigi Gallo²⁹,
Michele Carminati²⁸,
Mario Polino²⁸ &
…
Stefano Zanero²⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14828))

Included in the following conference series:

International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

Abstract

Machine learning algorithms can effectively classify malware through dynamic behavior but are susceptible to adversarial attacks. Existing attacks, however, often fail to find an effective solution in both the feature and problem spaces. This issue arises from not addressing the intrinsic nondeterministic nature of malware, namely executing the same sample multiple times may yield significantly different behaviors. Hence, the perturbations computed for a specific behavior may be ineffective for others observed in subsequent executions. In this paper, we show how an attacker can augment their chance of success by leveraging a new and more efficient feature space algorithm for sequential data, which we have named Position Sensitive - Fast Gradient Sign Method, and by adopting two problem space strategies specially tailored to address nondeterminism in the problem space. We implement our novel algorithm and attack strategies in Tarallo, an end-to-end adversarial framework that significantly outperforms previous works in both white and black-box scenarios. Our preliminary analysis in a sandboxed environment and against two Recurrent Neural Network (RNN)-based malware detectors, shows that Tarallo achieves a success rate up to 99% on both feature and problem space attacks while significantly minimizing the number of modifications required for misclassification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 12583; Price includes VAT (Japan)

Softcover Book: JPY 15729; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Lightweight Behavior-Based Malware Detection

Peekaboo: Hide and Seek with Malware Through Lightweight Multi-feature Based Lenient Hybrid Approach

A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences

Notes

1.
https://github.com/necst/Tarallo.
2.
The list of parameters can be found at: https://github.com/necst/Tarallo/blob/main/ChainFramework/config/config_api_args.py.

References

Cuckoo (2024). https://github.com/cuckoosandbox/cuckoo
Afianian, A., Niksefat, S., Sadeghiyan, B., Baptiste, D.: Malware dynamic analysis evasion techniques: a survey. ACM Comput. Surv. 52(6) (2019). https://doi.org/10.1145/3365001
Agrawal, R., Stokes, J.W., Marinescu, M., Selvaraj, K.: Neural sequential malware detection with parameters. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2656–2660. IEEE (2018)
Google Scholar
Anderson, H.S., Roth, P.: Ember: an open dataset for training static PE malware machine learning models. arXiv preprint arXiv:1804.04637 (2018)
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of Android malware in your pocket. In: NDSS, vol. 14, pp. 23–26 (2014)
Google Scholar
Berman, D.S., Buczak, A.L., Chavis, J.S., Corbett, C.L.: A survey of deep learning methods for cyber security. Information 10(4), 122 (2019)
Article Google Scholar
Catak, F.O., Yazı, A.F., Elezaj, O., Ahmed, J.: Deep learning based sequential model for malware analysis using windows exe API calls. PeerJ Comput. Sci. 6, e285 (2020)
Article Google Scholar
Comparetti, P.M., Salvaneschi, G., Kirda, E., Kolbitsch, C., Kruegel, C., Zanero, S.: Identifying dormant functionality in malware programs. In: 2010 IEEE Symposium on Security and Privacy, pp. 61–76. IEEE (2010)
Google Scholar
Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., Bharath, A.A.: Generative adversarial networks: an overview. IEEE Signal Process. Mag. 35(1), 53–65 (2018)
Article Google Scholar
D’Onghia, M., Di Cesare, F., Gallo, L., Carminati, M., Polino, M., Zanero, S.: Lookin’out my backdoor! investigating backdooring attacks against DL-driven malware detectors. In: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, pp. 209–220 (2023)
Google Scholar
D’Onghia, M., Salvadore, M., Nespoli, B.M., Carminati, M., Polino, M., Zanero, S.: Apícula: static detection of API calls in generic streams of bytes. Comput. Secur. 119, 102775 (2022)
Article Google Scholar
Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1625–1634 (2018)
Google Scholar
Fang, Y., et al.: A new malware classification approach based on malware dynamic analysis. In: Pieprzyk, J., Suriadi, S. (eds.) ACISP 2017. LNCS, vol. 10343, pp. 173–189. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59870-3_10
Chapter Google Scholar
Forensics, C.: Virusshare (2023). http://virusshare.com/
Furht, B.: SIMD Single Instruction Multiple Data Processing. In: Furht, B. (ed.) Encyclopedia of Multimedia, pp. 817–819. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-78414-4_220
Chapter Google Scholar
Galloro, N., Polino, M., Carminati, M., Continella, A., Zanero, S.: A systematical and longitudinal study of evasive behaviors in windows malware. Comput. Secur. 113, 102550 (2022)
Article Google Scholar
Giffin, J.T., Jha, S., Miller, B.P.: Automated discovery of mimicry attacks. In: Zamboni, D., Kruegel, C. (eds.) RAID 2006. LNCS, vol. 4219, pp. 41–60. Springer, Heidelberg (2006). https://doi.org/10.1007/11856214_3
Chapter Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Hariom, Handa, A., Kumar, N., Kumar Shukla, S.: Adversaries strike hard: adversarial attacks against malware classifiers using dynamic API calls as features. In: Dolev, S., Margalit, O., Pinkas, B., Schwarzmann, A. (eds.) CSCML 2021. LNCS, vol. 12716, pp. 20–37. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78086-9_2
Chapter Google Scholar
Hassen, M., Carvalho, M.M., Chan, P.K.: Malware classification using static analysis based features. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE (2017)
Google Scholar
Hu, W., Tan, Y.: Black-box attacks against RNN based malware detection algorithms. arXiv preprint arXiv:1705.08131 (2017)
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. In: Tan, Y., Shi, Y. (eds.) DMBD 2022. CCIS, vol. 1745, pp. 409–423. Springer, Singapore (2022). https://doi.org/10.1007/978-981-19-8991-9_29
Chapter Google Scholar
Kolosnjaji, B., et al.: Adversarial malware binaries: evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2018)
Google Scholar
Kreuk, F., Barak, A., Aviv-Reuven, S., Baruch, M., Pinkas, B., Keshet, J.: Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv preprint arXiv:1802.04528 (2018)
Li, C., Lv, Q., Li, N., Wang, Y., Sun, D., Qiao, Y.: A novel deep framework for dynamic malware detection based on API sequence intrinsic features. Comput. Secur. 116, 102686 (2022)
Article Google Scholar
Liu, T., Curtsinger, C., Berger, E.D.: DTHREADS: efficient deterministic multithreading. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 327–336 (2011)
Google Scholar
Lucas, K., Sharif, M., Bauer, L., Reiter, M.K., Shintre, S.: Malware makeover: breaking ML-based static analysis by modifying executable bytes. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 744–758 (2021)
Google Scholar
Machado, G.R., Silva, E., Goldschmidt, R.R.: Adversarial machine learning in image classification: a survey toward the defender’s perspective. ACM Comput. Surv. (CSUR) 55(1), 1–38 (2021)
Article Google Scholar
Microsoft: Programming reference for the win32 API (2024). https://learn.microsoft.com/en-us/windows/win32/api/
Ming, J., Xin, Z., Lan, P., Wu, D., Liu, P., Mao, B.: Impeding behavior-based malware analysis via replacement attacks to malware specifications. J. Comput. Virol. Hacking Tech. 13, 193–207 (2017)
Article Google Scholar
Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007), pp. 421–430. IEEE (2007)
Google Scholar
Or-Meir, O., Nissim, N., Elovici, Y., Rokach, L.: Dynamic malware analysis in the modern era-a state of the art survey. ACM Comput. Surv. (CSUR) 52(5), 1–48 (2019)
Article Google Scholar
Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: 2016 IEEE Military Communications Conference, MILCOM 2016, pp. 49–54. IEEE (2016)
Google Scholar
Pawlowski, A., Contag, M., Holz, T.: Probfuscation: an obfuscation approach using probabilistic control flows. In: Caballero, J., Zurutuza, U., Rodríguez, R.J. (eds.) DIMVA 2016. LNCS, vol. 9721, pp. 165–185. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40667-1_9
Chapter Google Scholar
Pierazzi, F., Pendlebury, F., Cortellazzi, J., Cavallaro, L.: Intriguing properties of adversarial ml attacks in the problem space. In: 2020 IEEE Symposium on Security and Privacy (SP), pp. 1332–1349. IEEE (2020)
Google Scholar
Polino, M., et al.: Measuring and defeating anti-instrumentation-equipped malware. In: Polychronakis, M., Meier, M. (eds.) DIMVA 2017. LNCS, vol. 10327, pp. 73–96. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60876-1_4
Chapter Google Scholar
Polino, M., Scorti, A., Maggi, F., Zanero, S.: Jackdaw: towards automatic reverse engineering of large datasets of binaries. In: Almgren, M., Gulisano, V., Maggi, F. (eds.) DIMVA 2015. LNCS, vol. 9148, pp. 121–143. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20550-2_7
Chapter Google Scholar
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.K.: Malware detection by eating a whole exe. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)
Article Google Scholar
Rosenberg, I., Meir, S.: Bypassing NGAV for Fun and Pro t (2020)
Google Scholar
Rosenberg, I., Meir, S., Berrebi, J., Gordon, I., Sicard, G., David, E.O.: Generating end-to-end adversarial examples for malware classifiers using explainability. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–10. IEEE (2020)
Google Scholar
Rosenberg, I., Shabtai, A., Elovici, Y., Rokach, L.: Query-efficient black-box attack against sequence-based malware classifiers. In: Annual Computer Security Applications Conference, pp. 611–626 (2020)
Google Scholar
Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against state of the art API call based malware classifiers. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 490–510. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_23
Chapter Google Scholar
Somayaji, A., Forrest, S.: Automated response using \(\{\)System-Call\(\}\) delay. In: 9th USENIX Security Symposium (USENIX Security 2000) (2000)
Google Scholar
Suciu, O., Coull, S.E., Johns, J.: Exploring adversarial examples in malware detection. In: 2019 IEEE Security and Privacy Workshops (SPW). IEEE (2019)
Google Scholar
Tan, K., Maxion, R.: “Why 6?” defining the operational limits of stide, an anomaly-based intrusion detector. In: Proceedings 2002 IEEE Symposium on Security and Privacy, pp. 188–201 (2002). https://doi.org/10.1109/SECPRI.2002.1004371
Tian, R., Islam, R., Batten, L., Versteeg, S.: Differentiating malware from cleanware using behavioural analysis. In: 2010 5th International Conference on Malicious and Unwanted Software, pp. 23–30. IEEE (2010)
Google Scholar
Ucci, D., Aniello, L., Baldoni, R.: Survey of machine learning techniques for malware analysis. Comput. Secur. 81, 123–147 (2019)
Article Google Scholar
Uppal, D., Sinha, R., Mehra, V., Jain, V.: Malware detection and classification based on extraction of API sequences. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE (2014)
Google Scholar
Wagner, D., Dean, R.: Intrusion detection via static analysis. In: Proceedings 2001 IEEE Symposium on Security and Privacy, S &P 2001, pp. 156–168. IEEE (2000)
Google Scholar
Wagner, D., Soto, P.: Mimicry attacks on host-based intrusion detection systems. In: Proceedings of the 9th ACM Conference on Computer and Communications Security, pp. 255–264 (2002)
Google Scholar
Warrender, C., Forrest, S., Pearlmutter, B.: Detecting intrusions using system calls: alternative data models. In: Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No. 99CB36344), pp. 133–145. IEEE (1999)
Google Scholar
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, pp. 297–300 (2010). https://doi.org/10.1109/BWCCA.2010.85
Zhang, Z., Qi, P., Wang, W.: Dynamic malware analysis with feature engineering and feature learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 1210–1217 (2020)
Google Scholar

Download references

Acknowledgements

This study was carried out within the MICS (Made in Italy - Circular and Sustainable) Extended Partnership and received funding from Next-Generation EU (Italian PNRR - M4 C2, Invest 1.3 - D.D. 1551.11-10-2022, PE00000004). CUP MICS D43C22003120001. Mario D’Onghia acknowledges support from TIM S.p.A. through the PhD scholarship.

Author information

Authors and Affiliations

Politecnico di Milano, Milan, Italy
Gabriele Digregorio, Salvatore Maccarrone, Mario D’Onghia, Michele Carminati, Mario Polino & Stefano Zanero
Cybersecurity Lab, TIM S.p.A., Turin, Italy
Luigi Gallo

Authors

Gabriele Digregorio
View author publications
You can also search for this author in PubMed Google Scholar
Salvatore Maccarrone
View author publications
You can also search for this author in PubMed Google Scholar
Mario D’Onghia
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Gallo
View author publications
You can also search for this author in PubMed Google Scholar
Michele Carminati
View author publications
You can also search for this author in PubMed Google Scholar
Mario Polino
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Zanero
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gabriele Digregorio .

Editor information

Editors and Affiliations

AWS, San Diego, CA, USA
Federico Maggi
Boston University, Boston, MA, USA
Manuel Egele
EPFL, Lausanne, Switzerland
Mathias Payer
Politecnico di Milano, Milan, Italy
Michele Carminati

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Digregorio, G. et al. (2024). Tarallo: Evading Behavioral Malware Detectors in the Problem Space. In: Maggi, F., Egele, M., Payer, M., Carminati, M. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2024. Lecture Notes in Computer Science, vol 14828. Springer, Cham. https://doi.org/10.1007/978-3-031-64171-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-64171-8_7
Published: 09 July 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64170-1
Online ISBN: 978-3-031-64171-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tarallo: Evading Behavioral Malware Detectors in the Problem Space

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Lightweight Behavior-Based Malware Detection

Peekaboo: Hide and Seek with Malware Through Lightweight Multi-feature Based Lenient Hybrid Approach

A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Tarallo: Evading Behavioral Malware Detectors in the Problem Space

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Lightweight Behavior-Based Malware Detection

Peekaboo: Hide and Seek with Malware Through Lightweight Multi-feature Based Lenient Hybrid Approach

A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation