Workshop on Binary Analysis Research (BAR) 2024 Program
Friday, 1 March
-
Dr. Barton P. Miller (Vilas Distinguished Achievement Professor at The University of Wisconsin-Madison)
Patching binary code dates back to some of the earliest computer systems. Binary code patching allows access to a program without having access to the source code, obviating the need to recompile, re-link, and, in the dynamic case, re-execute. In the early days, it was a bold technique used by serious programmers to avoid the long recompile/reassemble and link steps. Code patching required an intimate knowledge of the instruction set and its binary representation. Great advances have been made in simplifying the use of code patching, making it less error prone and more flexible. "Binary rewriters" were a great advance in the technology for modifying a binary before its execution. Early tools, such as OM, EEL, and Vulcan, enabled the building of tools for tracing, simulation, testing, and sandboxing. Moving beyond static patching, we developed "dynamic instrumentation", the ability to patch code into a running program. Dynamic instrumentation provided the ability to adapt the code to the immediate need, dynamically control overhead costs. We applied this technology to both user programs and operating system kernels producing the Dyninst and Kerninst tool kits. This technology formed the foundation of the Paradyn Performance Tools. Dynamic code patching continued to get more aggressive. We developed "self-propelled instrumentation", which inserts instrumentation code that propagates itself along the program's control flow as the program executes. At its best, this technique can provide very low overhead, detailed instrumentation in support of fault isolation and identification of intermittent performance issues. More recently, we have addressed a wide variety of issues related to binary code patching including analyzing and patching defensive and obfuscated malware, parallelizing the binary code parsing process to quickly patch huge (GB+) binaries, and efficient analysis and instrumentation of GPU binaries. Key to both static and dynamic patching are the interfaces. There is a difficult balance between providing an interface that abstracts the details of the code, often using control- and data-flow graphs and instruction categories, and an interface that exposes the details of the instruction set. Our primary interface is based on editing of the control flow graph, based on an editing algebra that is closed under valid control flow graphs. In this talk, I will discuss the development of code patching over the years, with examples from the various technologies (including our tools) and present results from our latest work in self- propelled instrumentation. I will also discuss interface abstractions and our work towards the goal of multi-platform interfaces and tools.
Bio: Barton Miller is the Vilas Distinguished Achievement Professor at UW-Madison Miller is a co-PI on the Trusted CI NSF Cybersecurity Center of Excellence, where he leads the software assurance effort. His research interests include software security, in-depth vulnerability assessment, binary and malicious code analysis and instrumentation, extreme scale systems, and parallel and distributed program measurement and debugging. In 1988, Miller founded the field of Fuzz random software testing, which is the foundation of many security and software engineering disciplines. In 1992, Miller (working with his then-student Prof. Jeffrey Hollingsworth) founded the field of dynamic binary code instrumentation and coined the term “dynamic instrumentation”. Miller Miller is a Fellow of the ACM and recently won the Jean Claude Laprie Award in dependable computing for his work on fuzz testing. Miller was the chair of the Institute for Defense Analysis Center for Computing Sciences Program Review Committee, member of the U.S. National Nuclear Safety Administration Los Alamos and Lawrence Livermore National Labs Cyber Security Review Committee (POFMR), member of the Los Alamos National Laboratory Computing, Communications and Networking Division Review Committee, has been on the U.S. Secret Service Electronic Crimes Task Force (Chicago Area) is currently an advisor to the Wisconsin National Guard Cyber Prevention Team.
-
Lambang Akbar (National University of Singapore), Yuancheng Jiang (National University of Singapore), Roland H.C. Yap (National University of Singapore), Zhenkai Liang (National University of Singapore), Zhuohao Liu (National University of Singapore)
Disassemblers play a crucial role in reverse engineering, malware analysis, binary analysis, malware detection, binary-level security mechanisms, etc. It is well known that in general, disassembly is an undecidable problem, so errors in a disassembler should be expected. In applications where disassembly of a binary is only the first step, any disassembly errors will impact the correctness or effectiveness of tasks such as static binary instrumentation, binary hardening, binary CFI, automated code repair, etc. As such, determining what errors may lie in the disassembly of a given binary would help in determining to what extent such applications are affected by disassembly errors. Existing works have highlighted limitations and errors in existing disassemblers but they largely rely on practical implementation without specific guarantees. In this initial work, we investigate an alternative and complementary approach, where the error evaluation has a soundness guarantees. There are intrinsic tradeoffs when trying to determine the ground truth of disassembly given its theoretical undecidability. Essentially one can choose between soundness or completeness. In this work, we focus on exploring the soundness direction. We propose TraceDis which uses dynamic execution to find disassembly errors and evaluate whether TraceDis is successful to answer the following questions: (i) can TraceDis find errors consistent with existing studies evaluating disassemblers using approaches which do not have guarantees; (ii) can (new) interesting errors be found; (iii) can errors in non-C/C++ binaries be found; and (iv) can errors in closed source binaries be found. The experiments show that TraceDis finds errors in all these cases. We believe that this preliminary evaluation taking a soundness based approach shows promise. It can also complement and be an alternative to existing evaluation techniques.
-
Florian Hofhammer (EPFL), Marcel Busch (EPFL), Qinying Wang (EPFL and Zhejiang University), Manuel Egele (Boston University), Mathias Payer (EPFL)
Dynamic analysis of microcontroller-based embedded firmware remains challenging. The general lack of source code availability for Commercial-off-the-shelf (COTS) firmware prevents powerful source-based instrumentation and prohibits compiling the firmware into an executable directly runnable by an analyst. Analyzing firmware binaries requires either acquisition and configuration of custom hardware, or configuration of extensive software stacks built around emulators. In both cases, dynamic analysis is limited in functionality by complex debugging and instrumentation interfaces and in performance by low execution speeds on Microcontroller Units (MCUs) and Instruction Set Architecture (ISA) translation overheads in emulators.
SURGEON provides a performant, flexible, and accurate rehosting approach for dynamic analysis of embedded firmware. We introduce transplantation to transform binary, embedded firmware into a Linux user space process executing natively on compatible high-performance systems through static binary rewriting. In addition to the achieved performance improvements, SURGEON scales horizontally through process instantiation and provides the flexibility to apply existing dynamic analysis tooling for user space processes without requiring adaptations to firmware-specific use cases. SURGEON’s key use cases include debugging binary firmware with off-the-shelf tooling for user space processes and fuzz testing.
-
Ioannis Angelakopoulos (Boston University), Gianluca Stringhini (Boston University), Manuel Egele (Boston University)
Re-hosting Internet of Things (IoT) firmware can oftentimes be a tedious process, especially when analysts have to intervene with the analysis to ensure further progress. When it comes to Linux-based firmware, one crucial problem that current re-hosting systems face, is that the configuration of the custom kernels used by these systems, significantly deviates from the configuration of the IoT kernel modules used in firmware images. As a consequence, kernel artifacts, such as the memory layout of data structures might differ between the custom kernels and the IoT kernel modules. To analyze the IoT kernel modules within these kernels, the analyst often has to invest significant amount of engineering effort and time to align the offending data structures within the custom kernels. In this paper, we present FirmDiff, an automated binary diffing framework that enables analysts to effectively detect and align the misaligned data structures between the custom kernels produced by the FirmSolo re-hosting framework and the Linux kernel modules in IoT firmware. The goal of FirmDiff is to improve the configuration of FirmSolo’s kernels to closely approximate the configuration of the IoT kernels in the firmware images, such that the IoT kernel modules can be analyzed without errors. We evaluate FirmDiff on a dataset of 10 firmware images with 148 IoT kernel modules that crash during re-hosting with FirmSolo. Using FirmDiff’s findings, we identify 37 misaligned data structures in FirmSolo’s kernels for these images. After aligning the layout of 35 of these data structures, FirmSolo’s refined kernels successfully load 28 previously crashing kernel modules.
-
Alexander Balgavy (Independent), Marius Muench (University of Birmingham)
Embedded devices are a pervasive and at times invisible part of our lives. Due to this pervasiveness, security vulnerabilities may have severe consequences, particularly because many embedded devices are deployed in sensitive applications, such as the industrial, automotive, and medical sectors. Linux-based firmware has already been the subject of extensive research; however, a considerable part of embedded devices do not run Linux. Since current literature mostly focuses on Linux-based firmware, the ecosystem of non-Linux firmware is not well-known.
Therefore, in this paper, we aim to fill this gap in research with FIRMLINE, a pipeline suitable for a large-scale study of nonLinux-based firmware. Using this pipeline, we analyze 21,755 samples, obtained from previous studies and new sources. As part of a security assessment, we also investigate the presence of operating systems and memory protections for a subset of 756 non-Linux ARM samples and find that the majority do not make use of either. Our work will allow for further research of non-Linux firmware, such as refining generic analysis techniques or investigating the OS and deployed security facilities of such firmware in more detail.
-
Peter Lafosse (Owner and Co-Founder of Vector 35 Inc.)
Binary analysis serves as a foundational technique for a wide array of cybersecurity tasks, including vulnerability identification and malware analysis. While these methods have evolved to become incredibly powerful, they are inherently bounded by the limitations of what can be inferred solely from the binary data within a file. This talk aims to provide an in-depth exploration of both the capabilities and the constraints of binary analysis, dissecting the fundamental goals that drive its usage and to explore potential solutions to these constraints. We will investigate the common objectives of binary analysis, such as code understanding, bug hunting, and threat analysis, and evaluate how these goals often remain unmet when confined to mere binary introspection. This talk will argue that while binary analysis is indispensable, it cannot be the sole methodology employed for a comprehensive solution. The presentation will advocate for the integration of external data sources, contextual information, runtime behavior, and most importantly machine learning and large language models as essential components for enriching the output of binary analysis tools. By fusing binary data with external inputs, we can transcend the inherent limitations and offer a more nuanced, accurate, and actionable analysis for our users.
Bio: Peter LaFosse, is an industry veteran. He started his journey 18 years ago working at SI Government Solutions (later acquired by Raytheon) finding and exploiting software vulnerabilities, writing tools to the same effect and running teams as well. He is a recipient of a coveted DEFCON Black Badge for Capture the Flag where he was the offensive team leader. Having served his time as a defense contractor and being dissatisfied with the tools available for software reverse engineering he co-founded Vector 35 with his business partners with the aim of building the next generation of decompiler. Eight years later Binary Ninja stands as one of the most highly regarded products in the industry.
-
Romain Malmain (EURECOM), Andrea Fioraldi (EURECOM), Aurelien Francillon (EURECOM)
Despite QEMU’s popularity for binary-only fuzzing, the fuzzing community faces challenges like the proliferation of hard-to-maintain QEMU forks and the lack of an up-to-date, flexible framework well-integrated with advanced fuzzing engines. This leads to a gap in emulation-based fuzzing tools that are both maintainable and fuzzing-oriented.
To cope with that, we present LIBAFL QEMU, a library written in Rust that provides an interface for fuzzing-based emulation by wrapping around QEMU, in both system and user mode. We focus on addressing the limitations of existing QEMU forks used in fuzzing by offering a well-integrated, maintainable and up-to-date solution. In this paper, we detail the design, implementation, and practical challenges of LIBAFL QEMU, including its APIs and fuzzing capabilities and we showcase the library’s use in two case studies: fuzzing an Android library and a Windows kernel driver.
We compare the fuzzers written for these 2 targets with the state-of-the-art, AFL++ qemu mode for the Android library, and KAFL for the Windows driver. For the former, we show that LIBAFL QEMU outperforms AFL++ qemu mode both in terms of speed and coverage. For the latter, despite KAFL being built above hardware-based virtualization instead of emulation, we show we can run complex targets such as Windows and still reach comparable performance, with an overhead expected by a software emulator.
-
Arnau Gàmez-Montolio (City, University of London; Activision Research), Enric Florit (Universitat de Barcelona), Martin Brain (City, University of London), Jacob M. Howe (City, University of London)
Polynomials over fixed-width binary numbers (bytes, Z/2 wZ, bit-vectors, etc.) appear widely in computer science including obfuscation and reverse engineering, program analysis, automated theorem proving, verification, errorcorrecting codes and cryptography. As some fixed-width binary numbers do not have reciprocals, these polynomials behave differently to those normally studied in mathematics. In particular, polynomial equality is harder to determine; polynomials having different coefficients is not sufficient to show they always compute different values. Determining polynomial equality is a fundamental building block for most symbolic algorithms. For larger widths or multivariate polynomials, checking all inputs is computationally infeasible. This paper presents a study of the mathematical structure of null polynomials (those that evaluate to 0 for all inputs) and uses this to develop efficient algorithms to reduce polynomials to a normalized form. Polynomials in such normalized form are equal if and only if their coefficients are equal. This is a key building block for more mathematically sophisticated approaches to a wide range of fundamental problems.
-
Chloe Fortuna (STR), JT Paasch (STR), Sam Lasser (Draper), Philip Zucker (Draper), Chris Casinghino (Jane Street), Cody Roux (AWS)
Modifying a binary program without access to the original source code is an error-prone task. In many cases, the modified binary must be tested or otherwise validated to ensure that the change had its intended effect and no others—a process that can be labor-intensive. This paper presents CBAT, an automated tool for verifying the correctness of binary transformations. CBAT’s approach to this task is based on a differential program analysis that checks a relative correctness property over the original and modified versions of a function. CBAT applies this analysis to the binary domain by implementing it as an extension to the BAP binary analysis toolkit. We highlight several features of CBAT that contribute to the tool’s efficiency and to the interpretability of its output. We evaluate CBAT’s performance by using the tool to verify modifications to three collections of functions taken from real-world binaries.
-
Yun Zhang (Hunan University), Yuling Liu (Hunan University), Ge Cheng (Xiangtan University), Bo Ou (Hunan University)
In the field of computer security, binary code similarity detection is a crucial for identifying malicious software, copyright infringement, and software vulnerabilities. However, obfuscation techniques not only changes the structure and features of the code but also effectively conceal its potential malicious behaviors or infringing nature, thereby increasing the complexity of detection. Although methods based on graph neural networks have become the forefront technology for solving code similarity detection due to their effective processing and representation of code structures, they have limitations in dealing with obfuscated function matching, especially in scenarios involving control flow obfuscation. This paper proposes a method based on Graph Transformers aimed at improving the accuracy and efficiency of obfuscation-resilient binary code similarity detection. Our method utilizes Transformers to extract global information and employs three different encodings to determine the relative importance or influence of nodes in the CFG, the relative position between nodes, and the hierarchical relationships within the CFG. This method demonstrates significant adaptability to various obfuscation techniques and exhibits enhanced robustness and scalability when processing large datasets.