Total Recall? How Good are Static Call Graphs Really? - Companion Artifact

Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

There is a newer version of the record available.

Published March 31, 2024 | Version 1.0.0
Software Open

Total Recall? How Good are Static Call Graphs Really? - Companion Artifact

  • 1. Technische Universität Darmstadt
  • 2. ATHENE
  • 3. Technische Universität Dortmund
  • 4. hessian.AI

Description

This artifact holds a pipeline that captures a dynamic callgraph for a JVM program and a given set of inputs (input corpus). This dynamic callgraph can then be used as a baseline to compute precision and recall of a defined set of static callgraphs.

To achieve a good quality of the dynamic baseline, the pipeline provides different techniques for creating a suitable input corpus. These are:

  1. Base Seed Corpus: Pre-existing input corpora found online, without any modification
  2. Seed Corpus: Manual additions to the Base Seed Corpus derived from inspecting the coverage values.
  3. Fuzzing: A coverage-guided fuzzer (Jazzer) generates program inputs from scratch
  4. Fuzzing Seed: Jazzer generates new inputs using the Seed Corpus as a starting point. This is the combination of all aforementioned techniques, which we found to be best suited for good quality dynamic callgraphs.
The pipeline evaluates precision and recall for the following fixed set of static callgraphs:

  • OPAL: CHA, RTA, 0-CFA
  • WALA: CHA, RTA, 0-CFA
  • Soot: CHA
  • Doop: 0-CFA

Numerical values for precision and recall are computed for every static callgraph and every project. We further include scripts that visualize those values for our set of four programs.
 
The artifact consists of three archives:
 
  1. total_recall_paper_supplementary.zip: Holds supplementary material for our paper, including proofs for bounds to precision and recall, as well as additional visualizations.
  2. total_recall_artifact.zip: Holds the implementation of our pipeline and most of the data generated for our evaluation. A detailed description on how to use this artifact can be found in the enclosed README.md file.
  3. total_recall_artifact_supplementary.zip: Holds supplementary data for our artifact. This may be helpful if you do not have access to the computing resources required to compute static callgraphs. Installation instructions can be found in the enclosed README.md file.

Files

total_recall_artifact.zip

Files (1.1 GB)

Name Size Download all
md5:1708bb8481102a93bd795978a9c00beb
163.5 MB Preview Download
md5:a249bd4d837f67310bfe377d0efa3911
916.6 MB Preview Download
md5:3b8d2c4e659bfb4b4cd64e7f95dad0f8
3.7 MB Preview Download

Additional details

Related works

Is supplement to
Conference paper: 10.1145/3650212.3652114 (DOI)

Software

Programming language
Java, Python