This is our attempt at providing a simple means of interacting with the CERT Vulnerability Data Archive. At this time we have no plans to provide any full-fledged applications. Instead we are offering a simple VulDb()
Python class that can load the vulnerability data contained in the archive. It's up to you to decide how to use it. Anyone with some basic Python skills should be able to make use of this package.
Get virtualenv if you don't already have it
$ pip install virtualenv
Get the Vulnerability Data Archive Tools:
git clone https://github.com/CERTCC-Vulnerability-Analysis/Vulnerability-Data-Archive-Tools.git
So we'll assume that you now have the code in ./Vulnerability-Data-Archive-Tools
Change into the directory with the code:
cd ./Vulnerability-Data-Archive-Tools/src
Create a virtualenv:
$ cd Vulnerability-Data-Archive-Tools
$ virtualenv vuldata_demo_env
New python executable in cert_vuldata/bin/python
Installing setuptools, pip, wheel...done.
Activate it:
$ . vuldata_demo_env/bin/activate
Install requirements:
(vuldata_demo_env)$ pip install -r requirements.txt
Install the CERT Vulnerability Data Archive Tools package:
(vuldata_demo_env)$ python setup.py install
You can clone the Vulnerability Data Archive using git:
git clone https://github.com/CERTCC-Vulnerability-Analysis/Vulnerability-Data-Archive.git
Or just download and unzip this:
https://github.com/CERTCC-Vulnerability-Analysis/Vulnerability-Data-Archive/archive/master.zip
Skip this section if you got the data from github as described immediately above.
If you happen to have the JSON files from https://www.cert.org/download/vul_data_archive/ and you would rather use those, just put the CERT_vul_reports.json
and CERT_vendor_records.json
directories in a directory and run:
cert_vuldata_splitexport --inpath=path/to/data --outpath=path/to/output
Then you should be able to run the demo below on that data. We recommend you use the data from github though.
Try the demo:
$ cert_vuldata_demo --help
usage: cert_vuldata_demo [-h] datapath
positional arguments:
datapath path to CERT Vulnerability Data Archive
optional arguments:
-h, --help show this help message and exit
Ok, so you have to point it at the data:
(vuldata_demo_env)$ cert_vuldata_demo Vulnerability-Data-Archive/data
Watch it go...the demo prints a series of reports just to show you what you can do with this data:
- a count of the vul records read
- a list of all records for which "Google" appears as an affected vendor
- a count of vulnerability reports created by year
- the number of CVE IDs associated with vulnerability reports
66316
### Google Vuls by Date ###
2002-02-08 VU#864643 SSL 3.0 and TLS 1.0 allow chosen plaintext attack in CBC modes
2005-12-27 VU#181038 Microsoft Windows Metafile handler SETABORTPROC GDI Escape vulnerability
...
2015-10-13 VU#943167 Voice over LTE implementations contain multiple vulnerabilities
2016-01-19 VU#916896 Oracle Outside In 8.5.2 contains multiple stack buffer overflows
Year, NumVulReportsCreated
1998, 639
1999, 765
...
2015, 188
2016, 23
NumCVEs, VulCount
0, 44848
1, 19863
...
63, 1
66, 1
This is where your creativity comes in. Want to see all the vuls we cataloged with prime numbered CVEs? How about counting how many times we've said "We are currently unaware of a practical solution to this problem."? You've got ideas for analysis and questions to answer. That's why you're reading this. So to get started, you can follow the examples in cert_vuldata/demo.py
, or just start with something like
from cert_vuldata.vuldb import VulDb
vulrecords = VulDb('Vulnerability-Archive-Data/data')
vulrecords.load()
# a VulDb object is basically a dict
for vu_id, record in vulrecords.iteritems():
do_something_with(record)
Did you find something interesting in the data? Did you come up with some cool way of slicing it or remixing it and you want to share? You can tweet us @certcc. Or send mail to cert@cert.org.
If you find a problem with the data or the tools, please create an issue report in the appropriate repository.
Please be aware though that we offer no formal support, however we may respond to questions and feedback sent to cert@cert.org with the tag INFO#365908 in the subject.