CGC Corpus
24 Apr 2017 | CGC, CTFToday, we released a browsable Cyber Grand Challenge corpus platform. Similar to the DARPA IDS dataset released in 1998 and 1999, we’re hoping this corpus can help inform cyber security research for the next decade. This corpus provides a human browsable interface to the happenings of DARPA’s Cyber Grand Challenge, both in the qualifying event (CQE) and the final event (CFE).
There are a handful of unique components that make up the corpus that could be of interest:
- A CQE and CFE scoring breakdown.
- An per team index of competitor submissions.
- Challenges broken out by author identified CWE for CQE and CFE, and CFE difficulty ratings.
- Submissions from competitors for CQE and CFE broken out by challenge.
- A unified source tree for the applications that make up the challenge.
- 3d rendering of POVs executing in CFE using HAXXIS, the CGC data visualization platform. Example: ForAllSecure scoring against Disekt in Round 25 on CROMU_00095.
- Execution taint traces of following memory involved in the proving of a vulnerability for all successful POVs, taken from the CGC forensics platform. Example: Shellphish targetted TECHx running CROMU_00098 in round 47 (the CrackAddr bug).
Including a browsable disassembly of each submission is in progress. The corpus will be updated with that information as we build the data.
For more information about the the corpus, please see the CGC Corpus Paper. The corpus is available as a git repo, as well as browsable on our website. If you have any comments or questions, please email us.