Skip to content

Commit 9c2fa31

Browse files
authored
feat(eval): xscode generation (#10)
* feat(eval): xscode generation * fix: typo * fix: typo * fix: typos in annotate_utils * fix: remove old header
1 parent 8f4929c commit 9c2fa31

9 files changed

Lines changed: 1492 additions & 7 deletions

File tree

eval/compile_xscode/README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
### XSCode
2+
[XSCode](https://huggingface.co/datasets/purpcode/XSCode) is an overrefusal benchmark for secure code generation. While benchmarks like CyberSecEval FRR prompts are lengthy and specifically target malicious cyber activities, XSCode contains 589 short and harmless code-generation prompts that do not contain any built-in code security bias.
3+
4+
### XSCode Generation
5+
6+
```
7+
export PYTHONPATH=$PYTHONPATH:$(pwd)
8+
python eval/compile_xscode/main.py
9+
```
10+
11+
Generation requires AWS bedrock access.
12+
13+
### XSCode Eval
14+
15+
```
16+
export PYTHONPATH=$PYTHONPATH:$(pwd)
17+
python eval/main.py --task "purpcode/xscode" --model <MODEL_NAME>
18+
```

eval/compile_xscode/annotate.py renamed to eval/compile_xscode/annotate_utils/annotate.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
#
33
# SPDX-License-Identifier: Apache-2.0
44

5-
# pip install rich datasets
65
import argparse
76
import json
87
import os

eval/compile_xscode/gather.py renamed to eval/compile_xscode/annotate_utils/gather.py

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,6 @@
22
#
33
# SPDX-License-Identifier: Apache-2.0
44

5-
"""
6-
Script to process JSONL files and filter records where malicious, unnatural,
7-
and too_simple are all "disagree", then display statistics.
8-
"""
9-
105
import json
116
from collections import Counter
127
from pathlib import Path
@@ -20,7 +15,7 @@ def analyze_records(records):
2015
return None
2116
print(f"Total records: {len(records)}")
2217
print(
23-
f"{len(set([ record['original_prompt']['additional_context']['cwe_id'] for record in records ]))} unique CWEs"
18+
f"{len({record['original_prompt']['additional_context']['cwe_id'] for record in records})} unique CWEs"
2419
)
2520

2621
# distribution per language

eval/compile_xscode/split.py renamed to eval/compile_xscode/annotate_utils/split.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
#
33
# SPDX-License-Identifier: Apache-2.0
44

5+
56
import argparse
67
import json
78
from pathlib import Path

0 commit comments

Comments
 (0)