Skip to content

Commit ceb520f

Browse files
authored
Update README with detection metrics and evaluationsu
Added sections for Spam Detection, Prompt Injection Detection, and PII Detection with performance metrics.
1 parent 2605a5e commit ceb520f

1 file changed

Lines changed: 24 additions & 0 deletions

File tree

README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,30 @@ Benchmarked using [CHI 2025 "Lost in Moderation"](https://arxiv.org/html/2503.01
3838

3939
---
4040

41+
### Spam Detection
42+
43+
**Positive class (1) = spam (should be flagged).** Precision reflects false positives; recall reflects false negatives.
44+
45+
| System | Balanced Accuracy | Precision | Recall | F1 | FP | FN | n | Dataset | Eval |
46+
|--------|------------------:|----------:|-------:|---:|---:|---:|---:|---------|------|
47+
| **LocalMod** | **0.9988** | **0.9861** | **1.0000** | **0.9930** | **1** | **0** | 500 | [UCI SMS Spam Collection](https://archive.ics.uci.edu/ml/datasets/sms+spam+collection) (`ucirvine/sms_spam`, train) | `evaluation/chi2025_benchmark.py --task spam --samples 500` |
48+
49+
### Prompt Injection Detection
50+
51+
**Positive class (1) = injection (should be flagged).** These numbers are the **overall** metrics across ProtectAI validation subsets (with per-subset sample caps).
52+
53+
| System | Balanced Accuracy | Precision | Recall | F1 | FP | FN | n | Dataset | Eval |
54+
|--------|------------------:|----------:|-------:|---:|---:|---:|---:|---------|------|
55+
| **LocalMod** | **0.815** | **0.657** | **0.932** | **0.771** | **199** | **28** | 1069 | `protectai/prompt-injection-validation` (subset-capped) | `evaluation/chi2025_benchmark.py --task prompt_injection --samples 200` |
56+
57+
### PII Detection
58+
59+
**Positive class (1) = contains PII (should be flagged).** This is a **synthetic sanity-check** benchmark (not a public curated dataset).
60+
61+
| System | Balanced Accuracy | Precision | Recall | F1 | FP | FN | n | Dataset | Eval |
62+
|--------|------------------:|----------:|-------:|---:|---:|---:|---:|---------|------|
63+
| **LocalMod** | **1.0000** | **1.0000** | **1.0000** | **1.0000** | **0** | **0** | 2000 | `synthetic_pii_v1` (balanced) | `evaluation/chi2025_benchmark.py --task pii --samples 2000` |
64+
4165
## Installation
4266

4367
```bash

0 commit comments

Comments
 (0)