feat(eval): WAF evaluation lab + M6-01 evaluation plan#225
Merged
Conversation
Closes #133 (M6-01 evaluation plan document). Adds a reproducible thesis evaluation lab with real target applications (OWASP Juice Shop, DVWA, WordPress) behind guard-proxy, automated test scenarios, a dedicated Makefile, and the methodology document required before experiments run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 33587452 | Triggered | Generic Password | 8706635 | benchmarks/lab/docker-compose.targets.yml | View secret |
| 33587451 | Triggered | Generic Password | 8706635 | benchmarks/lab/docker-compose.targets.yml | View secret |
| 33587448 | Triggered | Generic Password | 8706635 | benchmarks/lab/docker-compose.targets.yml | View secret |
| 33587453 | Triggered | Generic Password | 8706635 | benchmarks/lab/docker-compose.targets.yml | View secret |
| 33587450 | Triggered | Generic Password | 8706635 | benchmarks/lab/.env.example | View secret |
| 33587449 | Triggered | Generic Password | 8706635 | benchmarks/lab/.env.example | View secret |
| 33587448 | Triggered | Generic Password | 8706635 | benchmarks/lab/docker-compose.targets.yml | View secret |
| 33587452 | Triggered | Generic Password | 8706635 | benchmarks/lab/docker-compose.targets.yml | View secret |
| 33587454 | Triggered | Generic Password | 8706635 | benchmarks/lab/.env.example | View secret |
| 33587448 | Triggered | Generic Password | 8706635 | benchmarks/lab/docker-compose.targets.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Closes #133 (M6-01 — evaluation plan document). Adds a reproducible thesis evaluation lab with real target applications (OWASP Juice Shop, DVWA, WordPress) behind guard-proxy, automated test scenarios, a dedicated Makefile, and runner scripts. The evaluation plan document lives in the thesis repo (dsw-latex-thesis). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix root Makefile passing empty RUN_ID=/TARGET_VHOST= to benchmarks/Makefile, overriding ?= defaults and creating run-/ directories (use $(if ...) guards) - Fix setup-lab.sh ensure_policy: heredoc+here-string conflict caused Python to receive JSON as program source; pass response via POLICY_RESPONSE env var instead - Fix run-zap.sh and run-nuclei.sh: export OUT_DIR (and TARGET_VHOST) before Python heredoc so os.environ lookups find the correct path, not "." - Fix run-zap.sh: inject Host: header via ZAP replacer -config flags instead of ignored -e env var; replace two-step hook/fallback with single reliable run - Fix benchmarks/Makefile results target: strip run- prefix when reading latest dir name to avoid constructing run-run-<id>/results.csv path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The field is not yet implemented in the backend. Drop it from both the baseline and PL2 policy JSON bodies in setup-lab.sh and remove the now-unused LAB_POLICY_OUTBOUND_THRESHOLD variables (already commented out in .env.example). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes #133 (M6-01 — evaluation plan document). Defines WAF evaluation methodology: scenarios (go-ftw CRS corpus, ZAP, Nuclei, wrk load), metrics (TPR/FPR, p50/p95/p99 latency, RPS degradation, memory), success criteria, hardware spec (Proxmox LXC), and threats to validity. Lab source: benchmarks/lab/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
benchmarks/lab/with Docker Compose overlay, setup/teardown scripts, scenario configs, and runner scriptsbenchmarks/Makefile+ rootMakefilepass-throughs (make eval-up,make eval-all, …)evaluation-plan.md) lives in the thesis repo (dsw-latex-thesis) — committed there separately;thesis/is gitignored hereWhat's in
benchmarks/lab/docker-compose.targets.yml.env.examplesetup-lab.shensure_vhostfromsetup-demo.shteardown-lab.shdown/down -vscenarios/crs-ftw/config.yamlscenarios/zap/scenarios/nuclei/nuclei.yamlscenarios/load/benign-mix.luarunners/lib.shrunners/run-ftw.shrunners/run-zap.shrunners/run-nuclei.shrunners/run-load.shrunners/collect-metrics.shsummary.json→results.csv+report.jsonTest plan
cp deploy/demo/.env.example deploy/demo/.env && cp benchmarks/lab/.env.example benchmarks/lab/.env && git submodule update --init --recursivemake eval-up— all 5 vhosts healthy (demo-app, demo-api, juice.local, dvwa.local, wp.local)curl -si -H 'Host: juice.local' 'http://127.0.0.1:8080/?q=1+UNION+SELECT+1--'→HTTP/1.1 403make eval-ftw— go-ftw run completes,benchmarks/results/run-*/ftw/summary.jsoncreatedmake eval-all— all scenarios run,results.csvproducedmake eval-results— table prints without error