Skip to content

Expand roadmap with semantic detection, response analysis, and live validation#31

Open
ksek87 wants to merge 5 commits into
mainfrom
claude/plan-fuzzd-project-88AMD
Open

Expand roadmap with semantic detection, response analysis, and live validation#31
ksek87 wants to merge 5 commits into
mainfrom
claude/plan-fuzzd-project-88AMD

Conversation

@ksek87
Copy link
Copy Markdown
Owner

@ksek87 ksek87 commented May 14, 2026

Summary

  • Adds three new planned milestones to the roadmap:

    • v0.7 — Semantic detection layer: embedding-based similarity pass to close Message Hijacking (40%) and Privacy Leakage (59.8%) gaps in MCPTox benchmark
    • v0.8 — Tool output / response analysis: scans CallToolResult content for exfiltration indicators, covering attacks where the description is clean but the server poisons via responses
    • v1.0 — Live attack validation (LLM-in-the-loop): runs a real LLM agent against corpus-instrumented servers, measuring actual exploit success rate
  • Renumbers downstream milestones (chain fuzzer → v0.9, reporter → v1.1, protocol fuzzer → v1.2)

  • Adds a Milestone detail section with rationale and scope for each new stage

Test plan

  • README renders correctly with updated roadmap table and milestone detail section
  • No code changes — documentation only

https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk


Generated by Claude Code

claude added 5 commits May 14, 2026 03:03
…alidation milestones

Adds three new planned milestones:
- v0.7: Semantic/embedding detection layer to close Message Hijacking and Privacy Leakage gaps
- v0.8: Tool output/response analysis to detect server-side poisoning via CallToolResult
- v1.0: LLM-in-the-loop live attack validation measuring actual exploit success rate

Renumbers downstream milestones accordingly (chain fuzzer → v0.9, reporter → v1.1,
protocol fuzzer → v1.2) and adds milestone detail section explaining the rationale
for each new stage.

https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…, Python SDK, package scanning)

Adds five new planned milestones targeting friction points in how teams actually
build and adopt agentic tooling:
- v0.9: SARIF output + GitHub Security tab integration (moved earlier; multiplies reach)
- v0.10: GitHub Action published to Marketplace (one-line CI integration)
- v0.11: Package-level scanning (--package @scope/mcp-server, zero friction pre-adoption)
- v0.12: Python SDK + LangChain/LlamaIndex/AutoGen/LangGraph framework adapters
- v0.13: npx wrapper (no binary install for JS/TS teams)

Renumbers chain fuzzer to v0.14 and live validation to v1.0. Adds milestone detail
for each new stage. Updates Contributing section to reference current milestone names.

https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
Notes that v0.12 will use PyO3 + maturin to compile the Rust core as a native
Python extension module, with a thin Python adapter for framework conversions.
Same CI pipeline feeds binary and wheel releases with no code duplication.

https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
LLM-in-the-loop is model benchmarking (what MCPTox already does), not a fuzzer
feature — it would break determinism and add API cost to CI. Replaced with a
fuzzd validate subcommand: a thin reporting layer over the existing DescriptionScanner
that accepts labelled fixtures and outputs detection/FP rates. No new detection
code; replaces bench/run.sh as a first-class CLI surface.

https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…rds, and 5 citations

Detection signals (+2, 11→13):
- message_hijacking: recipient substitution, BCC injection, proxy number framing
  (Invariant Labs WhatsApp PoC; Postmark npm real-world incident)
- unicode_obfuscation: U+200B/200C/200D/FEFF zero-width character injection
  to hide instructions from human reviewers (Noma Security; arxiv 2601.17549)

Patterns (+16, 86→102) across both new signals plus:
- /.env, .cursor/mcp.json added to credential_reference
- "provide the contents of", "send to email" added to exfiltration_mechanism

Corpus records (+4, 23→27):
- TPA-018: MCP-UPD parasitic toolchain .env exfiltration (arxiv 2509.06572)
- TPA-019: Trivial Trojans cross-tool SSH key theft (arxiv 2507.19880)
- TPA-020: Message hijacking via recipient substitution (Invariant Labs / Postmark)
- TPA-021: Unicode zero-width character obfuscation (Noma Security)

Citations (+5): MCP-UPD [^9], Trivial Trojans [^10], When MCP Servers Attack [^11],
Breaking the Protocol [^12], Noma Security invisible characters [^13]

All 103 tests pass. clippy and rustfmt clean.

https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants