Add dynamic model assignment with security hardening by churcho · Pull Request #3 · danpeg/bug-hunt

churcho · 2026-03-05T17:27:44Z

What changed

This PR adds dynamic model assignment to bug-hunt, letting users run Hunter, Skeptic, and Referee on different AI providers (Claude, Codex CLI, Gemini CLI). It also fixes several security and robustness issues discovered by running bug-hunt on itself with mixed providers.

Dynamic model assignment (SKILL.md, README.md)

Users can now assign providers per role via CLI flags:

/bug-hunt --hunter=codex --skeptic=claude --referee=gemini src/
/bug-hunt --preset=mixed src/

Presets provide named configurations:

claude (default) — all three roles run as Claude Code subagents, identical to current behavior
codex — all roles shell out to Codex CLI
gemini — all roles shell out to Gemini CLI
mixed — Hunter=Codex, Skeptic=Claude, Referee=Gemini

Individual --hunter=, --skeptic=, --referee= flags override any preset. With no flags, behavior is unchanged from the original (all Claude).

Provider dispatch:

Claude roles use the Agent tool (general-purpose subagent) as before
Codex/Gemini roles write the prompt to a unique temp file (mktemp) and pipe it via stdin to the CLI (codex exec - / gemini -p -)

Security and robustness fixes (found by self-scan)

After implementing model assignment, we ran /bug-hunt --hunter=codex --skeptic=claude --referee=codex on this repo. The adversarial review confirmed 7 real issues, all now fixed:

Critical — shell injection (BUG-1, BUG-2):
The original external CLI instructions interpolated scan targets and report content directly into shell command strings. A path like src; rm -rf / or report text containing shell metacharacters could execute arbitrary commands. Fixed by always passing prompt content via stdin/temp file, never inlining into shell args.

Medium — tempfile collisions (BUG-3):
The hard-coded path /tmp/bug-hunt-hunter-prompt.md would corrupt concurrent runs. Fixed with mktemp /tmp/bug-hunt-{role}-XXXXXX.md for unique files per invocation, with cleanup after use.

Medium — no provider validation (BUG-4):
Invalid provider values (e.g., --hunter=gpt4) were silently accepted with undefined dispatch behavior. Step 0 now validates all provider values and stops with a clear error on invalid input.

Medium — no Hunter success gate (BUG-18):
If the Hunter agent failed (CLI not installed, crash, empty output), the flow continued to Skeptic/Referee with no input. Step 2b now explicitly verifies Hunter success before proceeding.

Low — no target validation (BUG-9):
Specifying a nonexistent scan target would dispatch agents that fail confusingly downstream. Step 0 now checks target existence and fails fast with a clear message.

Low — malformed markdown link (BUG-11):
The @systematicls attribution link in README.md used nested markdown syntax [text]([url](url)). Fixed to a single valid link.

Codex CLI invocation fix:
The original instructions used codex exec "prompt" which doesn't work for the exec subcommand. Corrected to cat file | codex exec - (stdin mode).

Files changed

SKILL.md — argument parsing, provider validation, target validation, external CLI dispatch via stdin, Hunter success gate
README.md — dynamic model assignment docs, provider table, fixed markdown link

No changes to the prompt files (hunter.md, skeptic.md, referee.md).

Backward compatibility

Running /bug-hunt or /bug-hunt src/ with no provider flags behaves identically to the current version. All three roles default to Claude Code subagents. The new functionality is additive.

How I tested

Ran /bug-hunt --hunter=codex --skeptic=claude --referee=codex on this repo itself — Hunter (Codex gpt-5.3) found 20 issues, Skeptic (Claude) challenged them down to 4, Referee (Codex) confirmed 7. All confirmed bugs are fixed in this PR.
Verified Codex CLI invocation works with stdin mode (codex exec -)
Confirmed default behavior (no flags) matches original flow

Checklist

Tested locally with /bug-hunt
Updated docs if needed

Each role can now run on Claude, Codex CLI, or Gemini CLI via --preset and --hunter/--skeptic/--referee flags. Defaults to all-Claude for backward compatibility.

Add dynamic model assignment for bug hunt roles

…g hunt - Pass all external CLI prompts via stdin/tempfile instead of inline shell interpolation (fixes shell injection risk) - Use mktemp for unique temp files per run (fixes concurrent collisions) - Add provider validation with clear error on invalid values - Add target path existence check before dispatching agents - Add Hunter success gate before proceeding to Skeptic/Referee - Fix codex CLI invocation to use stdin mode (codex exec -) - Fix malformed markdown link in README

Fix security and robustness issues found by bug hunt

Keep both the branch diff mode docs from upstream and the dynamic model assignment section from the fork.

churcho added 6 commits March 5, 2026 19:51

Add dynamic model assignment for hunter, skeptic, and referee roles

184acc7

Each role can now run on Claude, Codex CLI, or Gemini CLI via --preset and --hunter/--skeptic/--referee flags. Defaults to all-Claude for backward compatibility.

Merge pull request #1 from churcho/feature/dynamic-model-switch

58499db

Add dynamic model assignment for bug hunt roles

Merge pull request #2 from churcho/feature/dynamic-model-switch

23723a5

Fix security and robustness issues found by bug hunt

Merge danpeg/main and resolve SKILL.md conflict

023ab22

Merge upstream/main and resolve README.md conflict

f738e3d

Keep both the branch diff mode docs from upstream and the dynamic model assignment section from the fork.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dynamic model assignment with security hardening#3

Add dynamic model assignment with security hardening#3
churcho wants to merge 6 commits intodanpeg:mainfrom
churcho:main

churcho commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

churcho commented Mar 5, 2026

What changed

Dynamic model assignment (SKILL.md, README.md)

Security and robustness fixes (found by self-scan)

Files changed

Backward compatibility

How I tested

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant