Stop guessing. Start debugging.
A Claude Code skill that brings hypothesis-driven, instrumented debugging to the terminal. Instead of guessing and patching, Claude Code will systematically reproduce your bug, add tagged logs, analyze runtime behavior, fix the root cause, and write a regression test.
AI coding assistants guess at fixes. They see an error, pattern-match to a solution, and hope it works. When it doesn't, they guess again. This is expensive, slow, and produces fragile code.
Debug Mode forces a disciplined loop:
Hypothesize → Reproduce → Instrument → Analyze → Reset → Fix → Verify
Every fix is backed by evidence. Every bug gets a regression test. No guessing.
| Phase | What Happens |
|---|---|
| 0. Safety | Create git branch + stash uncommitted work |
| 1. Discover | Scan project for test infra, scripts, recent changes |
| 2. Hypothesize | Generate 3-5 ranked theories with evidence criteria |
| 3. Reproduce | Write repro script/test and commit it |
| 4. Instrument | Add [DEBUG-MODE] tagged logs (dirtying the working tree) |
| 5. Analyze | Run reproduction, capture logs to file, grep intelligently |
| 6. Reset | git restore . — perfectly removes all debug logs |
| 7. Fix | Implement fix at root cause (not symptom) |
| 8. Verify | Red-to-green check using the committed repro |
This is the key insight. LLMs are bad at surgically removing debug logs — they leave hanging brackets, break indentation, and corrupt syntax. Debug Mode solves this:
- Write reproduction test/script
- Commit it (
git commit -m "chore: add failing reproduction") - Add
[DEBUG-MODE]logs everywhere (dirty the working tree) - Analyze the logs
git restore .— instantly removes ALL instrumentation, perfectly- Your committed repro test survives. Your code is clean. Fix the bug.
No manual log deletion. No broken syntax. No cleanup mistakes.
Claude Code tries these automatically before asking you to do anything:
| Priority | Strategy | When to Use |
|---|---|---|
| 1 | Standalone script | Fastest — bypasses test framework config |
| 2 | Failing unit test | Gold standard when framework is easy to use |
| 3 | curl / HTTP | API bugs |
| 4 | Loop 50x | Race conditions, flaky bugs |
| 5 | Manual | Last resort only |
All debug logs follow a strict format:
[DEBUG-MODE][H1-stale-cache][getUserProfile] cache_key=user:123 hit=true age=3600s
[DEBUG-MODE][H2-race-condition][updateUser] lock_acquired=false retry=1
[DEBUG-MODE]prefix makesgit restore .cleanup bulletproof[H<N>-name]lets you filter by hypothesis:grep '\[H1' debug.log[function]tells you where you are
AI agents hang when they start servers without detaching:
# WRONG — hangs the terminal forever
npm run dev
# RIGHT — detached with PID tracking
nohup npm run dev > server.log 2>&1 &
SERVER_PID=$!
sleep 3
# ... do work ...
kill $SERVER_PIDAI agents blow their context window on massive log dumps:
# WRONG — 5MB of logs floods context
node repro.js
# RIGHT — capture to file, query surgically
node repro.js > debug.log 2>&1
grep '[DEBUG-MODE]' debug.log
grep -C 2 'Exception' debug.log | head -n 50Every debug session ends with proof:
# 1. Run repro — MUST PASS (green)
node repro.js
# 2. Stash the fix
git stash push -m "temp-verify"
# 3. Run repro — MUST FAIL (red)
node repro.js
# 4. Restore
git stash popIf the test passes without the fix, it's a bad test. Rewrite it.
mkdir -p ~/.claude/skills/debug-mode && \
curl -sL https://raw.githubusercontent.com/franzenzenhofer/debug-mode-skill/main/SKILL.md \
-o ~/.claude/skills/debug-mode/SKILL.md- Copy
SKILL.mdto~/.claude/skills/debug-mode/SKILL.md - That's it. Claude Code auto-discovers skills in
~/.claude/skills/.
Start Claude Code and describe a bug. You should see:
"Entering Debug Mode. I will hypothesize, instrument, reproduce, analyze, fix, and verify."
| Bug Type | Best Strategy |
|---|---|
| Logic error | Standalone script or failing test |
| API response wrong | curl |
| Race condition | Loop 50x |
| State corruption | Repro script |
| Auth/session | Repro script |
| Caching | Failing test |
| Flaky test | Loop 50x |
| CORS/headers | curl -v |
Debug instrumentation patterns for:
- JavaScript/TypeScript —
console.error('[DEBUG-MODE]...') - Python —
print('[DEBUG-MODE]...', file=sys.stderr) - Go —
fmt.Fprintf(os.Stderr, "[DEBUG-MODE]...") - Ruby —
$stderr.puts "[DEBUG-MODE]..." - PHP —
error_log("[DEBUG-MODE]...")
All use stderr so debug output doesn't interfere with application output.
These cannot be bypassed:
- No fixes without reproduction first
- No guessing — instrument and observe
- Never hang the terminal — detach background processes
- Protect your context — logs to file, then grep
- Reset before fixing —
git restore ., never manual edits
MIT