fix(autoname): make Haiku auto-rename reliable and diagnosable by drn · Pull Request #806 · drn/argus

drn · 2026-06-24T21:23:53Z

Recent task auto-renames failed inconsistently, leaving tasks on their
regex slug. Root causes (confirmed against live ~/.argus logs):

claude -p writes RUNTIME errors (budget exceeded / usage-rate limit /
overload) to STDOUT and exits non-zero with EMPTY stderr. The prior fix
folded only ExitError.Stderr, so every real failure logged as a bare,
undiagnosable "exit status 1".
The --max-budget-usd 0.01 cap was tuned to a stale ~$0.0002/call
estimate; a live call measures ~1235 input + 111 output tokens
($0.0034), leaving only ~3x headroom.

Changes (internal/llm/namegen.go):

wrapRunError folds stdout first (the runtime-error channel), then
stderr when both present, scrubbing control chars so an untrusted
reason cannot forge a second log line (log-injection defense).
--max-budget-usd 0.01 -> 0.05 (~15x measured cost); comments corrected.
Retry the CLI once on a non-zero exit; validation failures terminal;
keeps the first (richest) reason on exhaustion. DefaultTimeout 30s -> 45s.

Routed through openspec (auto-naming), archived in-PR. Adds 7 tests;
gotchas/misc.md updated.

Co-Authored-By: Claude noreply@anthropic.com

Recent Haiku auto-renames failed inconsistently, leaving tasks on their regex slug. Root causes (both confirmed against live ~/.argus logs): - claude -p writes RUNTIME errors (budget exceeded / usage-rate limit / overload) to STDOUT and exits non-zero with an EMPTY stderr. The prior fix folded only ExitError.Stderr, so every real failure logged as a bare "exit status 1" — undiagnosable. - The --max-budget-usd 0.01 cap was tuned to a stale "~$0.0002/call" estimate; a live call measures ~1235 input + ~111 output tokens (~$0.0034), leaving only ~3x headroom. A longer pasted prompt crossed it → "Error: Exceeded USD budget" → exit 1 → slug kept. Fixes in internal/llm/namegen.go: - wrapRunError folds stdout FIRST (the runtime-error channel), then stderr, so failures are diagnosable. - Raise --max-budget-usd 0.01 → 0.05 (~15x measured cost); correct the stale cost comments. - Retry the CLI once with short backoff on a non-zero exit (transient overload/limit); validation failures are NOT retried. - Raise DefaultTimeout 30s → 45s (a signal: killed was seen at 30s). Adds 4 tests (stdout-fold regression, retry-then-succeed, retry-exhausted, budget-flag pin). Routed through openspec (auto-naming) and archived in-PR; gotchas/misc.md updated. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…lassify Address code-review findings on the retry/diagnostics path: - Log-injection (WARNING): the folded claude stdout reason is untrusted (may echo prompt text) and flowed unescaped into uxlog (%s, newline- terminated) and slog's TextHandler (unquoted). A newline could forge a second physical log line (a fake "[autoname] renamed" record). New scrubReason maps CR/LF/tab/ANSI-ESC/control runes to spaces before folding. - wrapRunError now folds BOTH stdout and stderr when both are present (was: stdout-only, dropping a concurrent flag-parse stderr). - Retry exhaustion kept the LAST attempt's error while the comment claimed "first" — a rich attempt-0 reason (budget) was lost when a retry died bare. Now keeps firstErr; comment corrected; shared-deadline behavior documented honestly. - generateNameOnce returns (name, retryable bool, err) — clearer than the implicit-mutual-exclusion three-error-return. - Dedup double string(out) conversion; tests use testutil.Error. Adds tests: control-char scrub (log-injection guard), keep-first-reason on exhaustion, scrubReason table. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

github-actions · 2026-06-24T21:28:37Z

Merging this branch will not change overall coverage

Impacted Packages	Coverage Δ	🤖
github.com/drn/argus/internal/llm	94.12% (ø)

Coverage by file

Changed files (no unit tests)

Changed File	Coverage Δ	Total	Covered	Missed	🤖
github.com/drn/argus/internal/llm/namegen.go	94.12% (ø)	68	64	4

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

github.com/drn/argus/internal/llm/namegen_test.go

drn and others added 2 commits June 24, 2026 14:22

drn merged commit dfb0f06 into master Jun 24, 2026

drn deleted the argus/Recent-task-renames-aren-t branch June 25, 2026 00:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(autoname): make Haiku auto-rename reliable and diagnosable#806

fix(autoname): make Haiku auto-rename reliable and diagnosable#806
drn merged 2 commits into
masterfrom
argus/Recent-task-renames-aren-t

drn commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Changed files (no unit tests)

Changed unit test files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

drn commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Merging this branch will not change overall coverage

Changed files (no unit tests)

Changed unit test files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant