Environment
- rtk version: 0.40.0 (
~/.local/bin/rtk.exe)
- OS / shell: Windows 11 Pro (10.0.26200), Git Bash
- Integration: Claude Code global hook —
~/.claude/settings.json PreToolUse matcher "Bash" → command: "rtk hook claude". Every Bash tool call is rewritten (git status → rtk git status) and its STDOUT is filtered before it reaches the LLM.
Summary
Used as a Claude Code PreToolUse: Bash → rtk hook claude filter, the STDOUT returned to the agent is intermittently not faithful to the real command output. We observed duplicated lines and spurious markup (text resembling the agent harness's own result-delimiter tags) injected into otherwise-correct output. The command's on-disk side effects are always correct — only the returned text diverges. For an autonomous AI agent that treats returned STDOUT as ground truth, even rare infidelity is high-impact, because it can act on wrong facts (merge/push/report decisions).
Honesty / scope note (so you can prioritize accurately)
We initially attributed a broader, more dramatic set of failures (invented PR numbers, fake push/commit confirmations) to rtk. On controlled re-testing we found a meaningful fraction of those were our own tooling errors, not rtk:
- co-batching a "write a summary file" tool-call together with the read tool-calls whose results it was summarizing — capturing placeholder values before the results returned;
- malformed PowerShell-under-bash one-liners erroring (
command not found, unexpected EOF).
Deterministic probes (printf KNOWN, and git rev-parse --short HEAD cross-checked against .git/refs/heads/<branch>) came back clean and internally consistent. So this report is scoped to the residual, genuine behavior only: intermittent, hard-to-reproduce STDOUT infidelity (duplicated lines / spurious delimiter-like markup) under batching/load. We could not produce a 100%-deterministic repro.
Repro (best-effort, intermittent)
- Windows 11 + Git Bash, rtk 0.40.0, Claude Code with
PreToolUse: Bash → rtk hook claude installed.
- Run several
git/gh commands through the hook in quick succession / batched (e.g. git log --grep=..., gh pr view <n> --json ..., git branch --contains <sha>).
- Compare each command's hook-returned STDOUT against ground truth captured two independent ways: (a) redirect to a file and read the file, (b) read the underlying source (e.g.
.git/refs/heads/<branch> for SHAs).
- Intermittently the hook-returned STDOUT shows duplicated lines / injected markup, while the redirect-to-file output and the underlying source agree with each other.
Expected: filtering/compression is lossless with respect to facts and structure; if output is truncated it is clearly marked as truncated — never rewritten, duplicated, or augmented with markup.
Actual: returned STDOUT intermittently contains duplicated lines and spurious markup not present in the real output.
Impact: AI coding agents may act on altered output. It also interacts badly with downstream hooks that substring-match command output — e.g. a merge-gate hook that scans codex output for command not found / ENOENT can be falsely tripped by injected text.
Workaround
- Per-command: prefix with
rtk run <cmd> (raw, no filtering, no tracking) or rtk proxy <cmd> (raw, tracks usage).
- Systemic: remove the
PreToolUse: Bash → rtk hook claude hook (loses token savings). We found no master "filter off" config switch or env var — only [hooks] exclude_commands / transparent_prefixes in config.toml for per-command exclusion.
Questions for maintainers
- Is there a supported way to make the hook filter lossless (never rewrite/duplicate, only summarize + mark truncation)?
- Is there a global disable-filtering flag (env var or config) that keeps the hook installed but passes STDOUT through raw?
- Are there known-good / known-bad versions around 0.40.0 for output fidelity?
Environment
~/.local/bin/rtk.exe)~/.claude/settings.jsonPreToolUsematcher"Bash"→command: "rtk hook claude". Every Bash tool call is rewritten (git status→rtk git status) and its STDOUT is filtered before it reaches the LLM.Summary
Used as a Claude Code
PreToolUse: Bash → rtk hook claudefilter, the STDOUT returned to the agent is intermittently not faithful to the real command output. We observed duplicated lines and spurious markup (text resembling the agent harness's own result-delimiter tags) injected into otherwise-correct output. The command's on-disk side effects are always correct — only the returned text diverges. For an autonomous AI agent that treats returned STDOUT as ground truth, even rare infidelity is high-impact, because it can act on wrong facts (merge/push/report decisions).Honesty / scope note (so you can prioritize accurately)
We initially attributed a broader, more dramatic set of failures (invented PR numbers, fake push/commit confirmations) to rtk. On controlled re-testing we found a meaningful fraction of those were our own tooling errors, not rtk:
command not found,unexpected EOF).Deterministic probes (
printf KNOWN, andgit rev-parse --short HEADcross-checked against.git/refs/heads/<branch>) came back clean and internally consistent. So this report is scoped to the residual, genuine behavior only: intermittent, hard-to-reproduce STDOUT infidelity (duplicated lines / spurious delimiter-like markup) under batching/load. We could not produce a 100%-deterministic repro.Repro (best-effort, intermittent)
PreToolUse: Bash → rtk hook claudeinstalled.git/ghcommands through the hook in quick succession / batched (e.g.git log --grep=...,gh pr view <n> --json ...,git branch --contains <sha>)..git/refs/heads/<branch>for SHAs).Expected: filtering/compression is lossless with respect to facts and structure; if output is truncated it is clearly marked as truncated — never rewritten, duplicated, or augmented with markup.
Actual: returned STDOUT intermittently contains duplicated lines and spurious markup not present in the real output.
Impact: AI coding agents may act on altered output. It also interacts badly with downstream hooks that substring-match command output — e.g. a merge-gate hook that scans
codexoutput forcommand not found/ENOENTcan be falsely tripped by injected text.Workaround
rtk run <cmd>(raw, no filtering, no tracking) orrtk proxy <cmd>(raw, tracks usage).PreToolUse: Bash → rtk hook claudehook (loses token savings). We found no master "filter off" config switch or env var — only[hooks] exclude_commands/transparent_prefixesinconfig.tomlfor per-command exclusion.Questions for maintainers