Skip to content

fix(tools): cap bash stdout/stderr capture to prevent OOM#492

Closed
euxaristia wants to merge 2 commits into
Gitlawb:mainfrom
euxaristia:fix/bash-output-oom-cap
Closed

fix(tools): cap bash stdout/stderr capture to prevent OOM#492
euxaristia wants to merge 2 commits into
Gitlawb:mainfrom
euxaristia:fix/bash-output-oom-cap

Conversation

@euxaristia

@euxaristia euxaristia commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Summary

The bash tool captured command stdout/stderr into an unbounded bytes.Buffer with no size limit. Large output (cat on a huge file, a gh pr with thousands of comments, find /, a verbose test run, etc.) can OOM the agent process, with no backpressure, and the process can be killed before the command even finishes.

Closes #469

Changes

  • Add maxBashOutputBytes (16 MiB) and a cappedWriter type in internal/tools/bash.go. cappedWriter wraps a bytes.Buffer, stops accepting bytes once the cap is hit, and sets a truncated flag instead of growing further.
  • bashTool.run now uses cappedWriter for both stdout and stderr, and surfaces a [zero] output truncated at 16 MiB / [zero] stderr truncated at 16 MiB notice in the tool output when the cap is hit.
  • cappedWriter intentionally does not embed *bytes.Buffer. exec.Cmd drives non-*os.File Stdout/Stderr writers through io.Copy, which prefers io.ReaderFrom over Write when the destination implements it. Embedding *bytes.Buffer would promote its ReadFrom method and let io.Copy bypass the cap entirely, streaming unbounded output straight into the buffer. Keeping the buffer as a named (non-embedded) field avoids that trap.

Test plan

  • Added TestBashToolCapsLargeStdoutAndReportsTruncation: runs a real subprocess writing 20 MiB of stdout through the bash tool and confirms the output is bounded and carries the truncation notice.
  • Added TestCappedWriterStopsGrowingAfterMax and TestCappedWriterCapturesPartialWriteThatCrossesMax: direct unit tests of the write-capping logic.
  • Added TestCappedWriterEnforcesCapThroughIOCopy: a regression test that drives cappedWriter through io.Copy (the same path exec.Cmd uses) to lock in the io.ReaderFrom fix described above.
  • go build ./internal/tools/...
  • go vet ./internal/tools/...
  • go test ./internal/tools/... -count=1: all relevant tests pass. Three pre-existing, location-sensitive failures (TestScopedToolsAllowReadOnlyRootsWithoutWrite, TestRegistryAppliesSandboxBeforeToolExecution, TestRegistrySandboxGatesPathAliasKeys) reproduce identically on unmodified upstream/main code in this same checkout path and are unrelated to this change.

This was split out of #468 at reviewer request (Vasanthdev2004 asked for the bundled fixes to be split into independent PRs).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Large command stdout/stderr is now safely capped to prevent excessive memory usage.
    • When output exceeds the limit, it now includes clear truncation notices (including the configured MiB amount).
    • Bash command results continue to format normally while handling oversized output more reliably.
  • Tests
    • Expanded coverage for stdout truncation behavior and boundary conditions to prevent regressions.

The bash tool buffered command stdout/stderr into an unbounded
bytes.Buffer, so large output (cat on a huge file, a verbose test run,
find /, etc.) could OOM the agent process with no backpressure.

Add cappedWriter, a bounded writer that stops growing once stdout or
stderr hits 16 MiB and surfaces a truncation notice in the tool output
instead of buffering everything.

Note: cappedWriter deliberately does not embed *bytes.Buffer. exec.Cmd
drives non-*os.File Stdout/Stderr writers through io.Copy, which prefers
io.ReaderFrom over Write when the destination implements it. Embedding
*bytes.Buffer promotes its ReadFrom method, which would let io.Copy bypass
the cap check entirely and stream unbounded output straight into the
buffer. Keeping the buffer as a named field avoids that.

Closes Gitlawb#469. Split out of Gitlawb#468 at reviewer request.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jul 4, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 60cbc010-3911-4321-8d80-c9ff1de0ce94

📥 Commits

Reviewing files that changed from the base of the PR and between 9d3418e and c59c9d2.

📒 Files selected for processing (2)
  • internal/tools/bash.go
  • internal/tools/bash_tool_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/tools/bash_tool_test.go
  • internal/tools/bash.go

Walkthrough

Bash command output is now capped at 16 MiB per stream, truncation is tracked for stdout and stderr, and rendered output includes explicit truncation markers. Tests cover large-output behavior and capped-writer boundary cases.

Changes

Bash Output Capping

Layer / File(s) Summary
cappedWriter implementation
internal/tools/bash.go
Adds a bytes.Buffer-backed writer enforcing a 16 MiB cap and recording truncation when writes exceed the limit.
Wire cappedWriter into bash execution
internal/tools/bash.go
Uses capped writers for command.Stdout and command.Stderr, then threads truncation flags into success and shell-hint output formatting.
Tests for capping and truncation
internal/tools/bash_tool_test.go
Adds a large-output helper and tests for truncation notices, bounded output size, boundary-crossing writes, and io.Copy enforcement.

Estimated code review effort: 2 (Simple) | ~15 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant bashTool
  participant cappedWriter
  participant Formatter

  Caller->>bashTool: run(command)
  bashTool->>cappedWriter: write stdout/stderr bytes
  cappedWriter-->>bashTool: buffered output + truncated flags
  bashTool->>Formatter: formatBashOutputWithTruncation(...)
  Formatter-->>Caller: output with truncation markers
Loading

Possibly related PRs

  • Gitlawb/zero#79: Also changes internal/tools/bash.go around bash execution and output handling, so it likely touches the same review surface.
  • Suggested reviewers: anandh8x, Vasanthdev2004
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and accurately summarizes the main change: capping bash stdout/stderr capture to avoid OOMs.
Linked Issues check ✅ Passed The PR implements the requested bash output cap, truncation messaging, and tests for large and boundary cases.
Out of Scope Changes check ✅ Passed The changes stay focused on bash output buffering and its tests, with no clear unrelated or extra-scope modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/tools/bash.go (1)

358-392: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Truncation notice hardcodes "16 MiB" instead of deriving it from maxBashOutputBytes.

If maxBashOutputBytes is ever changed, this message silently goes stale and misleads users about the actual cap.

♻️ Derive the message from the constant
 	if stdoutTruncated {
-		parts = append(parts, "[zero] output truncated at 16 MiB")
+		parts = append(parts, fmt.Sprintf("[zero] output truncated at %d MiB", maxBashOutputBytes/(1024*1024)))
 	}
 	if stderrTruncated {
-		parts = append(parts, "[zero] stderr truncated at 16 MiB")
+		parts = append(parts, fmt.Sprintf("[zero] stderr truncated at %d MiB", maxBashOutputBytes/(1024*1024)))
 	}

Note: the test in bash_tool_test.go asserts the literal string "output truncated at 16 MiB" (Line 318), so this would need updating alongside the fix (or the test can assert %d MiB format).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/tools/bash.go` around lines 358 - 392, The truncation messages in
formatBashOutputWithTruncation are hardcoded to “16 MiB” instead of using
maxBashOutputBytes. Update the truncation notice generation to derive the
displayed limit from the existing constant, and keep the message construction
inside formatBashOutputWithTruncation consistent for both stdoutTruncated and
stderrTruncated. Also update the bash_tool_test.go expectation that currently
asserts the literal string so it matches the new dynamically derived value.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/tools/bash.go`:
- Around line 358-392: The truncation messages in formatBashOutputWithTruncation
are hardcoded to “16 MiB” instead of using maxBashOutputBytes. Update the
truncation notice generation to derive the displayed limit from the existing
constant, and keep the message construction inside
formatBashOutputWithTruncation consistent for both stdoutTruncated and
stderrTruncated. Also update the bash_tool_test.go expectation that currently
asserts the literal string so it matches the new dynamically derived value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: da028965-8362-41ae-8db5-eccac76ad2a4

📥 Commits

Reviewing files that changed from the base of the PR and between f401c66 and 9d3418e.

📒 Files selected for processing (2)
  • internal/tools/bash.go
  • internal/tools/bash_tool_test.go

euxaristia added a commit to euxaristia/zero that referenced this pull request Jul 4, 2026
Vasanthdev2004 asked for this PR to be split since it bundled several
independent static-analysis fixes. Remove the specialist depth cap, bash
output OOM cap, and persistence error logging: each now has its own PR
(Gitlawb#491, Gitlawb#492, Gitlawb#493).

Also drop the "cat" addition to the Windows POSIX-utility detection in
shell_runtime.go. PR Gitlawb#476 already covers cat detection comprehensively
under the MSYS/sandbox angle, so keeping it here would duplicate that
work.

What remains: the Windows cmd.exe quoting-guidance rewrite, the
Windows-specific interactive-command suggestions (steering away from
Unix head/tail/ps toward native or PowerShell alternatives), the
clipboard PowerShell -Command fix, and the CI test-slack change in
exec_command_test.go.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>

@jatmn jatmn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I do not see any actionable issues from my review.

@Vasanthdev2004

@Vasanthdev2004 Vasanthdev2004 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting this out — it's exactly the focused PR I asked for on #468, and the code is clean. The cap holds on every write path, returning len(p), nil so the child keeps draining instead of taking a SIGPIPE is the right call, there's no race between the two streams, and the 20 MiB subprocess test actually exercises it. The trick of not embedding *bytes.Buffer so io.Copy's ReaderFrom path can't bypass the cap is sharp, and the regression test locking it in is the right instinct.

The snag isn't the code — it's that #484 ("polish v2", item #3) also caps bash stdout/stderr and rewrites the same lines of bash.go, so the two collide and can't both land as-is. And they're really fixing two different halves:

  • #484 caps what reaches the model — 96 KB head+tail per stream, keeps the tail where build/test errors live, sets Result.Truncated, and plugs into the same output-budget infra the read/search tools use. But it truncates after buffering into an unbounded buffer, so it does nothing for memory: cat 10GB.log still OOMs before it ever truncates.
  • This PR caps the capture itself, which is the only thing that actually prevents the OOM in #469 — but 16 MiB head-only is a non-guard for context (≈4M tokens, past any window) and drops the tail where the errors are.

So they're complementary layers, not competitors, and the clean end state is both. What I'd like: keep the part that's genuinely unique here — the cappedWriter capture guard and its ReaderFrom regression test — but reshape it into a pure OOM backstop under #484's budget. Concretely: drop the cap from 16 MiB to a few MiB (a memory-only ceiling, high enough it basically never trips in normal use), delete the standalone [zero] output truncated at 16 MiB notice and the truncated-bool threading, feed the capped buffer into #484's budgetBashOutput, and let a memory-cap trip also flip Result.Truncated. Then capture is OOM-safe and the model-facing truncation is #484's head+tail. Easiest to rebase onto #484 once its shape settles.

One non-blocking note on the rationale (code's fine, just the explanation): on the current Go toolchain *os.File does implement io.WriterTo (since 1.22), so "the pipe source has no WriteTo" isn't quite the mechanism — the ReaderFrom branch is reached one io.Copy layer deeper (os.File.WriteTogenericWriteTo, which hides WriteTo). Outcome's identical and your non-embedded-field call is still exactly right; just don't document the reason as "*os.File lacks WriteTo."

To be clear, the overlap is on me for approving #484's bash budget while you were splitting #468 — not a knock on your work. If you'd rather not wait on #484, tell me and I'll help sort the sequencing.

The truncation message hardcoded "16 MiB" instead of deriving it from
the maxBashOutputBytes constant, so it would silently go stale if the
cap ever changed. Format it from the constant instead, and update the
test assertion to match dynamically rather than asserting the literal
string.
@euxaristia

Copy link
Copy Markdown
Contributor Author

Thanks for the detailed breakdown. Agreed these are complementary layers, and reshaping this into a pure OOM backstop under #484's budget makes sense.

I'd rather not restructure this against a moving target, so I'll wait for #484 to settle (or merge) before reshaping the cap and dropping the standalone truncation notice/threading here. Will follow up once that lands.

@Vasanthdev2004

Copy link
Copy Markdown
Collaborator

Circling back now that the sequencing resolved — closing this as superseded by #484.

When I last reviewed, this PR's unique value was the cappedWriter capture guard: the real OOM backstop for #469, which #484's budget didn't cover because it truncated after buffering. That gap is closed now — #484's latest commit ("bound bash output capture in memory, not just model-visible text") caps the capture itself, so it does both halves: the 96 KB head+tail model-facing cap and the memory ceiling that prevents the OOM. I re-checked that path on #484 and the capture is genuinely bounded (a few hundred KB per stream), so #469 gets fixed there.

That leaves this one a conflicting duplicate — it collides with #484 on the same bash.go lines (showing CONFLICTING), and the only commit since my review was the notice-string cleanup rather than the reshape-onto-#484 I'd suggested. No reason to have you rebase a duplicate onto an approved PR, so I'm closing it.

Not a knock on the work — it was clean, the ReaderFrom-bypass guard was the sharp bit, and it's a good part of why #484 ended up hardening the in-memory bound. #469 lands either way once #484 merges. If #484 stalls for any reason, reopening this is one click. Thanks for splitting it out when I asked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bash tool: unbounded stdout/stderr buffering can OOM the agent

3 participants