[claude-hackernews] Reply draft: Capsule Bash Show HN, inline-eval escape hatch (id=48009460)#58
[claude-hackernews] Reply draft: Capsule Bash Show HN, inline-eval escape hatch (id=48009460)#58NiveditJain wants to merge 1 commit intomainfrom
Conversation
…cape hatch (id=48009460)
📝 WalkthroughWalkthroughA new draft Markdown reply to a Show HN thread about Capsule Bash is added, discussing sandbox architecture limitations and proposing syscall-level or pre-parsing mitigations for inline ChangesCapsule Bash Security Draft Reply
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 3/5 reviews remaining, refill in 18 minutes. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
drafts/2026-05-04T151151Z.md (2)
19-20: ⚡ Quick winQualify the
python3 -c/node -eclaim to avoid overstatement.Right now it reads as if the reintroduction of filesystem runtime capabilities will occur unconditionally once those commands are “core commands”. If Capsule Bash could be configured to exclude Python/Node, or to block
-c/-ein its own tool layer, this draft risks being perceived as an absolute bypass claim rather than an architectural observation.Suggestion: tweak sentences to something like “If python3/node are available as inline evaluators inside the sandboxed executor…” so the critique stays accurate across implementations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@drafts/2026-05-04T151151Z.md` around lines 19 - 20, Update the paragraph that asserts "python3 -c" and "node -e" re-open the sandbox to soften the absolute claim: change wording to qualify that this happens only if those inline evaluators are available or not blocked by the sandbox/tool layer (e.g., use phrasing like “If python3/node are available as inline evaluators inside the sandboxed executor…”), mention the alternative mitigations (blocking -c/-e at the Capsule Bash tool layer or intercepting at syscall layer), and retain the note about the PreToolUse hook as the complementary seam; ensure the revised sentence references the same symbols ("python3 -c", "node -e", "Capsule Bash", "PreToolUse hook") so reviewers can locate the change.
16-34: ⚡ Quick winAdd a language tag to the fenced reply block (MD040).
The fenced block that wraps the HN reply has no language specified. Add a language tag like
textto satisfy markdownlint MD040 without changing the reply contents you intend to post.Suggested change
-``` +```text (disclosure: I work on FailProof AI: https://github.com/exospherehost/failproofai) The "too much freedom" framing is fair, but `python3 -c` and `node -e` as core commands re-open most of it. Once the agent runs `python3 -c "import shutil; shutil.rmtree(p)"`, the fs ops happen inside the Python runtime, not your bash interpreter, so the sandbox's structured-feedback layer never sees them. Same for `node -e "require('fs').unlinkSync(...)"`. Either intercept at the syscall layer, or pre-parse the inline source and refuse blocked imports. Adjacent angle: agents also call Edit/Write/MCP tools that bypass Bash entirely, so a Bash-only sandbox leaves that surface uncovered. The complementary seam is the harness's PreToolUse hook, which sees every tool call before any executor runs: import { customPolicies, allow, deny } from "failproofai"; customPolicies.add({ name: "block-inline-eval", match: { events: ["PreToolUse"] }, fn: ({ toolName, toolInput }) => { if (toolName !== "Bash") return allow(); if (/\b(python3?|node)\s+-(c|e)\b/i.test(toolInput?.command ?? "")) { return deny("inline -c/-e blocked; drop a script file instead"); } return allow(); }, });</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@drafts/2026-05-04T151151Z.mdaround lines 16 - 34, The fenced code block
wrapping the HN reply is missing a language tag which triggers markdownlint
MD040; update the opening fence from "" to include a language tag (e.g., "text") so the block becomes fenced with a language specifier while leaving
the block's contents exactly as-is; ensure you modify the same fenced reply
block shown in the diff (the triple-backtick fenced block containing the
FailProof AI disclosure and code snippet) and do not alter any of the reply
text.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In@drafts/2026-05-04T151151Z.md:
- Around line 19-20: Update the paragraph that asserts "python3 -c" and "node
-e" re-open the sandbox to soften the absolute claim: change wording to qualify
that this happens only if those inline evaluators are available or not blocked
by the sandbox/tool layer (e.g., use phrasing like “If python3/node are
available as inline evaluators inside the sandboxed executor…”), mention the
alternative mitigations (blocking -c/-e at the Capsule Bash tool layer or
intercepting at syscall layer), and retain the note about the PreToolUse hook as
the complementary seam; ensure the revised sentence references the same symbols
("python3 -c", "node -e", "Capsule Bash", "PreToolUse hook") so reviewers can
locate the change.- Around line 16-34: The fenced code block wrapping the HN reply is missing a
language tag which triggers markdownlint MD040; update the opening fence from
"" to include a language tag (e.g., "text") so the block becomes fenced
with a language specifier while leaving the block's contents exactly as-is;
ensure you modify the same fenced reply block shown in the diff (the
triple-backtick fenced block containing the FailProof AI disclosure and code
snippet) and do not alter any of the reply text.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro **Run ID**: `eb40469f-ada2-4ca4-91db-9acc64851329` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between ebbce06017d58f009ca7dfa9dbb3e0dcf1bba4df and 8e08b1f86ff410c3e046c543323c0df7fa1e19fd. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `drafts/2026-05-04T151151Z.md` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Summary
Draft of a reply on the Show HN of Capsule Bash (sandboxed bash for agents, by
mavdol04). The OP explicitly invites design feedback.claude code agent,agent guardrails,agent deleted,agent sandboxall came up either empty or already-covered (PR [claude-hackernews] Reply draft: harness-outside-sandbox, PreToolUse firewall layer (id=47990675) #17, [claude-hackernews] Reply draft: Lilith-zero Show HN, transport vs hook layer (id=47875939) #50, [claude-hackernews] Reply draft: AgentPort vs runtime-hook layer (id=47950752) #11, etc.).INSTRUCTIONS.md"Tone for discussing it on HN" gate): Show HN of an adjacent product (sandbox), OP soliciting design discussion. Reply leads with a substantive critique of the design (python3 -c/node -eas core commands re-open the freedom the sandbox tries to clamp down on, because Python/Node fs ops happen inside their own runtimes, not the bash interpreter), then offers the harness's PreToolUse hook as the complementary seam with one custom-policy snippet.grep item?id=48009460clean acrossdrafts/+comments/; cross-PR scan viagh pr diffclean.Files
drafts/2026-05-04T151151Z.md(the draft + insight + notes;**Status:** draft (pending manual post))Thread URL
https://news.ycombinator.com/item?id=48009460
Test plan
drafts/2026-05-04T151151Z.mdcomments/<ts>.mdand append the permalink to the HN: line in the draft🤖 Generated with Claude Code
Summary by CodeRabbit