[claude-hackernews] Reply draft: Agent-skills-eval thread, deny() vs instruct() for routing Bash DB queries via tidewave (id=48046023) by NiveditJain · Pull Request #64 · exospherehost/claude-hackernews

NiveditJain · 2026-05-08T21:27:17Z

Discovery

Feed sweep: /ask, /show, hn.algolia.com/?q=claude%20code%20agent, hn.algolia.com/?q=agent%20guardrails, hn.algolia.com/?q=claude%20code%20hooks, hn.algolia.com/?q=codex%20agent (all dateRange=pastWeek / pastMonth, sort=byDate).
Filtered candidates against the existing covered-thread set (53 prior IDs across drafts/, comments/, and open PRs). Tilde.run sandbox thread (id=48037724) skipped under cross-thread duplicate guard since luv-63's "Stop Treating Agent Sandboxes as Cattle" already argues the intent-vs-infra layer.
Picked id=48046023 because reedlaw's sub-thread (id=48051949) is the cleanest "concrete failure mode that hooks should solve, but the operator can't see how" articulation I found this sweep.

Thread

Story: Show HN: Agent-skills-eval - Test whether Agent Skills improve outputs (link-only Show HN to https://github.com/darkrishabh/agent-skills-eval; 72 points, 36 comments, 1 day old).
Parent (replied to): reedlaw, id=48051949. Grandparent reedlaw id=48050489 lays out the concrete failure: Opus 4.7 ignores a 720-byte CLAUDE.md telling it to route DB queries through tidewave's MCP server, and instead does Bash(DATABASE_URL=$(grep ... .env) echo "ok"). reedlaw's id=48051949 then asks how a hook would even work for this, and concludes "prompts are not tightly coupled with capabilities".

Proposed reply (full text in `drafts/2026-05-08T212431Z.md`)

The reply pivots on the deny()-vs-instruct() distinction: instruct(msg) injects guidance and lets the call proceed (so the model can ignore it the same way it ignores CLAUDE.md), but deny(msg) returns a tool-error so the bash literally does not run and the model has to take a different path. Includes one custom-policy snippet that pattern-matches DB-shaped Bash commands and denys them with a redirect message to tidewave's MCP tool. ~140 words. ASCII-only punctuation. Single disclosure line at the top, single repo link.

Workflow

Status: draft (pending manual post) - per CLAUDE.md "Comments via PR (never direct post)", I have not typed into the HN composer or clicked submit.
One draft, one commit, one PR. Fresh branch off origin/main (luv-64 has its own PR [claude-hackernews] Reply draft: Kirikiri Show HN, mobile supervision asymmetry vs hook layer (id=47996198) #55 covering id=47996198, so this is not bundled there).
Duplicate check: id=48046023 is not in any prior drafts/, comments/, or open-PR diff. Cross-thread duplicate guard verified - deny()-vs-instruct() framing is fresh; earlier PRs covered transport-vs-hook, MCP-surface-vs-PreToolUse, Docker-vs-intent, workflow-vs-invariant, etc.

Summary by CodeRabbit

Documentation
- Added internal draft documentation containing technical discussions and policy examples related to evaluation frameworks and tool routing strategies.

Note: This change consists primarily of internal draft materials with no direct user-facing impact.

…ueries via tidewave (id=48046023)

coderabbitai · 2026-05-08T21:27:30Z

📝 Walkthrough

Walkthrough

A new Markdown draft file outlines a Hacker News "Show HN" submission on an eval harness for Claude code skills, including a substantive reply on tool routing enforcement using "deny" semantics with a concrete policy example that blocks Bash commands matching database access patterns to redirect requests through a dedicated MCP tool.

Changes

HN Post Draft on Tool Routing via Deny Semantics

Layer / File(s)	Summary
Post Context and Metadata `drafts/2026-05-08T212431Z.md`	HN item link, parent context, and submission status metadata are established for the Show HN post.
Post Framing and OP Summary `drafts/2026-05-08T212431Z.md`	The post outlines the eval harness A/B test premise and captures the parent-comment debate about routing enforcement and hooks.
Reply Content and Tool Routing Policy `drafts/2026-05-08T212431Z.md`	Drafted reply argues for "deny" semantics over "instruct" semantics and includes a concrete PreToolUse policy that blocks Bash commands matching DB access patterns.
Outreach Strategy and Implementation Notes `drafts/2026-05-08T212431Z.md`	FailProof blog angle and routing-via-deny story framing are outlined, along with constraints (ASCII-only formatting, login-wall expectations, timing, cross-thread guards).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

exospherehost/claude-hackernews#6: Both PRs touch the drafts/ workflow directory and manage draft file persistence and routing.
exospherehost/claude-hackernews#2: Both PRs interact with the drafts-only mode and drafts/ directory commit strategy.
exospherehost/claude-hackernews#4: Both PRs touch the HN workflow's drafts/ vs. comments/ split and tooling structure.

Poem

🐰 A rabbit hops through HN threads so bright,
Drafting posts on routing done right—
Deny the bypass, route with care,
Tools find their homes with a policy fair! 🎯

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is specific and directly related to the main change: a Hackernews reply draft about deny() vs instruct() semantics for routing Bash DB queries, with a specific thread ID and context.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@drafts/2026-05-08T212431Z.md`:
- Line 19: The fenced code block opened at the start of the snippet is missing a
language tag (triggers markdownlint MD040); update the opening fence used in the
draft (the triple-backtick block that contains "(disclosure: I work on FailProof
AI...)" ) to include an explicit language tag such as text (e.g., change ``` to
```text) so the block is properly tagged and MD040 is resolved.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bcfd33ac-de3c-4136-b7d5-05234b74b80d

📥 Commits

Reviewing files that changed from the base of the PR and between ebbce06 and fa86f3d.

📒 Files selected for processing (1)

drafts/2026-05-08T212431Z.md

coderabbitai · 2026-05-08T21:29:17Z

+
+**My reply:**
+
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced code block.

Line 19 opens a fenced block without a language, which triggers markdownlint MD040.

Suggested fix

-``` +```text (disclosure: I work on FailProof AI: https://github.com/exospherehost/failproofai) ... -``` +```

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 19-19: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@drafts/2026-05-08T212431Z.md` at line 19, The fenced code block opened at the start of the snippet is missing a language tag (triggers markdownlint MD040); update the opening fence used in the draft (the triple-backtick block that contains "(disclosure: I work on FailProof AI...)" ) to include an explicit language tag such as text (e.g., change ``` to ```text) so the block is properly tagged and MD040 is resolved.

[claude-hackernews] draft: deny() vs instruct() for routing Bash DB q…

fa86f3d

…ueries via tidewave (id=48046023)

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[claude-hackernews] Reply draft: Agent-skills-eval thread, deny() vs instruct() for routing Bash DB queries via tidewave (id=48046023)#64

[claude-hackernews] Reply draft: Agent-skills-eval thread, deny() vs instruct() for routing Bash DB queries via tidewave (id=48046023)#64
NiveditJain wants to merge 1 commit intomainfrom
hn-deny-vs-instruct-tidewave-48046023

NiveditJain commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NiveditJain commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Discovery

Thread

Proposed reply (full text in drafts/2026-05-08T212431Z.md)

Workflow

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

NiveditJain commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Proposed reply (full text in `drafts/2026-05-08T212431Z.md`)

coderabbitai Bot commented May 8, 2026 •

edited

Loading