From 58dfb658dd04e938381c89d8cd105c0aeb64947c Mon Sep 17 00:00:00 2001 From: Albert Mavashev Date: Thu, 14 May 2026 10:38:29 -0400 Subject: [PATCH] skill(blog): codex prompt should discover paths, not trust examples MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two edits to .claude/skills/blog/SKILL.md and .agents/skills/blog/SKILL.md (kept in sync): 1. Phase 4 source-code audit step — instruct the auditor to discover the repo layout first via `gh api repos///git/trees/ main?recursive=1` before assuming any path. Names the example-paths- are-stale failure mode explicitly (`langchain_runcycles/...` vs `src/langchain_runcycles/...`). 2. Phase 9a codex prompt template — replace the prescribed `gh api repos///contents/...` example (which codex typically cannot shell to under read-only sandbox) with explicit instructions to discover paths and tell codex its read-only sandbox blocks `gh` calls, so it should route through its GitHub connector. Triggered by codex's open question in the v0.2.0 LangChain runcycles review (PR #644): "Should the source-path references in the review prompt be updated from `src/langchain_runcycles/...` to `langchain_runcycles/...`?" — same class of fragility the skill's own source-audit clause was meant to catch. --- .agents/skills/blog/SKILL.md | 17 +++++++++++------ .claude/skills/blog/SKILL.md | 17 +++++++++++------ 2 files changed, 22 insertions(+), 12 deletions(-) diff --git a/.agents/skills/blog/SKILL.md b/.agents/skills/blog/SKILL.md index dd801bd..aea312b 100644 --- a/.agents/skills/blog/SKILL.md +++ b/.agents/skills/blog/SKILL.md @@ -52,7 +52,7 @@ Spawn agents in parallel — each covers a subset of the eight dimensions, joint 11. **Link verification (dim 3):** Check every internal link resolves to an existing .md file; flag any link in a list rather than contextual prose; check link count is in the 5–8 range for pillar posts. 12. **Fact-check on prose (dims 1, 2):** Verify all prose claims, dollar figures, version numbers, and named features against source posts, upstream READMEs, and release notes; flag overclaims, hype phrases, unverifiable absolutes. -13. **Source-code audit (dims 1, 5) — when the post contains code or makes code-level claims:** Fetch the actual source files from the referenced upstream repo (e.g. `gh api repos///contents/ --jq .content | base64 -d`) and verify, per claim: +13. **Source-code audit (dims 1, 5) — when the post contains code or makes code-level claims:** Fetch the actual source files from the referenced upstream repo and verify, per claim. Discover the repo layout first (`gh api repos///git/trees/main?recursive=1 --jq '.tree[] | select(.path | endswith(\".java\") or endswith(\".py\") or ...) | .path'`) before assuming any path — example paths in this skill or in prior prompts may be stale against the current package layout (e.g., `langchain_runcycles/...` vs `src/langchain_runcycles/...`). Then fetch and verify, per claim: - **Operator order in reactive/async code** — e.g. `doOnError` attached before vs. after `concatWith` changes whether commit-Mono failures trigger upstream cleanup. Reactor / RxJava / Project Reactor claims are particularly easy to get wrong from prose alone. - **Method signatures and return types** — does `chatClient.prompt().stream()` actually return `Flux`, or a stream-spec on which `.chatResponse()` yields it? - **Field names, action labels, header names** — anything quoted in backticks should be searchable in source. @@ -110,11 +110,16 @@ codex exec --sandbox read-only --cd --skip-git-repo-check \ before critique). No dimension is skippable. 3. Explicitly say NOT to edit files (read-only sandbox enforces this anyway). 4. NAME THE UPSTREAM SOURCE REPOS and tell codex to fetch and read the - relevant source files (e.g. via 'gh api repos///contents/... - --jq .content | base64 -d') before judging code-level claims. Give - example file paths if known. Tell codex explicitly to verify operator - order in any reactive/async pseudocode, method signatures of framework - abstractions cited, error/release paths, fluent-builder requirements, + relevant source files before judging code-level claims. **Do not pin + file paths in the prompt** — codex's read-only sandbox typically + blocks shell `gh` calls, so codex will route through its GitHub + connector, and example paths in this template (or in your prompt) may + be stale against the current package layout. Instruct codex to + **discover the repo layout first** (list the tree), then locate + relevant files by name pattern. Tell codex explicitly to verify + operator order in any reactive/async pseudocode, method signatures of + framework abstractions cited, error/release paths, fluent-builder + requirements, type aliases against their actual source definitions, and any quoted identifier (field, action label, header) against the actual source. Do not trust the post's own pseudocode as ground truth. 5. Ask for output bucketed by FACTUAL / OVERCLAIM / CROSS-LINKS / SEO / diff --git a/.claude/skills/blog/SKILL.md b/.claude/skills/blog/SKILL.md index dd801bd..aea312b 100644 --- a/.claude/skills/blog/SKILL.md +++ b/.claude/skills/blog/SKILL.md @@ -52,7 +52,7 @@ Spawn agents in parallel — each covers a subset of the eight dimensions, joint 11. **Link verification (dim 3):** Check every internal link resolves to an existing .md file; flag any link in a list rather than contextual prose; check link count is in the 5–8 range for pillar posts. 12. **Fact-check on prose (dims 1, 2):** Verify all prose claims, dollar figures, version numbers, and named features against source posts, upstream READMEs, and release notes; flag overclaims, hype phrases, unverifiable absolutes. -13. **Source-code audit (dims 1, 5) — when the post contains code or makes code-level claims:** Fetch the actual source files from the referenced upstream repo (e.g. `gh api repos///contents/ --jq .content | base64 -d`) and verify, per claim: +13. **Source-code audit (dims 1, 5) — when the post contains code or makes code-level claims:** Fetch the actual source files from the referenced upstream repo and verify, per claim. Discover the repo layout first (`gh api repos///git/trees/main?recursive=1 --jq '.tree[] | select(.path | endswith(\".java\") or endswith(\".py\") or ...) | .path'`) before assuming any path — example paths in this skill or in prior prompts may be stale against the current package layout (e.g., `langchain_runcycles/...` vs `src/langchain_runcycles/...`). Then fetch and verify, per claim: - **Operator order in reactive/async code** — e.g. `doOnError` attached before vs. after `concatWith` changes whether commit-Mono failures trigger upstream cleanup. Reactor / RxJava / Project Reactor claims are particularly easy to get wrong from prose alone. - **Method signatures and return types** — does `chatClient.prompt().stream()` actually return `Flux`, or a stream-spec on which `.chatResponse()` yields it? - **Field names, action labels, header names** — anything quoted in backticks should be searchable in source. @@ -110,11 +110,16 @@ codex exec --sandbox read-only --cd --skip-git-repo-check \ before critique). No dimension is skippable. 3. Explicitly say NOT to edit files (read-only sandbox enforces this anyway). 4. NAME THE UPSTREAM SOURCE REPOS and tell codex to fetch and read the - relevant source files (e.g. via 'gh api repos///contents/... - --jq .content | base64 -d') before judging code-level claims. Give - example file paths if known. Tell codex explicitly to verify operator - order in any reactive/async pseudocode, method signatures of framework - abstractions cited, error/release paths, fluent-builder requirements, + relevant source files before judging code-level claims. **Do not pin + file paths in the prompt** — codex's read-only sandbox typically + blocks shell `gh` calls, so codex will route through its GitHub + connector, and example paths in this template (or in your prompt) may + be stale against the current package layout. Instruct codex to + **discover the repo layout first** (list the tree), then locate + relevant files by name pattern. Tell codex explicitly to verify + operator order in any reactive/async pseudocode, method signatures of + framework abstractions cited, error/release paths, fluent-builder + requirements, type aliases against their actual source definitions, and any quoted identifier (field, action label, header) against the actual source. Do not trust the post's own pseudocode as ground truth. 5. Ask for output bucketed by FACTUAL / OVERCLAIM / CROSS-LINKS / SEO /