v1.34.2.0 fix wave: /codex review on CLI 0.130+, /investigate learnings, /sync-gbrain on Supabase (3 community-reported bugs)#1478
Open
garrytan wants to merge 5 commits into
Open
v1.34.2.0 fix wave: /codex review on CLI 0.130+, /investigate learnings, /sync-gbrain on Supabase (3 community-reported bugs)#1478garrytan wants to merge 5 commits into
garrytan wants to merge 5 commits into
Conversation
The /investigate skill instructed agents to log learnings with type:"investigation", but bin/gstack-learnings-log:22 rejected anything not in [pattern, pitfall, preference, architecture, tool, operational]. Every investigation run exited 1 to stderr and the learning was dropped, silently to the user. Fix: add 'investigation' to ALLOWED_TYPES. Regression test: round-trips a learning with type:"investigation" and asserts exit 0 + file write; second test reads investigate/SKILL.md.tmpl and asserts it emits the literal type:"investigation" string, guarding the template/validator contract at both ends. Fixes #1423. Reported by diogolealassis.
… doctor exit
freshDetectEngineTier() in lib/gstack-memory-helpers.ts returned engine:
"unknown" for every Supabase user on gbrain ≥0.25. Two stacking bugs:
1. execSync("gbrain doctor --json --fast 2>/dev/null") threw on non-zero
exit. gbrain doctor exits 1 whenever health_score < 100, which is
essentially every fresh install due to resolver_health warnings. The
JSON output never reached the parser.
2. gbrain ≥0.25 shipped schema_version:2 doctor output that dropped the
top-level 'engine' field entirely.
Result: every /sync-gbrain on Supabase logged 'engine=unknown' and skipped
all sync stages silently.
Fix:
- Replace execSync with execFileSync (no shell, no bash-specific 2>/dev/null
redirect; portable to Windows).
- Recover stdout from the thrown error object so non-zero exits still parse.
- Fall back to reading gbrain's config.json (respecting GBRAIN_HOME env var,
defaulting to ~/.gbrain/config.json) when doctor output doesn't surface
an engine field.
- Add logGbrainError() helper that appends one-line JSONL to
~/.gstack/.gbrain-errors.jsonl on parse failure, so future regressions
leave a forensic trail.
The "supabase" tier here means "remote postgres" in practice — gbrain
config uses engine:"postgres" for both real Supabase and any other
remote postgres (e.g. local-postgres-for-testing). Downstream sync code
treats them identically, so the label compression is intentional and
documented inline.
Regression test: existing detectEngineTier suite now isolates HOME +
GBRAIN_HOME + PATH to temp dirs (closes a flake source where the prior
tests would read whatever was on the reviewer's machine). New test
forces gbrain off PATH, writes a synthetic config.json with
engine:"postgres", asserts detectEngineTier() returns
engine:"supabase".
Fixes #1415. Patch shape contributed by Shiv @shivasymbl (tested on
gstack v1.31.0.0 + gbrain v0.31.3 + Supabase).
Codex CLI 0.130.0 made [PROMPT] and --base <BRANCH> mutually exclusive at argv level. Step 2A of codex/SKILL.md.tmpl had always passed both (the filesystem boundary prefix as the prompt argument + the base branch), so every /codex review call died with: error: the argument '[PROMPT]' cannot be used with '--base <BRANCH>' Fix: split Step 2A into two paths. Default (no custom user instructions): bare 'codex review --base <base>'. Codex's review prompt is internally diff-scoped, so the model focuses on the changes against base. The filesystem boundary prefix is dropped here because Codex 0.130 has no documented system-prompt config key (probed -c 'system_prompt="..."' against 0.130 — the flag is silently accepted but the value isn't applied). Skill files under .claude/ and agents/ are public, so this is a token-efficiency concern, not a safety one. Custom instructions (/codex review <focus>): route through codex exec with the diff written to a tempfile, inlined into the prompt between explicit DIFF_START / DIFF_END markers. The boundary is preserved here because codex exec isn't auto-scoped to the diff. The DIFF_START/END delimiters tell the model where data ends and instructions resume, which materially reduces prompt-injection hijack rates when the diff contains adversarial content. Note on bash semantics: codex's earlier review flagged the exec route as "command injection via $_DIFF interpolation." That framing is wrong — bash parameter expansion does not re-evaluate $(...) or backticks inside the expanded value, so a diff containing $(rm -rf /) is plain string data to codex exec. The real risk is prompt injection (model-side, not shell-side), which the DIFF_START/END pattern mitigates. Regression tests in test/codex-hardening.test.ts assert across BOTH codex/SKILL.md.tmpl AND the generated codex/SKILL.md: 1. No 'codex review' invocation line combines a quoted-string OR variable positional argument with --base. 2. Step 2A still contains either bare 'codex review --base' OR 'codex exec' (guards against accidental deletion of both fix paths). Fixes #1428. Reported by Stashub.
Two test files were timing out at the default 5s on developer machines, both pre-existing on origin/main but unrelated to this branch's bug fixes: - test/gstack-artifacts-init.test.ts: 13 tests spawning real subprocesses via fake gh/glab/git shims in PATH. bun's fork+exec overhead pushed these past 5s consistently. Added a local test-wrapper that aliases test() with a 30s timeout (matches the brain-sync.test.ts pattern already in the repo). - test/gstack-next-version.test.ts: one integration smoke test that spawns 'bun run ./bin/gstack-next-version' and parses the resulting JSON. The subprocess does a 'gh pr list' against the live GitHub API to enumerate claimed version slots. Network latency makes 5s tight; raised this single test to 30s. No production code changed. The tests already passed deterministically once given enough wall-clock time.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
E2E Evals: ✅ PASS3/3 tests passed | $1.20 total cost | 12 parallel runners
12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three filed bugs land together. All three were silent-or-loud regressions on user flows; each ships with a regression test that locks the fix in.
Fixes:
/codex reviewbroken on Codex CLI ≥ 0.130.0 (fcbc217f). The CLI made[PROMPT]and--base <branch>mutually exclusive at argv level, so every/codex reviewcall died witherror: the argument '[PROMPT]' cannot be used with '--base <BRANCH>'. Reported byStashub. Fixed via dual-path: barecodex review --basefor default,codex exec+ tempfile prompt with DIFF_START/DIFF_END delimiters for/codex review <focus>./investigatelearnings silently dropped (f94aef73). The skill emittedtype:"investigation"butbin/gstack-learnings-log:22rejected the type, exiting 1 to stderr. Reported bydiogolealassis. One-line fix: add'investigation'toALLOWED_TYPES./sync-gbrainreturnedengine: unknownon every Supabase install (9ae98e9d). Two stacking bugs:execSyncthrew ongbrain doctor's non-zero exit, and gbrain ≥ 0.25 dropped the top-levelenginefield. Patch contributed byShiv @shivasymbl(tested on gstack v1.31.0.0 + gbrain v0.31.3 + Supabase). Switched toexecFileSync, recover stdout from the thrown error, fall back togbrainconfig.json (respectingGBRAIN_HOME), addedlogGbrainErrorfor forensic JSONL.Test infrastructure (
0ef4ff80): raised timeouts for two pre-existing slow integration tests (gstack-artifacts-init× 16,gstack-next-version× 1). Both were timing out at the default 5s on developer machines due to subprocess fork+exec overhead; same flake reproduced onorigin/mainbefore this branch. Matches thebrain-sync.test.tspattern already in the repo (aliastestto a 30s-timeout wrapper).Triage: GitHub issue #1412 (PM skill proposal) got a polite decline comment explaining the proposed shape is enterprise-PM coordination ceremony that gstack is explicitly trying to compress. Suggested the proposer publish it as a separate skill collection.
Test Coverage
bun testexit 0, all green (full suite, including the previously-timing-out integration tests).Pre-Landing Review
No structural issues. Diff has no SQL, no LLM trust boundary, no auth surface. Plan already cleared
/plan-eng-review(CLEAR) and/codexplan-review (12 findings, all 12 incorporated).Plan Completion
Plan at
~/.claude/plans/system-instruction-you-are-working-floating-mccarthy.md. All planned items DONE: Fix 1 (codex template, both bare + exec paths), Fix 2 (learnings allowlist), Fix 3 (gbrain detect with full codex-review hardening: execFileSync, GBRAIN_HOME, error logging), 3 regression tests, defensive-c system_promptprobe (confirmed Codex 0.130 doesn't expose the key, fell through to documented bare-path-without-boundary). PM proposal #1412 triaged separately as planned (comment posted).Verification Results
All three manual smoke-tests passed using isolated temp
HOME/GSTACK_HOME/GBRAIN_HOME(the destructive verification steps from the original plan were rewritten to never touch the developer's real state — codex review finding #11).TODOS
No items completed in this PR. TODOS.md unchanged.
Test plan
bun testpasses (exit 0, 0 failures)codex review --base mainargv-parses withouterror: the argument '[PROMPT]' cannot be used with '--base <BRANCH>'gstack-learnings-logacceptstype:"investigation"and writes to learnings filedetectEngineTier()resolves toengine:"supabase"against synthetic~/.gbrain/config.jsonwith nogbrainon PATH (proves the config.json fallback path)## GSTACK REVIEW REPORTin plan file)🤖 Generated with Claude Code
Need help on this PR? Tag
@codesmithwith what you need.