v1.34.2.0 fix wave: /codex review on CLI 0.130+, /investigate learnings, /sync-gbrain on Supabase (3 community-reported bugs) by garrytan · Pull Request #1478 · garrytan/gstack

garrytan · 2026-05-13T18:24:29Z

Summary

Three filed bugs land together. All three were silent-or-loud regressions on user flows; each ships with a regression test that locks the fix in.

Fixes:

/codex review broken on Codex CLI ≥ 0.130.0 — passes [PROMPT] + --base together, which the CLI now rejects #1428 — /codex review broken on Codex CLI ≥ 0.130.0 (fcbc217f). The CLI made [PROMPT] and --base <branch> mutually exclusive at argv level, so every /codex review call died with error: the argument '[PROMPT]' cannot be used with '--base <BRANCH>'. Reported by Stashub. Fixed via dual-path: bare codex review --base for default, codex exec + tempfile prompt with DIFF_START/DIFF_END delimiters for /codex review <focus>.
/investigate skill logs type:"investigation" but gstack-learnings-log silently rejects it #1423 — /investigate learnings silently dropped (f94aef73). The skill emitted type:"investigation" but bin/gstack-learnings-log:22 rejected the type, exiting 1 to stderr. Reported by diogolealassis. One-line fix: add 'investigation' to ALLOWED_TYPES.
fix(gstack-gbrain-sync): engine always "unknown" when gbrain ≥0.25 is unhealthy — silently breaks /sync-gbrain for all Supabase users #1415 — /sync-gbrain returned engine: unknown on every Supabase install (9ae98e9d). Two stacking bugs: execSync threw on gbrain doctor's non-zero exit, and gbrain ≥ 0.25 dropped the top-level engine field. Patch contributed by Shiv @shivasymbl (tested on gstack v1.31.0.0 + gbrain v0.31.3 + Supabase). Switched to execFileSync, recover stdout from the thrown error, fall back to gbrain config.json (respecting GBRAIN_HOME), added logGbrainError for forensic JSONL.

Test infrastructure (0ef4ff80): raised timeouts for two pre-existing slow integration tests (gstack-artifacts-init × 16, gstack-next-version × 1). Both were timing out at the default 5s on developer machines due to subprocess fork+exec overhead; same flake reproduced on origin/main before this branch. Matches the brain-sync.test.ts pattern already in the repo (alias test to a 30s-timeout wrapper).

Triage: GitHub issue #1412 (PM skill proposal) got a polite decline comment explaining the proposed shape is enterprise-PM coordination ceremony that gstack is explicitly trying to compress. Suggested the proposer publish it as a separate skill collection.

Test Coverage

[+] bin/gstack-learnings-log               [★★★ TESTED]  test/learnings.test.ts:106-115 (round-trip + caller-contract)
[+] lib/gstack-memory-helpers.ts           [★★★ TESTED]  test/gstack-memory-helpers.test.ts:316-330 (schema_v2 fallback)
[+] codex/SKILL.md.tmpl Step 2A            [★★★ TESTED]  test/codex-hardening.test.ts:368-426 (template + generated SKILL.md)
[+] test/gstack-artifacts-init.test.ts     [INFRA]       12-line timeout-wrapper, no production code touched
[+] test/gstack-next-version.test.ts       [INFRA]       single-test timeout=30000, no production code touched

COVERAGE: 3/3 production-bug paths tested (100%)  |  QUALITY: ★★★:3

bun test exit 0, all green (full suite, including the previously-timing-out integration tests).

Pre-Landing Review

No structural issues. Diff has no SQL, no LLM trust boundary, no auth surface. Plan already cleared /plan-eng-review (CLEAR) and /codex plan-review (12 findings, all 12 incorporated).

Plan Completion

Plan at ~/.claude/plans/system-instruction-you-are-working-floating-mccarthy.md. All planned items DONE: Fix 1 (codex template, both bare + exec paths), Fix 2 (learnings allowlist), Fix 3 (gbrain detect with full codex-review hardening: execFileSync, GBRAIN_HOME, error logging), 3 regression tests, defensive -c system_prompt probe (confirmed Codex 0.130 doesn't expose the key, fell through to documented bare-path-without-boundary). PM proposal #1412 triaged separately as planned (comment posted).

Verification Results

All three manual smoke-tests passed using isolated temp HOME/GSTACK_HOME/GBRAIN_HOME (the destructive verification steps from the original plan were rewritten to never touch the developer's real state — codex review finding #11).

TODOS

No items completed in this PR. TODOS.md unchanged.

Test plan

bun test passes (exit 0, 0 failures)
codex review --base main argv-parses without error: the argument '[PROMPT]' cannot be used with '--base <BRANCH>'
gstack-learnings-log accepts type:"investigation" and writes to learnings file
detectEngineTier() resolves to engine:"supabase" against synthetic ~/.gbrain/config.json with no gbrain on PATH (proves the config.json fallback path)
Codex plan-review with all 12 findings incorporated (see ## GSTACK REVIEW REPORT in plan file)

🤖 Generated with Claude Code

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

The /investigate skill instructed agents to log learnings with type:"investigation", but bin/gstack-learnings-log:22 rejected anything not in [pattern, pitfall, preference, architecture, tool, operational]. Every investigation run exited 1 to stderr and the learning was dropped, silently to the user. Fix: add 'investigation' to ALLOWED_TYPES. Regression test: round-trips a learning with type:"investigation" and asserts exit 0 + file write; second test reads investigate/SKILL.md.tmpl and asserts it emits the literal type:"investigation" string, guarding the template/validator contract at both ends. Fixes #1423. Reported by diogolealassis.

@shivasymbl

… doctor exit freshDetectEngineTier() in lib/gstack-memory-helpers.ts returned engine: "unknown" for every Supabase user on gbrain ≥0.25. Two stacking bugs: 1. execSync("gbrain doctor --json --fast 2>/dev/null") threw on non-zero exit. gbrain doctor exits 1 whenever health_score < 100, which is essentially every fresh install due to resolver_health warnings. The JSON output never reached the parser. 2. gbrain ≥0.25 shipped schema_version:2 doctor output that dropped the top-level 'engine' field entirely. Result: every /sync-gbrain on Supabase logged 'engine=unknown' and skipped all sync stages silently. Fix: - Replace execSync with execFileSync (no shell, no bash-specific 2>/dev/null redirect; portable to Windows). - Recover stdout from the thrown error object so non-zero exits still parse. - Fall back to reading gbrain's config.json (respecting GBRAIN_HOME env var, defaulting to ~/.gbrain/config.json) when doctor output doesn't surface an engine field. - Add logGbrainError() helper that appends one-line JSONL to ~/.gstack/.gbrain-errors.jsonl on parse failure, so future regressions leave a forensic trail. The "supabase" tier here means "remote postgres" in practice — gbrain config uses engine:"postgres" for both real Supabase and any other remote postgres (e.g. local-postgres-for-testing). Downstream sync code treats them identically, so the label compression is intentional and documented inline. Regression test: existing detectEngineTier suite now isolates HOME + GBRAIN_HOME + PATH to temp dirs (closes a flake source where the prior tests would read whatever was on the reviewer's machine). New test forces gbrain off PATH, writes a synthetic config.json with engine:"postgres", asserts detectEngineTier() returns engine:"supabase". Fixes #1415. Patch shape contributed by Shiv @shivasymbl (tested on gstack v1.31.0.0 + gbrain v0.31.3 + Supabase).

Codex CLI 0.130.0 made [PROMPT] and --base <BRANCH> mutually exclusive at argv level. Step 2A of codex/SKILL.md.tmpl had always passed both (the filesystem boundary prefix as the prompt argument + the base branch), so every /codex review call died with: error: the argument '[PROMPT]' cannot be used with '--base <BRANCH>' Fix: split Step 2A into two paths. Default (no custom user instructions): bare 'codex review --base <base>'. Codex's review prompt is internally diff-scoped, so the model focuses on the changes against base. The filesystem boundary prefix is dropped here because Codex 0.130 has no documented system-prompt config key (probed -c 'system_prompt="..."' against 0.130 — the flag is silently accepted but the value isn't applied). Skill files under .claude/ and agents/ are public, so this is a token-efficiency concern, not a safety one. Custom instructions (/codex review <focus>): route through codex exec with the diff written to a tempfile, inlined into the prompt between explicit DIFF_START / DIFF_END markers. The boundary is preserved here because codex exec isn't auto-scoped to the diff. The DIFF_START/END delimiters tell the model where data ends and instructions resume, which materially reduces prompt-injection hijack rates when the diff contains adversarial content. Note on bash semantics: codex's earlier review flagged the exec route as "command injection via $_DIFF interpolation." That framing is wrong — bash parameter expansion does not re-evaluate $(...) or backticks inside the expanded value, so a diff containing $(rm -rf /) is plain string data to codex exec. The real risk is prompt injection (model-side, not shell-side), which the DIFF_START/END pattern mitigates. Regression tests in test/codex-hardening.test.ts assert across BOTH codex/SKILL.md.tmpl AND the generated codex/SKILL.md: 1. No 'codex review' invocation line combines a quoted-string OR variable positional argument with --base. 2. Step 2A still contains either bare 'codex review --base' OR 'codex exec' (guards against accidental deletion of both fix paths). Fixes #1428. Reported by Stashub.

Two test files were timing out at the default 5s on developer machines, both pre-existing on origin/main but unrelated to this branch's bug fixes: - test/gstack-artifacts-init.test.ts: 13 tests spawning real subprocesses via fake gh/glab/git shims in PATH. bun's fork+exec overhead pushed these past 5s consistently. Added a local test-wrapper that aliases test() with a 30s timeout (matches the brain-sync.test.ts pattern already in the repo). - test/gstack-next-version.test.ts: one integration smoke test that spawns 'bun run ./bin/gstack-next-version' and parses the resulting JSON. The subprocess does a 'gh pr list' against the live GitHub API to enumerate claimed version slots. Network latency makes 5s tight; raised this single test to 30s. No production code changed. The tests already passed deterministically once given enough wall-clock time.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-13T18:36:39Z

E2E Evals: ✅ PASS

3/3 tests passed | $1.20 total cost | 12 parallel runners

Suite	Result	Status	Cost
e2e-browse	1/1	✅	$0.04
e2e-review	1/1	✅	$0.58
e2e-review	1/1	✅	$0.58

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

garrytan and others added 5 commits May 13, 2026 11:21

chore: bump version and changelog (v1.34.2.0)

0fe03ec

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.34.2.0 fix wave: /codex review on CLI 0.130+, /investigate learnings, /sync-gbrain on Supabase (3 community-reported bugs)#1478

v1.34.2.0 fix wave: /codex review on CLI 0.130+, /investigate learnings, /sync-gbrain on Supabase (3 community-reported bugs)#1478
garrytan wants to merge 5 commits into
mainfrom
garrytan/macau-v2

garrytan commented May 13, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented May 13, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Coverage

Pre-Landing Review

Plan Completion

Verification Results

TODOS

Test plan

Uh oh!

github-actions Bot commented May 13, 2026

E2E Evals: ✅ PASS

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

garrytan commented May 13, 2026 •

edited by blacksmith-sh Bot

Loading