Skip to content

fix: add mechanical migration and dependency verification gates to cataractae#503

Merged
MichielDean merged 6 commits into
mainfrom
fix/cataractae-migration-gates
May 11, 2026
Merged

fix: add mechanical migration and dependency verification gates to cataractae#503
MichielDean merged 6 commits into
mainfrom
fix/cataractae-migration-gates

Conversation

@MichielDean
Copy link
Copy Markdown
Owner

Summary

After the LLMem Go rewrite shipped with a vec0 migration bug, missing viant/sqlite-vec dependency import, 60s test timeouts on unavailable Ollama, and broken CLI backward compatibility, I analyzed why Cistern's pipeline missed all of these.

Root cause: the pipeline checks code against briefs, but never checks briefs against reality. No cataractae had visibility into the existing system's data, CLI interface, or dependency manifest.

This PR adds mechanical gates that force specific verification actions:

Architect INSTRUCTIONS.md — Migration Surface Analysis (mandatory for rewrites):

  • Enumerate every migration scenario (existing DB, config, CLI, hooks)
  • Specific verification commands for each scenario
  • Dependency verification: brief says X, is X in go.mod?
  • Breaking change matrix with callers

Reviewer INSTRUCTIONS.md — Three new rubric checks:

  • Migration compatibility: does the code handle opening databases from the previous version?
  • Dependency verification: cross-reference brief dependencies against dependency file
  • Test timeout discipline: 30s max for I/O tests, no indefinite hangs

QA INSTRUCTIONS.md — Three new test requirements:

  • Migration compatibility test: must test against pre-existing data, not only fresh DBs
  • Dependency verification
  • Deploy-and-verify gate: for rewrites, MUST run core commands against existing data store

Lobsterdog Contributors added 6 commits May 9, 2026 22:27
The previous ct filter implementation used a direct-exec approach
(opencode run --format json) which doesn't work because opencode run
requires an existing session ID and doesn't produce output on stdout.

The new implementation spawns opencode in a tmux session, mirroring
how cataractae work:

1. Create temp workdir with CONTEXT.md + AGENTS.md + agent identity
2. Spawn opencode run in tmux with --dangerously-skip-permissions --agent filter
3. Use pipe-pane to capture PTY output to a log file
4. Wait for session exit (poll with 10min timeout)
5. Read log, strip ANSI, extract agent response
6. Clean up tmux session and temp files

Key changes:
- filter.go: Replaced callFilterAgent with filterAgentTmux/filterAgentResume
- filter_agent.go: New file with tmux-based spawning logic
- preset.go: Removed FormatArgs (was --format json, doesn't work with opencode)
- dashboard_web.go: Updated filterResume to use invokeFilterNew
- filter_test.go: Kept structural tests, callFilterAgent now returns error

Also unset OPENCODE_SERVER_* env vars in tmux sessions to prevent
'session not found' errors (same fix as cataractae).

Replaces the incomplete FormatArgs fix (PR #500) with a proper
architectural solution.
…lter

The PTY log approach for extracting agent responses was fundamentally
broken — pipe-pane captures raw terminal output mixed with TUI chrome,
and the heuristic extraction (cleanANSI + isTUICrome) consistently
truncated responses or captured TUI artifacts.

Solution: instruct the filter agent to write its response to RESPONSE.md
in the workdir. After the tmux session exits, read the file directly.
No ANSI stripping, no extraction heuristics, no truncated paragraphs.

Changes:
- filterAgentsMD() now instructs the agent to write RESPONSE.md
- filterAgentTmux() reads RESPONSE.md instead of PTY log
- filterAgentResume() uses tmux display-message to find workdir,
  then reads RESPONSE.md instead of polling PTY log for size changes
- Removed cleanANSI(), extractFilterResponse(), isTUICrome() functions
- Removed pipe-pane log setup and PTY parsing logic
- Added tmuxSessionWorkdir() helper for --resume path
- Added dropVec0Objects() to prevent 'no such module: vec0' errors
- Updated fakeagent to write RESPONSE.md when --agent filter is present
- Updated tests: removed deprecated callFilterAgent tests, updated
  FormatArgs test for empty (removed) format args, simplified
  integration tests
The root cause of all previous failures was OPENCODE_SERVER_* env vars
causing 'Session not found' errors. Unsetting those vars allows
opencode run to work correctly as a direct subprocess.

opencode run --format json produces NDJSON output with:
- type:'text' events containing the agent's response
- sessionID for --resume support

This is dramatically simpler and more reliable than the tmux approach:
- No PTY log parsing, no ANSI stripping
- No RESPONSE.md file redirect with timing issues
- No tmux session management
- Direct JSON parsing of stdout, just like the original design intended

The tmux approach was a workaround for 'Session not found' errors that
turned out to be caused by OPENCODE_SERVER_* vars leaking into the
tmux session. Now that we know the root cause, the direct-exec approach
works correctly.

Changes:
- filter_agent.go: complete rewrite from tmux to direct-exec
  - filterAgentRun() runs opencode as subprocess, parses NDJSON
  - filterAgentRunResume() uses -s flag for session continuation
  - buildFilterRunCommand() constructs args and env
  - Env vars OPENCODE_SERVER_*, OPENCODE_PID, OPENCODE are unset
  - --format json flag captures output programmatically
  - Removed tmux, pipe-pane, RESPONSE.md, cleanANSI, extractFilterResponse
  - Removed isSessionAlive, shellQuote, homeDir, minimalTmuxEnv
- filter.go: updated invokeFilterNew/Resume to use new functions
  - callFilterAgent deprecated stub now references filterAgentRun
- filter_test.go: updated for new architecture
  - Replaced tmux-based tests with direct-exec tests
  - Added unsetEnvPrefix, buildFilterRunCommand tests
  - Removed ResponseFileName test (no longer applicable)
- fakeagent: added --agent filter detection for test mode
…attern

- Remove stale --file flag reference (flag was removed)
- Replace tmux wrapper pattern with direct ct droplet add --filter
- Note that ct filter now uses opencode run --format json
Codifies the env var pollution root cause and fixes that were discovered
through the PR #500#502 → direct-exec arc. Includes:
- Session not found: OPENCODE_SERVER_* env var pollution
- Empty response: missing --dangerously-skip-permissions or invalid model
- Timeout: CT_FILTER_TIMEOUT env var
…taractae

Post-incident fix after LLMem Go rewrite shipped with vec0 migration
bug, missing dependency import, test timeouts, and CLI flag incompat.

Architect INSTRUCTIONS.md:
- Add Migration Surface Analysis section (mandatory for rewrites)
- Require enumeration of every migration scenario (existing DBs, config,
  CLI invocations, plugins/hooks)
- Require specific verification commands for each scenario
- Require dependency verification: brief dependency → go.mod/package.json
- Require breaking change matrix with callers

Reviewer INSTRUCTIONS.md:
- Add migration compatibility check to migration_safety rubric
- Add dependency verification check (brief mentions X, is X imported?)
- Add test timeout discipline check (30s max for I/O tests)

QA INSTRUCTIONS.md:
- Add migration compatibility test requirement for rewrites
- Add dependency verification check
- Add test timeout discipline check
- Add deploy-and-verify gate for rewrites: MUST run core commands
  against existing data, not just test suite
@MichielDean MichielDean merged commit 19605b1 into main May 11, 2026
2 of 3 checks passed
@MichielDean MichielDean deleted the fix/cataractae-migration-gates branch May 11, 2026 04:05
MichielDean added a commit that referenced this pull request May 11, 2026
## Summary

Building on PR #503 (mechanical migration/dependency gates), this adds a
**droplet reality check** to every cataractae in the pipeline.

The core problem: every cataractae treated the architect's brief as the
source of truth. But the brief is a *translation* of the droplet, and
translations lose information — especially implicit requirements like
"existing data must work" or "the plugin interface must stay the same."

This PR adds a "Droplet Reality Check" section to all six cataractae:

- **Architect**: Extract implicit requirements from the droplet before
writing the brief. If the droplet says "rewrite," include migration
compatibility even if the droplet doesn't spell it out.
- **Implementer**: Compare the brief against the droplet before coding.
If the brief omits something the droplet asked for, file an issue BEFORE
implementing.
- **Reviewer**: Flag gaps between the droplet and implementation even if
the brief was followed correctly. The brief can be wrong.
- **QA**: Test against what the droplet asked for, not just what tests
cover. Verify CLI compatibility, data compatibility, dependency imports.
- **Security**: Check security surface from migration paths in rewrites.
- **Docs**: Document breaking changes and migration paths for rewrites.

The key insight: "follows the brief" and "delivers what was requested"
are different metrics. The pipeline was only measuring the first one.
Now every cataractae measures both.

Combined with #503's mechanical gates (migration compatibility test,
dependency cross-reference, deploy-and-verify, 30s test timeout), this
should prevent the class of bugs where the pipeline faithfully
implements the wrong requirements.

---------

Co-authored-by: Lobsterdog Contributors <noreply@lobsterdog.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant