fix: add mechanical migration and dependency verification gates to cataractae by MichielDean · Pull Request #503 · MichielDean/Cistern

MichielDean · 2026-05-11T04:04:38Z

Summary

After the LLMem Go rewrite shipped with a vec0 migration bug, missing viant/sqlite-vec dependency import, 60s test timeouts on unavailable Ollama, and broken CLI backward compatibility, I analyzed why Cistern's pipeline missed all of these.

Root cause: the pipeline checks code against briefs, but never checks briefs against reality. No cataractae had visibility into the existing system's data, CLI interface, or dependency manifest.

This PR adds mechanical gates that force specific verification actions:

Architect INSTRUCTIONS.md — Migration Surface Analysis (mandatory for rewrites):

Enumerate every migration scenario (existing DB, config, CLI, hooks)
Specific verification commands for each scenario
Dependency verification: brief says X, is X in go.mod?
Breaking change matrix with callers

Reviewer INSTRUCTIONS.md — Three new rubric checks:

Migration compatibility: does the code handle opening databases from the previous version?
Dependency verification: cross-reference brief dependencies against dependency file
Test timeout discipline: 30s max for I/O tests, no indefinite hangs

QA INSTRUCTIONS.md — Three new test requirements:

Migration compatibility test: must test against pre-existing data, not only fresh DBs
Dependency verification
Deploy-and-verify gate: for rewrites, MUST run core commands against existing data store

The previous ct filter implementation used a direct-exec approach (opencode run --format json) which doesn't work because opencode run requires an existing session ID and doesn't produce output on stdout. The new implementation spawns opencode in a tmux session, mirroring how cataractae work: 1. Create temp workdir with CONTEXT.md + AGENTS.md + agent identity 2. Spawn opencode run in tmux with --dangerously-skip-permissions --agent filter 3. Use pipe-pane to capture PTY output to a log file 4. Wait for session exit (poll with 10min timeout) 5. Read log, strip ANSI, extract agent response 6. Clean up tmux session and temp files Key changes: - filter.go: Replaced callFilterAgent with filterAgentTmux/filterAgentResume - filter_agent.go: New file with tmux-based spawning logic - preset.go: Removed FormatArgs (was --format json, doesn't work with opencode) - dashboard_web.go: Updated filterResume to use invokeFilterNew - filter_test.go: Kept structural tests, callFilterAgent now returns error Also unset OPENCODE_SERVER_* env vars in tmux sessions to prevent 'session not found' errors (same fix as cataractae). Replaces the incomplete FormatArgs fix (PR #500) with a proper architectural solution.

…lter The PTY log approach for extracting agent responses was fundamentally broken — pipe-pane captures raw terminal output mixed with TUI chrome, and the heuristic extraction (cleanANSI + isTUICrome) consistently truncated responses or captured TUI artifacts. Solution: instruct the filter agent to write its response to RESPONSE.md in the workdir. After the tmux session exits, read the file directly. No ANSI stripping, no extraction heuristics, no truncated paragraphs. Changes: - filterAgentsMD() now instructs the agent to write RESPONSE.md - filterAgentTmux() reads RESPONSE.md instead of PTY log - filterAgentResume() uses tmux display-message to find workdir, then reads RESPONSE.md instead of polling PTY log for size changes - Removed cleanANSI(), extractFilterResponse(), isTUICrome() functions - Removed pipe-pane log setup and PTY parsing logic - Added tmuxSessionWorkdir() helper for --resume path - Added dropVec0Objects() to prevent 'no such module: vec0' errors - Updated fakeagent to write RESPONSE.md when --agent filter is present - Updated tests: removed deprecated callFilterAgent tests, updated FormatArgs test for empty (removed) format args, simplified integration tests

The root cause of all previous failures was OPENCODE_SERVER_* env vars causing 'Session not found' errors. Unsetting those vars allows opencode run to work correctly as a direct subprocess. opencode run --format json produces NDJSON output with: - type:'text' events containing the agent's response - sessionID for --resume support This is dramatically simpler and more reliable than the tmux approach: - No PTY log parsing, no ANSI stripping - No RESPONSE.md file redirect with timing issues - No tmux session management - Direct JSON parsing of stdout, just like the original design intended The tmux approach was a workaround for 'Session not found' errors that turned out to be caused by OPENCODE_SERVER_* vars leaking into the tmux session. Now that we know the root cause, the direct-exec approach works correctly. Changes: - filter_agent.go: complete rewrite from tmux to direct-exec - filterAgentRun() runs opencode as subprocess, parses NDJSON - filterAgentRunResume() uses -s flag for session continuation - buildFilterRunCommand() constructs args and env - Env vars OPENCODE_SERVER_*, OPENCODE_PID, OPENCODE are unset - --format json flag captures output programmatically - Removed tmux, pipe-pane, RESPONSE.md, cleanANSI, extractFilterResponse - Removed isSessionAlive, shellQuote, homeDir, minimalTmuxEnv - filter.go: updated invokeFilterNew/Resume to use new functions - callFilterAgent deprecated stub now references filterAgentRun - filter_test.go: updated for new architecture - Replaced tmux-based tests with direct-exec tests - Added unsetEnvPrefix, buildFilterRunCommand tests - Removed ResponseFileName test (no longer applicable) - fakeagent: added --agent filter detection for test mode

…attern - Remove stale --file flag reference (flag was removed) - Replace tmux wrapper pattern with direct ct droplet add --filter - Note that ct filter now uses opencode run --format json

Codifies the env var pollution root cause and fixes that were discovered through the PR #500 → #502 → direct-exec arc. Includes: - Session not found: OPENCODE_SERVER_* env var pollution - Empty response: missing --dangerously-skip-permissions or invalid model - Timeout: CT_FILTER_TIMEOUT env var

…taractae Post-incident fix after LLMem Go rewrite shipped with vec0 migration bug, missing dependency import, test timeouts, and CLI flag incompat. Architect INSTRUCTIONS.md: - Add Migration Surface Analysis section (mandatory for rewrites) - Require enumeration of every migration scenario (existing DBs, config, CLI invocations, plugins/hooks) - Require specific verification commands for each scenario - Require dependency verification: brief dependency → go.mod/package.json - Require breaking change matrix with callers Reviewer INSTRUCTIONS.md: - Add migration compatibility check to migration_safety rubric - Add dependency verification check (brief mentions X, is X imported?) - Add test timeout discipline check (30s max for I/O tests) QA INSTRUCTIONS.md: - Add migration compatibility test requirement for rewrites - Add dependency verification check - Add test timeout discipline check - Add deploy-and-verify gate for rewrites: MUST run core commands against existing data, not just test suite

## Summary Building on PR #503 (mechanical migration/dependency gates), this adds a **droplet reality check** to every cataractae in the pipeline. The core problem: every cataractae treated the architect's brief as the source of truth. But the brief is a *translation* of the droplet, and translations lose information — especially implicit requirements like "existing data must work" or "the plugin interface must stay the same." This PR adds a "Droplet Reality Check" section to all six cataractae: - **Architect**: Extract implicit requirements from the droplet before writing the brief. If the droplet says "rewrite," include migration compatibility even if the droplet doesn't spell it out. - **Implementer**: Compare the brief against the droplet before coding. If the brief omits something the droplet asked for, file an issue BEFORE implementing. - **Reviewer**: Flag gaps between the droplet and implementation even if the brief was followed correctly. The brief can be wrong. - **QA**: Test against what the droplet asked for, not just what tests cover. Verify CLI compatibility, data compatibility, dependency imports. - **Security**: Check security surface from migration paths in rewrites. - **Docs**: Document breaking changes and migration paths for rewrites. The key insight: "follows the brief" and "delivers what was requested" are different metrics. The pipeline was only measuring the first one. Now every cataractae measures both. Combined with #503's mechanical gates (migration compatibility test, dependency cross-reference, deploy-and-verify, 30s test timeout), this should prevent the class of bugs where the pipeline faithfully implements the wrong requirements. --------- Co-authored-by: Lobsterdog Contributors <noreply@lobsterdog.dev>

Lobsterdog Contributors added 6 commits May 9, 2026 22:27

docs: update skill docs for direct-exec filter, remove tmux wrapper p…

0b1bb57

…attern - Remove stale --file flag reference (flag was removed) - Replace tmux wrapper pattern with direct ct droplet add --filter - Note that ct filter now uses opencode run --format json

MichielDean merged commit 19605b1 into main May 11, 2026
2 of 3 checks passed

MichielDean deleted the fix/cataractae-migration-gates branch May 11, 2026 04:05

MichielDean mentioned this pull request May 11, 2026

fix: add droplet reality check to all cataractae #504

Merged

MichielDean mentioned this pull request May 11, 2026

fix: replace ct filter with opencode run --format json (direct-exec) #502

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add mechanical migration and dependency verification gates to cataractae#503

fix: add mechanical migration and dependency verification gates to cataractae#503
MichielDean merged 6 commits into
mainfrom
fix/cataractae-migration-gates

MichielDean commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MichielDean commented May 11, 2026

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant