Implement the north-star arc: ten phases in one PR#141
Merged
Conversation
- docs/design/wire-protocol-2026-06.md: JSON-RPC 2.0 envelope + OTel LogRecord-shaped payload spec. The single contract every downstream surface (MCP, otel-bridge, distributed-ci coordinator, Hono migration, cloud upload) reads off. Four channels (events/state/rpc/submit), three transports (WS/SSE/NDJSON), one envelope. - docs/progress/implementation-log-2026-06.md: append-only progress log for the north-star implementation arc. Records what shipped, decisions made, deferred items, test impact, per phase.
Lift Digest = { hash, sizeBytes } and a CASBackend interface as the
explicit content-addressed storage primitive beneath CacheLayer
(architecture-review-2026-06.md §4.3). The existing Cache /
LayeredCache / RemoteCache stay untouched; this ships the abstraction
+ two reference impls (MemoryCASBackend, FsCASBackend) so downstream
work (R2 backend in apps/cloud, REAPI CAS bridge, fast existence
probes) can rely on the type without churn.
- src/cache/digest.ts: Digest type + makeDigest / digestEqual /
digestString / parseDigest. sizeBytes is number not bigint
(Bun.file().size returns number; 2^53 covers any artifact).
- src/cache/cas-backend.ts: CASBackend interface (put/get/has/remove);
MemoryCASBackend for tests and ephemeral runs; FsCASBackend mirroring
Cache.save's on-disk layout for the local-FS case.
- src/cache/index.ts: contract export.
- tests/digest.test.ts: 14 tests across helpers + both reference
backends (round-trip, sizeBytes mismatch, remove idempotence).
Cache.ts is NOT yet rewired to use a CASBackend internally — that's a
follow-up; today the abstraction co-exists. Byte-identical behaviour;
no CACHE_VERSION bump.
vx Cloud reference deployment per docs/design/vx-cloud-2026-06.md and the JSON-RPC 2.0 envelope per wire-protocol-2026-06.md. The scaffold compiles + is wrangler-deploy-shaped (template-spawnable), but handler bodies stub the real logic with TODO markers — HMAC verification, per-run seq allocation, graph fan-out, and waiter broadcast all land as follow-ups. Bindings: D1 (DB), R2 (ARTIFACTS), KV (TOKEN_CACHE), Queue (EVENT_INGEST), two Durable Objects (RUN_COORDINATOR for per-run state + WS hibernation, INFLIGHT_DEDUP for content-addressed dedup). Routes: Turbo-wire-compatible cache (/v8/artifacts/*), insights (/v1/runs/*, /v1/events/ingest, /v1/runs/:id/events SSE), WS upgrade delegated to the per-run DO. The repo's apps/* + packages/* exclusion in oxlint + oxfmt already covers this; root `bun src/bin.ts run ci` stays green.
Combined commit to preserve all work atomically given working-tree
churn during parallel agent spawns.
src/ (orchestrator + workspace + config):
- src/orchestrator/history.ts (Phase 2): HistoryProvider + LocalHistoryProvider, SQL CTE over the cache.db runs table, p50/p99/successRate/hitRate/failureMode per (project, task) pair.
- src/orchestrator/predict.ts (Phase 4): pure computePredictedPriorities — topo-DP over a HistoryTable producing expected-remaining-critical-path duration per node. Default fallback 1000ms.
- src/orchestrator/plugin.ts (Phase 3): Plugin + PluginContext + installPlugins. Lifecycle hooks (onRunStart/onTaskStart/etc) wrapped in per-hook isolation (throwing hook disables the plugin for the run, doesn't block the bus).
- src/orchestrator/index.ts: contract exports for history, predict, plugin.
- src/config.ts: WorkspaceConfig.plugins?: Plugin[]; .predictive?: boolean; Plugin shape.
- src/workspace/project-loader.ts: validation for the new fields.
apps/insights/ (Phase 8): Solid + Vite + UnoCSS + DuckDB-WASM SPA. Read-only over cache.db via DuckDB-WASM with sqlite_scanner. One Overview page (recent runs list) + RunDetail page (CSS flamegraph). vx insights serve subcommand boots the SPA in the user's workspace context.
- apps/insights/{package.json,vite.config.ts,uno.config.ts,tsconfig.json,index.html,vx.config.ts,README.md}
- apps/insights/src/{main.tsx,api.ts,duckdb.ts,format.ts,flamegraph-layout.ts}
- apps/insights/src/{components,pages}/*
- src/cli/insights.ts + dispatch entries in cli/index.ts + cli/help.ts
- tests/insights.test.ts
packages/otel-bridge/ (Phase 7): standalone OTel CI/CD-conventions adapter. Subscribe to a vx bus, map WireEvent → OTel LogRecord (timeUnixNano, severityNumber, body, attributes, traceId/spanId per the wire spec), POST via OTLP HTTP. devDep, type-only imports from core; root never gains OTel runtime deps.
- packages/otel-bridge/{package.json,tsconfig.json,README.md}
- packages/otel-bridge/src/{index.ts,types.ts}
- packages/otel-bridge/tests/map-to-log-record.test.ts
Root: workspaces extended to include packages/*; .oxlintrc + .oxfmtrc ignorePatterns extended; src/index.ts adds bus + WireEvent exports for adapters.
Apply the orchestrator/workspace edits left out by the prior atomic
commit (they were lost in a stash/restore cycle while parallel agents
ran). Plus the missing predict.test + plugin.test fixtures.
- src/config.ts: WorkspaceConfig gains plugins?: readonly Plugin[] and
predictive?: boolean. Plugin = { name, setup(ctx) } structural type.
- src/orchestrator/index.ts: contract re-exports HistoryProvider /
LocalHistoryProvider / EmptyHistoryProvider / HistoryTable /
TaskHistory; computePredictedPriorities; Plugin / PluginContext /
installPlugins / PluginHookName / PluginHookHandlers /
InstallPluginsArgs.
- src/workspace/project-loader.ts: runtime validation for plugins[]
shape (array of {name,setup}) and predictive type. Same UserError
style as existing checks.
- tests/predict.test.ts: 5 tests covering leaf / chain / fallback /
default-empty / sibling-max.
- tests/plugin.test.ts: 5 tests covering hook fan-out / payload
threading / setup-throw isolation / per-hook isolation / shape
validation.
…ffolds Phase 5 — vx mcp (Model Context Protocol server): - src/cli/mcp.ts: vx mcp subcommand. Boots @modelcontextprotocol/sdk Server over stdio (Claude Code / Cursor / Continue.dev). Lists tools + dispatches CallToolRequest through src/cli/mcp-rpc.ts. Dynamic import keeps SDK out of the cold-start path. - src/cli/mcp-rpc.ts: pure RPC dispatcher. Four read-only tools: getCacheStats, getRunHistory, explainCacheKey, whyDidThisRerun. Same handlers a future WS-side inspector reuses (one impl, two transports per architecture-review §2.2). v1 returns placeholders marking the contract; the cache.db / HistoryProvider wiring lands with the inspector RPC server. - tests/mcp.test.ts: 10 tests covering parser, tool listing, dispatch, arg validation, unknown-tool rejection. Phase 6 — distributed-CI protocol + role stubs: - src/orchestrator/protocol.ts: ServerMessage gains task:assign, cache:exists, coord:drain; ClientMessage gains worker:hello, worker:pull, worker:start, worker:stdout, worker:stderr, worker:done, worker:bye. New WireTaskNode + WireOutcome types. Per architecture-review §2.1: ONE wire enum extended additively; the run-submitter ignores worker/coordinator messages. - src/cli/coordinator.ts: vx coordinator <tasks...> [--port --host --workers]. Scaffold — parses + prints intended bind. The real WS handler + assignment policy lands in Phase A-B (graph + ready queue). - src/cli/worker.ts: vx run --worker <coord-url> handler. Scaffold — parses URL/--capacity/--label, prints worker identity. The pull loop lands when the coordinator handler is real. - src/cli/backend.ts: narrow ServerMessage handling so the new task:assign/cache:exists/coord:drain branches don't trip TS narrow checks; the submitter explicitly ignores them. - tests/distributed.test.ts: parser tests for both subcommands + the protocol-shape sanity tests (worker:* + task:assign:* + cache sources). Tidy-ups landed in the same pass: - src/orchestrator/plugin.ts: void each hook invocation (hook returns Promise; bus is fire-and-forget — explicitly drop) to satisfy typescript/no-floating-promises. - src/orchestrator/history.ts: EmptyHistoryProvider.loadFor now carries the parameter to match the interface signature. - tests/history.test.ts: align mkRun() with RunRecord (forwardArgs is readonly string[]; status union is the cache.ts version); Cache(constructor) takes one arg. Full CI gate green (3 success, 0 failed).
- src/cli/help.ts: surface the new subcommands (vx mcp, vx coordinator, vx run --worker) in the Usage block + Distributed CI section. - docs/progress/implementation-log-2026-06.md: final summary with the ten-phase breakdown, commits referenced, deferred items cataloged, test impact, and the architecture-state snapshot at close. Hono migration (Phase 10) is deferred and documented as such in the log — the existing vx serve / vx dev wiring stays for this PR; the host-side Hono port lands once SSE + NDJSON endpoints have user signal. apps/cloud already ships on Hono per spec; the substrate is proven there.
architecture-review §4.1 + §8.4 follow-through. Until now the three APIs existed in isolation; this commit makes them actually load and fire during a real `vx run`. - src/graph/scheduler.ts: ScheduleOptions gains `priorities?: ReadonlyMap<string, number>` — caller-supplied per-node weights that override the static reverse-deps baseline. mergePriorities scales overrides above baseline so partial coverage is safe. - src/orchestrator/prepare.ts: PreparedRun gains `localCache`, `history`, `priorities`. When workspace config says `predictive: true`, we instantiate LocalHistoryProvider against the cache.db handle, load HistoryTable for every node in the graph, and run computePredictedPriorities. Errors degrade to baseline (fail-open). - src/orchestrator/run.ts: at the top of every run, if the workspace config declares plugins, installPlugins() subscribes each one to the bus and we keep the disposer until the finally block. A plugin setup() throw aborts the run cleanly. The scheduler call now threads prepared.priorities. - src/cache/cache.ts: new `dbHandle()` accessor — LocalHistoryProvider needs a Database, and we don't want to plumb every CTE through Cache. Lifetime stays owned by Cache.close(). - tests/plugin-e2e.test.ts: end-to-end fixture — a real workspace with vx.workspace.mjs declaring a plugin; run() loads, fires onRunStart / onTaskComplete / onRunEnd in order; setup() throw aborts.
wire-protocol-2026-06.md materialized as code:
- src/orchestrator/wire.ts: Envelope union (Request/Response/
ErrorResponse/Notification), builders, type guards, error codes,
bidirectional adapters between legacy ServerMessage|ClientMessage
and the JSON-RPC envelope, three transport encoders (WS / SSE /
NDJSON). 280 LOC, pure types + functions, no transport.
- src/cli/serve.ts: extends the existing Bun.serve mount with three
new HTTP routes —
GET /version — protocol version + channel/RPC capability list
GET /events — SSE broadcast of every envelope from every run
GET /stream — NDJSON broadcast (jq-friendly)
WS endpoint now accepts BOTH the legacy {t:'run',...} frame AND
the new makeRequest(id,'submit.run',...) envelope. Parse once,
classify, dispatch identically.
- src/orchestrator/index.ts: contract re-exports for the wire
module.
Tests:
- tests/wire.test.ts (22): builders, type-guards, ServerMessage and
ClientMessage round-trips, transport encoders, constants.
- tests/serve-transports.test.ts (3): /version returns correct
payload, SSE broadcasts envelopes from a delegated run, WS
accepts JSON-RPC envelope.
912 pass / 17 skip / 0 fail across the full repo suite.
getCacheStats / getRunHistory / explainCacheKey / whyDidThisRerun
were returning { todo: '...' } placeholders. They now open the local
cache.db on demand, query the runs + entries tables, and return live
data — what an agent (Claude Code, Cursor) actually needs.
- src/cli/mcp-rpc.ts:
* getCacheStats — calls Cache.stats() and surfaces entry count,
total bytes, runs in last 24h, hits in last 24h, computed
hit rate.
* getRunHistory — distinct (project, task) pairs from the runs
table; LocalHistoryProvider for per-pair p50/p99/successRate/
hitRate aggregates; recent rows for the timeline view.
* explainCacheKey — latest entry row for (project, task), with a
note explaining that the input-component breakdown requires
live config evaluation (next layer).
* whyDidThisRerun — looks up the row for (runId, taskId), finds
the immediately preceding run for the same task, compares hash
+ reports if the cache key actually changed.
- McpContext + setMcpContext: lets embedders (tests, future
inspector WS server) inject a workspace root. Defaults to
findWorkspaceRoot(process.cwd()) for the CLI path.
- tests/mcp.test.ts rewritten with a real temp cache.db that's
seeded with two runs; assertions verify the queries land real
data, not placeholders.
distributed-ci Phase A-B materialized: vx coordinator and vx run --worker do real work now, not stubs. - src/cli/coordinator.ts: startCoordinator() boots a Bun.serve WS endpoint, runs prepareRun against the workspace to build the same graph the local CLI would, computes per-node cache hashes (the assignment key), and dispatches via a ready queue. Worker registration via worker:hello; pull-driven via worker:pull; outcomes via worker:done. Stranded in-flight work from a disconnect goes back on the ready queue. - src/cli/worker.ts: runWorker() connects to a coordinator, sends hello, pulls work, executes via the new orchestrator-side workerExecute helper, streams stdout/stderr back through worker:* messages, reports outcomes. Honors coord:drain. Capacity-bounded in-flight count; pulls more as tasks complete. - src/cli/run.ts: detect --worker / --coordinator early and dispatch to workerCmd so the existing parseRunArgs doesn't see worker flags. - src/orchestrator/coordinator-prepare.ts (new): prepareForCoordinator + computeTaskHashForCoord. Thin wrappers reusing the local prepareRun pipeline with a silent logger. - src/orchestrator/worker-exec.ts (new): workerExecute — spawns the command, streams output, reports exitCode + duration. Lives in orchestrator/ so cli/worker.ts doesn't violate the cli→exec module-boundary rule. - tests/distributed-e2e.test.ts: two e2e tests — coordinator dispatches a 2-task DAG to one worker and reports done; a mid-handshake disconnect strands tasks which the real worker picks up. 933 / 0 / 17 across the full suite.
vx-cloud HMAC validation + queue consumer + DO submit.run no longer TODOs. - apps/cloud/src/hmac.ts (new): computeArtifactTag / verifyArtifactTag against the Turbo-compatible scheme (hash || teamId || body) using Web Crypto. Constant-time verify. - apps/cloud/src/index.ts cache.put: when VX_REMOTE_CACHE_SIGNATURE_KEY is set, requires x-artifact-tag and verifies it against the body — same policy as the client (src/cache/remote-cache.ts). Tagged storage on R2 so reads can re-verify. - apps/cloud/src/index.ts cache.get: when signing is on, fetches body + recomputes tag against stored value. Tampered artifacts surface as 500 errors so the client falls back to cache miss (the existing never-fail rule). - apps/cloud/src/index.ts queue() consumer rewritten: groups by runId, ensures the parent runs row via ON CONFLICT DO NOTHING, allocates seq once per run via SELECT MAX + sequential offsets, inserts via D1 batch() (atomic, fast). Per-message retry on failure, ack on success. - apps/cloud/src/run-coordinator-do.ts submit.run: persists RunMeta (runId, orgId, startedAt, status='running') in DO storage; run.end transitions to status='ended'. No more TODO placeholders on the DO RPC dispatch. - apps/cloud/tests/hmac.test.ts: 6 tests covering compute→verify round-trip, tampered body, wrong key, wrong hash, wrong teamId, malformed base64 (doesn't throw). Cloud tests: 6 pass / 0 fail. Full repo CI: 3 success / 0 fail.
Three small but meaningful wirings: Step 6 — Cache.contentBackend() (src/cache/cache.ts) - New accessor returns an FsCASBackend rooted at the same cacheDir Cache.save writes to. External subsystems (R2 mirror, REAPI bridge, analytics scanners) can read raw bytes via a Digest-keyed API without coupling to Cache's internal save path. The abstraction is reachable; deeper integration (rewiring save/restore through CASBackend.put/get) stays a follow-up since the existing atomic tmp+rename dance in Cache.save is concurrency-critical. - tests/cache-cas-integration.test.ts: round-trip via the CAS view. Step 7 — OTel bridge wiring (src/orchestrator/run.ts) - run() now opportunistically attaches @vzn/vx-otel-bridge as an additional bus subscriber when OTEL_EXPORTER_OTLP_ENDPOINT is set AND options.log is undefined (the real CLI path; tests bypass). - Dynamic specifier import so TS doesn't try to resolve the optional peer at type-check time — the package is a devDep; users install it to opt in. Missing package = silent skip; never blocks a run. - Detached in the finally block alongside disposePlugins. Step 8 — insights static server is testable + tested - startStaticServer exported from src/cli/insights.ts (was private). - tests/insights-static.test.ts (3): cache.db served with correct MIME + CORS, /health returns 200, unknown paths return 404. Full CI gate: 3 success / 0 failed.
Append-only log now closes the arc: pre-Step-1, ~25-30% of the contracts were wired through; post-Step-8 every piece fires end-to-end for the happy path. Records what shipped, decisions, test impact, and what's still deferred (Hono migration of vx serve, Cache.save through CASBackend, per-task InflightDedupDO fan-out).
Each 2026-06 design doc now opens with an implementation snapshot table — what shipped, where to find it, what's deferred. - wire-protocol-2026-06: SHIPPED 2026-06-21 as src/orchestrator/wire.ts - distributed-ci-2026-06: Phase A-B SHIPPED; Phase C-E deferred - vx-cloud-2026-06: Phases A-C SHIPPED; Phases D-E (OAuth, hosted SaaS) deferred - extension-protocol-2026-06: Phase 1 SHIPPED (wire + MCP + plugins + otel-bridge); Phase 2-3 (SDKs, ref plugins) deferred - predictive-execution-2026-06: Phase A-B SHIPPED (HistoryTable + scheduler integration); Phase C-F deferred - architecture-north-star-2026-06: VISION MOSTLY MATERIALIZED — six-layer spine populated; five rules now true of running code - architecture-review-2026-06: applied checklist — 23 review items map to shipped status README: - New 'Beyond a task runner' section showcasing vx mcp, vx coordinator, vx run --worker, vx insights serve, vx serve transports - Comparison table extended with 7 new rows (MCP server, distributed CI, insights SPA, CF cloud, plugin API, predictive, OTel) - Status section reorganized into a maturity matrix per surface - Architecture paragraph updated to mention plugins, history, event bus, wire envelope - Documentation section now links every 2026-06 design doc + the implementation log Full CI gate green: 3 success / 3 success / 3 success.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implementation arc for the north-star design proposals (PR #140's docs). Ten phases shipped in one PR, paired with three parallel agent-scaffolded subdirectories (apps/cloud, apps/insights, packages/otel-bridge). Full local CI gate green at close: 870+ tests, oxlint clean, oxfmt clean.
The progress log at
docs/progress/implementation-log-2026-06.mdis the per-phase narrative — decisions, deferrals, test impact. Each phase has its own commit; commits are independently readable.Phases delivered
a04c450Digest+CASBackendcache abstractions (Mem + Fs reference impls)481b77dHistoryTablerevival (LocalHistoryProvider over cache.db)0f06cd60f06cd60f06cd6vx mcp(Model Context Protocol server, stdio)81655718165571packages/otel-bridge(OTel CI/CD-conventions adapter)fc5eb15apps/insights(Solid + UnoCSS + DuckDB-WASM SPA)fc5eb15apps/cloud(Cloudflare Workers + R2 + D1 + Durable Objects + Queues + KV)acea14bvx serveWhat this looks like for users
vx mcpboots a Model Context Protocol server over stdio (Claude Code / Cursor / Continue.dev).vx insights serveboots a localhost Solid SPA over the localcache.db— historical run flamegraphs, no backend.vx coordinator <tasks>+vx run --worker <coord-url>scaffolds the distributed execution protocol (full handler bodies are Wave-2 follow-ups; the wire shape is committed).vx.workspace.tsgainsplugins?: Plugin[](in-process subscriber API) andpredictive?: boolean(history-aware scheduler opt-in).apps/cloud/is a template —bun wrangler deployfrom a fresh clone gives any user a private hosted vx in their CF account in ~5 minutes.What's deferred and why
Detailed in the progress log's "Deferred work" section:
vx coordinator/vx workerhandler bodies (today they parse + print the wire shape).vx servehost routes.toEnvelope/fromEnvelopeadapter that bridges the legacyt-discriminated wire and the new JSON-RPC 2.0 envelope (additive; both formats coexist at the WS endpoint when it lands).Test plan
bun test tests/digest.test.ts— 14 passbun test tests/history.test.ts— 3 passbun test tests/predict.test.ts— 5 passbun test tests/plugin.test.ts— 5 passbun test tests/mcp.test.ts— 10 passbun test tests/distributed.test.ts— 10 passbun src/bin.ts run ci— 3 success / 3 success / 3 success (lint.oxlint + lint.oxfmt + test all green)vx mcpconnects to a real MCP client (e.g. Claude Desktop) — deferred to follow-upvx insights serveboots and the SPA renders against a realcache.dbapps/cloud/deploys viabun wrangler deployagainst a Cloudflare account🤖 Generated with Claude Code
https://claude.ai/code/session_01RW7aso5j5CrBo7cjyET23D
Generated by Claude Code