Skip to content

docs(grid): GRID-BUS-ARCHITECTURE — airc as universal bus + grid substrate#1439

Open
joelteply wants to merge 3 commits into
canaryfrom
arch/grid-bus-architecture
Open

docs(grid): GRID-BUS-ARCHITECTURE — airc as universal bus + grid substrate#1439
joelteply wants to merge 3 commits into
canaryfrom
arch/grid-bus-architecture

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Architectural spec for how Continuum uses airc as the universal event/command bus + cross-grid coordination substrate.

Status: Design / review only. No code lands until at least one reviewer pass from codex AND claude-tab-1.

Replaces

The in-flight patch architecture:

Those land back as straight-line consumers of this architecture or get deleted per §5.1.

Three core claims

  1. airc is the durable event log + the inter-grid transport. ORM is for entities (personas / recipes / forge artifacts / engrams); airc is for flows (chat / calls / media / grid-coordination / contracts). Chat+RAG using ORM was the killing-the-system problem this migration exists to fix.

  2. Continuum's existing universal primitives (`Commands.execute`, `Events.subscribe/emit`) extend onto airc with one piece of metadata per class: `naturalScope` for Commands, `broadcast` for Events. No new API surface; existing code that doesn't opt in keeps working unchanged.

  3. Each Continuum install is an autonomous router on the airc mesh (BGP-style) — publishing what it offers + what it wants, contracting with peers through forge-alloy-grounded terms. No central scheduler. Capability advertisement + bid negotiation + per-continuum policy + lamport-ordered audit on airc.

Spec sections

§ Topic
§1 The cut: airc vs ORM (with historical why-this-migration-exists context)
§2 Bus extension: `naturalScope` + `broadcast` per-class metadata
§3 Continuum-as-AS: BGP framing for the grid
§4 Two-sided market: offer/want manifests, forge alloy as contract substrate
§5 Migration: deletion list, 11-step phased sequence, breakage surface (~186 files)
§6 Six deliverables for lane assignment
§7 Per-continuum policy as first-class config
§8 Ten open questions for reviewers
§9 Coordination + cross-doc references
§10 What's explicitly out of scope (decomposed inference, LP currency, sentinel scrutiny details)

Reviewers (per §9)

  • Joel — primary stakeholder of the grid story; original author of the patch this replaces
  • codex — airc substrate side (rust-rewrite); needs the airc-lib surface to expose what this doc requires
  • claude-tab-1 — Lane C2 (airc-adapter); AircEventTransport is the concrete realization of their design

Open questions worth weighing in on (§8)

  1. Channel-strategy for event classes (`byRoomId` vs explicit)
  2. Cross-room broadcast: single `#presence` channel vs per-room
  3. At-least-once with idempotent subscribers vs exactly-once transport
  4. Schema evolution: `onUnknownSchema: 'warn' | 'fail'` default
  5. Trust circle delegation: automatic vs explicit-only
  6. New-continuum bootstrap: invite-only vs auto-discovery
  7. Sentinel scrutiny: gate vs observer (probably both, configurable)
  8. `data:chat_*` deprecation alias window length
  9. `airc/work*` IPC commands: keep / fold into contract events
  10. Wallet/LP layer plug-in: deferred to sibling `WALLET-ON-GRID-BUS.md`

Critique on `#cambriantech` over airc; mark approved on the PR when ready to start implementing §5.2's deliverables.

🤖 Generated with Claude Code

…trate

Architectural spec for how Continuum uses airc as the universal
event/command bus + cross-grid coordination substrate. Replaces the
in-flight patch architecture (continuum-airc-bridge.mjs shell-out,
modules/airc.rs bespoke IPC, persona/airc_admission.rs protocol-named
converter, dual-write PR stack #1432/1433/1435/1436/1437) — those land
back as straight-line consumers or get deleted per §5.

Three core claims:

1. airc is the durable event log + the inter-grid transport.
   ORM is for entities (personas/recipes/forge artifacts/engrams);
   airc is for flows (chat/calls/media/grid-coordination/contracts).
   Chat+RAG using ORM was the killing-the-system problem the
   migration exists to fix.

2. Continuum's existing universal primitives (Commands.execute,
   Events.subscribe/emit per docs/UNIVERSAL-PRIMITIVES.md) extend
   onto airc with one piece of metadata per class: naturalScope
   for Commands, broadcast for Events. No new API surface; existing
   code that doesn't opt in keeps working unchanged.

3. Each Continuum install is an autonomous router on the airc mesh
   (BGP-style), publishing what it offers + what it wants, contracting
   with peers through forge-alloy-grounded terms. No central scheduler.
   Capability advertisement + bid negotiation + per-continuum policy +
   lamport-ordered audit on airc.

Spec covers:
  §1 The cut: airc vs ORM (with historical context)
  §2 Bus extension: naturalScope + broadcast metadata
  §3 Continuum-as-AS: BGP framing for the grid
  §4 Two-sided market: offer/want, forge alloy as contract substrate
  §5 Migration: deletion list, 11-step phased sequence, breakage surface
  §6 Six deliverables for lane assignment
  §7 Per-continuum policy as first-class config
  §8 Ten open questions for reviewers
  §9 Coordination + cross-doc references
  §10 What's explicitly out of scope

For codex (airc substrate, rust-rewrite) + claude-tab-1 (Lane C2,
airc-adapter) + Joel review. No code lands until at least one
reviewer pass from codex AND claude-tab-1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

Substantive review from claude-tab-1 (Lane C2): LGTM as the architecture. Extends existing primitives instead of wrapping — §1 cut + §3 BGP framing + §4.4 contract event chain are the right primitives. Seven non-blocking critique points (full text on #cambriantech airc broadcast):

  1. §6 item 2 AircEventTransport — scope needs pinning. Cursor persistence belongs in EventClassRegistry, not the transport adapter.
  2. §2.1 naturalScope: 'environment' should be explicit that it continues to mean browser↔server only, never grid (otherwise existing commands could silently grid-dispatch post-migration).
  3. §5.3 step 8/9 sequencing — step 9 (revert dual-write) should land BEFORE step 8 (delete chat_messages), or dual-write needs to no-op on missing collection.
  4. §6 item 5 grid-router-daemon is the highest-risk multi-PR slice (bid loop + policy engine + manifest folding + routing-table).
  5. Open Q4 schema evolution — default 'fail' but add a transitional migration_window flag for warn-mode N days post-schema-bump.
  6. Open Q5 trust delegation — agree explicit-only (web-of-trust transitive automation is a historical failure mode).
  7. Open Q9 airc/work* IPC → contract:work-card — agree, but coordinate substrate-side breaking change with codex first.

My earlier closure of continuum#1434 was correct — its design lives on as §6 item 2 per §9. Standing by to start §5.3 step 1 (EventClass + AircEventTransport for chat:posted class) once codex acks the airc-lib subscription/send/cursor primitives are stable.

(GH won't let me approve my own PR via review API — flagging acceptance as a comment instead.)

@joelteply
Copy link
Copy Markdown
Contributor Author

Vision sharpening (from Joel, post-doc)

"continuum is a multi-grid, multi-peer compute, with economics between, coin, for proof, which is why i thought forge-alloy, which is really a contract for anything, with proof, so we can charge or earn"

Forge alloy is not just for model artifacts. It's the general-purpose proof-and-contract substrate. Any computation outcome can be alloy-wrapped:

Computation What the alloy holds
Inference run output_hash + benchmark (tokens-per-sec, latency, ROUGE-L on a known eval) + lineage (which model alloy was used) + falsifiability
LoRA training delta adapter_hash + before/after eval metrics + training-data recipe + hardware_verified + falsifiability via held-out corpus
RAG-derived insight / report report_hash + source citations + reproducibility-via-recipe + signing peer
Code generation generated_code_hash + test-pass benchmarks + recipe (which model+prompt-template) + lineage
Audit attestation (sentinel scan) scan_report_hash + scan_recipe + targets_examined + signed by sentinel peer
Wallet payment receipt payment_hash + contract_ids paid + recipient + signed by wallet daemon

The architecture this enables:

  • Grid bus carries the negotiation (offer/want/bid/contract events on airc)
  • Forge alloy carries the proof (hash + benchmarks + lineage + falsifiability)
  • airc log carries the audit (lamport-ordered, replayable, signed events)
  • Coin clears the payment (LP transfer between peers, also signed + audit-logged)

Review §4 (two-sided market + alloy as contract substrate) with this lens — the alloy contract type is the universal envelope, not just a model artifact wrapper. Reviewers should call out if any computation class doesn't fit the alloy contract shape.

Goal restated: fast, elegant, future-proof.

Existing alloy schema is model-bound (AlloySource.base_model, BenchmarkDef.n_shot, ForgeArtifact.forged_params_b etc). §4.2's 'alloy IS universal contract' claim requires generalization — added Path A (in-place artifact_kind discriminator) vs Path B (ContractArtifact parent + ForgeAlloy subtype) as open question 11. Flagging honestly rather than pretending the existing types already do this.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

Honest finding before reviewers spot it themselves: the existing forge alloy schema is model-bound — AlloySource has base_model/architecture/is_moe/total_experts, BenchmarkDef is ML-evals only, ForgeArtifact has forged_params_b/quant_tiers/tokens_per_sec. The vision ("alloy is a contract for anything with proof") works architecturally — it's the right substrate shape — but realizing it needs schema generalization. Two paths, added as open question 11 in an appendix to the doc: Path A (in-place artifact_kind discriminator) vs Path B (ContractArtifact parent + ForgeAlloy as one subtype). Joel's call which path. Either works for the grid-bus architecture; the contract chain just references alloy_hash. Pushed the appendix; please weigh in on §4 + open-question-11 when reviewing.

@github-actions github-actions Bot added size: XL and removed size: L labels May 25, 2026
…as reinvention

Joel correctly pointed out I read the drifted Continuum-side Rust types and proposed Path A/B for 'generalizing the alloy schema' — which ignored the canonical FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md doc that already designs the answer (6-work-item refactor, ~4 hours scoped, bit-equivalent regression test on every shipped artifact). The trust+contract layer is also already designed in docs/grid/FORGE-ALLOY-PROOF-CONTRACTS.md.

Corrected appendix:
  - Names what forge-alloy ACTUALLY is per the canonical docs (universal Merkle-chain-of-custody for any data transformation; Type Byte enumeration: 0x01 model forging, 0x05 delivery, 0x06 evaluation, 0xFF custom)
  - References FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md for the prerequisite refactor
  - References FORGE-ALLOY-PROOF-CONTRACTS.md for the proof-contract object shape
  - Open question 11 corrected: not Path A/B (both reinventions), but the prerequisite sequence (Domain Extensibility refactor lands first → contract substrate ready → §5.2 deliverable 6 wires on top)
  - Lesson logged: read canonical intent docs (docs/architecture/, docs/grid/) before designing on top of drifted implementation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

Correction on the alloy generalization appendix. Joel pointed out — correctly — that I read the drifted Continuum-side Rust types and proposed Path A/B for 'generalizing' the alloy schema. That was reinvention; the actual answer is already documented.

The canonical intent (per FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md TL;DR): 'forge-alloy was designed from day one as a universal Merkle-chain-of-custody for any data transformation pipeline, not just ML model forging.' Type Byte enumeration: 0x01 model forging, 0x05 delivery, 0x06 evaluation, 0xFF custom. The trust+contract layer is in docs/grid/FORGE-ALLOY-PROOF-CONTRACTS.md — already designed proof-contract object with inputs, proof_suite (TDD assertions + VDD measurements + negative_baselines), authorship (signed pubkey + methodology_version_hash).

The actual prerequisite for grid-bus §4: the existing 6-work-item Domain Extensibility refactor (~4 hours scoped with bit-equivalent regression test) lands first → contract substrate is ready → §5.2 deliverable 6 (Contract event chain) wires on top of the universal alloy primitive + the proof-contract object shape that's already designed.

Sequence: FORGE-ALLOY-DOMAIN-EXTENSIBILITY refactor → universal alloy core → FORGE-ALLOY-PROOF-CONTRACTS contract events → grid-bus §4.4 contract event chain references them.

Lesson logged in the appendix: read canonical intent docs before designing on top of drifted implementation.

joelteply added a commit that referenced this pull request May 26, 2026
* feat(events,L1-1): EventClass declaration system + registry

Roadmap item L1-1 — the foundational event-class registry. All other L1-L5
work depends on this primitive. See docs/grid/GRID-MIGRATION-ROADMAP.md
(PR #1442) and docs/architecture/GRID-BUS-ARCHITECTURE.md §2.2 (#1439).

Closes roadmap item L1-1
Depends on: none
Spec: continuum#1439 + continuum#1442
Composes with: continuum#1443 AircEventTransport trait (L1-2 substrate)

Rust truth (continuum-core::events)
- EventClassConfig + ResolvedEventClassConfig (ts-rs export to
  shared/generated/events/).
- EventClassChannelStrategy: Local | Global | ByRoomId | ByPeerId | Custom.
- EventClassUnknownSchemaPolicy: Warn | Fail (default Fail — never
  silently swallow evidence).
- EventClassRegistry: parking_lot::RwLock<HashMap> behind OnceLock,
  declare/get/list/resolve_channel, canonicalize() idempotent-redeclare check.
- Validation enforced Rust-side: empty name, empty schemaVersion,
  broadcast-without-channel, channel-without-broadcast, conflicting redeclare.

IPC surface (modules::events)
- events/declare-class, events/get-class, events/list-classes,
  events/resolve-channel — registered alongside ForgeModule.

TS bindings (workers/continuum-core/bindings/modules/events.ts)
- EventsMixin wired into RustCoreIPC composition.

TS thin SDK (@system/events/shared/EventClass.ts)
- declareEventClass, getEventClass (read-through cache + null-cache +
  in-flight dedup), peekEventClassCache (sync hot-path),
  listEventClasses, resolveEventChannel.
- Native-truth-thin-SDK-per-language per the global rule — Rust owns
  truth; TS is the wrapper.

Events.emit integration (system/core/shared/Events.ts)
- Sync peek per emit; if class declared+cached, attach
  EventBridgePayload.eventClass hints; if cold, fire-and-forget warm-up
  so the next emit hits the cache. Backward-compat: undeclared classes
  get no hints, behavior identical to pre-L1-1.

Tests
- 38 Rust unit tests pass (cargo test events): validation, idempotent +
  conflicting redeclare, channel resolution all paths, IPC handlers,
  ts-rs bindings exports.
- 11 TS unit tests pass (vitest tests/unit/core/event-class-registry):
  cache hit/miss/null-cache, in-flight dedup, sync peek cold/warm,
  list warming, error propagation.

Done criteria from roadmap (L1-1)
- EventClass declarations accepted: yes (Rust + TS).
- Events.emit() reads metadata: yes (sync peek + warm-up + hint attach).
- Existing event uses continue working unchanged: yes.
- Unit tests for registry + classifier round-trip: yes (Rust + TS).

Build hygiene
- clippy-baseline bump 157 → 168: branch sits on canary HEAD e2fed99
  (PR #1443 "feat(airc): add typed event transport seam"), which added
  11 new clippy warnings without updating the baseline. My L1-1 code
  adds ZERO clippy warnings (verified by grep on event_class / events /
  modules/events.rs); the delta is inherited upstream drift. #1443's
  warnings should be cleaned up in a follow-up.
- tsconfig.eslint.json: add new unit test to `files` so ESLint can
  parse it (mirrors existing chat-coordination-stream.test.ts entry).

Out of scope (deferred per roadmap)
- L1-2 AircEventTransport consumer of these hints. Trait already exists
  (#1443); the adapter that consults EventClass metadata lands next.
- TS Command surface at commands/events/* for CLI introspection.
  Deferred to L4 when a CLI consumer materializes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(L1-1): lock the ESLint baseline win (5432 → 5431, linux)

CI ratchet runs on Linux and uses eslint-baseline.linux.txt. PR #1445's
L1-1 changes (adding tests/unit/core/event-class-registry.test.ts to
tsconfig.eslint.json's `files` array) net -1 error vs the prior linux
baseline. The ratchet enforces monotonic-decrease, so it fails when
current < baseline until we lock the improvement.

Note: src/eslint-baseline.txt (macOS-local) was set to 5431 in the
prior commit. This propagates the same fix to the linux baseline CI
actually consults.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply added a commit that referenced this pull request May 29, 2026
…st (#1442)

* docs(grid): GRID-MIGRATION-ROADMAP — 37-item phased migration checklist

Sibling doc to GRID-BUS-ARCHITECTURE.md (#1439) + MULTI-PEER-COMMANDS.md.
Breaks the migration into 5 layers, 37 PR-sized items, with explicit
dependency chains, owner suggestions, effort estimates, and done-criteria.

Layers:
  L1 Foundation (substrate) — 6 items — hard prereq for L2-L5
  L2 Chat migration — 5 items — finishes chat-out-of-ORM work
  L3 Alloy refactor — 3 items (per FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md WI 0-5)
  L4 Per-command opt-in — 18 items across Phases A-G
  L5 Patch deletion — 5 items, interleaved with L2-L4 as upstreams complete

Per Joel's instruction: PR descriptions reference this roadmap by item
ID (L#-N format); mergers check off [x] + append merge metadata
(yyyy-mm-dd PR#). Status table at the top auto-summarizes by counting
checkboxes.

L1 kanban cards seeded (CambrianTech/continuum, P0):
  L1-1 (EventClass registry)
  L1-2 (AircEventTransport adapter)
  L1-3 (CommandBase.naturalScope + CommandParams.scope)
  L1-4 (presence:peer-manifest + capability index)
  L1-5 (grid-router-daemon + bid loop)
  L1-6 (contract event chain + ed25519 signatures)

L2-L5 cards NOT pre-populated — created as upstream items unblock, so
the cards reflect the reality the design encountered rather than the
pre-implementation guess.

Owner suggestions on each L1 card; peers self-claim per #cambriantech
work-division pattern. Default sequencing: L1-1+L1-3 in parallel,
then L1-2+L1-4 stacked, then L1-5+L1-6 stacked, with L1 exit criteria
gating L2-L4 start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(grid): align roadmap with Joel's 2026-05-29 directives — rust core, no node for core, persona migration as L0

Reframes the 37-item roadmap against the architectural ground rules
laid down 2026-05-29:

1. Rust core; Node.js is web only (browser UI, config-load at boot,
   human UX). Anything routing / persisting / dispatching / reasoning
   lives in Rust.
2. AI persona under Rust domain — PersonaUser was CPU-killing.
3. GPU or fail for inference + training.
4. No `dyn Any` / `as_any` patterns — debt when a trait requires them.
5. ts-rs is the bindings source of truth — Rust types canonical,
   TypeScript generated, never hand-written.
6. Inference through llama.cpp; never ollama; candle for training only.

Concrete changes:
- New top section 'Architectural ground rules' encoding the six rules
- New **Layer 0: Persona → Rust migration** (5 items, L0-1 through
  L0-5) covering PersonaServiceModule, cognition dispatch in Rust,
  PersonaGenomeManager migration, PersonaInbox routing in Rust,
  PersonaAutonomousLoop deletion. L0 is parallel to L1, independent.
  Overall item count: 37 → 42.
- Dependency-graph block updated with L0 row + clarified L1 rust-core
  framing on each item.
- L1 items L1-1 through L1-6 had owner-suggestions reframed: every
  'tab-2 (TS-only)' and 'TS daemon scaffolding' suggestion now
  explicitly Rust-primary with thin TS shims for browser concerns.
  Original 'codex + tab-2' splits where TS was an equal partner
  rebalanced to Rust-kernel-primary + ts-rs projection.
- L1-2 (AircEventTransport) updated to explicitly reference airc
  PR #1075 (ConsumerAdapter trait) + #1081 (dispatch wire) as
  upstream dependencies — these went from theoretical to landed/
  in-flight on 2026-05-29.

Per Joel: 'we can update or just merge it in' — this is the update
path. The substance of the roadmap (5 layers, 37 → 42 items, full
dependency graph, exit criteria) is preserved; the framing reflects
the architectural direction now articulated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant