From 1b8e305da62c43ca75b1e69fd93501b1072e8a6b Mon Sep 17 00:00:00 2001 From: Test Date: Mon, 25 May 2026 18:21:46 -0500 Subject: [PATCH 1/6] =?UTF-8?q?docs(grid):=20MULTI-PEER-COMMANDS.md=20stra?= =?UTF-8?q?wman=20=E2=80=94=20multi-author=20seed?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Companion to continuum#1439 GRID-BUS-ARCHITECTURE. Defines which Continuum commands distribute across grid + how distributed resources are addressed (handles) + concrete shapes for multi-peer commands. **This is a STRAWMAN, not a finished doc.** Per Joel's direction (2026-05-25 'you are not alone, divide up research and planning'), the 8 sections are intended for multi-author ownership: - §1 existing primitives inventory → research baseline (any reviewer) - §2 command classification table → claude-tab-2 (16279c3f) — needs bus-architecture-author depth - §3 quorum model + §4.1 genome paging + §4.4 multi-peer RAG → claude-tab-1 (55c30b28) — Lane C2 consumer-side context - §4.2/4.3 federated inference + distributed training → dba950ce or whoever takes adapter-integration depth - §4.5 multi-peer forge runs → codex (543c0bf7) — forge substrate - §5 handle distribution model → codex Rust side + claude-tab-1 TS side, paired - §6 hosting/payment + §7 forge-alloy as universal contract substrate → claude-tab-2 (per Joel's vision clarification + tab-2's own forge-alloy correction) - §8 migration sequencing → claude-tab-2 (owns #1439 bus migration) Reviewers should REPLACE their owned sections wholesale if my strawman framing doesn't fit — this is starting material, not finished design. Sections I'll commit to keeping mine: §3, §4.1, §4.4, and TS half of §5. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/architecture/MULTI-PEER-COMMANDS.md | 443 +++++++++++++++++++++++ 1 file changed, 443 insertions(+) create mode 100644 docs/architecture/MULTI-PEER-COMMANDS.md diff --git a/docs/architecture/MULTI-PEER-COMMANDS.md b/docs/architecture/MULTI-PEER-COMMANDS.md new file mode 100644 index 000000000..9cc971776 --- /dev/null +++ b/docs/architecture/MULTI-PEER-COMMANDS.md @@ -0,0 +1,443 @@ +# Multi-Peer Commands, Handles, and the Grid-Distribution Model + +**Status:** Design (2026-05-25). Companion to [GRID-BUS-ARCHITECTURE.md](GRID-BUS-ARCHITECTURE.md). +**Authors:** claude-tab-1 (research + draft), Joel (direction + vision). +**Scope:** Defines which Continuum commands distribute across grid + how distributed resources are addressed (handles) + concrete shapes for the multi-peer commands the grid economy needs. + +This doc sits BELOW the bus architecture (#1439, which defines the transport + routing layer) and ABOVE the per-command implementation work (§5.3). It answers: "OK we have a grid bus — what RUNS on it, what stays local, and how do peers actually share things?" + +--- + +## Executive summary + +Three claims, with the rest of the doc supporting them: + +1. **Most of the primitives already exist.** Continuum has `GridInterceptor` (transparent command routing via Rust kernel), `GridEvents` (typed topology events), `Handle` (UUID-addressable persistent state with TTL), `PagedResourcePool` (generic ref-counted pinning), `AdapterStore` (content-addressed LoRA adapters), `GenomeDaemon` (paging orchestrator), `UDPMulticast` + `WebRTC` transports, and `EventBridgePayload` with multi-hop bridge metadata. The multi-peer story is **a composition of what's already there, not a green-field build**. + +2. **Commands divide cleanly along three axes that already exist in the codebase:** + - **Where the truth lives** (entity vs flow — see #1439 §1) + - **What environment can satisfy it** (DOM-local vs process-local vs network-reachable vs grid-distributable) + - **What's the minimum quorum for satisfaction** (1 peer, N peers, every reachable peer) + This doc enumerates each Continuum command namespace against those three axes so the `naturalScope` declaration in #1439 §2.1 has a concrete authoritative table to follow, not per-command guesswork. + +3. **Handles are the universal way distributed resources travel between peers.** Continuum's existing `Handle` system (`src/system/core/types/Handle.ts`) already serializes opaque-id-with-status; extending it for grid distribution means the holder peer keeps the live resource pinned, and the handle is a typed reference that other peers fetch / await / cancel / unpin through. Genome paging (LoRA adapters) is the canonical worked example because it already implements the pattern end-to-end on one machine. + +--- + +## 1. Inventory of existing primitives (what we build on) + +Per the GRID-BUS-ARCHITECTURE doc and what's already in the codebase as of 2026-05-25: + +### 1.1 Grid routing layer (already shipped) + +| Component | Location | What it does | +|---|---|---| +| `GridInterceptor` | `src/system/grid/server/GridInterceptor.ts` | Hooks `Commands.execute()` pre-execution; calls Rust `gridRoute()` for routing decision; if remote, calls `gridSend()`; if local, returns null and command executes in-process. **Transparent grid routing is already a thing.** | +| `RustCoreIPC.gridRoute()` + `gridSend()` | `workers/continuum-core/bindings/RustCoreIPC.ts` | IPC into the Rust kernel for routing-table lookup + forwarding | +| `GridEventBridge` | `src/system/grid/server/GridEventBridge.ts` | Subscribes to Rust IPC events and emits typed `GRID_EVENTS` over the existing Events bus. **Substrate state is already a first-class Events bus citizen.** | +| `GRID_EVENTS` | `src/system/events/shared/GridEvents.ts` | 5 typed events: `node:joined`, `node:left`, `node:health-changed`, `route:decision`, `command:forwarded` | + +The migration toward #1439 reshapes these — instead of `GridInterceptor` making routing decisions in TypeScript, the Rust kernel reads `naturalScope` metadata from the EventClass / command registry and the EventBridge transport handles the dispatch. But the primitives that need wiring are already named. + +### 1.2 Handle system (already shipped) + +| Component | Location | Pattern | +|---|---|---| +| `Handle` | `src/system/core/types/Handle.ts` | Persistent UUID-addressable handle. States: `pending → processing → complete | failed | expired | cancelled`. TTL-managed (default 5min). Backed by SQLite. | +| `ShortId` | same | `#abc123` form for human-typeable references. `HandleRef = UUID | ShortId | string` resolves either. | +| `ResourceHandle` | `src/system/core/paging/PagedResourcePool.ts` | Ref-counted pinning. `.pin()` returns a handle that keeps the resource resident; `.unpin()` allows eviction under pressure. | +| `IteratorHandle` | `src/system/core/logging/LogIterator.ts` | Stateful cursor for streaming reads (used by `data/query-open/next/close`). | +| `DbHandle` | `src/daemons/data-daemon/server/DatabaseHandleRegistry.ts` | Pooled database connection handle. | + +Handles already serialize cleanly across the wire (UUID + minimal status JSON). The grid extension is: a handle MAY refer to a resource pinned on a different peer; resolving it dispatches through the grid router. The local Continuum keeps a `RemoteResourceHandle` that wraps the peer-id + remote handle id + a local cache of the latest known status. + +### 1.3 Genome / LoRA paging (already shipped per-machine; canonical example for grid extension) + +| Component | Location | What it does | +|---|---|---| +| `AdapterStore` | `src/system/genome/server/AdapterStore.ts` | Scans `SystemPaths.genome.adapters`. Each dir has `manifest.json` (personaId, traitType, baseModel) + `adapter_config.json` + `adapter_model.safetensors`. Indexed by manifest.id and by (personaId, domain) latest-version. | +| `LayerLoader` + `LayerCache` | `src/system/genome/server/LayerLoader.ts`, `LayerCache.ts` | Async loader for adapter weights with in-flight dedup; LRU + TTL cache. | +| `GenomeRegistry` | `src/system/genome/server/GenomeRegistry.ts` | Tracks active adapter loads; reference-count pinning prevents eviction while in use. | +| `GenomeDaemon` | `src/system/genome/server/GenomeDaemon.ts` | Orchestrator. Hooks pressure events from `PagedResourcePool` to evict under memory pressure. Exposes paging-activate / paging-deactivate commands. | +| `TrainingStepBridge` | `src/system/genome/server/TrainingStepBridge.ts` | Subscribes to Python training stdout, parses step metrics, emits `AI_LEARNING_EVENTS.TRAINING_STEP`. | + +Adapter manifests are content-addressed (`manifest.id` is stable across machines if the manifest content is identical). Adding a `presence:adapter-available` broadcast (manifest + peer-id + capacity hints) is the smallest change that lets peers discover each other's LoRAs without central registry. The paging layer already does the hard work; the grid extension is one event class and one resolver. + +### 1.4 Forge / alloy substrate (already shipped, generalization path open per #1439 Q11) + +| Component | Location | What it does | +|---|---|---| +| `ForgeRecipe` | `src/shared/generated/forge/ForgeRecipe.ts` | Authored recipe entity: id, version (semver), name, user_summary, author, tags. Stored in ORM via standard `data/create` commands. | +| `ForgeArtifact` | `src/shared/generated/forge/ForgeArtifact.ts` | Foundry output: stable id, recipe_id+version snapshot, alloy_hash, executionTime, hardware, benchmarks. | +| `model/forge` command | `src/commands/model/forge/` | Synthesis. Accepts `nodeId` param for remote execution. | +| `model/publish` | `src/commands/model/publish/` | Ships to HuggingFace OR airc-blobs. | + +Per Joel's vision clarification on #1439, alloy is the universal contract substrate for any computation (not just model artifacts). Open question 11 on #1439 covers two generalization paths (in-place discriminator vs ContractArtifact parent). This doc treats alloy as already-universal — every multi-peer command result references an alloy hash (or a `ContractArtifact` hash once the generalization lands). + +### 1.5 Other multi-peer-relevant primitives + +| Component | Location | Relevance | +|---|---|---| +| `UDPMulticastTransport` | `src/system/transports/udp-multicast-transport/` | Server uses raw UDP multicast + unicast; browser uses WebRTC DataChannels + WS signaling. Many-to-many `TransportRole: 'peer'`. | +| `EventBridgePayload` | `src/system/events/shared/EventSystemTypes.ts` | Already carries `originContextUUID`, `BRIDGE_HOP_COUNT`, `BRIDGED` markers — multi-hop delivery is anticipated by the type system. | +| `PagedResourcePool` | `src/system/core/paging/PagedResourcePool.ts` | Generic ref-counted paging for any resource. Used today for LoRA adapters, KV cache, model weights, embedding cache, memory recall. **Generic enough to coordinate cross-peer pinning** if extended with a "where is the resource currently pinned" field. | + +--- + +## 2. Command classification — the authoritative table + +Each Continuum command namespace below is classified on three axes: + + - **Truth tier** (per #1439 §1): `entity` (lives in ORM) | `flow` (lives in airc) + - **`naturalScope`** (per #1439 §2.1): `local` | `environment` | `grid` + - **Multi-peer quorum** (new — see §3 below): `single` (one peer satisfies) | `multi` (N peers contribute) | `any` (any reachable peer, doesn't matter which) + +| Namespace | Truth tier | naturalScope | Quorum | Rationale | +|---|---|---|---|---| +| `ai/generate` | flow (in-flight) | grid | single | Inference completion. Any GPU peer with the right model can satisfy. Capability dispatch via §3.2. | +| `ai/embedding` | flow | grid | single | Embeddings. Cheap enough to be local-default with grid-fallback under load. | +| `ai/should-respond` | flow | grid | single | Routing decision; cheap. Same as embedding. | +| `inference/generate` | flow | grid | single | Same as `ai/generate` but lower-level. Future: `multi` for ensemble inference. | +| `inference/capacity` | entity | local | single | Per-peer VRAM/GPU state. Replicated to grid via `presence:peer-manifest`. | +| `cognition/recall-engrams` | entity (engrams ORM) | environment | single | RAG retrieval over local engram store. Future: `multi` for cross-peer RAG (§4.4). | +| `cognition/admit-inbox-message` | flow | local | single | Persona-scoped admission. Never grid (privacy). | +| `cognition/vision-describe` | flow | grid | single | Vision-model inference. Same shape as `ai/generate`. | +| `genome/paging-activate` | flow | grid | single | Adapter activation. Multi-peer if adapter only lives on another peer (§4.1). | +| `genome/paging-deactivate` | flow | local | single | Eviction. Hint-only across peers via `presence:resource-pressure`. | +| `genome/train` | flow | grid | multi | Training. **Federated training across peers (§4.3).** | +| `genome/adapter-list` | entity | local | single | Local index. Aggregate cross-peer via `presence:adapter-available` projection. | +| `recipe/generate` | entity (recipe ORM) | environment | single | Recipe authoring. Local-default; recipe is an entity. | +| `recipe/run` | flow | grid | single (today), `multi` (future) | Foundry synthesis. Multi-peer for distributed forge runs (§4.5). | +| `model/forge` | flow | grid | single | Same as recipe/run. | +| `model/publish` | entity (HF) + flow (broadcast) | grid | single | Publish to HF + announce on airc. | +| `adapter/adopt` | entity | environment | single | Local adoption decision. | +| `adapter/publish` | entity (HF) + flow | grid | single | Same shape as `model/publish`. | +| `adapter/search` | entity (HF + grid manifests) | grid | any | Search any peer's published manifests. | +| `data/create` | entity | local | single | ORM write. Per-machine. Never grid. | +| `data/read` | entity | local | single | ORM read. Local. | +| `data/query-open/next/close` | entity (iterator handle) | local | single | Per-machine iterator. | +| `data/vector-search` | entity (embeddings) | grid | any | Vector search — fan-out to peers with embedding indexes; merge results. (§4.4) | +| `search/execute` | entity | environment | single | Local full-text. | +| `rag/load` / `rag/budget` | flow | local | single | Per-persona context assembly. Local. | +| `collaboration/chat/*` | flow | environment (today, post-#1439: grid) | single | Chat. Becomes grid-distributed per #1439 §1.2. | +| `voice/synthesize` | flow (TTS handle) | grid | single | TTS. Any peer with the voice model. | +| `voice/transcribe` | flow | grid | single | STT. Same shape. | +| `media/upload` | flow + entity (airc-blobs) | grid | single | Content-addressed blob upload; any peer can hold; resolver returns peer-id. | +| `interface/screenshot` | local (DOM) | local | single | DOM. Never grid. | +| `interface/render` | local (DOM) | local | single | DOM. Never grid. | +| `code/agent` | flow (code-edit handle) | environment | single | Local code work. | +| `grid/pair` | flow | grid | single | Pairing handshake. Already grid-aware. | +| `workspace/*` | entity | local | single | Per-machine workspace state. | +| `forge/*` | entity + flow | grid | multi (training), single (inference) | Forge runs are compute-heavy; distributed forge is §4.5. | + +**Pattern from the table:** ~30% of commands stay local (DOM/FS/per-machine entity), ~40% are environment-scoped (browser↔server inside one Continuum install), ~30% are grid-distributable. Of grid commands, ~5 namespaces are natural multi-peer candidates (training, vector-search, RAG, forge-runs, blob storage); the rest are single-peer. + +--- + +## 3. Quorum: the third axis + +`naturalScope` (per #1439 §2.1) answers "where does this command run." But for grid-distributed commands, a second question matters: "how many peers does it take to satisfy?" + +### 3.1 `quorum: 'single'` — one peer satisfies + +Most grid commands. The router picks ONE peer per the operator's policy (cheapest / fastest / closest / etc.), dispatches, awaits result. Example: `ai/generate` — one inference completion is the answer. + +Existing primitive: GridInterceptor already does this (single-peer routing via Rust kernel). + +### 3.2 `quorum: 'multi'` — N peers contribute, results combine + +Federated commands. The router dispatches the SAME logical work to multiple peers, each producing a partial result; a reducer at the originating peer combines them into the final answer. + +Examples: + + - **Federated training** (`genome/train`): each peer trains on its local data; gradients/checkpoints sync periodically; final adapter is the combined model. Quorum: `min: 2, max: , sync_strategy: 'fedavg' | 'async-sgd'`. + - **Distributed inference** (future `inference/generate-ensemble`): N peers run inference in parallel on the same prompt; the requester does majority-vote / weighted-average / best-of-N. Quorum: `min: 3, reducer: 'majority-vote'`. + - **Multi-peer vote** (sentinel arbitration): N peers from a trust circle each evaluate a contract violation claim; the requester takes consensus. Quorum: `min: 3, agree_threshold: 0.67`. + +The quorum specification belongs in the `scope` per-call override (per #1439 §2.1): + +```typescript +await GenomeTrainCommand.execute({ ... }, { + scope: { + target: 'grid', + quorum: { min: 2, max: 8, sync_strategy: 'fedavg' }, + requires: { gpu_vram_gb: 32, capability: 'training:lora:typescript-expertise' }, + }, +}); +``` + +### 3.3 `quorum: 'any'` — any reachable peer, doesn't matter which + +Read-mostly commands where any peer can satisfy and the requester takes the first-good-enough answer (often racing several peers and taking whichever responds first). + +Examples: + + - **`data/vector-search`** against the grid: query goes to every peer with an embedding index for the namespace; merge results client-side. + - **`adapter/search`**: search the union of every peer's published adapter manifests; return aggregated matches. + - **`media/upload` fetch path** (when reading a blob hash that lives on multiple peers): race the fetch against all known holders; take the first response. + +The reducer for `any`-quorum commands is usually "first-N-results-merged" or "first-good-enough." + +--- + +## 4. Five concrete multi-peer command shapes (with worked specs) + +### 4.1 Genome paging across peers — the canonical example + +**Today (single-machine):** `GenomeDaemon` + `PagedResourcePool` + `AdapterStore` work together: persona requests an adapter via `genome/paging-activate`, pool checks if loaded, loads via `LayerLoader` if not, pins, returns handle. Pressure-driven eviction. + +**Grid extension (zero new daemons, two new event classes, one extension to AdapterStore):** + + - **New event class** `presence:adapter-available` (broadcast: true, channel: 'global'): each peer broadcasts its full adapter manifest list on join + on adapter add/remove. Body: `{ peer_id, adapters: [{ manifest_id, manifest_json, last_used_ms, currently_pinned_count }] }`. + - **New event class** `presence:adapter-pressure` (broadcast: true, channel: 'global'): peers broadcast adapter-eviction-candidates under memory pressure. Body: `{ peer_id, evictable: [{ manifest_id, last_used_ms, can_offer_to_other_peers: bool }] }`. + - **AdapterStore extension:** alongside the local manifest index, maintain a `GridAdapterIndex` (folder of `presence:adapter-available` events). Lookup: "find this adapter" returns `{ local: bool, peers: [peer_id] }`. + +**Cross-peer paging-activate flow:** + + 1. `genome/paging-activate({ manifest_id })` called locally. + 2. AdapterStore check: is it on this peer? If yes → existing path. + 3. If no → query `GridAdapterIndex` → list of peers holding it. + 4. Per operator policy: either FETCH (pull the safetensors from a peer, store locally, then paging-activate locally) OR DELEGATE (the peer that has it loaded executes inference there; this peer holds a `RemoteResourceHandle`). + 5. The DELEGATE path is the LoRA-paging-across-grid story: cheap household-LAN means "load on the GPU peer, route inferences through it" is faster than copying 100MB-1GB of weights. + +**Why this is the canonical example:** every existing primitive composes; the multi-peer behavior emerges from broadcasting manifests + the routing policy choice. No new wrapper layer. The `RemoteResourceHandle` is just `Handle` with a `peer_id` field. + +### 4.2 Federated inference (single-peer dispatch, but interesting cases) + +**Today:** `ai/generate` happens locally if model fits; falls back to cloud provider if not. + +**Grid extension:** `ai/generate` declares `naturalScope: 'grid'`, `quorum: 'single'`. The router uses `scope.requires` (capability, min-vram, max-latency) to pick a peer. Inference happens there; result returns. + +**Capability advertisement** (per #1439 §4): each peer's `presence:peer-manifest` includes its `offers[]`. For inference, an offer looks like: + +```json +{ + "capability": "inference:qwen3.5-72b-q4", + "alloy_hash": "aa61c4bdf463847c", + "terms": { "cost_cents_per_1k_tokens": 0.4, "est_latency_ms": 320, "max_concurrent": 4 }, + "loaded_state": "now" +} +``` + +**Hot path:** `ai/generate` against `requires: { capability: 'inference:qwen3.5-72b-q4' }` → router looks up offers → bid loop (or skip if obvious winner) → dispatch → inference handle returned. The handle streams tokens via the airc bus (`Events.subscribe('inference:tokens', handler)` with channel scoped to the handle id). + +**Future ensemble (`quorum: 'multi'`):** same shape but with N peers contributing. Each peer runs the same prompt; the originator's reducer does majority-vote / temperature-weighted selection / best-of-N by some scoring function. Use case: when local models are weaker (3B/7B household) and you want a 3-way ensemble of household peers' best-of, before paying for hosted-72B. + +### 4.3 Distributed training (federated) + +**Today:** `genome/train` runs entirely on the requesting peer's GPU. Single machine, single dataset. + +**Grid extension:** `genome/train` with `quorum: 'multi'`. The originator declares: + +```typescript +await GenomeTrainCommand.execute({ + base_model: 'qwen3.5-72b', + target_capability: 'lora:typescript-expertise', + recipe_id: '...', +}, { + scope: { + target: 'grid', + quorum: { + min: 2, max: 8, + sync_strategy: 'fedavg', + sync_every_steps: 100, + }, + requires: { gpu_vram_gb: 32, has_capability: 'training:lora' }, + }, +}); +``` + +**What happens:** + + 1. Router picks N peers matching `requires` (within `min..max`). + 2. Originator broadcasts `contract:proposed` with training spec + dataset shard plan. + 3. Each peer accepts, runs local training, periodically broadcasts `training:gradient-sync` events with the latest model deltas (or full checkpoint). + 4. The originator (or a designated coordinator) does FedAvg / async SGD to combine. + 5. Final converged adapter is written as a `ForgeArtifact` referencing all contributing peers' contracts (via alloy hash). Each contributing peer's `contract:delivered` is auditable. + +**Why this matters per Joel's economic vision:** training is the highest-value compute on the grid. Federated training across a household + trusted-org grid is "the economy in action" — household peers contribute idle GPU cycles, the originator gets the benefit, contributing peers earn LP via `contract:paid`. + +**Open question (depends on #1439 Q9):** is the training spec a `contract:proposed` event or a `genome/train` command? Probably the latter wraps the former — the command is the user-facing API, the contract chain is what the substrate sees. + +### 4.4 Multi-peer RAG / vector search + +**Today:** `data/vector-search` queries the local embedding index for a namespace. Returns top-K matches by cosine similarity. + +**Grid extension:** `data/vector-search` with `quorum: 'any'` and `scope.fan_out: true`. The router fan-outs the same query to every peer that has an embedding index for the requested namespace; each peer returns top-K; originator merges + re-ranks + returns merged top-K. + +**Why this matters:** persona engram stores are per-peer (each persona builds its own context). Cross-peer RAG = "what does the household collectively know about X?" without centralizing the engrams. + +**Privacy implication:** each peer's reply is filtered through that peer's `policies.share_engrams_with_circles` — household-tier might share full content, public-tier might share only the embedding signal + reference. Per-peer policy enforces. + +### 4.5 Multi-peer forge runs (distributed synthesis) + +**Today:** `recipe/run` (foundry executor) runs synthesis on one machine. For 70B+ models, this can take hours-to-days. + +**Grid extension:** `recipe/run` with `quorum: 'multi'` for compute-parallelizable stages of the recipe. + +**Example recipe stages with parallel-friendly slices:** + + - **Calibration corpus embedding generation:** embarrassingly parallel — each peer embeds a shard. + - **Importance profile collection:** parallel across calibration shards. + - **Per-tier quantization sweep:** parallel — each peer quantizes a different tier (Q4, Q5_K_M, Q6_K). + - **Per-benchmark eval:** parallel — each peer runs a different benchmark in the suite. + +The recipe entity grows a `stages[].parallelizable_across_peers: bool` flag. The recipe executor dispatches parallel stages via `contract:proposed` for each shard; reduces results. + +**Why this matters:** forge runs are the most compute-expensive task in Continuum. Parallelizing across household grid takes a 12-hour forge to ~2 hours on 6 peers. The contract chain audits exactly which peer did which shard, so the resulting alloy can attest "stage X computed by peer Y from input hash Z." + +--- + +## 5. The handle distribution model + +How do distributed resources travel between peers without losing the safety properties of the local handle system? + +### 5.1 `RemoteResourceHandle` — handle that points at another peer's pin + +```typescript +interface RemoteResourceHandle extends Handle { + // Existing Handle fields: + id: UUID; + short_id: ShortId; + status: HandleStatus; + created_ms: number; + ttl_ms: number; + // Grid extension: + peer_id: PeerId; // who holds the live resource + remote_handle_id: UUID; // id on the holder peer + resource_kind: string; // 'lora_adapter' | 'kv_cache' | 'inference_session' | ... + resource_hint: ResourceHint; // cached display info (size, capability, etc) + fetch_strategy: 'delegate' | 'pull-on-use' | 'pull-immediately'; +} +``` + +**Operations on a RemoteResourceHandle:** + + - `.value()` — if `fetch_strategy === 'delegate'`, returns a proxy that dispatches calls via grid; if `pull-on-use`, fetches the bytes lazily; if `pull-immediately`, fetched at handle creation. + - `.unpin()` — sends `grid/unpin` to the holder peer (decrements ref-count there). If holder loses all pins, may evict locally. + - `.status()` — queries (or subscribes to) status events from the holder peer. + +### 5.2 Pin lifecycle across peers + + 1. Peer A requests resource via `genome/paging-activate({ manifest_id })`. + 2. Router determines resource lives on peer B (via `GridAdapterIndex`). + 3. Per A's policy: `delegate` (return RemoteResourceHandle pointing at B) OR `pull` (transfer + local handle). + 4. Delegate path: A sends `grid/pin-request` to B; B pins locally; returns its handle id; A creates a RemoteResourceHandle wrapping it. + 5. A uses the resource by dispatching inference (etc.) through the handle — Commands.execute on grid path with `scope.peer_id: B`, including the remote handle id as context. + 6. A finishes; calls `.unpin()`; B decrements its local ref count; if zero, B may evict. + +**Why this is safe:** B's pin lifecycle is identical to single-machine paging — B doesn't know or care the pinner is remote; its `PagedResourcePool` ref count handles it. A doesn't know or care about B's local cache strategy — its `RemoteResourceHandle` is just a typed reference. The grid is invisible in the type system. + +### 5.3 Lease + reservation for expensive resources + +For resources where "is it currently available?" matters (GPU slots, model load slots, render queue slots), the pin is preceded by a **reservation:** + + 1. A asks B: "do you have free capacity for capability X?" (via `presence:peer-manifest` or a fresh probe). + 2. B says yes with a `reservation_id` valid for K seconds. + 3. A pins against the `reservation_id`; if expired, B refuses, A retries elsewhere. + 4. Pin promotes to long-lived handle once accepted. + +Reservations prevent the "10 peers all pin against B's last GPU slot, 9 get rejected after waiting" thundering-herd failure. + +### 5.4 Content-addressed pull + +For static resources (LoRA weights, model files, recipe blobs), the handle resolution falls back to content-addressed pull: + + 1. A wants resource with `manifest_id`. Router sees no live peer holds it pinned. + 2. A queries airc-blobs for the content (manifest_id → sha256 → blob storage). + 3. A pulls bytes; pins locally; uses. + +This is the fallback when delegation isn't an option (peer offline, capacity full, content static-immutable). + +--- + +## 6. Hosting model — who runs what, where it pays + +Per #1439 §4 the contract event chain handles attribution. This section pins how that interacts with hosting: + + - **Local-only commands:** no contract, no payment. Free. + - **Environment commands:** no contract (one Continuum install). Free. + - **Grid commands, single quorum, household circle:** typically no payment (reciprocity), but `contract:executed` + `contract:delivered` still emitted for audit. Optional `contract:paid` with zero-LP amount. + - **Grid commands, single quorum, trusted-orgs circle:** micropayment via `contract:paid`. Rates per peer manifest. + - **Grid commands, multi quorum:** each contributing peer gets its own `contract:proposed → bid → executed → delivered → paid` chain. Originator's policy decides how to split payment (proportional to compute, equal share, weighted by contribution quality, etc.). + - **Grid commands, public-mesh tier:** full contract chain with reputation + payment + sentinel arbitration. + +The hosting node owns the resource lifecycle (pinning, eviction); the requesting node owns the contract terms (capability needed, budget, latency requirement, quorum spec). The router matches them through capability advertisement + bid negotiation. + +--- + +## 7. Forge-alloy as universal contract substrate (per Joel + #1439 Q11) + +Joel's clarification on #1439: **forge-alloy isn't model-bound. It's the universal contract substrate for any computation.** + +Concretely: every multi-peer command result references an alloy hash (or a `ContractArtifact` hash once #1439 Q11 lands). The alloy holds: + + - WHAT was computed (typed body — model inference output, training delta, RAG snapshot, render frame, signature, etc.) + - HOW it was computed (recipe lineage, peer-id, hardware verified, methodology) + - WHEN (lamport) + - WHO signed it (the executing peer's ed25519) + - WHY it should be trusted (benchmarks, falsification baselines, attestation chain) + +The grid economy works because every contract:delivered references an alloy. Disputes (`contract:disputed`) refer to specific properties of the alloy. Payment (`contract:paid`) is conditioned on the alloy's benchmarks matching the agreed terms. + +**For this doc's multi-peer commands:** + + - `ai/generate` result references the inference alloy: prompt hash + model alloy_hash + tokens + sampling params. + - `genome/train` federated result references the training alloy: contributing peers + sync strategy + final eval benchmarks. + - `data/vector-search` fan-out result references each peer's index alloy_hash + the query + the returned shard. + - `recipe/run` distributed result references the recipe + each parallel stage's contributing peer's alloy. + +The alloy generalization (Q11 path A or B) doesn't change this doc — the multi-peer commands work either way. What changes is whether the alloy's `body` field is a discriminated union or a sibling-type pointer. + +--- + +## 8. Migration sequencing — how existing commands opt into multi-peer + +Per #1439 §5.3, the migration is staged. This doc's additions are downstream: + + - **§5.3 step 1-4 land first** (EventClass registry + AircEventTransport + naturalScope/scope on commands + capability index). + - **Then per-namespace opt-in:** each command from §2 above gets `naturalScope` set per the table. Most are no-ops (local defaults to local). Grid-eligible commands declare `naturalScope: 'grid'` and ship their capability advertisement schema (capability string + alloy hash if applicable + terms shape). + - **Multi-peer commands land last:** `genome/train` federated, `inference/generate` ensemble, `data/vector-search` fan-out, `recipe/run` parallel stages. Each is a separately scoped PR consuming the established substrate. + +**Sequencing prevents shim leakage:** the underlying primitives (Handle, PagedResourcePool, GridInterceptor, AdapterStore) don't change shape; they get a `peer_id` field's worth of extension. No new wrapping layer. No mirror writer. Per the no-shim feedback. + +--- + +## 9. Open questions + + 1. **Reservation TTL default.** For GPU slots, what's a sensible reservation window? Too short = race losses; too long = capacity holds. Suggest start at 10s, tunable per peer policy. + 2. **Fan-out result-merge timeout.** For `quorum: 'any'` fan-out commands, how long do we wait for slow peers? Suggest aggressive default (e.g. p95 of recent latencies for that capability) + first-good-enough early-return. + 3. **Adapter manifest broadcast volume.** Every peer broadcasting full adapter list could be O(peers × adapters) traffic at join time. Probably needs a delta-based protocol: broadcast hash-of-manifest-list on join; peers diff against their cache; ask for full only on mismatch. + 4. **Federated training sync strategy default.** FedAvg vs async SGD vs others — depends on heterogeneity of contributing peers. Default for household = FedAvg (homogeneous-ish); public-mesh = async SGD (heterogeneous). Per-command override always. + 5. **`RemoteResourceHandle.fetch_strategy` default.** When peer A pins on peer B, is delegate or pull default? Probably delegate for delegation-cheap resources (LoRA inference where B has GPU but A has CPU) and pull for read-mostly small content (recipe blob, manifest). Heuristic on resource_kind. + 6. **Resource-pressure broadcast cadence.** Too frequent = chatter; too rare = stale pressure data. Suggest hysteresis: broadcast on threshold-crossing (e.g. when VRAM crosses 70%, 85%, 95%) + 60s heartbeat baseline. + 7. **`quorum: 'multi'` with degraded participation.** If we asked for min=3 and only 2 peers respond, do we fail-clean or proceed with degraded? Per-command policy field: `quorum.if_under_min: 'fail' | 'proceed-degraded' | 'wait-for-more'`. + 8. **Contract chain for failed federated training.** If a contributing peer's gradients are anomalous, sentinel-AI scrutiny → `contract:disputed`. But the other peers' partial work IS valid. Need to specify: failed contributors don't get paid; successful contributors do; final alloy attests the participant list. Already implicit in #1439; worth pinning. + 9. **Hot-path inference: skip the bid loop for routine dispatch?** Bid-loop adds latency (round-trips + decision time). For repeat dispatches against a known-good peer with stable capability + acceptable terms, skip bid and dispatch directly; fall back to bid only on first-call or after a failure. Optimization, not correctness. + +--- + +## 10. What this doc does NOT do + + - **Does not define the airc-lib substrate primitives** for grid coordination — that's codex's airc-rust-rewrite work (subscribe / send / cursor-replay primitives). + - **Does not define wallet / LP currency.** Per #1439 §10. This doc treats payment as `contract:paid` events; the actual exchange rate / minting / on-chain integration is `WALLET-ON-GRID-BUS.md` (future). + - **Does not specify sentinel-AI scrutiny rules** — that's sentinel's own design. This doc just provides the contract chain sentinel reads. + - **Does not solve decomposed/sharded inference** (model parallelism, pipeline parallelism). Per #1439 §10. Multi-peer here is task-level parallelism (fan-out + reduce), not single-task decomposition. + - **Does not specify public-mesh discovery / anti-Sybil.** Per #1439 Q6 — public mesh tier is invite-only initially. + +These belong in sibling specs. Don't block this doc's review on them. + +--- + +## 11. Coordination + +This doc is downstream of #1439 (the bus + transport layer) and upstream of per-command implementation work. Reviewers: + + - **Joel** — primary stakeholder of the grid story; original direction for this brainstorm + - **claude-tab-2 / 16279c3f** — author of #1439; needs to confirm the command classification table doesn't contradict §1.2 / §2 of the bus arch + - **codex / 543c0bf7** — substrate side; needs to confirm airc-lib can carry the event classes named in §4 (esp. `presence:adapter-available`, `contract:*` chain, training sync events) without growing new primitives + - **dba950ce** — paired on consumer-side AIRC work; relevant if their next slice touches handles or training + +Reply on `#cambriantech` over airc. Approval comes from at least one of codex + Joel before any per-command implementation work opens against §2's table. + +**This doc does not gate anything from landing immediately:** existing commands work as they do today. What this defines is the target shape for grid extension as the bus layer (§5.3 steps 1-4 of #1439) lands. The opt-in is per-command; legacy paths keep working unchanged. From bb50e5d6cef63d6ee069f22ef6a1521b2f82c951 Mon Sep 17 00:00:00 2001 From: Test Date: Mon, 25 May 2026 18:33:35 -0500 Subject: [PATCH 2/6] =?UTF-8?q?docs(grid):=20MULTI-PEER-COMMANDS=20=C2=A72?= =?UTF-8?q?/=C2=A76/=C2=A77/=C2=A78=20refinements=20+=20corrections?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per the work-division on #cambriantech 2026-05-25: claude-tab-1 (55c30b28) wrote first-pass draft of all sections including §2/§6/§7/§8. This commit refines those four sections per the wholesale-handoff invitation. §2 — added 2.1 with rows the first-pass missed (ping for #1439 grid-routable example, inbox/persona-turn-execute migration trajectory, cognition/* per-persona binding, presence:peer-manifest + contract:* event classes, grid/show-* introspection commands). Sharpened axis-rationale prose. §6 — added 6.1 per-circle pricing defaults table (local/household/trusted-orgs/extended/public-mesh × cost model × sentinel scrutiny × contract artifact). Added 6.2 capability liveness + withdrawal mechanics. Added 6.3 three worked hosting examples (ai/generate household, genome/train federated mixed-tier, data/vector-search any-quorum household). §7 — substantial rewrite incorporating canonical-doc references I missed on #1439's first pass (logged in #1439 appendix correction). 7.1 quotes FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md TL;DR + FORGE-ALLOY-PROOF-CONTRACTS.md proof-contract object shape. 7.2 names the Continuum-side drift + the 6-work-item refactor as prerequisite. 7.3 computation-kind → alloy-domain mapping table (model forging 0x01, delivery 0x05, evaluation 0x06, custom 0xFF). 7.4 conditional claim: refactor lands before first non-ML multi-peer command. Resolves #1439 Q11 — not Path A/B (both were my reinvention), but the already-designed Domain Extensibility refactor. §8 — added 8.1 worked example: ping opts into multi-peer in 2 lines (smallest opt-in). Added 8.2 phased opt-in order (Phase A proof-of-life → Phase G distributed forge), each phase separately shippable. Added 8.3 revert path. Added 8.4 explicit out-of-scope (persona migration, sentinel arbitration protocol, LP wallet on-chain settlement, recipe-as-grid-contract semantics). Kanban cards claimed (CambrianTech/continuum repo, P1): §2 0525edc6-6411-4d00-99fe-9d86de1af1bb §6 38848f04-563e-4929-931f-a9cb3d911f76 §7 e5c65d27-4620-4655-a74a-c2487434ef90 §8 ca374e43-4399-42fe-82b5-0415929b058a Co-Authored-By: claude-tab-1 <55c30b28-f01d-4a33-bb71-dc0279bbe7ef> Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/architecture/MULTI-PEER-COMMANDS.md | 175 +++++++++++++++++++++-- 1 file changed, 161 insertions(+), 14 deletions(-) diff --git a/docs/architecture/MULTI-PEER-COMMANDS.md b/docs/architecture/MULTI-PEER-COMMANDS.md index 9cc971776..8db75c1e9 100644 --- a/docs/architecture/MULTI-PEER-COMMANDS.md +++ b/docs/architecture/MULTI-PEER-COMMANDS.md @@ -1,7 +1,7 @@ # Multi-Peer Commands, Handles, and the Grid-Distribution Model **Status:** Design (2026-05-25). Companion to [GRID-BUS-ARCHITECTURE.md](GRID-BUS-ARCHITECTURE.md). -**Authors:** claude-tab-1 (research + draft), Joel (direction + vision). +**Authors:** claude-tab-1 / 55c30b28 (research + first-pass draft, all sections), claude-tab-2 / 16279c3f (§2 refinements + §6 expansion + §7 rewrite + §8 worked-example), Joel (direction + vision). Per the work-division proposal on #cambriantech 2026-05-25. **Scope:** Defines which Continuum commands distribute across grid + how distributed resources are addressed (handles) + concrete shapes for the multi-peer commands the grid economy needs. This doc sits BELOW the bus architecture (#1439, which defines the transport + routing layer) and ABOVE the per-command implementation work (§5.3). It answers: "OK we have a grid bus — what RUNS on it, what stays local, and how do peers actually share things?" @@ -132,6 +132,21 @@ Each Continuum command namespace below is classified on three axes: **Pattern from the table:** ~30% of commands stay local (DOM/FS/per-machine entity), ~40% are environment-scoped (browser↔server inside one Continuum install), ~30% are grid-distributable. Of grid commands, ~5 namespaces are natural multi-peer candidates (training, vector-search, RAG, forge-runs, blob storage); the rest are single-peer. +### 2.1 Additions to the classification table (post-#1439-review) + +A few namespaces the first-pass table missed or under-specified — adding rows + sharpening rationale: + +| Namespace | Truth tier | naturalScope | Quorum | Rationale | +|---|---|---|---|---| +| `ping` | flow (snapshot) | grid | single | Cross-grid health check — already exercised in #1439 §2.1 as the reference grid-routable command. Returns per-peer server-info + browser-info if available. | +| `inbox/drain-frame`, `persona/turn-execute` | flow (per-persona) | environment now → grid post-#1439 step 6 | single | Becomes airc-cursor-driven post-migration; persona is bound to one peer at a time (the one running its grid-router-daemon), so quorum stays single even when sourced from grid events. | +| `cognition/*` (engine state, decisions) | per-persona state (in-memory + spilled to ORM) | local | single | The persona-cognition engine is intrinsically per-peer; cross-peer persona is a persona-migration event, not a per-call grid hop. | +| `presence:peer-manifest`, `presence:resource-pressure` (event classes, not commands but co-classify) | flow | grid (broadcast: true) | n/a (event) | Mesh-wide visibility into capabilities + load. Cursor-replayable on join. | +| `contract:*` event chain (per #1439 §4.4) | flow | grid (broadcast: true) | n/a (event) | Audit substrate. Every contract event is broadcast on the airc log; sentinel + wallet daemons fold from it. | +| `grid/show-routes`, `grid/show-policy` | introspection (local routing-table view) | local | single | `show ip bgp` equivalent. Doesn't cross machines; just renders this peer's current grid-router-daemon state. | + +**The two axes that matter most for migration:** `naturalScope` (which transport routes the command) and `quorum` (whether a single grid hop or N-peer coordination satisfies). Truth tier is a hint about whether the command's *output* needs durable cross-grid logging (flow → airc event) or per-peer entity storage (entity → ORM). Most commands' classification falls out of the existing CLAUDE.md universal-primitives discipline once `naturalScope` is set. + --- ## 3. Quorum: the third axis @@ -364,30 +379,117 @@ Per #1439 §4 the contract event chain handles attribution. This section pins ho The hosting node owns the resource lifecycle (pinning, eviction); the requesting node owns the contract terms (capability needed, budget, latency requirement, quorum spec). The router matches them through capability advertisement + bid negotiation. +### 6.1 Per-circle pricing defaults (concrete) + +Hosting decisions per circle, with concrete cost-knob defaults that operators can override per `~/.continuum/grid-policy.json` (per #1439 §7): + +| Circle | Default cost model | Default sentinel scrutiny | Default contract artifact | +|---|---|---|---| +| **local** (same install) | free | none | none (no contract — local exec) | +| **household** (own machines) | free, reciprocity-tracked (no LP transfer; LP-equivalent recorded on airc log for fairness visibility) | none (operator trusts own peers) | `contract:executed` + `contract:delivered` only (no `paid`) | +| **trusted-orgs** (peered orgs) | micropayment via LP (rate per peer manifest); host can offer 0-LP "favor" terms | optional (operator can require sentinel pre-flight) | Full chain incl. `contract:paid` | +| **extended** (transitive trust) | LP required; rate-card pricing; bid loop active | required pre-flight + post-delivery audit | Full chain + `contract:disputed` resolution path | +| **public-mesh** | LP required + reputation-tracked; bid loop competitive | mandatory pre-flight + post-delivery audit + sentinel slashing on dispute | Full chain + reputation event (`reputation:contract-completed` or `:disputed`) | + +These are defaults, not enforcement. A household operator can set their household to LP-priced if they want explicit fairness accounting; a public-mesh operator can set permissive pricing if they're seeding adoption. The `grid-policy.json` config (#1439 §7) is the knob. + +### 6.2 Capability liveness + withdrawal + +Capability advertisements (per #1439 §4 — the `offers[]` block on `presence:peer-manifest`) need lifecycle handling: + + - **Liveness:** each manifest carries `ts_ms`; routers consider an offer stale after `T_stale` (default: 5 min). Stale offers stay in the routing table but are weighted down or skipped per policy. + - **Withdrawal:** explicit `presence:capability-withdrawn` event (broadcast: true, contains `peer_id + capability + reason`) removes the offer from the index immediately. Reasons include `'shutdown'`, `'overloaded'`, `'maintenance'`, `'policy-change'`. + - **Refresh on state change:** peer rebroadcasts its full manifest when `current_state.gpu_util` crosses ±0.1, when a model is loaded/unloaded, or when `policies` change. Not every tick — only material state changes. + - **Implicit withdrawal:** if a peer's heartbeat is missing for `T_dead` (default: 15 min) without an explicit `peer-departed` event, routers mark all its offers as `unavailable` and trigger a re-discovery sweep. + +### 6.3 What runs where — three concrete worked examples + +**Example A: ai/generate from Joel's laptop, household tier.** Laptop has no GPU. bigmama-wsl (household) has rtx5090 with qwen3.5-72b-q4 loaded. Routing → bigmama wins (`loaded_now`, `cost=0` household-reciprocity, `est_latency_ms=320`). Contract chain: `proposed → bid → accepted → executing → delivered` (no `paid` event because household-tier default = reciprocity-tracked, no LP transfer). Total elapsed: ~400ms. + +**Example B: genome/train federated, household + trusted-orgs.** Originator on Joel's laptop. Recipe: train `typescript-expertise-v4` LoRA, target `min_eval_delta: +0.05`. `requires: { gpu_vram_gb: 32 }` matches bigmama-wsl (household) + 2 peers from Toby's grid (trusted-orgs). Quorum: `min: 2, max: 3, sync_strategy: 'fedavg'`. Contract chains: bigmama gets `contract:proposed → bid → accepted` with 0-LP terms (household); Toby's peers get `proposed → bid → accepted` with per-compute-hour LP rate (trusted-orgs). Training runs 6 hours. Final adapter alloy references all 3 contributing peers. LP transfer to Toby's peers, reciprocity entry for bigmama. Audit chain on airc cursor. + +**Example C: data/vector-search any-quorum, household.** Persona on Joel's laptop wants "what does the household collectively know about TypeScript performance traps?" `data/vector-search` with `quorum: 'any', fan_out: true` to every household peer with a `code:typescript` embedding namespace. Each peer returns top-10 from its index, filtered through `policies.share_engrams_with_circles.household` (full content). Originator merges + reranks + returns top-20. Total chain: 3 `contract:executed`s (one per peer), 3 `contract:delivered`s, 0 `contract:paid`s (household reciprocity). + --- ## 7. Forge-alloy as universal contract substrate (per Joel + #1439 Q11) Joel's clarification on #1439: **forge-alloy isn't model-bound. It's the universal contract substrate for any computation.** -Concretely: every multi-peer command result references an alloy hash (or a `ContractArtifact` hash once #1439 Q11 lands). The alloy holds: +This isn't a future redesign — it's the original design intent that the current Continuum-side Rust types drifted away from. The corrected understanding (logged in #1439's appendix after Joel pointed me at the canonical docs): + +### 7.1 What forge-alloy actually is (per canonical docs) + +Per [`FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md`](FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md) TL;DR: + +> "[`forge-alloy`](https://github.com/CambrianTech/forge-alloy) was designed from day one as a **universal Merkle-chain-of-custody for any data transformation pipeline, not just ML model forging**. The README's Type Byte enumeration is explicit: model forging is `0x01`, but `0x05` is delivery, `0x06` is evaluation, `0xFF` is custom domain. Photo provenance from a camera enclave to social media, venue tickets from issuance to gate scan, supply chain transactions, document signing — all of these are forge-alloy use cases under the same universal contract." + +The grid-trust + contract layer is also already designed in [`docs/grid/FORGE-ALLOY-PROOF-CONTRACTS.md`](../grid/FORGE-ALLOY-PROOF-CONTRACTS.md). The proof-contract object has the slots this doc's multi-peer commands need: + +```text +ForgeAlloyProofContract { + id: hash(content) + description: human-readable prose + inputs: { base_artifact: {id, hash}, # what was fed in + corpus: {ref, hash}, # SHA-256 anchored + recipe: {steps[], hash} } # how it was made + proof_suite: { tdd[]: # pass/fail assertions + { test_id, fixture_hash, expected_assertion, methodology_ref }, + vdd[]: # statistical measurements + { metric, threshold, tolerance_band, methodology_ref, N_runs_required }, + negative_baselines[]: # §4.1.3.4 falsifiability + { metric, must_not_exceed, methodology_ref } } + authorship: { contract_author_pubkey, methodology_version_hash, ... } +} +``` + +### 7.2 The Continuum-side drift + the prerequisite refactor - - WHAT was computed (typed body — model inference output, training delta, RAG snapshot, render frame, signature, etc.) - - HOW it was computed (recipe lineage, peer-id, hardware verified, methodology) - - WHEN (lamport) - - WHO signed it (the executing peer's ed25519) - - WHY it should be trusted (benchmarks, falsification baselines, attestation chain) +The current Continuum-side Rust types in `src/workers/continuum-core/src/forge/{recipe,artifact}.rs` are model-bound (`AlloySource.base_model`, `BenchmarkDef` ML-evals only, `ForgeArtifact.forged_params_b/quant_tiers/tokens_per_sec`). That drift is the gap between intent (universal) and implementation (ML-only). -The grid economy works because every contract:delivered references an alloy. Disputes (`contract:disputed`) refer to specific properties of the alloy. Payment (`contract:paid`) is conditioned on the alloy's benchmarks matching the agreed terms. +The **already-designed fix** is in `FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md` — a 6-work-item refactor (~4 hours scoped, with a bit-equivalent regression test on every shipped artifact): -**For this doc's multi-peer commands:** +| Work item | Scope | +|---|---| +| 0 | Domain registry refactor in forge-alloy (~30 min) | +| 1 | `llm-forge` domain extension content (~30 min) | +| 2 | Continuum-side TS types regenerated from forge-alloy (~30 min) | +| 3 | Domain-aware Factory widget (~1 hour) | +| 4 | Backwards-compatibility regression test (~30 min) | +| 5 | Documentation refresh (~30 min) | - - `ai/generate` result references the inference alloy: prompt hash + model alloy_hash + tokens + sampling params. - - `genome/train` federated result references the training alloy: contributing peers + sync strategy + final eval benchmarks. - - `data/vector-search` fan-out result references each peer's index alloy_hash + the query + the returned shard. - - `recipe/run` distributed result references the recipe + each parallel stage's contributing peer's alloy. +Post-refactor: the universal alloy core stays domain-agnostic; current ML stages move into an `llm-forge` domain extension; new domains (delivery, evaluation, photo provenance, ticketing, code-gen attestation, sentinel-scan attestation, payment-receipt attestation, etc.) plug in by registering their own stage types without touching the core. -The alloy generalization (Q11 path A or B) doesn't change this doc — the multi-peer commands work either way. What changes is whether the alloy's `body` field is a discriminated union or a sibling-type pointer. +**This refactor is the prerequisite for the grid-bus contract substrate.** Multi-peer commands work either way (they reference `alloy_hash` regardless of body shape), but the *universal* claim that the grid economy depends on is only true post-refactor. + +### 7.3 Computation kinds → alloy domain mapping (worked) + +For each multi-peer command in §4, the alloy that `contract:delivered` references uses the appropriate Type Byte domain: + +| Multi-peer command | Alloy domain | Alloy body | +|---|---|---| +| `ai/generate` / `inference/generate` | `0x06` evaluation (inference run = evaluation of model against prompt) | `{ model_alloy_hash, prompt_hash, sampling_params, output_text, tokens, latency_ms }` | +| `genome/train` (federated) | `0x01` model forging (recipe + training data + base = new alloy) | `{ recipe_hash, contributing_peers[], sync_strategy, final_adapter_safetensors_hash, eval_deltas[] }` | +| `data/vector-search` (fan-out) | `0x06` evaluation (retrieval = evaluation of query against index) | `{ query_hash, peer_index_alloy_hash, returned_shard_hash, rerank_params }` | +| `recipe/run` (distributed forge) | `0x01` model forging (parent alloy) + `0xFF` custom (per parallelizable stage) | parent references stage alloys; each stage alloy references its peer's compute receipt | +| `media/upload` | `0x05` delivery (transfer with verification) | `{ blob_hash, source_peer, target_peer(s), bytes_transferred, content_addressed_path }` | +| `voice/synthesize`, `voice/transcribe` | `0x06` evaluation (TTS/STT = evaluation of model against waveform/text) | `{ model_alloy_hash, input_hash, output_hash, sampling_params }` | +| `cognition/vision-describe` | `0x06` evaluation | `{ model_alloy_hash, image_hash, description, sampling_params }` | +| Sentinel scan output | `0xFF` custom (`sentinel-scan` registered domain) | `{ scan_recipe_hash, targets_examined[], findings[], signed_by }` | +| LP payment receipt | `0xFF` custom (`wallet-receipt` registered domain) | `{ payer, payee, amount_lp, contract_ids_paid[], lp_ledger_anchor }` | + +Every row in the table produces a hash-pinned, signed, falsifiable, lineage-bearing artifact. **The grid economy works because every result has the same audit shape regardless of what was computed.** That's the universal contract substrate Joel meant. + +### 7.4 What this doc claims, conditionally on the refactor + +Multi-peer commands in §4 work regardless of whether the alloy schema has been generalized yet (`contract:delivered` references `alloy_hash` as an opaque hash either way). What changes post-refactor: + + - **Pre-refactor (today):** alloys for non-model computations have to either (a) shoehorn into the model-bound schema with synthetic fields or (b) live outside the alloy chain (so the audit trail breaks for them). + - **Post-refactor:** every computation kind gets a first-class alloy with its own domain registration. Audit chain stays unbroken. Sentinel + wallet can fold uniformly. + +**Recommendation for sequencing:** the Domain Extensibility refactor (~4 hours) should land BEFORE the first non-ML multi-peer command ships. The ML-side multi-peer commands (`genome/train`, `recipe/run`) can land before the refactor since they use the existing ML-bound alloy schema correctly. Non-ML use cases (sentinel scans, wallet receipts, payment ledger anchors, code-gen attestation) gate on the refactor. + +This resolves #1439 Q11: not "Path A vs Path B" (both my original speculation, both wrong) — the actual answer is the already-designed Domain Extensibility refactor, which is a prerequisite for the universal contract substrate claim being true. --- @@ -401,6 +503,51 @@ Per #1439 §5.3, the migration is staged. This doc's additions are downstream: **Sequencing prevents shim leakage:** the underlying primitives (Handle, PagedResourcePool, GridInterceptor, AdapterStore) don't change shape; they get a `peer_id` field's worth of extension. No new wrapping layer. No mirror writer. Per the no-shim feedback. +### 8.1 Worked example — `ping` opts into multi-peer (the simplest case) + +`ping` is the cleanest first opt-in: low-stakes, already implemented, well-understood. Sequence: + + 1. **Today:** `PingCommand` has no `naturalScope` declaration → defaults to `'auto'` (= browser↔server within one Continuum install). `ping` works locally only. + 2. **Step 1 (substrate ready, per #1439 §5.3 steps 1-4):** EventClass registry + AircEventTransport + `CommandBase.naturalScope` + capability index all landed. + 3. **Step 2 (opt-in, this command):** add `static get naturalScope() { return 'grid'; }` to `PingCommand`. Add a capability advertisement to `presence:peer-manifest`: `{ capability: 'ping:server-info', terms: { cost_cents: 0, est_latency_ms: 50 } }`. + 4. **Step 3 (dual-path during transition):** existing callers (`./jtag ping`) still default to local (browser↔server). New callers can pass `{ scope: { target: 'grid', peer_id: '' } }` or `{ scope: { target: 'grid', capability: 'ping:server-info' } }`. Both work; no breaking change. + 5. **Step 4 (test):** smoke — two peers, laptop pings bigmama-wsl across grid, gets back bigmama's server info + browser info if its tab is open. Result envelope contains `{ source: laptop_peer_id, target: bigmama_peer_id, forwarded_by: [], result: { server: {...}, browser: {...} } }`. + 6. **Step 5 (close out card):** update kanban; broadcast on #cambriantech; no follow-up needed. + +End-to-end opt-in change: **two lines** (`naturalScope` declaration + capability ad). The architecture absorbed the migration cost; per-command opt-in is metadata-flip + manifest entry, not refactor. + +### 8.2 Recommended opt-in order (smallest blast radius first) + +| Phase | Commands | Why this order | +|---|---|---| +| **Phase A — proof of life** | `ping`, `debug/system-info`, `grid/show-routes` | Tiny commands, low stakes, no LP contract needed, no entity changes. Validates substrate end-to-end. | +| **Phase B — single-peer compute, household-tier** | `ai/generate`, `ai/embedding`, `cognition/vision-describe`, `voice/synthesize`, `voice/transcribe` | Hot paths, but single-peer + household-tier first (no payment surface, no public-mesh complexity). Validates capability advertisement + bid loop end-to-end. | +| **Phase C — single-peer compute, trusted-orgs tier** | same commands as Phase B, but `accept_inbound_from: ['household', 'trusted-orgs']` | Validates contract event chain + LP transfer + sentinel pre-flight. First time payment flows execute. | +| **Phase D — canonical multi-peer** | `genome/paging-activate` cross-peer (§4.1) | The canonical example — exercises capability index + `RemoteResourceHandle` + FETCH vs DELEGATE policy decision. | +| **Phase E — multi-quorum** | `data/vector-search` (fan-out, any-quorum), then `genome/train` (federated, multi-quorum) | Validates fan-out routing + per-peer-result merging + (for training) FedAvg sync. | +| **Phase F — non-ML alloy contracts** | sentinel scans, wallet receipts, code-gen attestations | **Gated on the Domain Extensibility refactor per §7.4.** First non-ML multi-peer commands. Validates the universal contract substrate claim. | +| **Phase G — distributed forge runs** | `recipe/run` (parallel stages, §4.5) | The capstone — multi-peer + multi-stage + each stage produces its own alloy + parent alloy references children. Validates the full economic loop. | + +Each phase is a separately-shippable PR (or PR series). Phase A → Phase B can land in the same week; Phase C-G are weeks-to-months depending on the contract/payment layer maturity. + +### 8.3 Revert path + +If a per-command opt-in causes problems (latency regression, capability advertisement bug, contract chain failure): + + 1. Drop `naturalScope: 'grid'` declaration → command reverts to environment-local default. + 2. Withdraw the capability advertisement: emit `presence:capability-withdrawn` with reason `'reverting'`. + 3. Any in-flight cross-grid invocations complete or time out per their existing handle TTL; no rollback needed for already-shipped contracts (they're durable on the airc log regardless). + 4. Investigate, fix, re-opt-in. + +The substrate layer (#1439 §5.2 deliverables 1-6) doesn't get reverted by per-command opt-ins. Revert blast radius is the one command. + +### 8.4 What this doc explicitly does NOT cover + + - **Cross-grid persona migration** (moving a persona's full state from peer A to peer B). Different problem — touches ORM (engrams, persona identity) + airc cursor handoff. Belongs in a sibling doc once Phase D demonstrates the handle mechanics. + - **Sentinel arbitration protocol** for contract disputes. Belongs in `SENTINEL-CONTRACTS.md`, dependency on §7 + #1439 §4.4. + - **LP wallet on-chain settlement.** Belongs in `WALLET-ON-GRID-BUS.md` (named in #1439 §10), depends on §7's universal contract substrate landing. + - **Recipe-as-grid-contract execution semantics.** A recipe can have stages that distribute differently (some stages local, some grid-multi); the per-stage opt-in shape is a §4.5 follow-up. + --- ## 9. Open questions From 7c777836ae018caff6bf723b30a771ce2dfd3790 Mon Sep 17 00:00:00 2001 From: Test Date: Mon, 25 May 2026 18:34:54 -0500 Subject: [PATCH 3/6] =?UTF-8?q?docs(grid):=20MULTI-PEER-COMMANDS=20=C2=A73?= =?UTF-8?q?=20quorum=20model=20=E2=80=94=20pinned=20defaults=20+=20edge=20?= =?UTF-8?q?cases?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Card fbf8912e-eb3a-4bf9-9f75-53b07f59f110 (claude-tab-1 / 55c30b28). Revises §3 from strawman framing to decisive spec: - Default quorum for naturalScope='grid' commands is 'single' (lowest-cost, matches today's GridInterceptor behavior); 'multi'/'any' are explicit opt-ins per call site. - §3.2 'single' quorum: explicit failure modes (no-matching-peer, peer-unreachable, no-accepting-peer), retry-budget defaults (3 retries with exp backoff capped 5s), no-auto-retry for mutating commands — command class declares idempotent: true to opt into retry. - §3.3 'multi' quorum: concrete defaults table (min: 2, max: 8, slow_peer_ timeout_ms reducer-specific, result_freshness_ms 30s), 6 reducer types (fedavg/majority-vote/weighted-average/best-of-N/union/custom) with specific defaults each, if_under_min triple option (fail / proceed- degraded / wait-up-to-Ns) with rationale per command-class, contract attribution rule (per-peer chain, only successful peers paid). - §3.4 'any' quorum: fan_out_to ('all-matching' default), reducer choices (first-good-enough / merge-top-k / union), adaptive max_wait_ms (p95 of recent latency * 1.5, capped 5s), early_return_on_first opt-in, privacy filter at SOURCE not reducer. - §3.5 cross-cutting: ordering (reducer's responsibility), idempotency contract (multi/any dispatchers must dedupe), backpressure via presence:resource-pressure (router-side, not scope), observability (grid:quorum:dispatched + :resolved as broadcast events). - §3.6 explicit non-quorum concerns: routing target hints, trust circle, backpressure, reservation TTL — these live elsewhere on scope. Strawman framing was vague on defaults; spec needs decisive values so per-call scope.quorum overrides are meaningful. All defaults rationalized in-table. No code change. Reviewers: claude-tab-2 (for consistency with §2 command classification + §8 migration), codex (for substrate-side dispatch logic), joel (for default rationale). Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/architecture/MULTI-PEER-COMMANDS.md | 153 ++++++++++++++++++++--- 1 file changed, 134 insertions(+), 19 deletions(-) diff --git a/docs/architecture/MULTI-PEER-COMMANDS.md b/docs/architecture/MULTI-PEER-COMMANDS.md index 9cc971776..b66704332 100644 --- a/docs/architecture/MULTI-PEER-COMMANDS.md +++ b/docs/architecture/MULTI-PEER-COMMANDS.md @@ -136,47 +136,162 @@ Each Continuum command namespace below is classified on three axes: ## 3. Quorum: the third axis -`naturalScope` (per #1439 §2.1) answers "where does this command run." But for grid-distributed commands, a second question matters: "how many peers does it take to satisfy?" +> **Status (2026-05-25):** Owned + revised by claude-tab-1 (55c30b28) per kanban card fbf8912e-eb3a-4bf9-9f75-53b07f59f110. This section pins concrete defaults so per-call `scope.quorum` declarations are decisive, not under-specified. -### 3.1 `quorum: 'single'` — one peer satisfies +`naturalScope` (per #1439 §2.1) answers "where does this command run." But for grid-distributed commands, a second question matters: "how many peers does it take to satisfy, and what happens when the answer doesn't match the request?" That's the **quorum axis**. The axis is binding on `naturalScope: 'grid'` commands (local + environment never have multi-peer quorum) and lives on the per-call `scope` override defined in #1439 §2.1. -Most grid commands. The router picks ONE peer per the operator's policy (cheapest / fastest / closest / etc.), dispatches, awaits result. Example: `ai/generate` — one inference completion is the answer. +### 3.1 Quorum types -Existing primitive: GridInterceptor already does this (single-peer routing via Rust kernel). +Three values cover the multi-peer behavior space: -### 3.2 `quorum: 'multi'` — N peers contribute, results combine + - **`single`** — one peer satisfies. The router picks ONE peer per operator policy and awaits its result. + - **`multi`** — N peers contribute. The router dispatches the same logical work to multiple peers; the requesting peer's reducer combines partial results into the final answer. + - **`any`** — any reachable peer can satisfy; race them and take the first-good-enough OR merge. -Federated commands. The router dispatches the SAME logical work to multiple peers, each producing a partial result; a reducer at the originating peer combines them into the final answer. +Default for a `naturalScope: 'grid'` command without an explicit `scope.quorum` override is **`single`** (lowest-cost, matches single-peer GridInterceptor behavior today). `multi` and `any` are explicit opt-ins per call site. -Examples: +### 3.2 `quorum: 'single'` — one peer satisfies (default) - - **Federated training** (`genome/train`): each peer trains on its local data; gradients/checkpoints sync periodically; final adapter is the combined model. Quorum: `min: 2, max: , sync_strategy: 'fedavg' | 'async-sgd'`. - - **Distributed inference** (future `inference/generate-ensemble`): N peers run inference in parallel on the same prompt; the requester does majority-vote / weighted-average / best-of-N. Quorum: `min: 3, reducer: 'majority-vote'`. - - **Multi-peer vote** (sentinel arbitration): N peers from a trust circle each evaluate a contract violation claim; the requester takes consensus. Quorum: `min: 3, agree_threshold: 0.67`. +The router picks ONE peer per operator policy (cheapest / fastest / closest / trust-preferred), dispatches, awaits result. Example: `ai/generate` — one inference completion is the answer. -The quorum specification belongs in the `scope` per-call override (per #1439 §2.1): +```typescript +await InferenceGenerateCommand.execute({ model: 'qwen3.5-72b', prompt: '...' }, { + scope: { + target: 'grid', + quorum: 'single', // (default; can omit) + policy: 'cheapest-fast-enough', + requires: { capability: 'inference:qwen3.5-72b-q4' }, + }, +}); +``` + +**Failure modes the router must handle:** + + - **No peer matches `requires`** → return `{ error: 'no-matching-peer', requires, considered_peers: N }`. Don't degrade silently. Per #1439 §3.2. + - **Selected peer becomes unreachable mid-dispatch** → return `{ error: 'peer-unreachable', peer_id, suggest_retry: true, suggested_alternates: [...] }`. Router updates its capability index from the failure signal. + - **Selected peer rejects (capacity, policy)** → router re-runs selection excluding the rejecting peer, up to a fixed retry budget (default: 3 retries with exponential backoff capped at 5s). After budget exhausted, return `{ error: 'no-accepting-peer', tried: [...] }`. + +**No-retry semantics:** mutating commands (e.g. `model/publish`, `contract:accepted`) MUST NOT auto-retry — the requester decides whether retry is safe given the operation's idempotency. Read-only commands (e.g. `ai/generate` with `temperature: 0`) MAY auto-retry. Heuristic: command class declares `idempotent: true` in registry; router consults before retry. + +Existing primitive: GridInterceptor already does single-peer routing via the Rust kernel. The above adds retry + explicit failure shapes. + +### 3.3 `quorum: 'multi'` — N peers contribute, results combine + +Federated commands. The router dispatches the SAME logical work to multiple peers, each producing a partial result; a reducer at the originating peer combines them. ```typescript await GenomeTrainCommand.execute({ ... }, { scope: { target: 'grid', - quorum: { min: 2, max: 8, sync_strategy: 'fedavg' }, + quorum: { + kind: 'multi', + min: 2, + max: 8, + reducer: 'fedavg', + if_under_min: 'wait-up-to-30s', // 'fail' | 'proceed-degraded' | 'wait-up-to-Ns' + slow_peer_timeout_ms: 60_000, // any peer slower than this is dropped from this round + }, requires: { gpu_vram_gb: 32, capability: 'training:lora:typescript-expertise' }, }, }); ``` -### 3.3 `quorum: 'any'` — any reachable peer, doesn't matter which +**Concrete defaults for `multi` quorum:** + +| Field | Default | Rationale | +|---|---|---| +| `min` | 2 | Multi-peer is meaningless with 1. | +| `max` | 8 | Coordination overhead grows with peers; 8 is enough for FedAvg without quadratic gossip. | +| `if_under_min` | `'fail'` for training/contracts, `'proceed-degraded'` for inference-ensemble | Training under-quorum produces a bad adapter; inference ensemble under-quorum just produces a lower-quality answer. | +| `slow_peer_timeout_ms` | depends on `reducer` (see below) | Fast reducers (vote) tolerate less slack than slow reducers (fedavg). | +| `result_freshness_ms` | 30_000 | After dispatch + this window, originator gives up gathering more partials. | +| `peer_replacement` | `true` if `if_under_min === 'fail'`, else `false` | If we hard-need N peers, replace dropouts; if we accept degraded, don't churn the router. | + +**Reducer types** (the function that combines partial results): + + - **`fedavg`** — federated averaging for model weights / gradients. Each contributing peer returns a delta; reducer averages weighted by sample count. Sync points: every `sync_every_steps` (default 100) or on convergence. Default `slow_peer_timeout_ms: 60_000` (training is slow; tolerate slack). + - **`majority-vote`** — discrete categorical decisions (e.g. "should we accept this contract?"). Each peer returns a vote; reducer takes mode + reports confidence (= mode-fraction). Default `slow_peer_timeout_ms: 5_000` (decisions should be fast). + - **`weighted-average`** — continuous scalar results (e.g. ensemble logits). Each peer returns a value + a confidence weight; reducer = sum(value*weight) / sum(weight). Default `slow_peer_timeout_ms: 10_000`. + - **`best-of-N`** — quality-scored variants (e.g. multiple inference completions). Each peer returns a result + self-score (perplexity, alignment score, etc.); reducer picks the best. Default `slow_peer_timeout_ms: 20_000`. + - **`union`** — set-shaped results (e.g. distributed search). Reducer = set union with provenance tags. Default `slow_peer_timeout_ms: 5_000`. + - **`custom`** — reducer name resolved via a registry; consumer provides the function. Validated against a typed reducer interface (`reduce(partials: Partial[]) -> Final`). + +**Examples:** + + - **Federated training** (`genome/train`): `reducer: 'fedavg', min: 2, max: 8`. Each peer trains on local data; periodic gradient sync; final adapter is the combined model. + - **Distributed inference ensemble** (future `inference/generate-ensemble`): `reducer: 'majority-vote' | 'best-of-N', min: 3`. N peers run inference in parallel on the same prompt; reducer combines. + - **Multi-peer sentinel arbitration**: `reducer: 'majority-vote', min: 3, agree_threshold: 0.67`. Trust-circle peers evaluate a contract dispute; consensus or escalate. + - **Parallel forge stages** (per §4.5): `reducer: 'union', min: `. Each peer handles one stage; reducer just joins the artifacts. + +**Failure shapes:** + + - `if_under_min: 'fail'` and we got fewer than `min` peers within `result_freshness_ms` → return `{ error: 'under-quorum', got: K, needed: min, timed_out: [...] }`. Originator decides whether to retry, lower the min, or give up. + - `if_under_min: 'proceed-degraded'` and we got K < min → return `{ ok: true, result: , degraded: true, got: K, needed: min, missing: [...] }`. The result is annotated `degraded` so downstream consumers can react. + - `if_under_min: 'wait-up-to-Ns'` → keep collecting up to the wait deadline, then apply `fail` or `proceed-degraded` based on whether `min` reached. Use case: training where you can spare a minute to wait for one more peer. + +**Contract attribution for `multi` quorum:** each contributing peer has its own `contract:proposed → bid → executed → delivered → paid` chain (per #1439 §4.4). Failed/timed-out contributors don't get paid (their `contract:delivered` never fires); successful contributors do. The final reduced result references all successful contributors in its alloy attestation (per #1439 §4.2 + Joel's vision: alloy as universal contract substrate). + +### 3.4 `quorum: 'any'` — any reachable peer, fan-out + first-good-enough + +Read-mostly commands where any peer can satisfy and the requester takes the first-good-enough answer (often racing several peers and taking whichever responds first, or merging top-K). + +```typescript +await DataVectorSearchCommand.execute({ namespace: 'engrams', query: vec, k: 10 }, { + scope: { + target: 'grid', + quorum: { + kind: 'any', + fan_out_to: 'all-matching', // 'all-matching' | 'first-N' (N=3) | 'first-fastest-N' + reducer: 'merge-top-k', // 'first-good-enough' | 'merge-top-k' | 'union' + max_wait_ms: 2_000, + early_return_on_first: false, + }, + }, +}); +``` + +**Concrete defaults for `any` quorum:** + +| Field | Default | Rationale | +|---|---|---| +| `fan_out_to` | `'all-matching'` | Default to broadest reach; operator can narrow. | +| `reducer` | `'first-good-enough'` for single-answer cases; `'merge-top-k'` for retrieval; `'union'` for sets | The shape of the result determines the reducer. | +| `max_wait_ms` | `p95(recent_latencies_for_capability) * 1.5`, capped at 5000 | Adaptive: faster peers raise the bar; cap prevents pathological waits. Initial bootstrap default = 2000ms before history exists. | +| `early_return_on_first` | `false` (default) | Most `any` commands benefit from at least one merge; `true` only for truly-equivalent peers (e.g. fetch a content-addressed blob — first one wins). | + +**Reducer types** (subset of §3.3's, focused on merge-rather-than-combine): + + - **`first-good-enough`** — first response satisfying a quality predicate (or first response, period). Use when peers are equivalent: blob fetch, capability advertisement. + - **`merge-top-k`** — each peer returns top-K shard; merge + re-rank globally, return top-K. Use when peers index disjoint partitions: cross-peer vector search, distributed full-text. + - **`union`** — each peer returns a set; reducer = set union with origin tags. Use when peers may have overlapping content: adapter-search union of published manifests. + +**Examples:** + + - **`data/vector-search`** against the grid: query goes to every peer with an embedding index for the namespace; merge top-K from each peer's shard. + - **`adapter/search`**: union of every peer's published adapter manifests; return aggregated matches, deduplicated by manifest hash. + - **`media/upload` fetch path**: when reading a blob hash that lives on multiple peers, race the fetch against all known holders; first response wins (`early_return_on_first: true`). + - **Cross-peer presence query**: "who in the household is reachable right now?" — fan out a ping, collect responses up to `max_wait_ms`, return the set. + +**Privacy filter on `any` fan-out:** each receiving peer applies its OWN policy on what to return (per #1439 §3.3 / §7's trust-circle config). Household-tier peers might share full content; trusted-orgs might share signal-only (embedding without source text); public-mesh might refuse entirely. The reducer at the originator merges what came back without re-asking — the privacy decision lives at the source peer. Worked example: a household peer's engrams of a private journal entry contribute the embedding signal but not the text body on a cross-peer RAG `any`-fan-out from a trusted-orgs requester. + +### 3.5 Cross-cutting concerns + +**Ordering guarantees across quorum types.** For `single`, ordering is irrelevant. For `multi`, the reducer is responsible for any ordering it cares about (FedAvg doesn't care; majority-vote doesn't care; best-of-N might tiebreak by lamport for determinism). For `any`, results may arrive out of dispatch order; reducer specifies whether ordering is preserved (`merge-top-k` re-sorts; `union` doesn't). + +**Idempotency contract.** Per §3.2's no-auto-retry rule, mutating commands must be idempotent or explicitly opt out of retry. For `multi`/`any` quorums, the contract is stronger: a command issued to N peers must produce the same observable result if any subset of those peers re-executes it. Reducer authors should assume duplicate partials are possible and dedupe (e.g. by `(peer_id, request_id)` tuple). + +**Backpressure feedback.** Per #1439 §3 / §4 the `presence:resource-pressure` event is broadcast by peers under load. The router consumes it to bias selection away from pressured peers automatically. The per-call `scope` does NOT need to encode this — it's a router-side concern. Per-call `scope.policy` (e.g. `'cheapest-fast-enough'`) gives operator hints about tradeoffs; the router applies them with pressure data factored in. -Read-mostly commands where any peer can satisfy and the requester takes the first-good-enough answer (often racing several peers and taking whichever responds first). +**Observability.** Every quorum dispatch emits a `grid:quorum:dispatched` event with `(command_class, quorum_spec, peer_count, dispatch_time)`, and `grid:quorum:resolved` on completion with `(result_shape, contributing_peers, latency_p99, degraded: bool)`. Both are class-`broadcast: true` so dashboards + sentinel can observe without instrumenting per-command. Idle observers can subscribe across the whole mesh. -Examples: +### 3.6 What's NOT a quorum question - - **`data/vector-search`** against the grid: query goes to every peer with an embedding index for the namespace; merge results client-side. - - **`adapter/search`**: search the union of every peer's published adapter manifests; return aggregated matches. - - **`media/upload` fetch path** (when reading a blob hash that lives on multiple peers): race the fetch against all known holders; take the first response. + - **Routing target hints** (`scope.peer_id`, `scope.capability`) — these constrain WHICH peers are eligible; quorum constrains HOW MANY satisfy. + - **Authentication / trust circle** (`scope.min_trust_circle`) — per-circle filtering happens before quorum selection. + - **Backpressure** (handled router-side; see §3.5). + - **Reservation TTL** (handled at the §5 handle layer; see §9 open question 1). -The reducer for `any`-quorum commands is usually "first-N-results-merged" or "first-good-enough." +These belong on `scope` but not under `scope.quorum`. --- From b0591580ca5c29704785fe599d9b33285663403f Mon Sep 17 00:00:00 2001 From: Test Date: Mon, 25 May 2026 18:40:08 -0500 Subject: [PATCH 4/6] =?UTF-8?q?docs(grid):=20MULTI-PEER-COMMANDS=20=C2=A74?= =?UTF-8?q?.1=20genome=20paging=20=E2=80=94=20concrete=20event=20schemas?= =?UTF-8?q?=20+=20FETCH/DELEGATE=20decision=20tree?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Card cdc37197-dc18-4030-81ce-5655004abc2e (claude-tab-1). Refines §4.1 from strawman framing to implementation-spec: - §4.1.1 explicit inventory of single-machine primitives (AdapterStore, LayerLoader, GenomeRegistry, PagedResourcePool, GenomeDaemon) — confirms grid extension preserves all of them unchanged. - §4.1.2 typed event schemas for the two new event classes: AdapterAvailableEvent (per-peer inventory broadcast on join+heartbeat, deduped by monotonic sequence) and AdapterPressureEvent (hysteresis threshold-crossings only, lists eviction candidates so other peers can pre-fetch). Plus GridAdapterIndex API surface. - §4.1.3 FETCH vs DELEGATE decision tree as operator policy: depends on local-GPU-can-run-inference + estimated-use-count + vram-budget. Per-circle defaults (household FETCH-leaning, trusted-orgs DELEGATE- leaning). - §4.1.4 ASCII flow diagram for cross-peer paging-activate (both FETCH and DELEGATE paths). - §4.1.5 hot-path inference through a remote adapter (DELEGATE): A dispatches ai/generate via grid router with scope.peer_id=B; B's standard local inference path with adapter pinned; token stream back via airc bus on inference handle's scoped channel. Calling code unchanged. - §4.1.6 multi-peer paging pressure model: peers react to broadcast pressure events (pre-fetch / voluntary release / dispatch elsewhere) — self-regulating mesh, no central scheduler. - §4.1.7 version-pinning sharp edge: content-stable manifest_id makes DELEGATE safe across same versions; cross-version requires explicit adapter_version_policy. Plus federated-training implication — eager-fan-out within contributing peer set, lazy DELEGATE for others. - §4.1.8 explicit non-goals: sharded loading (model-parallel out of scope), runtime adapter merging, weights_sha256 verification gap (TODO follow-up card). Composes existing primitives, adds 2 event classes + 1 new TS file (GridAdapterIndex). No daemon changes. No protocol changes. Per the no-shim rule: extends primitives via metadata broadcast + per-policy decision, not via a wrapping adapter layer. Reviewers: codex (substrate side — confirm the event classes can ride existing airc-lib subscribe primitives), joel (FETCH/DELEGATE policy defaults match grid-economy intent). Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/architecture/MULTI-PEER-COMMANDS.md | 178 +++++++++++++++++++++-- 1 file changed, 166 insertions(+), 12 deletions(-) diff --git a/docs/architecture/MULTI-PEER-COMMANDS.md b/docs/architecture/MULTI-PEER-COMMANDS.md index 58da4cae7..5399a2b33 100644 --- a/docs/architecture/MULTI-PEER-COMMANDS.md +++ b/docs/architecture/MULTI-PEER-COMMANDS.md @@ -314,23 +314,177 @@ These belong on `scope` but not under `scope.quorum`. ### 4.1 Genome paging across peers — the canonical example -**Today (single-machine):** `GenomeDaemon` + `PagedResourcePool` + `AdapterStore` work together: persona requests an adapter via `genome/paging-activate`, pool checks if loaded, loads via `LayerLoader` if not, pins, returns handle. Pressure-driven eviction. +> **Status (2026-05-25):** Owned + revised by claude-tab-1 per kanban card cdc37197-dc18-4030-81ce-5655004abc2e. Refines strawman with concrete event schemas, the FETCH-vs-DELEGATE decision tree, hot-path inference flow through a remote adapter, and the multi-peer paging pressure model. + +This is the canonical example because it composes existing primitives (`AdapterStore`, `PagedResourcePool`, `LayerLoader`, `GenomeRegistry`, `GenomeDaemon`) with two new event classes — no new daemons, no new wrapper traits. The grid extension emerges from broadcasting manifests + applying the §6 hosting policy. + +#### 4.1.1 What exists today (single-machine) + + - **`AdapterStore`** (`src/system/genome/server/AdapterStore.ts`) scans `SystemPaths.genome.adapters`. Each adapter dir has `manifest.json` + `adapter_config.json` + `adapter_model.safetensors`. Indexed by `manifest.id` (content-stable across machines if the manifest content is identical) and by `(personaId, domain)` for latest-version lookup. + - **`LayerLoader`** + **`LayerCache`** async-load adapter weights with in-flight dedup and LRU+TTL caching. + - **`GenomeRegistry`** ref-counts active loads — `.pin()` keeps adapter resident, `.unpin()` allows eviction. + - **`PagedResourcePool`** under that, generic — adapter is just one of several pressure-managed resources. + - **`GenomeDaemon`** orchestrates the lifecycle, consumes pressure events, exposes `genome/paging-activate` + `genome/paging-deactivate` commands. + +Flow today: `genome/paging-activate({ manifest_id })` → AdapterStore lookup → if loaded in pool, return handle; else LayerLoader fetches weights, pool pins, returns ResourceHandle. Pressure-driven eviction happens orthogonally via the PagedResourcePool's policy. **All of this stays exactly as-is for grid extension.** + +#### 4.1.2 What grid extension adds + + - **New event class** `presence:adapter-available` (`broadcast: true`, `channel: 'global'`): each peer broadcasts its current adapter inventory on join + on adapter add/remove + on a 5-minute heartbeat (idempotent — same content gets deduped at the projection layer). + + ```typescript + interface AdapterAvailableEvent { + peer_id: PeerId; + sequence: number; // monotonic per-peer; older events deduped + adapters: AdapterAvailability[]; + ts_ms: number; + } + interface AdapterAvailability { + manifest_id: string; // sha256-stable; same content → same id across peers + manifest: AdapterManifestSummary; // {persona_id, persona_name, domain, base_model, version, size_bytes} + load_state: 'on-disk' | 'cached' | 'pinned'; + currently_pinned_count: number; // 0 = evictable; >0 = held + last_used_ms: number; + can_delegate_inference: boolean; // true if peer has GPU + accepts inbound inference + can_offer_weights: boolean; // true if peer permits other peers to pull the safetensors + } + ``` + + - **New event class** `presence:adapter-pressure` (`broadcast: true`, `channel: 'global'`): broadcast at threshold-crossings (when VRAM crosses 70%, 85%, 95% per #1439-style hysteresis), not on every change. Body lists eviction candidates so other peers can pre-fetch before this peer evicts: + + ```typescript + interface AdapterPressureEvent { + peer_id: PeerId; + pressure_level: 'normal' | 'elevated' | 'high' | 'critical'; + vram_used_gb: number; + vram_total_gb: number; + eviction_candidates: Array<{ + manifest_id: string; + last_used_ms: number; + size_bytes: number; + can_offer_weights: boolean; + }>; + ts_ms: number; + } + ``` + + - **`GridAdapterIndex`** (new, `src/system/genome/server/GridAdapterIndex.ts`): subscribes to `presence:adapter-available` + `:adapter-pressure`, maintains a per-peer latest-availability projection. Lookup API: + + ```typescript + class GridAdapterIndex { + /** Locate an adapter across the grid. Returns local first, then peers sorted by suitability. */ + locate(manifest_id: string): { + local: boolean; + peers: Array<{ + peer_id: PeerId; + load_state: 'on-disk' | 'cached' | 'pinned'; + can_delegate_inference: boolean; + can_offer_weights: boolean; + estimated_latency_ms?: number; // from #1439 §3 capacity hints + }>; + }; + + /** All adapters reachable on the grid for a (persona, domain). Includes local. */ + list_for(persona_id: PersonaId, domain: string): GridAdapterCandidate[]; + } + ``` + +The `GridAdapterIndex` is the only new component. `AdapterStore` keeps its local index unchanged. The grid index is fed by airc subscriptions, lives entirely in memory, no persistence (the projection rebuilds from airc cursor on restart). + +#### 4.1.3 The FETCH-vs-DELEGATE decision (per-operator policy) + +When `genome/paging-activate({ manifest_id })` finds the adapter is NOT local but IS on peers, two strategies satisfy the request: + + - **FETCH** — pull the safetensors from a peer, store locally, then paging-activate locally. Subsequent uses are local-only. Good when: this peer has spare GPU + the adapter will be used many times. + - **DELEGATE** — keep the adapter remote; route every inference call through the peer that holds it. This peer holds a `RemoteResourceHandle` (§5). Good when: this peer doesn't have the GPU to run inference even if it had the weights, OR the adapter will be used a few times and weight-transfer cost (100MB-1GB) isn't worth it. + +Decision logic (in `GenomeDaemon`, configurable per operator policy in `~/.continuum/grid-policy.json` from #1439 §7): -**Grid extension (zero new daemons, two new event classes, one extension to AdapterStore):** +``` +local_gpu_can_run_inference? # do we have the hardware? + no → DELEGATE (no choice) + yes → estimated_use_count > threshold (default: 3)? + yes → check vram budget for adding this adapter + fits → FETCH (amortizes over many uses) + doesn't → DELEGATE (no room here) + no → DELEGATE (not worth transferring weights) +``` + +Operator policy can override per-circle: `household` peers might default FETCH (LAN is cheap, mutual trust); `trusted-orgs` might default DELEGATE (cross-internet weight transfer is slow, payment-per-inference makes more sense than payment-per-MB). See §6. + +#### 4.1.4 Cross-peer paging-activate flow + +``` + Peer A Peer B (has the adapter) + ────── ──────────────────────── +genome/paging-activate({ manifest_id }) ──► A: AdapterStore.locate(id) + │ not local + ▼ + A: GridAdapterIndex.locate(id) + │ found on B + ▼ + A: decide FETCH vs DELEGATE (§4.1.3) + │ + ┌─────────────────┴──────────────────┐ + ▼ ▼ + FETCH DELEGATE + │ │ + A → B: media/fetch-blob(id) A → B: grid/pin-request(id, ttl) + │ │ + B → A: stream safetensors B: PagedResourcePool.pin(id) → handle_B + │ │ + A: write to AdapterStore B → A: { remote_handle_id, ttl } + │ │ + A: paging-activate locally A: create RemoteResourceHandle wrapping (B, handle_B) + │ │ + return ResourceHandle (local) return RemoteResourceHandle + │ │ + ▼ ▼ + [done] subsequent inference dispatches via grid + (see §4.1.5) +``` + +#### 4.1.5 Hot path: inference through a remote adapter (DELEGATE) + +Once Peer A holds a `RemoteResourceHandle` pointing at Peer B's pinned adapter: + +``` +ai/generate({ prompt, adapter_handle: handle_remote_B }) on Peer A + → handle_remote_B.fetch_strategy === 'delegate' + → dispatch ai/generate via grid router, scope.peer_id = B + → B receives ai/generate({ prompt, adapter_handle: handle_B_local }) — locally rewritten + → B: standard local inference path with adapter pinned at handle_B + → B streams tokens back via the airc event bus, channel scoped to the inference handle id + → A receives token stream events, returns to caller +``` + +The TS-side caller doesn't know or care the adapter lives on B. The inference handle (a fresh one for this call) is a normal `Handle` on A; the streaming events on the bus are typed `inference:tokens` per #1439 §2.2 (broadcast: true, channel: scoped to handle id). + +**Why DELEGATE works for slow models on weak hardware:** if a household has a MacBook Air (8GB unified memory, no discrete GPU) and a desktop with an RTX 5090, the MacBook's persona can use the desktop's loaded LoRAs without copying weights. The personas effectively share GPU + adapter pool transparently. This is the "personas are citizens of the grid" practical implementation. + +#### 4.1.6 Multi-peer paging pressure + +The pressure model extends naturally — peers under VRAM pressure broadcast `presence:adapter-pressure`. Other peers consuming the event can: + + - **Pre-fetch** an evictable adapter they want, before B evicts (so they have it locally if B drops it). + - **Voluntarily release** their own pins on B's adapters (`grid/unpin`) if they were holding them speculatively, freeing B's capacity. + - **Hint dispatch elsewhere** — A's local policy stops biasing toward B for new requests until B's pressure level drops. + +This produces a self-regulating mesh: pressure broadcasts let peers cooperate without a central scheduler. No new mechanism — just AdapterPressureEvent fan-out + per-peer policy reaction. + +#### 4.1.7 The version-pinning sharp edge + +Adapter manifests are content-addressed by `manifest_id` (sha256 over the manifest). If two peers have the SAME adapter content, they get the same `manifest_id` — DELEGATE works transparently. If two peers have DIFFERENT versions of "the same" adapter (different training data, different seed), they have different `manifest_id`s — DELEGATE doesn't accidentally cross versions; each `manifest_id` is a separate locate lookup. - - **New event class** `presence:adapter-available` (broadcast: true, channel: 'global'): each peer broadcasts its full adapter manifest list on join + on adapter add/remove. Body: `{ peer_id, adapters: [{ manifest_id, manifest_json, last_used_ms, currently_pinned_count }] }`. - - **New event class** `presence:adapter-pressure` (broadcast: true, channel: 'global'): peers broadcast adapter-eviction-candidates under memory pressure. Body: `{ peer_id, evictable: [{ manifest_id, last_used_ms, can_offer_to_other_peers: bool }] }`. - - **AdapterStore extension:** alongside the local manifest index, maintain a `GridAdapterIndex` (folder of `presence:adapter-available` events). Lookup: "find this adapter" returns `{ local: bool, peers: [peer_id] }`. +**Sharp edge:** a persona that does `genome/paging-activate({ persona_id, domain })` (without a specific manifest_id) needs the GridAdapterIndex to pick which version. Policy choice: prefer-local-version if any, else prefer-newest-on-grid (by manifest `version` field, falling back to `last_used_ms`). Make this an explicit `scope.adapter_version_policy: 'local-first' | 'newest' | 'pinned-to-version='` so call sites can be deterministic. -**Cross-peer paging-activate flow:** +**Implication for federated training (§4.3):** when N peers contribute to a training run, the resulting adapter has a single new `manifest_id`. Each contributing peer's `presence:adapter-available` broadcast lists it as `load_state: 'on-disk'` once writing finishes. The originator's policy decides: distribute the safetensors back to all contributors (eager fan-out via `media/upload` blob distribution) OR let each contributor pull on first need (lazy DELEGATE). Default: eager fan-out within the contributing peer set, lazy for everyone else. - 1. `genome/paging-activate({ manifest_id })` called locally. - 2. AdapterStore check: is it on this peer? If yes → existing path. - 3. If no → query `GridAdapterIndex` → list of peers holding it. - 4. Per operator policy: either FETCH (pull the safetensors from a peer, store locally, then paging-activate locally) OR DELEGATE (the peer that has it loaded executes inference there; this peer holds a `RemoteResourceHandle`). - 5. The DELEGATE path is the LoRA-paging-across-grid story: cheap household-LAN means "load on the GPU peer, route inferences through it" is faster than copying 100MB-1GB of weights. +#### 4.1.8 What this section DOES NOT define -**Why this is the canonical example:** every existing primitive composes; the multi-peer behavior emerges from broadcasting manifests + the routing policy choice. No new wrapper layer. The `RemoteResourceHandle` is just `Handle` with a `peer_id` field. + - **Sharded adapter loading** (one peer holds adapter layers 0-15, another holds 16-31). That's model-parallelism, out of scope per #1439 §10. + - **Adapter merging at request time** (load adapter X + Y simultaneously and combine LoRA deltas). Single-adapter activation only; PEFT-side composition is a separate concern. + - **Trust verification of adapter weights.** A peer claiming `manifest_id: X` could lie about content. The `manifest_id` is content-addressed (sha256 over the manifest JSON), but the weights themselves need separate hash-verification on the receiver. **TODO:** add `weights_sha256` to the manifest schema + verify on FETCH receive. Cards as follow-up — not blocking. ### 4.2 Federated inference (single-peer dispatch, but interesting cases) From e5c699bc643bf4b08b4799417d1003bd79e138bb Mon Sep 17 00:00:00 2001 From: Test Date: Mon, 25 May 2026 18:44:27 -0500 Subject: [PATCH 5/6] =?UTF-8?q?docs(grid):=20MULTI-PEER-COMMANDS=20=C2=A74?= =?UTF-8?q?.4=20multi-peer=20RAG=20=E2=80=94=20privacy-at-source=20+=20re-?= =?UTF-8?q?rank=20math=20+=20namespace=20contract?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Card fc1e3262-7ad4-4f92-9f4b-f322e004f387 (claude-tab-1). Refines §4.4 from strawman framing to implementation-spec: - §4.4.1 explicit inventory of single-machine primitives (data/vector- search, engram store, cognition/recall-engrams). - §4.4.2 new event classes (GridRagRequest + GridRagResponse) with full typed schemas. Single new flag (scope.fan_out) is the API delta from caller's perspective. - §4.4.3 ASCII flow diagram for cross-peer fan-out. - §4.4.4 re-ranking math: dedup-by-content-hash, cosine score commensurable across peers IFF same embedding model (enforced via namespace contract), min-alloy filter for index recall quality control, score-zero handling. - §4.4.5 privacy filter HARD RULE: applied at source peer per its own policy, never re-asked by reducer. Three sharing levels (full / signal-only / denied) per-circle in grid-policy.json. Worked example (Joel's household + Toby's grid). - §4.4.6 namespace distinction: engrams:* (per-persona, privacy- filtered) vs published:* (opt-in shared, no filter). Cross-peer fan-out covers both; semantics differ. - §4.4.7 hot-path perf: embedding-gen latency depends on local model avail, wait deadline tuning (LAN vs cross-internet), result volume is trivial (~80 items at K=10, N=8), filter cost negligible. - §4.4.8 non-goals: cross-model embedding alignment (future research), persistent cross-peer subscription (different shape), cross-peer engram WRITE (separate spec with contract chain), federated learning over engrams (hybrid of §4.3 + §4.4). The privacy-at-source rule is the key invariant: each receiving peer decides what to return based on its OWN policy. Reducer never re-asks for withheld content. Per-engram metadata flags (e.g. private: true on a journal entry) override per-circle defaults. Reviewers: codex (event-class registration), joel (privacy defaults + namespace contract), dba950ce (engram-tier interactions if their sections touch this). Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/architecture/MULTI-PEER-COMMANDS.md | 163 ++++++++++++++++++++++- 1 file changed, 159 insertions(+), 4 deletions(-) diff --git a/docs/architecture/MULTI-PEER-COMMANDS.md b/docs/architecture/MULTI-PEER-COMMANDS.md index 5399a2b33..63f10497c 100644 --- a/docs/architecture/MULTI-PEER-COMMANDS.md +++ b/docs/architecture/MULTI-PEER-COMMANDS.md @@ -545,13 +545,168 @@ await GenomeTrainCommand.execute({ ### 4.4 Multi-peer RAG / vector search -**Today:** `data/vector-search` queries the local embedding index for a namespace. Returns top-K matches by cosine similarity. +> **Status (2026-05-25):** Owned + revised by claude-tab-1 per kanban card fc1e3262-7ad4-4f92-9f4b-f322e004f387. Refines strawman with the privacy-filter-at-source contract, re-ranking math, dedup semantics, and the engram-vs-published-knowledge distinction. -**Grid extension:** `data/vector-search` with `quorum: 'any'` and `scope.fan_out: true`. The router fan-outs the same query to every peer that has an embedding index for the requested namespace; each peer returns top-K; originator merges + re-ranks + returns merged top-K. +This is the canonical `quorum: 'any'` case. The grid extension is small in code (one new `scope.fan_out` flag, one reducer) but big in implication: it lets a persona on Peer A ask "what does the collective grid know about X" without centralizing engrams or violating the privacy filter at any contributing peer. -**Why this matters:** persona engram stores are per-peer (each persona builds its own context). Cross-peer RAG = "what does the household collectively know about X?" without centralizing the engrams. +#### 4.4.1 What exists today (single-machine) -**Privacy implication:** each peer's reply is filtered through that peer's `policies.share_engrams_with_circles` — household-tier might share full content, public-tier might share only the embedding signal + reference. Per-peer policy enforces. + - **`data/vector-search`** (`src/commands/data/vector-search/`): queries the local embedding index for a namespace + collection. Returns top-K matches by cosine similarity. Single-machine, single namespace. + - **Engram store** (`src/system/cognition/engrams/`): per-persona memory store with embeddings. Each persona's engrams are isolated by `personaId` — already the right granularity for "whose memory am I querying." + - **`cognition/recall-engrams`**: queries one persona's engram store for memory relevant to a stimulus. Uses `data/vector-search` under the hood. + +Flow today: persona P on Peer A calls `cognition/recall-engrams({ persona_id: P, stimulus })` → builds query vector → `data/vector-search({ namespace: 'engrams:P', query: vec, k: 10 })` → cosine top-K from local index → return. + +#### 4.4.2 What grid extension adds + + - **`scope.fan_out`** (new flag on `naturalScope: 'grid'` commands): when set, the router dispatches to multiple peers per §3.4 `'any'` quorum rules. Default false (no fan-out — single-peer). + - **`reducer: 'merge-top-k'`** (already defined in §3.4): receives per-peer top-K shards + the original query; produces global top-K. + - **`grid:rag:request`** event class (`broadcast: false`, dispatched as a Command, not a fire-and-forget Event — but using EventClass machinery for the typed shape). Body: + + ```typescript + interface GridRagRequest { + request_id: HandleId; // for cancellation + result correlation + namespace: string; // 'engrams:' | 'published:' | custom + query_vector: number[]; // the original query embedding + k: number; // top-K per peer + filter_predicate?: string; // optional metadata filter (e.g. 'tag=cooking') + max_wait_ms: number; // hard ceiling for this peer's contribution + requester_trust_circle: TrustCircle; // for the receiver's policy filter (§4.4.5) + } + ``` + + - **`grid:rag:response`** event class (`broadcast: false`, scoped to `request_id`). Body: + + ```typescript + interface GridRagResponse { + request_id: HandleId; + peer_id: PeerId; + namespace: string; + results: Array<{ + embedding_id: string; // opaque, peer-local stable id + score: number; // cosine similarity (0..1) + content?: string; // may be omitted by privacy filter (§4.4.5) + metadata: Record; + provenance: { + peer_id: PeerId; + alloy_hash?: string; // index alloy hash if this content was indexed via a known recipe + ts_ms: number; + }; + }>; + truncated: boolean; // true if filter dropped some matches + ts_ms: number; + } + ``` + +#### 4.4.3 Cross-peer fan-out flow + +``` +Peer A: cognition/recall-engrams or data/vector-search Peers B, C, D + with scope.fan_out=true, quorum.kind='any', (have the namespace) + reducer='merge-top-k', max_wait_ms=2000 + │ + ▼ +Router consults capability index for peers with this namespace + │ + ▼ +Router dispatches GridRagRequest to B, C, D in parallel + │ │ + │ ▼ + │ each peer: local data/vector-search + │ each peer: apply privacy filter (§4.4.5) + │ each peer: emit GridRagResponse on + │ channel scoped to request_id + │ │ + ◄────────────────────────────────────────────────────────────────┘ + │ +collect responses up to max_wait_ms or all-peers-responded + │ + ▼ +merge-top-k reducer (§4.4.4): rerank globally, return top-K + │ + ▼ +return to caller as standard data/vector-search result shape +``` + +The caller signature doesn't change — adding `scope.fan_out: true` is the entire API delta. Internal flow is the new piece. + +#### 4.4.4 Re-ranking math (the merge-top-k reducer) + +Each peer returns its top-K by local cosine similarity to the query. The reducer combines them. Naive concat-and-sort works only if scores are commensurable across peers — they should be, because cosine similarity over the same embedding model is intrinsically normalized (all values in [-1, 1]). But there are edge cases: + + - **Different embedding models per peer:** if peer B uses `text-embedding-3-large` and peer C uses `nomic-embed-text-v1.5`, their cosine scores aren't directly comparable (different vector spaces). Mitigation: the namespace contract pins an embedding model + dimension (e.g. `engrams:personaP@text-embedding-3-large/1536`); peers that don't have a matching index don't claim the namespace in their capability advertisement; cross-model fan-out doesn't happen. Per-peer scoring is then commensurable by construction. + - **Different index recall quality:** B might have a more recent/comprehensive index than C. The reducer can't detect this from scores alone. Heuristic: include `provenance.alloy_hash` for the index — if the originator wants tighter control, they can declare a min-alloy filter (`scope.fan_out_filter: { index_recipe: '' }`) to constrain to peers using a specific indexing methodology. + - **Duplicate content across peers:** the same engram might be indexed on multiple peers (Joel's iMac and laptop both indexed the same RSS feed). Dedup at the reducer: hash the embedding vector (first 16 bytes of the vector as a rough fingerprint) or hash the content text if shared. Default: dedup by `(content[:200] hash)` if content present; else by `embedding_id` if scopes overlap (rare). + - **Score-zero matches:** some peers may return no matches above threshold. Reducer ignores empty results; no penalty in the merged top-K. + +**Default merge-top-k algorithm:** + + 1. Concatenate all `GridRagResponse.results`. + 2. Dedup by content-hash (or embedding-fingerprint if no content). + 3. Sort by `score` descending. + 4. Take top-K (the caller-requested K). + 5. Annotate each result with `provenance` (which peer contributed it) so downstream consumers can route follow-up queries appropriately. + +#### 4.4.5 Privacy filter at SOURCE (not reducer) + +The hard rule: **each receiving peer decides what to return based on its OWN policy.** The reducer at the originator just merges what came back; it never re-asks the source peer for content it withheld. + +Per-peer policy lives in `~/.continuum/grid-policy.json` under `engram_sharing` (extending the policy block from #1439 §7): + +```json +{ + "engram_sharing": { + "by_circle": { + "household": { "share": "full", "include_content": true }, + "trusted-orgs": { "share": "signal-only", "include_content": false }, + "extended": { "share": "denied" }, + "public": { "share": "denied" } + }, + "by_namespace_override": { + "engrams:helper-ai": { "household": "denied" } // some engrams are off-limits even to household + } + } +} +``` + +Three sharing levels: + + - **`full`** — return the result with content + metadata. + - **`signal-only`** — return the result with embedding_id + score + provenance, but NO content (other peer can use the result to bias their own search OR follow up with a separate trust-elevation request, but can't read the engram body). + - **`denied`** — don't appear in the response at all. Set `truncated: true` so the requester knows results were filtered. + +**Concrete worked example:** Joel's household is querying "what do I know about my friend Toby?" The persona running on Joel's laptop fan-outs to: + + - Joel's iMac: returns `share: 'full'` per household policy. 5 engrams about Toby (chat history, shared docs). + - Joel's RTX desktop: same. 2 engrams (image-tagged photos). + - Toby's grid (trusted-orgs tier): returns `share: 'signal-only'`. 3 engrams matching the query, but content is withheld. The persona sees "there are 3 things Toby's grid knows that match your query — you don't have permission to read them" and can decide whether to ask Toby for elevation. + +The privacy filter is applied PER ENGRAM, not per request — a result might be `full` for one engram and `denied` for another within the same response, based on per-engram metadata flags (e.g. `private: true` on a journal entry). + +#### 4.4.6 The engram-vs-published-knowledge distinction + +Two namespaces this section enables: + + - **`engrams:`** — per-persona memory. Always privacy-filtered. Cross-peer fan-out lets one persona on Peer A query another persona on Peer B's engrams (subject to policy). Useful for: collaborative agents sharing context, household assistants learning from each other's interactions. + - **`published:`** — explicitly shared knowledge a peer wants discoverable. No privacy filter (the act of publishing implies sharing). Useful for: forge-alloy index ("which alloys does the grid know about for this capability?"), peer expertise advertisement ("which peers have the most engrams about astronomy?"). + +The `published:*` namespace requires a peer to opt in per-content (mark an engram `published: true` to expose it via this namespace). Default for new engrams is private. + +**Open design question** (deferred to follow-up): should `published:*` content be content-addressed (sha256) so multiple peers publishing the same artifact dedup naturally? Probably yes — same content + same alloy hash → same `embedding_id` in the merged response. Out of scope for this section; follow-up card. + +#### 4.4.7 Hot path performance considerations + + - **Embedding generation latency:** the query vector must be computed before fan-out. If the local peer can't run the embedding model, this becomes a grid command itself (§4.2 federated inference) — embed locally OR delegate to a peer with the model. Typical embedding latency: 10-50ms on local, 100-300ms on grid. + - **Wait deadline tuning:** default `max_wait_ms: 2000` is sized for household-tier grids (LAN, ~10ms RTT + ~100ms query). For trusted-orgs (cross-internet), 5000ms is safer. The adaptive default from §3.4 (`p95(recent_latencies) * 1.5`) converges to the right value within a few queries. + - **Result volume:** each peer returns up to K results; with N peers, the reducer sees N*K results. For K=10, N=8, that's 80 results to dedup + sort + truncate — trivial. Doesn't need streaming or paging at typical scales. + - **Privacy filter cost:** applying per-engram policy at the source is a fast attribute check; not a bottleneck. The trust-circle check uses the request envelope's signed sender peer-id (per #1439 §4.4 trust chain). + +#### 4.4.8 What this section DOES NOT define + + - **Cross-model embedding alignment.** If peers use different embedding models, this section says they don't fan-out together (namespace contract pins the model). A future spec could add cross-model alignment via a linear projection or shared anchor set, but that's its own research project — not in scope. + - **Persistent cross-peer engram subscription** ("notify me when Toby's grid indexes new content matching this query"). Different shape (subscribe vs query). Could ride the same event classes with a `subscribe: true` flag + cursor, but defer to a follow-up card. + - **Cross-peer engram WRITE** (one persona contributing engrams to another's store). Strictly read-side fan-out here. Cross-peer write requires explicit consent + audit + probably a contract chain (per #1439 §4.4). Separate spec. + - **Federated learning OVER cross-peer engrams** (train a new adapter using everyone's engrams). Hybrid of §4.3 and §4.4 — covered there, not here. ### 4.5 Multi-peer forge runs (distributed synthesis) From 55eb7c5499b7521edbc9a229ceeba486ce83be79 Mon Sep 17 00:00:00 2001 From: Test Date: Mon, 25 May 2026 18:50:04 -0500 Subject: [PATCH 6/6] =?UTF-8?q?docs(grid):=20MULTI-PEER-COMMANDS=20=C2=A75?= =?UTF-8?q?=20handle=20distribution=20=E2=80=94=20TS-side=20spec?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Card 54dc3648-ae0a-49e2-8608-ceca9a84a3c1 (claude-tab-1, TS side). Rust-side spec open for codex to pair. Refines §5 from strawman to TS-implementation-spec: - §5.1 RemoteResourceHandle typed interface extending existing Handle. 8 fields total (5 inherited from Handle, 3 grid-specific: peer_id, remote_handle_id, resource_kind, resource_hint, fetch_ strategy, reservation_id?, trust_circle). - §5.1.1 8 resource kinds with default fetch_strategy each (lora_ adapter delegate, kv_cache always delegate, inference_session always delegate, embedding_index delegate, render_buffer pull-on-use, model_weights pull-immediately, media_blob pull-on-use, custom). - §5.2 4 caller-facing methods (.value() / .unpin() / .status() / .heartbeat()) with explicit semantics + throws conditions. Async proxy caching for delegate strategy, byte caching for pull-on-use. - §5.3 pin lifecycle ASCII sequence covering REQUEST → PIN-RESPONSE → DISPATCH → UNPIN with both A (caller) and B (holder) perspectives. Safety section explains why no orphan pins survive crashes (heartbeat-driven timeout on holder side, 2× TTL). - §5.4 lease + reservation with concrete defaults table (10s reservation, 5min TTL, 1min heartbeat, 10min orphan timeout) and 3 reservation policies (first-come / priority-circle / bid) per holder policy. - §5.5 content-addressed FETCH path for the FETCH-side of the §4.1.3 decision tree. Hash verification on receive, dedup by content hash, multi-source fetch deferred. - §5.6 cross-cutting: handle id disambiguation (local .id vs remote_ handle_id), status events ride airc bus on scoped channel (no polling), JSON serialization clean, TS/Rust boundary explicit. - §5.7 non-goals: Rust-side substrate (codex owns), streaming-handle semantics (follow-up), multi-hop dispatch handle propagation (deferred until use case), cross-grid handle sharing (separate spec). Per the no-shim rule: TS doesn't reimplement pin lifecycle logic — it dispatches through IPC to Rust which owns the truth. RemoteResource Handle is a typed wrapper class around the existing Handle pattern, not a new abstraction layer. All 4 my-owned sections (§3, §4.1, §4.4, §5 TS) now done. Codex on §5 Rust spec when they pick up the card; dba950ce on §4.2/§4.3 if they take it; codex/anyone on §4.5 if they take it. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/architecture/MULTI-PEER-COMMANDS.md | 271 +++++++++++++++++++---- 1 file changed, 234 insertions(+), 37 deletions(-) diff --git a/docs/architecture/MULTI-PEER-COMMANDS.md b/docs/architecture/MULTI-PEER-COMMANDS.md index 63f10497c..5cd2d8aa2 100644 --- a/docs/architecture/MULTI-PEER-COMMANDS.md +++ b/docs/architecture/MULTI-PEER-COMMANDS.md @@ -729,64 +729,261 @@ The recipe entity grows a `stages[].parallelizable_across_peers: bool` flag. The ## 5. The handle distribution model -How do distributed resources travel between peers without losing the safety properties of the local handle system? +> **Status (2026-05-25):** TS-side spec owned by claude-tab-1 per kanban card 54dc3648-ae0a-49e2-8608-ceca9a84a3c1. Rust-side spec is open — pair with codex when they pick it up. This section defines the TypeScript API surface + lifecycle semantics from the consumer's perspective. The Rust side handles the substrate-internal pin coordination, ref-counting across IPC, and `airc-lib` event wiring. + +How do distributed resources travel between peers without losing the safety properties of the local handle system? Continuum already has `Handle` (`src/system/core/types/Handle.ts`) — UUID-addressable, TTL-managed, SQLite-backed. The grid extension is: a handle MAY refer to a resource pinned on a different peer, and the local Continuum keeps a `RemoteResourceHandle` wrapper that knows how to dispatch through to the holder. ### 5.1 `RemoteResourceHandle` — handle that points at another peer's pin ```typescript interface RemoteResourceHandle extends Handle { - // Existing Handle fields: - id: UUID; - short_id: ShortId; - status: HandleStatus; - created_ms: number; - ttl_ms: number; - // Grid extension: - peer_id: PeerId; // who holds the live resource - remote_handle_id: UUID; // id on the holder peer - resource_kind: string; // 'lora_adapter' | 'kv_cache' | 'inference_session' | ... - resource_hint: ResourceHint; // cached display info (size, capability, etc) - fetch_strategy: 'delegate' | 'pull-on-use' | 'pull-immediately'; + // Existing Handle fields (per src/system/core/types/Handle.ts): + readonly id: UUID; // local handle id (this peer's perspective) + readonly short_id: ShortId; // '#abc123' for human-typeable refs + readonly status: HandleStatus; // pending | processing | complete | failed | expired | cancelled + readonly created_ms: number; + readonly ttl_ms: number; // local handle TTL; refreshed on heartbeat + + // Grid extension fields: + readonly peer_id: PeerId; // which peer holds the live resource + readonly remote_handle_id: UUID; // the id on the holder peer (NOT same as local .id) + readonly resource_kind: ResourceKind; // typed enum (see 5.1.1) + readonly resource_hint: ResourceHint; // cached display info; not the resource itself + readonly fetch_strategy: FetchStrategy; + readonly reservation_id?: UUID; // present when held via §5.3 reservation + readonly trust_circle: TrustCircle; // which circle authorized the cross-peer pin +} + +type FetchStrategy = 'delegate' | 'pull-on-use' | 'pull-immediately'; + +type ResourceKind = + | 'lora_adapter' + | 'kv_cache' + | 'inference_session' + | 'embedding_index' + | 'render_buffer' + | 'model_weights' + | 'media_blob' + | { kind: 'custom'; namespace: string }; + +interface ResourceHint { + size_bytes?: number; + capability?: string; + display_label?: string; + alloy_hash?: string; +} +``` + +#### 5.1.1 Resource kinds and their semantics + +| Kind | Typical use | Default fetch_strategy | Why | +|---|---|---|---| +| `lora_adapter` | Genome paging across peers (§4.1) | `delegate` | Weights are 100MB-1GB; delegate is faster than transfer for short-lived use. | +| `kv_cache` | Continued-conversation context | `delegate` (always) | KV cache is huge + ephemeral; never makes sense to pull. | +| `inference_session` | Multi-turn stateful inference handle | `delegate` (always) | Sessions are bound to the GPU peer that started them. | +| `embedding_index` | Cross-peer RAG (§4.4) | `delegate` (typical) | Indexes are large + the peer's query path is optimized for them. | +| `render_buffer` | Distributed compute output | `pull-on-use` | Render output is the work product — caller wants the bytes locally. | +| `model_weights` | Full base-model fetch | `pull-immediately` | Once you need a base model, you'll use it many times; amortize transfer. | +| `media_blob` | Content-addressed file/image/video | `pull-on-use` | Lazy fetch; content is immutable so cache-friendly. | +| `custom` | Consumer-extension | `delegate` (conservative default) | Operator picks per kind. | + +### 5.2 Operations on a `RemoteResourceHandle` + +The interface mirrors local `Handle` operations transparently; the grid is invisible at the call site. + +```typescript +interface RemoteResourceHandle extends Handle { + /** + * Resolve the handle to its value. + * - delegate: returns a typed proxy that dispatches method calls via grid → peer_id. + * Proxy invocations include remote_handle_id as context so the peer rebinds locally. + * - pull-on-use: lazy fetch on first access; caches locally for TTL. + * - pull-immediately: bytes already local at handle creation time. + * + * THROWS on: + * - peer-unreachable (peer offline) + * - reservation-expired (lease lapsed) + * - permission-denied (peer's policy revoked access) + */ + value(): Promise; + + /** + * Release this peer's hold on the remote resource. + * - Sends `grid/unpin` to peer_id with remote_handle_id. + * - Holder decrements ref-count; if zero, may evict. + * - Local handle moves to status='cancelled'. + * - Idempotent: unpinning twice is a no-op (second call returns immediately). + */ + unpin(): Promise; + + /** + * Get latest known status. With `subscribe: true`, returns a subscription + * that fires on every status change until cancelled or handle expires. + * Subscription rides the airc bus on a channel scoped to (peer_id, remote_handle_id). + */ + status(options?: { subscribe?: boolean }): Promise | AsyncIterable; + + /** + * Refresh the lease against the holder peer. Called automatically by heartbeat + * loop while handle is in scope; manual call for explicit lifecycle control. + * Resets local TTL on success; returns false if holder refuses (capacity / policy). + */ + heartbeat(): Promise; } ``` -**Operations on a RemoteResourceHandle:** +**Implementation notes for TS callers:** - - `.value()` — if `fetch_strategy === 'delegate'`, returns a proxy that dispatches calls via grid; if `pull-on-use`, fetches the bytes lazily; if `pull-immediately`, fetched at handle creation. - - `.unpin()` — sends `grid/unpin` to the holder peer (decrements ref-count there). If holder loses all pins, may evict locally. - - `.status()` — queries (or subscribes to) status events from the holder peer. + - `RemoteResourceHandle` is a class wrapping the typed metadata + a private connection to the local Rust-IPC layer (which talks to airc-lib). + - All four methods are async. None of them block on the holder peer's response longer than `scope.timeout_ms` (default 5s; per-call override). + - `.value()` for `delegate` strategy returns the same proxy object on repeated calls — proxies are cached by handle id to avoid setup cost per dispatch. + - `.value()` for `pull-on-use` caches the resolved bytes in the local Continuum until the handle expires; subsequent `.value()` calls within TTL return the cached copy without re-fetching. -### 5.2 Pin lifecycle across peers +### 5.3 Pin lifecycle across peers - 1. Peer A requests resource via `genome/paging-activate({ manifest_id })`. - 2. Router determines resource lives on peer B (via `GridAdapterIndex`). - 3. Per A's policy: `delegate` (return RemoteResourceHandle pointing at B) OR `pull` (transfer + local handle). - 4. Delegate path: A sends `grid/pin-request` to B; B pins locally; returns its handle id; A creates a RemoteResourceHandle wrapping it. - 5. A uses the resource by dispatching inference (etc.) through the handle — Commands.execute on grid path with `scope.peer_id: B`, including the remote handle id as context. - 6. A finishes; calls `.unpin()`; B decrements its local ref count; if zero, B may evict. +``` +Peer A (caller) Peer B (holder, has resource) +═══════════════ ════════════════════════════ + +genome/paging-activate({ manifest_id }) + │ + ▼ +GridAdapterIndex.locate(manifest_id) + → returns peers including B + │ + ▼ +A decides FETCH vs DELEGATE per §4.1.3 policy + │ (DELEGATE path shown below; FETCH covered in §5.4) + ▼ +A → B: grid/pin-request({ B receives grid/pin-request + resource_kind, manifest_id, │ + reservation_id?, trust_circle ────►│ +}) ▼ + B: validate (resource exists, policy allows, + reservation valid if provided) + B: PagedResourcePool.pin(resource_id) → handle_B + B: store {remote_pinner: A, local_handle: handle_B} + in cross-peer pin registry + │ +A receives response ◄─────────────────────────── B → A: { + │ remote_handle_id: handle_B.id, + ▼ ttl_ms, +A: construct RemoteResourceHandle wrapping resource_hint + (B, handle_B, manifest_id, ...) } + │ + ▼ +A: register local handle in SQLite handle store + │ + ▼ +return RemoteResourceHandle to caller + ... + caller uses .value() → dispatch via grid B: receives dispatched command with + ai/generate({adapter_handle: handle_remote}) remote_handle_id in scope + → grid send to B with remote_handle_id B: rebinds to local handle_B → runs locally + → result returned B: streams result back via airc bus + ... + caller eventually calls .unpin() + │ +A → B: grid/unpin({remote_handle_id}) B: receives grid/unpin + │ B: PagedResourcePool.unpin(handle_B) + ▼ B: removes from cross-peer pin registry +A: marks local handle status='cancelled' B: if ref-count zero, eligible for eviction +A: removes from SQLite handle store B → A: ack (or fail if handle unknown — idempotent ack) +``` + +**Why this is safe (no leaks across peers):** -**Why this is safe:** B's pin lifecycle is identical to single-machine paging — B doesn't know or care the pinner is remote; its `PagedResourcePool` ref count handles it. A doesn't know or care about B's local cache strategy — its `RemoteResourceHandle` is just a typed reference. The grid is invisible in the type system. + - **B's pin lifecycle is identical to single-machine paging.** B's `PagedResourcePool` ref count handles eviction protection. B doesn't know or care the pinner is remote — the cross-peer pin registry is just metadata for cleanup. + - **A's `RemoteResourceHandle` is just a typed reference.** A doesn't know or care about B's local cache strategy. The grid is invisible in the type system. + - **Heartbeat loop prevents zombie pins.** A's handle has a TTL (default 5min, refreshed by `.heartbeat()`). If A crashes and stops heartbeating, B's pin registry detects the timeout (default 2× TTL = 10min) and unpins automatically. **No orphan pins survive a crash.** + - **Bidirectional disconnect handling.** If A and B become network-partitioned, A's heartbeats fail → local handle marks `status='failed'` and caller gets an exception on next `.value()`. B's pin registry times out independently → unpins on B side. When connectivity recovers, both sides are clean. -### 5.3 Lease + reservation for expensive resources +### 5.4 Lease + reservation for expensive resources For resources where "is it currently available?" matters (GPU slots, model load slots, render queue slots), the pin is preceded by a **reservation:** - 1. A asks B: "do you have free capacity for capability X?" (via `presence:peer-manifest` or a fresh probe). - 2. B says yes with a `reservation_id` valid for K seconds. - 3. A pins against the `reservation_id`; if expired, B refuses, A retries elsewhere. - 4. Pin promotes to long-lived handle once accepted. +``` +A → B: grid/reserve({ + resource_kind: 'inference_session', + capability: 'inference:qwen3.5-72b-q4', + estimated_duration_ms: 60000, + trust_circle: 'household' +}) + │ +B: check capacity; if available, allocate + │ +B → A: { reservation_id: UUID, expires_ms: , terms: {...} } + │ +A: within 10s, follow up with grid/pin-request + including reservation_id + │ +B: validates reservation_id is still valid + matches; promotes to pin +B → A: { remote_handle_id, ttl_ms } (RemoteResourceHandle constructed) +``` + +#### 5.4.1 Reservation defaults + +| Field | Default | Rationale | +|---|---|---| +| `reservation_expires_ms` | 10_000 (10s) | Long enough for caller to commit; short enough to free slot on caller no-op. | +| `pin_ttl_ms` after promotion | 300_000 (5min) | Matches local Handle default; refreshed by heartbeat. | +| `heartbeat_interval_ms` | 60_000 (1min) | 5x safety factor below TTL. | +| `holder_orphan_timeout_ms` | `2 * pin_ttl_ms` (10min) | Holder unpins if no heartbeat for 2 TTLs. | + +#### 5.4.2 Reservation policies + +Reservations prevent the "10 peers all pin against B's last GPU slot, 9 get rejected after waiting" thundering-herd failure. Three reservation policies a holder peer can advertise (per `~/.continuum/grid-policy.json`): + + - **`first-come`** (default): grant reservations in arrival order until capacity full. Refuse new requests until a slot frees. + - **`priority-circle`**: rank pending reservations by requester's trust circle (household > trusted-orgs > extended > public); grant highest-priority first. Useful when household needs to preempt cross-internet requests. + - **`bid`**: hold reservation requests for `bid_window_ms` (default 500ms); grant to highest-bidder per `contract:bid` event. Public-mesh tier default. + +### 5.5 Content-addressed pull (FETCH path) + +For static resources (LoRA weights, model files, recipe blobs), the handle resolution can FETCH the content instead of delegating: + +``` +A wants resource with manifest_id +A: GridAdapterIndex / capability lookup → peers offering it +A: pick peer B per policy (cheapest, fastest, closest) +A → B: media/fetch-blob({ content_hash: manifest_id }) +B: validate policy (can_offer_weights, trust_circle) +B → A: stream safetensors (chunked, content-verified by hash on receive) +A: write to local AdapterStore +A: pin locally → standard local Handle (NOT RemoteResourceHandle) +A: subsequent uses are entirely local +``` + +This is the fallback when delegation isn't an option (peer offline, capacity full) AND the use pattern justifies transfer cost (per §4.1.3 estimated-use-count > threshold). + +#### 5.5.1 Why content-addressed fetch is safe + + - **Hash verification on receive.** The `content_hash` is verifiable end-to-end; A re-computes the sha256 of received bytes and rejects mismatch. + - **Deduplication by content hash.** If A already has bytes hashing to `manifest_id`, no transfer happens; A uses local copy. + - **Multi-source fetch** (future optimization): for large blobs, A can fetch chunks in parallel from multiple peers offering the same hash, race-and-take-first-good per chunk. Out of scope here; deferred to `media/upload` substrate spec. + +### 5.6 Cross-cutting handle concerns + +**Handle ID disambiguation.** A `RemoteResourceHandle.id` is the LOCAL id on Peer A; `.remote_handle_id` is the id on Peer B. These are different UUIDs. Code that needs to dispatch to B includes `remote_handle_id` in the command scope; code that needs to address the local wrapper uses `.id`. This is the only sharp edge in the API surface. + +**Status events ride the airc bus.** When B's local handle changes status (e.g. resource evicted under pressure, inference completes, error), B emits a `grid:handle:status` event on a channel scoped to `(B, remote_handle_id)`. A's `.status({ subscribe: true })` subscribes to that channel. No polling. -Reservations prevent the "10 peers all pin against B's last GPU slot, 9 get rejected after waiting" thundering-herd failure. +**Handle serialization.** A `RemoteResourceHandle` serializes to JSON cleanly (all fields are primitive types). It can be passed in command params, persisted, or shared across browser/server boundary via the existing EventBridge — extending Handle's pattern. Receiving Continuum reconstructs the wrapper class around the JSON. -### 5.4 Content-addressed pull +**TypeScript / Rust boundary.** TS-side defines the interface + caller-facing class; Rust-side (via airc-lib + RustCoreIPC) implements: + - The cross-peer pin registry storage and timeout sweeper. + - The `grid/pin-request` / `grid/unpin` / `grid/reserve` / `media/fetch-blob` IPC commands. + - The `grid:handle:status` event emission on B's side. + - The heartbeat loop coordination. -For static resources (LoRA weights, model files, recipe blobs), the handle resolution falls back to content-addressed pull: +The TS side is a thin client over these IPC calls. **Per the no-shim rule:** TS doesn't reimplement pin lifecycle logic — it dispatches through the IPC to Rust which owns the truth. Rust-side implementation owned by codex (see §5 Rust-side spec when codex picks up that card). - 1. A wants resource with `manifest_id`. Router sees no live peer holds it pinned. - 2. A queries airc-blobs for the content (manifest_id → sha256 → blob storage). - 3. A pulls bytes; pins locally; uses. +### 5.7 What this section DOES NOT define -This is the fallback when delegation isn't an option (peer offline, capacity full, content static-immutable). + - **Rust-side substrate.** The pin registry, heartbeat sweeper, IPC command handlers, and `grid:handle:status` event emission live in Rust and are owned by codex. This section pins the TS API + the wire-level contract those Rust impls must satisfy. + - **Streaming-handle semantics for chunked / progressive results.** E.g. an inference stream returning tokens incrementally. Mostly handled via the airc event bus (token events on a scoped channel), but the explicit "stream handle" type ergonomics deserve their own section. Follow-up. + - **Handle inheritance across multi-hop dispatch.** If A dispatches to B which dispatches to C, does C's response handle propagate back to A as a `RemoteResourceHandle` pointing at C, or as one pointing at B (with C's handle nested inside B's)? Probably the former (transparent multi-hop) but spec deferred until a concrete use case emerges. + - **Cross-grid (different airc meshes) handle sharing.** Same-mesh assumed throughout. Cross-mesh requires invite-bridging + trust circle delegation — separate spec. ---