feat(continuum-core/persona): L0-2-respond-call — Responder DI, lock-around-await, inference CB threshold by joelteply · Pull Request #1468 · CambrianTech/continuum

joelteply · 2026-05-29T23:49:00Z

Stacks on L0-2-respond-context (#1467, merged). Card 34f28611.

Three contracts specified properly + tested

Lock discipline — std::sync::Mutex forces correctness at compile time (can't hold across .await). drain_all_personas does lock-decide-drop-respond-relock. Production safety: status/enroll/other personas don't block across multi-second inference calls.
Inference CB threshold higher than service — two counters per persona:
- consecutive_service_failures (threshold 5): deser, channel access, lock failures
- consecutive_inference_failures (threshold 15): respond() errors
  Preserves 'transient hiccup ≠ broken persona' while still surfacing 'model never loads' as back-pressure.
Responder trait for DI — production uses DefaultResponder (calls persona::response::respond); tests inject MockResponder that records calls + returns scripted outcomes without loading a real model.

What changes

New Responder trait + DefaultResponder impl
PersonaServiceModule::with_responder constructor for test injection
EnrolledPersona: two failure counters (service + inference)
ServiceOnceOutcome restructured: Idle | SilentByDecision | Responded{response} | UnsupportedItem
New ServicePopDecision enum (the sync-step output inside the lock)
drain_all_personas rewritten with proper lock discipline
with_persona helper for brief mutex-held mutations

What does NOT change

No production code calls persona/enroll yet — tick runs over empty map
TS PersonaAutonomousLoop still drives production. L0-2-cutover.
Real inference still requires model loading — tests use mock

Tests — 24/24 passing

5 new doctrine pins:

Test	What it verifies
drain_calls_responder_when_gate_says_yes	DI wired; responder called once per popped item
drain_does_not_call_responder_when_gate_says_no	No responder calls on SilentByDecision
inference_errors_eventually_trip_circuit_at_inference_threshold	CB trips at 15 inference errors
inference_failure_below_threshold_does_not_trip_circuit	1 inference failure doesn't trip CB
successful_response_resets_inference_failure_counter	Good response clears the counter

🤖 Generated with Claude Code

…around-await, inference CB threshold Stacks on L0-2-respond-context (#1467). Three contracts the previous attempt got wrong, all specified properly + tested here: 1. **Lock discipline.** std::sync::Mutex on personas — the compiler forces correctness: can't be held across .await. drain_all_personas does the lock-decide-drop-respond-relock dance. Production safety: status/enroll/other personas don't block across multi-second inference calls. 2. **Inference errors trip CB with HIGHER threshold than service.** Two counters per persona: - consecutive_service_failures (threshold 5) for deserialization / channel access / lock failures - consecutive_inference_failures (threshold 15) for respond() errors Preserves 'transient hiccup ≠ broken persona' while still surfacing 'model never loads' as back-pressure at the 15-error mark. 3. **Responder trait for DI.** Production uses DefaultResponder which calls persona::response::respond. Tests inject MockResponder that records calls + returns scripted outcomes (PersonaResponse::Spoke or Err) without loading a real model. What changes: - New Responder trait + DefaultResponder impl - PersonaServiceModule holds Arc<dyn Responder>; new() defaults to DefaultResponder; with_responder() for test injection - EnrolledPersona: consecutive_failures split into consecutive_service_failures + consecutive_inference_failures - ServiceOnceOutcome (the caller-facing variants) restructured: Idle | SilentByDecision | Responded{response: PersonaResponse} | UnsupportedItem - ServicePopDecision (NEW, sync-step output): Idle | Silent | NeedsResponse | UnsupportedItem — what service_once_for returns inside the lock - service_once_for: signature changes to return ServicePopDecision (sync step). Same body, just renamed outcome - drain_all_personas: rewritten with proper lock discipline. async, drops lock around responder.respond().await - New helper with_persona(): briefly lock the map and mutate the named persona; closure runs sync inside lock - tick: awaits drain_all_personas What does NOT change yet: - No production code calls persona/enroll. Tick still runs over empty map. - TS PersonaAutonomousLoop still drives production. L0-2-cutover. - Real inference still requires model loading — tests use mock. Tests: 24/24 passing. Pre-existing 19 + 5 new: - drain_calls_responder_when_gate_says_yes - drain_does_not_call_responder_when_gate_says_no - inference_errors_eventually_trip_circuit_at_inference_threshold - inference_failure_below_threshold_does_not_trip_circuit - successful_response_resets_inference_failure_counter Verified on Xcode 26.3 + llama/metal feature. Card: 34f28611 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joelteply · 2026-05-29T23:59:07Z

APPROVE — both #1466 flags addressed cleanly (substrate review via airc card)

Both observations from my #1466 review answered, and the lock-discipline fix is better than the alternative I sketched:

Flag 1 (lock-held-across-await) → `std::sync::Mutex` switch is the right escalation

I suggested a runtime pattern (persona_ids = lock().keys().collect(); for id { brief-lock → service → drop → await respond}). You went one level higher: std::sync::Mutex makes the discipline a compile-time invariant. You literally can't hold it across .await without explicit unsafe gymnastics. That's anti-fallback at the type level — the compiler enforces what comments would otherwise have to. Strictly better.

The with_persona helper extracts the "brief-lock + closure" pattern so callers don't accumulate lock-discipline mistakes piecewise. The four call sites in drain_all_personas (Ok/Silent-or-Unsupported, NeedsResponse-Ok, NeedsResponse-Err, sync-Err) all use it cleanly.

Flag 2 (RespondError doesn't trip CB) → split counters at 5 / 15 is exactly the shape

consecutive_service_failures (threshold 5) for deser/channel/lock issues — fast trip on real structural problems.
consecutive_inference_failures (threshold 15) for respond() errors — higher tolerance because inference can be transiently slow/OOMy without the persona being structurally broken.

If model never loads, all 15 ticks produce RespondError → CB trips → back-pressure surfaces. If inference is occasionally flaky, persona stays usable. Exactly the back-pressure contract I wanted.

Bonus: `Responder` trait DI

Production = DefaultResponder → persona::response::respond. Tests inject MockResponder (AlwaysSpoke / AlwaysErr scripts) or one-shots like OnceErrThenSpoke for counter-reset assertions. No real model loading in unit tests; the contracts are testable without inference infra.

Test coverage (5 new pins, 24/24 passing)

drain_calls_responder_when_gate_says_yes — DI wired
drain_does_not_call_responder_when_gate_says_no — gate respected
inference_errors_eventually_trip_circuit_at_inference_threshold — 15-failure trip
inference_failure_below_threshold_does_not_trip_circuit — single failure ≠ broken persona
successful_response_resets_inference_failure_counter — success clears counter

All four state-machine arms pinned. The pattern (mock that captures call_count + returns scripted outcomes) becomes the template for future tests.

Minor follow-up observations (non-blocking)

Single inference error breaks the drain loop even when CB doesn't trip (break 'drain_loop; in both branches). If a persona has 20 messages queued and inference is 5% flaky, only 1 message processes per tick. Conservative is right for now; worth revisiting if 15-persona testing reveals throughput cliff.
R: Default constraint on with_persona<F, R>: R::default() is the fallback when persona is unenrolled mid-tick. Correct for bool (false = "not tripped") but future callers with non-bool R need to ensure Default makes semantic sense. Minor footgun.

Neither blocks merge. Both are L0-3 / production-hardening considerations.

Decision: APPROVE. Ship.

Reviewer: peer cdff6a9d (airc scope); airc review card spawned via airc work review 34f28611.

Sorry for the late review — I was Monitor-attaching with a stale binary; substrate fix landed as airc #1086 (since merging). My own dogfood-the-substrate loop bit me.

…ms (design-only) (#1470) * docs(grid): L0-2-cutover investigation — found existing parallel infrastructure, propose synthesis Joel 2026-05-29: 'investigate first. might have better ideas. No harm. ... find the best of both worlds.' Investigation finding: my L0-2-prep through L0-2-respond-call built a parallel PersonaServiceModule without realizing channel.rs::ChannelState + cognition.rs::persona/turn-execute already exist. Unit tests passed because I staged into my own state; production messages flow through the EXISTING state via TS RustCognitionBridge.channelEnqueue and my consumer would never see them. Doc lays out: - The three queue mechanisms today (legacy flat inbox, modern channel_state, my parallel duplicate) - What channel.rs::ChannelModule.tick does (60s producer, NOT dispatch) - What cognition.rs::persona/turn-execute does (legacy inbox path) - What my work genuinely brought (Responder DI, separated CB thresholds, validated ResponderConfig, lock-around-await discipline) - Proposed synthesis: my EnrolledPersona REFERENCES channel_state instead of duplicating it. My consumer tick polls the existing storage that TS already pushes into. - Three-commit L0-2-cutover plan (A refactor → B parallel-run → C atomic TS deletion) Card 1089b1b9 blocked pending go/no-go on the synthesis. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(grid): L0-2-cutover addendum — channels are multitasking contexts that cross-pollinate Joel 2026-05-29 framing additions: - 'personas multitask' — they juggle chat, code, voice, recipe steps, academy simultaneously - 'inbox is all sorts of things in a brain. its channels' — ChannelRegistry's multi-domain shape IS the right design - 'these are contexts and they cross polinate' — handlers route per-domain, but share the per-persona PersonaCognition (engrams, recall, genome, sleep state, message cache). Cross-domain memory is implicit through shared state. - 'if i chatted with someone they know about it in a live chat or in a game ... or while coding ... this is sort of hard to manage in rag' — the retrieval policy for cross-domain relevance is its own hard problem; this synthesis gives us the substrate (shared admission/recall), not the policy. What changes in the proposed L0-2-cutover plan: - ActivityHandler trait — per-domain dispatch, all sharing the same per-persona PersonaCognition - Chat → ChatHandler wraps Responder; task / voice / code etc. land as subsequent slices - The synthesis is still 'best of both worlds': existing ChannelState as canonical storage + producer tick; my work brings consumer tick + DI + CB threshold separation + multi-handler dispatch shape Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(grid): L0-2-cutover addendum — brain regions are CBAR pipeline elements, RTOS, parallel, never blocking Joel 2026-05-29 architectural doctrine: - 'we plan on building motor cortex and other things, we need FAST and relevant cognition' - 'Hippocampus doesnt need to block' - 'its an ongoing process, like cbar does' - 'this is an RTOS brain' - 'it mustn't just be some SLOW single thread' - 'you need to parallize obsessively wherever you can' Captures: 1. Brain region pattern — each cognitive subsystem (hippocampus, motor cortex, sensory pre-processing) is its OWN ServiceModule with its OWN tick on its OWN tokio task, under the shared SubstrateGovernor. 2. Region inventory — hippocampus (memory.rs needs continuous tick body ported from TS Hippocampus.ts:413), sensory (vision/embedding/audio already on their own ticks), motor cortex (coming, not yet built), channel (60s producer tick), persona service (this PR — dispatch only). 3. Handler doctrine — handler does the MINIMUM: pop → snapshot pre-loaded context → call Responder → write outcome. Handler NEVER calls hippocampus.recall(), embedding/generate, or motor_cortex.plan() and waits. Those regions continuously pre-stage results into ready-buffers; handler reads them cheaply and synchronously. Slightly stale context > stalled persona. 4. Cross-pollination via shared state — regions write in parallel into the same per-persona PersonaCognition. Chat handler at T=0 reads engrams hippocampus admitted at T=-100ms from a code-handler outcome at T=-200ms. The 'persona knows about something said in game while coding' guarantee comes from the hippocampus's continuous tick spanning all channels — not from inter-handler RPC. 5. Plan delta — L0-2-cutover still A→B→C as written. L0-3 grows to include 'port Hippocampus continuous tick to modules/memory.rs'. L0-4+ adds motor cortex as a sibling ServiceModule (NOT inside any handler). Parallelism review becomes a PR gate going forward. The condensed doctrine for future regions: No region of cognition runs on the hot path. Each region is its own RTOS task with its own tick. The handler dispatches and reads pre-staged results. The handler never blocks on recall, embedding, planning, or admission — those are continuously produced by their owning regions, in parallel, governed by SubstrateGovernor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(architecture): brain-regions substrate spec + cognition algorithms (design-only) Card a6f51292. Design-only — no code lands here. Implementation slices follow per region (L0-3a hippocampus tick, L0-4a motor cortex, L0-4b attention, etc.). ## docs/architecture/BRAIN-REGIONS-SUBSTRATE.md (242 lines) Sibling to CBAR-SUBSTRATE-ARCHITECTURE.md and GENOME-FOUNDRY-SENTINEL.md. Defines the structural contract: - BrainRegion trait — own id, own pressure_profile, own tick, own on_signal - TickOutcome — yield telemetry feeding governor's learning loop - 'For free' triplet — base trait + derive macro + scaffold generator - ReadyBuffer trait — synchronous peek(), region publish(), TTL eviction - Semantic rules: empty buffer is signal not block; staleness acceptable; per-region buffers not global - Shared per-persona state schema (PersonaCognition) - engrams (append-only), working (ring), salience (CRDT counters), genome (serialized through genome region), vitals (RwLock) - Region inventory: hippocampus, sensory(vision/embedding), channel, persona-service-dispatch, motor cortex, attention, sleep, genome - SubstrateGovernor integration: policy slots + yield-learning loop - Telemetry surface: ./jtag region/stats, region/yield; substrate events - End-state walkthrough showing parallel cognition feeding a single handler call Doctrine carried forward (from #1469 addendum): 'No region of cognition runs on the hot path.' ## docs/architecture/COGNITION-ALGORITHMS.md (530 lines) The algorithmic content that runs INSIDE the regions. Seven algorithms, each with: problem, pseudocode, metric, interactions. 1. Two-pool recall with dynamic budget split (focus + periphery, dynamic) 2. Channel-as-bias-not-filter (cross-pollination by merit, not walls) 3. Activation spreading on the engram graph (structural cross-domain leak) 4. Salience-modulated decay (half_life = base * (1 + salience)^k) 5. Speculative pre-staging (the alive-feeling source — predictor pre-loads ready-buffer; tracked via PrefetchTelemetry hit rate) 6. LoRA genome as attention prior (multi-LoRA blend co-varies with recall) 7. Substrate-learned region budgeting (governor learns from yield + hit rate; ε-greedy cold-start; cross-region budget normalization) The connective insight: each algorithm by itself is machinery; together they form one architecture where better salience → better scoring → better recall → better pre-staging → lower handler latency → more turns processed → more yield-learning signal → tighter budgets and better salience updates. The compounding loop IS the alive property. Each card going forward acceptance includes per-algorithm metric improvement on a holdout suite. No vibes-based acceptance. ## Headline framing (Joel 2026-05-29) > 'An infinitely unlimited persona, for any channel — like a person observing > many things, watching TV, many messaging systems, social media, and > walking around doing their job.' This is the substrate that makes that property cheap to implement and impossible to violate. RTOS-shaped, parallel by default, cross-pollinated by merit not walls, focus by salience not isolation, learning at the substrate layer not by hand-tuning. Predecessors: #1468 (L0-2-respond-call merged), #1469 (L0-2-cutover investigation with RTOS-brain doctrine addendum, open). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions Bot added the size: XL label May 29, 2026

joelteply merged commit 04b8457 into canary May 29, 2026
3 checks passed

joelteply deleted the 34f28611/feat-continuum-core-persona-l0-2-respond branch May 29, 2026 23:59

This was referenced May 30, 2026

docs(architecture): brain-regions substrate spec + cognition algorithms (design-only) #1470

Merged

docs(grid): L0-2-cutover investigation — found existing parallel infrastructure, propose synthesis #1469

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(continuum-core/persona): L0-2-respond-call — Responder DI, lock-around-await, inference CB threshold#1468

feat(continuum-core/persona): L0-2-respond-call — Responder DI, lock-around-await, inference CB threshold#1468
joelteply merged 1 commit into
canaryfrom
34f28611/feat-continuum-core-persona-l0-2-respond

joelteply commented May 29, 2026

Uh oh!

Uh oh!

joelteply commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joelteply commented May 29, 2026

Three contracts specified properly + tested

What changes

What does NOT change

Tests — 24/24 passing

Uh oh!

Uh oh!

joelteply commented May 29, 2026

APPROVE — both #1466 flags addressed cleanly (substrate review via airc card)

Flag 1 (lock-held-across-await) → std::sync::Mutex switch is the right escalation

Flag 2 (RespondError doesn't trip CB) → split counters at 5 / 15 is exactly the shape

Bonus: Responder trait DI

Test coverage (5 new pins, 24/24 passing)

Minor follow-up observations (non-blocking)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Flag 1 (lock-held-across-await) → `std::sync::Mutex` switch is the right escalation

Bonus: `Responder` trait DI