diff --git a/docs/grid/L0-PERSONA-COGNITION-E2E-PLAN.md b/docs/grid/L0-PERSONA-COGNITION-E2E-PLAN.md new file mode 100644 index 000000000..b843c6fd4 --- /dev/null +++ b/docs/grid/L0-PERSONA-COGNITION-E2E-PLAN.md @@ -0,0 +1,138 @@ +# L0 Plan — E2E Persona Cognition in Rust Alone + +**Status:** plan, refines [GRID-MIGRATION-ROADMAP](GRID-MIGRATION-ROADMAP.md) L0 layer. +**Predecessor:** [L0-2-DISPATCH-SLICING.md](L0-2-DISPATCH-SLICING.md) — proposed L0-2 as 3 sub-slices a/b/c. +**Priority:** Joel 2026-05-29: *"would take careful planning to migrate. I would get e2e persona cognition first, within RUST alone."* + +## What "E2E persona cognition in Rust alone" means concretely + +A persona receives a message → evaluates → optionally responds. Every step happens **inside the Rust runtime** with **no TS in the cognition path**. + +The boundaries that may legitimately stay TS (because they're form-specific): + +- Message INGRESS — the source that delivers a chat message to the persona. Today: TS receives airc events; eventually: airc embed in Rust directly. **Transitional acceptable**: TS receives → puts message into Rust channel. +- Message EGRESS — the path that publishes a generated response. Today: TS `chat/send` command publishes to airc. **Transitional acceptable**: Rust dispatches the `chat/send` command via the universal `CommandExecutor` (which routes through the TS bridge socket until airc embed lands). + +What is **not** acceptable as TS: + +- Decision logic (should-respond, priority, evaluation gates) +- Cognition state (PersonaCognition, sleep state, rate limiter, message cache) +- Response generation orchestration (prompt assembly, model selection, inference dispatch) +- Loop / tick cadence (the autonomous service loop) +- Genome paging / LoRA activation logic +- Inbox routing +- Admission gate / dedup / engram creation + +## Today's state (audit, 2026-05-29) + +### Rust side (already exists in continuum-core/src/persona/) + +- `PersonaCognition` (unified.rs) — container for all per-persona cognitive state. Has `new(persona_id, persona_name, rag_engine)` constructor + `with_budget` variant. +- `PersonaCognitionEngine` — `fast_path_decision`, `enqueue_message`, `state`, `update_state`, `mark_message_evaluated`. +- `full_evaluate` (evaluator/mod.rs:195) — unified pre-response gate (response_cap → mention → rate_limit → sleep_mode → directed_mention → fast_path). +- `respond` (response.rs:197) — async response generation. Takes `RespondInput`, returns `Result`. +- `channel_registry::service_cycle()` — pops next item from the per-persona channel queue, respects priority + state gating. +- `PersonaServiceModule` (L0-1, merged in #1457) — singleton ServiceModule, `persona/status` works, `persona/enroll` returns the L0-2-not-wired error, tick is no-op. +- `airc_admission.rs` — converts a signed airc envelope into an `AdmissionCandidate` for persona memory. + +### TS side (still drives the loop today) + +- `PersonaAutonomousLoop.ts` (~349 LOC after #1459 doctrine cleanup) — `runServiceLoop`, `serviceInbox`, `handleItem`. Drives every persona's tick. Calls into Rust `serviceCycleFull` to get items, dispatches via `evaluateAndPossiblyRespondWithCognition`. +- `PersonaMessageEvaluator.ts` (~974 LOC) — `evaluateAndPossiblyRespondWithCognition`. Calls `rustCognition.fullEvaluate()` then coordinates with the chat coordinator, builds RAG, calls `respondToMessage`. +- `PersonaResponseGenerator.ts` (~904 LOC after #1459 cleanup) — orchestrates the response pipeline: prompt assembly, model selection, inference, tool execution, response posting. +- `PersonaUser.ts` (~2160 LOC after #1459 cleanup) — receives airc events, routes to the inbox, kicks off autonomous loop, hosts the cognition bridge. +- The cognition path from "received chat" → "posted response" crosses TS↔Rust boundary at least 4–6 times. + +## Sequencing + +Five sub-slices, each shippable with no silent-drop window, each leaves the tree green. + +### L0-2-prep — PersonaSlot extension, enroll opens (no dispatch yet) + +**Adds Rust:** +- `PersonaSlot { persona_id, display_name, cognition: PersonaCognition, circuit_open_until_ms, consecutive_failures }` in `service_module.rs` +- `PersonaServiceModule.personas: Mutex>` +- `enroll(persona_id, display_name, rag_engine)` constructs the slot +- `persona/enroll` command opens (no longer returns L0-2-not-wired error) +- `persona/status` reports enrolled list with persona_id + display_name +- tick remains no-op (no dispatch yet — *but enrollment is now real*, so when L0-2-dispatch lands the slot exists) + +**Tests Rust:** 6 — enroll constructs, enroll idempotency, status reflects enrolled list, two distinct personas, unknown command, tick still no-op. + +**TS:** none touched. + +**Why this is safe to ship alone:** enrolling a persona changes no behavior — TS PersonaAutonomousLoop is still driving everything. The Rust enrollment is *latent* until L0-2-dispatch wires it. + +**Net:** ~150 LOC Rust added, 0 TS deleted. Foundation for the next slice. + +### L0-2-dispatch — `service_once_for` wired, exercised in tests only + +**Adds Rust:** +- `service_once_for(slot)` — pops via `channel_registry::service_cycle` from the slot's cognition channels; dispatches through `full_evaluate`; if `should_respond`, calls `respond()`; emits a structured `persona/responded` event with the generated text + correlation id. +- `tick` iterates enrolled slots, calls `service_once_for`, manages per-slot circuit breaker (5 consecutive failures → 30s cooldown), respects max-drain-per-tick (20 items). +- Bookmark advance via Drop guard on the dispatch handle so it ALWAYS advances (success path AND error path) — matches the existing TS structural-progress invariant. + +**Tests Rust:** 10 — empty inbox no-op, single message dispatch, full_evaluate-says-no path, full_evaluate-says-yes path, respond-error path, circuit breaker trips on N consecutive errors, cooldown timer, drain bound respected, two enrolled personas dispatch independently, bookmark advances on error. + +**TS:** STILL untouched. The TS PersonaAutonomousLoop is still the production driver. The Rust dispatch is exercised in unit tests but no production callsite invokes `PersonaServiceModule.tick` yet. + +**Why this is safe:** the Rust dispatch is fully self-contained; no production path calls it. TS continues unchanged. + +**Net:** ~300 LOC Rust + 250 LOC tests. 0 TS deleted. + +### L0-2-cutover — atomic switch + TS PersonaAutonomousLoop deletion + +**This slice is the cliff.** All TS-side dispatch dies; Rust takes over. + +**Adds Rust:** +- `PersonaServiceModule.tick` becomes the production loop. Registered via the runtime's normal module-tick scheduler at module init. +- Response posting: `service_once_for` dispatches `Commands.execute("chat/send", {...})` via the universal CommandExecutor. The TS side handles publish until airc embed lands; the Rust side is the orchestrator. + +**Removes TS:** +- `PersonaAutonomousLoop.ts` — entire file, 349 LOC. +- `PersonaUser.startAutonomousServicing()` — replaced with a call to register the persona with the Rust ServiceModule via `persona/enroll`. +- `PersonaUser.stopAutonomousServicing()` — replaced with `persona/unenroll` (new mirror command). +- Callsites in `autonomous-learning-e2e.test.ts` — update or delete tests for the TS loop. + +**Verification (gate):** +- 15-persona scenario in general room: every persona receives messages, evaluates, responds (or stays silent based on cognition's decision). +- No ghost retries (bookmark advances correctly). +- No duplicate dispatch (TS loop is gone; only Rust dispatches). +- Circuit breaker observably trips if a persona's cognition keeps erroring. + +**Net:** ~50 LOC Rust + ~400 LOC TS deleted. Net -350 LOC, but the value is the architectural cutover. + +### L0-3 — Genome / LoRA paging moves to Rust (PersonaGenomeManager.ts deletion) + +Out-of-scope details for now; sketched in [LORA-GENOME-PAGING.md](../personas/LORA-GENOME-PAGING.md). After L0-2-cutover, the TS PersonaGenomeManager has no Rust caller; deletion is mechanical. + +### L0-4 — Inbox routing moves to Rust (PersonaInbox.ts deletion) + +The Rust `channel_registry` already exists. After L0-2-cutover the TS `PersonaInbox` is the only remaining TS-side queue; its routing logic moves to Rust subscribers on airc room events. + +### L0-5 — Final `PersonaUser.ts` cull + +After L0-2 + L0-3 + L0-4 land, the remaining methods on PersonaUser.ts are mostly form-glue: receive airc events, route to Rust, expose RAG bridges for the response generator. Most of the 2160 LOC is then dead. Final cull. + +## Dependencies + blockers + +- **Not blocked by airc#1075.** L0-2-prep through L0-2-cutover use the universal CommandExecutor's existing TS-route branch for response posting. No airc embed needed yet. +- **Not blocked by e51ab14e.** That blocks the chat-flow migration (PR #1462 scope). E2E persona cognition in Rust does not require machine-singular daemon — the existing TS bridge for airc-event-ingress + chat-send-egress works. +- **Blocked by knowing the rag_engine source.** L0-2-prep needs a way to obtain `Arc` at enroll time. Open question: does the runtime's `ModuleContext` already plumb a shared RagEngine, or does PersonaServiceModule construct one? Need to investigate before writing L0-2-prep. + +## Pre-implementation investigation + +Before writing L0-2-prep code: + +1. Confirm how `Arc` is shared today. Is there a runtime-managed singleton? Per-persona? Constructed lazily? +2. Confirm how `channel_registry` items get populated today. Who writes to it, and does that path need to change for the Rust loop to drain it? +3. Confirm `Commands.execute` is reachable from inside a Rust ServiceModule. The `command_executor.rs` exists; ServiceModule needs to dispatch through it. +4. Identify the existing test fixtures for `PersonaCognition`. If there's a mock RagEngine or test harness, L0-2-prep tests can reuse it. + +I'll do those four checks before opening the L0-2-prep implementation PR. + +## What this plan is NOT + +- Not a contract negotiation — sub-slice boundaries may shift as the implementation reveals the shape. +- Not a substitute for actually shipping. The plan exists so the slices are reviewable and the cutover gate (L0-2-cutover) doesn't surprise anyone. +- Not a deletion of [L0-2-DISPATCH-SLICING.md](L0-2-DISPATCH-SLICING.md). That doc captured the slicing rationale; this one refines the slicing with the post-#1459 doctrine + Joel's "e2e in Rust alone first" priority. diff --git a/src/workers/continuum-core/src/persona/service_module.rs b/src/workers/continuum-core/src/persona/service_module.rs index a4390f422..500cc6111 100644 --- a/src/workers/continuum-core/src/persona/service_module.rs +++ b/src/workers/continuum-core/src/persona/service_module.rs @@ -1,40 +1,138 @@ //! `PersonaServiceModule` — singleton Rust `ServiceModule` for persona -//! work. **L0-1 minimum unit** of [GRID-MIGRATION-ROADMAP]. +//! work. //! -//! ## Scope discipline +//! ## L0-2-prep scope //! -//! L0-1 ships only what L0-1 needs: a registered module that responds -//! to `persona/status`. Enrollment, cognition dispatch, channel -//! ownership, and the circuit breaker all live with the layers that -//! wire them to real work (L0-2..L0-4), shipped alongside deletion of -//! their TS counterparts in the same PRs. +//! Builds on L0-1's minimum unit (#1457): the slot machinery and +//! `enroll` now open. Each enrolled persona gets a `PersonaSlot` that +//! carries its `PersonaCognition` (the per-persona container for engine +//! + inbox + rate_limiter + sleep_state + adapter_registry + genome + +//! classifier + caches + admission state from `persona::unified`). //! -//! No fallbacks here. Calling `persona/enroll` returns a loud error -//! until L0-2 wires cognition dispatch. +//! `tick` is still a no-op in this slice. The TS `PersonaAutonomousLoop` +//! continues to drive the production loop. Wiring `service_once_for` to +//! actually dispatch through `full_evaluate` + `respond` lands in +//! L0-2-dispatch, gated against the slot machinery proven here. +//! +//! See [docs/grid/L0-PERSONA-COGNITION-E2E-PLAN.md] for the full +//! sequencing. use std::any::Any; +use std::collections::HashMap; +use std::sync::{Arc, Mutex}; use std::time::Duration; use async_trait::async_trait; use serde_json::{json, Value}; +use uuid::Uuid; +use crate::persona::unified::PersonaCognition; +use crate::rag::RagEngine; use crate::runtime::service_module::{CommandResult, ModuleConfig, ModulePriority, ServiceModule}; use crate::runtime::ModuleContext; +/// Per-persona state inside the singleton service module. One slot per +/// enrolled persona; the slot owns the persona's cognition container +/// and the per-slot circuit-breaker bookkeeping. +/// +/// L0-2-prep: cognition is carried; circuit breaker fields are +/// declared but not yet exercised (no dispatch happens in this slice). +/// L0-2-dispatch will read + update them inside `service_once_for`. +pub struct PersonaSlot { + pub persona_id: Uuid, + pub display_name: String, + pub cognition: PersonaCognition, + /// Unix-ms timestamp at which the per-persona circuit re-closes. + /// 0 means the circuit is currently closed (healthy). + pub circuit_open_until_ms: u64, + /// Consecutive `service_once_for` failures since the last success. + /// Trips the circuit at `CIRCUIT_BREAKER_MAX_CONSECUTIVE_FAILURES`. + pub consecutive_failures: u32, +} + +impl PersonaSlot { + fn new(persona_id: Uuid, display_name: String, cognition: PersonaCognition) -> Self { + Self { + persona_id, + display_name, + cognition, + circuit_open_until_ms: 0, + consecutive_failures: 0, + } + } +} + /// Singleton owning persona work in-process. Replaces the TS /// `PersonaAutonomousLoop`; the deletion of `PersonaAutonomousLoop.ts` -/// lands with L0-2 once cognition dispatch is wired here. -pub struct PersonaServiceModule; +/// lands with L0-2-cutover. +pub struct PersonaServiceModule { + /// Per-persona state, keyed by persona_id. One mutex over the whole + /// map — for the 15-persona load this is fine. If a future profile + /// ever shows contention here, split into per-slot `Mutex` + /// inside a dashmap or similar. + personas: Mutex>, + /// Shared `RagEngine` used to construct each persona's cognition. + /// Held at module level so all personas share a single retrieval + /// substrate (corpora, indexes, caches). + rag_engine: Arc, +} impl PersonaServiceModule { - pub fn new() -> Self { - Self + pub fn new(rag_engine: Arc) -> Self { + Self { + personas: Mutex::new(HashMap::new()), + rag_engine, + } } -} -impl Default for PersonaServiceModule { - fn default() -> Self { - Self::new() + /// Enroll a persona. Constructs a `PersonaCognition` for it under the + /// module's shared `RagEngine`, stores the slot. Idempotent: enrolling + /// the same id with a different display name updates the name; the + /// existing cognition + circuit-breaker state are preserved (do NOT + /// reset cognition state silently — that would be a fallback). + pub fn enroll(&self, persona_id: Uuid, display_name: impl Into) -> Result<(), String> { + let display_name = display_name.into(); + let mut personas = self + .personas + .lock() + .map_err(|_| "personas lock poisoned".to_string())?; + if let Some(slot) = personas.get_mut(&persona_id) { + slot.display_name = display_name; + return Ok(()); + } + let cognition = PersonaCognition::new( + persona_id, + display_name.clone(), + Arc::clone(&self.rag_engine), + ); + personas.insert( + persona_id, + PersonaSlot::new(persona_id, display_name, cognition), + ); + Ok(()) + } + + /// Number of currently enrolled personas. Cheap; used by status. + pub fn enrolled_count(&self) -> Result { + let personas = self + .personas + .lock() + .map_err(|_| "personas lock poisoned".to_string())?; + Ok(personas.len()) + } + + /// Returns a snapshot of enrolled persona ids + display names, used + /// by status. Allocates; for hot-path observers, iterate the map + /// directly via your own lock. + pub fn enrolled_snapshot(&self) -> Result, String> { + let personas = self + .personas + .lock() + .map_err(|_| "personas lock poisoned".to_string())?; + Ok(personas + .values() + .map(|s| (s.persona_id, s.display_name.clone())) + .collect()) } } @@ -59,26 +157,50 @@ impl ServiceModule for PersonaServiceModule { async fn handle_command( &self, command: &str, - _params: Value, + params: Value, ) -> Result { match command { - "persona/status" => Ok(CommandResult::Json(json!({ - "module": "persona", - "enrolled": 0, - "scope": "L0-1: status-only; enroll wired in L0-2", - }))), - "persona/enroll" => Err( - "persona/enroll requires cognition dispatch (L0-2 — card 7a45a15f); \ - not yet wired" - .to_string(), - ), + "persona/status" => { + let snapshot = self.enrolled_snapshot()?; + let entries: Vec = snapshot + .into_iter() + .map(|(id, name)| json!({"persona_id": id.to_string(), "display_name": name})) + .collect(); + Ok(CommandResult::Json(json!({ + "module": "persona", + "enrolled": entries.len(), + "personas": entries, + "scope": "L0-2-prep: enroll opens; dispatch wiring lands in L0-2-dispatch", + }))) + } + "persona/enroll" => { + let persona_id_str = params + .get("persona_id") + .and_then(Value::as_str) + .ok_or_else(|| "persona/enroll requires persona_id (string)".to_string())?; + let persona_id = Uuid::parse_str(persona_id_str) + .map_err(|e| format!("persona/enroll: invalid persona_id uuid: {e}"))?; + let display_name = params + .get("display_name") + .and_then(Value::as_str) + .ok_or_else(|| "persona/enroll requires display_name (string)".to_string())? + .to_string(); + self.enroll(persona_id, display_name)?; + Ok(CommandResult::Json(json!({ + "enrolled": persona_id.to_string(), + "total": self.enrolled_count()?, + }))) + } other => Err(format!("unknown persona command: {other}")), } } async fn tick(&self) -> Result<(), String> { - // L0-1: no personas to service. L0-2 wires the per-persona - // `channel_registry::service_cycle()` dispatch here. + // L0-2-prep: enrollment is real, but no dispatch yet. The TS + // PersonaAutonomousLoop continues to drive production. The Rust + // dispatch lands in L0-2-dispatch with `service_once_for` and is + // exercised in unit tests before being made the production + // driver in L0-2-cutover. Ok(()) } @@ -91,9 +213,13 @@ impl ServiceModule for PersonaServiceModule { mod tests { use super::*; + fn fresh_module() -> PersonaServiceModule { + PersonaServiceModule::new(Arc::new(RagEngine::new())) + } + #[test] fn config_declares_persona_prefix_and_high_priority() { - let m = PersonaServiceModule::new(); + let m = fresh_module(); let cfg = m.config(); assert_eq!(cfg.name, "persona"); assert_eq!(cfg.priority, ModulePriority::High); @@ -102,8 +228,8 @@ mod tests { } #[tokio::test] - async fn status_command_succeeds_and_reports_l0_1_scope() { - let m = PersonaServiceModule::new(); + async fn status_with_no_enrollments_reports_zero_and_prep_scope() { + let m = fresh_module(); let result = m .handle_command("persona/status", Value::Null) .await @@ -113,39 +239,122 @@ mod tests { }; assert_eq!(v["module"], "persona"); assert_eq!(v["enrolled"], 0); - assert!(v["scope"].as_str().unwrap().contains("L0-1")); + assert_eq!(v["personas"].as_array().unwrap().len(), 0); + assert!(v["scope"].as_str().unwrap().contains("L0-2-prep")); } #[tokio::test] - async fn enroll_command_fails_loud_until_l0_2_card_7a45a15f() { - let m = PersonaServiceModule::new(); + async fn enroll_constructs_slot_and_status_reflects_it() { + let m = fresh_module(); + let persona_id = Uuid::new_v4(); + let result = m + .handle_command( + "persona/enroll", + json!({"persona_id": persona_id.to_string(), "display_name": "Helper"}), + ) + .await + .expect("enroll succeeds with valid params"); + let CommandResult::Json(enroll_result) = result else { + panic!("expected Json result") + }; + assert_eq!(enroll_result["enrolled"], persona_id.to_string()); + assert_eq!(enroll_result["total"], 1); + + let status = m + .handle_command("persona/status", Value::Null) + .await + .expect("status succeeds"); + let CommandResult::Json(s) = status else { + panic!("expected Json result") + }; + assert_eq!(s["enrolled"], 1); + let personas = s["personas"].as_array().unwrap(); + assert_eq!(personas.len(), 1); + assert_eq!(personas[0]["persona_id"], persona_id.to_string()); + assert_eq!(personas[0]["display_name"], "Helper"); + } + + #[tokio::test] + async fn enroll_is_idempotent_and_updates_display_name() { + let m = fresh_module(); + let persona_id = Uuid::new_v4(); + m.enroll(persona_id, "First").expect("first enroll"); + m.enroll(persona_id, "Second").expect("second enroll"); + assert_eq!(m.enrolled_count().unwrap(), 1); + let snapshot = m.enrolled_snapshot().unwrap(); + assert_eq!(snapshot.len(), 1); + assert_eq!(snapshot[0].1, "Second"); + } + + #[tokio::test] + async fn enroll_two_distinct_personas_keeps_both() { + let m = fresh_module(); + let a = Uuid::new_v4(); + let b = Uuid::new_v4(); + m.enroll(a, "Alpha").expect("enroll alpha"); + m.enroll(b, "Beta").expect("enroll beta"); + assert_eq!(m.enrolled_count().unwrap(), 2); + } + + #[tokio::test] + async fn enroll_missing_persona_id_fails_loud() { + let m = fresh_module(); + let err = m + .handle_command("persona/enroll", json!({"display_name": "Helper"})) + .await + .expect_err("enroll without persona_id must fail"); + assert!(err.contains("persona_id"), "error names the missing param: {err}"); + } + + #[tokio::test] + async fn enroll_missing_display_name_fails_loud() { + let m = fresh_module(); let err = m - .handle_command("persona/enroll", json!({"persona_id": "x"})) + .handle_command( + "persona/enroll", + json!({"persona_id": Uuid::new_v4().to_string()}), + ) .await - .expect_err("enroll must fail loud — no fallback semantics"); + .expect_err("enroll without display_name must fail"); assert!( - err.contains("L0-2"), - "error must name the gating layer; got: {err}" + err.contains("display_name"), + "error names the missing param: {err}" ); + } + + #[tokio::test] + async fn enroll_invalid_uuid_fails_loud() { + let m = fresh_module(); + let err = m + .handle_command( + "persona/enroll", + json!({"persona_id": "not-a-uuid", "display_name": "X"}), + ) + .await + .expect_err("enroll with invalid uuid must fail"); assert!( - err.contains("7a45a15f"), - "error must name the gating card so it's grep-able; got: {err}" + err.contains("uuid") || err.contains("invalid"), + "error names the parse failure: {err}" ); } #[tokio::test] async fn unknown_command_returns_clear_error() { - let m = PersonaServiceModule::new(); + let m = fresh_module(); let err = m .handle_command("persona/teleport", Value::Null) .await - .expect_err("unknown commands must error, not fall back"); + .expect_err("unknown commands must error"); assert!(err.contains("persona/teleport"), "error names the command"); } #[tokio::test] - async fn tick_succeeds_quietly_with_no_enrolled_personas() { - let m = PersonaServiceModule::new(); - m.tick().await.expect("empty tick succeeds"); + async fn tick_is_no_op_in_prep_slice() { + let m = fresh_module(); + let persona_id = Uuid::new_v4(); + m.enroll(persona_id, "Helper").expect("enroll"); + // tick should not error and should not affect enrolled state + m.tick().await.expect("tick succeeds"); + assert_eq!(m.enrolled_count().unwrap(), 1); } }