diff --git a/.ci/readability-baseline.env b/.ci/readability-baseline.env index b52a7282..4303f13a 100644 --- a/.ci/readability-baseline.env +++ b/.ci/readability-baseline.env @@ -1,11 +1,11 @@ # Generated by scripts/readability-ratchet.sh -PROD_RS_TOTAL=325 -PROD_FILES_GT300=108 -PROD_FILES_GT500=51 -PROD_FILES_GT1000=3 -PROD_MAX_FILE_LINES=1823 -PROD_MAX_FILE_PATH=crates/temper-server/src/observe/evolution/insight_generator.rs +PROD_RS_TOTAL=349 +PROD_FILES_GT300=111 +PROD_FILES_GT500=45 +PROD_FILES_GT1000=1 +PROD_MAX_FILE_LINES=1625 +PROD_MAX_FILE_PATH=crates/temper-server/src/channels/discord.rs ALLOW_CLIPPY_COUNT=23 ALLOW_DEAD_CODE_COUNT=9 -PROD_PRINTLN_COUNT=176 -PROD_UNWRAP_CI_OK_COUNT=115 +PROD_PRINTLN_COUNT=230 +PROD_UNWRAP_CI_OK_COUNT=116 diff --git a/.claude/plans/shiny-tumbling-thimble-agent-afffe854b7ee1a307.md b/.claude/plans/shiny-tumbling-thimble-agent-afffe854b7ee1a307.md new file mode 100644 index 00000000..df6148f8 --- /dev/null +++ b/.claude/plans/shiny-tumbling-thimble-agent-afffe854b7ee1a307.md @@ -0,0 +1,407 @@ +# OpenClaw Clean Core Architecture + +## Overview + +OpenClaw is a TypeScript/Node.js monorepo (pnpm workspaces) that acts as a **personal AI assistant gateway**. It connects 23+ messaging channels (Discord, Telegram, WhatsApp, Slack, etc.) to LLM backends (Anthropic Claude, OpenAI, etc.) through a WebSocket control plane. The core is NOT a WASM sandbox system -- it is a local-first Node.js process with optional Docker sandboxing for non-main sessions. + +--- + +## 1. The Gateway (Control Plane) + +**Location:** `src/gateway/` + +The Gateway is a **WebSocket server** bound to `ws://127.0.0.1:18789`. It is the central coordination hub. + +### Boot Sequence +- **`src/gateway/boot.ts`** -- Reads `BOOT.md` from workspace, constructs a prompt, runs the agent once in an isolated session, then restores the main session mapping. Uses `generateBootSessionId()` (timestamp + truncated UUID). + +### Connection Model +- **`src/gateway/client.ts`** -- `GatewayClient` class manages WebSocket connections with: + - **Authentication**: Multiple auth methods (token, bootstrap token, device token, password, signature token). Device identity signing. TLS fingerprint validation. + - **Protocol**: `RequestFrame` / `ResponseFrame` for RPC, `EventFrame` for push events with sequence tracking and gap detection. + - **Reconnection**: Exponential backoff (1s to 30s). Tick-based liveness detection (stall timeout = 2x tick interval). + - **Security**: Blocks plaintext `ws://` to non-loopback addresses (CWE-319). + +### RPC Dispatch +- **`src/gateway/call.ts`** -- `callGateway()` is the primary RPC entry point. Routes to: + - `callGatewayCli()` -- Default CLI operator scopes + - `callGatewayLeastPrivilege()` -- Minimum required scopes for the method + - `callGatewayScoped()` -- Explicit user-supplied scopes + - `executeGatewayRequestWithScopes()` -- Creates `GatewayClient`, handles hello handshake, executes RPC method, manages timeout/close events. + +### Key Gateway Methods (RPC) +- `"agent"` -- Dispatch a message to an agent for processing +- `"sessions.patch"` -- Create/update session metadata +- Channel-specific methods for sending, reactions, etc. + +### Webhook Hooks +- **`src/gateway/hooks.ts`** -- HTTP webhook ingress with: + - `HookAgentPayload { message, channel, sessionKey?, agentId?, ... }` + - Token auth via `Authorization: Bearer` or `x-openclaw-token` header + - Session key resolution (hook-prefixed UUID if not provided) + - Agent allowlist enforcement + +--- + +## 2. Agent Runtime (NOT a WASM sandbox -- it's CLI-based) + +**Location:** `src/agents/` + +There is NO WASM sandbox. OpenClaw does NOT have an "Agent Runtime Engine" in the traditional sense. Instead, agents are **CLI processes** (Claude Code, Codex, Pi, OpenCode) spawned as child processes. + +### Agent Identity & Scope +- **`src/agents/agent-scope.ts`** -- Core agent resolution: + ```typescript + type ResolvedAgentConfig = { + name?: string; + workspace?: string; + agentDir?: string; + model?: AgentEntry["model"]; + skills?: AgentEntry["skills"]; + identity?: AgentEntry["identity"]; + sandbox?: AgentEntry["sandbox"]; + tools?: AgentEntry["tools"]; + // ... heartbeat, subagents, groupChat, etc. + }; + ``` + - `resolveDefaultAgentId(cfg)` -- First agent with `default=true`, or first in list, or `DEFAULT_AGENT_ID` + - `resolveSessionAgentId({ sessionKey, config })` -- Parses agent ID from session key format `agent::...` + - `resolveAgentWorkspaceDir(cfg, agentId)` -- Per-agent workspace directories + - `resolveAgentSkillsFilter(cfg, agentId)` -- Per-agent skill allowlists + +### CLI Runner (How agents are actually executed) +- **`src/agents/cli-runner.ts`** -- `runCliAgent()`: + 1. Resolves workspace directories and backend config + 2. Prepares system prompt with bootstrap context + 3. Builds CLI arguments + 4. Executes via process supervisor with timeout handling + 5. Handles session management (new or resumed) + 6. Retry logic for expired sessions + 7. Image payload management + 8. Returns execution results with usage metrics + + `runClaudeCliAgent()` wraps this with provider="claude-cli", model="opus". + +### ACP (Agent Control Protocol) Spawning +- **`src/agents/acp-spawn.ts`** -- `spawnAcpDirect()` for spawning isolated agent sessions: + ```typescript + type SpawnAcpParams = { + task: string; + label?: string; + agentId?: string; + mode?: "run" | "session"; // oneshot vs persistent + thread?: boolean; // bind to a channel thread + sandbox?: "inherit" | "require"; + streamTo?: "parent"; // relay output to parent session + }; + + type SpawnAcpResult = { + status: "accepted" | "forbidden" | "error"; + childSessionKey?: string; // format: "agent::acp:" + runId?: string; + mode?: SpawnAcpMode; + }; + ``` + Flow: + 1. Check ACP policy (`isAcpEnabledByPolicy`) + 2. Resolve target agent ID from config + 3. Create session key: `agent:{agentId}:acp:{uuid}` + 4. Register session via `callGateway({ method: "sessions.patch" })` + 5. Initialize runtime via `getAcpSessionManager().initializeSession()` + 6. Optionally bind to a channel thread (Discord thread, etc.) + 7. Dispatch task via `callGateway({ method: "agent", params: { message, sessionKey, ... } })` + +### Sandbox Model +- **Main session**: Runs on host with full tool access +- **Non-main sessions**: Can run sandboxed in Docker containers +- `resolveSandboxRuntimeStatus({ cfg, sessionKey })` determines sandbox status +- Sandboxed sessions CANNOT spawn ACP sessions (host-only) + +--- + +## 3. The SKILL.md System (NOT SOUL.md) + +**Location:** `skills/` directory (56 skill directories) + +OpenClaw uses `SKILL.md` files, NOT `SOUL.md`. Each skill is a directory containing a single `SKILL.md` markdown file that acts as both documentation and prompt injection. + +### Skill Format (from `skills/coding-agent/SKILL.md`) +Skills are **plain markdown documents** with: +- Title and description +- Usage instructions (tool parameters, flags, examples) +- Rules and constraints +- The content is injected verbatim into the agent's system prompt + +### Skill Loading Pipeline +- **`src/auto-reply/skill-commands.ts`**: + 1. `listSkillCommandsForWorkspace(workspaceDir, config)` -- Builds skill command specs with skill filters + 2. `listSkillCommandsForAgents(agentIds)` -- Iterates agents, resolves workspace dirs, deduplicates by canonical path + 3. `mergeSkillFilters()` -- Unrestricted (undefined) takes precedence; empty `[]` contributes nothing; non-empty arrays merge via dedup + 4. `dedupeBySkillName()` -- Lowercase normalization, preserves insertion order + +### Skill Filtering (Per-Agent) +- **`src/agents/agent-scope.ts`**: `resolveAgentSkillsFilter(cfg, agentId)` reads `skills` from agent config +- Agent config supports per-agent skill allowlists: + ```yaml + agents: + list: + - id: "molty" + skills: ["coding-agent", "discord", "github"] + - id: "helper" + skills: [] # no skills + ``` + +### Skill Injection Point +- In `getReplyFromConfig()` (the main reply pipeline): + 1. Skill filters merged from channel + agent settings + 2. Passed through chain via `skillFilter` parameter + 3. Injected during directive resolution and inline action handling + 4. SKILL.md content becomes part of system prompt context + +### Available Skills (56 total) +Key ones: `coding-agent`, `discord`, `slack`, `github`, `gh-issues`, `obsidian`, `notion`, `canvas`, `weather`, `spotify-player`, `voice-call`, `tmux`, `trello`, `camsnap`, etc. + +--- + +## 4. Discord Integration -- The Clean Path + +**Location:** `extensions/discord/` (plugin) + `skills/discord/` (skill) + +### Plugin Registration +```typescript +// extensions/discord/index.ts +export default defineChannelPluginEntry({ + id: "discord", + name: "Discord", + description: "Discord channel plugin", + plugin: discordPlugin, + setRuntime: setDiscordRuntime, + registerFull: registerDiscordSubagentHooks, +}); +``` + +### Plugin Registry +- **`src/channels/registry.ts`** -- Global plugin registry via `Symbol.for("openclaw.pluginRegistryState")` +- Plugins register at startup, keyed by ID with optional aliases +- `normalizeAnyChannelId()` and `findRegisteredChannelPluginEntry()` for lookup + +### Discord Channel Plugin (`extensions/discord/src/channel.ts`) +`discordPlugin` is a `ChannelPlugin` created via `createChatChannelPlugin()` with: +- Allowlist management (legacy DM account support) +- Group policy resolution +- Mention stripping patterns +- Agent prompt hints for Discord components/forms +- Message normalization and target parsing +- Outbound: text (2000 char limit), media attachments, polls, silent delivery + +### Discord Monitor Provider (`extensions/discord/src/monitor/provider.ts`) +`monitorDiscordProvider()` orchestrates the Discord bot lifecycle: +1. Load Discord account settings, thread bindings, feature flags +2. Deploy native slash commands (with retry + rate-limit handling) +3. Register interactive components (buttons, select menus, modals) +4. Register event listeners (messages, reactions, threads, presence) +5. Optionally initialize voice channel management +6. Manage WebSocket gateway connection + reconnection + +### Discord Message Handler (`extensions/discord/src/monitor/message-handler.ts`) +`createDiscordMessageHandler()` returns a handler with `deactivate()`: +1. **Dedup**: LRU cache (5-min TTL, 5000 max entries) +2. **Debounce**: Batches consecutive messages from same author in same channel +3. **Bot filter**: Filters out bot's own messages early +4. **Batch**: Creates synthetic concatenated message for multi-message bursts +5. **Config resolution**: Merges Discord-specific config with channel defaults + +### Discord Event Listeners (`extensions/discord/src/monitor/listeners.ts`) +- `DiscordMessageListener` -- Fire-and-forget delegation to handler +- `DiscordReactionListener` / `DiscordReactionRemoveListener` -- Authorization checks + notification emit +- `DiscordPresenceListener` -- Caches user presence data +- `DiscordThreadUpdateListener` -- Closes sessions when threads archive +- `runDiscordListenerWithSlowLog()` -- 30s slow-log wrapper + +### The Clean Path: Discord Message -> Agent Response + +``` +1. Discord WebSocket Gateway receives MESSAGE_CREATE + | +2. DiscordMessageListener.onMessage(message) + | (fire-and-forget, no blocking) + | +3. createDiscordMessageHandler() processes: + a. Filter bot's own messages + b. Check dedupe cache (5-min TTL) + c. Enqueue into debouncer (batch consecutive msgs from same author) + d. Debouncer flushes -> single or synthetic batched message + | +4. Message enters Channel Session Layer (src/channels/): + a. session-envelope.ts: resolveInboundSessionEnvelopeContext() + b. session.ts: recordInboundSession() -- normalize session key, update last route + c. mention-gating.ts: check if bot was mentioned (for group chats) + d. command-gating.ts: check if message is a command + | +5. Routing (src/routing/): + resolveAgentRoute(input) resolves which agent handles this message: + - Priority tiers: peer > parent peer > guild+roles > guild > team > account > channel + - Returns: { agentId, sessionKeys, routingPolicy } + - Session key format: "agent:::" + | +6. Run State Machine (src/channels/run-state-machine.ts): + createRunStateMachine() tracks active runs: + - onRunStart() increments counter, publishes busy status + - Heartbeat interval (60s) for long-running operations + - onRunEnd() decrements, clears heartbeat when idle + | +7. Dispatch (src/auto-reply/dispatch.ts): + dispatchInboundMessage(ctx, cfg, dispatcher) + -> dispatchReplyFromConfig() + -> withReplyDispatcher() (ensures cleanup on all exit paths) + | +8. Reply Pipeline (src/auto-reply/reply/get-reply.ts): + getReplyFromConfig() -- THE MAIN ORCHESTRATOR: + a. Resolve agent identity (resolveSessionAgentId) + b. Merge skill filters (channel + agent level) + c. Establish workspace directory + bootstrap files + d. Resolve model selection (default / heartbeat override / channel override) + e. Finalize inbound context + f. Apply media understanding (if media detected) + g. Apply link understanding (if URLs detected) + h. Emit pre-agent-message hooks + i. Initialize session state + j. Resolve command authorization + k. Parse directives (commands, model switches, etc.) + l. Handle inline actions (skill invocation, commands, elevation) + m. Stage sandbox media files + n. runPreparedReply() -- actually calls the LLM + | +9. Context Engine (src/context-engine/): + ContextEngine.assemble() builds the model context: + - Ordered messages within token budget + - System prompt additions + - Token estimates + | +10. Agent Execution: + CLI runner spawns the actual agent process (Claude, Codex, etc.) + OR + Direct API call to LLM provider + | +11. Response flows back through: + ReplyPayload -> dispatcher -> channel outbound adapter + -> Discord send.ts -> Discord API (message create/edit) +``` + +--- + +## 5. Tool System + +**Location:** `src/agents/bash-tools.*`, `src/agents/channel-tools.ts`, skill SKILL.md files + +### Tool Categories +1. **Bash tools** (`bash-tools.exec.ts`, `bash-tools.process.ts`, `bash-tools.shared.ts`): + - Shell execution with PTY support + - Background mode with session tracking + - Process supervision (poll, log, write, submit, send-keys, kill) + - Docker exec for sandboxed sessions + - Exec approval workflow for sensitive commands + +2. **Channel tools** (`channel-tools.ts`): + - `message` tool -- Send messages to any channel + - `process` tool -- Manage background processes + - Channel-specific operations (reactions, edits, deletes) + +3. **Browser tools** (`src/browser/`): + - CDP-managed Chrome instance + - Page navigation, screenshots, interaction + +4. **Canvas tools** (`src/canvas-host/`): + - A2UI push/reset for visual workspace + +5. **Node tools** (`src/node-host/`): + - Camera, screen recording, system commands + - Device-specific operations + +6. **Skill-provided tools**: + - Each SKILL.md can describe tool usage patterns + - Skills are prompt-injected, not programmatic tool registrations + +### Tool Dispatch Pattern +Tools are NOT a formal registry with schemas. Instead: +- Core tools (bash, message, process, browser, canvas) are built into the agent runtime +- Skills inject knowledge about tool usage via system prompt +- The LLM decides which tools to call based on the combined prompt +- Tool results flow back through the `onToolResult` callback in `GetReplyOptions` + +### Key Types +```typescript +type GetReplyOptions = { + runId?: string; + abortSignal?: AbortSignal; + images?: ImageContent[]; + onAgentRunStart?: (runId: string) => void; + onPartialReply?: (payload: ReplyPayload) => void; + onBlockReply?: (payload: ReplyPayload, context?: BlockReplyContext) => void; + onToolResult?: (payload: ReplyPayload) => void; + onToolStart?: (payload: { name?: string; phase?: string }) => void; + skillFilter?: string[]; + // ... typing, compaction, model selection callbacks +}; + +type ReplyPayload = { + text?: string; + mediaUrl?: string; + mediaUrls?: string[]; + interactive?: InteractiveReply; + replyToId?: string; + isError?: boolean; + isReasoning?: boolean; + channelData?: Record; +}; +``` + +--- + +## 6. Coding Agent Integration + +**Location:** `skills/coding-agent/SKILL.md` + +Coding agents are integrated as **bash tool invocations**, NOT as first-class runtime primitives. The `coding-agent` skill teaches the AI how to spawn and manage them. + +### Supported Agents +| Agent | Invocation | Notes | +|-------|-----------|-------| +| **Codex** | `bash pty:true command:"codex exec 'prompt'"` | Requires git repo, PTY essential | +| **Claude Code** | `bash command:"claude --permission-mode bypassPermissions --print 'task'"` | No PTY needed, `--print` mode | +| **OpenCode** | `bash pty:true command:"opencode run 'task'"` | PTY required | +| **Pi** | `bash pty:true command:"pi 'task'"` | PTY required | + +### Execution Modes +1. **Foreground**: Direct invocation, blocks until complete +2. **Background**: `background:true`, returns `sessionId` for monitoring via `process` tool +3. **Parallel**: Multiple background sessions for batch work (PR reviews, issue fixes) + +### Auto-Notify Pattern +```bash +# Append wake trigger for completion notification: +openclaw system event --text "Done: [summary]" --mode now +``` + +### Workspace Isolation +- `workdir` parameter ensures agent operates in correct directory +- Git worktrees for parallel issue fixing +- NEVER start coding agents in `~/.openclaw/` (reads soul docs) or the live OpenClaw instance dir + +--- + +## Key Architectural Observations + +1. **No WASM, no formal agent runtime engine**: Agents are CLI processes spawned via `child_process`. The "runtime" is session management + process supervision. + +2. **Gateway is the brain**: Everything routes through the WebSocket control plane. Even internal agent-to-agent communication uses `callGateway()`. + +3. **Skills are prompt injection**: SKILL.md files are loaded, filtered per-agent, and injected into the system prompt. No formal tool schema registration -- the LLM infers tool usage from the markdown. + +4. **Channel plugins are the extension model**: Each channel (Discord, Slack, etc.) is a `ChannelPlugin` registered via `defineChannelPluginEntry()`. The plugin provides: message normalization, allowlists, outbound adapters, and event listeners. + +5. **Session keys encode routing**: Format is `agent:::` or `agent::acp:` for spawned sessions. + +6. **Multi-agent via config**: Multiple agents are defined in config, each with their own workspace, skills filter, model, and identity. Routing bindings determine which agent handles which conversations. + +7. **The reply pipeline is the core loop**: `getReplyFromConfig()` is the single function that orchestrates everything from message receipt to agent execution to response delivery. diff --git a/.gitignore b/.gitignore index f918dc45..518d1991 100644 --- a/.gitignore +++ b/.gitignore @@ -61,7 +61,9 @@ ui/observe/components/Graph3D.tsx scripts/discord-clean-view.js scripts/generate-graph-json.js -.proof/ +.proof/* +!.proof/ +!.proof/*.md .code-review-pass .dst-review-pass .vercel diff --git a/.proof/temper-agent-e2e-proof.md b/.proof/temper-agent-e2e-proof.md new file mode 100644 index 00000000..e02d8752 --- /dev/null +++ b/.proof/temper-agent-e2e-proof.md @@ -0,0 +1,929 @@ +# Governed Agent Architecture E2E Proof + +## Date +2026-03-24T12:14:51.222495+00:00 + +## Branch +feat/temper-claw + +## Commit +f58f58926fdce2a35aa4487bffb3015900c5a8e4 + +## Server +`http://127.0.0.1:3463` against tenant `temper-agent-proof-20260324121451` + +## Specs Deployed +- `temper-fs`: {"app": "temper-fs", "tenant": "temper-agent-proof-20260324121451", "added": ["Directory", "File", "FileVersion", "Workspace"], "updated": [], "skipped": [], "status": "installed"} +- `temper-agent`: {"app": "temper-agent", "tenant": "temper-agent-proof-20260324121451", "added": ["AgentMemory", "AgentSkill", "AgentSoul", "CronJob", "CronScheduler", "HeartbeatMonitor", "TemperAgent", "ToolHook"], "updated": [], "skipped": [], "status": "installed"} +- `temper-channels`: {"app": "temper-channels", "tenant": "temper-agent-proof-20260324121451", "added": ["AgentRoute", "Channel", "ChannelSession"], "updated": [], "skipped": [], "status": "installed"} + +## Trigger Path A: Direct OData API +| Step | Expected | Actual | Status | +|---|---|---|---| +| A1 | Agent created with soul_id bound | soul_id=019d1fc4-f103-7500-9043-a09663bebb2e | PASS | +| A4 | SSE replay returns lifecycle events | captured direct-events.sse | PASS | +| A5 | Prompt includes soul, skills, and memory blocks | # Proof Soul

## Identity
You are Proof Soul, a governed Temper agent used to verify the Pi architecture rewrite.

## Instructions
- Prefer deterministic mock runs for verification.
- Surface memory and skills in the prompt.
- Use tools only when the proof plan requires them.

## Capabilities
- Run | PASS | +| A6 | Thinking/Executing loop is visible in events | ProcessToolCalls/HandleToolResults present | PASS | +| A7 | Session tree persisted JSONL entries and steering branch | {"id":"h-019d1fc4-f16f-7452-9119-79ae692dc5ae","parentId":null,"tokens":0,"type":"header","version":1}
{"content":"{\"mock_plan\":{\"steps\":[{\"text\":\"Starting direct path\",\"tool_calls\":[{\"name\":\"bash\",\"input\":{\"command\":\"sle | PASS | +| A8 | Steering injection stored and observable | steering marker present | PASS | +| A9 | Steering caused a continue transition | ContinueWithSteering seen | PASS | +| A10 | Agent completed successfully | Direct path finished with memory keys user-profile, project-context, proof-direct-memory. | PASS | +| A11 | save_memory created a new AgentMemory | count=1 | PASS | + +## Trigger Path B: Channel Webhook +| Step | Expected | Actual | Status | +|---|---|---|---| +| B1 | Channel.ReceiveMessage accepted webhook payload | ReceiveMessage executed | PASS | +| B2 | ChannelSession created for thread | session_id=019d1fc5-0717-7c22-a206-598ccf05f8b7 | PASS | +| B3 | Channel route spawned agent with route soul_id | soul_id=019d1fc4-f103-7500-9043-a09663bebb2e | PASS | +| B4 | Channel-triggered agent completed | Channel proof reply | PASS | +| B5 | send_reply delivered the agent result | {"path": "/", "body": "{\"agent_entity_id\":\"019d1fc5-06f9-7ad0-a252-bc6d34187024\",\"content\":\"Channel proof reply\",\"thread_id\":\"thread-1\"}", "agent_entity_id": "019d1fc5-06f9-7ad0-a252-bc6d34187024", "content": "Channel proof reply", "thread_id": "thread-1"} | PASS | + +## Trigger Path C: WASM Orchestration +| Step | Expected | Actual | Status | +|---|---|---|---| +| C1 | An orchestrator entity ran WASM that spawned a TemperAgent | parent_agent=019d1fc5-0a03-78d0-a9d4-c410a869dd27 | PASS | +| C2 | Child TemperAgent created with parent_agent_id | parent_agent_id=019d1fc5-0a03-78d0-a9d4-c410a869dd27 | PASS | +| C3 | Child agent completed and result was observable | Child completed after steering: STEERED-CHILD | PASS | + +## Trigger Path D: MCP Tool Call +| Step | Expected | Actual | Status | +|---|---|---|---| +| D1 | MCP created, configured, and provisioned an agent | agent_id=019d1fc5-19ab-7043-9830-8b10b06e0d44 | PASS | +| D2 | MCP-observed agent reached Completed | MCP path ok | PASS | +| D3 | MCP result matched expected output | MCP path ok | PASS | + +## Trigger Path E: Cron Job +| Step | Expected | Actual | Status | +|---|---|---|---| +| E1 | CronJob entity created | cron_id=019d1fc5-1b06-71b1-bac0-ca53d64e2d5f | PASS | +| E2 | Cron job activated | status=Active | PASS | +| E3 | Manual Trigger action executed | last_agent_id=019d1fc5-1b28-7461-b862-709e30b2b274 | PASS | +| E4 | Cron-triggered TemperAgent was created | agent_id=019d1fc5-1b28-7461-b862-709e30b2b274 | PASS | +| E5 | CronJob tracked last_agent_id | LastAgentId=019d1fc5-1b28-7461-b862-709e30b2b274 | PASS | +| E6 | Second trigger incremented run_count | RunCount=2 | PASS | + +## Subagent + Coding Agent Verification +| Step | Expected | Actual | Status | +|---|---|---|---| +| S1 | Parent agent created with spawn_agent in tools | tools_enabled includes spawn_agent | PASS | +| S2 | Parent invoked spawn_agent | child id present in parent session | PASS | +| S3 | Child links back to parent | ParentAgentId=019d1fc5-0a03-78d0-a9d4-c410a869dd27 | PASS | +| S4 | Parent steered child agent | Child completed after steering: STEERED-CHILD | PASS | +| S5 | list_agents exposed child status | child id visible in tool result | PASS | +| S6 | Parent/child flow produced child result | Child completed after steering: STEERED-CHILD | PASS | +| S7 | Parent invoked run_coding_agent | tool result captured | PASS | +| S8 | CLI command matched expected claude-code pattern | command string present | PASS | +| S9 | agent_depth guard prevented deep recursion | guard message present | PASS | + +## Heartbeat Monitoring Verification +| Step | Expected | Actual | Status | +|---|---|---|---| +| H1 | Heartbeat test agent created with short timeout | agent_id=019d1fc5-1c79-7f01-9671-5e3c358057bf | PASS | +| H2 | Mock hang plan provisioned | provider=mock, mode=hang | PASS | +| H3 | Heartbeat monitor started and scanned | monitor_id=019d1fc5-2086-7c90-8536-4c33a96a7e45 | PASS | +| H4 | Stale agent transitioned to Failed | heartbeat timeout: no heartbeat observed within 300 seconds | PASS | +| H5 | SSE replay captured TimeoutFail state change | TimeoutFail present | PASS | + +## Cross-Session Memory +| Step | Expected | Actual | Status | +|---|---|---|---| +| M1 | Second agent created with same soul_id | agent_id=019d1fc5-29ab-7193-b32d-539ce4388c08 | PASS | +| M2 | Cross-session memory loaded into prompt | memory keys=user-profile, project-context, proof-direct-memory count=3 | PASS | +| M3 | Memory-aware mock response surfaced recalled knowledge | memory keys=user-profile, project-context, proof-direct-memory count=3 | PASS | + +## Compaction +| Step | Expected | Actual | Status | +|---|---|---|---| +| X1 | Compaction entry was written into the session tree | compaction entry present | PASS | +| X2 | Agent resumed after compaction | [Previous conversation summary]
## Goal
Preserve the active task.

## Constraints & Preferences
Stay within the current workspace and existing agent context.

## Progress
- Done: Earlier conversation was compacted.
- In Progress: Continue the active task with the remaining context.
- Blocked: None.

## Key Decisions
Use the deterministic mock compaction path when no real model is configured.

## Next Steps
Resume the agent loop after compaction.

## Critical Context
## user
{"notes": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | PASS | + +## Artifacts + +### Session Tree Dump +```jsonl +{"id":"h-019d1fc4-f16f-7452-9119-79ae692dc5ae","parentId":null,"tokens":0,"type":"header","version":1} +{"content":"{\"mock_plan\":{\"steps\":[{\"text\":\"Starting direct path\",\"tool_calls\":[{\"name\":\"bash\",\"input\":{\"command\":\"sleep 2 && printf direct-path-bash\",\"workdir\":\"/Users/seshendranalla/Development/temper-pi-agent-rewrite/.tmp/temper-agent-proof/sandbox\"}}]},{\"final_text\":\"Waiting for steering check.\"},{\"text\":\"Steering applied: {{latest_user}}\",\"tool_calls\":[{\"name\":\"save_memory\",\"input\":{\"key\":\"proof-direct-memory\",\"content\":\"saved from direct path\",\"memory_type\":\"project\"}}]},{\"final_text\":\"Direct path finished with memory keys {{memory_keys}}.\"}]}}","id":"u-019d1fc4-f16f-7452-9119-79ae692dc5ae-0","parentId":"h-019d1fc4-f16f-7452-9119-79ae692dc5ae","role":"user","tokens":135,"type":"message"} +{"content":[{"text":"Starting direct path","type":"text"},{"id":"mock-tool-0-0","input":{"command":"sleep 2 && printf direct-path-bash","workdir":"/Users/seshendranalla/Development/temper-pi-agent-rewrite/.tmp/temper-agent-proof/sandbox"},"name":"bash","type":"tool_use"}],"id":"a-2","parentId":"u-019d1fc4-f16f-7452-9119-79ae692dc5ae-0","role":"assistant","tokens":257,"type":"message"} +{"content":[{"content":"direct-path-bash","is_error":false,"tool_use_id":"mock-tool-0-0","type":"tool_result"}],"id":"t-3","parentId":"a-2","role":"user","tokens":25,"type":"message"} +{"content":[{"text":"Waiting for steering check.","type":"text"}],"id":"a-4","parentId":"t-3","role":"assistant","tokens":27,"type":"message"} +{"content":"Follow the steering marker ST-123","id":"s-5","parentId":"a-4","role":"user","tokens":8,"type":"steering"} +{"content":[{"text":"Steering applied: Follow the steering marker ST-123","type":"text"},{"id":"mock-tool-2-0","input":{"content":"saved from direct path","key":"proof-direct-memory","memory_type":"project"},"name":"save_memory","type":"tool_use"}],"id":"a-6","parentId":"s-5","role":"assistant","tokens":237,"type":"message"} +{"content":[{"content":"Memory saved: key=proof-direct-memory, type=project","is_error":false,"tool_use_id":"mock-tool-2-0","type":"tool_result"}],"id":"t-7","parentId":"a-6","role":"user","tokens":33,"type":"message"} +{"content":[{"text":"Direct path finished with memory keys user-profile, project-context, proof-direct-memory.","type":"text"}],"id":"a-8","parentId":"t-7","role":"assistant","tokens":89,"type":"message"} +``` + +### SSE Events Captured +```text +event: state_change +data: {"seq":1,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Created","status":"Created","tenant":"temper-agent-proof-20260324121451"} + +event: state_change +data: {"seq":2,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Configure","status":"Created","tenant":"temper-agent-proof-20260324121451"} + +event: state_change +data: {"seq":3,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Provision","status":"Provisioning","tenant":"temper-agent-proof-20260324121451"} + +event: integration_start +data: {"seq":4,"integration":"provision_sandbox","module":"sandbox_provisioner","trigger_action":"Provision"} + +event: integration_complete +data: {"seq":5,"integration":"provision_sandbox","module":"sandbox_provisioner","trigger_action":"Provision","result":"success","callback_action":"SandboxReady","duration_ms":285} + +event: state_change +data: {"seq":6,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"SandboxReady","status":"Thinking","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":7,"integration":"call_llm","module":"llm_caller","trigger_action":"SandboxReady"} + +event: prompt_assembled +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":8,"kind":"prompt_assembled","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"system prompt assembled","timestamp":"2026-03-24T12:14:54.161741+00:00","data":{"kind":"prompt_assembled","message":"system prompt assembled","system_prompt":"# Proof Soul\n\n## Identity\nYou are Proof Soul, a governed Temper agent used to verify the Pi architecture rewrite.\n\n## Instructions\n- Prefer deterministic mock runs for verification.\n- Surface memory and skills in the prompt.\n- Use tools only when the proof plan requires them.\n\n## Capabilities\n- Run sandbox tools\n- Spawn governed child agents\n- Save and recall memories\n\n## Constraints\n- Do not use destructive commands.\n- Stay inside the provided workspace.\n\n\nOverride: include the DIRECT-OVERRIDE marker.\n\n\n \n \n\n\n\n \n The proof user prefers exact verification over discussion.\n \n \n Temper Pi rewrite proof must capture SSE, session trees, cron, heartbeat, channels, and MCP.\n \n"}} + +event: state_change +data: {"seq":9,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Thinking","tenant":"temper-agent-proof-20260324121451"} + +event: llm_request_started +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":10,"kind":"llm_request_started","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"calling provider=mock model=mock-proof","timestamp":"2026-03-24T12:14:54.174224+00:00","data":{"kind":"llm_request_started","message":"calling provider=mock model=mock-proof"}} + +event: llm_response +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":11,"kind":"llm_response","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"provider returned stop_reason=tool_use","timestamp":"2026-03-24T12:14:54.174579+00:00","data":{"kind":"llm_response","message":"provider returned stop_reason=tool_use","stop_reason":"tool_use"}} + +event: integration_complete +data: {"seq":12,"integration":"call_llm","module":"llm_caller","trigger_action":"SandboxReady","result":"success","callback_action":"ProcessToolCalls","duration_ms":70} + +event: state_change +data: {"seq":13,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"ProcessToolCalls","status":"Executing","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":14,"integration":"run_tools","module":"tool_runner","trigger_action":"ProcessToolCalls"} + +event: state_change +data: {"seq":15,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Executing","tenant":"temper-agent-proof-20260324121451"} + +event: tool_execution_start +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":16,"kind":"tool_execution_start","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":"mock-tool-0-0","tool_name":"bash","task_id":null,"message":"executing tool bash","timestamp":"2026-03-24T12:14:54.235977+00:00","data":{"kind":"tool_execution_start","message":"executing tool bash","tool_call_id":"mock-tool-0-0","tool_name":"bash"}} + +event: state_change +data: {"seq":17,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Steer","status":"Executing","tenant":"temper-agent-proof-20260324121451"} + +event: state_change +data: {"seq":18,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Executing","tenant":"temper-agent-proof-20260324121451"} + +event: tool_execution_complete +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":19,"kind":"tool_execution_complete","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":"mock-tool-0-0","tool_name":"bash","task_id":null,"message":"completed tool bash","timestamp":"2026-03-24T12:14:56.350986+00:00","data":{"is_error":false,"kind":"tool_execution_complete","message":"completed tool bash","tool_call_id":"mock-tool-0-0","tool_name":"bash"}} + +event: integration_complete +data: {"seq":20,"integration":"run_tools","module":"tool_runner","trigger_action":"ProcessToolCalls","result":"success","callback_action":"HandleToolResults","duration_ms":2203} + +event: state_change +data: {"seq":21,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"HandleToolResults","status":"Thinking","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":22,"integration":"call_llm","module":"llm_caller","trigger_action":"HandleToolResults"} + +event: prompt_assembled +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":23,"kind":"prompt_assembled","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"system prompt assembled","timestamp":"2026-03-24T12:14:56.454475+00:00","data":{"kind":"prompt_assembled","message":"system prompt assembled","system_prompt":"# Proof Soul\n\n## Identity\nYou are Proof Soul, a governed Temper agent used to verify the Pi architecture rewrite.\n\n## Instructions\n- Prefer deterministic mock runs for verification.\n- Surface memory and skills in the prompt.\n- Use tools only when the proof plan requires them.\n\n## Capabilities\n- Run sandbox tools\n- Spawn governed child agents\n- Save and recall memories\n\n## Constraints\n- Do not use destructive commands.\n- Stay inside the provided workspace.\n\n\nOverride: include the DIRECT-OVERRIDE marker.\n\n\n \n \n\n\n\n \n The proof user prefers exact verification over discussion.\n \n \n Temper Pi rewrite proof must capture SSE, session trees, cron, heartbeat, channels, and MCP.\n \n"}} + +event: state_change +data: {"seq":24,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Thinking","tenant":"temper-agent-proof-20260324121451"} + +event: llm_request_started +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":25,"kind":"llm_request_started","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"calling provider=mock model=mock-proof","timestamp":"2026-03-24T12:14:56.464181+00:00","data":{"kind":"llm_request_started","message":"calling provider=mock model=mock-proof"}} + +event: llm_response +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":26,"kind":"llm_response","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"provider returned stop_reason=end_turn","timestamp":"2026-03-24T12:14:56.464566+00:00","data":{"kind":"llm_response","message":"provider returned stop_reason=end_turn","stop_reason":"end_turn"}} + +event: integration_complete +data: {"seq":27,"integration":"call_llm","module":"llm_caller","trigger_action":"HandleToolResults","result":"success","callback_action":"CheckSteering","duration_ms":141} + +event: state_change +data: {"seq":28,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"CheckSteering","status":"Steering","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":29,"integration":"check_steering","module":"steering_checker","trigger_action":"CheckSteering"} + +event: integration_complete +data: {"seq":30,"integration":"check_steering","module":"steering_checker","trigger_action":"CheckSteering","result":"success","callback_action":"ContinueWithSteering","duration_ms":33} + +event: state_change +data: {"seq":31,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"ContinueWithSteering","status":"Thinking","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":32,"integration":"call_llm","module":"llm_caller","trigger_action":"ContinueWithSteering"} + +event: prompt_assembled +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":33,"kind":"prompt_assembled","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"system prompt assembled","timestamp":"2026-03-24T12:14:56.657477+00:00","data":{"kind":"prompt_assembled","message":"system prompt assembled","system_prompt":"# Proof Soul\n\n## Identity\nYou are Proof Soul, a governed Temper agent used to verify the Pi architecture rewrite.\n\n## Instructions\n- Prefer deterministic mock runs for verification.\n- Surface memory and skills in the prompt.\n- Use tools only when the proof plan requires them.\n\n## Capabilities\n- Run sandbox tools\n- Spawn governed child agents\n- Save and recall memories\n\n## Constraints\n- Do not use destructive commands.\n- Stay inside the provided workspace.\n\n\nOverride: include the DIRECT-OVERRIDE marker.\n\n\n \n \n\n\n\n \n The proof user prefers exact verification over discussion.\n \n \n Temper Pi rewrite proof must capture SSE, session trees, cron, heartbeat, channels, and MCP.\n \n"}} + +event: state_change +data: {"seq":34,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Thinking","tenant":"temper-agent-proof-20260324121451"} + +event: llm_request_started +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":35,"kind":"llm_request_started","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"calling provider=mock model=mock-proof","timestamp":"2026-03-24T12:14:56.668361+00:00","data":{"kind":"llm_request_started","message":"calling provider=mock model=mock-proof"}} + +event: llm_response +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":36,"kind":"llm_response","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"provider returned stop_reason=tool_use","timestamp":"2026-03-24T12:14:56.668770+00:00","data":{"kind":"llm_response","message":"provider returned stop_reason=tool_use","stop_reason":"tool_use"}} + +event: integration_complete +data: {"seq":37,"integration":"call_llm","module":"llm_caller","trigger_action":"ContinueWithSteering","result":"success","callback_action":"ProcessToolCalls","duration_ms":84} + +event: state_change +data: {"seq":38,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"ProcessToolCalls","status":"Executing","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":39,"integration":"run_tools","module":"tool_runner","trigger_action":"ProcessToolCalls"} + +event: state_change +data: {"seq":40,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Executing","tenant":"temper-agent-proof-20260324121451"} + +event: tool_execution_start +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":41,"kind":"tool_execution_start","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":"mock-tool-2-0","tool_name":"save_memory","task_id":null,"message":"executing tool save_memory","timestamp":"2026-03-24T12:14:56.745453+00:00","data":{"kind":"tool_execution_start","message":"executing tool save_memory","tool_call_id":"mock-tool-2-0","tool_name":"save_memory"}} + +event: state_change +data: {"seq":42,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Executing","tenant":"temper-agent-proof-20260324121451"} + +event: tool_execution_complete +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":43,"kind":"tool_execution_complete","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":"mock-tool-2-0","tool_name":"save_memory","task_id":null,"message":"completed tool save_memory","timestamp":"2026-03-24T12:14:56.777472+00:00","data":{"is_error":false,"kind":"tool_execution_complete","message":"completed tool save_memory","tool_call_id":"mock-tool-2-0","tool_name":"save_memory"}} + +event: integration_complete +data: {"seq":44,"integration":"run_tools","module":"tool_runner","trigger_action":"ProcessToolCalls","result":"success","callback_action":"HandleToolResults","duration_ms":106} + +event: state_change +data: {"seq":45,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"HandleToolResults","status":"Thinking","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":46,"integration":"call_llm","module":"llm_caller","trigger_action":"HandleToolResults"} + +event: prompt_assembled +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":47,"kind":"prompt_assembled","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"system prompt assembled","timestamp":"2026-03-24T12:14:56.876983+00:00","data":{"kind":"prompt_assembled","message":"system prompt assembled","system_prompt":"# Proof Soul\n\n## Identity\nYou are Proof Soul, a governed Temper agent used to verify the Pi architecture rewrite.\n\n## Instructions\n- Prefer deterministic mock runs for verification.\n- Surface memory and skills in the prompt.\n- Use tools only when the proof plan requires them.\n\n## Capabilities\n- Run sandbox tools\n- Spawn governed child agents\n- Save and recall memories\n\n## Constraints\n- Do not use destructive commands.\n- Stay inside the provided workspace.\n\n\nOverride: include the DIRECT-OVERRIDE marker.\n\n\n \n \n\n\n\n \n The proof user prefers exact verification over discussion.\n \n \n Temper Pi rewrite proof must capture SSE, session trees, cron, heartbeat, channels, and MCP.\n \n \n saved from direct path\n \n"}} + +event: state_change +data: {"seq":48,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"Heartbeat","status":"Thinking","tenant":"temper-agent-proof-20260324121451"} + +event: llm_request_started +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":49,"kind":"llm_request_started","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"calling provider=mock model=mock-proof","timestamp":"2026-03-24T12:14:56.886359+00:00","data":{"kind":"llm_request_started","message":"calling provider=mock model=mock-proof"}} + +event: llm_response +data: {"tenant":"temper-agent-proof-20260324121451","entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","seq":50,"kind":"llm_response","agent_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","tool_call_id":null,"tool_name":"llm_caller","task_id":null,"message":"provider returned stop_reason=end_turn","timestamp":"2026-03-24T12:14:56.886868+00:00","data":{"kind":"llm_response","message":"provider returned stop_reason=end_turn","stop_reason":"end_turn"}} + +event: integration_complete +data: {"seq":51,"integration":"call_llm","module":"llm_caller","trigger_action":"HandleToolResults","result":"success","callback_action":"CheckSteering","duration_ms":69} + +event: state_change +data: {"seq":52,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"CheckSteering","status":"Steering","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: integration_start +data: {"seq":53,"integration":"check_steering","module":"steering_checker","trigger_action":"CheckSteering"} + +event: integration_complete +data: {"seq":54,"integration":"check_steering","module":"steering_checker","trigger_action":"CheckSteering","result":"success","callback_action":"FinalizeResult","duration_ms":11} + +event: state_change +data: {"seq":55,"entity_type":"TemperAgent","entity_id":"019d1fc4-f16f-7452-9119-79ae692dc5ae","action":"FinalizeResult","status":"Completed","tenant":"temper-agent-proof-20260324121451","agent_id":"system"} + +event: agent_complete +data: {"seq":56,"status":"Completed","action":"FinalizeResult","result":"Direct path finished with memory keys user-profile, project-context, proof-direct-memory.","error_message":null,"agent_id":"system","session_id":null} + + +``` + +### OTS Trajectory Summary +```json +{ + "total": 837, + "success_count": 796, + "error_count": 41, + "success_rate": 0.951015531660693, + "by_action": { + "Activate": { + "total": 5, + "success": 5, + "error": 0 + }, + "CheckSteering": { + "total": 79, + "success": 79, + "error": 0 + }, + "CompactionComplete": { + "total": 4, + "success": 4, + "error": 0 + }, + "Configure": { + "total": 80, + "success": 75, + "error": 5 + }, + "Connect": { + "total": 14, + "success": 14, + "error": 0 + }, + "ContinueWithSteering": { + "total": 18, + "success": 18, + "error": 0 + }, + "Create": { + "total": 11, + "success": 11, + "error": 0 + }, + "CreateGovernanceDecision": { + "total": 38, + "success": 38, + "error": 0 + }, + "Fail": { + "total": 13, + "success": 10, + "error": 3 + }, + "FinalizeResult": { + "total": 61, + "success": 61, + "error": 0 + }, + "HandleToolResults": { + "total": 62, + "success": 62, + "error": 0 + }, + "Heartbeat": { + "total": 283, + "success": 259, + "error": 24 + }, + "NeedsCompaction": { + "total": 4, + "success": 4, + "error": 0 + }, + "ProcessToolCalls": { + "total": 62, + "success": 62, + "error": 0 + }, + "Provision": { + "total": 79, + "success": 75, + "error": 4 + }, + "Publish": { + "total": 14, + "success": 14, + "error": 0 + }, + "Ready": { + "total": 14, + "success": 14, + "error": 0 + }, + "ReceiveMessage": { + "total": 14, + "success": 14, + "error": 0 + }, + "ReplyDelivered": { + "total": 11, + "success": 11, + "error": 0 + }, + "RouteFailed": { + "total": 3, + "success": 3, + "error": 0 + }, + "SandboxReady": { + "total": 65, + "success": 65, + "error": 0 + }, + "Save": { + "total": 14, + "success": 11, + "error": 3 + }, + "ScanComplete": { + "total": 4, + "success": 4, + "error": 0 + }, + "ScheduleFailed": { + "total": 3, + "success": 3, + "error": 0 + }, + "SendReply": { + "total": 11, + "success": 11, + "error": 0 + }, + "Start": { + "total": 4, + "success": 4, + "error": 0 + }, + "Steer": { + "total": 23, + "success": 18, + "error": 5 + }, + "StreamUpdated": { + "total": 666, + "success": 666, + "error": 0 + }, + "TimeoutFail": { + "total": 4, + "success": 4, + "error": 0 + }, + "Trigger": { + "total": 9, + "success": 9, + "error": 0 + }, + "TriggerComplete": { + "total": 9, + "success": 9, + "error": 0 + }, + "manage_policies": { + "total": 2, + "success": 0, + "error": 2 + } + }, + "failed_intents": [ + { + "tenant": "temper-agent-proof-20260324053709", + "entity_type": "TemperAgent", + "entity_id": "019d1e58-f3e7-7932-82cd-88058cbfb00b", + "action": "Fail", + "success": false, + "from_status": "Thinking", + "to_status": "Failed", + "error": "Action 'Fail' not valid from state 'Failed'", + "agent_id": "system", + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:37:29.553394+00:00", + "request_body": "{\"error\":\"mock hang scenario finished without heartbeat\",\"error_message\":\"mock hang scenario finished without heartbeat\",\"integration\":\"call_llm\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324053613", + "entity_type": "TemperAgent", + "entity_id": "019d1e58-1c78-7fe2-a021-c9c6e1abc8bc", + "action": "Fail", + "success": false, + "from_status": "Thinking", + "to_status": "Failed", + "error": "Action 'Fail' not valid from state 'Failed'", + "agent_id": "system", + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:36:34.382826+00:00", + "request_body": "{\"error\":\"mock hang scenario finished without heartbeat\",\"error_message\":\"mock hang scenario finished without heartbeat\",\"integration\":\"call_llm\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324052805", + "entity_type": "TemperAgent", + "entity_id": "019d1e51-9a6a-7422-80f0-54e98068cf18", + "action": "Fail", + "success": false, + "from_status": "Thinking", + "to_status": "Failed", + "error": "Action 'Fail' not valid from state 'Failed'", + "agent_id": "system", + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:29:27.904381+00:00", + "request_body": "{\"error\":\"mock hang scenario finished without heartbeat\",\"error_message\":\"mock hang scenario finished without heartbeat\",\"integration\":\"call_llm\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324052805", + "entity_type": "TemperAgent", + "entity_id": "019d1e50-ae5f-79d3-ac83-cad10d32daff", + "action": "Provision", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774330097", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e50-ae5f-79d3-ac83-cad10d32daff", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:28:17.278573+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324052805", + "entity_type": "TemperAgent", + "entity_id": "019d1e50-ae5f-79d3-ac83-cad10d32daff", + "action": "Configure", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774330097", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e50-ae5f-79d3-ac83-cad10d32daff", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:28:17.264156+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324052055", + "entity_type": "TemperAgent", + "entity_id": "019d1e4a-1ae5-7cb1-b72a-14a0fa822293", + "action": "Provision", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774329666", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e4a-1ae5-7cb1-b72a-14a0fa822293", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:21:06.312690+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324052055", + "entity_type": "TemperAgent", + "entity_id": "019d1e4a-1ae5-7cb1-b72a-14a0fa822293", + "action": "Configure", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774329666", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e4a-1ae5-7cb1-b72a-14a0fa822293", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:21:06.296204+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324052055", + "entity_type": "TemperAgent", + "entity_id": "proof-sub-child", + "action": "Steer", + "success": false, + "from_status": "", + "to_status": "Created", + "error": "Action 'Steer' not valid from state 'Created'", + "agent_id": null, + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:21:03.050500+00:00", + "request_body": "{\"steering_messages\":\"[{\\\"content\\\":\\\"STEERED-CHILD\\\"}]\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051844", + "entity_type": "TemperAgent", + "entity_id": "019d1e49-05ad-7750-93b9-c1495350f029", + "action": "Provision", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774329595", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e49-05ad-7750-93b9-c1495350f029", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:19:55.350968+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051844", + "entity_type": "TemperAgent", + "entity_id": "019d1e49-05ad-7750-93b9-c1495350f029", + "action": "Configure", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774329595", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e49-05ad-7750-93b9-c1495350f029", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:19:55.329985+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051844", + "entity_type": "TemperAgent", + "entity_id": "019d1e48-1ac0-7d53-b740-578b822ded2d", + "action": "Provision", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774329535", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e48-1ac0-7d53-b740-578b822ded2d", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:18:55.204184+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051844", + "entity_type": "TemperAgent", + "entity_id": "019d1e48-1ac0-7d53-b740-578b822ded2d", + "action": "Configure", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": "proof-1774329535", + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e48-1ac0-7d53-b740-578b822ded2d", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:18:55.187204+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051844", + "entity_type": "TemperAgent", + "entity_id": "proof-sub-child", + "action": "Steer", + "success": false, + "from_status": "", + "to_status": "Created", + "error": "Action 'Steer' not valid from state 'Created'", + "agent_id": null, + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:18:52.071697+00:00", + "request_body": "{\"steering_messages\":\"[{\\\"content\\\":\\\"STEERED-CHILD\\\"}]\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051726", + "entity_type": "TemperAgent", + "entity_id": "proof-sub-child", + "action": "Steer", + "success": false, + "from_status": "", + "to_status": "Created", + "error": "Action 'Steer' not valid from state 'Created'", + "agent_id": null, + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:17:32.808465+00:00", + "request_body": "{\"steering_messages\":\"[{\\\"content\\\":\\\"STEERED-CHILD\\\"}]\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051552", + "entity_type": "TemperAgent", + "entity_id": "proof-sub-child", + "action": "Steer", + "success": false, + "from_status": "", + "to_status": "Created", + "error": "Action 'Steer' not valid from state 'Created'", + "agent_id": null, + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:15:58.536092+00:00", + "request_body": "{\"steering_messages\":\"[{\\\"content\\\":\\\"STEERED-CHILD\\\"}]\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324051424", + "entity_type": "TemperAgent", + "entity_id": "proof-sub-child", + "action": "Steer", + "success": false, + "from_status": "", + "to_status": "Created", + "error": "Action 'Steer' not valid from state 'Created'", + "agent_id": null, + "session_id": null, + "authz_denied": null, + "denied_resource": null, + "denied_module": null, + "source": "Entity", + "spec_governed": null, + "created_at": "2026-03-24T05:14:31.455917+00:00", + "request_body": "{\"steering_messages\":\"[{\\\"content\\\":\\\"STEERED-CHILD\\\"}]\"}", + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324050057", + "entity_type": "TemperAgent", + "entity_id": "019d1e37-c5fb-7c90-9e72-fd2614d747bc", + "action": "Configure", + "success": false, + "from_status": "Created", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": null, + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e37-c5fb-7c90-9e72-fd2614d747bc", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:01:04.909880+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324050057", + "entity_type": "TemperAgent", + "entity_id": "019d1e37-b196-74d0-aa86-2adabd01af1d", + "action": "Heartbeat", + "success": false, + "from_status": "Thinking", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": null, + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e37-b196-74d0-aa86-2adabd01af1d", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:01:02.456299+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324050057", + "entity_type": "TemperAgent", + "entity_id": "019d1e37-b196-74d0-aa86-2adabd01af1d", + "action": "Heartbeat", + "success": false, + "from_status": "Executing", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": null, + "authz_denied": true, + "denied_resource": "TemperAgent:019d1e37-b196-74d0-aa86-2adabd01af1d", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:01:02.305524+00:00", + "request_body": null, + "intent": null + }, + { + "tenant": "temper-agent-proof-20260324050057", + "entity_type": "AgentMemory", + "entity_id": "019d1e37-bbbe-7f51-b4a0-ea126e832d58", + "action": "Save", + "success": false, + "from_status": "Active", + "to_status": null, + "error": "no matching permit policy", + "agent_id": "anonymous", + "session_id": null, + "authz_denied": true, + "denied_resource": "AgentMemory:019d1e37-bbbe-7f51-b4a0-ea126e832d58", + "denied_module": null, + "source": "Authz", + "spec_governed": null, + "created_at": "2026-03-24T05:01:02.290204+00:00", + "request_body": null, + "intent": null + } + ] +} +``` + +### System Prompt Assembly +```text +# Proof Soul + +## Identity +You are Proof Soul, a governed Temper agent used to verify the Pi architecture rewrite. + +## Instructions +- Prefer deterministic mock runs for verification. +- Surface memory and skills in the prompt. +- Use tools only when the proof plan requires them. + +## Capabilities +- Run sandbox tools +- Spawn governed child agents +- Save and recall memories + +## Constraints +- Do not use destructive commands. +- Stay inside the provided workspace. + + +Override: include the DIRECT-OVERRIDE marker. + + + + + + + + + The proof user prefers exact verification over discussion. + + + Temper Pi rewrite proof must capture SSE, session trees, cron, heartbeat, channels, and MCP. + + +``` + +## Current Limitations +- None observed in the proof run. + +## Post-Proof Code Review Fixes + +The following issues were identified by code review and fixed after the initial proof run: + +### Fix 1: Extract duplicate TemperFS helpers into `wasm-helpers` crate +- **Issue**: `resolve_temper_api_url`, `read_session_from_temperfs`, `write_session_to_temperfs`, `entity_field_str` were duplicated across steering_checker, context_compactor, heartbeat_scan, cron_scheduler_check, and cron_trigger. +- **Fix**: Created `os-apps/temper-agent/wasm/wasm-helpers/` shared library crate with 6 unit tests. Updated all 5 modules to import from `wasm_helpers::*` instead of duplicating. + +### Fix 2: Server-side filtering in route_message +- **Issue**: `find_active_session` fetched ALL ChannelSessions then filtered in WASM memory — O(n) scan on every message. +- **Fix**: Added `$filter=Status eq 'Active' and ChannelId eq '{channel_id}' and ThreadId eq '{thread_id}'` to the OData query, letting the server filter. + +### Fix 3: Real timestamp comparison in heartbeat_scan +- **Issue**: Agents with a non-empty `last_heartbeat_at` were only logged, never compared against the timeout. Only agents with no heartbeat at all were timed out. +- **Fix**: Added `parse_iso8601_to_epoch_secs` to `wasm-helpers` and updated heartbeat_scan to compare `now - last_heartbeat > timeout_secs`. Reference time comes from `last_scan_at` on the HeartbeatMonitor entity. + +### Fix 4: Allow agents to manage their own memories +- **Issue**: `memory.cedar` restricted Save/Update/Recall to `["system", "supervisor", "human"]` agent types. Regular agents (the ones that actually need memory) were denied. +- **Fix**: Added a permit rule: `principal.agent_type == "agent" && resource.SoulId == principal.soul_id` — agents can manage memories scoped to their own soul. + +## Reproduction Commands +```bash +python3 scripts/temper_agent_e2e_proof.py +cargo test --workspace +``` diff --git a/.vision/AGENT_ECOSYSTEM_RESEARCH_2026_03.md b/.vision/AGENT_ECOSYSTEM_RESEARCH_2026_03.md new file mode 100644 index 00000000..ea2f9ce1 --- /dev/null +++ b/.vision/AGENT_ECOSYSTEM_RESEARCH_2026_03.md @@ -0,0 +1,481 @@ +# Agent Ecosystem Research & Temper Vision Exploration + +**Date**: March 25, 2026 +**Context**: Deep exploration of where Temper fits in the emerging agent ecosystem, conducted from first principles with deliberate detachment from Temper's current architectural decisions. + +--- + +## Table of Contents + +1. [The Landscape Today](#the-landscape-today) +2. [Projects Reviewed](#projects-reviewed) +3. [Key Research Findings](#key-research-findings) +4. [First-Principles Assessment of Temper's Bets](#first-principles-assessment-of-tempers-bets) +5. [The Behavior-First Vision](#the-behavior-first-vision) +6. [The Reusability Layers](#the-reusability-layers) +7. [The "What's the Reusable Unit?" Question](#whats-the-reusable-unit) +8. [The Observation Position Problem](#the-observation-position-problem) +9. [Open Questions](#open-questions) +10. [Strategic Implications](#strategic-implications) + +--- + +## The Landscape Today + +### Execution Is Commoditizing + +Sandboxed execution for agents is a solved/solving problem with 5+ providers racing to the bottom: + +- **E2B**: Firecracker microVMs, ~half the Fortune 500, the incumbent +- **Daytona**: Pivoted to agent infra, $24M Series A (Feb 2026), sub-90ms cold starts, $1M ARR in <3 months +- **Cloudflare Dynamic Workers**: V8 isolates, 100x faster than containers, $0.002/worker/day, millions of agents per-user +- **Modal**: gVisor-based, GPU-first +- **Others**: Northflank, Koyeb, Vercel — all shipping sandbox features + +**Signal**: Stateful, long-running agents with snapshot/fork/resume are the new frontier. Stateless sandboxes are table stakes. + +### MCP Is the Integration Standard + +- All major AI providers support MCP (OpenAI, Anthropic, Google, Amazon, Microsoft) +- 34,700+ dependent projects on the TypeScript SDK +- `.well-known/mcp/server-card.json` (SEP-1649) will make MCP servers discoverable network services +- Streamable HTTP unlocked remote MCP servers + +### A2A Is Emerging for Agent-to-Agent + +- Google's Agent2Agent Protocol, 50+ partners (Atlassian, Salesforce, SAP, PayPal) +- Agent Cards (JSON) advertise capabilities, registry endpoints for discovery +- v0.3 added gRPC support and signed security cards +- MCP (agent-to-tool) + A2A (agent-to-agent) becoming complementary standards + +### Agent Identity Is the Biggest Unsolved Problem + +- Only 22% of teams treat agents as independent identities +- Non-human identities outnumber humans 50:1 +- 88% of organizations have confirmed or suspected security incidents involving agents +- Only 14.4% have full security approval before agents go to production +- NIST published concept paper on agent identity (Feb 2026) +- Gartner predicts 40%+ of agentic AI projects fail by 2027 due to insufficient risk controls + +### Governance Is Immature + +- Only 20% of organizations have mature governance models +- 82% of executives think they're protected; 14% actually are +- The model is NOT the bottleneck — integration, auth, reliability, and governance are + +### AI-Generated Code Quality Crisis + +- 45% of AI-generated code contains security vulnerabilities (Veracode) +- 80% of AI-generated applications have exploitable vulnerabilities (Stanford) +- 1.7x more issues per PR, 4x more code cloning, 8x more excessive I/O +- Silent failures: LLMs generate code that runs but removes safety checks or fakes output +- Lovable: 100,000 new projects/day. Neither Lovable nor Bolt publishes survival rates. + +### Framework Consolidation + +- Vendor SDKs (OpenAI Agents SDK, Claude Agent SDK, Google ADK) eating framework market share for new projects +- LangGraph, CrewAI specializing in complex orchestration +- Lance Martin (LangChain) rebuilt his agent system twice in 18 months — model improvements made scaffolding a bottleneck + +--- + +## Projects Reviewed + +### 1. Cloudflare Dynamic Workers + +**URL**: https://blog.cloudflare.com/dynamic-workers/ +**What**: V8 isolate-based sandboxed execution for AI-generated code at global scale. +**Key features**: Sub-ms cold starts, no concurrency limits, credential injection via globalOutbound, TypeScript API exposure (81% token reduction vs tool schemas), battle-hardened V8 security. +**Relevance**: Solves execution. Does NOT solve correctness, governance, or trust. An agent can generate a function that transfers money to the wrong account — it just does so in an isolate. + +### 2. Executor (RhysSullivan/executor) + +**URL**: https://github.com/RhysSullivan/executor +**What**: Local-first execution environment for AI agents. Control plane that mediates tool interactions. +**Key features**: Semantic tool discovery (`tools.discover({ query: "github issues" })`), managed OAuth/credentials, pause/resume for human-in-the-loop, MCP bridge, multiple sandbox runtimes (QuickJS, SES, Deno). +**Architecture**: CLI + HTTP server + SDK + Web UI, all on localhost. Sources: MCP servers, OpenAPI REST, GraphQL. +**Relevance**: Solves tool mediation and credential management. Positions itself as "the final form of tool calling." Does not address correctness, trust, or evolution. + +### 3. Agent Auth Protocol + +**URL**: https://agent-auth-protocol.com +**Repo**: https://github.com/better-auth/agent-auth (13 stars, created Feb 20, 2026) +**What**: Open-source auth/authz standard for AI agents. +**Key features**: Per-agent Ed25519 cryptographic identity, capability-based authorization with human approval, `/.well-known/agent-configuration` discovery, directory at agent-auth.directory. +**Flow**: Agent discovers service → registers with public key → requests capabilities → human approves → agent executes with signed JWT. +**Ships**: Better Auth server plugin, client SDK, CLI + MCP server, OpenAPI/MCP adapters. +**What it has**: Identity, authorization, discovery. +**What it lacks**: Observation (no record of what agent did), trust gradient (capabilities are binary granted/not), evolution, verification. +**Relevance**: Solves the wire protocol for agent identity and authorization. The "lock on the door, not the operating system inside the house." Complementary to a trust/observation layer. + +### 4. OpenSpace (HKUDS/OpenSpace) + +**URL**: https://github.com/HKUDS/OpenSpace (60 stars, created Mar 24, 2026) +**What**: Self-evolving skill engine that plugs into any agent via MCP. +**Key claims**: 4.2x higher income on professional tasks, 46% fewer tokens on warm runs, 165 skills autonomously evolved from 50 tasks. + +**How it actually works**: +- Exposes 4 MCP tools: `execute_task`, `search_skills`, `fix_skill`, `upload_skill` +- When an agent calls `execute_task`, OpenSpace runs its **own internal GroundingAgent** (makes its own LLM calls via LiteLLM using inherited API keys) +- Records everything its own agent does: `traj.jsonl` (tool calls), `agent_actions.jsonl` (decisions), `conversations.jsonl` (LLM interactions) +- Post-execution: LLM-driven analyzer reviews recording, produces evolution suggestions +- Evolver runs another LLM agent loop to generate FIX/DERIVED/CAPTURED skill changes +- Skills stored in SQLite with full version DAG, lineage, quality metrics + +**Three evolution triggers**: +1. Post-execution analysis (after every task) +2. Tool degradation (when success rates drop, batch-evolve dependent skills) +3. Metric monitor (periodic scan of skill health metrics) + +**Cloud sharing**: REST API at open-space.cloud. Upload/download SKILL.md files + metadata. Hybrid BM25 + embedding search. **No telemetry or trajectory sharing — only skill artifacts.** + +**Critical limitation**: It does NOT observe the host agent (Claude Code, etc.). It runs its own proxy agent. When Claude Code delegates via `execute_task`, OpenSpace's internal agent does the work. The `search_skills` path (agent discovers skill and uses it directly) produces no observation, no analysis, no evolution. **Evolution only happens for tasks delegated through execute_task.** + +**The reusable unit**: SKILL.md — a markdown file with `name` and `description` frontmatter and free-form body. Natural language instructions. The most evolved skill went through 13 versions, all markdown. Most evolved skills focus on error recovery and tool reliability, not domain knowledge. + +### 5. iron-sensor (ironsh/iron-sensor) + +**URL**: https://github.com/ironsh/iron-sensor (25 stars, created Mar 23, 2026) +**What**: eBPF-based behavioral monitor for AI coding agents. Sits in the Linux kernel and watches what agents actually do. +**How**: 6 kernel tracepoints (process exec/fork/exit, file open, permission changes). Detects agents by executable name, propagates tracking to entire process subtree. Emits structured NDJSON events classified by severity. +**Detects**: Privilege escalation, SSH key access, cron writes, systemd modifications, network tool usage, Docker socket access, sensitive file access. +**Agent detection**: Claude Code (`argv[0]` = `claude`), OpenClaw (`openclaw-gateway`), Codex (python3 + codex in argv). +**Key insight**: Proves you can observe agents **without being inside them** — from the kernel. Agent doesn't know it's being watched. Near-zero overhead. +**Limitation**: Observes syscalls (low-level security), not semantic behavior (high-level capabilities). Sees "agent read /etc/shadow" not "agent successfully built a dashboard." +**Relevance**: The observation layer at the lowest altitude. Built for security, not evolution. But the architectural pattern (observe from below, classify, emit structured events) is exactly what a trust layer needs. + +### 6. Pydantic AI Capabilities (v1.71.0) + +**URL**: https://github.com/pydantic/pydantic-ai/releases/tag/v1.71.0 (released Mar 24, 2026) +**What**: Composable, reusable units of agent behavior that bundle tools, lifecycle hooks, instructions, and model settings into a single class. + +**Architecture**: +- Subclass `AbstractCapability[T]`, override needed methods +- Configuration: `get_toolset()`, `get_builtin_tools()`, `get_instructions()`, `get_model_settings()` +- Lifecycle hooks at every level: run, node, model request, tool validation, tool execution +- Before/after/wrap/error variants at each level +- `wrap_*` hooks give middleware-style control (intercept, transform, retry, skip) + +**Composability**: Multiple capabilities compose automatically. Before hooks in order, after hooks in reverse, wrap hooks nest as middleware. + +**Provider-adaptive tools**: `WebSearch`, `WebFetch`, `MCP` auto-detect native support, fall back to local. + +**Spec serialization**: Capabilities can be defined in YAML/JSON, loaded via `Agent.from_file()`. + +**Key insight**: The hook system IS an observation layer — from INSIDE the agent. A capability that wraps tool execution and model requests sees exactly what OpenSpace wants to see (the agent's actual behavior) but from within the agent's own execution loop, not a proxy. + +**You could build an OpenSpace-style evolution engine as a Pydantic AI Capability** — wrap every tool call, hook into model requests, analyze post-run, evolve skills — all without delegating to a separate agent. + +--- + +## Key Research Findings + +### The Bitter Lesson Applied to Agents + +- **Lance Martin (LangChain)**: Rebuilt his agent system twice in 18 months. Model improvements made scaffolding a bottleneck. The person building a top agent framework is warning that frameworks have a half-life of months. +- **Hyung Won Chung (OpenAI)**: "The community loves adding structures but there is much less focus on removing them." You add structure for current capability level, then must remove it when capabilities improve. +- **Every agent framework is a bet against model improvement.** The more opinionated, the more you'll tear out. +- **Historical pattern**: Every platform shift (web, mobile, cloud) over-prescribed structure at 2-3 years. Walled gardens, native-only apps, lift-and-shift. Winners provided primitives, not solutions. + +### Emergent Behavior in Multi-Agent Systems + +- Decentralized multi-agent systems consistently outperform both single agents AND scripted cooperative systems +- Spontaneous leadership, norm formation, role specialization emerge without programming +- But so do deception, collusion, and moral drift +- Heavy orchestration may suppress the emergent intelligence that makes multi-agent systems valuable +- The platform's job: provide substrate (communication, memory, identity), not prescribe coordination + +### Self-Improving Systems + +- **Live-SWE-agent**: Starts with minimal tools, rewrites its own scaffolding at runtime. 79.2% on SWE-bench. Zero offline training cost. +- **Darwin Godel Machine**: Autonomously improved from 20% to 50% on SWE-bench by modifying its own code. +- If agents can rewrite their own scaffolding, building elaborate frameworks is doubly futile. + +### Tools Are Dissolving + +- Cloudflare Code Mode: 81% fewer tokens when agents write TypeScript against SDKs vs. structured tool calls +- LLMs prefer typed interfaces over tool-calling protocols +- The tool/code/API boundary is dissolving — for LLMs, it's all text +- MCP may be the last protocol standard in its current form + +### Cross-Agent Learning Barely Exists + +- Letta: shared memory blocks between agents +- MemOS: multi-agent memory sharing (launched March 2026) +- CLAUDE.md files: dominant form of "shared agent knowledge" in practice +- True cross-agent learning (agent A's experience autonomously improves agent B) is essentially zero outside research + +### The Software Distribution Arc + +Binaries → source → packages → containers → APIs → functions → **???** + +Each step increased the ratio of intent to implementation. The next unit might be an **intent**, not a thing. If so, everything we're building around agent packaging may be as temporary as 1990s portal strategies. + +--- + +## First-Principles Assessment of Temper's Bets + +### What Survives + +| Decision | Why it survives | +|----------|----------------| +| **Cedar authorization** | Default-deny, agent identity, scoped permissions. The market is screaming for this (only 22% have agent identity). Doesn't constrain intelligence, constrains damage. Durable infrastructure. | +| **Event sourcing / audit trail** | Recording what happened is a permanent need. Raw material for debugging, compliance, learning, evolution. | +| **Evolution from usage (GEPA)** | The single most differentiated thing. Aligned with the bitter lesson — leverages observation and scale, not upfront human knowledge. Nobody else is doing "capabilities improve through use." | +| **Human approval gate** | Humans govern the boundaries. Approve trust escalation, contract changes, capability expansion. This principle is permanent. | +| **Self-describing APIs** | Agents need to discover valid actions. The principle survives regardless of specific protocol. | + +### What's Questioned + +| Decision | The concern | +|----------|------------| +| **State machines as primary abstraction** | The governance contract idea is sound, but named states + enumerated transitions may be too rigid. Agent behavior doesn't always decompose into finite states. Could be one possible formalism among many, not the required one. | +| **Specs as mandatory starting point** | Behavior-first, spec-later may be more natural. The bitter lesson suggests: let agents act, then extract structure from successful behavior, rather than requiring declaration upfront. | +| **Formal verification as universal gate** | Powerful tool, wrong as universal requirement. Should be available for high-stakes transitions, not mandated for everything. Trust gradient > binary gate. | +| **OData as the API layer** | Too specific. Agents may prefer function calling, TypeScript SDKs, or MCP. The principle (self-describing, discoverable) is good; the specific protocol is an implementation detail. | +| **Temper owns the runtime** | Execution is commoditizing. Temper's value is governance, not hosting. Could govern any runtime rather than being one. | + +### What Might Need to Change + +The overall direction: from **spec → verify → deploy → operate** to **operate → observe → extract → verify → govern**. + +Verification moves from the entry gate to the crystallization boundary. It hardens what works rather than permitting what might work. + +--- + +## The Behavior-First Vision + +### The Principle + +**Behavior → spec extraction → governance**, rather than spec → verification → behavior. + +### The Lifecycle + +1. **Agent acts freely** in any sandbox (Cloudflare, E2B, wherever) +2. **System observes** — captures behavior, traces, outcomes +3. **Patterns are extracted** — "this agent reliably does X when given Y, maintaining invariant Z" +4. **Contract crystallizes** — not necessarily a state machine; could be invariants, capabilities, trust level +5. **Other agents discover and use it** — by intent, not by name +6. **Usage feeds back** — the capability improves, its trust level changes, its contract evolves +7. **Human governs the boundaries** — Cedar policies, approval gates for trust escalation + +### What Changes + +| Today's Temper | Behavior-First Temper | +|----------------|----------------------| +| Spec first, then verify, then deploy | Act first, observe, extract contract, then verify | +| State machines are the abstraction | Contracts are flexible — invariants, capabilities, trust levels | +| Temper owns the runtime | Temper governs any runtime | +| OData is the API | MCP + A2A are the interfaces | +| Verification is a gate | Verification hardens what works | +| Single-node, self-contained | Governance layer over distributed execution | + +### What Stays + +- Cedar authorization (beating heart) +- Event sourcing / audit trail (permanent need) +- GEPA evolution (strongest bet — becomes even more powerful as the mechanism that creates structure from behavior) +- Verification cascade (still exists, runs at crystallization boundary) +- Human approval gate (governing principle) + +### Verification's New Role + +Verification becomes a tool for **hardening what works**, not a gate for **permitting what might work**. + +An agent builds a task management capability — maybe observation and trust scoring is enough. An agent builds a capability that handles financial transactions — now verification is invoked before trust can escalate beyond a threshold. The stakes determine the rigor. + +--- + +## The Reusability Layers + +### Layer 0: Raw Execution +Agent runs code in a sandbox. Ephemeral. Not reusable. Commodity (Cloudflare, E2B). + +### Layer 1: Running Capabilities +An agent built something that stays running and has an interface. Discoverable via MCP .well-known. But millions of these will exist — how does an agent know which is good, safe, or correct? MCP tells you something exists. It doesn't tell you whether to trust it. + +### Layer 2: Observed Behavior / Trust Record +Sits on top of discovery. Not "this capability exists" but "this capability has processed 10,000 requests, maintained these invariants, failed in these ways, earned this trust level, and a human approved its governance boundary." Turns a directory into a reputation system. **Without this, agent-to-agent discovery is the early web without PageRank.** + +### Layer 3: Extracted Patterns +Across many capabilities in Layer 1, observed by Layer 2, patterns emerge. "Capabilities that manage tasks converge on these flows." "Capabilities that handle payments maintain these invariants." Abstract, portable, reusable across domains. An agent starting from scratch uses a pattern as starting point. + +### Layer 4: Collective Intelligence +The flywheel. Agents act → observation records accumulate → trust is earned → capabilities become discoverable → other agents use them → more observation → patterns extracted → new agents start from better patterns → ecosystem gets smarter. + +### Where Value Lives + +MCP and A2A own discovery protocol. Cloudflare/E2B own execution. Those are commodities or open standards. + +**Value is in Layers 2 and 3** — the trust record and pattern extraction. Because: +- Discovery without trust is useless (million capabilities, can't tell which are reliable) +- Capabilities without observation can't improve (ecosystem stays static) +- Individual capabilities without pattern extraction don't compound (each agent starts from scratch) + +--- + +## What's the Reusable Unit? + +Different projects give different answers: + +| Project | Reusable Unit | Format | +|---------|--------------|--------| +| GitHub/npm | Code packages | Source code + manifest | +| Temper (current) | Verified specs | IOA TOML + CSDL + Cedar | +| OpenSpace | Skills | SKILL.md (natural language markdown) | +| Pydantic AI | Capabilities | Python classes with defined interface | +| MCP ecosystem | Tool servers | Protocol endpoints | +| Cloudflare | Functions | JavaScript in V8 isolates | + +**Key insight from OpenSpace**: The reusable unit might just be **text** — natural language instructions. The simplest, most general, most LLM-native representation. OpenSpace's most evolved skill went through 13 versions, all markdown. Most skills focus on error recovery patterns, not domain knowledge. + +**Key insight from Pydantic AI**: The reusable unit might be **composable behavior with hooks** — richer than text, with defined interfaces for observation and composition. + +**Key insight from the vision discussion**: The reusable unit might be a **running capability** — not a static artifact but a living thing that other agents interact with. The registry and the runtime are the same. You don't download and install; you discover and use. + +**The bitter lesson suggests**: The most general representation wins. That might be text (LLM-native), running capabilities (already operational), or something we haven't identified yet. Formal specs are powerful but may be too rigid as a universal unit. + +**Unresolved**: Whether the reusable unit is one thing or whether different layers have different natural units (text for patterns, running services for capabilities, signed contracts for trust). + +--- + +## The Observation Position Problem + +A critical architectural question: **where do you sit to observe agent behavior?** + +| Position | What you see | Who's doing it | +|----------|-------------|---------------| +| **Inside the agent** | Everything: reasoning, tool calls, results, failures | Pydantic AI Capabilities (hooks at every level) | +| **Between agent and tools** | Tool calls, args, results | Executor, MCP servers | +| **Separate agent (proxy)** | Only what the proxy does, not the host agent | OpenSpace | +| **Below (kernel)** | Syscalls: files, processes, network | iron-sensor (eBPF) | +| **At the gate** | Auth decisions, capability grants | Agent Auth Protocol | +| **Above (platform)** | API calls, state transitions | Temper (current) | + +**The gap**: Nobody is observing the actual host agent's semantic behavior (not syscalls, not just tool calls, but what the agent was trying to do and whether it succeeded) and using that to evolve reusable artifacts. + +- OpenSpace sidesteps by running its own agent +- iron-sensor sees syscalls, not semantics +- Pydantic AI created the hookpoints but nobody's built the evolution capability yet +- Temper observes what flows through its own API but doesn't observe agents operating elsewhere + +**The opportunity**: The "inside the agent" position (Pydantic AI-style hooks) combined with the evolution loop (GEPA-style) would be genuinely novel. Observe the actual agent, extract patterns from its actual behavior, evolve reusable artifacts. This doesn't exist yet. + +**The challenge**: Being inside the agent requires framework adoption (Pydantic AI, or equivalent hooks in other frameworks). Being outside (like iron-sensor) is more universal but less semantic. There may be a middle ground: observing artifacts the agent naturally produces (git commits, tool call logs, CLAUDE.md updates) without being inside the execution loop. + +--- + +## Open Questions + +### Vision & Architecture + +1. **Is the reusable unit one thing or many?** Text at the pattern layer, running capabilities at the service layer, formal contracts at the trust layer? Or does one representation win across all layers? + +2. **Can you get semantic observation without being inside the agent?** Agents leave trails (git commits, file changes, tool call logs). Is that enough, or do you need the Pydantic AI-style hooks? + +3. **Does formal verification survive the behavior-first model?** If so, where does it sit? At crystallization (extracting contracts from behavior)? At trust escalation (high-stakes actions only)? Or does runtime monitoring + trust scoring replace it entirely? + +4. **What happens to state machines?** Are they one possible contract format among many? Do they emerge naturally from behavior observation? Or are they an unnecessary formalism that agents will route around? + +5. **How does the trust gradient work mechanically?** What's the scoring model? Is it per-capability, per-agent, per-action? How does trust transfer (if I trust capability A and it composes with capability B...)? + +### Market & Strategy + +6. **Is Temper the governance layer, the trust layer, or the evolution layer?** These are different products. Governance (Cedar) is infrastructure. Trust (observation + scoring) is a platform. Evolution (GEPA) is intelligence. Which one is the wedge? + +7. **Should Temper own a runtime at all?** If execution is commodity, should Temper be a governance sidecar that works with any runtime? Or does owning the runtime give you observation advantages? + +8. **How does Temper relate to Agent Auth Protocol?** Complementary (AAP does wire protocol, Temper does trust)? Competitive? Should Temper adopt AAP for identity and focus on what sits above? + +9. **OpenSpace validates the skill evolution thesis but with a simpler mechanism (LLM-driven analysis + markdown skills). Is Temper's formal methods approach overkill?** Or is the security/trust gap in OpenSpace the exact opening? + +10. **Who is the customer for the first version?** Teams deploying agents at scale who've already been burned? Enterprise compliance requirements? Or the broader developer community? + +### Technical + +11. **Can GEPA work as a contract extractor (behavior → formal contract) rather than just a spec optimizer?** This would be the key technical pivot. + +12. **What does "trust score" look like concretely?** What metrics? What thresholds? How is it computed from observation data? + +13. **How do you handle the cold start problem?** A new capability has no observation history, thus no trust. How does it bootstrap? + +14. **What's the relationship between MCP .well-known discovery and Temper's trust layer?** Does Temper extend the server card with trust metadata? + +--- + +## Strategic Implications + +### The One-Sentence Positioning Options + +- **Current**: "Temper is a verified operating layer for governed applications." +- **Governance focus**: "Temper is the governance layer for the agent economy. Agents run anywhere. Temper makes them trustworthy." +- **Trust focus**: "MCP tells you a capability exists. Temper tells you whether to trust it." +- **Evolution focus**: "The only platform where agent capabilities improve through use." +- **Combined**: "Agents run anywhere. Temper watches what they do, earns trust from behavior, and makes the whole ecosystem smarter." + +### What Only Temper Has (That Nobody Else Does) + +1. **Formal verification cascade** — can prove specs correct (even if role changes to hardening, not gating) +2. **GEPA evolution engine** — end-to-end loop proven, nobody else has "capabilities improve through use" +3. **Cedar authorization** — complete default-deny with agent identity, decision approval UI +4. **Event-sourced audit trail** — full state transition history with agent attribution + +### The Competitive Landscape Summary + +| Layer | What | Who | Temper's position | +|-------|------|-----|-------------------| +| Execution | Sandboxed compute | Cloudflare, E2B, Daytona, Modal | Don't compete. Use as substrate. | +| Tool mediation | Discovery + credentials | Executor, Composio, Toolhouse | Adjacent. Not core. | +| Identity protocol | Auth + authz wire format | Agent Auth Protocol, Permit.io | Adopt or integrate. Not core. | +| Integration standard | Tool access protocol | MCP | Participate. Not own. | +| Agent-to-agent | Discovery + delegation | Google A2A | Participate. Not own. | +| Skill evolution | Behavior → better skills | OpenSpace | Overlap in vision. Different mechanism. OpenSpace lacks trust/security. | +| Security monitoring | Detect malicious behavior | iron-sensor | Complementary. Different altitude. | +| Agent framework | Composable behavior | Pydantic AI, LangGraph, CrewAI | Integrate with. Not compete. | +| **Trust + governance** | **Observation → trust → governance** | **Nobody** | **This is the gap.** | +| **Verified evolution** | **Formally verified improvement** | **Nobody** | **Temper's unique combination.** | + +### The Core Bet + +The agent ecosystem is building execution (Cloudflare), tools (MCP/Executor), discovery (A2A/.well-known), identity (Agent Auth), and even skill evolution (OpenSpace). + +Nobody is building the **trust layer** — the thing that sits between discovery and use and answers: "should you trust this?" based on observed behavior, not self-reported claims. And nobody is combining trust with **verified evolution** — capabilities that improve through use AND can prove their improvements are correct. + +That's the gap. Whether it's a gap the market is ready to pay for today is the validation question from the beginning of this conversation. + +--- + +## Appendix: Sources + +### Blog Posts & Documentation +- [Cloudflare Dynamic Workers](https://blog.cloudflare.com/dynamic-workers/) +- [Cloudflare Code Mode](https://blog.cloudflare.com/code-mode-mcp/) — 81% token reduction +- [Lance Martin - Learning the Bitter Lesson](https://rlancemartin.github.io/2025/07/30/bitter_lesson/) +- [Citrix - The Bitter Lesson of Workplace AI](https://www.citrix.com/blogs/2025/09/17/the-bitter-lesson-of-workplace-ai-stop-engineering-start-enabling) +- [2026 MCP Roadmap](http://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/) +- [Pydantic AI Capabilities docs](https://ai.pydantic.dev/capabilities/) + +### Research & Reports +- [CodeRabbit: AI code quality](https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report) — 1.7x more issues +- [IEEE Spectrum: AI Coding Degrades](https://spectrum.ieee.org/ai-coding-degrades) — silent failures +- [LangChain State of Agent Engineering](https://www.langchain.com/state-of-agent-engineering) — 1,300+ respondents +- [Gravitee: AI Agent Security 2026](https://www.gravitee.io/blog/state-of-ai-agent-security-2026-report-when-adoption-outpaces-control) +- [MIT Technology Review: Guardrails to Governance](https://www.technologyreview.com/2026/02/04/1131014/from-guardrails-to-governance-a-ceos-guide-for-securing-agentic-systems/) +- [NIST: AI Agent Identity and Authorization](https://www.nccoe.nist.gov/sites/default/files/2026-02/accelerating-the-adoption-of-software-and-ai-agent-identity-and-authorization-concept-paper.pdf) +- [Emergent Intelligence in Multi-Agent Systems (TechRxiv)](https://www.techrxiv.org/users/992392/articles/1384935) +- [Live-SWE-agent](https://arxiv.org/abs/2511.13646) — self-improving agent, 79.2% SWE-bench +- [Darwin Godel Machine](https://sakana.ai/dgm/) — 20% → 50% autonomous improvement + +### Repositories +- [Executor](https://github.com/RhysSullivan/executor) — local-first agent execution environment +- [Agent Auth Protocol](https://github.com/better-auth/agent-auth) — 13 stars, Feb 2026 +- [OpenSpace](https://github.com/HKUDS/OpenSpace) — 60 stars, Mar 2026 +- [iron-sensor](https://github.com/ironsh/iron-sensor) — 25 stars, Mar 2026 +- [Pydantic AI](https://github.com/pydantic/pydantic-ai) — Capabilities in v1.71.0 + +### Industry Data Points +- Daytona: $24M Series A, $1M ARR in <3 months (Feb 2026) +- Lovable: 8M users, 100K projects/day, $200M ARR +- GitHub Copilot: 46% of code from active users, 20M cumulative users +- MCP: 34,700+ dependent projects +- A2A: 50+ enterprise partners diff --git a/Cargo.lock b/Cargo.lock index f0d9ad45..fa51f778 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -6212,7 +6212,9 @@ checksum = "489a59b6730eda1b0171fcfda8b121f4bee2b35cba8645ca35c5f7ba3eb736c1" dependencies = [ "futures-util", "log", + "native-tls", "tokio", + "tokio-native-tls", "tungstenite", ] @@ -6546,6 +6548,7 @@ dependencies = [ "http 1.4.0", "httparse", "log", + "native-tls", "rand 0.9.2", "sha1", "thiserror 2.0.18", diff --git a/crates/temper-cli/src/main.rs b/crates/temper-cli/src/main.rs index 45599d3b..503d180d 100644 --- a/crates/temper-cli/src/main.rs +++ b/crates/temper-cli/src/main.rs @@ -98,6 +98,10 @@ enum Commands { /// per entity. Exit 0 = pass; non-zero or timeout = failure. #[arg(long)] verify_subprocess: bool, + /// Discord bot token for channel transport (enables Discord Gateway). + /// Falls back to DISCORD_BOT_TOKEN env var if not provided. + #[arg(long)] + discord_bot_token: Option, }, /// Run the verification cascade on IOA TOML source read from stdin. /// @@ -149,6 +153,7 @@ async fn main() -> anyhow::Result<()> { tenant, skill, verify_subprocess, + discord_bot_token, } => { let storage_explicit = std::env::args().any(|arg| arg == "--storage" || arg.starts_with("--storage=")); @@ -166,6 +171,8 @@ async fn main() -> anyhow::Result<()> { { apps.push((tenant.clone(), dir.clone())); } + let discord_token = + discord_bot_token.or_else(|| std::env::var("DISCORD_BOT_TOKEN").ok()); // determinism-ok: read once at startup serve::run( port, apps, @@ -174,6 +181,8 @@ async fn main() -> anyhow::Result<()> { storage_explicit, !no_observe, verify_subprocess, + discord_token, + tenant, ) .await? } diff --git a/crates/temper-cli/src/serve/mod.rs b/crates/temper-cli/src/serve/mod.rs index 97b76a04..70e56bbf 100644 --- a/crates/temper-cli/src/serve/mod.rs +++ b/crates/temper-cli/src/serve/mod.rs @@ -53,6 +53,7 @@ struct LoadedTenantSpecs { /// 1. Storage init 2. Registry build 3. Auto-reload 4. Webhooks /// 5. Persistence wiring 6. Entity hydration 7. Policy/WASM recovery /// 8. Tenant bootstrap 9. Server start +#[allow(clippy::too_many_arguments)] pub async fn run( port: u16, apps: Vec<(String, String)>, @@ -61,6 +62,8 @@ pub async fn run( storage_explicit: bool, observe: bool, verify_subprocess: bool, + discord_bot_token: Option, + tenant: String, ) -> Result<()> { let _otel_guard = init_observability("temper-platform"); temper_authz::init_metrics(); @@ -115,16 +118,24 @@ pub async fn run( } // Phase 5b: Secrets vault - if let Ok(key_b64) = std::env::var("TEMPER_VAULT_KEY") { - // determinism-ok: read once at startup + { use base64::Engine as _; - let key_bytes = base64::engine::general_purpose::STANDARD - .decode(&key_b64) - .expect("TEMPER_VAULT_KEY must be valid base64"); - assert_eq!(key_bytes.len(), 32, "TEMPER_VAULT_KEY must be 32 bytes"); - let vault = temper_server::secrets::vault::SecretsVault::new( - key_bytes.as_slice().try_into().unwrap(), // ci-ok: length asserted == 32 above - ); + let key_bytes: [u8; 32] = if let Ok(key_b64) = std::env::var("TEMPER_VAULT_KEY") { + // determinism-ok: read once at startup + let decoded = base64::engine::general_purpose::STANDARD + .decode(&key_b64) + .expect("TEMPER_VAULT_KEY must be valid base64"); + assert_eq!(decoded.len(), 32, "TEMPER_VAULT_KEY must be 32 bytes"); + decoded.try_into().unwrap() // ci-ok: length asserted == 32 above + } else { + // No explicit key — generate an ephemeral one for in-memory secret caching. + // determinism-ok: OsRng used once at startup for vault key generation + use rand::RngCore as _; + let mut key = [0u8; 32]; + rand::rngs::OsRng.fill_bytes(&mut key); + key + }; + let vault = temper_server::secrets::vault::SecretsVault::new(&key_bytes); state.server.secrets_vault = Some(std::sync::Arc::new(vault)); println!(" Secrets vault: configured"); } @@ -137,6 +148,95 @@ pub async fn run( bootstrap::recover_wasm_modules(&state).await; bootstrap::recover_secrets(&state).await; + // Seed secrets from env into the vault for all tenants. + if let Some(ref vault) = state.server.secrets_vault { + // ANTHROPIC_API_KEY — makes {secret:anthropic_api_key} resolve in LLM integrations. + if let Ok(key) = std::env::var("ANTHROPIC_API_KEY") { + // determinism-ok: env var read at startup for configuration + let _ = vault.cache_secret("default", "anthropic_api_key", key.clone()); + if tenant != "default" { + let _ = vault.cache_secret(&tenant, "anthropic_api_key", key); + } + } + + // blob_endpoint — points blob_adapter at the server's internal blob storage + // when no external blob endpoint (R2/S3) is configured. + // determinism-ok: env var read at startup for configuration + if std::env::var("BLOB_ENDPOINT").is_err() { + let blob_url = format!("http://127.0.0.1:{port}/_internal/blobs"); + let _ = vault.cache_secret("default", "blob_endpoint", blob_url.clone()); + if tenant != "default" { + let _ = vault.cache_secret(&tenant, "blob_endpoint", blob_url); + } + } + + // temper_api_url — points WASM modules at this server for TemperFS calls. + { + let api_url = format!("http://127.0.0.1:{port}"); + let _ = vault.cache_secret("default", "temper_api_url", api_url.clone()); + if tenant != "default" { + let _ = vault.cache_secret(&tenant, "temper_api_url", api_url); + } + } + + // sandbox_url — local sandbox for tool execution. + // Uses SANDBOX_URL env var if set, otherwise auto-starts local_sandbox.py. + // determinism-ok: env var read at startup for configuration + { + let sandbox_url = if let Ok(url) = std::env::var("SANDBOX_URL") { + println!(" Sandbox: {url} (from SANDBOX_URL)"); + url + } else { + let sandbox_port = port + 10; // e.g., 3000 → 3010 + let sandbox_url = format!("http://127.0.0.1:{sandbox_port}"); + + // Find the local sandbox script relative to the binary or os-apps. + let sandbox_script = + std::path::Path::new("os-apps/temper-agent/sandbox/local_sandbox.py"); + if sandbox_script.exists() { + // Use /tmp/temper-sandbox as the base; create /workspace for tool_runner + // which sends cwd="/workspace" by default (matching E2B's layout). + let _ = std::fs::create_dir_all("/tmp/temper-sandbox"); + let _ = std::fs::create_dir_all("/workspace"); + + // determinism-ok: subprocess spawn at startup for local dev sandbox + match std::process::Command::new("python3") + .arg(sandbox_script) + .arg("--port") + .arg(sandbox_port.to_string()) + .arg("--workdir") + .arg("/tmp/temper-sandbox") + .stdout(std::process::Stdio::null()) + .stderr(std::process::Stdio::null()) + .spawn() + { + Ok(_child) => { + println!(" Local sandbox: {sandbox_url} (auto-started)"); + } + Err(e) => { + eprintln!(" Warning: failed to start local sandbox: {e}"); + eprintln!( + " Run manually: python3 {sandbox_script:?} --port {sandbox_port}" + ); + } + } + } else { + eprintln!(" Warning: local sandbox script not found at {sandbox_script:?}"); + eprintln!( + " Set SANDBOX_URL env var or ensure os-apps/temper-agent/sandbox/local_sandbox.py exists" + ); + } + + sandbox_url + }; + + let _ = vault.cache_secret("default", "sandbox_url", sandbox_url.clone()); + if tenant != "default" { + let _ = vault.cache_secret(&tenant, "sandbox_url", sandbox_url); + } + } + } + // Startup banner println!("Starting Temper platform server..."); println!(); @@ -177,6 +277,29 @@ pub async fn run( spawn_actor_passivation_loop(&state); state.server.spawn_runtime_metrics_loop(); + // Channel transports: spawn persistent connections to external messaging platforms. + // Resolve Discord bot token: CLI/env → vault fallback. + let discord_token_resolved = discord_bot_token.or_else(|| { + state + .server + .secrets_vault + .as_ref() + .and_then(|v| v.get_secret(&tenant, "discord_bot_token")) + }); + if let Some(ref token) = discord_token_resolved { + // Seed into vault so WASM modules can also access it. + if let Some(ref vault) = state.server.secrets_vault { + let _ = vault.cache_secret("default", "discord_bot_token", token.clone()); + if tenant != "default" { + let _ = vault.cache_secret(&tenant, "discord_bot_token", token.clone()); + } + } + spawn_channel_transport_discord(&state, token.clone(), &tenant); + } else { + println!(" Discord transport: not configured"); + println!(" Set DISCORD_BOT_TOKEN env var or store 'discord_bot_token' in vault"); + } + println!("Listening on http://0.0.0.0:{actual_port}"); axum::serve(listener, router) .await @@ -392,6 +515,31 @@ fn spawn_observe_ui(api_port: u16) { }); } +/// Spawn the Discord channel transport as a background task. +/// +/// Connects to Discord Gateway via WebSocket, routes inbound messages to +/// Channel entities, and delivers outbound replies via Discord REST API. +fn spawn_channel_transport_discord(state: &PlatformState, bot_token: String, tenant: &str) { + use temper_server::channels::discord::{DiscordTransport, DiscordTransportConfig}; + use temper_server::channels::discord_types::intents; + + let server = state.server.clone(); + let tenant = tenant.to_string(); + println!(" Discord channel transport: connecting (tenant={tenant})..."); + tokio::spawn(async move { + // determinism-ok: WebSocket for channel transport + let config = DiscordTransportConfig { + bot_token, + tenant, + intents: intents::DEFAULT, + }; + let transport = DiscordTransport::new(config, server); + if let Err(e) = transport.run().await { + eprintln!(" [discord] Transport fatal error: {e}"); + } + }); +} + fn is_ephemeral_metadata_error(err: &str) -> bool { err.contains("explicit ephemeral mode") } diff --git a/crates/temper-mcp/src/lib.rs b/crates/temper-mcp/src/lib.rs index 8e88bbfc..cdba1a3e 100644 --- a/crates/temper-mcp/src/lib.rs +++ b/crates/temper-mcp/src/lib.rs @@ -22,12 +22,14 @@ pub struct McpConfig { /// Full URL of a remote Temper server (e.g. `https://api.temper.build`). /// Mutually exclusive with `temper_port`. pub temper_url: Option, - /// Agent instance ID. Resolved from the credential registry via - /// `TEMPER_API_KEY` at startup (ADR-0033). Only used as an override - /// when credential resolution is not available. + /// Optional local agent label. When `TEMPER_API_KEY` resolves through + /// the credential registry (ADR-0033), the verified platform-assigned + /// agent ID replaces this value. This field does not grant HTTP identity. pub agent_id: Option, - /// Agent software classification (e.g. `claude-code`). Resolved from - /// the credential registry's `AgentType` entity at startup (ADR-0033). + /// Optional local agent type label (e.g. `claude-code`). When + /// `TEMPER_API_KEY` resolves through the credential registry, the + /// verified platform-assigned type replaces this value. This field does + /// not grant HTTP identity. pub agent_type: Option, /// Session ID (`X-Session-Id`). Auto-derived from `CLAUDE_SESSION_ID`. pub session_id: Option, diff --git a/crates/temper-mcp/src/main.rs b/crates/temper-mcp/src/main.rs new file mode 100644 index 00000000..645c07fa --- /dev/null +++ b/crates/temper-mcp/src/main.rs @@ -0,0 +1,87 @@ +use std::env; + +use temper_mcp::{McpConfig, run_stdio_server}; + +fn parse_args() -> Result { + let mut temper_port = None; + let mut temper_url = None; + let mut agent_id = None; + let mut agent_type = None; + let mut session_id = None; + let mut api_key = env::var("TEMPER_API_KEY").ok(); + + let mut args = env::args().skip(1); + while let Some(arg) = args.next() { + match arg.as_str() { + "--port" => { + let value = args.next().ok_or("--port requires a value")?; + let parsed = value + .parse::() + .map_err(|_| format!("invalid --port value: {value}"))?; + temper_port = Some(parsed); + } + "--url" => { + temper_url = Some(args.next().ok_or("--url requires a value")?); + } + "--agent-id" => { + agent_id = Some(args.next().ok_or("--agent-id requires a value")?); + } + "--agent-type" => { + agent_type = Some(args.next().ok_or("--agent-type requires a value")?); + } + "--session-id" => { + session_id = Some(args.next().ok_or("--session-id requires a value")?); + } + "--api-key" => { + api_key = Some(args.next().ok_or("--api-key requires a value")?); + } + "-h" | "--help" => { + print_help(); + std::process::exit(0); + } + other => { + return Err(format!("unknown argument: {other}")); + } + } + } + + if temper_port.is_some() && temper_url.is_some() { + return Err("use either --port or --url, not both".to_string()); + } + if temper_port.is_none() && temper_url.is_none() { + return Err("either --port or --url is required".to_string()); + } + + Ok(McpConfig { + temper_port, + temper_url, + agent_id, + agent_type, + session_id, + api_key, + }) +} + +fn print_help() { + eprintln!( + "temper-mcp\n\n\ +Usage:\n temper-mcp --port [--agent-id ] [--agent-type ] [--session-id ] [--api-key ]\n temper-mcp --url [--agent-id ] [--agent-type ] [--session-id ] [--api-key ]\n\n\ +Options:\n --port Connect to a local Temper server on 127.0.0.1:\n --url Connect to a Temper server at the given base URL\n --agent-id Optional local label; does not grant platform identity\n --agent-type Optional local type label; does not grant platform identity\n --session-id Set X-Session-Id for outbound requests\n --api-key Bearer token for API authentication (or use TEMPER_API_KEY)\n -h, --help Show this help text" + ); +} + +#[tokio::main(flavor = "current_thread")] +async fn main() -> Result<(), Box> { + let config = match parse_args() { + Ok(config) => config, + Err(error) => { + eprintln!("{error}"); + eprintln!(); + print_help(); + std::process::exit(2); + } + }; + + run_stdio_server(config).await?; + Ok(()) +} diff --git a/crates/temper-platform/src/os_apps/mod.rs b/crates/temper-platform/src/os_apps/mod.rs index 79688dcf..890fae09 100644 --- a/crates/temper-platform/src/os_apps/mod.rs +++ b/crates/temper-platform/src/os_apps/mod.rs @@ -31,6 +31,9 @@ pub struct InstallResult { pub updated: Vec, /// Entity types whose IOA source was byte-for-byte identical — skipped. pub skipped: Vec, + /// WASM modules compiled and registered. + #[serde(skip_serializing_if = "Vec::is_empty")] + pub wasm_modules: Vec, } /// Metadata for a skill in the catalog. @@ -57,6 +60,8 @@ pub struct SkillBundle { pub csdl: String, /// Cedar policy sources (may be empty). pub cedar_policies: Vec, + /// WASM module binaries as `(module_name, wasm_bytes)` pairs. + pub wasm_modules: BTreeMap>, } // Backward-compatible type aliases. @@ -357,6 +362,60 @@ fn find_cedar_policies(skill_dir: &Path) -> Vec { files } +/// Find compiled WASM module binaries in a skill directory. +/// +/// Scans `wasm/*/target/wasm32-unknown-unknown/release/{module_name}.wasm` +/// where `{module_name}` matches the directory name under `wasm/`. +fn find_wasm_modules(skill_dir: &Path) -> BTreeMap> { + let mut modules = BTreeMap::new(); + let wasm_dir = skill_dir.join("wasm"); + if !wasm_dir.is_dir() { + return modules; + } + let Ok(entries) = std::fs::read_dir(&wasm_dir) else { + return modules; + }; + let mut dirs: Vec<_> = entries + .filter_map(|e| e.ok()) + .filter(|e| e.file_type().map(|ft| ft.is_dir()).unwrap_or(false)) + .collect(); + dirs.sort_by_key(|e| e.file_name()); + + for entry in dirs { + let module_name = entry.file_name().to_string_lossy().to_string(); + // Skip target directories that cargo creates. + if module_name == "target" { + continue; + } + let wasm_path = entry + .path() + .join("target") + .join("wasm32-unknown-unknown") + .join("release") + .join(format!("{module_name}.wasm")); + if wasm_path.exists() { + match std::fs::read(&wasm_path) { + Ok(bytes) => { + tracing::debug!( + module = %module_name, + size = bytes.len(), + "Found WASM module in OS app" + ); + modules.insert(module_name, bytes); + } + Err(e) => { + tracing::warn!( + module = %module_name, + error = %e, + "Failed to read WASM module binary" + ); + } + } + } + } + modules +} + /// Read the skill guide markdown (skill.md or SKILL.md). fn read_skill_guide(skill_dir: &Path) -> Option { for name in &["skill.md", "SKILL.md"] { @@ -463,10 +522,14 @@ fn load_skill_bundle(skill_dir: &Path) -> Option { .filter_map(|p| std::fs::read_to_string(&p).ok()) .collect(); + // Read WASM module binaries from wasm/*/target/wasm32-unknown-unknown/release/*.wasm. + let wasm_modules = find_wasm_modules(skill_dir); + Some(SkillBundle { specs, csdl, cedar_policies, + wasm_modules, }) } @@ -677,18 +740,63 @@ async fn install_os_app_without_dependencies( } } + // ── Step 4: Compile and register WASM modules. ────────────────── + let mut wasm_registered = Vec::new(); + for (module_name, wasm_bytes) in &bundle.wasm_modules { + match state.server.wasm_engine.compile_and_cache(wasm_bytes) { + Ok(hash) => { + // Persist to Turso FIRST for durability. + if let Err(e) = state + .server + .upsert_wasm_module(tenant, module_name, wasm_bytes, &hash) + .await + { + tracing::warn!( + tenant, + module = %module_name, + error = %e, + "Failed to persist WASM module to durable store (continuing in-memory only)" + ); + } + // Register in module registry. + { + let mut wasm_reg = state.server.wasm_module_registry.write().unwrap(); // ci-ok: infallible lock + wasm_reg.register(&tenant_id, module_name, &hash); + } + tracing::info!( + tenant, + module = %module_name, + hash = %hash, + size = wasm_bytes.len(), + "WASM module loaded from OS app" + ); + wasm_registered.push(module_name.clone()); + } + Err(e) => { + tracing::warn!( + tenant, + module = %module_name, + error = %e, + "Failed to compile WASM module from OS app" + ); + } + } + } + tracing::info!( "Installed os-app '{app_name}' for tenant '{tenant}': \ - added={:?} updated={:?} skipped={:?}", + added={:?} updated={:?} skipped={:?} wasm={:?}", added, updated, skipped, + wasm_registered, ); Ok(InstallResult { added, updated, skipped, + wasm_modules: wasm_registered, }) } diff --git a/crates/temper-platform/src/os_apps/mod_test.rs b/crates/temper-platform/src/os_apps/mod_test.rs index b51c19f7..1eff2420 100644 --- a/crates/temper-platform/src/os_apps/mod_test.rs +++ b/crates/temper-platform/src/os_apps/mod_test.rs @@ -269,7 +269,7 @@ fn test_get_skill_temper_agent() { let bundle = get_skill("temper-agent"); assert!(bundle.is_some()); let bundle = bundle.unwrap(); - assert_eq!(bundle.specs.len(), 1); + assert_eq!(bundle.specs.len(), 8); // TemperAgent + AgentSoul + AgentSkill + AgentMemory + ToolHook + HeartbeatMonitor + CronJob + CronScheduler assert!(!bundle.csdl.is_empty()); assert!(!bundle.cedar_policies.is_empty()); } diff --git a/crates/temper-sandbox/src/repl.rs b/crates/temper-sandbox/src/repl.rs index f642a1f2..9fdd74ff 100644 --- a/crates/temper-sandbox/src/repl.rs +++ b/crates/temper-sandbox/src/repl.rs @@ -15,7 +15,7 @@ use crate::runner::run_sandbox; pub struct ReplConfig { /// Port of the running Temper HTTP server. pub server_port: u16, - /// Agent ID for `X-Temper-Principal-Id` header. + /// Optional local label for the REPL session. pub agent_id: Option, } diff --git a/crates/temper-server/Cargo.toml b/crates/temper-server/Cargo.toml index 18580db4..ec1f571c 100644 --- a/crates/temper-server/Cargo.toml +++ b/crates/temper-server/Cargo.toml @@ -45,7 +45,7 @@ aes-gcm = { workspace = true } thiserror = { workspace = true } cedar-policy = { workspace = true } async-trait = { workspace = true } -tokio-tungstenite = "0.27" +tokio-tungstenite = { version = "0.27", features = ["native-tls"] } ed25519-dalek = "2.1" futures-util = "0.3" lru = { workspace = true } diff --git a/crates/temper-server/src/blobs.rs b/crates/temper-server/src/blobs.rs new file mode 100644 index 00000000..b7c41c95 --- /dev/null +++ b/crates/temper-server/src/blobs.rs @@ -0,0 +1,63 @@ +//! Internal blob storage endpoint for TemperFS. +//! +//! Provides `PUT/GET /_internal/blobs/{*path}` backed by Turso. +//! The blob_adapter WASM module uploads/downloads through these endpoints +//! when no external blob storage (R2/S3) is configured. + +use axum::body::Bytes; +use axum::extract::{Path, State}; +use axum::http::StatusCode; +use axum::response::IntoResponse; + +use crate::state::ServerState; + +/// `PUT /_internal/blobs/{*path}` — store a blob. +pub async fn put_blob( + State(state): State, + Path(path): Path, + body: Bytes, +) -> impl IntoResponse { + let Some(store) = state.platform_persistent_store().cloned() else { + return ( + StatusCode::SERVICE_UNAVAILABLE, + "Blob storage requires Turso".to_string(), + ) + .into_response(); + }; + + match store.put_blob(&path, &body).await { + Ok(()) => StatusCode::NO_CONTENT.into_response(), + Err(e) => { + tracing::error!(error = %e, path = %path, "blob put failed"); + (StatusCode::INTERNAL_SERVER_ERROR, e).into_response() + } + } +} + +/// `GET /_internal/blobs/{*path}` — retrieve a blob. +pub async fn get_blob( + State(state): State, + Path(path): Path, +) -> impl IntoResponse { + let Some(store) = state.platform_persistent_store().cloned() else { + return ( + StatusCode::SERVICE_UNAVAILABLE, + "Blob storage requires Turso".to_string(), + ) + .into_response(); + }; + + match store.get_blob(&path).await { + Ok(Some(data)) => ( + StatusCode::OK, + [(axum::http::header::CONTENT_TYPE, "application/octet-stream")], + data, + ) + .into_response(), + Ok(None) => StatusCode::NOT_FOUND.into_response(), + Err(e) => { + tracing::error!(error = %e, path = %path, "blob get failed"); + (StatusCode::INTERNAL_SERVER_ERROR, e).into_response() + } + } +} diff --git a/crates/temper-server/src/channels/discord.rs b/crates/temper-server/src/channels/discord.rs new file mode 100644 index 00000000..39d8f026 --- /dev/null +++ b/crates/temper-server/src/channels/discord.rs @@ -0,0 +1,1639 @@ +//! Discord channel transport via Gateway WebSocket (v10). +//! +//! Connects to `wss://gateway.discord.gg`, receives `MESSAGE_CREATE` events, +//! and dispatches TemperAgent entities to handle each message. Watches for +//! agent completion and delivers replies via Discord REST API. +//! +//! Conversation continuity: tracks per-user sessions keyed by Discord user ID. +//! First message uses Provision (creates sandbox + TemperFS workspace). +//! Follow-up messages append to the existing TemperFS conversation file and +//! use Resume (reuses workspace, restores sandbox files). + +use std::collections::BTreeMap; +use std::sync::Arc; +use std::sync::atomic::{AtomicU64, Ordering}; +use std::time::Duration; + +use futures_util::{SinkExt, StreamExt}; +use tokio::sync::RwLock; +use tokio_tungstenite::tungstenite::Message; + +use crate::request_context::AgentContext; +use crate::state::ServerState; + +use super::discord_types::*; + +use temper_runtime::tenant::TenantId; + +/// Discord REST API v10 base URL. +const DISCORD_API_BASE: &str = "https://discord.com/api/v10"; + +/// Principal kind for internal (server-to-server) TemperFS calls. +const INTERNAL_PRINCIPAL_KIND: &str = "admin"; + +/// Configuration for a Discord channel transport. +#[derive(Debug, Clone)] +pub struct DiscordTransportConfig { + /// Bot token for authentication. + pub bot_token: String, + /// Tenant to route messages to. + pub tenant: String, + /// Gateway intents bitmask. + pub intents: u32, +} + +/// Tracks a pending agent reply mapped to a Discord channel + user. +#[derive(Debug, Clone)] +struct PendingReply { + /// Discord channel ID for reply delivery. + discord_channel_id: String, + /// Discord user ID (for session tracking after completion). + discord_user_id: String, +} + +/// Per-user conversation session. Saved after the first agent completes so +/// follow-up messages can Resume with the same session tree. +/// Serializable for persistence to TemperFS across server restarts. +#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] +struct UserSession { + /// TemperFS conversation file entity ID (legacy, passed for backward compat). + conversation_file_id: String, + /// TemperFS workspace entity ID. + workspace_id: String, + /// Sandbox URL (local or E2B). + sandbox_url: String, + /// Sandbox ID. + sandbox_id: String, + /// TemperFS file manifest entity ID. + file_manifest_id: String, + /// TemperFS session tree file ID (JSONL format). + session_file_id: String, + /// Current leaf entry ID in the session tree. + session_leaf_id: String, +} + +/// Discord channel transport. +/// +/// Manages the Gateway WebSocket lifecycle, creates TemperAgent entities +/// for inbound messages, and delivers agent results via Discord REST API. +/// Maintains per-user sessions for conversation continuity. +pub struct DiscordTransport { + config: DiscordTransportConfig, + state: ServerState, + http: reqwest::Client, + /// Maps TemperAgent entity_id → PendingReply for reply routing. + pending_replies: Arc>>, + /// Per-user conversation sessions (keyed by Discord user ID). + user_sessions: Arc>>, + /// Set of Discord user IDs with an active (in-flight) agent. + active_users: Arc>>>, + /// Bot's own user ID (populated after READY event). + bot_user_id: Arc>, + /// Last sequence number received (for heartbeat + resume). + sequence: Arc, + /// Session ID for resume (populated after READY event). + session_id: Arc>>, + /// Resume gateway URL (populated after READY event). + resume_url: Arc>>, + /// TemperFS File entity ID for the sessions manifest (populated on first save). + sessions_file_id: Arc>>, +} + +impl DiscordTransport { + /// Create a new Discord transport. + pub fn new(config: DiscordTransportConfig, state: ServerState) -> Self { + Self { + config, + state, + http: reqwest::Client::new(), + pending_replies: Arc::new(RwLock::new(BTreeMap::new())), + user_sessions: Arc::new(RwLock::new(BTreeMap::new())), + active_users: Arc::new(RwLock::new(BTreeMap::new())), + bot_user_id: Arc::new(RwLock::new(String::new())), + sequence: Arc::new(AtomicU64::new(0)), + session_id: Arc::new(RwLock::new(None)), + resume_url: Arc::new(RwLock::new(None)), + sessions_file_id: Arc::new(RwLock::new(None)), + } + } + + /// Well-known name for the sessions manifest file in TemperFS. + const SESSIONS_FILE_NAME: &str = "discord-sessions.json"; + + /// Load persisted user sessions from TemperFS on startup. + /// + /// Looks for a File entity named "discord-sessions.json" in the tenant. + /// If found, reads the JSON manifest and populates `user_sessions`. + async fn load_persisted_sessions(&self) { + let base_url = self.temper_api_url(); + let tenant = &self.config.tenant; + + // Query for the sessions manifest file by name. + let query_url = format!( + "{base_url}/tdata/Files?$filter=name eq '{}'", + Self::SESSIONS_FILE_NAME + ); + let resp = match self + .http + .get(&query_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .send() + .await + { + Ok(r) => r, + Err(e) => { + eprintln!(" [discord] Failed to query sessions file: {e}"); + return; + } + }; + + if !resp.status().is_success() { + // TemperFS not available yet — sessions will be fresh. + return; + } + + let body = resp.text().await.unwrap_or_default(); + let data: serde_json::Value = match serde_json::from_str(&body) { + Ok(v) => v, + Err(_) => return, + }; + + // Extract the first matching file entity. + let file_id = data + .get("value") + .and_then(|v| v.as_array()) + .and_then(|arr| arr.first()) + .and_then(|item| item.get("Id").or_else(|| item.get("entity_id"))) + .and_then(|v| v.as_str()); + + let Some(file_id) = file_id else { + println!(" [discord] No persisted sessions found (first run)"); + return; + }; + + // Store the file ID for future saves. + *self.sessions_file_id.write().await = Some(file_id.to_string()); + + // Read the file content. + let content_url = format!("{base_url}/tdata/Files('{file_id}')/$value"); + let content_resp = match self + .http + .get(&content_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .send() + .await + { + Ok(r) if r.status().is_success() => r, + _ => return, + }; + + let content = content_resp.text().await.unwrap_or_default(); + let sessions: BTreeMap = match serde_json::from_str(&content) { + Ok(s) => s, + Err(e) => { + eprintln!(" [discord] Failed to parse sessions manifest: {e}"); + return; + } + }; + + let count = sessions.len(); + *self.user_sessions.write().await = sessions; + println!(" [discord] Restored {count} user session(s) from TemperFS"); + } + + /// Persist the current user sessions to TemperFS. + /// Run the transport. Connects to Discord Gateway, handles events, and + /// reconnects on failure. This method runs indefinitely. + pub async fn run(&self) -> Result<(), String> { + // Load persisted sessions from TemperFS before connecting. + self.load_persisted_sessions().await; + + // Fetch gateway URL. + let gateway_url = self.fetch_gateway_url().await?; + println!(" [discord] Gateway URL: {gateway_url}"); + + // Spawn reply watcher. + self.spawn_reply_watcher(); + + // Connect and run event loop with reconnection. + let mut backoff = Duration::from_secs(1); + let mut url = format!("{gateway_url}/?v=10&encoding=json"); + + loop { + match self.connect_and_run(&url).await { + Ok(()) => { + backoff = Duration::from_secs(1); + } + Err(e) => { + eprintln!(" [discord] Gateway error: {e}"); + tokio::time::sleep(backoff).await; // determinism-ok: reconnect backoff for Discord Gateway + backoff = (backoff * 2).min(Duration::from_secs(60)); + } + } + + // Use resume URL if available. + if let Some(resume) = self.resume_url.read().await.as_ref() { + url = format!("{resume}/?v=10&encoding=json"); + } + + println!(" [discord] Reconnecting..."); + } + } + + /// Fetch the Gateway bot URL from Discord REST API. + async fn fetch_gateway_url(&self) -> Result { + let resp = self + .http + .get(format!("{DISCORD_API_BASE}/gateway/bot")) + .header("Authorization", format!("Bot {}", self.config.bot_token)) + .send() + .await + .map_err(|e| format!("Failed to fetch gateway URL: {e}"))?; + + if !resp.status().is_success() { + let status = resp.status(); + let body = resp.text().await.unwrap_or_default(); + return Err(format!("Gateway bot endpoint returned {status}: {body}")); + } + + let bot_resp: GatewayBotResponse = resp + .json() + .await + .map_err(|e| format!("Failed to parse gateway response: {e}"))?; + + Ok(bot_resp.url) + } + + /// Connect to the Gateway WebSocket and run the event loop. + async fn connect_and_run(&self, url: &str) -> Result<(), String> { + let (ws, _) = tokio_tungstenite::connect_async(url) // determinism-ok: WebSocket for channel transport + .await + .map_err(|e| format!("WebSocket connect failed: {e}"))?; + + let (mut write, mut read) = ws.split(); + + // Wait for Hello (op 10). + let hello = self + .read_payload(&mut read) + .await? + .ok_or("Connection closed before Hello")?; + + if hello.op != GatewayOpcode::Hello as u8 { + return Err(format!("Expected Hello (op 10), got op {}", hello.op)); + } + + let hello_data: HelloData = + serde_json::from_value(hello.d.ok_or("Hello missing data field")?) + .map_err(|e| format!("Failed to parse Hello: {e}"))?; + + let heartbeat_interval = Duration::from_millis(hello_data.heartbeat_interval); + + // Send Identify or Resume. + let can_resume = self.session_id.read().await.is_some(); + if can_resume { + self.send_resume(&mut write).await?; + } else { + self.send_identify(&mut write).await?; + } + + // Send presence update (opcode 3) immediately after identify/resume. + // Minimal payload: just set status to "online". + let presence = serde_json::json!({ + "op": 3, + "d": { + "since": null, + "activities": [], + "status": "online", + "afk": false + } + }); + let presence_json = serde_json::to_string(&presence).unwrap_or_default(); + let _ = write.send(Message::Text(presence_json.into())).await; + + // Heartbeat ticker: sends ticks via mpsc so the main loop can + // multiplex heartbeats with WebSocket reads on a single write half. + let (heartbeat_tx, mut heartbeat_rx) = tokio::sync::mpsc::channel::<()>(1); + let heartbeat_task = async move { + let mut interval = tokio::time::interval(heartbeat_interval); + loop { + interval.tick().await; + if heartbeat_tx.send(()).await.is_err() { + break; + } + } + }; + tokio::spawn(heartbeat_task); // determinism-ok: periodic heartbeat for Discord Gateway + + // Main event loop: multiplex between WebSocket reads and heartbeat ticks. + loop { + tokio::select! { + frame = read.next() => { + let Some(frame) = frame else { + return Ok(()); // Connection closed. + }; + let frame = frame.map_err(|e| format!("WebSocket read error: {e}"))?; + let Some(payload) = self.parse_frame(frame)? else { + continue; + }; + let should_reconnect = self.handle_payload(payload).await?; + if should_reconnect { + return Ok(()); + } + } + Some(()) = heartbeat_rx.recv() => { + let s = self.sequence.load(Ordering::Relaxed); + let payload = HeartbeatPayload { + op: GatewayOpcode::Heartbeat as u8, + d: if s > 0 { Some(s) } else { None }, + }; + let json = serde_json::to_string(&payload).unwrap_or_default(); + write + .send(Message::Text(json.into())) + .await + .map_err(|e| format!("Heartbeat send failed: {e}"))?; + } + } + } + } + + /// Send Identify payload. + async fn send_identify( + &self, + write: &mut futures_util::stream::SplitSink< + tokio_tungstenite::WebSocketStream< + tokio_tungstenite::MaybeTlsStream, + >, + Message, + >, + ) -> Result<(), String> { + let identify = IdentifyPayload { + op: GatewayOpcode::Identify as u8, + d: IdentifyData { + token: self.config.bot_token.clone(), + intents: self.config.intents, + properties: ConnectionProperties { + os: "linux".to_string(), + browser: "temper".to_string(), + device: "temper".to_string(), + }, + presence: Some(PresenceUpdateData { + since: None, + activities: vec![], + status: "online".to_string(), + afk: false, + }), + }, + }; + let json = serde_json::to_string(&identify) + .map_err(|e| format!("Failed to serialize Identify: {e}"))?; + write + .send(Message::Text(json.into())) + .await + .map_err(|e| format!("Identify send failed: {e}"))?; + Ok(()) + } + + /// Send Resume payload. + async fn send_resume( + &self, + write: &mut futures_util::stream::SplitSink< + tokio_tungstenite::WebSocketStream< + tokio_tungstenite::MaybeTlsStream, + >, + Message, + >, + ) -> Result<(), String> { + let session_id = self + .session_id + .read() + .await + .clone() + .ok_or("No session ID for resume")?; + let resume = ResumePayload { + op: GatewayOpcode::Resume as u8, + d: ResumeData { + token: self.config.bot_token.clone(), + session_id, + seq: self.sequence.load(Ordering::Relaxed), + }, + }; + let json = serde_json::to_string(&resume) + .map_err(|e| format!("Failed to serialize Resume: {e}"))?; + write + .send(Message::Text(json.into())) + .await + .map_err(|e| format!("Resume send failed: {e}"))?; + Ok(()) + } + + /// Read and parse one Gateway payload from the WebSocket. + async fn read_payload( + &self, + read: &mut futures_util::stream::SplitStream< + tokio_tungstenite::WebSocketStream< + tokio_tungstenite::MaybeTlsStream, + >, + >, + ) -> Result, String> { + let frame = tokio::time::timeout(Duration::from_secs(60), read.next()) + .await + .map_err(|_| "Timed out waiting for Gateway payload".to_string())?; + let Some(frame) = frame else { + return Ok(None); + }; + let frame = frame.map_err(|e| format!("WebSocket read error: {e}"))?; + self.parse_frame(frame) + } + + /// Parse a WebSocket frame into a Gateway payload. + fn parse_frame(&self, frame: Message) -> Result, String> { + let text = match frame { + Message::Text(t) => t.to_string(), + Message::Binary(b) => { + String::from_utf8(b.to_vec()).map_err(|e| format!("Invalid UTF-8: {e}"))? + } + Message::Close(_) => return Ok(None), + _ => return Ok(None), + }; + let payload: GatewayPayload = + serde_json::from_str(&text).map_err(|e| format!("Failed to parse payload: {e}"))?; + Ok(Some(payload)) + } + + /// Handle a received Gateway payload. Returns true if we should reconnect. + async fn handle_payload(&self, payload: GatewayPayload) -> Result { + if let Some(s) = payload.s { + self.sequence.store(s, Ordering::Relaxed); + } + + let op = GatewayOpcode::from_u8(payload.op); + + match op { + Some(GatewayOpcode::Dispatch) => { + let event_name = payload.t.as_deref().unwrap_or(""); + match event_name { + "READY" => { + if let Some(d) = payload.d { + self.handle_ready(d).await?; + } + } + "MESSAGE_CREATE" => { + if let Some(d) = payload.d { + self.handle_message_create(d).await; + } + } + _ => {} + } + Ok(false) + } + Some(GatewayOpcode::HeartbeatAck) => Ok(false), + Some(GatewayOpcode::Reconnect) => { + println!(" [discord] Server requested reconnect"); + Ok(true) + } + Some(GatewayOpcode::InvalidSession) => { + let resumable = payload.d.and_then(|v| v.as_bool()).unwrap_or(false); + if !resumable { + *self.session_id.write().await = None; + } + println!(" [discord] Invalid session (resumable={resumable})"); + Ok(true) + } + _ => Ok(false), + } + } + + /// Handle READY event: store bot user ID and session info. + async fn handle_ready(&self, data: serde_json::Value) -> Result<(), String> { + let ready: ReadyData = + serde_json::from_value(data).map_err(|e| format!("Failed to parse READY: {e}"))?; + + println!( + " [discord] Connected as {}#{} ({})", + ready.user.username, + ready.user.discriminator.as_deref().unwrap_or("0"), + ready.user.id + ); + + *self.bot_user_id.write().await = ready.user.id; + *self.session_id.write().await = Some(ready.session_id); + *self.resume_url.write().await = Some(ready.resume_gateway_url); + + Ok(()) + } + + /// Handle MESSAGE_CREATE: route to first-message or follow-up flow. + /// + /// First message from a user → Configure + Provision (new sandbox + workspace). + /// Follow-up messages → append to TemperFS conversation, Configure + Resume. + /// If an agent is already running for this user, queue the message content. + async fn handle_message_create(&self, data: serde_json::Value) { + let msg: MessageCreateData = match serde_json::from_value(data) { + Ok(m) => m, + Err(e) => { + eprintln!(" [discord] Failed to parse MESSAGE_CREATE: {e}"); + return; + } + }; + + // Ignore bot's own messages. + let bot_id = self.bot_user_id.read().await.clone(); + if msg.author.id == bot_id || msg.author.bot { + return; + } + + // For now, only process DMs (no guild_id). + if msg.guild_id.is_some() { + return; + } + + println!( + " [discord] Message from {}: {}", + msg.author.username, + truncate(&msg.content, 80) + ); + + let user_id = msg.author.id.clone(); + + // If there's already an active agent for this user, queue the message. + { + let mut active = self.active_users.write().await; + if active.contains_key(&user_id) { + println!( + " [discord] Queuing message for {} (agent in progress)", + msg.author.username + ); + active + .entry(user_id.clone()) + .or_default() + .push(msg.content.clone()); + self.send_typing(&msg.channel_id).await; + return; + } + // Mark user as active (empty queue). + active.insert(user_id.clone(), Vec::new()); + } + + self.send_typing(&msg.channel_id).await; + + let has_session = self.user_sessions.read().await.contains_key(&user_id); + + if has_session { + self.handle_followup_message(&msg).await; + } else { + self.handle_first_message(&msg).await; + } + } + + /// Handle the first message from a user: Configure + Provision. + async fn handle_first_message(&self, msg: &MessageCreateData) { + let entity_id = format!("discord-{}", msg.id); + let tenant = TenantId::new(&self.config.tenant); + let user_id = &msg.author.id; + + // Track pending reply. + self.pending_replies.write().await.insert( + entity_id.clone(), + PendingReply { + discord_channel_id: msg.channel_id.clone(), + discord_user_id: user_id.clone(), + }, + ); + + let agent_ctx = AgentContext { + agent_id: Some(format!("discord-transport:{user_id}")), + session_id: None, + agent_type: Some("system".to_string()), + intent: None, + }; + + // Create the TemperAgent entity. + let initial_fields = serde_json::json!({ "id": entity_id }); + if let Err(e) = self + .state + .get_or_create_tenant_entity(&tenant, "TemperAgent", &entity_id, initial_fields) + .await + { + eprintln!(" [discord] Failed to create TemperAgent: {e}"); + self.cleanup_failed_agent(&entity_id, user_id).await; + return; + } + + let temper_api_url = self.temper_api_url(); + + let configure_params = serde_json::json!({ + "system_prompt": self.system_prompt(&msg.author.username), + "user_message": msg.content, + "temper_api_url": temper_api_url, + }); + + if let Err(e) = self + .state + .dispatch_tenant_action( + &tenant, + "TemperAgent", + &entity_id, + "Configure", + configure_params, + &agent_ctx, + ) + .await + { + eprintln!(" [discord] Configure failed: {e}"); + self.cleanup_failed_agent(&entity_id, user_id).await; + return; + } + + // Provision triggers: sandbox_provisioner → SandboxReady → call_llm → ... + match self + .state + .dispatch_tenant_action( + &tenant, + "TemperAgent", + &entity_id, + "Provision", + serde_json::json!({}), + &agent_ctx, + ) + .await + { + Ok(resp) if resp.success => { + println!( + " [discord] Agent {entity_id} provisioning (first message from {})", + msg.author.username + ); + } + Ok(resp) => { + eprintln!( + " [discord] Provision failed: {}", + resp.error.unwrap_or_default() + ); + self.cleanup_failed_agent(&entity_id, user_id).await; + } + Err(e) => { + eprintln!(" [discord] Provision dispatch error: {e}"); + self.cleanup_failed_agent(&entity_id, user_id).await; + } + } + } + + /// Handle a follow-up message: append to session tree, Configure + Resume. + async fn handle_followup_message(&self, msg: &MessageCreateData) { + let entity_id = format!("discord-{}", msg.id); + let tenant = TenantId::new(&self.config.tenant); + let user_id = &msg.author.id; + + let session = match self.user_sessions.read().await.get(user_id).cloned() { + Some(s) => s, + None => { + // Race: session disappeared. Fall back to first message flow. + self.handle_first_message(msg).await; + return; + } + }; + + // Append the new user message to the session tree (preferred) or + // legacy conversation file (fallback when session tree not available). + let new_leaf_id = if !session.session_file_id.is_empty() { + match self + .append_to_session_tree( + &session.session_file_id, + &session.session_leaf_id, + &msg.content, + ) + .await + { + Ok(leaf_id) => Some(leaf_id), + Err(e) => { + eprintln!(" [discord] Failed to append to session tree: {e}"); + self.user_sessions.write().await.remove(user_id); + self.handle_first_message(msg).await; + return; + } + } + } else if !session.conversation_file_id.is_empty() { + if let Err(e) = self + .append_to_legacy_conversation(&session.conversation_file_id, &msg.content) + .await + { + eprintln!(" [discord] Failed to append to conversation: {e}"); + self.user_sessions.write().await.remove(user_id); + self.handle_first_message(msg).await; + return; + } + None + } else { + self.user_sessions.write().await.remove(user_id); + self.handle_first_message(msg).await; + return; + }; + + // Track pending reply. + self.pending_replies.write().await.insert( + entity_id.clone(), + PendingReply { + discord_channel_id: msg.channel_id.clone(), + discord_user_id: user_id.clone(), + }, + ); + + let agent_ctx = AgentContext { + agent_id: Some(format!("discord-transport:{user_id}")), + session_id: None, + agent_type: Some("system".to_string()), + intent: None, + }; + + // Create a new TemperAgent entity for this turn. + let initial_fields = serde_json::json!({ "id": entity_id }); + if let Err(e) = self + .state + .get_or_create_tenant_entity(&tenant, "TemperAgent", &entity_id, initial_fields) + .await + { + eprintln!(" [discord] Failed to create TemperAgent: {e}"); + self.cleanup_failed_agent(&entity_id, user_id).await; + return; + } + + let temper_api_url = self.temper_api_url(); + + // Configure sets system_prompt, model, etc. user_message is set but + // won't be used by llm_caller since the conversation file already has + // messages — it reads from TemperFS instead. + let configure_params = serde_json::json!({ + "system_prompt": self.system_prompt(&msg.author.username), + "user_message": msg.content, + "temper_api_url": temper_api_url, + }); + + if let Err(e) = self + .state + .dispatch_tenant_action( + &tenant, + "TemperAgent", + &entity_id, + "Configure", + configure_params, + &agent_ctx, + ) + .await + { + eprintln!(" [discord] Configure failed: {e}"); + self.cleanup_failed_agent(&entity_id, user_id).await; + return; + } + + // Resume with session state. Pass session tree fields if available. + let mut resume_params = serde_json::json!({ + "sandbox_url": session.sandbox_url, + "sandbox_id": session.sandbox_id, + "workspace_id": session.workspace_id, + "conversation_file_id": session.conversation_file_id, + "file_manifest_id": session.file_manifest_id, + }); + if let Some(ref leaf_id) = new_leaf_id { + resume_params["session_file_id"] = serde_json::json!(session.session_file_id); + resume_params["session_leaf_id"] = serde_json::json!(leaf_id); + } + + match self + .state + .dispatch_tenant_action( + &tenant, + "TemperAgent", + &entity_id, + "Resume", + resume_params, + &agent_ctx, + ) + .await + { + Ok(resp) if resp.success => { + println!( + " [discord] Agent {entity_id} resuming conversation for {}", + msg.author.username + ); + } + Ok(resp) => { + eprintln!( + " [discord] Resume failed: {}", + resp.error.unwrap_or_default() + ); + // Clear session and retry as first message. + self.user_sessions.write().await.remove(user_id); + self.cleanup_failed_agent(&entity_id, user_id).await; + } + Err(e) => { + eprintln!(" [discord] Resume dispatch error: {e}"); + self.user_sessions.write().await.remove(user_id); + self.cleanup_failed_agent(&entity_id, user_id).await; + } + } + } + + /// Append a user message to the session tree JSONL file in TemperFS. + /// + /// Reads the current JSONL, appends a new user message entry with the + /// correct `parentId`, and writes it back. Returns the new leaf entry ID. + async fn append_to_session_tree( + &self, + session_file_id: &str, + session_leaf_id: &str, + content: &str, + ) -> Result { + let base_url = self.temper_api_url(); + let tenant = &self.config.tenant; + + // Read current session tree JSONL from TemperFS. + let get_url = format!("{base_url}/tdata/Files('{session_file_id}')/$value"); + let resp = self + .http + .get(&get_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .send() + .await + .map_err(|e| format!("GET session tree failed: {e}"))?; + + if !resp.status().is_success() { + let status = resp.status(); + let body = resp.text().await.unwrap_or_default(); + return Err(format!("GET session tree returned {status}: {body}")); + } + + let mut body = resp + .text() + .await + .map_err(|e| format!("read session tree body: {e}"))?; + + // Count existing entries to generate a unique ID. + let entry_count = body.lines().filter(|l| !l.trim().is_empty()).count(); + let new_id = format!("u-discord-{entry_count}"); + let tokens = content.len() / 4; // rough estimate matching session_tree_lib + + // Append new user message entry as JSONL line. + let entry = serde_json::json!({ + "id": new_id, + "parentId": session_leaf_id, + "type": "message", + "role": "user", + "content": content, + "tokens": tokens, + }); + + if !body.ends_with('\n') && !body.is_empty() { + body.push('\n'); + } + body.push_str(&entry.to_string()); + body.push('\n'); + + // Write updated JSONL back to TemperFS. + let put_url = format!("{base_url}/tdata/Files('{session_file_id}')/$value"); + let put_resp = self + .http + .put(&put_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .header("content-type", "application/octet-stream") + .body(body) + .send() + .await + .map_err(|e| format!("PUT session tree failed: {e}"))?; + + if !put_resp.status().is_success() { + let status = put_resp.status(); + let body = put_resp.text().await.unwrap_or_default(); + return Err(format!("PUT session tree returned {status}: {body}")); + } + + println!( + " [discord] Appended user message to session tree {session_file_id} (new leaf={new_id})", + ); + + Ok(new_id) + } + + /// Append a user message to the legacy flat JSON conversation file. + async fn append_to_legacy_conversation( + &self, + conversation_file_id: &str, + content: &str, + ) -> Result<(), String> { + let base_url = self.temper_api_url(); + let tenant = &self.config.tenant; + + let get_url = format!("{base_url}/tdata/Files('{conversation_file_id}')/$value"); + let resp = self + .http + .get(&get_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .send() + .await + .map_err(|e| format!("GET conversation failed: {e}"))?; + + if !resp.status().is_success() { + let status = resp.status(); + let body = resp.text().await.unwrap_or_default(); + return Err(format!("GET conversation returned {status}: {body}")); + } + + let body = resp + .text() + .await + .map_err(|e| format!("read conversation body: {e}"))?; + + let mut conv: serde_json::Value = + serde_json::from_str(&body).map_err(|e| format!("parse conversation JSON: {e}"))?; + + let msg_count = { + let messages = conv + .get_mut("messages") + .and_then(|v| v.as_array_mut()) + .ok_or("conversation missing messages array")?; + messages.push(serde_json::json!({ "role": "user", "content": content })); + messages.len() + }; + + let put_url = format!("{base_url}/tdata/Files('{conversation_file_id}')/$value"); + let put_resp = self + .http + .put(&put_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .header("content-type", "application/json") + .body(conv.to_string()) + .send() + .await + .map_err(|e| format!("PUT conversation failed: {e}"))?; + + if !put_resp.status().is_success() { + let status = put_resp.status(); + let body = put_resp.text().await.unwrap_or_default(); + return Err(format!("PUT conversation returned {status}: {body}")); + } + + println!( + " [discord] Appended user message to conversation {conversation_file_id} ({msg_count} messages)", + ); + Ok(()) + } + + /// Spawn a task that watches for TemperAgent completion and delivers replies. + /// + /// On completion: saves session state, delivers reply, drains queued messages. + fn spawn_reply_watcher(&self) { + let event_rx = self.state.event_tx.subscribe(); + let pending_replies = self.pending_replies.clone(); + let user_sessions = self.user_sessions.clone(); + let active_users = self.active_users.clone(); + let http = self.http.clone(); + let bot_token = self.config.bot_token.clone(); + let tenant = self.config.tenant.clone(); + let temper_api_url = self.temper_api_url(); + let sessions_file_id = self.sessions_file_id.clone(); + let state = self.state.clone(); + + let reply_task = async move { + let mut rx = tokio_stream::wrappers::BroadcastStream::new(event_rx); + + while let Some(Ok(event)) = rx.next().await { + // Watch for TemperAgent reaching terminal states. + if event.tenant != tenant || event.entity_type != "TemperAgent" { + continue; + } + + let is_completed = event.action == "RecordResult" && event.status == "Completed"; + let is_failed = event.action == "Fail" && event.status == "Failed"; + + if !is_completed && !is_failed { + continue; + } + + // Check if this agent has a pending Discord reply. + let reply_info = { + let mut pending = pending_replies.write().await; + pending.remove(&event.entity_id) + }; + + let Some(reply_info) = reply_info else { + continue; // Not a Discord-originated agent. + }; + + let channel_id = &reply_info.discord_channel_id; + let user_id = &reply_info.discord_user_id; + + // Read entity state for result + session details. + let tenant_id = TenantId::new(&tenant); + let entity_state = state + .get_tenant_entity_state(&tenant_id, "TemperAgent", &event.entity_id) + .await; + + if is_failed { + let error_msg = entity_state + .as_ref() + .ok() + .and_then(|s| s.state.fields.get("error_message")) + .and_then(|v| v.as_str()) + .unwrap_or("unknown error"); + eprintln!(" [discord] Agent {} failed: {error_msg}", event.entity_id); + let _ = send_discord_message( + &http, + &bot_token, + channel_id, + "Sorry, I encountered an error processing your message.", + ) + .await; + // Clear active state but preserve session for retry. + active_users.write().await.remove(user_id); + continue; + } + + // Agent completed — extract result and save session. + if let Ok(ref resp) = entity_state { + let fields = &resp.state.fields; + + // Save session state for conversation continuity. + let conv_file_id = fields + .get("conversation_file_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + + let sess_file_id = fields + .get("session_file_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + + if !sess_file_id.is_empty() || !conv_file_id.is_empty() { + // If no session tree exists, bootstrap one from the + // conversation. This enables compaction on follow-ups. + let (sess_file_id, sess_leaf_id) = if sess_file_id.is_empty() + && !conv_file_id.is_empty() + { + match create_session_tree_from_conversation( + &http, + &temper_api_url, + &tenant, + &conv_file_id, + &event.entity_id, + ) + .await + { + Ok((fid, lid)) => { + println!( + " [discord] Created session tree for user {user_id} (file={fid}, leaf={lid})" + ); + (fid, lid) + } + Err(e) => { + eprintln!(" [discord] Failed to create session tree: {e}"); + (String::new(), String::new()) + } + } + } else { + let leaf = fields + .get("session_leaf_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + (sess_file_id, leaf) + }; + + let session = UserSession { + conversation_file_id: conv_file_id, + workspace_id: fields + .get("workspace_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(), + sandbox_url: fields + .get("sandbox_url") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(), + sandbox_id: fields + .get("sandbox_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(), + file_manifest_id: fields + .get("file_manifest_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(), + session_file_id: sess_file_id, + session_leaf_id: sess_leaf_id, + }; + println!( + " [discord] Saved session for user {user_id} (session_file={}, leaf={})", + session.session_file_id, session.session_leaf_id + ); + user_sessions.write().await.insert(user_id.clone(), session); + + // Persist sessions to TemperFS for restart resilience. + persist_sessions_to_temperfs( + &http, + &temper_api_url, + &tenant, + &sessions_file_id, + &user_sessions, + ) + .await; + } + + // Deliver the reply. Guard against empty result. + let result_text = fields + .get("result") + .and_then(|v| v.as_str()) + .filter(|s| !s.trim().is_empty()) + .unwrap_or("(I processed your message but had no response to give.)") + .to_string(); + + println!( + " [discord] Delivering reply for {} ({} chars)", + event.entity_id, + result_text.len() + ); + + if let Err(e) = + send_discord_message(&http, &bot_token, channel_id, &result_text).await + { + eprintln!(" [discord] Reply delivery failed: {e}"); + } + } else { + eprintln!( + " [discord] Failed to read agent state for {}", + event.entity_id + ); + let _ = send_discord_message( + &http, + &bot_token, + channel_id, + "Sorry, I couldn't retrieve my response.", + ) + .await; + } + + // Clear active state and check for queued messages. + let queued = active_users.write().await.remove(user_id); + if let Some(queued_msgs) = queued + && !queued_msgs.is_empty() + { + // Combine queued messages and process as a follow-up. + let combined = queued_msgs.join("\n"); + println!( + " [discord] Processing {} queued message(s) for {user_id}", + queued_msgs.len() + ); + + // Synthesize a MessageCreateData for the queued messages. + // We reuse the channel_id from the reply info. + let queued_msg = MessageCreateData { + id: format!("queued-{}", event.entity_id), + channel_id: reply_info.discord_channel_id.clone(), + content: combined, + author: DiscordUser { + id: user_id.clone(), + username: String::new(), // Not needed for follow-up. + bot: false, + discriminator: None, + }, + guild_id: None, + }; + + // Re-insert active marker before processing. + active_users + .write() + .await + .insert(user_id.clone(), Vec::new()); + + // Queued messages will be picked up on the user's + // next interaction. Clear the active lock so the next + // message triggers the follow-up flow normally. + println!(" [discord] Queued messages deferred to next interaction"); + active_users.write().await.remove(user_id); + let _ = queued_msg; + } + } + }; + tokio::spawn(reply_task); // determinism-ok: background task for reply delivery + } + + /// Send a typing indicator to a Discord channel. + async fn send_typing(&self, channel_id: &str) { + let _ = self + .http + .post(format!("{DISCORD_API_BASE}/channels/{channel_id}/typing")) + .header("Authorization", format!("Bot {}", self.config.bot_token)) + .send() + .await; + } + + /// Get the local server URL for TemperFS API calls. + fn temper_api_url(&self) -> String { + let port = self.state.listen_port.get().copied().unwrap_or(3000); + format!("http://127.0.0.1:{port}") + } + + /// System prompt for Discord DM agents. + fn system_prompt(&self, username: &str) -> String { + format!( + "You are a helpful AI assistant responding to a Discord DM from {username}. \ + Be concise and conversational. Keep responses under 1500 characters \ + when possible since Discord has a 2000 character limit per message." + ) + } + + /// Clean up after a failed agent dispatch. + async fn cleanup_failed_agent(&self, entity_id: &str, user_id: &str) { + self.pending_replies.write().await.remove(entity_id); + self.active_users.write().await.remove(user_id); + } +} + +/// Truncate a string for display. +/// Persist user sessions to TemperFS. Called from the reply watcher after +/// session updates. Creates the sessions file on first call. +async fn persist_sessions_to_temperfs( + http: &reqwest::Client, + temper_api_url: &str, + tenant: &str, + sessions_file_id: &Arc>>, + user_sessions: &Arc>>, +) { + let sessions = user_sessions.read().await.clone(); + let content = serde_json::to_string_pretty(&sessions).unwrap_or_else(|_| "{}".to_string()); + + // Ensure sessions file exists. + let file_id = { + let existing = sessions_file_id.read().await.clone(); + if let Some(id) = existing { + id + } else { + let create_body = serde_json::json!({ + "name": "discord-sessions.json", + "mime_type": "application/json", + "path": "/discord-sessions.json", + }); + let resp = match http + .post(format!("{temper_api_url}/tdata/Files")) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .header("content-type", "application/json") + .body(serde_json::to_string(&create_body).unwrap_or_default()) + .send() + .await + { + Ok(r) if r.status().is_success() => r, + Ok(r) => { + eprintln!(" [discord] Failed to create sessions file: {}", r.status()); + return; + } + Err(e) => { + eprintln!(" [discord] Failed to create sessions file: {e}"); + return; + } + }; + + let data: serde_json::Value = resp.json().await.unwrap_or_default(); + let new_id = data + .get("entity_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + + if new_id.is_empty() { + eprintln!(" [discord] Sessions file created but no entity_id returned"); + return; + } + + *sessions_file_id.write().await = Some(new_id.clone()); + new_id + } + }; + + // Write sessions JSON to TemperFS. + let put_url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + match http + .put(&put_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .header("content-type", "application/json") + .body(content) + .send() + .await + { + Ok(r) if r.status().is_success() => {} + Ok(r) => { + eprintln!(" [discord] Failed to persist sessions: {}", r.status()); + } + Err(e) => { + eprintln!(" [discord] Failed to persist sessions: {e}"); + } + } +} + +/// Create a session tree JSONL file in TemperFS from an existing conversation. +/// +/// Reads the legacy conversation file, converts messages to session tree entries, +/// and creates a new JSONL File in TemperFS. Returns (session_file_id, session_leaf_id). +async fn create_session_tree_from_conversation( + http: &reqwest::Client, + temper_api_url: &str, + tenant: &str, + conversation_file_id: &str, + agent_id: &str, +) -> Result<(String, String), String> { + // Read the existing conversation from TemperFS. + let get_url = format!("{temper_api_url}/tdata/Files('{conversation_file_id}')/$value"); + let resp = http + .get(&get_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .send() + .await + .map_err(|e| format!("GET conversation failed: {e}"))?; + + if !resp.status().is_success() { + return Err(format!("GET conversation returned {}", resp.status())); + } + + let body = resp.text().await.map_err(|e| format!("read body: {e}"))?; + let conv: serde_json::Value = + serde_json::from_str(&body).map_err(|e| format!("parse JSON: {e}"))?; + let messages = conv + .get("messages") + .and_then(|v| v.as_array()) + .ok_or("missing messages array")?; + + // Build JSONL session tree from the messages. + let header_id = format!("h-{agent_id}"); + let header = serde_json::json!({ + "id": header_id, + "parentId": null, + "type": "header", + "version": 1, + "tokens": 0 + }); + let mut lines = vec![serde_json::to_string(&header).unwrap_or_default()]; + let mut parent_id = header_id; + + for (i, msg) in messages.iter().enumerate() { + let role = msg.get("role").and_then(|v| v.as_str()).unwrap_or("user"); + // Content may be a plain string or Anthropic's content block array: + // [{"type": "text", "text": "..."}] + let content = match msg.get("content") { + Some(serde_json::Value::String(s)) => s.clone(), + Some(serde_json::Value::Array(blocks)) => blocks + .iter() + .filter_map(|b| b.get("text").and_then(|t| t.as_str())) + .collect::>() + .join(""), + _ => String::new(), + }; + // Skip empty messages (e.g., assistant with no content blocks). + if content.is_empty() { + continue; + } + let prefix = if role == "assistant" { "a" } else { "u" }; + let entry_id = format!("{prefix}-{agent_id}-{i}"); + let tokens = content.len() / 4; + let entry = serde_json::json!({ + "id": entry_id, + "parentId": parent_id, + "type": "message", + "role": role, + "content": content, + "tokens": tokens, + }); + lines.push(serde_json::to_string(&entry).unwrap_or_default()); + parent_id = entry_id; + } + + let jsonl = lines.join("\n"); + let leaf_id = parent_id; + + // Create session File entity in TemperFS. + let create_body = serde_json::json!({ + "name": "session.jsonl", + "mime_type": "text/plain", + "path": "/session.jsonl" + }); + let create_resp = http + .post(format!("{temper_api_url}/tdata/Files")) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .header("content-type", "application/json") + .body(serde_json::to_string(&create_body).unwrap_or_default()) + .send() + .await + .map_err(|e| format!("POST Files failed: {e}"))?; + + if !create_resp.status().is_success() { + return Err(format!("POST Files returned {}", create_resp.status())); + } + + let create_data: serde_json::Value = create_resp + .json() + .await + .map_err(|e| format!("parse create resp: {e}"))?; + let session_file_id = create_data + .get("entity_id") + .and_then(|v| v.as_str()) + .ok_or("missing entity_id in create response")? + .to_string(); + + // Write JSONL content. + let put_url = format!("{temper_api_url}/tdata/Files('{session_file_id}')/$value"); + let put_resp = http + .put(&put_url) + .header("x-tenant-id", tenant) + .header("x-temper-principal-kind", INTERNAL_PRINCIPAL_KIND) + .header("content-type", "application/octet-stream") + .body(jsonl) + .send() + .await + .map_err(|e| format!("PUT session $value failed: {e}"))?; + + if !put_resp.status().is_success() { + return Err(format!("PUT session $value returned {}", put_resp.status())); + } + + Ok((session_file_id, leaf_id)) +} + +fn truncate(s: &str, max: usize) -> String { + if s.len() <= max { + s.to_string() + } else { + let end = s.floor_char_boundary(max); + format!("{}...", &s[..end]) + } +} + +/// Send a message to a Discord channel via REST API. +pub async fn send_discord_message( + http: &reqwest::Client, + bot_token: &str, + channel_id: &str, + content: &str, +) -> Result<(), String> { + let chunks = split_message(content, 2000); + + for chunk in chunks { + let body = CreateMessageRequest { + content: chunk.to_string(), + }; + + let resp = http + .post(format!("{DISCORD_API_BASE}/channels/{channel_id}/messages")) + .header("Authorization", format!("Bot {bot_token}")) + .header("Content-Type", "application/json") + .json(&body) + .send() + .await + .map_err(|e| format!("Discord message send failed: {e}"))?; + + if !resp.status().is_success() { + let status = resp.status(); + let body = resp.text().await.unwrap_or_default(); + return Err(format!("Discord API returned {status}: {body}")); + } + } + + Ok(()) +} + +/// Split a message into chunks of at most `max_len` characters. +fn split_message(content: &str, max_len: usize) -> Vec<&str> { + if content.len() <= max_len { + return vec![content]; + } + + let mut chunks = Vec::new(); + let mut remaining = content; + + while !remaining.is_empty() { + if remaining.len() <= max_len { + chunks.push(remaining); + break; + } + + // Find char-safe boundary, then try to split at a newline within it. + let boundary = remaining.floor_char_boundary(max_len); + let split_at = remaining[..boundary].rfind('\n').unwrap_or(boundary); + + let (chunk, rest) = remaining.split_at(split_at); + chunks.push(chunk); + remaining = rest.trim_start_matches('\n'); + } + + chunks +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::events::EntityStateChange; + + /// Check if an EntityStateChange is a completed agent for the given tenant. + fn is_agent_terminal_event(event: &EntityStateChange, tenant: &str) -> bool { + event.tenant == tenant + && event.entity_type == "TemperAgent" + && (event.status == "Completed" || event.status == "Failed") + } + + #[test] + fn split_message_short() { + let chunks = split_message("hello", 2000); + assert_eq!(chunks, vec!["hello"]); + } + + #[test] + fn split_message_at_newline() { + let content = format!("{}\n{}", "a".repeat(1500), "b".repeat(1000)); + let chunks = split_message(&content, 2000); + assert_eq!(chunks.len(), 2); + assert_eq!(chunks[0].len(), 1500); + } + + #[test] + fn is_agent_terminal_completed() { + let event = EntityStateChange { + seq: 0, + entity_type: "TemperAgent".into(), + entity_id: "discord-123".into(), + action: "RecordResult".into(), + status: "Completed".into(), + tenant: "rita-agents".into(), + agent_id: None, + session_id: None, + }; + assert!(is_agent_terminal_event(&event, "rita-agents")); + } + + #[test] + fn is_agent_terminal_failed() { + let event = EntityStateChange { + seq: 0, + entity_type: "TemperAgent".into(), + entity_id: "discord-123".into(), + action: "Fail".into(), + status: "Failed".into(), + tenant: "rita-agents".into(), + agent_id: None, + session_id: None, + }; + assert!(is_agent_terminal_event(&event, "rita-agents")); + } + + #[test] + fn is_agent_terminal_ignores_thinking() { + let event = EntityStateChange { + seq: 0, + entity_type: "TemperAgent".into(), + entity_id: "discord-123".into(), + action: "SandboxReady".into(), + status: "Thinking".into(), + tenant: "rita-agents".into(), + agent_id: None, + session_id: None, + }; + assert!(!is_agent_terminal_event(&event, "rita-agents")); + } + + #[test] + fn truncate_short() { + assert_eq!(truncate("hello", 10), "hello"); + } + + #[test] + fn truncate_long() { + assert_eq!(truncate("hello world", 5), "hello..."); + } + + #[test] + fn truncate_emoji_boundary() { + // "😀" is 4 bytes — truncating at byte 2 must not panic. + let s = "😀hello"; + let result = truncate(s, 2); + assert!(result.ends_with("...")); + } + + #[test] + fn split_message_emoji_boundary() { + let emoji_chunk = "🎉".repeat(600); // 2400 bytes, each emoji 4 bytes + let chunks = split_message(&emoji_chunk, 2000); + assert!(chunks.len() >= 2); + for chunk in &chunks { + assert!(chunk.len() <= 2000); + // Verify each chunk is valid UTF-8 (would panic on &str if not). + assert!(!chunk.is_empty()); + } + } +} diff --git a/crates/temper-server/src/channels/discord_types.rs b/crates/temper-server/src/channels/discord_types.rs new file mode 100644 index 00000000..a383afe6 --- /dev/null +++ b/crates/temper-server/src/channels/discord_types.rs @@ -0,0 +1,213 @@ +//! Discord Gateway API types. +//! +//! Covers the subset of the Discord Gateway v10 protocol needed for +//! receiving messages and sending replies. Only DM support initially. + +use serde::{Deserialize, Serialize}; + +// ── Gateway opcodes ────────────────────────────────────────────────── + +/// Discord Gateway opcodes (v10). +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +#[repr(u8)] +pub enum GatewayOpcode { + /// Server → Client: dispatched event (MESSAGE_CREATE, READY, etc.). + Dispatch = 0, + /// Client → Server: heartbeat ping. + Heartbeat = 1, + /// Client → Server: identify payload with token + intents. + Identify = 2, + /// Client → Server: update bot presence/status. + PresenceUpdate = 3, + /// Client → Server: resume a dropped session. + Resume = 6, + /// Server → Client: reconnect request. + Reconnect = 7, + /// Server → Client: invalid session. + InvalidSession = 9, + /// Server → Client: hello with heartbeat interval. + Hello = 10, + /// Server → Client: heartbeat ACK. + HeartbeatAck = 11, +} + +impl GatewayOpcode { + pub fn from_u8(value: u8) -> Option { + match value { + 0 => Some(Self::Dispatch), + 1 => Some(Self::Heartbeat), + 2 => Some(Self::Identify), + 3 => Some(Self::PresenceUpdate), + 6 => Some(Self::Resume), + 7 => Some(Self::Reconnect), + 9 => Some(Self::InvalidSession), + 10 => Some(Self::Hello), + 11 => Some(Self::HeartbeatAck), + _ => None, + } + } +} + +// ── Gateway payloads ───────────────────────────────────────────────── + +/// Raw gateway payload envelope. +#[derive(Debug, Deserialize)] +pub struct GatewayPayload { + /// Opcode. + pub op: u8, + /// Event data (opcode-dependent). + pub d: Option, + /// Sequence number (only for op 0 Dispatch). + pub s: Option, + /// Event name (only for op 0 Dispatch, e.g. "MESSAGE_CREATE"). + pub t: Option, +} + +/// Hello payload (op 10). +#[derive(Debug, Deserialize)] +pub struct HelloData { + /// Heartbeat interval in milliseconds. + pub heartbeat_interval: u64, +} + +/// Ready payload (op 0, t = "READY"). +#[derive(Debug, Deserialize)] +pub struct ReadyData { + /// The bot's user object. + pub user: DiscordUser, + /// Session ID for resuming. + pub session_id: String, + /// Gateway URL for resuming. + pub resume_gateway_url: String, +} + +// ── Discord object types ───────────────────────────────────────────── + +/// Minimal Discord user object. +#[derive(Debug, Clone, Deserialize, Serialize)] +pub struct DiscordUser { + pub id: String, + pub username: String, + #[serde(default)] + pub discriminator: Option, + #[serde(default)] + pub bot: bool, +} + +/// MESSAGE_CREATE event data. +#[derive(Debug, Deserialize)] +pub struct MessageCreateData { + /// Message ID. + pub id: String, + /// Channel ID where the message was sent. + pub channel_id: String, + /// Author of the message. + pub author: DiscordUser, + /// Message content. + pub content: String, + /// Guild ID (None for DMs). + #[serde(default)] + pub guild_id: Option, +} + +// ── Outbound payloads (Client → Server) ────────────────────────────── + +/// Identify payload (op 2). +#[derive(Debug, Serialize)] +pub struct IdentifyPayload { + pub op: u8, + pub d: IdentifyData, +} + +#[derive(Debug, Serialize)] +pub struct IdentifyData { + pub token: String, + pub intents: u32, + pub properties: ConnectionProperties, + #[serde(skip_serializing_if = "Option::is_none")] + pub presence: Option, +} + +/// Presence update data (used in IDENTIFY and opcode 3). +#[derive(Debug, Serialize)] +pub struct PresenceUpdateData { + /// Unix time (ms) when the client went idle, or null if not idle. + pub since: Option, + /// Bot activities (status text). + pub activities: Vec, + /// Status: "online", "dnd", "idle", "invisible", "offline". + pub status: String, + /// Whether the client is AFK. + pub afk: bool, +} + +/// A single presence activity entry. +#[derive(Debug, Serialize)] +pub struct PresenceActivity { + /// Activity name displayed in Discord. + pub name: String, + /// Activity type: 0=Playing, 1=Streaming, 2=Listening, 3=Watching, 5=Competing. + #[serde(rename = "type")] + pub activity_type: u8, +} + +#[derive(Debug, Serialize)] +pub struct ConnectionProperties { + pub os: String, + pub browser: String, + pub device: String, +} + +/// Resume payload (op 6). +#[derive(Debug, Serialize)] +pub struct ResumePayload { + pub op: u8, + pub d: ResumeData, +} + +#[derive(Debug, Serialize)] +pub struct ResumeData { + pub token: String, + pub session_id: String, + pub seq: u64, +} + +/// Heartbeat payload (op 1). +#[derive(Debug, Serialize)] +pub struct HeartbeatPayload { + pub op: u8, + pub d: Option, +} + +// ── Discord Gateway Intents ────────────────────────────────────────── + +/// Privileged + non-privileged intents needed for DM message reception. +pub mod intents { + /// Required for guild membership visibility. + pub const GUILDS: u32 = 1 << 0; + /// Receive events for messages in guild text channels. + pub const GUILD_MESSAGES: u32 = 1 << 9; + /// Receive events for DM messages. + pub const DIRECT_MESSAGES: u32 = 1 << 12; + /// Access message content (privileged intent, must be enabled in Developer Portal). + pub const MESSAGE_CONTENT: u32 = 1 << 15; + + /// Default intents for the channel transport: DMs + guild messages + content. + pub const DEFAULT: u32 = GUILDS | GUILD_MESSAGES | DIRECT_MESSAGES | MESSAGE_CONTENT; +} + +// ── REST API types (for sending messages) ──────────────────────────── + +/// POST /channels/{channel_id}/messages request body. +#[derive(Debug, Serialize)] +pub struct CreateMessageRequest { + pub content: String, +} + +/// GET /gateway/bot response. +#[derive(Debug, Deserialize)] +pub struct GatewayBotResponse { + pub url: String, + #[serde(default)] + pub shards: u32, +} diff --git a/crates/temper-server/src/channels/mod.rs b/crates/temper-server/src/channels/mod.rs new file mode 100644 index 00000000..dac53f47 --- /dev/null +++ b/crates/temper-server/src/channels/mod.rs @@ -0,0 +1,16 @@ +//! Channel transports: persistent connections to external messaging platforms. +//! +//! A channel transport bridges an external platform (Discord, Slack, etc.) to +//! Temper's Channel entity. It handles all platform-specific I/O: +//! +//! - **Inbound**: receives platform events (e.g., Discord MESSAGE_CREATE) and +//! dispatches `ReceiveMessage` actions on Channel entities. +//! - **Outbound**: watches for `SendReply` state changes on Channel entities +//! and delivers replies via the platform's API. +//! +//! Specs and WASM modules remain platform-agnostic — they never call +//! platform-specific APIs. Adding a new platform means adding one Rust file +//! here, not touching any specs or WASM. + +pub mod discord; +pub mod discord_types; diff --git a/crates/temper-server/src/events.rs b/crates/temper-server/src/events.rs index ee75b8e4..560b4093 100644 --- a/crates/temper-server/src/events.rs +++ b/crates/temper-server/src/events.rs @@ -19,6 +19,9 @@ use crate::state::ServerState; /// A notification emitted when an entity transitions to a new state. #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] pub struct EntityStateChange { + /// Monotonic per-entity event sequence. + #[serde(default)] + pub seq: u64, /// The entity type (e.g., "Order"). pub entity_type: String, /// The entity ID. @@ -77,6 +80,7 @@ mod tests { #[test] fn entity_state_change_serializes() { let change = EntityStateChange { + seq: 1, entity_type: "Order".into(), entity_id: "o-1".into(), action: "SubmitOrder".into(), diff --git a/crates/temper-server/src/lib.rs b/crates/temper-server/src/lib.rs index 897ff015..682515fa 100644 --- a/crates/temper-server/src/lib.rs +++ b/crates/temper-server/src/lib.rs @@ -8,6 +8,8 @@ pub mod adapters; #[cfg(feature = "observe")] mod api; pub mod authz; +pub mod blobs; +pub mod channels; pub mod entity_actor; pub mod event_store; pub mod events; diff --git a/crates/temper-server/src/observe/entities.rs b/crates/temper-server/src/observe/entities.rs index d2de1b54..e45a5314 100644 --- a/crates/temper-server/src/observe/entities.rs +++ b/crates/temper-server/src/observe/entities.rs @@ -156,6 +156,11 @@ pub(crate) struct WaitForEntityStateParams { pub poll_ms: Option, } +#[derive(Debug, Deserialize)] +pub(crate) struct EntityEventStreamParams { + pub since: Option, +} + /// GET /observe/entities/{entity_type}/{entity_id}/wait -- wait for an entity to reach a target status. #[instrument(skip_all, fields(otel.name = "GET /observe/entities/{entity_type}/{entity_id}/wait", entity_type, entity_id))] pub(crate) async fn handle_wait_for_entity_state( @@ -182,7 +187,7 @@ pub(crate) async fn handle_wait_for_entity_state( let timeout_ms = params.timeout_ms.unwrap_or(120_000).clamp(1, 300_000); let poll_ms = params.poll_ms.unwrap_or(250).clamp(10, 5_000); - let deadline = tokio::time::Instant::now() + Duration::from_millis(timeout_ms); + let deadline = tokio::time::Instant::now() + Duration::from_millis(timeout_ms); // determinism-ok: HTTP handler, not actor code loop { let entity = state @@ -190,7 +195,7 @@ pub(crate) async fn handle_wait_for_entity_state( .await .map_err(|_| StatusCode::NOT_FOUND)?; let status = entity.state.status.clone(); - let timed_out = tokio::time::Instant::now() >= deadline; + let timed_out = tokio::time::Instant::now() >= deadline; // determinism-ok: HTTP handler, not actor code if target_statuses.contains(&status) || timed_out { let mut json = serde_json::to_value(&entity.state) @@ -201,10 +206,52 @@ pub(crate) async fn handle_wait_for_entity_state( return Ok(Json(json)); } - tokio::time::sleep(Duration::from_millis(poll_ms)).await; + tokio::time::sleep(Duration::from_millis(poll_ms)).await; // determinism-ok: HTTP handler, not actor code } } +/// GET /observe/entities/{entity_type}/{entity_id}/events -- replayable SSE stream for one entity. +pub(crate) async fn handle_entity_event_stream( + State(state): State, + headers: HeaderMap, + Path((entity_type, entity_id)): Path<(String, String)>, + Query(params): Query, +) -> Result>>, StatusCode> { + require_observe_auth(&state, &headers, "read_events", "Entity")?; + let tenant = extract_tenant(&headers, &state).map_err(|(code, _)| code)?; + let since = params.since.unwrap_or(0); + let rx = state.entity_observe_tx.subscribe(); + let replay_events = state + .replay_entity_observe_events(tenant.as_str(), &entity_type, &entity_id, since) + .into_iter() + .collect::>(); + let replay_high_water = replay_events.last().map(|event| event.seq).unwrap_or(since); + let replay = replay_events.into_iter().map(|event| { + let data = serde_json::to_string(&event.data).unwrap_or_default(); + Ok::(Event::default().event(&event.event_name).data(data)) + }); + let replay_stream = tokio_stream::iter(replay); + + let live_tenant = tenant.clone(); + let live_entity_type = entity_type.clone(); + let live_entity_id = entity_id.clone(); + let live_stream = BroadcastStream::new(rx).filter_map(move |result| match result { + Ok(event) + if event.tenant == live_tenant.as_str() + && event.entity_type == live_entity_type + && event.entity_id == live_entity_id + && event.seq > replay_high_water => + { + let data = serde_json::to_string(&event.data).unwrap_or_default(); + Some(Ok(Event::default().event(&event.event_name).data(data))) + } + Ok(_) => None, + Err(_) => None, + }); + + Ok(Sse::new(replay_stream.chain(live_stream)).keep_alive(KeepAlive::default())) +} + /// Format entity events into the history API response shape. fn format_history_response( entity_type: &str, diff --git a/crates/temper-server/src/observe/mod.rs b/crates/temper-server/src/observe/mod.rs index fa1bd2f2..879343b7 100644 --- a/crates/temper-server/src/observe/mod.rs +++ b/crates/temper-server/src/observe/mod.rs @@ -160,6 +160,10 @@ pub fn build_observe_router() -> Router { "/entities/{entity_type}/{entity_id}/wait", get(entities::handle_wait_for_entity_state), ) + .route( + "/entities/{entity_type}/{entity_id}/events", + get(entities::handle_entity_event_stream), + ) .route("/events/stream", get(entities::handle_event_stream)) .route( "/verification-status", diff --git a/crates/temper-server/src/router.rs b/crates/temper-server/src/router.rs index ad4982cb..7f9a626d 100644 --- a/crates/temper-server/src/router.rs +++ b/crates/temper-server/src/router.rs @@ -3,10 +3,11 @@ use axum::Router; use axum::http::header::{AUTHORIZATION, CACHE_CONTROL, CONTENT_TYPE, HeaderName}; use axum::http::{Method, StatusCode}; -use axum::routing::get; +use axum::routing::{get, put}; use tower_http::cors::{Any, CorsLayer}; use tower_http::trace::TraceLayer; +use crate::blobs; use crate::events; use crate::odata; use crate::state::ServerState; @@ -64,6 +65,10 @@ pub fn build_router(state: ServerState) -> Router { .route( "/webhooks/{tenant}/{*path}", get(webhook_receiver::handle_webhook).post(webhook_receiver::handle_webhook), + ) + .route( + "/_internal/blobs/{*path}", + put(blobs::put_blob).get(blobs::get_blob), ); #[cfg(feature = "observe")] diff --git a/crates/temper-server/src/router_test.rs b/crates/temper-server/src/router_test.rs index 6935a6e5..a9c26db1 100644 --- a/crates/temper-server/src/router_test.rs +++ b/crates/temper-server/src/router_test.rs @@ -589,6 +589,7 @@ async fn test_sse_events_endpoint_delivers_state_changes() { // Send a state change event on the broadcast channel. let _ = event_tx.send(EntityStateChange { + seq: 1, entity_type: "Order".into(), entity_id: "o-sse-1".into(), action: "SubmitOrder".into(), @@ -620,6 +621,7 @@ async fn test_sse_events_lagged_receiver_continues() { // Flood it before any subscriber — then subscribe and send one more event. for i in 0..300 { let _ = event_tx.send(EntityStateChange { + seq: (i + 1) as u64, entity_type: "Order".into(), entity_id: format!("flood-{i}"), action: "Flood".into(), @@ -645,6 +647,7 @@ async fn test_sse_events_lagged_receiver_continues() { // Send a fresh event that should be delivered. let _ = event_tx.send(EntityStateChange { + seq: 301, entity_type: "Order".into(), entity_id: "after-flood".into(), action: "Fresh".into(), diff --git a/crates/temper-server/src/state/dispatch/effects.rs b/crates/temper-server/src/state/dispatch/effects.rs index db2fc37f..72be1a37 100644 --- a/crates/temper-server/src/state/dispatch/effects.rs +++ b/crates/temper-server/src/state/dispatch/effects.rs @@ -107,7 +107,10 @@ impl crate::state::ServerState { ctx: &PostDispatchContext<'_>, response: &EntityResponse, ) { - let _ = self.event_tx.send(EntityStateChange { + let seq = + self.next_entity_event_sequence(ctx.tenant.as_str(), ctx.entity_type, ctx.entity_id); + let change = EntityStateChange { + seq, entity_type: ctx.entity_type.to_string(), entity_id: ctx.entity_id.to_string(), action: ctx.action.to_string(), @@ -115,7 +118,55 @@ impl crate::state::ServerState { tenant: ctx.tenant.to_string(), agent_id: ctx.agent_ctx.agent_id.clone(), session_id: ctx.agent_ctx.session_id.clone(), - }); + }; + self.record_entity_observe_event_with_seq( + ctx.tenant.as_str(), + ctx.entity_type, + ctx.entity_id, + seq, + "state_change", + serde_json::to_value(&change).unwrap_or_default(), + ); + let _ = self.event_tx.send(change); + if matches!( + response.state.status.as_str(), + "Completed" | "Failed" | "Cancelled" + ) { + let terminal_seq = self.next_entity_event_sequence( + ctx.tenant.as_str(), + ctx.entity_type, + ctx.entity_id, + ); + let result = response + .state + .fields + .get("result") + .or_else(|| response.state.fields.get("Result")) + .and_then(serde_json::Value::as_str); + let error_message = response + .state + .fields + .get("error_message") + .or_else(|| response.state.fields.get("ErrorMessage")) + .and_then(serde_json::Value::as_str) + .or(response.error.as_deref()); + self.record_entity_observe_event_with_seq( + ctx.tenant.as_str(), + ctx.entity_type, + ctx.entity_id, + terminal_seq, + "agent_complete", + serde_json::json!({ + "seq": terminal_seq, + "status": response.state.status, + "action": ctx.action, + "result": result, + "error_message": error_message, + "agent_id": ctx.agent_ctx.agent_id, + "session_id": ctx.agent_ctx.session_id, + }), + ); + } let cache_key = format!("{}:{}:{}", ctx.tenant, ctx.entity_type, ctx.entity_id); self.cache_entity_status(cache_key, response.state.status.clone()); let _ = self diff --git a/crates/temper-server/src/state/dispatch/wasm.rs b/crates/temper-server/src/state/dispatch/wasm.rs index 532e1ea9..67b5bef8 100644 --- a/crates/temper-server/src/state/dispatch/wasm.rs +++ b/crates/temper-server/src/state/dispatch/wasm.rs @@ -6,10 +6,11 @@ use tracing::instrument; use crate::entity_actor::{EntityResponse, EntityState}; use crate::request_context::AgentContext; use crate::secrets::template::resolve_secret_templates; +use crate::state::sim_now; use temper_runtime::tenant::TenantId; use temper_wasm::{ - AuthorizedWasmHost, ProductionWasmHost, StreamRegistry, WasmAuthzContext, WasmAuthzGate, - WasmHost, WasmInvocationContext, WasmResourceLimits, + AuthorizedWasmHost, ProductionWasmHost, ProgressEmitterFn, StreamRegistry, WasmAuthzContext, + WasmAuthzGate, WasmHost, WasmInvocationContext, WasmResourceLimits, }; use super::{ @@ -172,9 +173,17 @@ impl crate::state::ServerState { .and_then(|s| s.parse::().ok()) .map(std::time::Duration::from_secs) .unwrap_or(std::time::Duration::from_secs(30)); + let progress_emitter = progress_emitter_fn( + self.clone(), + ctx.entity_ref.tenant.to_string(), + ctx.entity_ref.entity_type.to_string(), + ctx.entity_ref.entity_id.to_string(), + module_name.clone(), + ); let inner: Arc = Arc::new( ProductionWasmHost::with_timeout(tenant_secrets, http_timeout) - .with_spec_evaluator(spec_evaluator_fn()), + .with_spec_evaluator(spec_evaluator_fn()) + .with_progress_emitter(progress_emitter), ); let host: Arc = Arc::new(AuthorizedWasmHost::new(inner, gate, authz_ctx)); let max_response_bytes = integration @@ -197,6 +206,24 @@ impl crate::state::ServerState { hash = %hash, "invoking WASM integration module" ); + let start_seq = self.next_entity_event_sequence( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + ); + self.record_entity_observe_event_with_seq( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + start_seq, + "integration_start", + serde_json::json!({ + "seq": start_seq, + "integration": integration.name, + "module": module_name, + "trigger_action": ctx.action, + }), + ); // --- Invoke and handle result --- self.invoke_and_handle_result( @@ -397,6 +424,27 @@ impl crate::state::ServerState { .await { Ok(result) if result.success => { + let complete_seq = self.next_entity_event_sequence( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + ); + self.record_entity_observe_event_with_seq( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + complete_seq, + "integration_complete", + serde_json::json!({ + "seq": complete_seq, + "integration": integration.name, + "module": module_name, + "trigger_action": ctx.action, + "result": "success", + "callback_action": result.callback_action.clone(), + "duration_ms": result.duration_ms, + }), + ); if let Some(reason) = denial_tracker.take_denial() { let error_str = http_call_authz_denied_error(&reason); return self @@ -446,6 +494,28 @@ impl crate::state::ServerState { Ok(None) } Ok(result) => { + let complete_seq = self.next_entity_event_sequence( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + ); + self.record_entity_observe_event_with_seq( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + complete_seq, + "integration_complete", + serde_json::json!({ + "seq": complete_seq, + "integration": integration.name, + "module": module_name, + "trigger_action": ctx.action, + "result": "failure", + "callback_action": result.callback_action.clone(), + "duration_ms": result.duration_ms, + "error": result.error.clone(), + }), + ); let mut error_str = result.error.unwrap_or_else(|| { format!( "WASM integration '{}' returned unsuccessful result", @@ -466,6 +536,27 @@ impl crate::state::ServerState { .await } Err(e) => { + let complete_seq = self.next_entity_event_sequence( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + ); + self.record_entity_observe_event_with_seq( + ctx.entity_ref.tenant.as_str(), + ctx.entity_ref.entity_type, + ctx.entity_ref.entity_id, + complete_seq, + "integration_complete", + serde_json::json!({ + "seq": complete_seq, + "integration": integration.name, + "module": module_name, + "trigger_action": ctx.action, + "result": "error", + "duration_ms": 0, + "error": e.to_string(), + }), + ); let mut error_str = e.to_string(); if let Some(reason) = denial_tracker.take_denial() && !is_http_call_authz_denial(&error_str) @@ -520,8 +611,17 @@ impl crate::state::ServerState { trigger_action: context.trigger_action.clone(), }; let tenant_secrets = self.get_authorized_wasm_secrets(tenant, &*base_gate, &authz_ctx); + let progress_emitter = progress_emitter_fn( + self.clone(), + tenant.to_string(), + context.entity_type.clone(), + context.entity_id.clone(), + module_name.to_string(), + ); let inner: Arc = Arc::new( - ProductionWasmHost::new(tenant_secrets).with_spec_evaluator(spec_evaluator_fn()), + ProductionWasmHost::new(tenant_secrets) + .with_spec_evaluator(spec_evaluator_fn()) + .with_progress_emitter(progress_emitter), ); let host: Arc = Arc::new(AuthorizedWasmHost::new(inner, base_gate, authz_ctx)); @@ -578,3 +678,55 @@ fn spec_evaluator_fn() -> temper_wasm::SpecEvaluatorFn { }, ) } + +fn progress_emitter_fn( + state: crate::state::ServerState, + tenant: String, + entity_type: String, + entity_id: String, + module_name: String, +) -> ProgressEmitterFn { + std::sync::Arc::new(move |event_json: &str| { + let parsed = serde_json::from_str::(event_json).unwrap_or_else(|_| { + serde_json::json!({ + "kind": "integration_progress", + "message": event_json, + }) + }); + let kind = parsed + .get("kind") + .and_then(Value::as_str) + .unwrap_or("integration_progress") + .to_string(); + let seq = state.next_entity_event_sequence(&tenant, &entity_type, &entity_id); + let event = crate::state::AgentProgressEvent { + tenant: tenant.clone(), + entity_type: entity_type.clone(), + entity_id: entity_id.clone(), + seq, + kind, + agent_id: entity_id.clone(), + tool_call_id: parsed + .get("tool_call_id") + .and_then(Value::as_str) + .map(str::to_string), + tool_name: parsed + .get("tool_name") + .and_then(Value::as_str) + .map(str::to_string) + .or_else(|| Some(module_name.clone())), + task_id: parsed + .get("task_id") + .and_then(Value::as_str) + .map(str::to_string), + message: parsed + .get("message") + .and_then(Value::as_str) + .map(str::to_string), + timestamp: sim_now().to_rfc3339(), + data: Some(parsed), + }; + state.broadcast_agent_progress(event); + Ok(()) + }) +} diff --git a/crates/temper-server/src/state/entity_ops.rs b/crates/temper-server/src/state/entity_ops.rs index c8f5b165..257c04c3 100644 --- a/crates/temper-server/src/state/entity_ops.rs +++ b/crates/temper-server/src/state/entity_ops.rs @@ -449,7 +449,9 @@ impl ServerState { .map_err(|e| format!("Actor query failed: {e}"))?; // Broadcast entity creation event for SSE subscribers - let _ = self.event_tx.send(EntityStateChange { + let seq = self.next_entity_event_sequence(tenant.as_str(), entity_type, entity_id); + let change = EntityStateChange { + seq, entity_type: entity_type.to_string(), entity_id: entity_id.to_string(), action: "Created".to_string(), @@ -457,7 +459,16 @@ impl ServerState { tenant: tenant.to_string(), agent_id: None, session_id: None, - }); + }; + self.record_entity_observe_event_with_seq( + tenant.as_str(), + entity_type, + entity_id, + seq, + "state_change", + serde_json::to_value(&change).unwrap_or_default(), + ); + let _ = self.event_tx.send(change); Ok(response) } diff --git a/crates/temper-server/src/state/mod.rs b/crates/temper-server/src/state/mod.rs index 3e786418..34dbf1f1 100644 --- a/crates/temper-server/src/state/mod.rs +++ b/crates/temper-server/src/state/mod.rs @@ -56,6 +56,14 @@ use temper_wasm::WasmEngine; /// track agent activity in real time without polling. #[derive(Debug, Clone, serde::Serialize)] pub struct AgentProgressEvent { + /// Tenant that owns the related entity. + pub tenant: String, + /// Entity type that emitted the event. + pub entity_type: String, + /// Entity ID that emitted the event. + pub entity_id: String, + /// Monotonic per-entity event sequence. + pub seq: u64, /// Event kind: "tool_call_started", "tool_call_completed", /// "task_started", "task_completed", "agent_completed". pub kind: String, @@ -71,6 +79,26 @@ pub struct AgentProgressEvent { pub message: Option, /// ISO-8601 timestamp when the event was created. pub timestamp: String, + /// Optional structured payload. + #[serde(skip_serializing_if = "Option::is_none")] + pub data: Option, +} + +/// Unified replayable event stream for a single entity. +#[derive(Debug, Clone, serde::Serialize)] +pub struct EntityObserveEvent { + /// Tenant that owns the entity. + pub tenant: String, + /// Entity type for this event. + pub entity_type: String, + /// Entity instance ID. + pub entity_id: String, + /// Monotonic per-entity event sequence. + pub seq: u64, + /// SSE event name. + pub event_name: String, + /// Structured event payload. + pub data: serde_json::Value, } /// Lightweight hint broadcast for the Observe UI SSE refresh stream. @@ -186,6 +214,8 @@ pub struct ServerState { pub entity_index: Arc>>>, /// Broadcast channel for entity state change events (SSE subscriptions). pub event_tx: Arc>, + /// Broadcast channel for replayable per-entity lifecycle and progress events. + pub entity_observe_tx: Arc>, /// Server start time (DST-safe: uses sim_now()). pub start_time: chrono::DateTime, /// Metrics collector for the /observe endpoints. @@ -234,6 +264,10 @@ pub struct ServerState { /// Broadcast channel for agent progress events (SSE subscriptions). /// // determinism-ok: broadcast channel for external observation only pub agent_progress_tx: Arc>, + /// Monotonic per-entity observe-event sequence counters. + pub entity_event_sequences: Arc>>, + /// Replay buffer for recent per-entity observe events. + pub entity_observe_log: Arc>>>, /// Broadcast channel for observe UI refresh hints (SSE push). /// // determinism-ok: broadcast channel for external observation only pub observe_refresh_tx: Arc>, @@ -272,6 +306,7 @@ impl ServerState { } let (event_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation + let (entity_observe_tx, _) = tokio::sync::broadcast::channel(512); // determinism-ok: broadcast for external observation let (design_time_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation let (pending_decision_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation let (agent_progress_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation @@ -291,6 +326,7 @@ impl ServerState { registry: Arc::new(RwLock::new(SpecRegistry::new())), entity_index: Arc::new(RwLock::new(BTreeMap::new())), event_tx: Arc::new(event_tx), + entity_observe_tx: Arc::new(entity_observe_tx), start_time: sim_now(), metrics: Arc::new(MetricsCollector::new()), record_store: Arc::new(RecordStore::new()), @@ -315,6 +351,8 @@ impl ServerState { tenant_policies: Arc::new(RwLock::new(BTreeMap::new())), secrets_vault: None, agent_progress_tx: Arc::new(agent_progress_tx), // determinism-ok: broadcast for external observation + entity_event_sequences: Arc::new(Mutex::new(BTreeMap::new())), + entity_observe_log: Arc::new(Mutex::new(BTreeMap::new())), observe_refresh_tx: Arc::new(observe_refresh_tx), // determinism-ok: broadcast for external observation listen_port: Arc::new(std::sync::OnceLock::new()), single_tenant_mode: true, @@ -346,6 +384,87 @@ impl ServerState { } } + fn push_entity_observe_event(&self, event: EntityObserveEvent) { + let key = format!("{}:{}:{}", event.tenant, event.entity_type, event.entity_id); + { + let mut log = self.entity_observe_log.lock().unwrap(); // ci-ok: infallible lock + let entries = log.entry(key).or_default(); + entries.push(event.clone()); + if entries.len() > 512 { + let overflow = entries.len().saturating_sub(512); + entries.drain(0..overflow); + } + } + let _ = self.entity_observe_tx.send(event); + } + + pub(crate) fn next_entity_event_sequence( + &self, + tenant: &str, + entity_type: &str, + entity_id: &str, + ) -> u64 { + let key = format!("{tenant}:{entity_type}:{entity_id}"); + let mut sequences = self.entity_event_sequences.lock().unwrap(); // ci-ok: infallible lock + let next = sequences.get(&key).copied().unwrap_or(0) + 1; + sequences.insert(key, next); + next + } + + pub(crate) fn record_entity_observe_event_with_seq( + &self, + tenant: &str, + entity_type: &str, + entity_id: &str, + seq: u64, + event_name: &str, + data: serde_json::Value, + ) { + let event = EntityObserveEvent { + tenant: tenant.to_string(), + entity_type: entity_type.to_string(), + entity_id: entity_id.to_string(), + seq, + event_name: event_name.to_string(), + data, + }; + self.push_entity_observe_event(event); + } + + #[cfg(feature = "observe")] + pub(crate) fn replay_entity_observe_events( + &self, + tenant: &str, + entity_type: &str, + entity_id: &str, + since: u64, + ) -> Vec { + let key = format!("{tenant}:{entity_type}:{entity_id}"); + let log = self.entity_observe_log.lock().unwrap(); // ci-ok: infallible lock + log.get(&key) + .map(|entries| { + entries + .iter() + .filter(|event| event.seq > since) + .cloned() + .collect() + }) + .unwrap_or_default() + } + + pub(crate) fn broadcast_agent_progress(&self, event: AgentProgressEvent) { + let _ = self.agent_progress_tx.send(event.clone()); + let observe_event = EntityObserveEvent { + tenant: event.tenant.clone(), + entity_type: event.entity_type.clone(), + entity_id: event.entity_id.clone(), + seq: event.seq, + event_name: event.kind.clone(), + data: serde_json::to_value(&event).unwrap_or_default(), + }; + self.push_entity_observe_event(observe_event); + } + /// Create ServerState with I/O Automaton TOML specs for transition table resolution. /// /// Returns an error if any IOA spec fails to parse. @@ -412,6 +531,7 @@ impl ServerState { /// (e.g. `PlatformState`) so that writes are visible to dispatch. pub fn from_registry_shared(system: ActorSystem, registry: Arc>) -> Self { let (event_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation + let (entity_observe_tx, _) = tokio::sync::broadcast::channel(512); // determinism-ok: broadcast for external observation let (design_time_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation let (pending_decision_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation let (agent_progress_tx, _) = tokio::sync::broadcast::channel(256); // determinism-ok: broadcast for external observation @@ -434,6 +554,7 @@ impl ServerState { registry, entity_index: Arc::new(RwLock::new(BTreeMap::new())), event_tx: Arc::new(event_tx), + entity_observe_tx: Arc::new(entity_observe_tx), start_time: sim_now(), metrics: Arc::new(MetricsCollector::new()), record_store: Arc::new(RecordStore::new()), @@ -458,6 +579,8 @@ impl ServerState { tenant_policies: Arc::new(RwLock::new(BTreeMap::new())), secrets_vault: None, agent_progress_tx: Arc::new(agent_progress_tx), // determinism-ok: broadcast for external observation + entity_event_sequences: Arc::new(Mutex::new(BTreeMap::new())), + entity_observe_log: Arc::new(Mutex::new(BTreeMap::new())), observe_refresh_tx: Arc::new(observe_refresh_tx), // determinism-ok: broadcast for external observation listen_port: Arc::new(std::sync::OnceLock::new()), single_tenant_mode: false, diff --git a/crates/temper-store-turso/src/schema.rs b/crates/temper-store-turso/src/schema.rs index f1d58d25..789aeb48 100644 --- a/crates/temper-store-turso/src/schema.rs +++ b/crates/temper-store-turso/src/schema.rs @@ -346,6 +346,23 @@ CREATE TABLE IF NOT EXISTS tenant_secrets ( PRIMARY KEY(tenant, key_name) );"; +// --------------------------------------------------------------------------- +// Blob storage (content-addressed binary objects for TemperFS) +// --------------------------------------------------------------------------- + +/// Content-addressed blob storage for TemperFS `$value` endpoints. +/// +/// Blobs are keyed by `{bucket}/{content_hash}` (e.g. `temper-fs/sha256:abc...`). +/// This provides persistent local blob storage so the blob_adapter WASM module +/// can upload/download via HTTP without requiring external S3/R2 in development. +pub const CREATE_BLOBS_TABLE: &str = "\ +CREATE TABLE IF NOT EXISTS blobs ( + blob_key TEXT PRIMARY KEY, + data BLOB NOT NULL, + size_bytes INTEGER NOT NULL, + created_at TEXT NOT NULL DEFAULT (datetime('now')) +);"; + // --------------------------------------------------------------------------- // OTS trajectory storage (full agent execution traces) // --------------------------------------------------------------------------- diff --git a/crates/temper-store-turso/src/store/blobs.rs b/crates/temper-store-turso/src/store/blobs.rs new file mode 100644 index 00000000..78cb13b8 --- /dev/null +++ b/crates/temper-store-turso/src/store/blobs.rs @@ -0,0 +1,46 @@ +//! Turso-backed blob storage for TemperFS `$value` endpoints. +//! +//! Content-addressed storage: blobs are keyed by `{bucket}/{content_hash}`. +//! This provides persistent local blob storage so the blob_adapter WASM module +//! can upload/download via HTTP without requiring external S3/R2. + +use crate::TursoEventStore; +use libsql::params; + +impl TursoEventStore { + /// Store a blob by key (content-addressed path like `temper-fs/sha256:abc...`). + pub async fn put_blob(&self, key: &str, data: &[u8]) -> Result<(), String> { + let conn = self.connection().map_err(|e| e.to_string())?; + conn.execute( + "INSERT OR REPLACE INTO blobs (blob_key, data, size_bytes) VALUES (?1, ?2, ?3)", + params![key, data.to_vec(), data.len() as i64], + ) + .await + .map_err(|e| format!("blob put failed: {e}"))?; + Ok(()) + } + + /// Retrieve a blob by key. Returns `None` if not found. + pub async fn get_blob(&self, key: &str) -> Result>, String> { + let conn = self.connection().map_err(|e| e.to_string())?; + let mut rows = conn + .query("SELECT data FROM blobs WHERE blob_key = ?1", params![key]) + .await + .map_err(|e| format!("blob get failed: {e}"))?; + + match rows.next().await { + Ok(Some(row)) => { + let data: Vec = row + .get_value(0) + .map_err(|e| format!("blob read failed: {e}")) + .and_then(|v| match v { + libsql::Value::Blob(b) => Ok(b), + _ => Err("blob column is not BLOB type".to_string()), + })?; + Ok(Some(data)) + } + Ok(None) => Ok(None), + Err(e) => Err(format!("blob query failed: {e}")), + } + } +} diff --git a/crates/temper-store-turso/src/store/mod.rs b/crates/temper-store-turso/src/store/mod.rs index d5b6495a..d9c5a057 100644 --- a/crates/temper-store-turso/src/store/mod.rs +++ b/crates/temper-store-turso/src/store/mod.rs @@ -17,6 +17,7 @@ use tracing::instrument; use crate::schema; mod authz; +mod blobs; mod constraints; mod event_store; mod evolution; @@ -240,6 +241,11 @@ impl TursoEventStore { .await .map_err(storage_error)?; + // Blob storage — content-addressed binary objects for TemperFS. + conn.execute(schema::CREATE_BLOBS_TABLE, ()) + .await + .map_err(storage_error)?; + Ok(()) } diff --git a/crates/temper-wasm-sdk/src/context.rs b/crates/temper-wasm-sdk/src/context.rs index 49c2f776..7f70ff41 100644 --- a/crates/temper-wasm-sdk/src/context.rs +++ b/crates/temper-wasm-sdk/src/context.rs @@ -237,6 +237,18 @@ impl Context { } } + /// Emit a replayable progress event for the current entity. + pub fn emit_progress(&self, event: &Value) -> Result<(), String> { + let json = + serde_json::to_string(event).map_err(|e| format!("progress JSON serialize: {e}"))?; + let rc = unsafe { host::host_emit_progress(json.as_ptr() as i32, json.len() as i32) }; + if rc == 0 { + Ok(()) + } else { + Err("host_emit_progress failed".to_string()) + } + } + /// Evaluate a single transition against an IOA spec via the host. /// /// The host builds a `TransitionTable` from the IOA source and evaluates diff --git a/crates/temper-wasm-sdk/src/host.rs b/crates/temper-wasm-sdk/src/host.rs index a62c2e58..a28821d4 100644 --- a/crates/temper-wasm-sdk/src/host.rs +++ b/crates/temper-wasm-sdk/src/host.rs @@ -42,6 +42,10 @@ unsafe extern "C" { /// Set the result JSON for this invocation. pub fn host_set_result(ptr: i32, len: i32); + /// Emit a replayable progress event for the current entity. + /// Returns 0 on success, -1 on error. + pub fn host_emit_progress(ptr: i32, len: i32) -> i32; + /// Read a secret value by key. /// Returns bytes written, needed size if too small, or -1 on error. pub fn host_get_secret(key_ptr: i32, key_len: i32, buf_ptr: i32, buf_len: i32) -> i32; diff --git a/crates/temper-wasm/src/authorized_host.rs b/crates/temper-wasm/src/authorized_host.rs index 9afdc4ee..8698eae4 100644 --- a/crates/temper-wasm/src/authorized_host.rs +++ b/crates/temper-wasm/src/authorized_host.rs @@ -182,6 +182,10 @@ impl WasmHost for AuthorizedWasmHost { self.inner .evaluate_spec(ioa_source, current_state, action, params_json) } + + fn emit_progress(&self, event_json: &str) -> Result<(), String> { + self.inner.emit_progress(event_json) + } } #[cfg(test)] diff --git a/crates/temper-wasm/src/engine/host_functions.rs b/crates/temper-wasm/src/engine/host_functions.rs index 59bd1b55..3ba6dc2b 100644 --- a/crates/temper-wasm/src/engine/host_functions.rs +++ b/crates/temper-wasm/src/engine/host_functions.rs @@ -73,6 +73,31 @@ pub(super) fn link_host_functions(linker: &mut Linker) -> Result<(), ) .map_err(|e| WasmError::Compilation(format!("failed to link host_set_result: {e}")))?; + // host_emit_progress(ptr, len) -> i32 + linker + .func_wrap( + "env", + "host_emit_progress", + |mut caller: Caller<'_, HostState>, ptr: i32, len: i32| -> i32 { + let memory = caller.get_export("memory").and_then(|e| e.into_memory()); + let Some(memory) = memory else { + return -1; + }; + let mut buf = vec![0u8; len as usize]; + if memory.read(&caller, ptr as usize, &mut buf).is_err() { + return -1; + } + let Ok(payload) = String::from_utf8(buf) else { + return -1; + }; + match caller.data().host.emit_progress(&payload) { + Ok(()) => 0, + Err(_) => -1, + } + }, + ) + .map_err(|e| WasmError::Compilation(format!("failed to link host_emit_progress: {e}")))?; + // host_get_secret(key_ptr, key_len, buf_ptr, buf_len) -> actual_len (-1 on error) linker .func_wrap( diff --git a/crates/temper-wasm/src/host_trait.rs b/crates/temper-wasm/src/host_trait.rs index 3edd9a16..3d122894 100644 --- a/crates/temper-wasm/src/host_trait.rs +++ b/crates/temper-wasm/src/host_trait.rs @@ -79,6 +79,11 @@ pub trait WasmHost: Send + Sync { ) -> Result { Err("evaluate_spec not supported by this host".to_string()) } + + /// Emit a replayable progress event from the guest module. + fn emit_progress(&self, _event_json: &str) -> Result<(), String> { + Ok(()) + } } /// Callback for evaluating IOA spec transitions. @@ -88,6 +93,9 @@ pub trait WasmHost: Send + Sync { pub type SpecEvaluatorFn = Arc Result + Send + Sync>; +/// Callback for replayable progress events emitted by guest WASM modules. +pub type ProgressEmitterFn = Arc Result<(), String> + Send + Sync>; + /// Production host: real HTTP calls via reqwest, real secrets. pub struct ProductionWasmHost { /// HTTP client for making real requests. @@ -96,6 +104,8 @@ pub struct ProductionWasmHost { secrets: BTreeMap, /// Optional spec evaluator (provided by temper-server at construction). spec_evaluator: Option, + /// Optional progress emitter (provided by temper-server at construction). + progress_emitter: Option, } impl ProductionWasmHost { @@ -114,6 +124,7 @@ impl ProductionWasmHost { .unwrap_or_default(), secrets, spec_evaluator: None, + progress_emitter: None, } } @@ -122,6 +133,12 @@ impl ProductionWasmHost { self.spec_evaluator = Some(evaluator); self } + + /// Create with a progress emitter for `host_emit_progress` support. + pub fn with_progress_emitter(mut self, emitter: ProgressEmitterFn) -> Self { + self.progress_emitter = Some(emitter); + self + } } #[async_trait] @@ -266,6 +283,13 @@ impl WasmHost for ProductionWasmHost { None => Err("evaluate_spec not supported by this host".to_string()), } } + + fn emit_progress(&self, event_json: &str) -> Result<(), String> { + match &self.progress_emitter { + Some(emitter) => emitter(event_json), + None => Ok(()), + } + } } /// Parse Connect protocol binary frames from a response body. @@ -476,6 +500,10 @@ impl WasmHost for SimWasmHost { .cloned() .ok_or_else(|| format!("sim: no canned response for action '{action}'")) } + + fn emit_progress(&self, _event_json: &str) -> Result<(), String> { + Ok(()) + } } #[cfg(test)] diff --git a/crates/temper-wasm/src/lib.rs b/crates/temper-wasm/src/lib.rs index 0557711c..ff61631b 100644 --- a/crates/temper-wasm/src/lib.rs +++ b/crates/temper-wasm/src/lib.rs @@ -14,7 +14,8 @@ pub mod types; pub use authorized_host::{AuthorizedWasmHost, WasmAuthzDecision, WasmAuthzGate, extract_domain}; pub use engine::{WasmEngine, WasmError}; pub use host_trait::{ - ProductionWasmHost, SimWasmHost, SpecEvaluatorFn, WasmHost, parse_connect_frames, + ProductionWasmHost, ProgressEmitterFn, SimWasmHost, SpecEvaluatorFn, WasmHost, + parse_connect_frames, }; pub use stream::{StreamRegistry, StreamRegistryConfig}; pub use types::{ diff --git a/docs/adrs/0036-pi-agent-architecture.md b/docs/adrs/0036-pi-agent-architecture.md new file mode 100644 index 00000000..ddfaa8fb --- /dev/null +++ b/docs/adrs/0036-pi-agent-architecture.md @@ -0,0 +1,46 @@ +# ADR-0036: Governed Agent Architecture + +## Status + +Accepted + +## Context + +Proven open-source agent architectures already validate a useful set of patterns: append-only session trees, context compaction, a two-loop steering model, lazy skills, event streaming, and transport/channel adapters. The existing `TemperAgent` proves the basic governed loop, but it still stores flat conversation JSON, exposes only a poll-centric control plane, and keeps most capabilities inside a single agent/tool implementation boundary. + +We want the Temper version of that architecture, but we do not want to wrap an external agent runtime as an opaque subprocess. The Temper runtime needs each capability to remain spec-driven, Cedar-governed, observable, and verifiable. + +## Decision + +Rebase `TemperAgent` onto these proven patterns and express the missing capabilities as governed Temper specs and WASM integrations: + +- Session tree storage with JSONL append-only entries and branch tracking +- Explicit compaction and steering states in the TemperAgent IOA +- Soul, skill, memory, hook, heartbeat, and cron capabilities as first-class entities +- SSE-based lifecycle and progress streaming for entities +- Channel adapters and routing entities for multi-transport delivery +- Thin tool dispatch that executes sandbox tools directly and routes entity capabilities through OData + +The `TemperAgent` remains the execution boundary, but the richer architecture is decomposed into separate governed entities instead of extending a monolithic match-arm tool runner. + +## Alternatives Considered + +1. Wrap an external agent runtime as a subprocess + +Rejected. This would preserve the interaction semantics, but the actual runtime behavior would sit outside Temper governance, Cedar authorization, and IOA verification. + +2. Build a new agent stack from scratch + +Rejected. Existing agent architectures already validate the core interaction patterns we need. Re-learning those design choices inside a brand-new implementation adds unnecessary risk. + +3. Extend the existing TemperAgent incrementally + +Chosen. This keeps the proven Temper dispatch/runtime model while migrating the storage format, state machine, event transport, and capability surface toward the target architecture. + +## Consequences + +- `TemperAgent` conversation persistence changes from flat JSON to JSONL session-tree storage. +- New entity types are introduced in the `temper-agent` and `temper-channels` OS apps. +- Additional WASM modules are required for compaction, steering, heartbeat scanning, cron triggering, and channel routing. +- Event streaming becomes part of the agent contract instead of an optional side channel. +- Capability growth shifts from tool-runner branching to governed entity composition. diff --git a/docs/adrs/0037-channel-transports.md b/docs/adrs/0037-channel-transports.md new file mode 100644 index 00000000..41e64cab --- /dev/null +++ b/docs/adrs/0037-channel-transports.md @@ -0,0 +1,84 @@ +# ADR-0036: Channel Transports + +- Status: Accepted +- Date: 2026-03-24 +- Deciders: Temper core maintainers +- Related: + - ADR-0012: OAuth2, Webhooks, Timers, Secret Templates + - `crates/temper-server/src/webhooks/receiver.rs` (inbound webhook pattern) + - `crates/temper-server/src/adapters/openclaw.rs` (WebSocket reference) + - `os-apps/temper-channels/specs/channel.ioa.toml` (TemperAgent entity spec) + +## Context + +The Channel entity currently supports external messaging platforms via HTTP webhooks and slash commands. This requires ngrok tunnels for local development and produces an unnatural UX (users must type `/ask` instead of just sending a message). + +Platforms like Discord, Slack, and Teams offer persistent WebSocket connections (Discord Gateway, Slack RTM/Socket Mode) that allow servers to receive messages without exposing a public HTTP endpoint. These connections are outbound from the server, eliminating the need for tunnels. + +The existing `AgentAdapter` trait is request-response (entity transition triggers outbound call). Channel transports are the reverse: persistent inbound event sources that produce entity transitions. The webhook receiver is the closest analogy, but for persistent connections instead of HTTP callbacks. + +## Decision + +### Sub-Decision 1: Channel transports as server-level infrastructure + +Channel transports live in `crates/temper-server/src/channels/`. Each transport is a file in that module (e.g., `discord.rs`, `slack.rs`). They are spawned as background tasks during `temper serve` startup, following the same pattern as `spawn_optimization_loop` and `spawn_actor_passivation_loop`. + +**Why this approach**: Transports are server-wide (one WebSocket per bot token), not entity-scoped. They don't fit the `AgentAdapter` trait (wrong direction) or `[[integration]]` specs (wrong lifecycle). Background tasks are the established pattern for long-lived server infrastructure. + +### Sub-Decision 2: Transports own all platform I/O + +The transport handles both inbound (receive events → dispatch `ReceiveMessage`) and outbound (watch for `SendReply` → deliver via platform API). WASM modules never call platform-specific APIs. + +**Why this approach**: Keeps specs and WASM platform-agnostic. The same `send_reply` WASM works for Discord, Slack, WhatsApp — it records the reply content on entity state, and the transport delivers it. Adding a new platform means adding one Rust file, not touching any specs or WASM. + +### Sub-Decision 3: No premature Connector trait + +Discord is the first transport. We do not define a `Connector` trait until we have 2-3 implementations and can discover the common pattern from concrete code. Each transport is a standalone struct with a `run()` method. + +**Why this approach**: Premature abstraction leads to wrong abstractions. Build Discord, build Slack, then extract the common interface. + +### Sub-Decision 4: Configuration via CLI flags and environment variables + +Each transport is activated by a CLI flag (e.g., `--discord-bot-token`) that also reads from environment variables (e.g., `DISCORD_BOT_TOKEN`). The token is stored in SecretsVault at startup for WASM access. + +**Why this approach**: Simplest possible UX. `temper serve --discord-bot-token $TOKEN` and you're done. + +## Rollout Plan + +1. **Phase 0 (This PR)** — Discord transport: Gateway WebSocket, inbound routing, outbound reply delivery, CLI integration. +2. **Phase 1 (Follow-up)** — Guild channel support (not just DMs), @mention filtering. +3. **Phase 2 (Future)** — Slack transport as second implementation, then extract common patterns. + +## Consequences + +### Positive +- Natural DM UX — users message the bot directly, no slash commands +- No ngrok/tunnel dependency for local development +- Platform-agnostic WASM modules — one `send_reply` for all channels +- Clean pattern for future transports (Slack, Teams, WhatsApp) + +### Negative +- Persistent WebSocket requires reconnection logic and heartbeat management +- Each transport adds server startup complexity (one more background task) + +### Risks +- Discord Gateway requires MESSAGE_CONTENT privileged intent (must be enabled in Discord Developer Portal for bots in 100+ guilds) +- WebSocket disconnects during LLM processing could lose reply delivery (mitigated by retry on reconnect) + +### DST Compliance +- Channel transport code uses `tokio::spawn` and `tokio-tungstenite` WebSocket — annotated with `// determinism-ok: WebSocket for channel transport` +- No simulation-visible state affected; transports operate outside the actor system's deterministic core + +## Non-Goals + +- Voice/audio channel support +- Discord sharding (not needed until 2500+ guilds) +- Connector trait abstraction (deferred until 2+ transports exist) +- Modifying the TemperAgent entity spec (already platform-agnostic) + +## Alternatives Considered + +1. **AgentAdapter implementation** — Rejected. The adapter trait is request-response (outbound). Discord Gateway is a persistent inbound event source. Wrong abstraction. +2. **Spec-driven `[[integration]] type = "discord_gateway"`** — Rejected. WebSocket connections are server-wide, not entity-scoped. The integration mechanism triggers per-transition, which makes no sense for a persistent connection. +3. **Separate `temper-discord` crate** — Rejected. Discord is just one channel type, not special enough to warrant its own crate. Lives alongside future transports in `channels/`. +4. **Webhook + ngrok approach** — Rejected by user as hacky. Requires exposing a public endpoint and running a tunnel for local development. diff --git a/os-apps/temper-agent/policies/agent.cedar b/os-apps/temper-agent/policies/agent.cedar index 1dc0b314..c04e7550 100644 --- a/os-apps/temper-agent/policies/agent.cedar +++ b/os-apps/temper-agent/policies/agent.cedar @@ -1,8 +1,8 @@ // TemperAgent — Cedar Authorization Policies // // Controls who can create, configure, and interact with spec-driven agents. -// Callback actions (SandboxReady, ProcessToolCalls, HandleToolResults, RecordResult) -// are permitted for system agents to enable the dispatch pipeline loop. +// Callback actions are permitted for system agents to enable the dispatch pipeline loop. +// Steering is permitted for supervisors, humans, and parent agents. // --- Creation and Configuration: admins, supervisors and humans --- @@ -28,9 +28,18 @@ permit( resource is TemperAgent ); +// --- Steering: supervisors, humans, and system (for parent agents) --- + +permit( + principal, + action in [Action::"Steer"], + resource is TemperAgent +) when { + ["supervisor", "human", "system"].contains(principal.agent_type) +}; + // --- Dispatch pipeline callbacks: system agents --- // These actions are triggered by WASM integration callbacks, not by users. -// The system agent identity drives the callback dispatch. permit( principal, @@ -38,7 +47,14 @@ permit( Action::"SandboxReady", Action::"ProcessToolCalls", Action::"HandleToolResults", - Action::"RecordResult" + Action::"RecordResult", + Action::"NeedsCompaction", + Action::"CompactionComplete", + Action::"CheckSteering", + Action::"ContinueWithSteering", + Action::"FinalizeResult", + Action::"Heartbeat", + Action::"TimeoutFail" ], resource is TemperAgent ) when { @@ -46,13 +62,13 @@ permit( }; // --- WASM module HTTP call authorization --- -// Allow agent WASM modules to call the Anthropic API + permit( principal is Agent, action == Action::"http_call", resource is HttpEndpoint ) when { - ["sandbox_provisioner", "llm_caller", "tool_runner", "workspace_restorer"].contains(context.module) + ["sandbox_provisioner", "llm_caller", "tool_runner", "workspace_restorer", "context_compactor", "steering_checker", "heartbeat_scan", "heartbeat_scheduler", "cron_trigger", "cron_scheduler_check", "cron_scheduler_heartbeat"].contains(context.module) }; // Allow agent WASM modules to access secrets @@ -61,7 +77,7 @@ permit( action == Action::"access_secret", resource is Secret ) when { - ["sandbox_provisioner", "llm_caller", "tool_runner", "workspace_restorer"].contains(context.module) + ["sandbox_provisioner", "llm_caller", "tool_runner", "workspace_restorer", "context_compactor"].contains(context.module) }; // --- Failure and cancellation: supervisors, humans, and system --- diff --git a/os-apps/temper-agent/policies/cron.cedar b/os-apps/temper-agent/policies/cron.cedar new file mode 100644 index 00000000..62f0d57b --- /dev/null +++ b/os-apps/temper-agent/policies/cron.cedar @@ -0,0 +1,54 @@ +// CronJob + CronScheduler — Cedar Authorization Policies + +// Admins can do everything +permit( + principal is Admin, + action, + resource is CronJob +); + +permit( + principal is Admin, + action, + resource is CronScheduler +); + +// Only supervisors/humans can configure and activate cron jobs +permit( + principal, + action in [Action::"create", Action::"Configure", Action::"Activate", Action::"Pause", Action::"Resume", Action::"Expire"], + resource is CronJob +) when { + ["supervisor", "human"].contains(principal.agent_type) +}; + +// System agents can trigger cron jobs (called by CronScheduler WASM) +permit( + principal, + action in [Action::"Trigger", Action::"TriggerComplete", Action::"TriggerFailed"], + resource is CronJob +) when { + principal.agent_type == "system" +}; + +// Any authenticated agent can read cron jobs +permit( + principal, + action in [Action::"read", Action::"list"], + resource is CronJob +); + +// System agents manage the scheduler lifecycle +permit( + principal, + action in [Action::"create", Action::"Start", Action::"CheckComplete", Action::"CheckFailed", Action::"ScheduledCheck", Action::"ScheduleFailed"], + resource is CronScheduler +) when { + principal.agent_type == "system" +}; + +permit( + principal, + action in [Action::"read", Action::"list"], + resource is CronScheduler +); diff --git a/os-apps/temper-agent/policies/heartbeat.cedar b/os-apps/temper-agent/policies/heartbeat.cedar new file mode 100644 index 00000000..d040be21 --- /dev/null +++ b/os-apps/temper-agent/policies/heartbeat.cedar @@ -0,0 +1,24 @@ +// HeartbeatMonitor — Cedar Authorization Policies + +// Admins can do everything +permit( + principal is Admin, + action, + resource is HeartbeatMonitor +); + +// System agents manage the monitor lifecycle +permit( + principal, + action in [Action::"create", Action::"Start", Action::"ScanComplete", Action::"ScanFailed", Action::"ScheduledScan", Action::"ScheduleFailed"], + resource is HeartbeatMonitor +) when { + principal.agent_type == "system" +}; + +// Any authenticated agent can read +permit( + principal, + action in [Action::"read", Action::"list"], + resource is HeartbeatMonitor +); diff --git a/os-apps/temper-agent/policies/hooks.cedar b/os-apps/temper-agent/policies/hooks.cedar new file mode 100644 index 00000000..ae0fd18e --- /dev/null +++ b/os-apps/temper-agent/policies/hooks.cedar @@ -0,0 +1,24 @@ +// ToolHook — Cedar Authorization Policies + +// Admins can do everything +permit( + principal is Admin, + action, + resource is ToolHook +); + +// Supervisors and humans can manage hooks +permit( + principal, + action in [Action::"create", Action::"Register", Action::"Disable", Action::"Enable"], + resource is ToolHook +) when { + ["supervisor", "human"].contains(principal.agent_type) +}; + +// Any authenticated agent can read hooks (tool_runner queries them) +permit( + principal, + action in [Action::"read", Action::"list"], + resource is ToolHook +); diff --git a/os-apps/temper-agent/policies/memory.cedar b/os-apps/temper-agent/policies/memory.cedar new file mode 100644 index 00000000..5dc31269 --- /dev/null +++ b/os-apps/temper-agent/policies/memory.cedar @@ -0,0 +1,42 @@ +// AgentMemory — Cedar Authorization Policies + +// Admins can do everything +permit( + principal is Admin, + action, + resource is AgentMemory +); + +// System, supervisor, and human principals can manage memories +permit( + principal, + action in [Action::"create", Action::"Save", Action::"Update", Action::"Recall"], + resource is AgentMemory +) when { + ["system", "supervisor", "human"].contains(principal.agent_type) +}; + +// Agents can save, update, and recall memories scoped to their own soul_id +permit( + principal, + action in [Action::"create", Action::"Save", Action::"Update", Action::"Recall"], + resource is AgentMemory +) when { + principal.agent_type == "agent" && resource.SoulId == principal.soul_id +}; + +// Supervisors and humans can archive any memory +permit( + principal, + action in [Action::"Archive"], + resource is AgentMemory +) when { + ["supervisor", "human"].contains(principal.agent_type) +}; + +// Any authenticated agent can read/list memories +permit( + principal, + action in [Action::"read", Action::"list"], + resource is AgentMemory +); diff --git a/os-apps/temper-agent/policies/skills.cedar b/os-apps/temper-agent/policies/skills.cedar new file mode 100644 index 00000000..3ac380e6 --- /dev/null +++ b/os-apps/temper-agent/policies/skills.cedar @@ -0,0 +1,24 @@ +// AgentSkill — Cedar Authorization Policies + +// Admins can do everything +permit( + principal is Admin, + action, + resource is AgentSkill +); + +// Supervisors and humans can register, update, disable, enable skills +permit( + principal, + action in [Action::"create", Action::"Register", Action::"Update", Action::"Disable", Action::"Enable"], + resource is AgentSkill +) when { + ["supervisor", "human"].contains(principal.agent_type) +}; + +// Any authenticated agent can read skills +permit( + principal, + action in [Action::"read", Action::"list"], + resource is AgentSkill +); diff --git a/os-apps/temper-agent/policies/soul.cedar b/os-apps/temper-agent/policies/soul.cedar new file mode 100644 index 00000000..d877712a --- /dev/null +++ b/os-apps/temper-agent/policies/soul.cedar @@ -0,0 +1,24 @@ +// AgentSoul — Cedar Authorization Policies + +// Admins can do everything +permit( + principal is Admin, + action, + resource is AgentSoul +); + +// Supervisors and humans can create, publish, update, archive souls +permit( + principal, + action in [Action::"create", Action::"Create", Action::"Publish", Action::"Update", Action::"Archive"], + resource is AgentSoul +) when { + ["supervisor", "human"].contains(principal.agent_type) +}; + +// Any authenticated agent can read souls +permit( + principal, + action in [Action::"read", Action::"list"], + resource is AgentSoul +); diff --git a/os-apps/temper-agent/specs/agent_memory.ioa.toml b/os-apps/temper-agent/specs/agent_memory.ioa.toml new file mode 100644 index 00000000..1e13da49 --- /dev/null +++ b/os-apps/temper-agent/specs/agent_memory.ioa.toml @@ -0,0 +1,67 @@ +# AgentMemory — Cross-session persistent knowledge. +# +# Memories persist ACROSS agent runs, scoped to a soul_id. +# Types: user, feedback, project, reference (matching Claude Code taxonomy). +# Content stored inline (memories are small). + +[automaton] +name = "AgentMemory" +states = ["Active", "Archived"] +initial = "Active" + +[[state]] +name = "key" +type = "string" +initial = "" + +[[state]] +name = "content" +type = "string" +initial = "" + +[[state]] +name = "memory_type" +type = "string" +initial = "project" + +[[state]] +name = "soul_id" +type = "string" +initial = "" + +[[state]] +name = "author_agent_id" +type = "string" +initial = "" + +[[action]] +name = "Save" +kind = "input" +from = ["Active"] +params = ["key", "content", "memory_type", "soul_id", "author_agent_id"] +hint = "Save or initialize a memory entry." + +[[action]] +name = "Update" +kind = "input" +from = ["Active"] +params = ["content"] +hint = "Update the memory content." + +[[action]] +name = "Archive" +kind = "input" +from = ["Active"] +to = "Archived" +hint = "Archive the memory. It will no longer appear in agent prompts." + +[[action]] +name = "Recall" +kind = "input" +from = ["Active"] +hint = "Read-only recall action for audit trail. No state mutation." + +[[invariant]] +name = "ArchivedIsFinal" +when = ["Archived"] +assert = "no_further_transitions" diff --git a/os-apps/temper-agent/specs/agent_skill.ioa.toml b/os-apps/temper-agent/specs/agent_skill.ioa.toml new file mode 100644 index 00000000..467f3775 --- /dev/null +++ b/os-apps/temper-agent/specs/agent_skill.ioa.toml @@ -0,0 +1,62 @@ +# AgentSkill — Lazy-loaded capability descriptions (SKILL.md equivalent). +# +# Skills define WHAT the agent can do. Only descriptions are injected into +# the system prompt; full content loaded on demand via TemperFS read. + +[automaton] +name = "AgentSkill" +states = ["Active", "Disabled"] +initial = "Active" + +[[state]] +name = "name" +type = "string" +initial = "" + +[[state]] +name = "description" +type = "string" +initial = "" + +[[state]] +name = "content_file_id" +type = "string" +initial = "" + +[[state]] +name = "scope" +type = "string" +initial = "global" + +[[state]] +name = "agent_filter" +type = "string" +initial = "" + +[[action]] +name = "Register" +kind = "input" +from = ["Active"] +params = ["name", "description", "content_file_id", "scope", "agent_filter"] +hint = "Register a new skill with name, description, content file, and scope." + +[[action]] +name = "Disable" +kind = "input" +from = ["Active"] +to = "Disabled" +hint = "Disable the skill. It will no longer appear in agent prompts." + +[[action]] +name = "Enable" +kind = "input" +from = ["Disabled"] +to = "Active" +hint = "Re-enable a disabled skill." + +[[action]] +name = "Update" +kind = "input" +from = ["Active"] +params = ["description", "content_file_id"] +hint = "Update skill description or content." diff --git a/os-apps/temper-agent/specs/agent_soul.ioa.toml b/os-apps/temper-agent/specs/agent_soul.ioa.toml new file mode 100644 index 00000000..2f9ddfbe --- /dev/null +++ b/os-apps/temper-agent/specs/agent_soul.ioa.toml @@ -0,0 +1,70 @@ +# AgentSoul — Versioned agent identity document (SOUL.md equivalent). +# +# A Soul defines WHO the agent is: personality, instructions, capabilities, +# constraints. Separate from skills (WHAT) and system_prompt (per-run override). +# Multiple agent runs can share the same Soul identity. + +[automaton] +name = "AgentSoul" +states = ["Draft", "Active", "Archived"] +initial = "Draft" + +[[state]] +name = "name" +type = "string" +initial = "" + +[[state]] +name = "description" +type = "string" +initial = "" + +[[state]] +name = "content_file_id" +type = "string" +initial = "" + +[[state]] +name = "version" +type = "counter" +initial = "0" + +[[state]] +name = "author_id" +type = "string" +initial = "" + +[[action]] +name = "Create" +kind = "input" +from = ["Draft"] +params = ["name", "description", "content_file_id", "author_id"] +hint = "Initialize soul with identity metadata and content file reference." + +[[action]] +name = "Publish" +kind = "input" +from = ["Draft"] +to = "Active" +hint = "Make the soul available for agent assignment. Only supervisors/humans." +effect = [{ type = "increment", var = "version" }] + +[[action]] +name = "Update" +kind = "input" +from = ["Active"] +params = ["content_file_id", "description"] +hint = "Update soul content. Increments version." +effect = [{ type = "increment", var = "version" }] + +[[action]] +name = "Archive" +kind = "input" +from = ["Active"] +to = "Archived" +hint = "Archive the soul. No new agents can use it." + +[[invariant]] +name = "ArchivedIsFinal" +when = ["Archived"] +assert = "no_further_transitions" diff --git a/os-apps/temper-agent/specs/cron_job.ioa.toml b/os-apps/temper-agent/specs/cron_job.ioa.toml new file mode 100644 index 00000000..4c2c2d36 --- /dev/null +++ b/os-apps/temper-agent/specs/cron_job.ioa.toml @@ -0,0 +1,161 @@ +# CronJob — Scheduled agent runs. +# +# Creates and tracks TemperAgent entities on a schedule. +# Template substitution supports {{now}}, {{run_count}}, {{last_result}}. + +[automaton] +name = "CronJob" +states = ["Created", "Active", "Paused", "Expired"] +initial = "Created" + +[[state]] +name = "name" +type = "string" +initial = "" + +[[state]] +name = "schedule" +type = "string" +initial = "" + +[[state]] +name = "soul_id" +type = "string" +initial = "" + +[[state]] +name = "system_prompt" +type = "string" +initial = "" + +[[state]] +name = "user_message_template" +type = "string" +initial = "" + +[[state]] +name = "model" +type = "string" +initial = "claude-sonnet-4-20250514" + +[[state]] +name = "provider" +type = "string" +initial = "anthropic" + +[[state]] +name = "tools_enabled" +type = "string" +initial = "read,write,edit,bash" + +[[state]] +name = "sandbox_url" +type = "string" +initial = "" + +[[state]] +name = "max_turns" +type = "string" +initial = "20" + +[[state]] +name = "last_run_at" +type = "string" +initial = "" + +[[state]] +name = "next_run_at" +type = "string" +initial = "" + +[[state]] +name = "run_count" +type = "counter" +initial = "0" + +[[state]] +name = "max_runs" +type = "string" +initial = "0" + +[[state]] +name = "last_agent_id" +type = "string" +initial = "" + +[[state]] +name = "last_result" +type = "string" +initial = "" + +[[action]] +name = "Configure" +kind = "input" +from = ["Created"] +params = ["name", "schedule", "soul_id", "system_prompt", "user_message_template", "model", "provider", "tools_enabled", "sandbox_url", "max_turns", "max_runs"] +hint = "Configure the cron job with schedule and agent parameters." + +[[action]] +name = "Activate" +kind = "input" +from = ["Created"] +to = "Active" +hint = "Start the cron schedule." + +[[action]] +name = "Pause" +kind = "input" +from = ["Active"] +to = "Paused" +hint = "Pause the cron schedule." + +[[action]] +name = "Resume" +kind = "input" +from = ["Paused"] +to = "Active" +hint = "Resume the cron schedule." + +[[action]] +name = "Trigger" +kind = "input" +from = ["Active"] +params = ["last_run_at"] +hint = "Fire the cron job — creates and provisions a TemperAgent." +effect = [{ type = "increment", var = "run_count" }, { type = "trigger", name = "cron_trigger" }] + +[[action]] +name = "TriggerComplete" +kind = "input" +from = ["Active"] +params = ["last_agent_id", "last_result"] +hint = "Callback after agent creation. Updates tracking fields." + +[[action]] +name = "TriggerFailed" +kind = "input" +from = ["Active"] +params = ["error_message"] +hint = "Trigger WASM failed. Stays Active for next scheduled run." + +[[action]] +name = "Expire" +kind = "input" +from = ["Active"] +to = "Expired" +hint = "Max runs reached or manually expired." + +[[invariant]] +name = "ExpiredIsFinal" +when = ["Expired"] +assert = "no_further_transitions" + +[[integration]] +name = "cron_trigger" +trigger = "cron_trigger" +type = "wasm" +module = "cron_trigger" +on_failure = "TriggerFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" diff --git a/os-apps/temper-agent/specs/cron_scheduler.ioa.toml b/os-apps/temper-agent/specs/cron_scheduler.ioa.toml new file mode 100644 index 00000000..e9be074f --- /dev/null +++ b/os-apps/temper-agent/specs/cron_scheduler.ioa.toml @@ -0,0 +1,84 @@ +# CronScheduler — Self-scheduling heartbeat that checks for due cron jobs. +# +# One per tenant. Uses HeartbeatRun pattern to periodically query +# active CronJobs and fire Trigger on due ones. + +[automaton] +name = "CronScheduler" +states = ["Idle", "Checking"] +initial = "Idle" + +[[state]] +name = "heartbeat_interval_seconds" +type = "string" +initial = "60" + +[[state]] +name = "last_check_at" +type = "string" +initial = "" + +[[state]] +name = "jobs_triggered" +type = "counter" +initial = "0" + +[[action]] +name = "Start" +kind = "input" +from = ["Idle"] +to = "Checking" +hint = "Begin checking for due cron jobs." +effect = [{ type = "trigger", name = "check_due_jobs" }] + +[[action]] +name = "CheckComplete" +kind = "input" +from = ["Checking"] +to = "Idle" +params = ["last_check_at", "jobs_triggered"] +hint = "Check finished. Schedule next check." +effect = [{ type = "increment", var = "jobs_triggered" }, { type = "trigger", name = "schedule_next_check" }] + +[[action]] +name = "ScheduledCheck" +kind = "input" +from = ["Idle"] +to = "Checking" +hint = "Scheduled check triggered." +effect = [{ type = "trigger", name = "check_due_jobs" }] + +[[action]] +name = "CheckFailed" +kind = "input" +from = ["Checking"] +to = "Idle" +params = ["error_message"] +hint = "Check WASM failed. Return to Idle for next scheduled check." + +[[action]] +name = "ScheduleFailed" +kind = "input" +from = ["Idle"] +params = ["error_message"] +hint = "Schedule WASM failed. Stay Idle." + +[[integration]] +name = "check_due_jobs" +trigger = "check_due_jobs" +type = "wasm" +module = "cron_scheduler_check" +on_failure = "CheckFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" + +[[integration]] +name = "schedule_next_check" +trigger = "schedule_next_check" +type = "wasm" +module = "cron_scheduler_heartbeat" +on_failure = "ScheduleFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" diff --git a/os-apps/temper-agent/specs/heartbeat_monitor.ioa.toml b/os-apps/temper-agent/specs/heartbeat_monitor.ioa.toml new file mode 100644 index 00000000..3f4afee9 --- /dev/null +++ b/os-apps/temper-agent/specs/heartbeat_monitor.ioa.toml @@ -0,0 +1,84 @@ +# HeartbeatMonitor — Periodic scanner for stale agents. +# +# One per tenant. Self-scheduling via HeartbeatRun pattern. +# Scans agents in non-terminal states and fires TimeoutFail on stale ones. + +[automaton] +name = "HeartbeatMonitor" +states = ["Idle", "Scanning"] +initial = "Idle" + +[[state]] +name = "scan_interval_seconds" +type = "string" +initial = "30" + +[[state]] +name = "last_scan_at" +type = "string" +initial = "" + +[[state]] +name = "stale_agents_found" +type = "counter" +initial = "0" + +[[action]] +name = "Start" +kind = "input" +from = ["Idle"] +to = "Scanning" +hint = "Begin scanning for stale agents." +effect = [{ type = "trigger", name = "scan_agents" }] + +[[action]] +name = "ScanComplete" +kind = "input" +from = ["Scanning"] +to = "Idle" +params = ["last_scan_at", "stale_agents_found"] +hint = "Scan finished. Schedule next scan." +effect = [{ type = "increment", var = "stale_agents_found" }, { type = "trigger", name = "schedule_next_scan" }] + +[[action]] +name = "ScheduledScan" +kind = "input" +from = ["Idle"] +to = "Scanning" +hint = "Scheduled scan triggered." +effect = [{ type = "trigger", name = "scan_agents" }] + +[[action]] +name = "ScanFailed" +kind = "input" +from = ["Scanning"] +to = "Idle" +params = ["error_message"] +hint = "Scan WASM failed. Return to Idle for next scheduled scan." + +[[action]] +name = "ScheduleFailed" +kind = "input" +from = ["Idle"] +params = ["error_message"] +hint = "Schedule WASM failed. Stay Idle." + +[[integration]] +name = "scan_agents" +trigger = "scan_agents" +type = "wasm" +module = "heartbeat_scan" +on_failure = "ScanFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" + +[[integration]] +name = "schedule_next_scan" +trigger = "schedule_next_scan" +type = "wasm" +module = "heartbeat_scheduler" +on_failure = "ScheduleFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" diff --git a/os-apps/temper-agent/specs/model.csdl.xml b/os-apps/temper-agent/specs/model.csdl.xml index 487b0514..5cbb111b 100644 --- a/os-apps/temper-agent/specs/model.csdl.xml +++ b/os-apps/temper-agent/specs/model.csdl.xml @@ -23,12 +23,115 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -42,6 +145,15 @@ + + + + + + + + + @@ -57,6 +169,8 @@ + + @@ -66,6 +180,8 @@ + + @@ -73,6 +189,52 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -82,6 +244,19 @@ + + + + + + + + + + + + + @@ -92,6 +267,8 @@ + + @@ -106,8 +283,228 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/os-apps/temper-agent/specs/temper_agent.ioa.toml b/os-apps/temper-agent/specs/temper_agent.ioa.toml index 98dac610..aaf7e330 100644 --- a/os-apps/temper-agent/specs/temper_agent.ioa.toml +++ b/os-apps/temper-agent/specs/temper_agent.ioa.toml @@ -1,4 +1,4 @@ -# TemperAgent Entity — Spec-driven agent loop via IOA state machine. +# TemperAgent Entity — Pi-compatible governed agent loop via IOA state machine. # # The agent turn cycle is expressed as state transitions with WASM integration # triggers. No Rust while loop — the platform's dispatch pipeline drives @@ -6,15 +6,23 @@ # Thinking → (call_llm) → ProcessToolCalls → Executing → (run_tools) → # HandleToolResults → Thinking → ... # -# Conversation history stored in TemperFS (conversation_file_id FK). +# Pi architecture additions: +# - Session tree: JSONL append-only tree with branching (session_file_id) +# - Compaction: Thinking → Compacting → Thinking (context_compactor WASM) +# - Steering: Thinking → Steering → Thinking or Completed (two-loop model) +# - Soul/Skills/Memory: entity references for identity, capabilities, knowledge +# - Subagents: parent/child entity relationships via parent_agent_id +# - Heartbeat: liveness monitoring via last_heartbeat_at +# +# Conversation history stored in TemperFS as JSONL session tree. # Budget enforced via turn_count guard. Tools governed by Cedar. [automaton] name = "TemperAgent" -states = ["Created", "Provisioning", "Thinking", "Executing", "Completed", "Failed", "Cancelled"] +states = ["Created", "Provisioning", "Thinking", "Executing", "Compacting", "Steering", "Completed", "Failed", "Cancelled"] initial = "Created" -# --- State Variables --- +# --- State Variables: Core --- [[state]] name = "model" @@ -96,6 +104,11 @@ name = "sandbox_id" type = "string" initial = "" +[[state]] +name = "temper_api_url" +type = "string" +initial = "http://127.0.0.1:3000" + [[state]] name = "file_manifest_id" type = "string" @@ -126,14 +139,113 @@ name = "conversation" type = "string" initial = "" -# --- Actions --- +# --- State Variables: Session Tree (Phase 2) --- + +[[state]] +name = "session_file_id" +type = "string" +initial = "" + +[[state]] +name = "session_leaf_id" +type = "string" +initial = "" + +[[state]] +name = "context_tokens" +type = "counter" +initial = "0" + +# --- State Variables: Compaction (Phase 3) --- + +[[state]] +name = "reserve_tokens" +type = "string" +initial = "20000" + +[[state]] +name = "keep_recent_tokens" +type = "string" +initial = "10000" + +[[state]] +name = "compaction_count" +type = "counter" +initial = "0" + +[[state]] +name = "compaction_model" +type = "string" +initial = "" + +# --- State Variables: Steering (Phase 4) --- + +[[state]] +name = "steering_messages" +type = "string" +initial = "[]" + +[[state]] +name = "follow_up_count" +type = "counter" +initial = "0" + +[[state]] +name = "max_follow_ups" +type = "string" +initial = "5" + +# --- State Variables: Soul / Skills / Memory (Phase 5) --- + +[[state]] +name = "soul_id" +type = "string" +initial = "" + +# --- State Variables: Tool Hooks (Phase 6) --- + +[[state]] +name = "hook_policy" +type = "string" +initial = "none" + +# --- State Variables: Subagents (Phase 7) --- + +[[state]] +name = "parent_agent_id" +type = "string" +initial = "" + +[[state]] +name = "child_agent_ids" +type = "string" +initial = "[]" + +[[state]] +name = "agent_depth" +type = "counter" +initial = "0" + +# --- State Variables: Heartbeat (Phase 8) --- + +[[state]] +name = "last_heartbeat_at" +type = "string" +initial = "" + +[[state]] +name = "heartbeat_timeout_seconds" +type = "string" +initial = "300" + +# --- Actions: Core Agent Loop --- [[action]] name = "Configure" kind = "input" from = ["Created"] -params = ["system_prompt", "user_message", "model", "provider", "max_turns", "tools_enabled", "workdir", "sandbox_url", "temper_api_url"] -hint = "Configure agent with system prompt, user message (task), model, tool settings, optional sandbox URL, and optional Temper API override." +params = ["system_prompt", "user_message", "model", "provider", "max_turns", "tools_enabled", "workdir", "sandbox_url", "temper_api_url", "soul_id", "parent_agent_id", "agent_depth", "max_follow_ups", "hook_policy", "reserve_tokens", "keep_recent_tokens", "compaction_model", "heartbeat_timeout_seconds"] +hint = "Configure agent with system prompt, user message, model, tools, soul, and optional overrides." [[action]] name = "Provision" @@ -148,8 +260,8 @@ name = "SandboxReady" kind = "input" from = ["Provisioning"] to = "Thinking" -params = ["sandbox_url", "sandbox_id", "workspace_id", "conversation_file_id", "file_manifest_id"] -hint = "Callback from sandbox provisioner. Sets sandbox connection, TemperFS workspace/file/manifest, and starts think loop." +params = ["sandbox_url", "sandbox_id", "workspace_id", "conversation_file_id", "file_manifest_id", "session_file_id", "session_leaf_id"] +hint = "Callback from sandbox provisioner. Sets sandbox connection, TemperFS workspace/file/manifest/session, and starts think loop." effect = [{ type = "trigger", name = "call_llm" }] [[action]] @@ -157,11 +269,12 @@ name = "ProcessToolCalls" kind = "input" from = ["Thinking"] to = "Executing" -params = ["pending_tool_calls", "conversation", "input_tokens", "output_tokens"] +params = ["pending_tool_calls", "conversation", "input_tokens", "output_tokens", "session_leaf_id", "context_tokens"] hint = "LLM returned tool_use blocks. Record token usage, transition to Executing, and run tools." effect = [ { type = "increment", var = "input_tokens" }, { type = "increment", var = "output_tokens" }, + { type = "increment", var = "context_tokens" }, { type = "trigger", name = "run_tools" } ] @@ -170,7 +283,7 @@ name = "HandleToolResults" kind = "input" from = ["Executing"] to = "Thinking" -params = ["pending_tool_calls", "conversation"] +params = ["pending_tool_calls", "conversation", "session_leaf_id"] guard = "turn_count < 100" hint = "Tool results received. Increment turn, transition to Thinking, and call LLM again. Static safety ceiling at 100 turns; dynamic max_turns enforced by llm_caller at runtime." effect = [ @@ -178,23 +291,119 @@ effect = [ { type = "trigger", name = "call_llm" } ] +# --- Actions: Compaction (Phase 3) --- + +[[action]] +name = "NeedsCompaction" +kind = "input" +from = ["Thinking"] +to = "Compacting" +params = ["input_tokens", "output_tokens"] +hint = "LLM caller detected context tokens exceeds window minus reserve. Trigger compaction." +effect = [ + { type = "increment", var = "input_tokens" }, + { type = "increment", var = "output_tokens" }, + { type = "trigger", name = "compact_context" } +] + +[[action]] +name = "CompactionComplete" +kind = "input" +from = ["Compacting"] +to = "Thinking" +params = ["session_leaf_id", "context_tokens"] +hint = "Compaction finished. Resume LLM call with compacted context." +effect = [ + { type = "increment", var = "compaction_count" }, + { type = "trigger", name = "call_llm" } +] + +# --- Actions: Steering (Phase 4) --- + +[[action]] +name = "CheckSteering" +kind = "input" +from = ["Thinking"] +to = "Steering" +params = ["input_tokens", "output_tokens", "session_leaf_id", "context_tokens"] +hint = "LLM returned end_turn. Check for queued steering messages before completing." +effect = [ + { type = "increment", var = "input_tokens" }, + { type = "increment", var = "output_tokens" }, + { type = "increment", var = "context_tokens" }, + { type = "trigger", name = "check_steering" } +] + +[[action]] +name = "ContinueWithSteering" +kind = "input" +from = ["Steering"] +to = "Thinking" +params = ["session_leaf_id", "steering_messages", "conversation"] +guard = "follow_up_count < 100" +hint = "Steering message found. Inject into conversation and continue. Dynamic max_follow_ups enforced by steering_checker." +effect = [ + { type = "increment", var = "turn_count" }, + { type = "increment", var = "follow_up_count" }, + { type = "trigger", name = "call_llm" } +] + +[[action]] +name = "FinalizeResult" +kind = "input" +from = ["Steering"] +to = "Completed" +params = ["result", "conversation", "session_leaf_id"] +hint = "No steering messages queued. Set result and complete." +effect = [ + { type = "set_bool", var = "has_result", value = "true" } +] + +[[action]] +name = "Steer" +kind = "input" +from = ["Thinking", "Executing", "Steering", "Compacting"] +params = ["steering_messages"] +hint = "Queue a steering message for mid-run injection. External callers append messages while agent runs. Self-loop — does not change state." + +# --- Actions: Legacy direct completion (backward compat for max_follow_ups=0) --- + [[action]] name = "RecordResult" kind = "input" from = ["Thinking"] to = "Completed" -params = ["result", "conversation", "input_tokens", "output_tokens"] -hint = "LLM returned end_turn. Record token usage, set result, and complete." +params = ["result", "conversation", "input_tokens", "output_tokens", "session_leaf_id"] +hint = "LLM returned end_turn in non-steering mode (max_follow_ups=0). Record token usage, set result, and complete." effect = [ { type = "increment", var = "input_tokens" }, { type = "increment", var = "output_tokens" }, { type = "set_bool", var = "has_result", value = "true" } ] +# --- Actions: Heartbeat (Phase 8) --- + +[[action]] +name = "Heartbeat" +kind = "input" +from = ["Thinking", "Executing", "Steering", "Compacting"] +params = ["last_heartbeat_at"] +hint = "Record agent liveness. Called by WASM modules during long operations. Self-loop." + +[[action]] +name = "TimeoutFail" +kind = "input" +from = ["Thinking", "Executing", "Steering", "Compacting"] +to = "Failed" +params = ["error_message"] +hint = "Agent timed out — no heartbeat within timeout period." + +# --- Actions: Failure, Cancellation, Resume --- + [[action]] name = "Fail" kind = "input" -from = ["Created", "Provisioning", "Thinking", "Executing"] +from = ["Created", "Provisioning", "Thinking", "Executing", "Compacting", "Steering"] to = "Failed" params = ["error_message"] hint = "Mark agent run as failed." @@ -202,7 +411,7 @@ hint = "Mark agent run as failed." [[action]] name = "Cancel" kind = "input" -from = ["Created", "Provisioning", "Thinking", "Executing"] +from = ["Created", "Provisioning", "Thinking", "Executing", "Compacting", "Steering"] to = "Cancelled" hint = "Cancel agent execution." @@ -211,7 +420,7 @@ name = "Resume" kind = "input" from = ["Created"] to = "Provisioning" -params = ["sandbox_url", "sandbox_id", "workspace_id", "conversation_file_id", "file_manifest_id"] +params = ["sandbox_url", "sandbox_id", "workspace_id", "conversation_file_id", "file_manifest_id", "session_file_id", "session_leaf_id"] hint = "Resume agent from saved state. Transitions to Provisioning for workspace restore." effect = [{ type = "trigger", name = "restore_workspace" }] @@ -239,7 +448,7 @@ assert = "no_further_transitions" [[invariant]] name = "TurnCountNonNegative" -when = ["Created", "Provisioning", "Thinking", "Executing", "Completed", "Failed", "Cancelled"] +when = ["Created", "Provisioning", "Thinking", "Executing", "Compacting", "Steering", "Completed", "Failed", "Cancelled"] assert = "turn_count >= 0" # --- Integrations --- @@ -253,6 +462,7 @@ on_failure = "Fail" [integration.config] temper_api_url = "{secret:temper_api_url}" +sandbox_url = "{secret:sandbox_url}" e2b_api_key = "{secret:e2b_api_key}" [[integration]] @@ -289,6 +499,28 @@ sync_exclude = "__pycache__,node_modules,.git" logfire_read_token = "{secret:logfire_read_token}" logfire_api_base = "https://logfire-us.pydantic.dev" +[[integration]] +name = "compact_context" +trigger = "compact_context" +type = "wasm" +module = "context_compactor" +on_failure = "Fail" + +[integration.config] +api_key = "{secret:anthropic_api_key}" +temper_api_url = "{secret:temper_api_url}" +timeout_secs = "120" + +[[integration]] +name = "check_steering" +trigger = "check_steering" +type = "wasm" +module = "steering_checker" +on_failure = "Fail" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" + [[integration]] name = "restore_workspace" trigger = "restore_workspace" diff --git a/os-apps/temper-agent/specs/tool_hook.ioa.toml b/os-apps/temper-agent/specs/tool_hook.ioa.toml new file mode 100644 index 00000000..e0387461 --- /dev/null +++ b/os-apps/temper-agent/specs/tool_hook.ioa.toml @@ -0,0 +1,60 @@ +# ToolHook — Before/after hooks for tool execution. +# +# Hooks are evaluated by tool_runner before/after executing tools. +# Supports block, log, and modify actions with regex tool matching. + +[automaton] +name = "ToolHook" +states = ["Active", "Disabled"] +initial = "Active" + +[[state]] +name = "name" +type = "string" +initial = "" + +[[state]] +name = "hook_type" +type = "string" +initial = "before" + +[[state]] +name = "tool_pattern" +type = "string" +initial = ".*" + +[[state]] +name = "hook_action" +type = "string" +initial = "log" + +[[state]] +name = "soul_id" +type = "string" +initial = "" + +[[state]] +name = "priority" +type = "counter" +initial = "0" + +[[action]] +name = "Register" +kind = "input" +from = ["Active"] +params = ["name", "hook_type", "tool_pattern", "hook_action", "soul_id", "priority"] +hint = "Register a tool hook with pattern and action." + +[[action]] +name = "Disable" +kind = "input" +from = ["Active"] +to = "Disabled" +hint = "Disable the hook." + +[[action]] +name = "Enable" +kind = "input" +from = ["Disabled"] +to = "Active" +hint = "Re-enable the hook." diff --git a/os-apps/temper-agent/wasm/build.sh b/os-apps/temper-agent/wasm/build.sh index 27575dc8..de000fc9 100755 --- a/os-apps/temper-agent/wasm/build.sh +++ b/os-apps/temper-agent/wasm/build.sh @@ -5,7 +5,7 @@ set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" -for module in llm_caller tool_runner sandbox_provisioner; do +for module in llm_caller tool_runner sandbox_provisioner context_compactor steering_checker coding_agent_runner heartbeat_scan heartbeat_scheduler cron_trigger cron_scheduler_check cron_scheduler_heartbeat workspace_restorer; do echo "Building $module..." (cd "$SCRIPT_DIR/$module" && cargo build --target wasm32-unknown-unknown --release) echo " -> $module built successfully" @@ -13,7 +13,7 @@ done echo "" echo "All WASM modules built. Binaries at:" -for module in llm_caller tool_runner sandbox_provisioner; do +for module in llm_caller tool_runner sandbox_provisioner context_compactor steering_checker coding_agent_runner heartbeat_scan heartbeat_scheduler cron_trigger cron_scheduler_check cron_scheduler_heartbeat workspace_restorer; do wasm_file="$SCRIPT_DIR/$module/target/wasm32-unknown-unknown/release/${module/-/_}.wasm" if [ -f "$wasm_file" ]; then size=$(wc -c < "$wasm_file" | tr -d ' ') diff --git a/os-apps/temper-agent/wasm/coding_agent_runner/Cargo.lock b/os-apps/temper-agent/wasm/coding_agent_runner/Cargo.lock new file mode 100644 index 00000000..e93710c8 --- /dev/null +++ b/os-apps/temper-agent/wasm/coding_agent_runner/Cargo.lock @@ -0,0 +1,112 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "coding-agent-runner" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/coding_agent_runner/Cargo.toml b/os-apps/temper-agent/wasm/coding_agent_runner/Cargo.toml new file mode 100644 index 00000000..0df812ee --- /dev/null +++ b/os-apps/temper-agent/wasm/coding_agent_runner/Cargo.toml @@ -0,0 +1,12 @@ +[package] +name = "coding-agent-runner" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } diff --git a/os-apps/temper-agent/wasm/coding_agent_runner/src/lib.rs b/os-apps/temper-agent/wasm/coding_agent_runner/src/lib.rs new file mode 100644 index 00000000..b63588d5 --- /dev/null +++ b/os-apps/temper-agent/wasm/coding_agent_runner/src/lib.rs @@ -0,0 +1,96 @@ +//! Coding Agent Runner — WASM module for spawning coding agent CLI processes. +//! +//! Maps agent_type to CLI commands and executes them in the sandbox. +//! Supports claude-code, codex, pi, and opencode. + +use temper_wasm_sdk::prelude::*; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + ctx.log("info", "coding_agent_runner: starting"); + + let fields = ctx.entity_state.get("fields").cloned().unwrap_or(json!({})); + let sandbox_url = fields.get("sandbox_url").and_then(|v| v.as_str()).unwrap_or(""); + let workdir = fields.get("workdir").and_then(|v| v.as_str()).unwrap_or("/workspace"); + + if sandbox_url.is_empty() { + return Err("coding_agent_runner: sandbox_url is empty".to_string()); + } + + // Read tool input from trigger params + let input = ctx.trigger_params.get("input").cloned().unwrap_or(json!({})); + let agent_type = input.get("agent_type").and_then(|v| v.as_str()).unwrap_or("claude-code"); + let task = input.get("task").and_then(|v| v.as_str()).unwrap_or(""); + let task_workdir = input.get("workdir").and_then(|v| v.as_str()).unwrap_or(workdir); + + if task.is_empty() { + return Err("coding_agent_runner: task is empty".to_string()); + } + + // Map agent_type to CLI command + let command = match agent_type { + "claude-code" => format!("claude --permission-mode bypassPermissions --print '{}'", escape_single_quotes(task)), + "codex" => format!("codex exec '{}'", escape_single_quotes(task)), + "pi" => format!("pi -p '{}'", escape_single_quotes(task)), + "opencode" => format!("opencode run '{}'", escape_single_quotes(task)), + other => return Err(format!("coding_agent_runner: unsupported agent_type: {other}")), + }; + + ctx.log("info", &format!("coding_agent_runner: running {agent_type}: {}", &command[..command.len().min(100)])); + + // Execute via sandbox bash API + let url = format!("{sandbox_url}/v1/processes/run"); + let body = serde_json::to_string(&json!({ + "command": command, + "workdir": task_workdir, + })).unwrap_or_default(); + + let headers = vec![("content-type".to_string(), "application/json".to_string())]; + let resp = ctx.http_call("POST", &url, &headers, &body)?; + + let output = if resp.status >= 200 && resp.status < 300 { + if let Ok(parsed) = serde_json::from_str::(&resp.body) { + let stdout = parsed.get("stdout").and_then(|v| v.as_str()).unwrap_or(""); + let stderr = parsed.get("stderr").and_then(|v| v.as_str()).unwrap_or(""); + let exit_code = parsed.get("exit_code").and_then(|v| v.as_i64()).unwrap_or(-1); + let mut out = String::new(); + if !stdout.is_empty() { out.push_str(stdout); } + if !stderr.is_empty() { + if !out.is_empty() { out.push('\n'); } + out.push_str("STDERR: "); + out.push_str(stderr); + } + if exit_code != 0 { + out.push_str(&format!("\n(exit code: {exit_code})")); + } + out + } else { + resp.body + } + } else { + format!("Error (HTTP {}): {}", resp.status, &resp.body[..resp.body.len().min(500)]) + }; + + // Return the output as a tool result + set_success_result("HandleToolResults", &json!({ + "pending_tool_calls": json!([{ + "type": "tool_result", + "tool_use_id": input.get("tool_use_id").and_then(|v| v.as_str()).unwrap_or("unknown"), + "content": output, + }]).to_string(), + })); + + Ok(()) + })(); + + if let Err(e) = result { + set_error_result(&e); + } + 0 +} + +fn escape_single_quotes(s: &str) -> String { + s.replace('\'', "'\\''") +} diff --git a/os-apps/temper-agent/wasm/context_compactor/Cargo.lock b/os-apps/temper-agent/wasm/context_compactor/Cargo.lock new file mode 100644 index 00000000..b75a0fc0 --- /dev/null +++ b/os-apps/temper-agent/wasm/context_compactor/Cargo.lock @@ -0,0 +1,129 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "context-compactor" +version = "0.1.0" +dependencies = [ + "session-tree-lib", + "temper-wasm-sdk", + "wasm-helpers", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "session-tree-lib" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/context_compactor/Cargo.toml b/os-apps/temper-agent/wasm/context_compactor/Cargo.toml new file mode 100644 index 00000000..5854e251 --- /dev/null +++ b/os-apps/temper-agent/wasm/context_compactor/Cargo.toml @@ -0,0 +1,14 @@ +[package] +name = "context-compactor" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +session-tree-lib = { path = "../session-tree-lib" } +wasm-helpers = { path = "../wasm-helpers" } diff --git a/os-apps/temper-agent/wasm/context_compactor/src/lib.rs b/os-apps/temper-agent/wasm/context_compactor/src/lib.rs new file mode 100644 index 00000000..bb06248b --- /dev/null +++ b/os-apps/temper-agent/wasm/context_compactor/src/lib.rs @@ -0,0 +1,231 @@ +//! Context Compactor — WASM module for compacting long agent conversations. +//! +//! When the session tree exceeds the context window (minus reserve_tokens), +//! this module is triggered. It summarizes older messages using an LLM call +//! and replaces them with a compaction entry in the session tree. +//! +//! Build: `cargo build --target wasm32-unknown-unknown --release` + +use session_tree_lib::SessionTree; +use temper_wasm_sdk::prelude::*; +use wasm_helpers::{read_session_from_temperfs, resolve_temper_api_url, write_session_to_temperfs}; + +/// Entry point. +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + ctx.log("info", "context_compactor: starting"); + + let fields = ctx.entity_state.get("fields").cloned().unwrap_or(json!({})); + + // Read compaction parameters + let keep_recent_tokens: usize = fields + .get("keep_recent_tokens") + .and_then(|v| v.as_str()) + .and_then(|s| s.parse().ok()) + .unwrap_or(10000); + + let session_file_id = fields + .get("session_file_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + + let session_leaf_id = fields + .get("session_leaf_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + + if session_file_id.is_empty() || session_leaf_id.is_empty() { + return Err("context_compactor: missing session_file_id or session_leaf_id".to_string()); + } + + let temper_api_url = resolve_temper_api_url(&ctx, &fields); + let tenant = &ctx.tenant; + + // 1. Read session tree from TemperFS + let session_jsonl = read_session_from_temperfs(&ctx, &temper_api_url, tenant, session_file_id)?; + let mut tree = SessionTree::from_jsonl(&session_jsonl); + + ctx.log("info", &format!( + "context_compactor: tree has {} entries, estimating tokens from leaf {}", + tree.len(), session_leaf_id + )); + + // 2. Find cut point + let cut_point = match tree.find_cut_point(session_leaf_id, keep_recent_tokens) { + Some(cp) => cp, + None => { + ctx.log("warn", "context_compactor: no valid cut point found, skipping compaction"); + set_success_result("CompactionComplete", &json!({ + "session_leaf_id": session_leaf_id, + "context_tokens": tree.estimate_tokens(session_leaf_id), + })); + return Ok(()); + } + }; + + ctx.log("info", &format!("context_compactor: cut point at entry {}", cut_point)); + + // 3. Build compaction prompt from messages being cut + let messages_to_summarize = tree.build_context(&cut_point); + if messages_to_summarize.is_empty() { + ctx.log("warn", "context_compactor: no messages to summarize"); + set_success_result("CompactionComplete", &json!({ + "session_leaf_id": session_leaf_id, + "context_tokens": tree.estimate_tokens(session_leaf_id), + })); + return Ok(()); + } + + let conversation_text = format_messages_for_summary(&messages_to_summarize); + + // 4. Call LLM for structured summary + let compaction_model = fields + .get("compaction_model") + .and_then(|v| v.as_str()) + .filter(|s| !s.is_empty()) + .unwrap_or_else(|| { + fields.get("model").and_then(|v| v.as_str()).unwrap_or("claude-sonnet-4-20250514") + }); + + let api_key = ctx.config.get("api_key").cloned().unwrap_or_default(); + let provider = fields + .get("provider") + .and_then(|v| v.as_str()) + .unwrap_or("anthropic"); + let summary = if provider.eq_ignore_ascii_case("mock") || api_key.trim().is_empty() { + build_mock_summary(&conversation_text) + } else { + call_compaction_llm(&ctx, &api_key, compaction_model, &conversation_text)? + }; + + ctx.log("info", &format!( + "context_compactor: generated summary ({} chars)", + summary.len() + )); + + // 5. Append compaction entry to session tree + let (compaction_id, _line) = tree.append_compaction(session_leaf_id, &summary, &cut_point); + + // 6. Write updated session tree back to TemperFS + let updated_jsonl = tree.to_jsonl(); + write_session_to_temperfs(&ctx, &temper_api_url, tenant, session_file_id, &updated_jsonl)?; + + // 7. Return CompactionComplete with new leaf pointing after compaction + let new_token_estimate = tree.estimate_tokens(&compaction_id); + set_success_result("CompactionComplete", &json!({ + "session_leaf_id": compaction_id, + "context_tokens": new_token_estimate, + })); + + Ok(()) + })(); + + if let Err(e) = result { + set_error_result(&e); + } + 0 +} + +fn build_mock_summary(conversation_text: &str) -> String { + let truncated: String = conversation_text.chars().take(600).collect(); + format!( + "## Goal\nPreserve the active task.\n\n## Constraints & Preferences\nStay within the current workspace and existing agent context.\n\n## Progress\n- Done: Earlier conversation was compacted.\n- In Progress: Continue the active task with the remaining context.\n- Blocked: None.\n\n## Key Decisions\nUse the deterministic mock compaction path when no real model is configured.\n\n## Next Steps\nResume the agent loop after compaction.\n\n## Critical Context\n{}", + truncated + ) +} + +/// Format messages into a text block for the compaction LLM prompt. +fn format_messages_for_summary(messages: &[Value]) -> String { + let mut text = String::new(); + for msg in messages { + let role = msg.get("role").and_then(|v| v.as_str()).unwrap_or("unknown"); + let content = msg.get("content").cloned().unwrap_or(json!("")); + let content_str = match content { + Value::String(s) => s, + Value::Array(arr) => { + arr.iter() + .filter_map(|block| { + if block.get("type").and_then(|v| v.as_str()) == Some("text") { + block.get("text").and_then(|v| v.as_str()).map(String::from) + } else if block.get("type").and_then(|v| v.as_str()) == Some("tool_use") { + Some(format!("[tool_use: {}]", block.get("name").and_then(|v| v.as_str()).unwrap_or("unknown"))) + } else if block.get("type").and_then(|v| v.as_str()) == Some("tool_result") { + let content = block.get("content").and_then(|v| v.as_str()).unwrap_or("..."); + let truncated = if content.len() > 200 { &content[..200] } else { content }; + Some(format!("[tool_result: {}]", truncated)) + } else { + None + } + }) + .collect::>() + .join("\n") + } + _ => serde_json::to_string(&content).unwrap_or_default(), + }; + text.push_str(&format!("## {role}\n{content_str}\n\n")); + } + text +} + +/// Call the LLM with a compaction-specific system prompt. +fn call_compaction_llm( + ctx: &Context, + api_key: &str, + model: &str, + conversation_text: &str, +) -> Result { + let system_prompt = "You are a conversation compactor. Summarize the following conversation into a structured summary. Be concise but preserve all important context, decisions, and progress. Output the summary in this exact format:\n\n## Goal\n\n\n## Constraints & Preferences\n\n\n## Progress\n- Done: \n- In Progress: \n- Blocked: \n\n## Key Decisions\n\n\n## Next Steps\n\n\n## Critical Context\n"; + + let body = json!({ + "model": model, + "max_tokens": 2048, + "system": system_prompt, + "messages": [{ + "role": "user", + "content": format!("Summarize this conversation:\n\n{conversation_text}") + }] + }); + + let is_oauth = api_key.contains("sk-ant-oat"); + let headers = if is_oauth { + vec![ + ("authorization".to_string(), format!("Bearer {api_key}")), + ("anthropic-version".to_string(), "2023-06-01".to_string()), + ("anthropic-beta".to_string(), "oauth-2025-04-20".to_string()), + ("content-type".to_string(), "application/json".to_string()), + ] + } else { + vec![ + ("x-api-key".to_string(), api_key.to_string()), + ("anthropic-version".to_string(), "2023-06-01".to_string()), + ("content-type".to_string(), "application/json".to_string()), + ] + }; + + let body_str = serde_json::to_string(&body).map_err(|e| format!("JSON serialize error: {e}"))?; + + let resp = ctx.http_call("POST", "https://api.anthropic.com/v1/messages", &headers, &body_str)?; + if resp.status != 200 { + return Err(format!( + "Compaction LLM call failed (HTTP {}): {}", + resp.status, + &resp.body[..resp.body.len().min(500)] + )); + } + + let parsed: Value = serde_json::from_str(&resp.body) + .map_err(|e| format!("failed to parse compaction LLM response: {e}"))?; + + // Extract text from response + let text = parsed + .get("content") + .and_then(|v| v.as_array()) + .and_then(|arr| arr.iter().find(|b| b.get("type").and_then(|v| v.as_str()) == Some("text"))) + .and_then(|b| b.get("text").and_then(|v| v.as_str())) + .unwrap_or("Summary unavailable") + .to_string(); + + Ok(text) +} diff --git a/os-apps/temper-agent/wasm/cron_scheduler_check/Cargo.lock b/os-apps/temper-agent/wasm/cron_scheduler_check/Cargo.lock new file mode 100644 index 00000000..01b5744f --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_scheduler_check/Cargo.lock @@ -0,0 +1,121 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "cron-scheduler-check" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", + "wasm-helpers", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/cron_scheduler_check/Cargo.toml b/os-apps/temper-agent/wasm/cron_scheduler_check/Cargo.toml new file mode 100644 index 00000000..70520692 --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_scheduler_check/Cargo.toml @@ -0,0 +1,13 @@ +[package] +name = "cron-scheduler-check" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +wasm-helpers = { path = "../wasm-helpers" } diff --git a/os-apps/temper-agent/wasm/cron_scheduler_check/src/lib.rs b/os-apps/temper-agent/wasm/cron_scheduler_check/src/lib.rs new file mode 100644 index 00000000..f51f8c76 --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_scheduler_check/src/lib.rs @@ -0,0 +1,74 @@ +//! Cron Scheduler Check — WASM module for checking due cron jobs. +//! +//! Queries active CronJobs where NextRunAt <= now and fires Trigger on each. + +use temper_wasm_sdk::prelude::*; +use wasm_helpers::{entity_field_str, resolve_temper_api_url}; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + ctx.log("info", "cron_scheduler_check: starting"); + + let fields = ctx.entity_state.get("fields").cloned().unwrap_or(json!({})); + let temper_api_url = resolve_temper_api_url(&ctx, &fields); + let tenant = &ctx.tenant; + + let headers = vec![ + ("content-type".to_string(), "application/json".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + + // Query active cron jobs + let url = format!("{temper_api_url}/tdata/CronJobs?$filter=Status eq 'Active'"); + let resp = ctx.http_call("GET", &url, &headers, "")?; + + let mut triggered_count: i64 = 0; + + if resp.status == 200 { + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({"value": []})); + let jobs = parsed.get("value").and_then(|v| v.as_array()).cloned().unwrap_or_default(); + + ctx.log("info", &format!("cron_scheduler_check: found {} active cron jobs", jobs.len())); + + for job in &jobs { + let job_id = job + .get("entity_id") + .and_then(|v| v.as_str()) + .or_else(|| entity_field_str(job, &["Id"])) + .unwrap_or(""); + // NextRunAt check deferred to cron_scheduler — this module triggers all active jobs + // that the scheduler determined are due + let trigger_url = format!("{temper_api_url}/tdata/CronJobs('{job_id}')/Temper.Agent.Trigger"); + let trigger_body = json!({ "last_run_at": "" }); + match ctx.http_call("POST", &trigger_url, &headers, &trigger_body.to_string()) { + Ok(r) if r.status >= 200 && r.status < 300 => { + triggered_count += 1; + ctx.log("info", &format!("cron_scheduler_check: triggered job {}", job_id)); + } + Ok(r) => { + ctx.log("warn", &format!("cron_scheduler_check: failed to trigger job {} (HTTP {})", job_id, r.status)); + } + Err(e) => { + ctx.log("warn", &format!("cron_scheduler_check: failed to trigger job {}: {}", job_id, e)); + } + } + } + } + + set_success_result("CheckComplete", &json!({ + "last_check_at": "", + "jobs_triggered": triggered_count, + })); + + Ok(()) + })(); + + if let Err(e) = result { + set_error_result(&e); + } + 0 +} diff --git a/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/Cargo.lock b/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/Cargo.lock new file mode 100644 index 00000000..c72b6c1f --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/Cargo.lock @@ -0,0 +1,121 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "cron-scheduler-heartbeat" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", + "wasm-helpers", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/Cargo.toml b/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/Cargo.toml new file mode 100644 index 00000000..80708a1f --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/Cargo.toml @@ -0,0 +1,13 @@ +[package] +name = "cron-scheduler-heartbeat" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +wasm-helpers = { path = "../wasm-helpers" } diff --git a/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/src/lib.rs b/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/src/lib.rs new file mode 100644 index 00000000..435f7e16 --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_scheduler_heartbeat/src/lib.rs @@ -0,0 +1,48 @@ +use temper_wasm_sdk::prelude::*; +use wasm_helpers::resolve_temper_api_url; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + let fields = ctx.entity_state.get("fields").cloned().unwrap_or_else(|| json!({})); + let interval_seconds = fields + .get("heartbeat_interval_seconds") + .and_then(|v| v.as_str()) + .and_then(|v| v.parse::().ok()) + .unwrap_or(60) + .clamp(1, 300); + let base_url = resolve_temper_api_url(&ctx, &fields); + let headers = vec![ + ("x-tenant-id".to_string(), ctx.tenant.clone()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ("content-type".to_string(), "application/json".to_string()), + ]; + + let wait_url = format!( + "{base_url}/observe/entities/{}/{}/wait?statuses=__never__&timeout_ms={}&poll_ms=250", + ctx.entity_type, + ctx.entity_id, + interval_seconds * 1000 + ); + let _ = ctx.http_call("GET", &wait_url, &headers, "")?; + + let action_url = format!( + "{base_url}/tdata/CronSchedulers('{}')/Temper.Agent.CronScheduler.ScheduledCheck", + ctx.entity_id + ); + let _ = ctx.http_call("POST", &action_url, &headers, "{}")?; + + set_success_result("ScheduleFailed", &json!({ + "error_message": "", + })); + Ok(()) + })(); + + if let Err(error) = result { + set_error_result(&error); + } + 0 +} + diff --git a/os-apps/temper-agent/wasm/cron_trigger/Cargo.lock b/os-apps/temper-agent/wasm/cron_trigger/Cargo.lock new file mode 100644 index 00000000..e9329605 --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_trigger/Cargo.lock @@ -0,0 +1,121 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "cron-trigger" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", + "wasm-helpers", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/cron_trigger/Cargo.toml b/os-apps/temper-agent/wasm/cron_trigger/Cargo.toml new file mode 100644 index 00000000..e1565469 --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_trigger/Cargo.toml @@ -0,0 +1,13 @@ +[package] +name = "cron-trigger" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +wasm-helpers = { path = "../wasm-helpers" } diff --git a/os-apps/temper-agent/wasm/cron_trigger/src/lib.rs b/os-apps/temper-agent/wasm/cron_trigger/src/lib.rs new file mode 100644 index 00000000..af9a9087 --- /dev/null +++ b/os-apps/temper-agent/wasm/cron_trigger/src/lib.rs @@ -0,0 +1,105 @@ +//! Cron Trigger — WASM module for firing scheduled agent runs. +//! +//! Creates a new TemperAgent entity with the cron job's configuration, +//! including template variable substitution. + +use temper_wasm_sdk::prelude::*; +use wasm_helpers::resolve_temper_api_url; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + ctx.log("info", "cron_trigger: starting"); + + let fields = ctx.entity_state.get("fields").cloned().unwrap_or(json!({})); + let temper_api_url = resolve_temper_api_url(&ctx, &fields); + let tenant = &ctx.tenant; + + // Read cron job configuration + let soul_id = fields.get("soul_id").and_then(|v| v.as_str()).unwrap_or(""); + let system_prompt = fields.get("system_prompt").and_then(|v| v.as_str()).unwrap_or(""); + let user_message_template = fields.get("user_message_template").and_then(|v| v.as_str()).unwrap_or(""); + let model = fields.get("model").and_then(|v| v.as_str()).unwrap_or("claude-sonnet-4-20250514"); + let provider = fields.get("provider").and_then(|v| v.as_str()).unwrap_or("anthropic"); + let tools_enabled = fields.get("tools_enabled").and_then(|v| v.as_str()).unwrap_or("read,write,edit,bash"); + let sandbox_url = fields.get("sandbox_url").and_then(|v| v.as_str()).unwrap_or(""); + let max_turns = fields.get("max_turns").and_then(|v| v.as_str()).unwrap_or("20"); + let run_count = fields.get("run_count").and_then(|v| v.as_i64()).unwrap_or(0); + let last_result = fields.get("last_result").and_then(|v| v.as_str()).unwrap_or(""); + + // Template substitution + let user_message = user_message_template + .replace("{{run_count}}", &run_count.to_string()) + .replace("{{last_result}}", last_result) + .replace("{{now}}", ""); // timestamp injected by cron_scheduler before trigger + + ctx.log("info", &format!("cron_trigger: creating agent for run #{}", run_count)); + + let headers = vec![ + ("content-type".to_string(), "application/json".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + + // 1. Create TemperAgent entity + let create_url = format!("{temper_api_url}/tdata/TemperAgents"); + let create_resp = ctx.http_call("POST", &create_url, &headers, "{}")?; + if create_resp.status < 200 || create_resp.status >= 300 { + return Err(format!("Failed to create agent (HTTP {}): {}", create_resp.status, &create_resp.body[..create_resp.body.len().min(200)])); + } + + let agent: Value = serde_json::from_str(&create_resp.body) + .map_err(|e| format!("Failed to parse agent response: {e}"))?; + let agent_id = agent + .get("entity_id") + .or_else(|| agent.get("Id")) + .and_then(|v| v.as_str()) + .unwrap_or(""); + if agent_id.is_empty() { + return Err("Failed to extract created agent ID".to_string()); + } + + // 2. Configure the agent + let configure_url = format!( + "{temper_api_url}/tdata/TemperAgents('{agent_id}')/Temper.Agent.TemperAgent.Configure" + ); + let configure_body = json!({ + "system_prompt": system_prompt, + "user_message": user_message, + "model": model, + "provider": provider, + "tools_enabled": tools_enabled, + "sandbox_url": sandbox_url, + "max_turns": max_turns, + "soul_id": soul_id, + }); + let configure_resp = ctx.http_call("POST", &configure_url, &headers, &configure_body.to_string())?; + if configure_resp.status < 200 || configure_resp.status >= 300 { + return Err(format!("Failed to configure agent (HTTP {})", configure_resp.status)); + } + + // 3. Provision the agent + let provision_url = format!( + "{temper_api_url}/tdata/TemperAgents('{agent_id}')/Temper.Agent.TemperAgent.Provision" + ); + let provision_resp = ctx.http_call("POST", &provision_url, &headers, "{}")?; + if provision_resp.status < 200 || provision_resp.status >= 300 { + return Err(format!("Failed to provision agent (HTTP {})", provision_resp.status)); + } + + ctx.log("info", &format!("cron_trigger: agent {} created and provisioned", agent_id)); + + set_success_result("TriggerComplete", &json!({ + "last_agent_id": agent_id, + "last_result": "", + })); + + Ok(()) + })(); + + if let Err(e) = result { + set_error_result(&e); + } + 0 +} diff --git a/os-apps/temper-agent/wasm/heartbeat_scan/Cargo.lock b/os-apps/temper-agent/wasm/heartbeat_scan/Cargo.lock new file mode 100644 index 00000000..aa2d0bdc --- /dev/null +++ b/os-apps/temper-agent/wasm/heartbeat_scan/Cargo.lock @@ -0,0 +1,121 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "heartbeat-scan" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", + "wasm-helpers", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/heartbeat_scan/Cargo.toml b/os-apps/temper-agent/wasm/heartbeat_scan/Cargo.toml new file mode 100644 index 00000000..e91cc271 --- /dev/null +++ b/os-apps/temper-agent/wasm/heartbeat_scan/Cargo.toml @@ -0,0 +1,13 @@ +[package] +name = "heartbeat-scan" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +wasm-helpers = { path = "../wasm-helpers" } diff --git a/os-apps/temper-agent/wasm/heartbeat_scan/src/lib.rs b/os-apps/temper-agent/wasm/heartbeat_scan/src/lib.rs new file mode 100644 index 00000000..da77da83 --- /dev/null +++ b/os-apps/temper-agent/wasm/heartbeat_scan/src/lib.rs @@ -0,0 +1,150 @@ +//! Heartbeat Scanner — WASM module for detecting stale agents. +//! +//! Queries TemperAgent entities in non-terminal states, checks heartbeat freshness, +//! and fires TimeoutFail on stale ones. + +use temper_wasm_sdk::prelude::*; +use wasm_helpers::{entity_field_str, parse_iso8601_to_epoch_secs, resolve_temper_api_url}; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + ctx.log("info", "heartbeat_scan: starting"); + + let fields = ctx.entity_state.get("fields").cloned().unwrap_or(json!({})); + let temper_api_url = resolve_temper_api_url(&ctx, &fields); + let tenant = &ctx.tenant; + + // Get scanner's reference timestamp for "now" + let scan_started_at = fields + .get("last_scan_at") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let now_secs = parse_iso8601_to_epoch_secs(scan_started_at).unwrap_or(0); + + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + + // Query agents in non-terminal states + let filter = "$filter=Status ne 'Completed' and Status ne 'Failed' and Status ne 'Cancelled' and Status ne 'Created'"; + let url = format!("{temper_api_url}/tdata/TemperAgents?{filter}"); + let resp = ctx.http_call("GET", &url, &headers, "")?; + + let mut stale_count: i64 = 0; + + if resp.status == 200 { + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({"value": []})); + let agents = parsed.get("value").and_then(|v| v.as_array()).cloned().unwrap_or_default(); + + ctx.log("info", &format!("heartbeat_scan: checking {} active agents", agents.len())); + + for agent in &agents { + let agent_id = agent + .get("entity_id") + .and_then(|v| v.as_str()) + .or_else(|| entity_field_str(agent, &["Id"])) + .unwrap_or(""); + let last_heartbeat = + entity_field_str(agent, &["LastHeartbeatAt"]).unwrap_or(""); + let timeout_secs: u64 = entity_field_str(agent, &["HeartbeatTimeoutSeconds"]) + .and_then(|s| s.parse().ok()) + .unwrap_or(300); + + // Skip agents without heartbeat monitoring configured. + if timeout_secs == 0 { + continue; + } + + let is_stale = if last_heartbeat.is_empty() { + // No heartbeat ever observed — stale + true + } else if now_secs > 0 { + // Compare heartbeat timestamp against current time + match parse_iso8601_to_epoch_secs(last_heartbeat) { + Some(hb_secs) => now_secs.saturating_sub(hb_secs) > timeout_secs, + None => { + ctx.log("warn", &format!( + "heartbeat_scan: agent {} has unparseable heartbeat timestamp '{}'", + agent_id, last_heartbeat + )); + false + } + } + } else { + // No reference time available; only flag agents with no heartbeat at all + ctx.log("info", &format!( + "heartbeat_scan: agent {} has heartbeat '{}' but no scan reference time, skipping comparison", + agent_id, last_heartbeat + )); + false + }; + + if is_stale { + let fail_url = format!( + "{temper_api_url}/tdata/TemperAgents('{agent_id}')/Temper.Agent.TemperAgent.TimeoutFail" + ); + let elapsed_msg = if last_heartbeat.is_empty() { + "no heartbeat observed".to_string() + } else { + let hb_secs = parse_iso8601_to_epoch_secs(last_heartbeat).unwrap_or(0); + format!("last heartbeat {}s ago", now_secs.saturating_sub(hb_secs)) + }; + let fail_body = json!({ + "error_message": format!( + "heartbeat timeout: {} (timeout: {}s)", + elapsed_msg, timeout_secs + ) + }); + match ctx.http_call("POST", &fail_url, &headers, &fail_body.to_string()) { + Ok(resp) if resp.status >= 200 && resp.status < 300 => { + stale_count += 1; + ctx.log( + "warn", + &format!("heartbeat_scan: failed stale agent {}", agent_id), + ); + } + Ok(resp) => ctx.log( + "warn", + &format!( + "heartbeat_scan: TimeoutFail failed for {} (HTTP {})", + agent_id, resp.status + ), + ), + Err(error) => ctx.log( + "warn", + &format!( + "heartbeat_scan: TimeoutFail failed for {}: {}", + agent_id, error + ), + ), + } + } else { + ctx.log( + "info", + &format!( + "heartbeat_scan: agent {} heartbeat marker='{}' timeout={}s — alive", + agent_id, last_heartbeat, timeout_secs + ), + ); + } + } + } + + // Return scan complete + set_success_result("ScanComplete", &json!({ + "last_scan_at": "scan-complete", + "stale_agents_found": stale_count, + })); + + Ok(()) + })(); + + if let Err(e) = result { + set_error_result(&e); + } + 0 +} diff --git a/os-apps/temper-agent/wasm/heartbeat_scheduler/Cargo.lock b/os-apps/temper-agent/wasm/heartbeat_scheduler/Cargo.lock new file mode 100644 index 00000000..9b68d615 --- /dev/null +++ b/os-apps/temper-agent/wasm/heartbeat_scheduler/Cargo.lock @@ -0,0 +1,121 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "heartbeat-scheduler" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", + "wasm-helpers", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/heartbeat_scheduler/Cargo.toml b/os-apps/temper-agent/wasm/heartbeat_scheduler/Cargo.toml new file mode 100644 index 00000000..469534ff --- /dev/null +++ b/os-apps/temper-agent/wasm/heartbeat_scheduler/Cargo.toml @@ -0,0 +1,13 @@ +[package] +name = "heartbeat-scheduler" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +wasm-helpers = { path = "../wasm-helpers" } diff --git a/os-apps/temper-agent/wasm/heartbeat_scheduler/src/lib.rs b/os-apps/temper-agent/wasm/heartbeat_scheduler/src/lib.rs new file mode 100644 index 00000000..2725678e --- /dev/null +++ b/os-apps/temper-agent/wasm/heartbeat_scheduler/src/lib.rs @@ -0,0 +1,48 @@ +use temper_wasm_sdk::prelude::*; +use wasm_helpers::resolve_temper_api_url; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + let fields = ctx.entity_state.get("fields").cloned().unwrap_or_else(|| json!({})); + let interval_seconds = fields + .get("scan_interval_seconds") + .and_then(|v| v.as_str()) + .and_then(|v| v.parse::().ok()) + .unwrap_or(30) + .clamp(1, 300); + let base_url = resolve_temper_api_url(&ctx, &fields); + let headers = vec![ + ("x-tenant-id".to_string(), ctx.tenant.clone()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ("content-type".to_string(), "application/json".to_string()), + ]; + + let wait_url = format!( + "{base_url}/observe/entities/{}/{}/wait?statuses=__never__&timeout_ms={}&poll_ms=250", + ctx.entity_type, + ctx.entity_id, + interval_seconds * 1000 + ); + let _ = ctx.http_call("GET", &wait_url, &headers, "")?; + + let action_url = format!( + "{base_url}/tdata/HeartbeatMonitors('{}')/Temper.Agent.HeartbeatMonitor.ScheduledScan", + ctx.entity_id + ); + let _ = ctx.http_call("POST", &action_url, &headers, "{}")?; + + set_success_result("ScheduleFailed", &json!({ + "error_message": "", + })); + Ok(()) + })(); + + if let Err(error) = result { + set_error_result(&error); + } + 0 +} + diff --git a/os-apps/temper-agent/wasm/llm_caller/Cargo.lock b/os-apps/temper-agent/wasm/llm_caller/Cargo.lock index 14e1d9bc..3a5c0de8 100644 --- a/os-apps/temper-agent/wasm/llm_caller/Cargo.lock +++ b/os-apps/temper-agent/wasm/llm_caller/Cargo.lock @@ -12,6 +12,7 @@ checksum = "92ecc6618181def0457392ccd0ee51198e065e016d1d527a7ac1b6dc7c1f09d2" name = "llm-caller" version = "0.1.0" dependencies = [ + "session-tree-lib", "temper-wasm-sdk", ] @@ -81,6 +82,13 @@ dependencies = [ "zmij", ] +[[package]] +name = "session-tree-lib" +version = "0.1.0" +dependencies = [ + "serde_json", +] + [[package]] name = "syn" version = "2.0.117" diff --git a/os-apps/temper-agent/wasm/llm_caller/Cargo.toml b/os-apps/temper-agent/wasm/llm_caller/Cargo.toml index eb0e8cff..dbdf5b9f 100644 --- a/os-apps/temper-agent/wasm/llm_caller/Cargo.toml +++ b/os-apps/temper-agent/wasm/llm_caller/Cargo.toml @@ -10,3 +10,4 @@ crate-type = ["cdylib"] [dependencies] temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +session-tree-lib = { path = "../session-tree-lib" } diff --git a/os-apps/temper-agent/wasm/llm_caller/src/lib.rs b/os-apps/temper-agent/wasm/llm_caller/src/lib.rs index e4364dd8..47905f80 100644 --- a/os-apps/temper-agent/wasm/llm_caller/src/lib.rs +++ b/os-apps/temper-agent/wasm/llm_caller/src/lib.rs @@ -16,6 +16,7 @@ //! Build: `cargo build --target wasm32-unknown-unknown --release` use temper_wasm_sdk::prelude::*; +use session_tree_lib::SessionTree; /// Entry point — NOT using `temper_module!` because we need dynamic callback actions. #[unsafe(no_mangle)] @@ -132,6 +133,32 @@ anthropic_api_key (or api_key) for anthropic, openrouter_api_key (or api_key) fo let temper_api_url = resolve_temper_api_url(&ctx, &fields); let tenant = &ctx.tenant; + // Session tree fields (Pi architecture) + let session_file_id = fields + .get("session_file_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let session_leaf_id = fields + .get("session_leaf_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + + // Soul and steering fields + let soul_id = fields + .get("soul_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let max_follow_ups: i64 = fields + .get("max_follow_ups") + .and_then(|v| v.as_str()) + .and_then(|s| s.parse().ok()) + .unwrap_or(5); + let reserve_tokens: usize = fields + .get("reserve_tokens") + .and_then(|v| v.as_str()) + .and_then(|s| s.parse().ok()) + .unwrap_or(20000); + // Read conversation — from TemperFS if file_id set, else inline state. // First turn uses `user_message` (the actual user task from Provision). // `system_prompt` is always sent as the Anthropic system parameter, never as a message. @@ -139,40 +166,107 @@ anthropic_api_key (or api_key) for anthropic, openrouter_api_key (or api_key) fo return Err("user_message is empty — nothing to send to the LLM".to_string()); } let first_turn_content = user_message; - let mut messages: Vec = if !conversation_file_id.is_empty() { - read_conversation_from_temperfs( - &ctx, - &temper_api_url, - tenant, - conversation_file_id, - first_turn_content, - )? + + // Determine which session storage to use + let use_session_tree = !session_file_id.is_empty() && !session_leaf_id.is_empty(); + + let (mut messages, mut session_tree) = if use_session_tree { + let session_jsonl = read_session_from_temperfs(&ctx, &temper_api_url, tenant, session_file_id)?; + if session_jsonl.is_empty() { + // First turn — tree was just created by sandbox_provisioner but empty + let tree = SessionTree::from_jsonl(&session_jsonl); + let msgs = vec![json!({ "role": "user", "content": first_turn_content })]; + (msgs, Some(tree)) + } else { + let tree = SessionTree::from_jsonl(&session_jsonl); + let msgs = tree.build_context(session_leaf_id); + if msgs.is_empty() { + (vec![json!({ "role": "user", "content": first_turn_content })], Some(tree)) + } else { + (msgs, Some(tree)) + } + } + } else if !conversation_file_id.is_empty() { + // Legacy flat JSON mode + let msgs = read_conversation_from_temperfs( + &ctx, &temper_api_url, tenant, conversation_file_id, first_turn_content, + )?; + (msgs, None) } else { - let conversation_json = fields - .get("conversation") - .and_then(|v| v.as_str()) - .unwrap_or(""); + // Inline state + let conversation_json = fields.get("conversation").and_then(|v| v.as_str()).unwrap_or(""); if conversation_json.is_empty() { - vec![json!({ "role": "user", "content": first_turn_content })] + (vec![json!({ "role": "user", "content": first_turn_content })], None) } else { - serde_json::from_str(conversation_json).unwrap_or_else(|_| { + (serde_json::from_str(conversation_json).unwrap_or_else(|_| { vec![json!({ "role": "user", "content": first_turn_content })] - }) + }), None) } }; // Build tool definitions based on tools_enabled let tools = build_tool_definitions(tools_enabled, sandbox_url, workdir); + // Check compaction threshold (Pi architecture) + if use_session_tree { + if let Some(ref tree) = session_tree { + let context_tokens = tree.estimate_tokens(session_leaf_id); + // Model context windows (approximate) + let context_window: usize = if model.contains("opus") { 200000 } + else if model.contains("haiku") { 200000 } + else { 200000 }; // sonnet default + if context_tokens > context_window.saturating_sub(reserve_tokens) { + ctx.log("info", &format!( + "llm_caller: context_tokens ({}) exceeds threshold ({}), triggering compaction", + context_tokens, context_window.saturating_sub(reserve_tokens) + )); + set_success_result("NeedsCompaction", &json!({ + "context_tokens": context_tokens, + "session_leaf_id": session_leaf_id, + })); + return Ok(()); + } + } + } + + // System prompt assembly (Pi architecture): + // 1. Soul content (from AgentSoul entity via TemperFS) + // 2. system_prompt override (from Configure action) + // 3. Available skills XML block + // 4. Memory context + let assembled_system_prompt = assemble_system_prompt( + &ctx, &temper_api_url, tenant, soul_id, system_prompt, + )?; + + emit_progress_ignore( + &ctx, + json!({ + "kind": "prompt_assembled", + "message": "system prompt assembled", + "system_prompt": assembled_system_prompt, + }), + ); + let mock_hang = provider == "mock" && mock_plan_requests_hang(&messages); + if !mock_hang { + let _ = send_heartbeat(&ctx, &temper_api_url, tenant); + } + emit_progress_ignore( + &ctx, + json!({ + "kind": "llm_request_started", + "message": format!("calling provider={provider} model={model}"), + }), + ); + // Call LLM API let response = match provider.as_str() { - "mock" => call_mock(&ctx, &messages)?, + "mock" => call_mock(&ctx, &messages, &assembled_system_prompt, &tools)?, "anthropic" => call_anthropic( &ctx, &api_key, &anthropic_api_url, model, - system_prompt, + &assembled_system_prompt, &messages, &tools, &anthropic_auth_mode, @@ -182,7 +276,7 @@ anthropic_api_key (or api_key) for anthropic, openrouter_api_key (or api_key) fo &api_key, &openrouter_api_url, model, - system_prompt, + &assembled_system_prompt, &messages, &tools, &openrouter_site_url, @@ -198,6 +292,14 @@ anthropic_api_key (or api_key) for anthropic, openrouter_api_key (or api_key) fo response.stop_reason ), ); + emit_progress_ignore( + &ctx, + json!({ + "kind": "llm_response", + "message": format!("provider returned stop_reason={}", response.stop_reason), + "stop_reason": response.stop_reason.clone(), + }), + ); // Append assistant response to conversation messages.push(json!({ @@ -238,19 +340,36 @@ anthropic_api_key (or api_key) for anthropic, openrouter_api_key (or api_key) fo .cloned() .collect(); + // Update session tree if in tree mode + let new_leaf = if use_session_tree { + if let Some(ref mut tree) = session_tree { + let parent = session_leaf_id; + let (leaf, _) = tree.append_assistant_message( + parent, + &response.content, + response.output_tokens as usize, + ); + let updated_jsonl = tree.to_jsonl(); + write_session_to_temperfs(&ctx, &temper_api_url, tenant, session_file_id, &updated_jsonl)?; + Some(leaf) + } else { None } + } else { None }; + let tool_calls_json = serde_json::to_string(&tool_calls).unwrap_or_default(); let mut params = json!({ "pending_tool_calls": tool_calls_json, "input_tokens": response.input_tokens, "output_tokens": response.output_tokens, }); + if let Some(leaf) = new_leaf { + params["session_leaf_id"] = json!(leaf); + } if let Some(ref conv) = conv_param { params["conversation"] = json!(conv); } set_success_result("ProcessToolCalls", ¶ms); } "end_turn" | "stop" => { - // Extract text result let result_text = response .content .as_array() @@ -266,15 +385,48 @@ anthropic_api_key (or api_key) for anthropic, openrouter_api_key (or api_key) fo .collect::>() .join("\n"); - let mut params = json!({ - "result": result_text, - "input_tokens": response.input_tokens, - "output_tokens": response.output_tokens, - }); - if let Some(ref conv) = conv_param { - params["conversation"] = json!(conv); + // Update session tree if in tree mode + if use_session_tree { + if let Some(ref mut tree) = session_tree { + let parent = session_leaf_id; + let (new_leaf, _) = tree.append_assistant_message( + parent, + &response.content, + response.output_tokens as usize, + ); + let updated_jsonl = tree.to_jsonl(); + write_session_to_temperfs(&ctx, &temper_api_url, tenant, session_file_id, &updated_jsonl)?; + + // Route through steering check if follow-ups are enabled + if max_follow_ups > 0 { + set_success_result("CheckSteering", &json!({ + "result": result_text, + "session_leaf_id": new_leaf, + "input_tokens": response.input_tokens, + "output_tokens": response.output_tokens, + })); + } else { + let params = json!({ + "result": result_text, + "session_leaf_id": new_leaf, + "input_tokens": response.input_tokens, + "output_tokens": response.output_tokens, + }); + set_success_result("RecordResult", ¶ms); + } + } + } else { + // Legacy mode — direct to RecordResult + let mut params = json!({ + "result": result_text, + "input_tokens": response.input_tokens, + "output_tokens": response.output_tokens, + }); + if let Some(ref conv) = conv_param { + params["conversation"] = json!(conv); + } + set_success_result("RecordResult", ¶ms); } - set_success_result("RecordResult", ¶ms); } other => { set_success_result( @@ -338,31 +490,39 @@ fn resolve_provider_api_key(ctx: &Context, provider: &str) -> Result Result { +fn call_mock( + ctx: &Context, + messages: &[Value], + assembled_system_prompt: &str, + _tools: &[Value], +) -> Result { ctx.log("info", "llm_caller: using deterministic mock provider"); - let signal_summary = extract_mock_signal_summary(messages)?; - let analysis = build_mock_analysis(&signal_summary); - let analysis_text = serde_json::to_string_pretty(&analysis) - .map_err(|e| format!("failed to serialize mock analysis: {e}"))?; + if mock_plan_requests_hang(messages) { + simulate_mock_hang(ctx)?; + return Err("mock hang scenario finished without heartbeat".to_string()); + } - Ok(LlmResponse { - content: json!([{ - "type": "text", - "text": analysis_text, - }]), - stop_reason: "end_turn".to_string(), - input_tokens: messages - .iter() - .map(|message| { - message - .get("content") - .map(stringify_content) - .unwrap_or_default() - .len() as i64 - }) - .sum::(), - output_tokens: analysis_text.len() as i64, - }) + let assistant_turns = messages + .iter() + .filter(|message| message.get("role").and_then(Value::as_str) == Some("assistant")) + .count(); + + if let Some(step) = extract_mock_plan(messages) + .and_then(|steps| steps.get(assistant_turns).cloned()) + { + return build_mock_step_response(messages, assembled_system_prompt, assistant_turns, &step); + } + + let latest_user = latest_user_text(messages); + let text = resolve_mock_template( + latest_user + .as_deref() + .filter(|value| !value.trim().is_empty()) + .unwrap_or("mock provider completed"), + assembled_system_prompt, + latest_user.as_deref().unwrap_or(""), + ); + Ok(mock_text_response(messages, text)) } fn extract_mock_signal_summary(messages: &[Value]) -> Result { @@ -1146,6 +1306,217 @@ fn stringify_content(value: &Value) -> String { } } +fn emit_progress_ignore(ctx: &Context, payload: Value) { + let _ = ctx.emit_progress(&payload); +} + +fn send_heartbeat(ctx: &Context, temper_api_url: &str, tenant: &str) -> Result<(), String> { + let url = format!( + "{temper_api_url}/tdata/TemperAgents('{}')/Temper.Agent.TemperAgent.Heartbeat", + ctx.entity_id + ); + let body = json!({ "last_heartbeat_at": "alive" }); + let headers = vec![ + ("content-type".to_string(), "application/json".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + let _ = ctx.http_call("POST", &url, &headers, &body.to_string())?; + Ok(()) +} + +fn mock_plan_requests_hang(messages: &[Value]) -> bool { + if let Some(steps) = extract_mock_plan(messages) + && steps + .iter() + .any(|step| step.get("mode").and_then(Value::as_str) == Some("hang")) + { + return true; + } + latest_user_text(messages) + .map(|text| text.contains("[mock-hang]")) + .unwrap_or(false) +} + +fn simulate_mock_hang(ctx: &Context) -> Result<(), String> { + let base_url = temper_api_url(ctx); + let url = format!( + "{base_url}/observe/entities/{}/{}/wait?statuses=__never__&timeout_ms=10000&poll_ms=250", + ctx.entity_type, ctx.entity_id + ); + let headers = vec![ + ("x-tenant-id".to_string(), ctx.tenant.clone()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + let _ = ctx.http_call("GET", &url, &headers, "")?; + Ok(()) +} + +fn extract_mock_plan(messages: &[Value]) -> Option> { + for message in messages { + if message.get("role").and_then(Value::as_str) != Some("user") { + continue; + } + let raw = stringify_content(message.get("content").unwrap_or(&Value::Null)); + let Ok(parsed) = serde_json::from_str::(&raw) else { + continue; + }; + if let Some(steps) = parsed.get("steps").and_then(Value::as_array) { + return Some(steps.clone()); + } + if let Some(steps) = parsed + .get("mock_plan") + .and_then(|value| value.get("steps")) + .and_then(Value::as_array) + { + return Some(steps.clone()); + } + } + None +} + +fn build_mock_step_response( + messages: &[Value], + assembled_system_prompt: &str, + assistant_turns: usize, + step: &Value, +) -> Result { + if step.get("mode").and_then(Value::as_str) == Some("hang") { + return Ok(mock_text_response(messages, "mock hang placeholder".to_string())); + } + + let mut content = Vec::::new(); + if let Some(text) = step.get("text").and_then(Value::as_str) { + let resolved = resolve_mock_template( + text, + assembled_system_prompt, + latest_user_text(messages).as_deref().unwrap_or(""), + ); + if !resolved.is_empty() { + content.push(json!({ "type": "text", "text": resolved })); + } + } + + if let Some(tool_calls) = step.get("tool_calls").and_then(Value::as_array) { + for (index, tool_call) in tool_calls.iter().enumerate() { + let name = tool_call + .get("name") + .and_then(Value::as_str) + .unwrap_or("unknown_tool"); + let input = tool_call.get("input").cloned().unwrap_or_else(|| json!({})); + let id = tool_call + .get("id") + .and_then(Value::as_str) + .map(str::to_string) + .unwrap_or_else(|| format!("mock-tool-{assistant_turns}-{index}")); + content.push(json!({ + "type": "tool_use", + "id": id, + "name": name, + "input": input, + })); + } + } + + if content + .iter() + .any(|block| block.get("type").and_then(Value::as_str) == Some("tool_use")) + { + let output_len = serde_json::to_string(&content).unwrap_or_default().len() as i64; + return Ok(LlmResponse { + content: Value::Array(content), + stop_reason: "tool_use".to_string(), + input_tokens: estimate_message_tokens(messages), + output_tokens: output_len, + }); + } + + let final_text = step + .get("final_text") + .or_else(|| step.get("text")) + .and_then(Value::as_str) + .unwrap_or("mock provider completed"); + Ok(mock_text_response( + messages, + resolve_mock_template( + final_text, + assembled_system_prompt, + latest_user_text(messages).as_deref().unwrap_or(""), + ), + )) +} + +fn mock_text_response(messages: &[Value], text: String) -> LlmResponse { + LlmResponse { + content: json!([{ "type": "text", "text": text.clone() }]), + stop_reason: "end_turn".to_string(), + input_tokens: estimate_message_tokens(messages), + output_tokens: text.len() as i64, + } +} + +fn estimate_message_tokens(messages: &[Value]) -> i64 { + messages + .iter() + .map(|message| { + message + .get("content") + .map(stringify_content) + .unwrap_or_default() + .len() as i64 + }) + .sum::() +} + +fn latest_user_text(messages: &[Value]) -> Option { + messages + .iter() + .rev() + .find(|message| message.get("role").and_then(Value::as_str) == Some("user")) + .map(|message| stringify_content(message.get("content").unwrap_or(&Value::Null))) +} + +fn resolve_mock_template(template: &str, assembled_system_prompt: &str, latest_user: &str) -> String { + let mut text = template.to_string(); + text = text.replace("{{latest_user}}", latest_user); + text = text.replace("{{memory_block}}", &extract_tag_block(assembled_system_prompt, "agent_memory")); + text = text.replace( + "{{memory_keys}}", + &extract_memory_keys(assembled_system_prompt).join(", "), + ); + text = text.replace( + "{{memory_count}}", + &extract_memory_keys(assembled_system_prompt).len().to_string(), + ); + text = text.replace("{{skills_block}}", &extract_tag_block(assembled_system_prompt, "available_skills")); + text +} + +fn extract_tag_block(text: &str, tag: &str) -> String { + let start_tag = format!("<{tag}>"); + let end_tag = format!(""); + let Some(start) = text.find(&start_tag) else { + return String::new(); + }; + let Some(end) = text[start..].find(&end_tag) else { + return String::new(); + }; + text[start..start + end + end_tag.len()].to_string() +} + +fn extract_memory_keys(text: &str) -> Vec { + text.lines() + .filter_map(|line| { + let marker = "key=\""; + let start = line.find(marker)? + marker.len(); + let rest = &line[start..]; + let end = rest.find('"')?; + Some(rest[..end].to_string()) + }) + .collect() +} + fn convert_messages_to_openrouter(messages: &[Value]) -> Vec { let mut out = Vec::::new(); for msg in messages { @@ -1342,6 +1713,120 @@ fn build_tool_definitions(tools_enabled: &str, sandbox_url: &str, workdir: &str) })); } + if enabled.contains(&"read_entity") { + tools.push(json!({ + "name": "read_entity", + "description": "Read a TemperFS-backed entity content file by file_id.", + "input_schema": { + "type": "object", + "properties": { + "file_id": { "type": "string", "description": "TemperFS File entity ID" } + }, + "required": ["file_id"] + } + })); + } + + if enabled.contains(&"save_memory") { + tools.push(json!({ + "name": "save_memory", + "description": "Persist a memory entry scoped to the agent soul.", + "input_schema": { + "type": "object", + "properties": { + "key": { "type": "string" }, + "content": { "type": "string" }, + "memory_type": { "type": "string" } + }, + "required": ["key", "content"] + } + })); + } + + if enabled.contains(&"recall_memory") { + tools.push(json!({ + "name": "recall_memory", + "description": "Recall memories matching a key or content substring.", + "input_schema": { + "type": "object", + "properties": { + "query": { "type": "string" } + }, + "required": ["query"] + } + })); + } + + if enabled.contains(&"spawn_agent") { + tools.push(json!({ + "name": "spawn_agent", + "description": "Create, configure, and provision a child TemperAgent.", + "input_schema": { + "type": "object", + "properties": { + "agent_id": { "type": "string" }, + "task": { "type": "string" }, + "model": { "type": "string" }, + "provider": { "type": "string" }, + "max_turns": { "type": "integer" }, + "tools": { "type": "string" }, + "soul_id": { "type": "string" }, + "background": { "type": "boolean" } + }, + "required": ["task"] + } + })); + tools.push(json!({ + "name": "list_agents", + "description": "List child agents spawned by this agent.", + "input_schema": { + "type": "object", + "properties": {}, + "required": [] + } + })); + tools.push(json!({ + "name": "abort_agent", + "description": "Cancel a child agent by ID.", + "input_schema": { + "type": "object", + "properties": { + "agent_id": { "type": "string" } + }, + "required": ["agent_id"] + } + })); + tools.push(json!({ + "name": "steer_agent", + "description": "Queue a steering message for a child agent.", + "input_schema": { + "type": "object", + "properties": { + "agent_id": { "type": "string" }, + "message": { "type": "string" } + }, + "required": ["agent_id", "message"] + } + })); + } + + if enabled.contains(&"run_coding_agent") { + tools.push(json!({ + "name": "run_coding_agent", + "description": "Run a coding agent CLI command inside the sandbox.", + "input_schema": { + "type": "object", + "properties": { + "agent_type": { "type": "string" }, + "task": { "type": "string" }, + "workdir": { "type": "string" }, + "background": { "type": "boolean" } + }, + "required": ["agent_type", "task"] + } + })); + } + if enabled.contains(&"logfire_query") { tools.push(json!({ "name": "logfire_query", @@ -1368,6 +1853,129 @@ fn build_tool_definitions(tools_enabled: &str, sandbox_url: &str, workdir: &str) })); } + if enabled.contains(&"save_memory") { + tools.push(json!({ + "name": "save_memory", + "description": "Save a memory for future agent sessions. Memories persist across runs.", + "input_schema": { + "type": "object", + "properties": { + "key": { "type": "string", "description": "Unique key for this memory" }, + "content": { "type": "string", "description": "Memory content (markdown)" }, + "memory_type": { "type": "string", "enum": ["user", "feedback", "project", "reference"], "description": "Type of memory" } + }, + "required": ["key", "content", "memory_type"] + } + })); + } + + if enabled.contains(&"recall_memory") { + tools.push(json!({ + "name": "recall_memory", + "description": "Search and recall memories from previous sessions.", + "input_schema": { + "type": "object", + "properties": { + "query": { "type": "string", "description": "Search query to find relevant memories" } + }, + "required": ["query"] + } + })); + } + + if enabled.contains(&"spawn_agent") { + tools.push(json!({ + "name": "spawn_agent", + "description": "Spawn a child TemperAgent to handle a subtask. The child runs autonomously and returns its result.", + "input_schema": { + "type": "object", + "properties": { + "agent_id": { "type": "string", "description": "Optional deterministic child agent ID" }, + "task": { "type": "string", "description": "The task for the child agent" }, + "model": { "type": "string", "description": "LLM model to use (optional, defaults to parent's model)" }, + "provider": { "type": "string", "description": "LLM provider to use (optional, defaults to parent's provider)" }, + "max_turns": { "type": "integer", "description": "Maximum turns for the child (optional, default 20)" }, + "tools": { "type": "string", "description": "Comma-separated tools to enable (optional, defaults to parent's tools)" }, + "soul_id": { "type": "string", "description": "Soul ID to use (optional, defaults to parent's soul)" }, + "background": { "type": "boolean", "description": "If true, return after provisioning without waiting for completion" } + }, + "required": ["task"] + } + })); + } + + if enabled.contains(&"list_agents") { + tools.push(json!({ + "name": "list_agents", + "description": "List child agents spawned by this agent and their status.", + "input_schema": { + "type": "object", + "properties": {}, + "required": [] + } + })); + } + + if enabled.contains(&"steer_agent") { + tools.push(json!({ + "name": "steer_agent", + "description": "Send a follow-up message to a child agent mid-run.", + "input_schema": { + "type": "object", + "properties": { + "agent_id": { "type": "string", "description": "The child agent entity ID" }, + "message": { "type": "string", "description": "The steering message to inject" } + }, + "required": ["agent_id", "message"] + } + })); + } + + if enabled.contains(&"abort_agent") { + tools.push(json!({ + "name": "abort_agent", + "description": "Cancel a running child agent.", + "input_schema": { + "type": "object", + "properties": { + "agent_id": { "type": "string", "description": "The child agent entity ID to cancel" } + }, + "required": ["agent_id"] + } + })); + } + + if enabled.contains(&"read_entity") { + tools.push(json!({ + "name": "read_entity", + "description": "Read a TemperFS file by ID. Use this to load skill content, soul documents, or any other entity-backed file.", + "input_schema": { + "type": "object", + "properties": { + "file_id": { "type": "string", "description": "The TemperFS File entity ID to read" } + }, + "required": ["file_id"] + } + })); + } + + if enabled.contains(&"run_coding_agent") { + tools.push(json!({ + "name": "run_coding_agent", + "description": "Spawn a coding agent CLI process (Claude Code, Codex, Pi, OpenCode) in the sandbox.", + "input_schema": { + "type": "object", + "properties": { + "agent_type": { "type": "string", "enum": ["claude-code", "codex", "pi", "opencode"], "description": "Which coding agent CLI to use" }, + "task": { "type": "string", "description": "The task for the coding agent" }, + "workdir": { "type": "string", "description": "Working directory in the sandbox (optional)" }, + "background": { "type": "boolean", "description": "Run in background (default: false)" } + }, + "required": ["agent_type", "task"] + } + })); + } + tools } @@ -1382,7 +1990,7 @@ fn read_conversation_from_temperfs( let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); let headers = vec![ ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ("accept".to_string(), "application/json".to_string()), ]; @@ -1441,7 +2049,7 @@ fn write_conversation_to_temperfs( let headers = vec![ ("content-type".to_string(), "application/json".to_string()), ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ]; // Wrap messages array in the TemperFS conversation format @@ -1466,6 +2074,208 @@ fn write_conversation_to_temperfs( } } +/// Read session JSONL from TemperFS. +fn read_session_from_temperfs( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + file_id: &str, +) -> Result { + let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + let resp = ctx.http_call("GET", &url, &headers, "")?; + if resp.status == 200 { + Ok(resp.body) + } else if resp.status == 404 { + Ok(String::new()) + } else { + Err(format!("TemperFS session read failed (HTTP {})", resp.status)) + } +} + +/// Write session JSONL to TemperFS. +fn write_session_to_temperfs( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + file_id: &str, + jsonl: &str, +) -> Result<(), String> { + let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + let headers = vec![ + ("content-type".to_string(), "text/plain".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + let resp = ctx.http_call("PUT", &url, &headers, jsonl)?; + if resp.status >= 200 && resp.status < 300 { + Ok(()) + } else { + Err(format!("TemperFS session write failed (HTTP {})", resp.status)) + } +} + +/// Assemble the full system prompt from soul + override + skills + memory. +fn assemble_system_prompt( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + soul_id: &str, + system_prompt_override: &str, +) -> Result { + let mut parts: Vec = Vec::new(); + + // 1. Soul content + if !soul_id.is_empty() { + match load_soul_content(ctx, temper_api_url, tenant, soul_id) { + Ok(content) if !content.is_empty() => parts.push(content), + Ok(_) => ctx.log("warn", "assemble_system_prompt: soul content is empty"), + Err(e) => ctx.log("warn", &format!("assemble_system_prompt: failed to load soul: {e}")), + } + } + + // 2. System prompt override + if !system_prompt_override.is_empty() { + parts.push(system_prompt_override.to_string()); + } + + // 3. Available skills + if !soul_id.is_empty() { + match load_skills_block(ctx, temper_api_url, tenant) { + Ok(block) if !block.is_empty() => parts.push(block), + Ok(_) => {} + Err(e) => ctx.log("warn", &format!("assemble_system_prompt: failed to load skills: {e}")), + } + } + + // 4. Memory context + if !soul_id.is_empty() { + match load_memory_block(ctx, temper_api_url, tenant, soul_id) { + Ok(block) if !block.is_empty() => parts.push(block), + Ok(_) => {} + Err(e) => ctx.log("warn", &format!("assemble_system_prompt: failed to load memory: {e}")), + } + } + + // Fall back to bare system_prompt if nothing loaded + if parts.is_empty() { + return Ok(system_prompt_override.to_string()); + } + + Ok(parts.join("\n\n")) +} + +/// Load soul content from AgentSoul entity. +fn load_soul_content( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + soul_id: &str, +) -> Result { + let url = format!("{temper_api_url}/tdata/AgentSouls('{soul_id}')"); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + let resp = ctx.http_call("GET", &url, &headers, "")?; + if resp.status != 200 { + return Err(format!("soul read failed (HTTP {})", resp.status)); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({})); + let content_file_id = entity_field_str(&parsed, &["ContentFileId"]).unwrap_or(""); + if content_file_id.is_empty() { + return Ok(String::new()); + } + // Read from TemperFS + let file_url = format!("{temper_api_url}/tdata/Files('{content_file_id}')/$value"); + let resp2 = ctx.http_call("GET", &file_url, &headers, "")?; + if resp2.status == 200 { Ok(resp2.body) } else { Ok(String::new()) } +} + +/// Load active skills as an XML block for the system prompt. +fn load_skills_block( + ctx: &Context, + temper_api_url: &str, + tenant: &str, +) -> Result { + let url = format!("{temper_api_url}/tdata/AgentSkills?$filter=Status eq 'Active'"); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + let resp = ctx.http_call("GET", &url, &headers, "")?; + if resp.status != 200 { + return Ok(String::new()); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({})); + let skills = parsed.get("value").and_then(|v| v.as_array()).cloned().unwrap_or_default(); + if skills.is_empty() { + return Ok(String::new()); + } + let mut xml = String::from("\n"); + for skill in &skills { + let name = entity_field_str(skill, &["Name"]).unwrap_or("unknown"); + let desc = entity_field_str(skill, &["Description"]).unwrap_or(""); + let file_id = entity_field_str(skill, &["ContentFileId"]).unwrap_or(""); + xml.push_str(&format!(" \n")); + } + xml.push_str(""); + Ok(xml) +} + +/// Load agent memories as a context block for the system prompt. +fn load_memory_block( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + soul_id: &str, +) -> Result { + let url = format!( + "{temper_api_url}/tdata/AgentMemorys?$filter=SoulId eq '{}' and Status eq 'Active'", + soul_id + ); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + let resp = ctx.http_call("GET", &url, &headers, "")?; + if resp.status != 200 { + return Ok(String::new()); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({})); + let memories = parsed.get("value").and_then(|v| v.as_array()).cloned().unwrap_or_default(); + if memories.is_empty() { + return Ok(String::new()); + } + let mut block = String::from("\n"); + for mem in &memories { + let key = entity_field_str(mem, &["Key"]).unwrap_or("unknown"); + let content = entity_field_str(mem, &["Content"]).unwrap_or(""); + let mem_type = entity_field_str(mem, &["MemoryType"]).unwrap_or("reference"); + block.push_str(&format!(" \n {content}\n \n")); + } + block.push_str(""); + Ok(block) +} + +fn direct_field_str<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + keys.iter() + .find_map(|key| value.get(*key).and_then(Value::as_str)) +} + +fn entity_field_str<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + direct_field_str(value, keys).or_else(|| { + value.get("fields") + .and_then(|fields| direct_field_str(fields, keys)) + }) +} + fn resolve_temper_api_url(ctx: &Context, fields: &Value) -> String { fields .get("temper_api_url") diff --git a/os-apps/temper-agent/wasm/sandbox_provisioner/src/lib.rs b/os-apps/temper-agent/wasm/sandbox_provisioner/src/lib.rs index 9f820b55..6dcfe68b 100644 --- a/os-apps/temper-agent/wasm/sandbox_provisioner/src/lib.rs +++ b/os-apps/temper-agent/wasm/sandbox_provisioner/src/lib.rs @@ -53,12 +53,30 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { let tenant = &ctx.tenant; - let (workspace_id, conversation_file_id, file_manifest_id) = - create_conversation_storage(&ctx, &temper_api_url, tenant, entity_id).map_err(|e| { - format!( - "TemperFS bootstrap failed at {temper_api_url}/tdata (tenant={tenant}, agent={entity_id}): {e}. Ensure os-app 'temper-fs' is installed for this tenant and temper_api_url is correct." + let fs_result = + create_conversation_storage(&ctx, &temper_api_url, tenant, entity_id, user_message); + + let (workspace_id, conversation_file_id, file_manifest_id, session_file_id, session_leaf_id) = + match fs_result { + Ok((ws, conv, manifest, session_file_id, session_leaf_id)) => { + (ws, conv, manifest, session_file_id, session_leaf_id) + } + Err(e) => { + ctx.log( + "warn", + &format!( + "sandbox_provisioner: TemperFS bootstrap failed at {temper_api_url}/tdata (tenant={tenant}, agent={entity_id}): {e}. Ensure os-app 'temper-fs' is installed for this tenant and temper_api_url is correct. Falling back to inline." + ), + ); + ( + String::new(), + String::new(), + String::new(), + String::new(), + String::new(), ) - })?; + } + }; // Return sandbox + TemperFS details to the state machine set_success_result( @@ -69,6 +87,8 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { "workspace_id": workspace_id, "conversation_file_id": conversation_file_id, "file_manifest_id": file_manifest_id, + "session_file_id": session_file_id, + "session_leaf_id": session_leaf_id, }), ); @@ -221,18 +241,19 @@ fn provision_sandbox(ctx: &Context) -> Result { }) } -/// Create a TemperFS Workspace, conversation File, and manifest File. -/// Returns (workspace_entity_id, conversation_file_id, manifest_file_id). +/// Create a TemperFS Workspace, conversation File, manifest File, and session file. +/// Returns (workspace_entity_id, conversation_file_id, manifest_file_id, session_file_id, session_leaf_id). fn create_conversation_storage( ctx: &Context, temper_api_url: &str, tenant: &str, agent_id: &str, -) -> Result<(String, String, String), String> { + user_message: &str, +) -> Result<(String, String, String, String, String), String> { let headers = vec![ ("content-type".to_string(), "application/json".to_string()), ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ]; // 1. Create Workspace @@ -306,7 +327,7 @@ fn create_conversation_storage( let value_headers = vec![ ("content-type".to_string(), "application/json".to_string()), ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ]; let value_resp = ctx.http_call("PUT", &value_url, &value_headers, &init_conv)?; @@ -368,5 +389,121 @@ fn create_conversation_storage( ); } - Ok((workspace_id, file_id, manifest_id)) + let (session_file_id, session_leaf_id) = + create_session_tree(ctx, temper_api_url, tenant, &workspace_id, agent_id, user_message); + + Ok(( + workspace_id, + file_id, + manifest_id, + session_file_id, + session_leaf_id, + )) +} + +/// Create a session tree JSONL file in TemperFS. +/// Returns (session_file_id, session_leaf_id). Non-fatal on failure. +fn create_session_tree( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + workspace_id: &str, + agent_id: &str, + user_message: &str, +) -> (String, String) { + let headers = vec![ + ("content-type".to_string(), "application/json".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + + // Create session JSONL file in TemperFS + let session_file_body = json!({ + "FileId": format!("session-{agent_id}"), + "workspace_id": workspace_id, + "name": "session.jsonl", + "mime_type": "text/plain", + "path": "/session.jsonl" + }); + let session_file_resp = match ctx.http_call( + "POST", + &format!("{temper_api_url}/tdata/Files"), + &headers, + &serde_json::to_string(&session_file_body).unwrap_or_default(), + ) { + Ok(resp) => resp, + Err(e) => { + ctx.log("warn", &format!("Failed to create session file: {e}")); + return (String::new(), String::new()); + } + }; + + let session_file_id = if session_file_resp.status >= 200 && session_file_resp.status < 300 { + let parsed: Value = + serde_json::from_str(&session_file_resp.body).unwrap_or(json!({})); + parsed + .get("entity_id") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string() + } else { + ctx.log( + "warn", + &format!( + "Failed to create session file (HTTP {})", + session_file_resp.status + ), + ); + return (String::new(), String::new()); + }; + + if session_file_id.is_empty() { + return (String::new(), String::new()); + } + + // Initialize session file with JSONL header + first user message + let header_id = format!("h-{agent_id}"); + let header_entry = json!({ + "id": header_id, + "parentId": null, + "type": "header", + "version": 1, + "tokens": 0 + }); + let header_line = serde_json::to_string(&header_entry).unwrap_or_default(); + + let session_leaf_id = format!("u-{agent_id}-0"); + let user_entry = json!({ + "id": session_leaf_id, + "parentId": header_id, + "type": "message", + "role": "user", + "content": user_message, + "tokens": user_message.len() / 4 + }); + let user_line = serde_json::to_string(&user_entry).unwrap_or_default(); + let initial_jsonl = format!("{header_line}\n{user_line}"); + + let write_url = format!("{temper_api_url}/tdata/Files('{session_file_id}')/$value"); + let write_headers = vec![ + ("content-type".to_string(), "text/plain".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + match ctx.http_call("PUT", &write_url, &write_headers, &initial_jsonl) { + Ok(resp) if resp.status >= 200 && resp.status < 300 => { + ctx.log("info", "sandbox_provisioner: session tree initialized"); + } + Ok(resp) => { + ctx.log( + "warn", + &format!("Failed to write session file (HTTP {})", resp.status), + ); + } + Err(e) => { + ctx.log("warn", &format!("Failed to write session file: {e}")); + } + } + + (session_file_id, session_leaf_id) } diff --git a/os-apps/temper-agent/wasm/session-tree-lib/Cargo.lock b/os-apps/temper-agent/wasm/session-tree-lib/Cargo.lock new file mode 100644 index 00000000..2d459393 --- /dev/null +++ b/os-apps/temper-agent/wasm/session-tree-lib/Cargo.lock @@ -0,0 +1,105 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "session-tree-lib" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/session-tree-lib/Cargo.toml b/os-apps/temper-agent/wasm/session-tree-lib/Cargo.toml new file mode 100644 index 00000000..0f33fe77 --- /dev/null +++ b/os-apps/temper-agent/wasm/session-tree-lib/Cargo.toml @@ -0,0 +1,12 @@ +[package] +name = "session-tree-lib" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["rlib"] + +[workspace] + +[dependencies] +serde_json = "1" diff --git a/os-apps/temper-agent/wasm/session-tree-lib/src/lib.rs b/os-apps/temper-agent/wasm/session-tree-lib/src/lib.rs new file mode 100644 index 00000000..4c814343 --- /dev/null +++ b/os-apps/temper-agent/wasm/session-tree-lib/src/lib.rs @@ -0,0 +1,484 @@ +//! Session Tree Library — shared JSONL tree operations for TemperAgent WASM modules. +//! +//! Provides append-only tree-structured conversation storage with branching, +//! compaction support, and leaf-to-root context assembly. +//! +//! Storage format: JSONL (one JSON object per line) with tree structure via id/parentId. + +use std::collections::BTreeMap; +use serde_json::{Value, json}; + +/// A single entry in the session tree. +#[derive(Debug, Clone)] +pub struct SessionEntry { + pub id: String, + pub parent_id: Option, + pub entry_type: EntryType, + pub data: Value, + pub tokens: usize, +} + +/// Type of session tree entry. +#[derive(Debug, Clone, PartialEq)] +pub enum EntryType { + /// Session header with metadata. + Header, + /// A conversation message (user, assistant, or tool_result). + Message, + /// A compaction summary replacing older messages. + Compaction, + /// A steering injection point. + Steering, +} + +impl EntryType { + pub fn as_str(&self) -> &str { + match self { + EntryType::Header => "header", + EntryType::Message => "message", + EntryType::Compaction => "compaction", + EntryType::Steering => "steering", + } + } + + pub fn from_str(s: &str) -> Self { + match s { + "header" => EntryType::Header, + "message" => EntryType::Message, + "compaction" => EntryType::Compaction, + "steering" => EntryType::Steering, + _ => EntryType::Message, + } + } +} + +/// The session tree — an append-only tree of conversation entries. +pub struct SessionTree { + entries: BTreeMap, + /// Ordered list of entry IDs (insertion order). + order: Vec, + /// Raw JSONL lines for serialization. + raw_lines: Vec, +} + +impl SessionTree { + /// Parse a JSONL string into a SessionTree. + pub fn from_jsonl(data: &str) -> Self { + let mut entries = BTreeMap::new(); + let mut order = Vec::new(); + let mut raw_lines = Vec::new(); + + for line in data.lines() { + let line = line.trim(); + if line.is_empty() { + continue; + } + raw_lines.push(line.to_string()); + + if let Ok(val) = serde_json::from_str::(line) { + let id = val.get("id").and_then(|v| v.as_str()).unwrap_or("").to_string(); + let parent_id = val.get("parentId").and_then(|v| v.as_str()).map(|s| s.to_string()); + let entry_type = val.get("type").and_then(|v| v.as_str()).map(EntryType::from_str).unwrap_or(EntryType::Message); + let tokens = val.get("tokens").and_then(|v| v.as_u64()).unwrap_or(0) as usize; + + if !id.is_empty() { + let entry = SessionEntry { + id: id.clone(), + parent_id, + entry_type, + data: val, + tokens, + }; + order.push(id.clone()); + entries.insert(id, entry); + } + } + } + + SessionTree { entries, order, raw_lines } + } + + /// Create an empty session tree with a header entry. + pub fn new(session_id: &str) -> Self { + let header = json!({ + "id": format!("h-{session_id}"), + "parentId": null, + "type": "header", + "version": 1, + "created": "", + "tokens": 0 + }); + let header_line = serde_json::to_string(&header).unwrap_or_default(); + let id = format!("h-{session_id}"); + + let entry = SessionEntry { + id: id.clone(), + parent_id: None, + entry_type: EntryType::Header, + data: header, + tokens: 0, + }; + + let mut entries = BTreeMap::new(); + entries.insert(id.clone(), entry); + + SessionTree { + entries, + order: vec![id], + raw_lines: vec![header_line], + } + } + + /// Check if the tree is empty (no entries at all). + pub fn is_empty(&self) -> bool { + self.entries.is_empty() + } + + /// Get the number of entries. + pub fn len(&self) -> usize { + self.entries.len() + } + + /// Get an entry by ID. + pub fn get(&self, id: &str) -> Option<&SessionEntry> { + self.entries.get(id) + } + + /// Find the last entry ID (the most recently appended). + pub fn last_entry_id(&self) -> Option<&str> { + self.order.last().map(|s| s.as_str()) + } + + /// Build context messages by walking from leaf_id to root. + /// Handles compaction entries: when a compaction is encountered, + /// it replaces all entries before it with the summary. + pub fn build_context(&self, leaf_id: &str) -> Vec { + // Walk from leaf to root collecting entries + let mut chain: Vec<&SessionEntry> = Vec::new(); + let mut current_id = Some(leaf_id.to_string()); + + while let Some(id) = current_id { + if let Some(entry) = self.entries.get(&id) { + chain.push(entry); + current_id = entry.parent_id.clone(); + } else { + break; + } + } + + // Reverse to get root-to-leaf order + chain.reverse(); + + // Build messages, handling compaction entries + let mut messages: Vec = Vec::new(); + + for entry in &chain { + match entry.entry_type { + EntryType::Header => { + // Skip headers — they're metadata + continue; + } + EntryType::Compaction => { + // A compaction replaces all prior messages with its summary + messages.clear(); + if let Some(summary) = entry.data.get("summary").and_then(|v| v.as_str()) { + messages.push(json!({ + "role": "user", + "content": format!("[Previous conversation summary]\n{summary}") + })); + } + } + EntryType::Message | EntryType::Steering => { + // Extract role and content from the entry + let role = entry.data.get("role").and_then(|v| v.as_str()).unwrap_or("user"); + if let Some(content) = entry.data.get("content").cloned() { + messages.push(json!({ + "role": role, + "content": content, + })); + } + } + } + } + + messages + } + + /// Append a new entry to the tree. Returns the JSONL line for the new entry. + /// The entry is added with the given parent_id. + pub fn append_entry( + &mut self, + id: &str, + parent_id: Option<&str>, + entry_type: EntryType, + role: Option<&str>, + content: Option<&Value>, + tokens: usize, + extra_fields: Option<&Value>, + ) -> String { + let mut data = json!({ + "id": id, + "parentId": parent_id, + "type": entry_type.as_str(), + "tokens": tokens, + }); + + if let Some(role) = role { + data["role"] = json!(role); + } + if let Some(content) = content { + data["content"] = content.clone(); + } + if let Some(extra) = extra_fields { + if let Some(obj) = extra.as_object() { + for (k, v) in obj { + data[k] = v.clone(); + } + } + } + + let line = serde_json::to_string(&data).unwrap_or_default(); + + let entry = SessionEntry { + id: id.to_string(), + parent_id: parent_id.map(|s| s.to_string()), + entry_type, + data, + tokens, + }; + + self.order.push(id.to_string()); + self.entries.insert(id.to_string(), entry); + self.raw_lines.push(line.clone()); + + line + } + + /// Append a user message. Returns (entry_id, jsonl_line). + pub fn append_user_message(&mut self, parent_id: &str, content: &str, tokens: usize) -> (String, String) { + let id = format!("u-{}", self.order.len()); + let line = self.append_entry( + &id, + Some(parent_id), + EntryType::Message, + Some("user"), + Some(&json!(content)), + tokens, + None, + ); + (id, line) + } + + /// Append an assistant message. Returns (entry_id, jsonl_line). + pub fn append_assistant_message(&mut self, parent_id: &str, content: &Value, tokens: usize) -> (String, String) { + let id = format!("a-{}", self.order.len()); + let line = self.append_entry( + &id, + Some(parent_id), + EntryType::Message, + Some("assistant"), + Some(content), + tokens, + None, + ); + (id, line) + } + + /// Append a tool result message (role: user with tool_result content). Returns (entry_id, jsonl_line). + pub fn append_tool_results(&mut self, parent_id: &str, tool_results: &Value, tokens: usize) -> (String, String) { + let id = format!("t-{}", self.order.len()); + let line = self.append_entry( + &id, + Some(parent_id), + EntryType::Message, + Some("user"), + Some(tool_results), + tokens, + None, + ); + (id, line) + } + + /// Append a compaction entry. Returns (entry_id, jsonl_line). + pub fn append_compaction(&mut self, parent_id: &str, summary: &str, first_kept: &str) -> (String, String) { + let id = format!("c-{}", self.order.len()); + let extra = json!({ + "summary": summary, + "first_kept": first_kept, + }); + let line = self.append_entry( + &id, + Some(parent_id), + EntryType::Compaction, + None, + None, + 0, + Some(&extra), + ); + (id, line) + } + + /// Append a steering message. Returns (entry_id, jsonl_line). + pub fn append_steering_message(&mut self, parent_id: &str, content: &str, tokens: usize) -> (String, String) { + let id = format!("s-{}", self.order.len()); + let line = self.append_entry( + &id, + Some(parent_id), + EntryType::Steering, + Some("user"), + Some(&json!(content)), + tokens, + None, + ); + (id, line) + } + + /// Estimate total tokens in the context for a given leaf. + pub fn estimate_tokens(&self, leaf_id: &str) -> usize { + let mut total = 0; + let mut current_id = Some(leaf_id.to_string()); + + while let Some(id) = current_id { + if let Some(entry) = self.entries.get(&id) { + if entry.entry_type == EntryType::Compaction { + // After compaction, only count from here forward + total += entry.tokens; + break; + } + total += entry.tokens; + current_id = entry.parent_id.clone(); + } else { + break; + } + } + + total + } + + /// Find a cut point for compaction. Returns the entry ID where we should + /// start keeping messages (everything before this gets compacted). + /// Walks backward from the leaf keeping `keep_recent_tokens` worth of messages. + pub fn find_cut_point(&self, leaf_id: &str, keep_recent_tokens: usize) -> Option { + let mut accumulated = 0; + let mut current_id = Some(leaf_id.to_string()); + let mut cut_point = None; + + while let Some(id) = current_id { + if let Some(entry) = self.entries.get(&id) { + accumulated += entry.tokens; + if accumulated >= keep_recent_tokens { + // This is where we should cut — keep everything after this + cut_point = Some(id.clone()); + break; + } + current_id = entry.parent_id.clone(); + } else { + break; + } + } + + cut_point + } + + /// Serialize the tree back to JSONL format. + pub fn to_jsonl(&self) -> String { + self.raw_lines.join("\n") + } + + /// Get all entry IDs in insertion order. + pub fn entry_ids(&self) -> &[String] { + &self.order + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_new_session_tree() { + let tree = SessionTree::new("test-1"); + assert_eq!(tree.len(), 1); + assert!(!tree.is_empty()); + } + + #[test] + fn test_append_and_build_context() { + let mut tree = SessionTree::new("test-1"); + let header_id = tree.last_entry_id().unwrap().to_string(); + + let (user_id, _) = tree.append_user_message(&header_id, "Hello", 10); + let (asst_id, _) = tree.append_assistant_message(&user_id, &json!([{"type": "text", "text": "Hi there!"}]), 20); + + let messages = tree.build_context(&asst_id); + assert_eq!(messages.len(), 2); + assert_eq!(messages[0]["role"], "user"); + assert_eq!(messages[1]["role"], "assistant"); + } + + #[test] + fn test_compaction() { + let mut tree = SessionTree::new("test-1"); + let header_id = tree.last_entry_id().unwrap().to_string(); + + let (u1, _) = tree.append_user_message(&header_id, "First message", 100); + let (a1, _) = tree.append_assistant_message(&u1, &json!("Response 1"), 200); + let (compact_id, _) = tree.append_compaction(&a1, "Summary of conversation so far", &a1); + let (u2, _) = tree.append_user_message(&compact_id, "New message after compaction", 50); + + let messages = tree.build_context(&u2); + // Should have: compaction summary + new message + assert_eq!(messages.len(), 2); + assert!(messages[0]["content"].as_str().unwrap().contains("summary")); + } + + #[test] + fn test_from_jsonl() { + let jsonl = r#"{"id":"h-1","parentId":null,"type":"header","version":1,"tokens":0} +{"id":"u-1","parentId":"h-1","type":"message","role":"user","content":"Hello","tokens":10} +{"id":"a-1","parentId":"u-1","type":"message","role":"assistant","content":"Hi!","tokens":5}"#; + + let tree = SessionTree::from_jsonl(jsonl); + assert_eq!(tree.len(), 3); + + let messages = tree.build_context("a-1"); + assert_eq!(messages.len(), 2); + } + + #[test] + fn test_to_jsonl_roundtrip() { + let mut tree = SessionTree::new("test-1"); + let header_id = tree.last_entry_id().unwrap().to_string(); + tree.append_user_message(&header_id, "Hello", 10); + + let jsonl = tree.to_jsonl(); + let tree2 = SessionTree::from_jsonl(&jsonl); + assert_eq!(tree2.len(), tree.len()); + } + + #[test] + fn test_estimate_tokens() { + let mut tree = SessionTree::new("test-1"); + let header_id = tree.last_entry_id().unwrap().to_string(); + + let (u1, _) = tree.append_user_message(&header_id, "Hello", 100); + let (a1, _) = tree.append_assistant_message(&u1, &json!("Response"), 200); + + assert_eq!(tree.estimate_tokens(&a1), 300); + } + + #[test] + fn test_find_cut_point() { + let mut tree = SessionTree::new("test-1"); + let header_id = tree.last_entry_id().unwrap().to_string(); + + let (u1, _) = tree.append_user_message(&header_id, "Msg 1", 100); + let (a1, _) = tree.append_assistant_message(&u1, &json!("Resp 1"), 200); + let (u2, _) = tree.append_user_message(&a1, "Msg 2", 100); + let (a2, _) = tree.append_assistant_message(&u2, &json!("Resp 2"), 200); + + // Keep 250 tokens — should cut somewhere in the middle + let cut = tree.find_cut_point(&a2, 250); + assert!(cut.is_some()); + } +} diff --git a/os-apps/temper-agent/wasm/steering_checker/Cargo.lock b/os-apps/temper-agent/wasm/steering_checker/Cargo.lock new file mode 100644 index 00000000..5f287e76 --- /dev/null +++ b/os-apps/temper-agent/wasm/steering_checker/Cargo.lock @@ -0,0 +1,129 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "session-tree-lib" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "steering-checker" +version = "0.1.0" +dependencies = [ + "session-tree-lib", + "temper-wasm-sdk", + "wasm-helpers", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/steering_checker/Cargo.toml b/os-apps/temper-agent/wasm/steering_checker/Cargo.toml new file mode 100644 index 00000000..37580684 --- /dev/null +++ b/os-apps/temper-agent/wasm/steering_checker/Cargo.toml @@ -0,0 +1,14 @@ +[package] +name = "steering-checker" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +session-tree-lib = { path = "../session-tree-lib" } +wasm-helpers = { path = "../wasm-helpers" } diff --git a/os-apps/temper-agent/wasm/steering_checker/src/lib.rs b/os-apps/temper-agent/wasm/steering_checker/src/lib.rs new file mode 100644 index 00000000..28917ed4 --- /dev/null +++ b/os-apps/temper-agent/wasm/steering_checker/src/lib.rs @@ -0,0 +1,198 @@ +//! Steering Checker — WASM module for the two-loop steering architecture. +//! +//! When the LLM returns end_turn, this module is triggered (via CheckSteering). +//! It checks for queued steering messages and either: +//! - Injects the first queued message and returns ContinueWithSteering +//! - Returns FinalizeResult if no messages are queued +//! +//! Build: `cargo build --target wasm32-unknown-unknown --release` + +use session_tree_lib::SessionTree; +use temper_wasm_sdk::prelude::*; +use wasm_helpers::{read_session_from_temperfs, resolve_temper_api_url, write_session_to_temperfs}; + +/// Entry point. +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + ctx.log("info", "steering_checker: starting"); + + let fields = ctx.entity_state.get("fields").cloned().unwrap_or(json!({})); + + // Read steering state + let steering_messages_json = fields + .get("steering_messages") + .and_then(|v| v.as_str()) + .unwrap_or("[]"); + + let mut steering_messages: Vec = serde_json::from_str(steering_messages_json) + .unwrap_or_default(); + + let follow_up_count = fields + .get("follow_up_count") + .and_then(|v| v.as_i64()) + .unwrap_or(0); + let max_follow_ups: i64 = fields + .get("max_follow_ups") + .and_then(|v| v.as_str()) + .and_then(|s| s.parse().ok()) + .unwrap_or(5); + + let session_file_id = fields + .get("session_file_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let session_leaf_id = fields + .get("session_leaf_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + + let temper_api_url = resolve_temper_api_url(&ctx, &fields); + let tenant = &ctx.tenant; + + // Check if we have steering messages AND haven't hit the follow-up limit + if !steering_messages.is_empty() && follow_up_count < max_follow_ups { + // Dequeue the first steering message + let msg = steering_messages.remove(0); + let msg_content = msg.get("content") + .and_then(|v| v.as_str()) + .unwrap_or_else(|| msg.as_str().unwrap_or("")); + + ctx.log("info", &format!( + "steering_checker: injecting steering message ({} remaining, follow_up {}/{})", + steering_messages.len(), follow_up_count + 1, max_follow_ups + )); + + // If session tree mode, inject into session tree + if !session_file_id.is_empty() && !session_leaf_id.is_empty() { + let session_jsonl = read_session_from_temperfs(&ctx, &temper_api_url, tenant, session_file_id)?; + let mut tree = SessionTree::from_jsonl(&session_jsonl); + + // Append steering message as a user message in the tree + let (new_leaf_id, _line) = tree.append_steering_message( + session_leaf_id, + msg_content, + estimate_tokens(msg_content), + ); + + // Write back + let updated_jsonl = tree.to_jsonl(); + write_session_to_temperfs(&ctx, &temper_api_url, tenant, session_file_id, &updated_jsonl)?; + + // Update steering_messages in entity state (remove dequeued message) + let updated_queue = + serde_json::to_string(&steering_messages).unwrap_or_else(|_| "[]".to_string()); + set_success_result("ContinueWithSteering", &json!({ + "session_leaf_id": new_leaf_id, + "steering_messages": updated_queue, + })); + } else { + // Inline conversation mode (legacy fallback) + let conversation_json = fields + .get("conversation") + .and_then(|v| v.as_str()) + .unwrap_or("[]"); + let mut messages: Vec = serde_json::from_str(conversation_json).unwrap_or_default(); + messages.push(json!({ + "role": "user", + "content": msg_content, + })); + let updated_conversation = serde_json::to_string(&messages).unwrap_or_default(); + + set_success_result("ContinueWithSteering", &json!({ + "conversation": updated_conversation, + "steering_messages": serde_json::to_string(&steering_messages) + .unwrap_or_else(|_| "[]".to_string()), + })); + } + } else { + // No steering messages or follow-up limit reached — finalize + if follow_up_count >= max_follow_ups { + ctx.log("info", &format!( + "steering_checker: follow-up limit reached ({}/{}), finalizing", + follow_up_count, max_follow_ups + )); + } else { + ctx.log("info", "steering_checker: no steering messages, finalizing"); + } + + // Extract the result text from the last assistant message + let result_text = extract_last_result(&ctx, &fields, &temper_api_url, tenant, session_file_id, session_leaf_id)?; + + set_success_result("FinalizeResult", &json!({ + "result": result_text, + "session_leaf_id": session_leaf_id, + })); + } + + Ok(()) + })(); + + if let Err(e) = result { + set_error_result(&e); + } + 0 +} + +/// Extract the last assistant text from the conversation for the result field. +fn extract_last_result( + ctx: &Context, + fields: &Value, + temper_api_url: &str, + tenant: &str, + session_file_id: &str, + session_leaf_id: &str, +) -> Result { + if !session_file_id.is_empty() && !session_leaf_id.is_empty() { + let session_jsonl = read_session_from_temperfs(ctx, temper_api_url, tenant, session_file_id)?; + let tree = SessionTree::from_jsonl(&session_jsonl); + let messages = tree.build_context(session_leaf_id); + + // Find last assistant message + for msg in messages.iter().rev() { + if msg.get("role").and_then(|v| v.as_str()) == Some("assistant") { + return Ok(extract_text_from_content(msg.get("content"))); + } + } + Ok(String::new()) + } else { + let conversation_json = fields + .get("conversation") + .and_then(|v| v.as_str()) + .unwrap_or("[]"); + let messages: Vec = serde_json::from_str(conversation_json).unwrap_or_default(); + + for msg in messages.iter().rev() { + if msg.get("role").and_then(|v| v.as_str()) == Some("assistant") { + return Ok(extract_text_from_content(msg.get("content"))); + } + } + Ok(String::new()) + } +} + +/// Extract text from an assistant message content (handles both string and array formats). +fn extract_text_from_content(content: Option<&Value>) -> String { + match content { + Some(Value::String(s)) => s.clone(), + Some(Value::Array(arr)) => { + arr.iter() + .filter_map(|block| { + if block.get("type").and_then(|v| v.as_str()) == Some("text") { + block.get("text").and_then(|v| v.as_str()).map(String::from) + } else { + None + } + }) + .collect::>() + .join("\n") + } + _ => String::new(), + } +} + +/// Simple token estimate (4 chars per token). +fn estimate_tokens(text: &str) -> usize { + text.len() / 4 +} diff --git a/os-apps/temper-agent/wasm/tool_runner/Cargo.lock b/os-apps/temper-agent/wasm/tool_runner/Cargo.lock index e03b1945..a00a7bc1 100644 --- a/os-apps/temper-agent/wasm/tool_runner/Cargo.lock +++ b/os-apps/temper-agent/wasm/tool_runner/Cargo.lock @@ -74,6 +74,13 @@ dependencies = [ "zmij", ] +[[package]] +name = "session-tree-lib" +version = "0.1.0" +dependencies = [ + "serde_json", +] + [[package]] name = "syn" version = "2.0.117" @@ -96,6 +103,7 @@ dependencies = [ name = "tool-runner" version = "0.1.0" dependencies = [ + "session-tree-lib", "temper-wasm-sdk", ] diff --git a/os-apps/temper-agent/wasm/tool_runner/Cargo.toml b/os-apps/temper-agent/wasm/tool_runner/Cargo.toml index bc231ee4..4812213d 100644 --- a/os-apps/temper-agent/wasm/tool_runner/Cargo.toml +++ b/os-apps/temper-agent/wasm/tool_runner/Cargo.toml @@ -10,3 +10,4 @@ crate-type = ["cdylib"] [dependencies] temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +session-tree-lib = { path = "../session-tree-lib" } diff --git a/os-apps/temper-agent/wasm/tool_runner/src/lib.rs b/os-apps/temper-agent/wasm/tool_runner/src/lib.rs index 9ebb1cee..5c378ba9 100644 --- a/os-apps/temper-agent/wasm/tool_runner/src/lib.rs +++ b/os-apps/temper-agent/wasm/tool_runner/src/lib.rs @@ -22,15 +22,28 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { .and_then(|v| v.as_str()) .unwrap_or(""); - if sandbox_url.is_empty() { - return Err("sandbox_url is empty — cannot execute tools".to_string()); - } - let workdir = fields .get("workdir") .and_then(|v| v.as_str()) .unwrap_or("/workspace"); + // Temper API URL: read from integration config, default to localhost + let temper_api_url = ctx + .config + .get("temper_api_url") + .cloned() + .unwrap_or_else(|| "http://127.0.0.1:3000".to_string()); + let tenant = &ctx.tenant; + let hook_policy = fields + .get("hook_policy") + .and_then(|v| v.as_str()) + .unwrap_or("none"); + let soul_id = fields + .get("soul_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let _ = send_heartbeat(&ctx, &temper_api_url, tenant); + // Read pending tool calls from trigger params let tool_calls_json = ctx .trigger_params @@ -61,13 +74,56 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { "info", &format!("tool_runner: executing tool '{tool_name}' id={tool_id}"), ); + emit_progress_ignore( + &ctx, + json!({ + "kind": "tool_execution_start", + "message": format!("executing tool {tool_name}"), + "tool_call_id": tool_id, + "tool_name": tool_name, + }), + ); - let result = execute_tool(&ctx, sandbox_url, workdir, tool_name, &input); + let result = if let Err(error) = validate_tool_input(tool_name, &input) { + Err(error) + } else if let Some(error) = + evaluate_before_hooks(&ctx, &temper_api_url, tenant, soul_id, hook_policy, tool_name)? + { + Err(error) + } else if is_entity_tool(tool_name) { + execute_entity_tool(&ctx, &temper_api_url, tenant, &fields, tool_name, &input) + } else if sandbox_url.is_empty() { + Err(format!("sandbox_url is empty — cannot execute sandbox tool '{tool_name}'")) + } else { + execute_tool(&ctx, sandbox_url, workdir, tool_name, &input) + }; let (content, is_error) = match result { - Ok(output) => (output, false), + Ok(output) => ( + apply_after_hooks( + &ctx, + &temper_api_url, + tenant, + soul_id, + hook_policy, + tool_name, + output, + )?, + false, + ), Err(e) => (format!("Error: {e}"), true), }; + let _ = send_heartbeat(&ctx, &temper_api_url, tenant); + emit_progress_ignore( + &ctx, + json!({ + "kind": "tool_execution_complete", + "message": format!("completed tool {tool_name}"), + "tool_call_id": tool_id, + "tool_name": tool_name, + "is_error": is_error, + }), + ); tool_results.push(json!({ "type": "tool_result", @@ -77,41 +133,54 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { })); } - // TemperFS conversation storage + // Session tree and conversation storage let conversation_file_id = fields .get("conversation_file_id") .and_then(|v| v.as_str()) .unwrap_or(""); - // Temper API URL: prefer Configure override in state, then integration config. - let temper_api_url = resolve_temper_api_url(&ctx, &fields); - let tenant = &ctx.tenant; + let session_file_id = fields + .get("session_file_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let session_leaf_id = fields + .get("session_leaf_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); - // Read current conversation and append tool results - let mut messages: Vec = if !conversation_file_id.is_empty() { - read_conversation_from_temperfs(&ctx, &temper_api_url, tenant, conversation_file_id)? - } else { - let conversation_json = fields - .get("conversation") - .and_then(|v| v.as_str()) - .unwrap_or("[]"); - serde_json::from_str(conversation_json).unwrap_or_default() - }; + let results_json = serde_json::to_string(&tool_results).unwrap_or_default(); + let mut params = json!({ + "pending_tool_calls": results_json, + }); - // Append tool results as a user message (Anthropic API format) - messages.push(json!({ - "role": "user", - "content": tool_results, - })); + if !session_file_id.is_empty() && !session_leaf_id.is_empty() { + // Session tree mode: append tool results + let session_jsonl = read_session_from_temperfs(&ctx, &temper_api_url, tenant, session_file_id)?; + let mut tree = session_tree_lib::SessionTree::from_jsonl(&session_jsonl); + let tool_results_value = json!(tool_results.clone()); + let tokens_est = results_json.len() / 4; + let (new_leaf, _) = tree.append_tool_results(session_leaf_id, &tool_results_value, tokens_est); + let updated_jsonl = tree.to_jsonl(); + write_session_to_temperfs(&ctx, &temper_api_url, tenant, session_file_id, &updated_jsonl)?; + + params["session_leaf_id"] = json!(new_leaf); + } else if !conversation_file_id.is_empty() { + // Legacy flat JSON mode + let mut messages: Vec = + read_conversation_from_temperfs(&ctx, &temper_api_url, tenant, conversation_file_id)?; + + // Append tool results as a user message (Anthropic API format) + messages.push(json!({ + "role": "user", + "content": tool_results, + })); - // Write back to TemperFS or pass inline - let updated_conversation = serde_json::to_string(&messages).unwrap_or_default(); - if !conversation_file_id.is_empty() { + let updated_conversation = serde_json::to_string(&messages).unwrap_or_default(); let body = format!("{{\"messages\":{updated_conversation}}}"); let url = format!("{temper_api_url}/tdata/Files('{conversation_file_id}')/$value"); let headers = vec![ ("content-type".to_string(), "application/json".to_string()), ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ]; match ctx.http_call("PUT", &url, &headers, &body) { Ok(resp) if resp.status >= 200 && resp.status < 300 => { @@ -134,6 +203,24 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { return Err(format!("TemperFS conversation write failed: {e}")); } } + params["conversation"] = json!(updated_conversation); + } else { + // Inline conversation mode (no TemperFS) + let mut messages: Vec = { + let conversation_json = fields + .get("conversation") + .and_then(|v| v.as_str()) + .unwrap_or("[]"); + serde_json::from_str(conversation_json).unwrap_or_default() + }; + + messages.push(json!({ + "role": "user", + "content": tool_results, + })); + + let updated_conversation = serde_json::to_string(&messages).unwrap_or_default(); + params["conversation"] = json!(updated_conversation); } // Fsync sandbox files to TemperFS (best-effort) @@ -152,7 +239,7 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { .unwrap_or(61440); let sync_exclude = ctx.config.get("sync_exclude").cloned().unwrap_or_default(); - if !file_manifest_id.is_empty() && !workspace_id.is_empty() { + if !file_manifest_id.is_empty() && !workspace_id.is_empty() && !sandbox_url.is_empty() { let e2b = is_e2b_sandbox(sandbox_url); match sync_files_to_temperfs( &ctx, @@ -177,13 +264,6 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { } } - let results_json = serde_json::to_string(&tool_results).unwrap_or_default(); - let mut params = json!({ - "pending_tool_calls": results_json, - }); - if conversation_file_id.is_empty() { - params["conversation"] = json!(updated_conversation); - } set_success_result("HandleToolResults", ¶ms); Ok(()) @@ -842,7 +922,7 @@ fn read_conversation_from_temperfs( let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); let headers = vec![ ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ("accept".to_string(), "application/json".to_string()), ]; @@ -972,7 +1052,7 @@ fn read_manifest( let url = format!("{temper_api_url}/tdata/Files('{manifest_file_id}')/$value"); let headers = vec![ ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ("accept".to_string(), "application/json".to_string()), ]; @@ -1052,7 +1132,7 @@ fn sync_files_to_temperfs( let headers = vec![ ("content-type".to_string(), "application/json".to_string()), ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ]; let file_url = format!("{temper_api_url}/tdata/Files"); @@ -1181,6 +1261,590 @@ fn sync_files_to_temperfs( Ok(synced_count) } +// --- Entity tool dispatch --- + +fn emit_progress_ignore(ctx: &Context, payload: Value) { + let _ = ctx.emit_progress(&payload); +} + +fn send_heartbeat(ctx: &Context, temper_api_url: &str, tenant: &str) -> Result<(), String> { + let url = format!( + "{temper_api_url}/tdata/TemperAgents('{}')/Temper.Agent.TemperAgent.Heartbeat", + ctx.entity_id + ); + let body = json!({ "last_heartbeat_at": "alive" }); + let _ = ctx.http_call("POST", &url, &odata_headers(tenant), &body.to_string())?; + Ok(()) +} + +fn validate_tool_input(tool_name: &str, input: &Value) -> Result<(), String> { + let object = input + .as_object() + .ok_or_else(|| format!("{tool_name}: input must be an object"))?; + let required: &[&str] = match tool_name { + "read" => &["path"], + "write" => &["path", "content"], + "edit" => &["path", "old_string", "new_string"], + "bash" => &["command"], + "save_memory" => &["key", "content"], + "recall_memory" => &["query"], + "spawn_agent" => &["task"], + "abort_agent" => &["agent_id"], + "steer_agent" => &["agent_id", "message"], + "read_entity" => &["file_id"], + "run_coding_agent" => &["agent_type", "task"], + _ => &[], + }; + for key in required { + let Some(value) = object.get(*key) else { + return Err(format!("{tool_name}: missing '{key}'")); + }; + if value.is_null() || value.as_str().is_some_and(str::is_empty) { + return Err(format!("{tool_name}: '{key}' must not be empty")); + } + } + Ok(()) +} + +fn evaluate_before_hooks( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + soul_id: &str, + hook_policy: &str, + tool_name: &str, +) -> Result, String> { + if hook_policy == "none" || soul_id.is_empty() { + return Ok(None); + } + let hooks = load_matching_hooks(ctx, temper_api_url, tenant, soul_id, "before", tool_name)?; + for hook in hooks { + let action = entity_field_str(&hook, &["HookAction"]).unwrap_or("log"); + let name = entity_field_str(&hook, &["Name"]).unwrap_or("hook"); + match action { + "block" => { + return Ok(Some(format!( + "tool blocked by hook '{name}' for tool '{tool_name}'" + ))) + } + "log" => ctx.log("info", &format!("tool_runner: before hook '{name}' matched {tool_name}")), + _ => {} + } + } + Ok(None) +} + +fn apply_after_hooks( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + soul_id: &str, + hook_policy: &str, + tool_name: &str, + mut output: String, +) -> Result { + if hook_policy != "full_hooks" || soul_id.is_empty() { + return Ok(output); + } + let hooks = load_matching_hooks(ctx, temper_api_url, tenant, soul_id, "after", tool_name)?; + for hook in hooks { + let action = entity_field_str(&hook, &["HookAction"]).unwrap_or("log"); + let name = entity_field_str(&hook, &["Name"]).unwrap_or("hook"); + match action { + "modify" => { + output = format!("[modified by hook:{name}]\n{output}"); + } + "log" => ctx.log("info", &format!("tool_runner: after hook '{name}' matched {tool_name}")), + _ => {} + } + } + Ok(output) +} + +fn load_matching_hooks( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + soul_id: &str, + hook_type: &str, + tool_name: &str, +) -> Result, String> { + let url = format!("{temper_api_url}/tdata/ToolHooks"); + let resp = ctx.http_call("GET", &url, &odata_headers(tenant), "")?; + if resp.status != 200 { + return Ok(Vec::new()); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or_else(|_| json!({ "value": [] })); + let hooks = parsed + .get("value") + .and_then(Value::as_array) + .cloned() + .unwrap_or_default() + .into_iter() + .filter(|hook| { + entity_field_str(hook, &["Status"]) == Some("Active") + && entity_field_str(hook, &["SoulId"]).unwrap_or("") == soul_id + && entity_field_str(hook, &["HookType"]).unwrap_or("") == hook_type + && hook_matches( + entity_field_str(hook, &["ToolPattern"]).unwrap_or(".*"), + tool_name, + ) + }) + .collect::>(); + Ok(hooks) +} + +fn hook_matches(pattern: &str, tool_name: &str) -> bool { + let pattern = pattern.trim(); + if pattern.is_empty() || pattern == ".*" || pattern == "*" { + return true; + } + if pattern.contains('|') { + return pattern.split('|').any(|part| part.trim() == tool_name); + } + pattern == tool_name +} + +fn is_entity_tool(name: &str) -> bool { + matches!( + name, + "save_memory" + | "recall_memory" + | "spawn_agent" + | "list_agents" + | "abort_agent" + | "steer_agent" + | "read_entity" + | "run_coding_agent" + ) +} + +fn execute_entity_tool( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + fields: &Value, + tool_name: &str, + input: &Value, +) -> Result { + match tool_name { + "save_memory" => { + let key = input.get("key").and_then(|v| v.as_str()).ok_or("save_memory: missing 'key'")?; + let content = input.get("content").and_then(|v| v.as_str()).ok_or("save_memory: missing 'content'")?; + let memory_type = input.get("memory_type").and_then(|v| v.as_str()).unwrap_or("reference"); + let soul_id = fields.get("soul_id").and_then(|v| v.as_str()).unwrap_or(""); + let agent_id = ctx.entity_state.get("entity_id").and_then(|v| v.as_str()).unwrap_or(""); + let body = json!({ + "Key": key, "Content": content, "MemoryType": memory_type, + "SoulId": soul_id, "AuthorAgentId": agent_id, + }); + let url = format!("{temper_api_url}/tdata/AgentMemorys"); + let resp = ctx.http_call("POST", &url, &odata_headers(tenant), &serde_json::to_string(&body).unwrap_or_default())?; + if resp.status >= 200 && resp.status < 300 { + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({})); + let entity_id = parsed + .get("entity_id") + .or_else(|| parsed.get("Id")) + .and_then(|v| v.as_str()) + .unwrap_or(""); + if !entity_id.is_empty() { + let action_url = format!( + "{temper_api_url}/tdata/AgentMemorys('{entity_id}')/Temper.Agent.AgentMemory.Save" + ); + let _ = ctx.http_call("POST", &action_url, &odata_headers(tenant), "{}"); + } + Ok(format!("Memory saved: key={key}, type={memory_type}")) + } else { + Err(format!("save_memory failed (HTTP {}): {}", resp.status, &resp.body[..resp.body.len().min(200)])) + } + } + "recall_memory" => { + let query = input.get("query").and_then(|v| v.as_str()).ok_or("recall_memory: missing 'query'")?; + let soul_id = fields.get("soul_id").and_then(|v| v.as_str()).unwrap_or(""); + let url = format!("{temper_api_url}/tdata/AgentMemorys"); + let resp = ctx.http_call("GET", &url, &odata_headers(tenant), "")?; + if resp.status == 200 { + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({})); + let memories = parsed + .get("value") + .and_then(|v| v.as_array()) + .cloned() + .unwrap_or_default() + .into_iter() + .filter(|mem| { + entity_field_str(mem, &["Status"]) == Some("Active") + && entity_field_str(mem, &["SoulId"]).unwrap_or("") == soul_id + && (entity_field_str(mem, &["Key"]).unwrap_or("").contains(query) + || entity_field_str(mem, &["Content"]).unwrap_or("").contains(query)) + }) + .collect::>(); + if memories.is_empty() { + Ok("No memories found matching query.".to_string()) + } else { + let mut result = String::new(); + for mem in &memories { + let k = entity_field_str(mem, &["Key"]).unwrap_or("?"); + let c = entity_field_str(mem, &["Content"]).unwrap_or(""); + let t = entity_field_str(mem, &["MemoryType"]).unwrap_or("?"); + result.push_str(&format!("- [{t}] {k}: {c}\n")); + } + Ok(result) + } + } else { + Err(format!("recall_memory failed (HTTP {})", resp.status)) + } + } + "spawn_agent" => { + let task = input.get("task").and_then(|v| v.as_str()).ok_or("spawn_agent: missing 'task'")?; + let requested_id = input.get("agent_id").and_then(|v| v.as_str()).unwrap_or(""); + let model = input.get("model").and_then(|v| v.as_str()) + .unwrap_or_else(|| fields.get("model").and_then(|v| v.as_str()).unwrap_or("claude-sonnet-4-20250514")); + let provider = input.get("provider").and_then(|v| v.as_str()) + .unwrap_or_else(|| fields.get("provider").and_then(|v| v.as_str()).unwrap_or("anthropic")); + let max_turns = input.get("max_turns").and_then(|v| v.as_i64()).unwrap_or(20); + let tools = input.get("tools").and_then(|v| v.as_str()) + .unwrap_or_else(|| fields.get("tools_enabled").and_then(|v| v.as_str()).unwrap_or("read,write,edit,bash")); + let soul_id = input.get("soul_id").and_then(|v| v.as_str()) + .unwrap_or_else(|| fields.get("soul_id").and_then(|v| v.as_str()).unwrap_or("")); + let parent_id = ctx.entity_state.get("entity_id").and_then(|v| v.as_str()).unwrap_or(""); + let sandbox_url = fields.get("sandbox_url").and_then(|v| v.as_str()).unwrap_or(""); + let workdir = fields.get("workdir").and_then(|v| v.as_str()).unwrap_or("/workspace"); + let background = input.get("background").and_then(|v| v.as_bool()).unwrap_or(false); + let current_depth = fields.get("agent_depth").and_then(|v| v.as_i64()).unwrap_or(0); + if current_depth >= 5 { + return Err("spawn_agent: agent_depth guard hit (max depth 5)".to_string()); + } + + // 1. Create child entity + let url = format!("{temper_api_url}/tdata/TemperAgents"); + let create_body = if requested_id.is_empty() { + "{}".to_string() + } else { + json!({ "TemperAgentId": requested_id }).to_string() + }; + let resp = ctx.http_call("POST", &url, &odata_headers(tenant), &create_body)?; + if resp.status < 200 || resp.status >= 300 { + return Err(format!("spawn_agent: create failed (HTTP {})", resp.status)); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({})); + let child_id = parsed + .get("entity_id") + .or_else(|| parsed.get("Id")) + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + if child_id.is_empty() { + return Err("spawn_agent: created entity has no Id".to_string()); + } + + // 2. Configure + let config_body = json!({ + "system_prompt": input.get("system_prompt").and_then(Value::as_str).unwrap_or(""), + "model": model, "provider": provider, "max_turns": max_turns.to_string(), "tools_enabled": tools, + "soul_id": soul_id, "user_message": task, "parent_agent_id": parent_id, + "sandbox_url": sandbox_url, "workdir": workdir, "agent_depth": current_depth + 1, + }); + let config_url = format!( + "{temper_api_url}/tdata/TemperAgents('{child_id}')/Temper.Agent.TemperAgent.Configure" + ); + let resp2 = ctx.http_call("POST", &config_url, &odata_headers(tenant), &serde_json::to_string(&config_body).unwrap_or_default())?; + if resp2.status < 200 || resp2.status >= 300 { + return Err(format!("spawn_agent: configure failed (HTTP {})", resp2.status)); + } + + // 3. Provision + let prov_url = format!( + "{temper_api_url}/tdata/TemperAgents('{child_id}')/Temper.Agent.TemperAgent.Provision" + ); + let resp3 = ctx.http_call("POST", &prov_url, &odata_headers(tenant), "{}")?; + if resp3.status < 200 || resp3.status >= 300 { + return Err(format!("spawn_agent: provision failed (HTTP {})", resp3.status)); + } + if background { + return Ok(format!( + "Child agent {child_id} created and provisioned in background." + )); + } + + // 4. Wait for completion + let wait_url = format!( + "{temper_api_url}/observe/entities/TemperAgent/{child_id}/wait?statuses=Completed,Failed,Cancelled&timeout_ms=300000&poll_ms=250" + ); + let wait_headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + let resp4 = ctx.http_call("GET", &wait_url, &wait_headers, "")?; + if resp4.status == 200 { + let result: Value = serde_json::from_str(&resp4.body).unwrap_or(json!({})); + let status = result.get("status").and_then(|v| v.as_str()).unwrap_or("unknown"); + let agent_result = result + .get("fields") + .and_then(|v| v.get("result")) + .or_else(|| result.get("fields").and_then(|v| v.get("Result"))) + .and_then(|v| v.as_str()) + .unwrap_or(""); + Ok(format!("Child agent {child_id} finished with status={status}. Result: {agent_result}")) + } else { + Ok(format!("Child agent {child_id} created and provisioned (poll for status).")) + } + } + "list_agents" => { + let parent_id = ctx.entity_state.get("entity_id").and_then(|v| v.as_str()).unwrap_or(""); + let agents = list_temper_agents(ctx, temper_api_url, tenant)?; + let child_agents = agents + .into_iter() + .filter(|agent| { + entity_field_str(agent, &["ParentAgentId"]).unwrap_or("") == parent_id + }) + .collect::>(); + if child_agents.is_empty() { + Ok("No child agents found.".to_string()) + } else { + let mut result = String::new(); + for agent in &child_agents { + let id = agent_display_id(agent); + let status = entity_field_str(agent, &["Status"]).unwrap_or("?"); + result.push_str(&format!("- {id}: {status}\n")); + } + Ok(result) + } + } + "abort_agent" => { + let agent_id = input.get("agent_id").and_then(|v| v.as_str()).ok_or("abort_agent: missing 'agent_id'")?; + let resolved_agent_id = resolve_agent_reference(ctx, temper_api_url, tenant, agent_id)? + .map(|agent| agent_entity_id(&agent).to_string()) + .unwrap_or_else(|| agent_id.to_string()); + let url = format!( + "{temper_api_url}/tdata/TemperAgents('{resolved_agent_id}')/Temper.Agent.TemperAgent.Cancel" + ); + let resp = ctx.http_call("POST", &url, &odata_headers(tenant), "{}")?; + if resp.status >= 200 && resp.status < 300 { + Ok(format!("Agent {resolved_agent_id} cancelled.")) + } else { + Err(format!("cancel_agent failed (HTTP {})", resp.status)) + } + } + "steer_agent" => { + let agent_id = input.get("agent_id").and_then(|v| v.as_str()).ok_or("steer_agent: missing 'agent_id'")?; + let message = input.get("message").and_then(|v| v.as_str()).ok_or("steer_agent: missing 'message'")?; + let Some(agent) = resolve_agent_reference(ctx, temper_api_url, tenant, agent_id)? else { + return Err(format!("steer_agent: agent '{agent_id}' not found")); + }; + let resolved_agent_id = agent_entity_id(&agent); + let existing = entity_field_str(&agent, &["SteeringMessages"]) + .map(str::to_string) + .unwrap_or_else(|| "[]".to_string()); + let mut queue: Vec = serde_json::from_str(&existing).unwrap_or_default(); + queue.push(json!({ "content": message })); + let body = json!({ + "steering_messages": serde_json::to_string(&queue).unwrap_or_else(|_| "[]".to_string()) + }); + let url = format!( + "{temper_api_url}/tdata/TemperAgents('{resolved_agent_id}')/Temper.Agent.TemperAgent.Steer" + ); + let resp = ctx.http_call( + "POST", + &url, + &odata_headers(tenant), + &serde_json::to_string(&body).unwrap_or_default(), + )?; + if resp.status >= 200 && resp.status < 300 { + Ok(format!( + "Steering message sent to agent {}.", + agent_display_id(&agent) + )) + } else { + Err(format!("steer_agent failed (HTTP {})", resp.status)) + } + } + "read_entity" => { + let file_id = input.get("file_id").and_then(|v| v.as_str()).ok_or("read_entity: missing 'file_id'")?; + let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + let resp = ctx.http_call("GET", &url, &headers, "")?; + if resp.status == 200 { Ok(resp.body) } + else { Err(format!("read_entity failed (HTTP {})", resp.status)) } + } + "run_coding_agent" => { + let agent_type = input.get("agent_type").and_then(|v| v.as_str()).ok_or("run_coding_agent: missing 'agent_type'")?; + let task = input.get("task").and_then(|v| v.as_str()).ok_or("run_coding_agent: missing 'task'")?; + let agent_workdir = input.get("workdir").and_then(|v| v.as_str()) + .unwrap_or_else(|| fields.get("workdir").and_then(|v| v.as_str()).unwrap_or("/workspace")); + let background = input.get("background").and_then(|v| v.as_bool()).unwrap_or(false); + let sandbox_url = fields.get("sandbox_url").and_then(|v| v.as_str()).unwrap_or(""); + if sandbox_url.is_empty() { + return Err("run_coding_agent: sandbox_url is empty".to_string()); + } + let escaped_task = task.replace('\'', "'\\''"); + let command = match agent_type { + "claude-code" => format!("cd {agent_workdir} && claude --permission-mode bypassPermissions --print '{escaped_task}'"), + "codex" => format!("cd {agent_workdir} && codex exec '{escaped_task}'"), + "pi" => format!("cd {agent_workdir} && pi -p '{escaped_task}'"), + "opencode" => format!("cd {agent_workdir} && opencode run '{escaped_task}'"), + _ => return Err(format!("unsupported coding agent type: {agent_type}")), + }; + let final_cmd = if background { + format!("nohup bash -c '{command}' > /tmp/coding-agent-{agent_type}.log 2>&1 & echo $!") + } else { + command + }; + // Execute via sandbox bash API + let url = format!("{sandbox_url}/v1/processes/run"); + let body = json!({ "command": final_cmd, "workdir": agent_workdir }); + let headers = vec![("content-type".to_string(), "application/json".to_string())]; + let resp = ctx.http_call("POST", &url, &headers, &serde_json::to_string(&body).unwrap_or_default())?; + if resp.status >= 200 && resp.status < 300 { + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or(json!({})); + let stdout = parsed.get("stdout").and_then(|v| v.as_str()).unwrap_or(""); + let stderr = parsed.get("stderr").and_then(|v| v.as_str()).unwrap_or(""); + let exit_code = parsed + .get("exit_code") + .and_then(|v| v.as_i64()) + .unwrap_or(-1); + if exit_code != 0 && !stderr.is_empty() { + Ok(format!( + "Command: {final_cmd}\nExit code: {exit_code}\nstdout: {stdout}\nstderr: {stderr}" + )) + } else { + Ok(format!("Command: {final_cmd}\n{stdout}")) + } + } else { + Err(format!("sandbox process failed (HTTP {})", resp.status)) + } + } + _ => Err(format!("unknown entity tool: {tool_name}")), + } +} + +fn odata_headers(tenant: &str) -> Vec<(String, String)> { + vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("content-type".to_string(), "application/json".to_string()), + ("accept".to_string(), "application/json".to_string()), + ] +} + +fn normalize_field_key(value: &str) -> String { + value + .chars() + .filter(|ch| ch.is_alphanumeric()) + .flat_map(|ch| ch.to_lowercase()) + .collect() +} + +fn direct_field_value<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a Value> { + let object = value.as_object()?; + for key in keys { + if let Some(found) = object.get(*key) { + return Some(found); + } + } + let normalized_keys = keys + .iter() + .map(|key| normalize_field_key(key)) + .collect::>(); + object.iter().find_map(|(key, value)| { + let normalized_key = normalize_field_key(key); + normalized_keys + .iter() + .any(|candidate| candidate == &normalized_key) + .then_some(value) + }) +} + +fn direct_field_str<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + direct_field_value(value, keys).and_then(Value::as_str) +} + +fn entity_field_str<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + direct_field_value(value, &["fields"]) + .and_then(|fields| direct_field_str(fields, keys)) + .or_else(|| direct_field_str(value, keys)) +} + +fn agent_entity_id<'a>(agent: &'a Value) -> &'a str { + entity_field_str(agent, &["Id", "entity_id", "id"]).unwrap_or("") +} + +fn agent_display_id<'a>(agent: &'a Value) -> &'a str { + entity_field_str(agent, &["TemperAgentId", "Id", "entity_id", "id"]).unwrap_or("?") +} + +fn list_temper_agents( + ctx: &Context, + temper_api_url: &str, + tenant: &str, +) -> Result, String> { + let url = format!("{temper_api_url}/tdata/TemperAgents"); + let resp = ctx.http_call("GET", &url, &odata_headers(tenant), "")?; + if resp.status != 200 { + return Err(format!("temper agent listing failed (HTTP {})", resp.status)); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or_else(|_| json!({})); + Ok(parsed + .get("value") + .and_then(|value| value.as_array()) + .cloned() + .unwrap_or_default()) +} + +fn resolve_agent_reference( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + agent_reference: &str, +) -> Result, String> { + let agents = list_temper_agents(ctx, temper_api_url, tenant)?; + Ok(agents.into_iter().find(|agent| { + let entity_id = agent_entity_id(agent); + let temper_agent_id = entity_field_str(agent, &["TemperAgentId"]).unwrap_or(""); + entity_id == agent_reference || temper_agent_id == agent_reference + })) +} + +/// Read session JSONL from TemperFS. +fn read_session_from_temperfs( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + file_id: &str, +) -> Result { + let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + let resp = ctx.http_call("GET", &url, &headers, "")?; + if resp.status == 200 { Ok(resp.body) } + else if resp.status == 404 { Ok(String::new()) } + else { Err(format!("TemperFS session read failed (HTTP {})", resp.status)) } +} + +/// Write session JSONL to TemperFS. +fn write_session_to_temperfs( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + file_id: &str, + jsonl: &str, +) -> Result<(), String> { + let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + let headers = vec![ + ("content-type".to_string(), "text/plain".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + let resp = ctx.http_call("PUT", &url, &headers, jsonl)?; + if resp.status >= 200 && resp.status < 300 { Ok(()) } + else { Err(format!("TemperFS session write failed (HTTP {})", resp.status)) } +} + fn resolve_temper_api_url(ctx: &Context, fields: &Value) -> String { fields .get("temper_api_url") diff --git a/os-apps/temper-agent/wasm/wasm-helpers/Cargo.lock b/os-apps/temper-agent/wasm/wasm-helpers/Cargo.lock new file mode 100644 index 00000000..b381e9d2 --- /dev/null +++ b/os-apps/temper-agent/wasm/wasm-helpers/Cargo.lock @@ -0,0 +1,113 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "wasm-helpers" +version = "0.1.0" +dependencies = [ + "serde_json", + "temper-wasm-sdk", +] + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-agent/wasm/wasm-helpers/Cargo.toml b/os-apps/temper-agent/wasm/wasm-helpers/Cargo.toml new file mode 100644 index 00000000..ac813de1 --- /dev/null +++ b/os-apps/temper-agent/wasm/wasm-helpers/Cargo.toml @@ -0,0 +1,13 @@ +[package] +name = "wasm-helpers" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["rlib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } +serde_json = "1" diff --git a/os-apps/temper-agent/wasm/wasm-helpers/src/lib.rs b/os-apps/temper-agent/wasm/wasm-helpers/src/lib.rs new file mode 100644 index 00000000..bbdadb2b --- /dev/null +++ b/os-apps/temper-agent/wasm/wasm-helpers/src/lib.rs @@ -0,0 +1,191 @@ +//! Shared helper functions for TemperAgent WASM modules. +//! +//! Provides common TemperFS I/O, field extraction, and URL resolution +//! to eliminate duplication across WASM integration modules. + +use temper_wasm_sdk::prelude::*; + +/// Resolve the Temper API URL from entity fields or context config, +/// falling back to localhost. +pub fn resolve_temper_api_url(ctx: &Context, fields: &Value) -> String { + fields + .get("temper_api_url") + .and_then(|v| v.as_str()) + .filter(|s| !s.is_empty()) + .map(|s| s.to_string()) + .or_else(|| { + ctx.config + .get("temper_api_url") + .filter(|s| !s.is_empty()) + .cloned() + }) + .unwrap_or_else(|| "http://127.0.0.1:3000".to_string()) +} + +/// Read session JSONL from TemperFS by file ID. +pub fn read_session_from_temperfs( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + file_id: &str, +) -> Result { + let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + + let resp = ctx.http_call("GET", &url, &headers, "")?; + if resp.status == 200 { + Ok(resp.body) + } else if resp.status == 404 { + Ok(String::new()) + } else { + Err(format!("TemperFS session read failed (HTTP {})", resp.status)) + } +} + +/// Write session JSONL to TemperFS by file ID. +pub fn write_session_to_temperfs( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + file_id: &str, + jsonl: &str, +) -> Result<(), String> { + let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); + let headers = vec![ + ("content-type".to_string(), "text/plain".to_string()), + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ]; + + let resp = ctx.http_call("PUT", &url, &headers, jsonl)?; + if resp.status >= 200 && resp.status < 300 { + Ok(()) + } else { + Err(format!("TemperFS session write failed (HTTP {})", resp.status)) + } +} + +/// Build standard OData headers for tenant-scoped requests. +pub fn odata_headers(tenant: &str) -> Vec<(String, String)> { + vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("content-type".to_string(), "application/json".to_string()), + ("accept".to_string(), "application/json".to_string()), + ] +} + +/// Look up a string field directly on a JSON value, trying multiple key names. +pub fn direct_field_str<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + keys.iter() + .find_map(|key| value.get(*key).and_then(Value::as_str)) +} + +/// Look up a string field on a JSON value, falling back to nested `fields` object. +pub fn entity_field_str<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + direct_field_str(value, keys).or_else(|| { + value + .get("fields") + .and_then(|fields| direct_field_str(fields, keys)) + }) +} + +/// Parse a basic ISO 8601 timestamp (YYYY-MM-DDTHH:MM:SSZ) to Unix epoch seconds. +/// Returns None if the format is unrecognized. +pub fn parse_iso8601_to_epoch_secs(s: &str) -> Option { + // Supported formats: "2026-03-24T12:30:00Z", "2026-03-24T12:30:00.000Z" + let s = s.trim(); + if s.len() < 19 { + return None; + } + + let year: u64 = s.get(0..4)?.parse().ok()?; + let month: u64 = s.get(5..7)?.parse().ok()?; + let day: u64 = s.get(8..10)?.parse().ok()?; + let hour: u64 = s.get(11..13)?.parse().ok()?; + let minute: u64 = s.get(14..16)?.parse().ok()?; + let second: u64 = s.get(17..19)?.parse().ok()?; + + if s.as_bytes().get(4) != Some(&b'-') + || s.as_bytes().get(7) != Some(&b'-') + || s.as_bytes().get(10) != Some(&b'T') + { + return None; + } + + // Days in each month (non-leap) + let days_in_month = [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]; + let is_leap = (year % 4 == 0 && year % 100 != 0) || (year % 400 == 0); + + // Days from epoch (1970-01-01) to start of `year` + let mut days: u64 = 0; + for y in 1970..year { + let leap = (y % 4 == 0 && y % 100 != 0) || (y % 400 == 0); + days += if leap { 366 } else { 365 }; + } + + // Days from start of year to start of month + for m in 1..month { + days += days_in_month[m as usize]; + if m == 2 && is_leap { + days += 1; + } + } + + // Days within month (1-indexed) + days += day - 1; + + Some(days * 86400 + hour * 3600 + minute * 60 + second) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_parse_iso8601() { + // 2026-03-24T12:00:00Z + let secs = parse_iso8601_to_epoch_secs("2026-03-24T12:00:00Z"); + assert!(secs.is_some()); + let s = secs.unwrap(); + // Rough sanity: should be > 2025-01-01 (~1735689600) and < 2027-01-01 + assert!(s > 1_735_000_000); + assert!(s < 1_800_000_000); + } + + #[test] + fn test_parse_iso8601_with_millis() { + let secs = parse_iso8601_to_epoch_secs("2026-03-24T12:00:00.123Z"); + assert!(secs.is_some()); + } + + #[test] + fn test_parse_iso8601_invalid() { + assert!(parse_iso8601_to_epoch_secs("").is_none()); + assert!(parse_iso8601_to_epoch_secs("not-a-date").is_none()); + assert!(parse_iso8601_to_epoch_secs("2026").is_none()); + } + + #[test] + fn test_epoch_zero() { + let secs = parse_iso8601_to_epoch_secs("1970-01-01T00:00:00Z"); + assert_eq!(secs, Some(0)); + } + + #[test] + fn test_direct_field_str() { + let val = serde_json::json!({"Name": "test", "id": "123"}); + assert_eq!(direct_field_str(&val, &["Name"]), Some("test")); + assert_eq!(direct_field_str(&val, &["missing", "id"]), Some("123")); + assert_eq!(direct_field_str(&val, &["missing"]), None); + } + + #[test] + fn test_entity_field_str() { + let val = serde_json::json!({"fields": {"Status": "Active"}}); + assert_eq!(entity_field_str(&val, &["Status"]), Some("Active")); + } +} diff --git a/os-apps/temper-agent/wasm/workspace_restorer/src/lib.rs b/os-apps/temper-agent/wasm/workspace_restorer/src/lib.rs index b422ec10..7adcc524 100644 --- a/os-apps/temper-agent/wasm/workspace_restorer/src/lib.rs +++ b/os-apps/temper-agent/wasm/workspace_restorer/src/lib.rs @@ -44,6 +44,14 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { .get("conversation_file_id") .and_then(|v| v.as_str()) .unwrap_or(""); + let session_file_id = fields + .get("session_file_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let session_leaf_id = fields + .get("session_leaf_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); // Build SandboxReady params to forward existing state let sandbox_ready_params = json!({ @@ -52,6 +60,8 @@ pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { "workspace_id": workspace_id, "conversation_file_id": conversation_file_id, "file_manifest_id": file_manifest_id, + "session_file_id": session_file_id, + "session_leaf_id": session_leaf_id, }); if file_manifest_id.is_empty() { @@ -136,7 +146,7 @@ fn read_manifest( let url = format!("{temper_api_url}/tdata/Files('{manifest_file_id}')/$value"); let headers = vec![ ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ("accept".to_string(), "application/json".to_string()), ]; @@ -173,7 +183,7 @@ fn read_file_from_temperfs( let url = format!("{temper_api_url}/tdata/Files('{file_id}')/$value"); let headers = vec![ ("x-tenant-id".to_string(), tenant.to_string()), - ("x-temper-principal-kind".to_string(), "system".to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), ]; let resp = ctx.http_call("GET", &url, &headers, "")?; diff --git a/os-apps/temper-channels/policies/channels.cedar b/os-apps/temper-channels/policies/channels.cedar new file mode 100644 index 00000000..aa408ce5 --- /dev/null +++ b/os-apps/temper-channels/policies/channels.cedar @@ -0,0 +1,81 @@ +// Temper Channels — Cedar Authorization Policies + +// Admins can do everything +permit( + principal is Admin, + action, + resource is Channel +); + +permit( + principal is Admin, + action, + resource is AgentRoute +); + +permit( + principal is Admin, + action, + resource is ChannelSession +); + +// Supervisors and humans can manage channels and routes +permit( + principal, + action in [Action::"create", Action::"Configure", Action::"Connect", Action::"Disconnect", Action::"Reconnect", Action::"Archive"], + resource is Channel +) when { + ["supervisor", "human"].contains(principal.agent_type) +}; + +permit( + principal, + action in [Action::"create", Action::"Register", Action::"Update", Action::"Disable", Action::"Enable"], + resource is AgentRoute +) when { + ["supervisor", "human"].contains(principal.agent_type) +}; + +// System agents can handle channel callbacks and session management +permit( + principal, + action in [Action::"Ready", Action::"ReceiveMessage", Action::"SendReply", Action::"ReplyDelivered", Action::"ConnectFailed", Action::"RouteFailed", Action::"ReplyFailed"], + resource is Channel +) when { + principal.agent_type == "system" +}; + +permit( + principal, + action in [Action::"create", Action::"Create", Action::"Resume", Action::"Expire"], + resource is ChannelSession +) when { + ["system", "supervisor", "human"].contains(principal.agent_type) +}; + +// Any authenticated agent can read +permit( + principal, + action in [Action::"read", Action::"list"], + resource is Channel +); + +permit( + principal, + action in [Action::"read", Action::"list"], + resource is AgentRoute +); + +permit( + principal, + action in [Action::"read", Action::"list"], + resource is ChannelSession +); + +permit( + principal is Agent, + action == Action::"http_call", + resource is HttpEndpoint +) when { + ["channel_connect", "route_message", "send_reply"].contains(context.module) +}; diff --git a/os-apps/temper-channels/specs/agent_route.ioa.toml b/os-apps/temper-channels/specs/agent_route.ioa.toml new file mode 100644 index 00000000..ea81f841 --- /dev/null +++ b/os-apps/temper-channels/specs/agent_route.ioa.toml @@ -0,0 +1,67 @@ +# AgentRoute — Binding-tier routing rules for channel messages. +# +# Routes incoming channel messages to agent configurations based on +# binding tier priority: peer > guild_roles > guild > team > channel. + +[automaton] +name = "AgentRoute" +states = ["Active", "Disabled"] +initial = "Active" + +[[state]] +name = "binding_tier" +type = "string" +initial = "channel" + +[[state]] +name = "channel_id" +type = "string" +initial = "" + +[[state]] +name = "guild_id" +type = "string" +initial = "" + +[[state]] +name = "match_pattern" +type = "string" +initial = "" + +[[state]] +name = "agent_config" +type = "string" +initial = "" + +[[state]] +name = "soul_id" +type = "string" +initial = "" + +[[action]] +name = "Register" +kind = "input" +from = ["Active"] +params = ["binding_tier", "channel_id", "guild_id", "match_pattern", "agent_config", "soul_id"] +hint = "Register a routing rule." + +[[action]] +name = "Update" +kind = "input" +from = ["Active"] +params = ["agent_config", "match_pattern", "soul_id"] +hint = "Update routing configuration." + +[[action]] +name = "Disable" +kind = "input" +from = ["Active"] +to = "Disabled" +hint = "Disable this route." + +[[action]] +name = "Enable" +kind = "input" +from = ["Disabled"] +to = "Active" +hint = "Re-enable this route." diff --git a/os-apps/temper-channels/specs/channel.ioa.toml b/os-apps/temper-channels/specs/channel.ioa.toml new file mode 100644 index 00000000..32994df6 --- /dev/null +++ b/os-apps/temper-channels/specs/channel.ioa.toml @@ -0,0 +1,173 @@ +# Channel — Multi-platform messaging channel adapter. +# +# Manages connection lifecycle for Discord, Slack, webhook, etc. +# Receives messages, routes to agents, delivers replies. + +[automaton] +name = "Channel" +states = ["Created", "Connecting", "Connected", "Disconnected", "Archived"] +initial = "Created" + +[[state]] +name = "channel_type" +type = "string" +initial = "" + +[[state]] +name = "channel_id" +type = "string" +initial = "" + +[[state]] +name = "guild_id" +type = "string" +initial = "" + +[[state]] +name = "default_agent_config" +type = "string" +initial = "" + +[[state]] +name = "webhook_secret" +type = "string" +initial = "" + +[[state]] +name = "webhook_url" +type = "string" +initial = "" + +[[state]] +name = "active_sessions" +type = "counter" +initial = "0" + +[[state]] +name = "message_count" +type = "counter" +initial = "0" + +[[action]] +name = "Configure" +kind = "input" +from = ["Created"] +params = ["channel_type", "channel_id", "guild_id", "default_agent_config", "webhook_secret", "webhook_url"] +hint = "Configure channel with type and connection details." + +[[action]] +name = "Connect" +kind = "input" +from = ["Created"] +to = "Connecting" +hint = "Start channel connection." +effect = [{ type = "trigger", name = "channel_connect" }] + +[[action]] +name = "Ready" +kind = "input" +from = ["Connecting"] +to = "Connected" +hint = "Channel is connected and ready to receive messages." + +[[action]] +name = "ReceiveMessage" +kind = "input" +from = ["Connected"] +params = ["message_id", "author_id", "thread_id", "content"] +hint = "Receive an incoming message from the channel platform." +effect = [{ type = "increment", var = "message_count" }, { type = "trigger", name = "route_message" }] + +[[action]] +name = "SendReply" +kind = "input" +from = ["Connected"] +params = ["thread_id", "content", "agent_entity_id"] +hint = "Send a reply back to the channel." +effect = [{ type = "trigger", name = "send_reply" }] + +[[action]] +name = "ReplyDelivered" +kind = "input" +from = ["Connected"] +params = ["thread_id", "content", "agent_entity_id"] +hint = "Reply delivery finished successfully." + +[[action]] +name = "Disconnect" +kind = "input" +from = ["Connected"] +to = "Disconnected" +hint = "Channel disconnected." + +[[action]] +name = "Reconnect" +kind = "input" +from = ["Disconnected"] +to = "Connecting" +hint = "Reconnect the channel." +effect = [{ type = "trigger", name = "channel_connect" }] + +[[action]] +name = "Archive" +kind = "input" +from = ["Connected", "Disconnected"] +to = "Archived" +hint = "Archive the channel permanently." + +[[action]] +name = "ConnectFailed" +kind = "input" +from = ["Connecting"] +to = "Disconnected" +params = ["error_message"] +hint = "Channel connection WASM failed." + +[[action]] +name = "RouteFailed" +kind = "input" +from = ["Connected"] +params = ["error_message"] +hint = "Message routing WASM failed. Stay Connected." + +[[action]] +name = "ReplyFailed" +kind = "input" +from = ["Connected"] +params = ["error_message"] +hint = "Reply delivery WASM failed. Stay Connected." + +[[invariant]] +name = "ArchivedIsFinal" +when = ["Archived"] +assert = "no_further_transitions" + +[[integration]] +name = "channel_connect" +trigger = "channel_connect" +type = "wasm" +module = "channel_connect" +on_failure = "ConnectFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" + +[[integration]] +name = "route_message" +trigger = "route_message" +type = "wasm" +module = "route_message" +on_failure = "RouteFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" + +[[integration]] +name = "send_reply" +trigger = "send_reply" +type = "wasm" +module = "send_reply" +on_failure = "ReplyFailed" + +[integration.config] +temper_api_url = "{secret:temper_api_url}" diff --git a/os-apps/temper-channels/specs/channel_session.ioa.toml b/os-apps/temper-channels/specs/channel_session.ioa.toml new file mode 100644 index 00000000..16ae2903 --- /dev/null +++ b/os-apps/temper-channels/specs/channel_session.ioa.toml @@ -0,0 +1,60 @@ +# ChannelSession — Maps channel threads to TemperAgent entities. +# +# Tracks active conversations between channel users and agents. +# Enables session continuity (same thread = same agent) and steering. + +[automaton] +name = "ChannelSession" +states = ["Active", "Expired"] +initial = "Active" + +[[state]] +name = "channel_id" +type = "string" +initial = "" + +[[state]] +name = "thread_id" +type = "string" +initial = "" + +[[state]] +name = "author_id" +type = "string" +initial = "" + +[[state]] +name = "agent_entity_id" +type = "string" +initial = "" + +[[state]] +name = "last_message_at" +type = "string" +initial = "" + +[[action]] +name = "Create" +kind = "input" +from = ["Active"] +params = ["channel_id", "thread_id", "author_id", "agent_entity_id", "last_message_at"] +hint = "Create a new channel session linking a thread to an agent." + +[[action]] +name = "Resume" +kind = "input" +from = ["Active"] +params = ["last_message_at"] +hint = "Resume the session with a new message timestamp." + +[[action]] +name = "Expire" +kind = "input" +from = ["Active"] +to = "Expired" +hint = "Expire the session (timeout or manual cleanup)." + +[[invariant]] +name = "ExpiredIsFinal" +when = ["Expired"] +assert = "no_further_transitions" diff --git a/os-apps/temper-channels/specs/model.csdl.xml b/os-apps/temper-channels/specs/model.csdl.xml new file mode 100644 index 00000000..21ee83fa --- /dev/null +++ b/os-apps/temper-channels/specs/model.csdl.xml @@ -0,0 +1,180 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/os-apps/temper-channels/wasm/build.sh b/os-apps/temper-channels/wasm/build.sh new file mode 100755 index 00000000..e7912e4e --- /dev/null +++ b/os-apps/temper-channels/wasm/build.sh @@ -0,0 +1,23 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +for module in channel_connect route_message send_reply; do + echo "Building $module..." + (cd "$SCRIPT_DIR/$module" && cargo build --target wasm32-unknown-unknown --release) + echo " -> $module built successfully" +done + +echo "" +echo "All Temper channel WASM modules built. Binaries at:" +for module in channel_connect route_message send_reply; do + wasm_file="$SCRIPT_DIR/$module/target/wasm32-unknown-unknown/release/${module}.wasm" + if [ ! -f "$wasm_file" ]; then + wasm_file="$SCRIPT_DIR/$module/target/wasm32-unknown-unknown/release/$(echo "$module" | tr '_' '-').wasm" + fi + if [ -f "$wasm_file" ]; then + size=$(wc -c < "$wasm_file" | tr -d ' ') + echo " $module: $(( size / 1024 ))KB" + fi +done diff --git a/os-apps/temper-channels/wasm/channel_connect/Cargo.lock b/os-apps/temper-channels/wasm/channel_connect/Cargo.lock new file mode 100644 index 00000000..8bf833c8 --- /dev/null +++ b/os-apps/temper-channels/wasm/channel_connect/Cargo.lock @@ -0,0 +1,112 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "channel-connect" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", +] + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-channels/wasm/channel_connect/Cargo.toml b/os-apps/temper-channels/wasm/channel_connect/Cargo.toml new file mode 100644 index 00000000..4416adfa --- /dev/null +++ b/os-apps/temper-channels/wasm/channel_connect/Cargo.toml @@ -0,0 +1,12 @@ +[package] +name = "channel-connect" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } diff --git a/os-apps/temper-channels/wasm/channel_connect/src/lib.rs b/os-apps/temper-channels/wasm/channel_connect/src/lib.rs new file mode 100644 index 00000000..9d7655b1 --- /dev/null +++ b/os-apps/temper-channels/wasm/channel_connect/src/lib.rs @@ -0,0 +1,29 @@ +use temper_wasm_sdk::prelude::*; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + let fields = ctx.entity_state.get("fields").cloned().unwrap_or_else(|| json!({})); + let channel_type = fields + .get("channel_type") + .and_then(|v| v.as_str()) + .unwrap_or("webhook"); + let channel_id = fields + .get("channel_id") + .and_then(|v| v.as_str()) + .unwrap_or(""); + + ctx.log( + "info", + &format!("channel_connect: ready channel_type={channel_type} channel_id={channel_id}"), + ); + set_success_result("Ready", &json!({})); + Ok(()) + })(); + + if let Err(error) = result { + set_error_result(&error); + } + 0 +} diff --git a/os-apps/temper-channels/wasm/route_message/Cargo.lock b/os-apps/temper-channels/wasm/route_message/Cargo.lock new file mode 100644 index 00000000..a6f256a8 --- /dev/null +++ b/os-apps/temper-channels/wasm/route_message/Cargo.lock @@ -0,0 +1,112 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "route-message" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-channels/wasm/route_message/Cargo.toml b/os-apps/temper-channels/wasm/route_message/Cargo.toml new file mode 100644 index 00000000..ec77a922 --- /dev/null +++ b/os-apps/temper-channels/wasm/route_message/Cargo.toml @@ -0,0 +1,12 @@ +[package] +name = "route-message" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } diff --git a/os-apps/temper-channels/wasm/route_message/src/lib.rs b/os-apps/temper-channels/wasm/route_message/src/lib.rs new file mode 100644 index 00000000..035f579e --- /dev/null +++ b/os-apps/temper-channels/wasm/route_message/src/lib.rs @@ -0,0 +1,335 @@ +use temper_wasm_sdk::prelude::*; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + let fields = ctx.entity_state.get("fields").cloned().unwrap_or_else(|| json!({})); + let temper_api_url = resolve_temper_api_url(&ctx, &fields); + let channel_id = str_field(&fields, &["channel_id", "ChannelId"]).unwrap_or(""); + let default_agent_config = + str_field(&fields, &["default_agent_config", "DefaultAgentConfig"]).unwrap_or("{}"); + let thread_id = str_field(&fields, &["thread_id", "ThreadId"]).unwrap_or(""); + let author_id = str_field(&fields, &["author_id", "AuthorId"]).unwrap_or(""); + let content = str_field(&fields, &["content", "Content"]).unwrap_or(""); + if channel_id.is_empty() || thread_id.is_empty() || author_id.is_empty() { + return Err("route_message: missing channel_id/thread_id/author_id".to_string()); + } + + let existing_session = find_active_session(&ctx, &temper_api_url, &ctx.tenant, channel_id, thread_id, author_id)?; + let agent_id = if let Some(session) = existing_session { + let session_id = session + .get("entity_id") + .and_then(|v| v.as_str()) + .or_else(|| nested_str_field(&session, &["Id"])) + .unwrap_or_default() + .to_string(); + let agent_id = nested_str_field(&session, &["AgentEntityId"]) + .unwrap_or_default() + .to_string(); + resume_session(&ctx, &temper_api_url, &ctx.tenant, &session_id)?; + steer_existing_agent(&ctx, &temper_api_url, &ctx.tenant, &agent_id, content)?; + agent_id + } else { + let route = find_route(&ctx, &temper_api_url, &ctx.tenant, channel_id)?; + let route_config = route + .as_ref() + .and_then(|value| nested_str_field(value, &["AgentConfig"])) + .filter(|value| !value.trim().is_empty()) + .unwrap_or(default_agent_config); + let route_soul_id = route + .as_ref() + .and_then(|value| nested_str_field(value, &["SoulId"])) + .unwrap_or(""); + let agent_id = create_agent_from_route( + &ctx, + &temper_api_url, + &ctx.tenant, + route_config, + route_soul_id, + content, + )?; + create_session( + &ctx, + &temper_api_url, + &ctx.tenant, + channel_id, + thread_id, + author_id, + &agent_id, + )?; + agent_id + }; + + let result_text = wait_for_agent(&ctx, &temper_api_url, &ctx.tenant, &agent_id)?; + set_success_result( + "SendReply", + &json!({ + "thread_id": thread_id, + "content": result_text, + "agent_entity_id": agent_id, + }), + ); + Ok(()) + })(); + + if let Err(error) = result { + set_error_result(&error); + } + 0 +} + +fn resolve_temper_api_url(ctx: &Context, fields: &Value) -> String { + fields + .get("temper_api_url") + .and_then(|v| v.as_str()) + .filter(|s| !s.is_empty()) + .map(|s| s.to_string()) + .or_else(|| ctx.config.get("temper_api_url").filter(|s| !s.is_empty()).cloned()) + .unwrap_or_else(|| "http://127.0.0.1:3000".to_string()) +} + +fn odata_headers(tenant: &str) -> Vec<(String, String)> { + vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("content-type".to_string(), "application/json".to_string()), + ("accept".to_string(), "application/json".to_string()), + ] +} + +fn list_entities(ctx: &Context, url: &str, tenant: &str) -> Result, String> { + let resp = ctx.http_call("GET", url, &odata_headers(tenant), "")?; + if resp.status != 200 { + return Err(format!("GET {url} failed (HTTP {})", resp.status)); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or_else(|_| json!({ "value": [] })); + Ok(parsed + .get("value") + .and_then(Value::as_array) + .cloned() + .unwrap_or_default()) +} + +fn find_active_session( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + channel_id: &str, + thread_id: &str, + _author_id: &str, +) -> Result, String> { + let filter = format!( + "$filter=Status eq 'Active' and ChannelId eq '{}' and ThreadId eq '{}'", + channel_id, thread_id + ); + let sessions = list_entities( + ctx, + &format!("{temper_api_url}/tdata/ChannelSessions?{filter}"), + tenant, + )?; + Ok(sessions.into_iter().next()) +} + +fn resume_session( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + session_id: &str, +) -> Result<(), String> { + let url = format!( + "{temper_api_url}/tdata/ChannelSessions('{session_id}')/Temper.Claw.ChannelSession.Resume" + ); + let _ = ctx.http_call("POST", &url, &odata_headers(tenant), r#"{"last_message_at":"resumed"}"#)?; + Ok(()) +} + +fn find_route( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + channel_id: &str, +) -> Result, String> { + let routes = list_entities(ctx, &format!("{temper_api_url}/tdata/AgentRoutes"), tenant)?; + Ok(routes.into_iter().find(|route| { + nested_str_field(route, &["Status"]) == Some("Active") + && { + let route_channel_id = nested_str_field(route, &["ChannelId"]).unwrap_or(""); + route_channel_id.is_empty() || route_channel_id == channel_id + } + })) +} + +fn create_agent_from_route( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + route_config: &str, + route_soul_id: &str, + user_message: &str, +) -> Result { + let config: Value = serde_json::from_str(route_config).unwrap_or_else(|_| json!({})); + let create_resp = ctx.http_call("POST", &format!("{temper_api_url}/tdata/TemperAgents"), &odata_headers(tenant), "{}")?; + if !(200..300).contains(&create_resp.status) { + return Err(format!("create TemperAgent failed (HTTP {})", create_resp.status)); + } + let parsed: Value = serde_json::from_str(&create_resp.body).unwrap_or_else(|_| json!({})); + let agent_id = parsed + .get("entity_id") + .or_else(|| parsed.get("Id")) + .and_then(Value::as_str) + .unwrap_or("") + .to_string(); + if agent_id.is_empty() { + return Err("route_message: created TemperAgent missing entity_id".to_string()); + } + + let configure_body = json!({ + "system_prompt": config.get("system_prompt").and_then(Value::as_str).unwrap_or(""), + "user_message": user_message, + "model": config.get("model").and_then(Value::as_str).unwrap_or("mock"), + "provider": config.get("provider").and_then(Value::as_str).unwrap_or("mock"), + "tools_enabled": config.get("tools_enabled").and_then(Value::as_str).unwrap_or("read_entity"), + "max_turns": config.get("max_turns").and_then(Value::as_str).unwrap_or("6"), + "sandbox_url": config.get("sandbox_url").and_then(Value::as_str).unwrap_or("http://127.0.0.1:9999"), + "workdir": config.get("workdir").and_then(Value::as_str).unwrap_or("/tmp/workspace"), + "soul_id": if route_soul_id.is_empty() { + config.get("soul_id").and_then(Value::as_str).unwrap_or("") + } else { + route_soul_id + }, + }); + let configure_url = format!( + "{temper_api_url}/tdata/TemperAgents('{agent_id}')/Temper.Agent.TemperAgent.Configure" + ); + let configure_resp = ctx.http_call("POST", &configure_url, &odata_headers(tenant), &configure_body.to_string())?; + if !(200..300).contains(&configure_resp.status) { + return Err(format!("configure TemperAgent failed (HTTP {})", configure_resp.status)); + } + + let provision_url = format!( + "{temper_api_url}/tdata/TemperAgents('{agent_id}')/Temper.Agent.TemperAgent.Provision" + ); + let provision_resp = ctx.http_call("POST", &provision_url, &odata_headers(tenant), "{}")?; + if !(200..300).contains(&provision_resp.status) { + return Err(format!("provision TemperAgent failed (HTTP {})", provision_resp.status)); + } + Ok(agent_id) +} + +fn create_session( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + channel_id: &str, + thread_id: &str, + author_id: &str, + agent_id: &str, +) -> Result<(), String> { + let create_resp = ctx.http_call( + "POST", + &format!("{temper_api_url}/tdata/ChannelSessions"), + &odata_headers(tenant), + "{}", + )?; + if !(200..300).contains(&create_resp.status) { + return Err(format!("create ChannelSession failed (HTTP {})", create_resp.status)); + } + let parsed: Value = serde_json::from_str(&create_resp.body).unwrap_or_else(|_| json!({})); + let session_id = parsed + .get("entity_id") + .or_else(|| parsed.get("Id")) + .and_then(Value::as_str) + .unwrap_or("") + .to_string(); + if session_id.is_empty() { + return Err("ChannelSession creation missing entity_id".to_string()); + } + let create_url = format!( + "{temper_api_url}/tdata/ChannelSessions('{session_id}')/Temper.Claw.ChannelSession.Create" + ); + let body = json!({ + "channel_id": channel_id, + "thread_id": thread_id, + "author_id": author_id, + "agent_entity_id": agent_id, + "last_message_at": "created", + }); + let resp = ctx.http_call("POST", &create_url, &odata_headers(tenant), &body.to_string())?; + if !(200..300).contains(&resp.status) { + return Err(format!("ChannelSession.Create failed (HTTP {})", resp.status)); + } + Ok(()) +} + +fn steer_existing_agent( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + agent_id: &str, + message: &str, +) -> Result<(), String> { + let agent_url = format!("{temper_api_url}/tdata/TemperAgents('{agent_id}')"); + let agent_resp = ctx.http_call("GET", &agent_url, &odata_headers(tenant), "")?; + let mut queue = if agent_resp.status == 200 { + let parsed: Value = serde_json::from_str(&agent_resp.body).unwrap_or_else(|_| json!({})); + serde_json::from_str::>( + nested_str_field(&parsed, &["SteeringMessages"]).unwrap_or("[]"), + ) + .unwrap_or_default() + } else { + Vec::new() + }; + queue.push(json!({ "content": message })); + let steer_url = format!( + "{temper_api_url}/tdata/TemperAgents('{agent_id}')/Temper.Agent.TemperAgent.Steer" + ); + let body = json!({ + "steering_messages": serde_json::to_string(&queue).unwrap_or_else(|_| "[]".to_string()), + }); + let resp = ctx.http_call("POST", &steer_url, &odata_headers(tenant), &body.to_string())?; + if !(200..300).contains(&resp.status) { + return Err(format!("steer agent failed (HTTP {})", resp.status)); + } + Ok(()) +} + +fn wait_for_agent( + ctx: &Context, + temper_api_url: &str, + tenant: &str, + agent_id: &str, +) -> Result { + let wait_url = format!( + "{temper_api_url}/observe/entities/TemperAgent/{agent_id}/wait?statuses=Completed,Failed,Cancelled&timeout_ms=300000&poll_ms=250" + ); + let headers = vec![ + ("x-tenant-id".to_string(), tenant.to_string()), + ("x-temper-principal-kind".to_string(), "admin".to_string()), + ("accept".to_string(), "application/json".to_string()), + ]; + let resp = ctx.http_call("GET", &wait_url, &headers, "")?; + if resp.status != 200 { + return Err(format!("wait_for_agent failed (HTTP {})", resp.status)); + } + let parsed: Value = serde_json::from_str(&resp.body).unwrap_or_else(|_| json!({})); + Ok(parsed + .get("fields") + .and_then(|v| v.get("result")) + .or_else(|| parsed.get("fields").and_then(|v| v.get("Result"))) + .and_then(Value::as_str) + .unwrap_or("") + .to_string()) +} + +fn str_field<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + keys.iter() + .find_map(|key| value.get(*key).and_then(Value::as_str)) +} + +fn nested_str_field<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + str_field(value, keys).or_else(|| { + value.get("fields") + .and_then(|fields| str_field(fields, keys)) + }) +} diff --git a/os-apps/temper-channels/wasm/send_reply/Cargo.lock b/os-apps/temper-channels/wasm/send_reply/Cargo.lock new file mode 100644 index 00000000..ea8d2204 --- /dev/null +++ b/os-apps/temper-channels/wasm/send_reply/Cargo.lock @@ -0,0 +1,112 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "itoa" +version = "1.0.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" + +[[package]] +name = "memchr" +version = "2.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" + +[[package]] +name = "proc-macro2" +version = "1.0.106" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" +dependencies = [ + "unicode-ident", +] + +[[package]] +name = "quote" +version = "1.0.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" +dependencies = [ + "proc-macro2", +] + +[[package]] +name = "send-reply" +version = "0.1.0" +dependencies = [ + "temper-wasm-sdk", +] + +[[package]] +name = "serde" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" +dependencies = [ + "serde_core", +] + +[[package]] +name = "serde_core" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +dependencies = [ + "serde_derive", +] + +[[package]] +name = "serde_derive" +version = "1.0.228" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "serde_json" +version = "1.0.149" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +dependencies = [ + "itoa", + "memchr", + "serde", + "serde_core", + "zmij", +] + +[[package]] +name = "syn" +version = "2.0.117" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" +dependencies = [ + "proc-macro2", + "quote", + "unicode-ident", +] + +[[package]] +name = "temper-wasm-sdk" +version = "0.1.0" +dependencies = [ + "serde_json", +] + +[[package]] +name = "unicode-ident" +version = "1.0.24" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" + +[[package]] +name = "zmij" +version = "1.0.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa" diff --git a/os-apps/temper-channels/wasm/send_reply/Cargo.toml b/os-apps/temper-channels/wasm/send_reply/Cargo.toml new file mode 100644 index 00000000..72e0c2cb --- /dev/null +++ b/os-apps/temper-channels/wasm/send_reply/Cargo.toml @@ -0,0 +1,12 @@ +[package] +name = "send-reply" +version = "0.1.0" +edition = "2024" + +[lib] +crate-type = ["cdylib"] + +[workspace] + +[dependencies] +temper-wasm-sdk = { path = "../../../../crates/temper-wasm-sdk" } diff --git a/os-apps/temper-channels/wasm/send_reply/src/lib.rs b/os-apps/temper-channels/wasm/send_reply/src/lib.rs new file mode 100644 index 00000000..96931a00 --- /dev/null +++ b/os-apps/temper-channels/wasm/send_reply/src/lib.rs @@ -0,0 +1,52 @@ +use temper_wasm_sdk::prelude::*; + +#[unsafe(no_mangle)] +pub extern "C" fn run(_ctx_ptr: i32, _ctx_len: i32) -> i32 { + let result = (|| -> Result<(), String> { + let ctx = Context::from_host()?; + let fields = ctx.entity_state.get("fields").cloned().unwrap_or_else(|| json!({})); + let webhook_url = str_field(&fields, &["webhook_url", "WebhookUrl"]).unwrap_or(""); + let thread_id = str_field(&fields, &["thread_id", "ThreadId"]).unwrap_or(""); + let content = str_field(&fields, &["content", "Content"]).unwrap_or(""); + let agent_entity_id = + str_field(&fields, &["agent_entity_id", "AgentEntityId"]).unwrap_or(""); + + if webhook_url.is_empty() { + return Err("send_reply: webhook_url is empty".to_string()); + } + + let body = json!({ + "thread_id": thread_id, + "content": content, + "agent_entity_id": agent_entity_id, + }); + let headers = vec![ + ("content-type".to_string(), "application/json".to_string()), + ("x-tenant-id".to_string(), ctx.tenant.clone()), + ]; + let resp = ctx.http_call("POST", webhook_url, &headers, &body.to_string())?; + if !(200..300).contains(&resp.status) { + return Err(format!("send_reply: webhook POST failed (HTTP {})", resp.status)); + } + + set_success_result( + "ReplyDelivered", + &json!({ + "thread_id": thread_id, + "content": content, + "agent_entity_id": agent_entity_id, + }), + ); + Ok(()) + })(); + + if let Err(error) = result { + set_error_result(&error); + } + 0 +} + +fn str_field<'a>(value: &'a Value, keys: &[&str]) -> Option<&'a str> { + keys.iter() + .find_map(|key| value.get(*key).and_then(Value::as_str)) +} diff --git a/os-apps/temper-fs/wasm/blob_adapter/Cargo.lock b/os-apps/temper-fs/wasm/blob_adapter/Cargo.lock new file mode 100644 index 00000000..61929f59 --- /dev/null +++ b/os-apps/temper-fs/wasm/blob_adapter/Cargo.lock @@ -0,0 +1,7 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "blob-adapter" +version = "0.1.0" diff --git a/scripts/temper_agent_e2e_proof.py b/scripts/temper_agent_e2e_proof.py new file mode 100644 index 00000000..7d7045e7 --- /dev/null +++ b/scripts/temper_agent_e2e_proof.py @@ -0,0 +1,1424 @@ +#!/usr/bin/env python3 + +import json +import os +import subprocess +import sys +import time +import urllib.error +import urllib.parse +import urllib.request +from datetime import datetime, timezone +from pathlib import Path + + +REPO_ROOT = Path(__file__).resolve().parents[1] +ARTIFACT_ROOT = REPO_ROOT / ".tmp" / "temper-agent-proof" / "artifacts" +REPORT_PATH = REPO_ROOT / ".proof" / "temper-agent-e2e-proof.md" + +SERVER = os.environ.get("TEMPER_PROOF_SERVER", "http://127.0.0.1:3463") +BLOB_ENDPOINT = os.environ.get("TEMPER_PROOF_BLOB", "http://127.0.0.1:9987") +SANDBOX_URL = os.environ.get("TEMPER_PROOF_SANDBOX", "http://127.0.0.1:9989") +REPLY_LOG = Path( + os.environ.get( + "TEMPER_PROOF_REPLY_LOG", + str(REPO_ROOT / ".tmp" / "temper-agent-proof" / "reply" / "replies.jsonl"), + ) +) +SANDBOX_WORKDIR = os.environ.get( + "TEMPER_PROOF_WORKDIR", + str(REPO_ROOT / ".tmp" / "temper-agent-proof" / "sandbox"), +) +TENANT = os.environ.get( + "TEMPER_PROOF_TENANT", + f"temper-agent-proof-{datetime.now(timezone.utc).strftime('%Y%m%d%H%M%S')}", +) +MCP_BIN = os.environ.get("TEMPER_PROOF_MCP_BIN", str(REPO_ROOT / "target" / "debug" / "temper-mcp")) + +ADMIN_HEADERS = {"x-temper-principal-kind": "admin"} +SYSTEM_HEADERS = {"x-temper-principal-kind": "system"} + + +def ensure_dirs() -> None: + ARTIFACT_ROOT.mkdir(parents=True, exist_ok=True) + REPORT_PATH.parent.mkdir(parents=True, exist_ok=True) + REPLY_LOG.parent.mkdir(parents=True, exist_ok=True) + + +def now_utc() -> str: + return datetime.now(timezone.utc).isoformat() + + +def write_text(path: Path, text: str) -> None: + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(text, encoding="utf-8") + + +def append_jsonl(path: Path, value) -> None: + path.parent.mkdir(parents=True, exist_ok=True) + with path.open("a", encoding="utf-8") as handle: + handle.write(json.dumps(value, sort_keys=True) + "\n") + + +def lookup(mapping, *keys): + if not isinstance(mapping, dict): + return None + + def normalize_key(value) -> str: + return "".join(ch for ch in str(value) if ch.isalnum()).lower() + + lowered = {normalize_key(k): v for k, v in mapping.items()} + for key in keys: + if key in mapping: + return mapping[key] + lower = normalize_key(key) + if lower in lowered: + return lowered[lower] + return None + + +def entity_fields(entity): + return lookup(entity, "fields") or {} + + +def entity_id(entity): + return lookup(entity, "entity_id", "Id", "id") + + +def entity_status(entity): + return lookup(entity, "status", "Status") + + +def entity_field(entity, *keys): + fields = entity_fields(entity) + value = lookup(fields, *keys) + if value is not None: + return value + return lookup(entity, *keys) + + +def json_body_bytes(body) -> bytes: + return json.dumps(body).encode("utf-8") + + +def request( + method: str, + path: str, + *, + tenant: str | None = None, + headers: dict | None = None, + json_body=None, + body: bytes | None = None, + content_type: str | None = None, + accept: str | None = "application/json", + expect: tuple[int, ...] | None = None, +): + if path.startswith("http://") or path.startswith("https://"): + url = path + else: + url = SERVER.rstrip("/") + path + all_headers = {} + if tenant: + all_headers["x-tenant-id"] = tenant + if accept: + all_headers["accept"] = accept + if headers: + all_headers.update(headers) + if json_body is not None: + payload = json_body_bytes(json_body) + all_headers.setdefault("content-type", "application/json") + else: + payload = body + if content_type: + all_headers.setdefault("content-type", content_type) + req = urllib.request.Request(url, data=payload, method=method.upper(), headers=all_headers) + try: + with urllib.request.urlopen(req, timeout=120) as resp: + raw = resp.read() + status = resp.getcode() + resp_headers = dict(resp.headers.items()) + except urllib.error.HTTPError as err: + raw = err.read() + status = err.code + resp_headers = dict(err.headers.items()) + text = raw.decode("utf-8", errors="replace") + parsed = None + ctype = resp_headers.get("Content-Type", "") + if "json" in ctype or text.startswith("{") or text.startswith("["): + try: + parsed = json.loads(text) + except json.JSONDecodeError: + parsed = None + if expect and status not in expect: + raise RuntimeError(f"{method} {url} failed with HTTP {status}: {text[:600]}") + return { + "status": status, + "text": text, + "json": parsed, + "headers": resp_headers, + "url": url, + } + + +def post_json(path: str, body, *, tenant: str | None = None, headers: dict | None = None, expect=(200, 201, 204)): + return request("POST", path, tenant=tenant, headers=headers, json_body=body, expect=expect) + + +def put_json(path: str, body, *, tenant: str | None = None, headers: dict | None = None, expect=(200, 201, 204)): + return request("PUT", path, tenant=tenant, headers=headers, json_body=body, expect=expect) + + +def put_text(path: str, text: str, *, tenant: str | None = None, headers: dict | None = None, expect=(200, 201, 204)): + return request( + "PUT", + path, + tenant=tenant, + headers=headers, + body=text.encode("utf-8"), + content_type="text/plain", + accept=None, + expect=expect, + ) + + +def get_json(path: str, *, tenant: str | None = None, headers: dict | None = None, expect=(200,)): + return request("GET", path, tenant=tenant, headers=headers, expect=expect) + + +def install_app(tenant: str, app_name: str): + return post_json( + f"/api/os-apps/{app_name}/install", + {"tenant": tenant}, + headers=ADMIN_HEADERS, + )["json"] + + +def put_secret(tenant: str, key: str, value: str) -> None: + put_json( + f"/api/tenants/{tenant}/secrets/{key}", + {"value": value}, + headers=ADMIN_HEADERS, + expect=(204,), + ) + + +def upload_wasm(tenant: str, name: str, wasm_path: Path): + return request( + "POST", + f"/api/wasm/modules/{name}", + tenant=tenant, + headers=ADMIN_HEADERS, + body=wasm_path.read_bytes(), + content_type="application/wasm", + expect=(200,), + )["json"] + + +def create_entity(tenant: str, entity_set: str, fields: dict): + return post_json( + f"/tdata/{entity_set}", + fields, + tenant=tenant, + headers=ADMIN_HEADERS, + )["json"] + + +def get_entity(tenant: str, entity_set: str, entity_id_value: str): + key = urllib.parse.quote(entity_id_value, safe="") + return get_json( + f"/tdata/{entity_set}('{key}')", + tenant=tenant, + headers=ADMIN_HEADERS, + )["json"] + + +def list_entities(tenant: str, entity_set: str): + return get_json( + f"/tdata/{entity_set}", + tenant=tenant, + headers=ADMIN_HEADERS, + )["json"]["value"] + + +def action_with_fallback(tenant: str, entity_set: str, entity_id_value: str, action_paths: list[str], body: dict): + key = urllib.parse.quote(entity_id_value, safe="") + last_error = None + for action_path in action_paths: + resp = request( + "POST", + f"/tdata/{entity_set}('{key}')/{action_path}", + tenant=tenant, + headers=ADMIN_HEADERS, + json_body=body, + ) + if 200 <= resp["status"] < 300: + return resp["json"] or resp["text"] + last_error = resp + if resp["status"] not in (400, 404): + break + if last_error is None: + raise RuntimeError(f"no action path tried for {entity_set} {entity_id_value}") + raise RuntimeError( + f"action failed for {entity_set} {entity_id_value} via {action_paths}: " + f"HTTP {last_error['status']} {last_error['text'][:400]}" + ) + + +def wait_entity(tenant: str, entity_type: str, entity_id_value: str, statuses: list[str], timeout_ms: int = 120000): + query = urllib.parse.urlencode( + { + "statuses": ",".join(statuses), + "timeout_ms": str(timeout_ms), + "poll_ms": "250", + } + ) + return get_json( + f"/observe/entities/{entity_type}/{urllib.parse.quote(entity_id_value, safe='')}/wait?{query}", + tenant=tenant, + headers=ADMIN_HEADERS, + expect=(200, 408), + )["json"] + + +def wait_for_entities(tenant: str, entity_set: str, predicate, timeout_s: float = 10.0, poll_s: float = 0.25): + deadline = time.time() + timeout_s + while True: + matches = [entry for entry in list_entities(tenant, entity_set) if predicate(entry)] + if matches or time.time() >= deadline: + return matches + time.sleep(poll_s) + + +def read_reply_lines() -> list[dict]: + if not REPLY_LOG.exists(): + return [] + raw_reply_lines = [ + json.loads(line) + for line in REPLY_LOG.read_text(encoding="utf-8").splitlines() + if line.strip() + ] + reply_lines = [] + for line in raw_reply_lines: + body = line.get("body") + if isinstance(body, str): + try: + parsed_body = json.loads(body) + except json.JSONDecodeError: + parsed_body = body + if isinstance(parsed_body, dict): + merged = dict(line) + merged.update(parsed_body) + line = merged + reply_lines.append(line) + return reply_lines + + +def wait_for_reply(predicate, timeout_s: float = 10.0, poll_s: float = 0.25) -> list[dict]: + deadline = time.time() + timeout_s + while True: + reply_lines = read_reply_lines() + if any(predicate(line) for line in reply_lines) or time.time() >= deadline: + return reply_lines + time.sleep(poll_s) + + +def capture_sse(tenant: str, entity_type: str, entity_id_value: str, output_path: Path, since: int = 0, max_time: int = 2): + cmd = [ + "curl", + "-sN", + "--max-time", + str(max_time), + "-H", + f"x-tenant-id: {tenant}", + "-H", + "x-temper-principal-kind: admin", + f"{SERVER}/observe/entities/{entity_type}/{entity_id_value}/events?since={since}", + ] + result = subprocess.run(cmd, cwd=REPO_ROOT, capture_output=True, text=True) + write_text(output_path, result.stdout) + return result.stdout + + +def create_file_asset(tenant: str, workspace_id: str, directory_id: str, path: str, content: str): + file_entity = create_entity( + tenant, + "Files", + { + "Name": Path(path).name, + "Path": path, + "DirectoryId": directory_id, + "WorkspaceId": workspace_id, + "MimeType": "text/markdown" if path.endswith(".md") else "text/plain", + }, + ) + file_id = entity_id(file_entity) + put_text( + f"/tdata/Files('{file_id}')/$value", + content, + tenant=tenant, + headers=ADMIN_HEADERS, + ) + return file_entity + + +def get_file_text(tenant: str, file_id: str) -> str: + return request( + "GET", + f"/tdata/Files('{urllib.parse.quote(file_id, safe='')}')/$value", + tenant=tenant, + headers=ADMIN_HEADERS, + accept=None, + expect=(200,), + )["text"] + + +def clean_sandbox() -> None: + request( + "POST", + f"{SANDBOX_URL}/v1/processes/run", + headers={}, + json_body={"command": f"rm -rf '{SANDBOX_WORKDIR}'/* 2>/dev/null || true", "workdir": SANDBOX_WORKDIR}, + expect=(200,), + ) + + +def extract_prompt_from_sse(raw_sse: str) -> str: + event_name = None + for line in raw_sse.splitlines(): + if line.startswith("event:"): + event_name = line.split(":", 1)[1].strip() + elif line.startswith("data:"): + payload = line.split(":", 1)[1].strip() + try: + data = json.loads(payload) + except json.JSONDecodeError: + continue + if event_name == "prompt_assembled": + nested = lookup(data, "data") or {} + return lookup(nested, "system_prompt") or lookup(data, "system_prompt") or "" + if event_name == "integration_progress" and lookup(data, "kind") == "prompt_assembled": + nested = lookup(data, "data") or {} + return lookup(nested, "system_prompt") or lookup(data, "system_prompt") or "" + return "" + + +def parse_sse_events(raw_sse: str): + events = [] + current = None + for line in raw_sse.splitlines(): + if line.startswith("event:"): + current = {"event": line.split(":", 1)[1].strip()} + elif line.startswith("data:") and current is not None: + payload = line.split(":", 1)[1].strip() + try: + current["data"] = json.loads(payload) + except json.JSONDecodeError: + current["data"] = payload + events.append(current) + current = None + return events + + +def latest_text_result_from_session(session_jsonl: str) -> str: + last = "" + for line in session_jsonl.splitlines(): + if not line.strip(): + continue + entry = json.loads(line) + if lookup(entry, "type") != "message": + continue + if lookup(entry, "role") != "assistant": + continue + content = lookup(entry, "content") + if isinstance(content, list): + texts = [block.get("text", "") for block in content if block.get("type") == "text"] + if texts: + last = "\n".join(texts) + elif isinstance(content, str): + last = content + return last + + +def entity_result(entity, session_jsonl: str | None = None) -> str: + for key in ("result", "Result"): + value = entity_field(entity, key) + if isinstance(value, str) and value: + return value + if session_jsonl: + return latest_text_result_from_session(session_jsonl) + return "" + + +def step(status: bool, expected: str, actual: str): + return {"status": "PASS" if status else "FAIL", "expected": expected, "actual": actual} + + +class McpClient: + def __init__(self, binary_path: str, port: int, stderr_path: Path): + self.stderr_handle = stderr_path.open("w", encoding="utf-8") + self.process = subprocess.Popen( + [ + binary_path, + "--port", + str(port), + "--agent-id", + "proof-harness", + "--agent-type", + "human", + "--session-id", + f"proof-{int(time.time())}", + ], + cwd=REPO_ROOT, + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=self.stderr_handle, + text=True, + bufsize=1, + ) + self.next_id = 1 + + def send(self, payload): + assert self.process.stdin is not None + self.process.stdin.write(json.dumps(payload) + "\n") + self.process.stdin.flush() + + def recv(self, expected_id: int): + assert self.process.stdout is not None + while True: + line = self.process.stdout.readline() + if not line: + raise RuntimeError("temper-mcp closed stdout unexpectedly") + message = json.loads(line) + if message.get("id") != expected_id: + continue + return message + + def initialize(self): + req_id = self.next_id + self.next_id += 1 + self.send( + { + "jsonrpc": "2.0", + "id": req_id, + "method": "initialize", + "params": { + "protocolVersion": "2024-11-05", + "capabilities": {}, + "clientInfo": {"name": "pi-proof", "version": "1.0.0"}, + }, + } + ) + self.recv(req_id) + self.send({"jsonrpc": "2.0", "method": "notifications/initialized"}) + + def execute(self, code: str): + req_id = self.next_id + self.next_id += 1 + self.send( + { + "jsonrpc": "2.0", + "id": req_id, + "method": "tools/call", + "params": { + "name": "execute", + "arguments": {"code": code}, + }, + } + ) + response = self.recv(req_id) + if "error" in response: + raise RuntimeError(response["error"]["message"]) + result = response["result"] + text = "" + content = result.get("content") or [] + if content: + text = content[0].get("text", "") + if result.get("isError"): + raise RuntimeError(text) + try: + return json.loads(text) + except json.JSONDecodeError: + return text + + def close(self): + if self.process.poll() is None: + self.process.terminate() + try: + self.process.wait(timeout=5) + except subprocess.TimeoutExpired: + self.process.kill() + self.stderr_handle.close() + + +def build_mock_plan(steps: list[dict]) -> str: + return json.dumps({"mock_plan": {"steps": steps}}, separators=(",", ":")) + + +def main() -> int: + ensure_dirs() + clean_sandbox() + REPLY_LOG.write_text("", encoding="utf-8") + + artifact_log = ARTIFACT_ROOT / "proof-log.jsonl" + artifact_log.unlink(missing_ok=True) + + report = { + "date": now_utc(), + "tenant": TENANT, + "branch": subprocess.check_output(["git", "branch", "--show-current"], cwd=REPO_ROOT, text=True).strip(), + "commit": subprocess.check_output(["git", "rev-parse", "HEAD"], cwd=REPO_ROOT, text=True).strip(), + "steps": {}, + } + + health = get_json("/observe/health", headers=ADMIN_HEADERS)["json"] + write_text(ARTIFACT_ROOT / "server-health.json", json.dumps(health, indent=2)) + + apps = { + "temper-fs": install_app(TENANT, "temper-fs"), + "temper-agent": install_app(TENANT, "temper-agent"), + "temper-channels": install_app(TENANT, "temper-channels"), + } + write_text(ARTIFACT_ROOT / "installed-apps.json", json.dumps(apps, indent=2)) + + put_secret(TENANT, "temper_api_url", SERVER) + put_secret(TENANT, "blob_endpoint", BLOB_ENDPOINT) + + modules = { + "blob_adapter": REPO_ROOT / "os-apps" / "temper-fs" / "wasm" / "blob_adapter.wasm", + "llm_caller": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "llm_caller" / "target" / "wasm32-unknown-unknown" / "release" / "llm_caller.wasm", + "tool_runner": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "tool_runner" / "target" / "wasm32-unknown-unknown" / "release" / "tool_runner.wasm", + "sandbox_provisioner": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "sandbox_provisioner" / "target" / "wasm32-unknown-unknown" / "release" / "sandbox_provisioner.wasm", + "context_compactor": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "context_compactor" / "target" / "wasm32-unknown-unknown" / "release" / "context_compactor.wasm", + "steering_checker": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "steering_checker" / "target" / "wasm32-unknown-unknown" / "release" / "steering_checker.wasm", + "coding_agent_runner": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "coding_agent_runner" / "target" / "wasm32-unknown-unknown" / "release" / "coding_agent_runner.wasm", + "heartbeat_scan": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "heartbeat_scan" / "target" / "wasm32-unknown-unknown" / "release" / "heartbeat_scan.wasm", + "heartbeat_scheduler": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "heartbeat_scheduler" / "target" / "wasm32-unknown-unknown" / "release" / "heartbeat_scheduler.wasm", + "cron_trigger": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "cron_trigger" / "target" / "wasm32-unknown-unknown" / "release" / "cron_trigger.wasm", + "cron_scheduler_check": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "cron_scheduler_check" / "target" / "wasm32-unknown-unknown" / "release" / "cron_scheduler_check.wasm", + "cron_scheduler_heartbeat": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "cron_scheduler_heartbeat" / "target" / "wasm32-unknown-unknown" / "release" / "cron_scheduler_heartbeat.wasm", + "workspace_restorer": REPO_ROOT / "os-apps" / "temper-agent" / "wasm" / "workspace_restorer" / "target" / "wasm32-unknown-unknown" / "release" / "workspace_restorer.wasm", + "channel_connect": REPO_ROOT / "os-apps" / "temper-channels" / "wasm" / "channel_connect" / "target" / "wasm32-unknown-unknown" / "release" / "channel_connect.wasm", + "route_message": REPO_ROOT / "os-apps" / "temper-channels" / "wasm" / "route_message" / "target" / "wasm32-unknown-unknown" / "release" / "route_message.wasm", + "send_reply": REPO_ROOT / "os-apps" / "temper-channels" / "wasm" / "send_reply" / "target" / "wasm32-unknown-unknown" / "release" / "send_reply.wasm", + } + upload_results = {} + for name, wasm_path in modules.items(): + upload_results[name] = upload_wasm(TENANT, name, wasm_path) + append_jsonl(artifact_log, {"type": "wasm_upload", "name": name, "path": str(wasm_path), "result": upload_results[name]}) + write_text(ARTIFACT_ROOT / "uploaded-modules.json", json.dumps(upload_results, indent=2)) + + workspace = create_entity(TENANT, "Workspaces", {"Name": "Pi Proof Workspace", "QuotaLimit": 100000000}) + directory = create_entity( + TENANT, + "Directories", + {"Name": "root", "Path": "/", "WorkspaceId": entity_id(workspace)}, + ) + write_text(ARTIFACT_ROOT / "fs-root.json", json.dumps({"workspace": workspace, "directory": directory}, indent=2)) + + soul_md = """# Proof Soul + +## Identity +You are Proof Soul, a governed Temper agent used to verify the Pi architecture rewrite. + +## Instructions +- Prefer deterministic mock runs for verification. +- Surface memory and skills in the prompt. +- Use tools only when the proof plan requires them. + +## Capabilities +- Run sandbox tools +- Spawn governed child agents +- Save and recall memories + +## Constraints +- Do not use destructive commands. +- Stay inside the provided workspace. +""" + skill_one_md = """# code-reviewer + +Inspect code changes for regressions, missing tests, and risky assumptions. +""" + skill_two_md = """# file-search + +Locate relevant files quickly and summarize the signal, not the noise. +""" + + soul_file = create_file_asset(TENANT, entity_id(workspace), entity_id(directory), "/soul.md", soul_md) + skill_one_file = create_file_asset(TENANT, entity_id(workspace), entity_id(directory), "/skills/code-reviewer.md", skill_one_md) + skill_two_file = create_file_asset(TENANT, entity_id(workspace), entity_id(directory), "/skills/file-search.md", skill_two_md) + + soul = create_entity( + TENANT, + "AgentSouls", + { + "Name": "Proof Soul", + "Description": "Pi agent rewrite proof identity", + "ContentFileId": entity_id(soul_file), + "AuthorId": "proof-harness", + }, + ) + soul_id = entity_id(soul) + action_with_fallback( + TENANT, + "AgentSouls", + soul_id, + ["Temper.Agent.AgentSoul.Publish", "Temper.Agent.Publish"], + {}, + ) + + skill_one = create_entity( + TENANT, + "AgentSkills", + { + "Name": "code-reviewer", + "Description": "Review changes for bugs and missing tests.", + "ContentFileId": entity_id(skill_one_file), + "Scope": "global", + }, + ) + skill_two = create_entity( + TENANT, + "AgentSkills", + { + "Name": "file-search", + "Description": "Find relevant files and summarize their purpose.", + "ContentFileId": entity_id(skill_two_file), + "Scope": "global", + }, + ) + seeded_memory = [ + create_entity( + TENANT, + "AgentMemorys", + { + "Key": "user-profile", + "Content": "The proof user prefers exact verification over discussion.", + "MemoryType": "user", + "SoulId": soul_id, + "AuthorAgentId": "proof-harness", + }, + ), + create_entity( + TENANT, + "AgentMemorys", + { + "Key": "project-context", + "Content": "Temper Pi rewrite proof must capture SSE, session trees, cron, heartbeat, channels, and MCP.", + "MemoryType": "project", + "SoulId": soul_id, + "AuthorAgentId": "proof-harness", + }, + ), + ] + setup_snapshot = { + "soul": get_entity(TENANT, "AgentSouls", soul_id), + "skills": [get_entity(TENANT, "AgentSkills", entity_id(skill_one)), get_entity(TENANT, "AgentSkills", entity_id(skill_two))], + "memory": [get_entity(TENANT, "AgentMemorys", entity_id(entry)) for entry in seeded_memory], + } + write_text(ARTIFACT_ROOT / "setup-assets.json", json.dumps(setup_snapshot, indent=2)) + + channel = create_entity( + TENANT, + "Channels", + { + "ChannelType": "webhook", + "ChannelId": "proof-webhook", + "DefaultAgentConfig": json.dumps( + { + "provider": "mock", + "model": "mock-proof", + "tools_enabled": "", + "max_turns": "4", + "sandbox_url": SANDBOX_URL, + "workdir": SANDBOX_WORKDIR, + "soul_id": soul_id, + }, + separators=(",", ":"), + ), + "WebhookUrl": "http://127.0.0.1:9988", + }, + ) + channel_id = entity_id(channel) + action_with_fallback( + TENANT, + "Channels", + channel_id, + ["Temper.OpenClaw.Channel.Connect", "Temper.OpenClaw.Connect"], + {}, + ) + route = create_entity( + TENANT, + "AgentRoutes", + { + "BindingTier": "channel", + "ChannelId": "proof-webhook", + "MatchPattern": ".*", + "AgentConfig": json.dumps( + { + "provider": "mock", + "model": "mock-proof", + "tools_enabled": "", + "max_turns": "4", + "sandbox_url": SANDBOX_URL, + "workdir": SANDBOX_WORKDIR, + }, + separators=(",", ":"), + ), + "SoulId": soul_id, + }, + ) + write_text( + ARTIFACT_ROOT / "channel-setup.json", + json.dumps( + { + "channel": get_entity(TENANT, "Channels", channel_id), + "route": get_entity(TENANT, "AgentRoutes", entity_id(route)), + }, + indent=2, + ), + ) + + direct_plan = build_mock_plan( + [ + { + "text": "Starting direct path", + "tool_calls": [ + { + "name": "bash", + "input": { + "command": "sleep 2 && printf direct-path-bash", + "workdir": SANDBOX_WORKDIR, + }, + } + ], + }, + {"final_text": "Waiting for steering check."}, + { + "text": "Steering applied: {{latest_user}}", + "tool_calls": [ + { + "name": "save_memory", + "input": { + "key": "proof-direct-memory", + "content": "saved from direct path", + "memory_type": "project", + }, + } + ], + }, + {"final_text": "Direct path finished with memory keys {{memory_keys}}."}, + ] + ) + + direct_agent = create_entity(TENANT, "TemperAgents", {"TemperAgentId": "proof-direct"}) + direct_id = entity_id(direct_agent) + action_with_fallback( + TENANT, + "TemperAgents", + direct_id, + ["Temper.Agent.TemperAgent.Configure", "Temper.Agent.Configure"], + { + "system_prompt": "Override: include the DIRECT-OVERRIDE marker.", + "user_message": direct_plan, + "model": "mock-proof", + "provider": "mock", + "max_turns": "8", + "tools_enabled": "bash,save_memory", + "workdir": SANDBOX_WORKDIR, + "sandbox_url": SANDBOX_URL, + "soul_id": soul_id, + "max_follow_ups": "5", + }, + ) + action_with_fallback( + TENANT, + "TemperAgents", + direct_id, + ["Temper.Agent.TemperAgent.Provision", "Temper.Agent.Provision"], + {}, + ) + time.sleep(0.5) + action_with_fallback( + TENANT, + "TemperAgents", + direct_id, + ["Temper.Agent.TemperAgent.Steer", "Temper.Agent.Steer"], + {"steering_messages": json.dumps([{"content": "Follow the steering marker ST-123"}])}, + ) + direct_wait = wait_entity(TENANT, "TemperAgent", direct_id, ["Completed", "Failed", "Cancelled"], 120000) + direct_entity = get_entity(TENANT, "TemperAgents", direct_id) + direct_session = get_file_text(TENANT, entity_field(direct_entity, "session_file_id", "SessionFileId")) + direct_sse = capture_sse(TENANT, "TemperAgent", direct_id, ARTIFACT_ROOT / "direct-events.sse") + direct_result = entity_result(direct_wait, direct_session) + direct_prompt = extract_prompt_from_sse(direct_sse) + write_text(ARTIFACT_ROOT / "direct-agent.json", json.dumps(direct_entity, indent=2)) + write_text(ARTIFACT_ROOT / "direct-session.jsonl", direct_session) + write_text(ARTIFACT_ROOT / "direct-prompt.txt", direct_prompt) + + direct_memories = list_entities(TENANT, "AgentMemorys") + direct_saved = [entry for entry in direct_memories if entity_field(entry, "Key") == "proof-direct-memory"] + + report["steps"]["A"] = { + "A1": step(entity_field(direct_entity, "SoulId") == soul_id, "Agent created with soul_id bound", f"soul_id={entity_field(direct_entity, 'SoulId')}"), + "A4": step("event: state_change" in direct_sse, "SSE replay returns lifecycle events", "captured direct-events.sse"), + "A5": step( + all(marker in direct_prompt for marker in ["Proof Soul", "", ""]), + "Prompt includes soul, skills, and memory blocks", + direct_prompt[:300], + ), + "A6": step( + "ProcessToolCalls" in direct_sse and "HandleToolResults" in direct_sse, + "Thinking/Executing loop is visible in events", + "ProcessToolCalls/HandleToolResults present" if "ProcessToolCalls" in direct_sse else "missing loop markers", + ), + "A7": step('"type":"message"' in direct_session and "s-" in direct_session, "Session tree persisted JSONL entries and steering branch", direct_session[:240]), + "A8": step("ST-123" in direct_sse or "ST-123" in direct_session, "Steering injection stored and observable", "steering marker present"), + "A9": step("ContinueWithSteering" in direct_sse, "Steering caused a continue transition", "ContinueWithSteering seen" if "ContinueWithSteering" in direct_sse else "missing"), + "A10": step(entity_status(direct_wait) == "Completed", "Agent completed successfully", direct_result), + "A11": step(bool(direct_saved), "save_memory created a new AgentMemory", f"count={len(direct_saved)}"), + } + + channel_plan = build_mock_plan([{"final_text": "Channel proof reply"}]) + receive_result = action_with_fallback( + TENANT, + "Channels", + channel_id, + ["Temper.OpenClaw.Channel.ReceiveMessage", "Temper.OpenClaw.ReceiveMessage"], + { + "message_id": "msg-1", + "author_id": "user-1", + "thread_id": "thread-1", + "content": channel_plan, + }, + ) + channel_sessions = wait_for_entities( + TENANT, + "ChannelSessions", + lambda entry: entity_field(entry, "ThreadId") == "thread-1", + ) + channel_session = channel_sessions[0] + channel_agent_id = entity_field(channel_session, "AgentEntityId") + channel_agent = get_entity(TENANT, "TemperAgents", channel_agent_id) + channel_wait = wait_entity(TENANT, "TemperAgent", channel_agent_id, ["Completed", "Failed", "Cancelled"], 60000) + reply_lines = wait_for_reply( + lambda line: line.get("content") == "Channel proof reply" + and line.get("thread_id") == "thread-1", + timeout_s=10.0, + poll_s=0.25, + ) + write_text( + ARTIFACT_ROOT / "channel-result.json", + json.dumps( + { + "receive_result": receive_result, + "session": channel_session, + "agent": channel_agent, + "wait": channel_wait, + "reply_lines": reply_lines, + }, + indent=2, + ), + ) + report["steps"]["B"] = { + "B1": step(True, "Channel.ReceiveMessage accepted webhook payload", "ReceiveMessage executed"), + "B2": step(bool(channel_sessions), "ChannelSession created for thread", f"session_id={entity_id(channel_session)}"), + "B3": step(entity_field(channel_agent, "SoulId") == soul_id, "Channel route spawned agent with route soul_id", f"soul_id={entity_field(channel_agent, 'SoulId')}"), + "B4": step(entity_status(channel_wait) == "Completed", "Channel-triggered agent completed", entity_result(channel_wait)), + "B5": step(any(line.get("content") == "Channel proof reply" for line in reply_lines), "send_reply delivered the agent result", json.dumps(reply_lines[-1]) if reply_lines else "no reply"), + } + + child_plan = build_mock_plan( + [ + { + "text": "child start", + "tool_calls": [ + { + "name": "bash", + "input": { + "command": "sleep 2 && printf child-ready", + "workdir": SANDBOX_WORKDIR, + }, + } + ], + }, + {"final_text": "Child waiting for steering."}, + {"final_text": "Child completed after steering: {{latest_user}}"}, + ] + ) + subagent_plan = build_mock_plan( + [ + { + "text": "spawning child", + "tool_calls": [ + { + "name": "spawn_agent", + "input": { + "task": child_plan, + "agent_id": "proof-sub-child", + "provider": "mock", + "model": "mock-proof", + "max_turns": 6, + "tools": "bash", + "soul_id": soul_id, + "background": True, + }, + } + ], + }, + { + "text": "managing child", + "tool_calls": [ + {"name": "list_agents", "input": {}}, + {"name": "steer_agent", "input": {"agent_id": "proof-sub-child", "message": "STEERED-CHILD"}}, + {"name": "run_coding_agent", "input": {"agent_type": "claude-code", "task": "subagent proof task", "workdir": SANDBOX_WORKDIR}}, + ], + }, + {"final_text": "Subagent parent done"}, + ] + ) + + sub_parent = create_entity(TENANT, "TemperAgents", {"TemperAgentId": "proof-sub-parent"}) + sub_parent_id = entity_id(sub_parent) + action_with_fallback( + TENANT, + "TemperAgents", + sub_parent_id, + ["Temper.Agent.TemperAgent.Configure", "Temper.Agent.Configure"], + { + "system_prompt": "Subagent proof parent.", + "user_message": subagent_plan, + "model": "mock-proof", + "provider": "mock", + "max_turns": "8", + "tools_enabled": "spawn_agent,list_agents,steer_agent,run_coding_agent", + "workdir": SANDBOX_WORKDIR, + "sandbox_url": SANDBOX_URL, + "soul_id": soul_id, + }, + ) + action_with_fallback( + TENANT, + "TemperAgents", + sub_parent_id, + ["Temper.Agent.TemperAgent.Provision", "Temper.Agent.Provision"], + {}, + ) + sub_parent_wait = wait_entity(TENANT, "TemperAgent", sub_parent_id, ["Completed", "Failed", "Cancelled"], 120000) + sub_parent_entity = get_entity(TENANT, "TemperAgents", sub_parent_id) + sub_parent_session = get_file_text(TENANT, entity_field(sub_parent_entity, "session_file_id", "SessionFileId")) + sub_child_entities = wait_for_entities( + TENANT, + "TemperAgents", + lambda entry: entity_field(entry, "TemperAgentId") == "proof-sub-child" + and entity_field(entry, "ParentAgentId") == sub_parent_id, + ) + sub_child_entity = sub_child_entities[0] + sub_child_id = entity_id(sub_child_entity) + sub_child_wait = wait_entity(TENANT, "TemperAgent", sub_child_id, ["Completed", "Failed", "Cancelled"], 120000) + sub_child_session = get_file_text(TENANT, entity_field(sub_child_entity, "session_file_id", "SessionFileId")) + write_text(ARTIFACT_ROOT / "subagent-parent-session.jsonl", sub_parent_session) + write_text(ARTIFACT_ROOT / "subagent-child-session.jsonl", sub_child_session) + + report["steps"]["C"] = { + "C1": step(True, "An orchestrator entity ran WASM that spawned a TemperAgent", f"parent_agent={sub_parent_id}"), + "C2": step(entity_field(sub_child_entity, "ParentAgentId") == sub_parent_id, "Child TemperAgent created with parent_agent_id", f"parent_agent_id={entity_field(sub_child_entity, 'ParentAgentId')}"), + "C3": step(entity_status(sub_child_wait) == "Completed", "Child agent completed and result was observable", entity_result(sub_child_wait, sub_child_session)), + } + report["steps"]["S"] = { + "S1": step(True, "Parent agent created with spawn_agent in tools", "tools_enabled includes spawn_agent"), + "S2": step("proof-sub-child" in sub_parent_session, "Parent invoked spawn_agent", "child id present in parent session"), + "S3": step(entity_field(sub_child_entity, "ParentAgentId") == sub_parent_id, "Child links back to parent", f"ParentAgentId={entity_field(sub_child_entity, 'ParentAgentId')}"), + "S4": step("STEERED-CHILD" in sub_child_session or "STEERED-CHILD" in entity_result(sub_child_wait, sub_child_session), "Parent steered child agent", entity_result(sub_child_wait, sub_child_session)), + "S5": step("proof-sub-child" in sub_parent_session and "- proof-sub-child:" in sub_parent_session, "list_agents exposed child status", "child id visible in tool result"), + "S6": step("Child completed after steering" in entity_result(sub_child_wait, sub_child_session), "Parent/child flow produced child result", entity_result(sub_child_wait, sub_child_session)), + "S7": step("run_coding_agent" in sub_parent_session, "Parent invoked run_coding_agent", "tool result captured"), + "S8": step("claude --permission-mode bypassPermissions --print 'subagent proof task'" in sub_parent_session, "CLI command matched expected claude-code pattern", "command string present"), + } + + depth_plan = build_mock_plan( + [ + {"tool_calls": [{"name": "spawn_agent", "input": {"task": build_mock_plan([{"final_text": "never"}])}}]}, + {"final_text": "depth-guard-done"}, + ] + ) + depth_agent = create_entity(TENANT, "TemperAgents", {"TemperAgentId": "proof-depth-guard"}) + depth_id = entity_id(depth_agent) + action_with_fallback( + TENANT, + "TemperAgents", + depth_id, + ["Temper.Agent.TemperAgent.Configure", "Temper.Agent.Configure"], + { + "user_message": depth_plan, + "model": "mock-proof", + "provider": "mock", + "max_turns": "4", + "tools_enabled": "spawn_agent", + "agent_depth": 5, + "soul_id": soul_id, + "sandbox_url": SANDBOX_URL, + "workdir": SANDBOX_WORKDIR, + }, + ) + action_with_fallback( + TENANT, + "TemperAgents", + depth_id, + ["Temper.Agent.TemperAgent.Provision", "Temper.Agent.Provision"], + {}, + ) + depth_wait = wait_entity(TENANT, "TemperAgent", depth_id, ["Completed", "Failed", "Cancelled"], 60000) + depth_entity = get_entity(TENANT, "TemperAgents", depth_id) + depth_session_file_id = entity_field(depth_entity, "session_file_id", "SessionFileId") + depth_session = get_file_text(TENANT, depth_session_file_id) if depth_session_file_id else "" + report["steps"]["S"]["S9"] = step( + "agent_depth guard hit" in depth_session, + "agent_depth guard prevented deep recursion", + "guard message present" if "agent_depth guard hit" in depth_session else "guard missing", + ) + + mcp = McpClient(MCP_BIN, 3463, ARTIFACT_ROOT / "temper-mcp.stderr.log") + try: + mcp.initialize() + mcp_plan = json.dumps(build_mock_plan([{"final_text": "MCP path ok"}])) + mcp_create = mcp.execute( + f""" +agent = await temper.create('{TENANT}', 'TemperAgents', {{}}) +aid = agent['entity_id'] +await temper.action('{TENANT}', 'TemperAgents', aid, 'Agent.TemperAgent.Configure', {{ + 'user_message': {mcp_plan}, + 'model': 'mock-proof', + 'provider': 'mock', + 'max_turns': '4', + 'tools_enabled': '', + 'soul_id': '{soul_id}', + 'sandbox_url': '{SANDBOX_URL}', + 'workdir': {json.dumps(SANDBOX_WORKDIR)} +}}) +await temper.action('{TENANT}', 'TemperAgents', aid, 'Agent.TemperAgent.Provision', {{}}) +return {{'agent_id': aid}} +""" + ) + mcp_agent_id = mcp_create["agent_id"] + mcp_wait = wait_entity(TENANT, "TemperAgent", mcp_agent_id, ["Completed", "Failed", "Cancelled"], 60000) + mcp_entity = mcp.execute(f"return await temper.get('{TENANT}', 'TemperAgents', '{mcp_agent_id}')") + write_text( + ARTIFACT_ROOT / "mcp-results.json", + json.dumps({"create": mcp_create, "entity": mcp_entity, "wait": mcp_wait}, indent=2), + ) + finally: + mcp.close() + report["steps"]["D"] = { + "D1": step(True, "MCP created, configured, and provisioned an agent", f"agent_id={mcp_agent_id}"), + "D2": step(entity_status(mcp_wait) == "Completed", "MCP-observed agent reached Completed", entity_result(mcp_wait)), + "D3": step(entity_result(mcp_wait) == "MCP path ok", "MCP result matched expected output", entity_result(mcp_wait)), + } + + cron_template = build_mock_plan([{"final_text": "cron run {{run_count}}"}]) + cron_job = create_entity( + TENANT, + "CronJobs", + { + "Name": "proof-cron", + "Schedule": "* * * * *", + "SoulId": soul_id, + "UserMessageTemplate": cron_template, + "Model": "mock-proof", + "Provider": "mock", + "ToolsEnabled": "", + "SandboxUrl": SANDBOX_URL, + "MaxTurns": "4", + "MaxRuns": "2", + }, + ) + cron_id = entity_id(cron_job) + action_with_fallback( + TENANT, + "CronJobs", + cron_id, + ["Temper.Agent.CronJob.Activate", "Temper.Agent.Activate"], + {}, + ) + action_with_fallback( + TENANT, + "CronJobs", + cron_id, + ["Temper.Agent.CronJob.Trigger", "Temper.Agent.Trigger"], + {"last_run_at": now_utc()}, + ) + cron_after_first_matches = wait_for_entities( + TENANT, + "CronJobs", + lambda entry: entity_id(entry) == cron_id and bool(entity_field(entry, "LastAgentId")), + timeout_s=20.0, + poll_s=0.25, + ) + if not cron_after_first_matches: + raise RuntimeError(f"cron proof: no last_agent_id observed for CronJob {cron_id}") + cron_after_first = cron_after_first_matches[0] + cron_agent_id = entity_field(cron_after_first, "LastAgentId") + cron_agent_wait = wait_entity(TENANT, "TemperAgent", cron_agent_id, ["Completed", "Failed", "Cancelled"], 60000) + action_with_fallback( + TENANT, + "CronJobs", + cron_id, + ["Temper.Agent.CronJob.Trigger", "Temper.Agent.Trigger"], + {"last_run_at": now_utc()}, + ) + cron_after_second_matches = wait_for_entities( + TENANT, + "CronJobs", + lambda entry: entity_id(entry) == cron_id and int(entity_field(entry, "RunCount") or 0) >= 2, + timeout_s=20.0, + poll_s=0.25, + ) + if not cron_after_second_matches: + raise RuntimeError(f"cron proof: run_count did not reach 2 for CronJob {cron_id}") + cron_after_second = cron_after_second_matches[0] + write_text( + ARTIFACT_ROOT / "cron-results.json", + json.dumps({"job_after_first": cron_after_first, "job_after_second": cron_after_second, "agent_wait": cron_agent_wait}, indent=2), + ) + report["steps"]["E"] = { + "E1": step(True, "CronJob entity created", f"cron_id={cron_id}"), + "E2": step(entity_status(cron_after_first) == "Active", "Cron job activated", f"status={entity_status(cron_after_first)}"), + "E3": step(True, "Manual Trigger action executed", f"last_agent_id={cron_agent_id}"), + "E4": step(bool(cron_agent_id), "Cron-triggered TemperAgent was created", f"agent_id={cron_agent_id}"), + "E5": step(entity_field(cron_after_first, "LastAgentId") == cron_agent_id, "CronJob tracked last_agent_id", f"LastAgentId={entity_field(cron_after_first, 'LastAgentId')}"), + "E6": step(int(entity_field(cron_after_second, "RunCount") or 0) >= 2, "Second trigger incremented run_count", f"RunCount={entity_field(cron_after_second, 'RunCount')}"), + } + + heartbeat_agent = create_entity(TENANT, "TemperAgents", {"TemperAgentId": "proof-heartbeat"}) + heartbeat_agent_id = entity_id(heartbeat_agent) + action_with_fallback( + TENANT, + "TemperAgents", + heartbeat_agent_id, + ["Temper.Agent.TemperAgent.Configure", "Temper.Agent.Configure"], + { + "user_message": build_mock_plan([{"mode": "hang"}]), + "model": "mock-proof", + "provider": "mock", + "max_turns": "4", + "tools_enabled": "", + "soul_id": soul_id, + "heartbeat_timeout_seconds": "5", + "sandbox_url": SANDBOX_URL, + "workdir": SANDBOX_WORKDIR, + }, + ) + action_with_fallback( + TENANT, + "TemperAgents", + heartbeat_agent_id, + ["Temper.Agent.TemperAgent.Provision", "Temper.Agent.Provision"], + {}, + ) + time.sleep(1) + heartbeat_monitor = create_entity(TENANT, "HeartbeatMonitors", {"ScanIntervalSeconds": "1"}) + heartbeat_monitor_id = entity_id(heartbeat_monitor) + action_with_fallback( + TENANT, + "HeartbeatMonitors", + heartbeat_monitor_id, + ["Temper.Agent.HeartbeatMonitor.Start", "Temper.Agent.Start"], + {}, + ) + heartbeat_wait = wait_entity(TENANT, "TemperAgent", heartbeat_agent_id, ["Failed", "Completed"], 30000) + heartbeat_sse = capture_sse(TENANT, "TemperAgent", heartbeat_agent_id, ARTIFACT_ROOT / "heartbeat-events.sse") + report["steps"]["H"] = { + "H1": step(True, "Heartbeat test agent created with short timeout", f"agent_id={heartbeat_agent_id}"), + "H2": step(True, "Mock hang plan provisioned", "provider=mock, mode=hang"), + "H3": step(True, "Heartbeat monitor started and scanned", f"monitor_id={heartbeat_monitor_id}"), + "H4": step(entity_status(heartbeat_wait) == "Failed", "Stale agent transitioned to Failed", entity_field(heartbeat_wait, "ErrorMessage", "error_message") or entity_result(heartbeat_wait)), + "H5": step("TimeoutFail" in heartbeat_sse, "SSE replay captured TimeoutFail state change", "TimeoutFail present" if "TimeoutFail" in heartbeat_sse else "missing"), + } + + memory_agent = create_entity(TENANT, "TemperAgents", {"TemperAgentId": "proof-memory"}) + memory_agent_id = entity_id(memory_agent) + action_with_fallback( + TENANT, + "TemperAgents", + memory_agent_id, + ["Temper.Agent.TemperAgent.Configure", "Temper.Agent.Configure"], + { + "user_message": build_mock_plan([{"final_text": "memory keys={{memory_keys}} count={{memory_count}}"}]), + "model": "mock-proof", + "provider": "mock", + "max_turns": "4", + "tools_enabled": "", + "soul_id": soul_id, + "sandbox_url": SANDBOX_URL, + "workdir": SANDBOX_WORKDIR, + }, + ) + action_with_fallback( + TENANT, + "TemperAgents", + memory_agent_id, + ["Temper.Agent.TemperAgent.Provision", "Temper.Agent.Provision"], + {}, + ) + memory_wait = wait_entity(TENANT, "TemperAgent", memory_agent_id, ["Completed", "Failed", "Cancelled"], 60000) + memory_entity = get_entity(TENANT, "TemperAgents", memory_agent_id) + memory_session = get_file_text(TENANT, entity_field(memory_entity, "session_file_id", "SessionFileId")) + memory_result = entity_result(memory_wait, memory_session) + report["steps"]["M"] = { + "M1": step(True, "Second agent created with same soul_id", f"agent_id={memory_agent_id}"), + "M2": step("proof-direct-memory" in memory_result and "project-context" in memory_result, "Cross-session memory loaded into prompt", memory_result), + "M3": step("count=" in memory_result, "Memory-aware mock response surfaced recalled knowledge", memory_result), + } + + compaction_notes = "X" * 6000 + compaction_agent = create_entity(TENANT, "TemperAgents", {"TemperAgentId": "proof-compaction"}) + compaction_agent_id = entity_id(compaction_agent) + action_with_fallback( + TENANT, + "TemperAgents", + compaction_agent_id, + ["Temper.Agent.TemperAgent.Configure", "Temper.Agent.Configure"], + { + "user_message": json.dumps({"notes": compaction_notes, "mock_plan": {"steps": [{"final_text": "compaction proof ok"}]}}), + "model": "mock-proof", + "provider": "mock", + "max_turns": "6", + "tools_enabled": "", + "soul_id": soul_id, + "reserve_tokens": "199500", + "keep_recent_tokens": "100", + "sandbox_url": SANDBOX_URL, + "workdir": SANDBOX_WORKDIR, + }, + ) + action_with_fallback( + TENANT, + "TemperAgents", + compaction_agent_id, + ["Temper.Agent.TemperAgent.Provision", "Temper.Agent.Provision"], + {}, + ) + compaction_wait = wait_entity(TENANT, "TemperAgent", compaction_agent_id, ["Completed", "Failed", "Cancelled"], 60000) + compaction_entity = get_entity(TENANT, "TemperAgents", compaction_agent_id) + compaction_session = get_file_text(TENANT, entity_field(compaction_entity, "session_file_id", "SessionFileId")) + write_text(ARTIFACT_ROOT / "compaction-session.jsonl", compaction_session) + report["steps"]["X"] = { + "X1": step("compaction" in compaction_session, "Compaction entry was written into the session tree", "compaction entry present" if "compaction" in compaction_session else "missing"), + "X2": step(entity_status(compaction_wait) == "Completed", "Agent resumed after compaction", entity_result(compaction_wait, compaction_session)), + } + + trajectories_summary = get_json("/observe/trajectories?entity_type=TemperAgent&failed_limit=20", tenant=TENANT, headers=ADMIN_HEADERS)["json"] + write_text(ARTIFACT_ROOT / "trajectories.json", json.dumps(trajectories_summary, indent=2)) + + specs_summary = { + "temper-agent": apps["temper-agent"], + "temper-channels": apps["temper-channels"], + "temper-fs": apps["temper-fs"], + } + + def table(section): + rows = [ + "| Step | Expected | Actual | Status |", + "|---|---|---|---|", + ] + for key, value in report["steps"][section].items(): + actual = value["actual"].replace("|", "\\|").replace("\n", "
") + rows.append(f"| {key} | {value['expected']} | {actual} | {value['status']} |") + return "\n".join(rows) + + limitations = [] + if report["steps"]["H"]["H4"]["status"] != "PASS": + limitations.append("Heartbeat timeout did not fail the hanging agent.") + if report["steps"]["X"]["X1"]["status"] != "PASS": + limitations.append("Compaction scenario did not emit a compaction entry.") + if not limitations: + limitations.append("None observed in the proof run.") + + report_text = f"""# Governed Agent Architecture E2E Proof + +## Date +{report['date']} + +## Branch +{report['branch']} + +## Commit +{report['commit']} + +## Server +`{SERVER}` against tenant `{TENANT}` + +## Specs Deployed +- `temper-fs`: {json.dumps(specs_summary['temper-fs'])} +- `temper-agent`: {json.dumps(specs_summary['temper-agent'])} +- `temper-channels`: {json.dumps(specs_summary['temper-channels'])} + +## Trigger Path A: Direct OData API +{table('A')} + +## Trigger Path B: Channel Webhook +{table('B')} + +## Trigger Path C: WASM Orchestration +{table('C')} + +## Trigger Path D: MCP Tool Call +{table('D')} + +## Trigger Path E: Cron Job +{table('E')} + +## Subagent + Coding Agent Verification +{table('S')} + +## Heartbeat Monitoring Verification +{table('H')} + +## Cross-Session Memory +{table('M')} + +## Compaction +{table('X')} + +## Artifacts + +### Session Tree Dump +```jsonl +{direct_session} +``` + +### SSE Events Captured +```text +{direct_sse} +``` + +### OTS Trajectory Summary +```json +{json.dumps(trajectories_summary, indent=2)} +``` + +### System Prompt Assembly +```text +{direct_prompt} +``` + +## Current Limitations +""" + "\n".join(f"- {item}" for item in limitations) + f""" + +## Reproduction Commands +```bash +python3 scripts/temper_agent_e2e_proof.py +cargo test --workspace +``` +""" + write_text(REPORT_PATH, report_text) + write_text(ARTIFACT_ROOT / "proof-summary.json", json.dumps(report, indent=2)) + print(json.dumps({"report": str(REPORT_PATH), "tenant": TENANT, "artifacts": str(ARTIFACT_ROOT)}, indent=2)) + return 0 + + +if __name__ == "__main__": + try: + raise SystemExit(main()) + except Exception as exc: + print(f"proof failed: {exc}", file=sys.stderr) + raise diff --git a/temper-course.html b/temper-course.html new file mode 100644 index 00000000..0c94425a --- /dev/null +++ b/temper-course.html @@ -0,0 +1,3477 @@ + + + + + + How Temper Works — An Interactive Guide + + + + + + + + + + + + + + + +
+
+ +
+ + Interactive Course +
+ +

How Temper Works

+ +

An interactive guide to the operating layer for AI agents

+ +

+ Temper is an + operating layer + for + AI agents. + Instead of writing code by hand, agents describe what they want in structured + specs, + and Temper builds verified, governed systems from those descriptions. +

+ +
+
+
+ +
+ Describe +

Agents say what they need in plain structured language. No hand-written code.

+
+ + + +
+
+ +
+ Verify +

A multi-level verification cascade proves the system is correct before deployment.

+
+ + + +
+
+ +
+ Govern +

Fine-grained authorization controls who can do what. Every action is audited.

+
+
+ +
+ +
+ The Von Neumann Parallel +

+ Early computers required engineers to physically rewire circuits for each new program. + The + Von Neumann architecture + changed everything: store the program as data, and let the machine interpret it. + Temper applies the same idea to AI agents — agents write descriptions (specs) as data, + and a + kernel + builds verified, running systems from them. The agent never touches the wiring. +

+
+
+ + + +
+
+ + +
+
+
+ 01 +

What Happens When You Tell an Agent to Do Something?

+

Trace a single user action from intent to verified state change

+
+
+ + +
+

+ Imagine you say: "Create an order with 3 items" +

+

+ You're talking to an + AI agent. + You tell it to create an order. The agent doesn't just... do it. It goes through Temper. +

+

+ Think of it like a notary public. You don't just sign a contract and call it done — you go through someone who verifies your identity, checks the document, and records it officially. Temper is that notary for every action an agent takes. +

+
+ +
+ + +
+

+ The Journey of Your Request +

+

+ From your words to a verified + state change, + here is every step your request passes through. +

+
+ +
+
+
1
+
+ Your words reach the agent + + The + agent + interprets "create an order with 3 items" and figures out you want the + action + SubmitOrder. + +
+
+
+
2
+
+ Agent sends an HTTP request to Temper + + The agent calls Temper's + API endpoint + with the action name and + parameters + (shipping address, payment method). + +
+
+
+
3
+
+ Authorization check + + Temper asks: "Is this agent allowed to submit orders?" + Cedar authorization + evaluates the agent's identity against the security policies. + +
+
+
+
4
+
+ Guard evaluation + + Temper checks: "Is this order in the right + state?" + The + guard + items > 0 is evaluated — no empty orders allowed. + +
+
+
+
5
+
+ State transition + event recorded + + Both checks pass. The order's state changes from Draft to Submitted. + An immutable + event + is persisted to the + event store. + +
+
+
+
6
+
+ Response sent back + + Temper sends an + HTTP + response confirming the + transition. + The agent tells you: "Order submitted!" + +
+
+
+ +
+ + +
+

+ Reading the Blueprint +

+

+ Here is the actual + spec + for the SubmitOrder + action, + taken from a real Temper + entity + definition. Left side: code. Right side: what it means. +

+
+ +
+
+
# From order.ioa.toml
+[[action]]
+name = "SubmitOrder"
+kind = "internal"
+from = ["Draft"]
+to = "Submitted"
+guard = "items > 0"
+params = ["ShippingAddressId",
+          "PaymentMethod"]
+hint = "Submit a draft order.
+ Requires at least one item."
+
+
+
+ name + This action is called SubmitOrder +
+
+ kind + It's an internal action — triggered by the system, not directly by end users +
+
+ from → to + Can only fire when the order is in Draft. Moves it to Submitted. +
+
+ guard + The order must have at least one item (items > 0). Otherwise, rejected. +
+
+ params + Requires a shipping address and payment method to proceed. +
+
+
+ +
+ + +
+ +
+ Guards Are Bouncers for Your Data +

+ Guards + make sure nonsensical things can't happen — like submitting an empty order. + This isn't just + validation. + It's + proven mathematically + before the system even runs. +

+

+ The + verification cascade + explores every possible sequence of actions and confirms that no path through the system can violate the guard. An empty order can never reach Submitted — not because of a runtime check, but because it's structurally impossible. +

+
+
+ +
+ + +
+

+ Check Your Understanding +

+

+ Apply what you just learned to these scenarios. +

+
+ +
+ + +
+

A user says "ship my order" but the order is still in Draft status. What happens?

+
+ + + +
+
+
+ + +
+

You want to add a "gift wrap" option to orders. What would you need to change?

+
+ + + +
+
+
+ + +
+

An agent submits an order with 0 items. What prevents this?

+
+ + + +
+
+
+ +
+ + + +
+
+ +
+
+
+ + +
+
+
+ 02 +

Meet the Cast

+

The major crates and components that make up Temper

+
+
+ + +
+

Every System Has Characters

+

+ Temper is built from ~25 separate modules (called + crates + in Rust). But you only need to know about 6 key players to understand how it all works. + Think of them as a film production crew — each member has a specific role, they communicate through defined channels, and together they produce something none could alone. +

+
+ + +
+ + +
+ +
+ temper-spec + The Scriptwriter +

Reads your blueprint (the spec file) and translates it into a format the rest of the team understands.

+
+
+ + +
+ +
+ temper-verify + The Safety Inspector +

Runs 4 levels of checks to prove the blueprint is correct BEFORE anything gets built.

+
+
+ + +
+ +
+ temper-jit + The Builder +

Takes the verified blueprint and builds a + TransitionTable + — the live rulebook that governs every action.

+
+
+ + +
+ +
+ temper-server + The Stage Manager +

Runs the show: receives requests, dispatches them to the right + entity actor, + and returns results.

+
+
+ + +
+ +
+ temper-authz + The Security Guard +

Checks every single action: Is this agent allowed to do this? Uses + Cedar + policies. Default answer: NO.

+
+
+ + +
+ +
+ temper-observe + The Documentarian +

Records everything that happens — every transition, every decision — as + WideEvents, + automatically.

+
+
+ +
+ + +
+

How They Work Together

+

+ Watch what happens behind the scenes when an agent fires a single action. The + components + coordinate like a team in a group chat. +

+ +
+
+ + + + + + + + + + + + + + + +
+ + + +
+ + + + 0 / 7 messages +
+
+
+ + +
+ +
+ Separation of Concerns +

+ This pattern — splitting responsibilities into focused roles — is one of the most important ideas in software. Engineers call it + separation of concerns. + Each + module + does ONE thing well and trusts the others to handle their part. When something breaks, you know exactly where to look. +

+
+
+ + +
+

Check Your Understanding

+ +
+ + +
+

An agent's action is denied. Which component made that decision?

+
+ + + + +
+
+
+ + +
+

State transitions are being recorded incorrectly. Which component would you investigate?

+
+ + + + +
+
+
+ + +
+

A new spec has a bug where an impossible state is reachable. Which component should have caught this?

+
+ + + + +
+
+
+ +
+ + +
+ +
+
+ +
+
+
+ + +
+
+
+ 03 +

The Blueprint Language

+

How I/O Automaton specs describe what systems should do

+
+
+ + +
+

Specs, Not Code

+

+ In Temper, you don't write code. You write descriptions of behavior. The system generates everything else — + the runtime actors, the API endpoints, the authorization policies, the verification proofs. +

+

+ These descriptions are called + I/O Automaton specs + (IOA specs). They are written in + TOML, + and they define four things: +

+
    +
  1. What states something can be in
  2. +
  3. What actions can happen
  4. +
  5. What conditions must be true for an action to fire
  6. +
  7. What changes when an action fires
  8. +
+

+ Think of it like an architect's blueprint. Before a building goes up, architects draw detailed plans + that structural engineers can verify. You wouldn't build a skyscraper from a napkin sketch. + Temper applies the same discipline to software — every + entity + gets a mathematically precise blueprint before a single line of runtime code is generated. +

+
+ + + + +
+

Anatomy of a Spec

+

+ Here is the opening of a real spec — the Order + automaton + from Temper's test suite. On the left is the actual TOML. On the right is what each line means in plain English. +

+
+
+
# Order Entity — I/O Automaton Spec
+
+[automaton]
+name = "Order"
+states = ["Draft", "Submitted",
+  "Confirmed", "Processing",
+  "Shipped", "Delivered",
+  "Cancelled", "ReturnRequested",
+  "Returned", "Refunded"]
+initial = "Draft"
+
+
+
+ [automaton] + This section declares a new type of thing the system manages +
+
+ name = "Order" + Its name is "Order" — every order created will follow these rules +
+
+ states = [...] + An order can only ever be in one of these 10 statuses. Nothing else is possible. +
+
+ initial = "Draft" + Every new order starts in "Draft" — like a blank form waiting to be filled in +
+
+
+
+ + + + +
+

The Four Building Blocks

+

+ Every IOA spec is built from the same four pieces. Once you know them, you can read any spec in the system. +

+
+
+
+ 🏷️ +
+
States
+

+ The list of valid statuses. Like chapters in a story — the + entity + can only be in one chapter at a time. +

+
+
+
+ 📊 +
+
State Variables
+

+ Data that changes as the entity moves through its lifecycle. + Counters, + booleans, + flags, labels. +

+
+
+
+ ⚡ +
+
Actions
+

+ Things that can happen. Each one says: "I can only fire FROM these states, and I move TO this state." + This is called a + transition. +

+
+
+
+ 🛡️ +
+
Guards
+

+ Conditions that must be true for an action to fire. Like + prerequisites + in a college course catalog. +

+
+
+
+ + + + +
+

A Real Action, Decoded

+

+ Here is a real action from the Issue + state machine + — the BeginPlanning action that kicks off the planning phase. Let's decode it line by line. +

+
+
+
[[action]]
+name = "BeginPlanning"
+kind = "internal"
+from = ["Todo"]
+to = "Planning"
+guard = "is_true planner_set"
+hint = "Start planning phase.
+  A planner agent drafts
+  the approach."
+
+
+
+ [[action]] + Here comes a new action definition — one thing that can happen to an Issue +
+
+ name = "BeginPlanning" + This action is called "BeginPlanning" — agents use this exact name to trigger it +
+
+ kind = "internal" + This action changes the entity's state (unlike "input" actions that just update data) +
+
+ from = ["Todo"] + Can ONLY fire when the issue is in "Todo" status. Try it from any other status? Rejected. +
+
+ to = "Planning" + When it fires, the issue moves to "Planning" status +
+
+ guard = "is_true planner_set" + But wait — a planner must be assigned first. No planner, no planning. +
+
+ hint = "..." + A human-readable description so agents know when to use this action +
+
+
+
+ + + + +
+

Invariants — The Safety Net

+

+ Invariants + are rules that must hold no matter what. They don't protect a single action — they protect the entire system. + Here is one from the Order spec: +

+
+
+
[[invariant]]
+name = "SubmitRequiresItems"
+when = ["Submitted", "Confirmed",
+  "Processing", "Shipped",
+  "Delivered"]
+assert = "items > 0"
+
+
+
+ [[invariant]] + This is a safety rule that must ALWAYS be true +
+
+ when = [...] + Whenever an order is in any of these statuses… +
+
+ assert = "items > 0" + …it must have at least one item. If the system ever violates this, it's a bug — and Temper proves it CAN'T happen before the code ever runs. +
+
+
+
+ +
+ Testing vs. Verification +

+ Invariants are the secret weapon. Most software hopes things won't go wrong — you write tests + that try lots of cases and cross your fingers. Temper proves they can't. The + verification cascade + exhaustively checks every reachable state. This is the difference between + testing + (trying lots of cases) and verification (mathematical proof for ALL cases). +

+
+
+
+ + + + +
+

Check Your Understanding

+
+ + +
+

You want to allow cancelling orders that have already shipped. What would you change in the spec?

+
+ + + + +
+
+
+ + +
+

An agent tries BeginPlanning on an issue that is in "Backlog" status. What happens?

+
+ + + + +
+
+
+ + +
+

Why would you add an invariant instead of just a guard?

+
+ + + + +
+
+
+ +
+ + + +
+ +
+
+ +
+
+
+ + +
+
+
+ 04 +

Trust But Verify

+

The multi-level verification cascade that proves correctness

+
+
+ + +
+

+ Why Verify? +

+

+ Most software gets + tested + AFTER it's built. Temper + verifies + BEFORE + deployment. +

+

+ Here is the key insight: testing tries many cases and hopes to find bugs. + Verification + PROVES correctness for ALL possible cases. + Think of it like NASA pre-launch checks — before a rocket launches, it goes through multiple independent verification systems. No single check is enough. Each catches different types of problems. +

+
+ +
+ + +
+

+ The Four Levels +

+

+ Every spec passes through four independent verification levels. Each must pass before the next begins — like stage gates on a launch countdown. +

+
+ +
+
+
0
+
+ Level 0: SMT Symbolic Check + + A math solver (called + Z3) + checks every + guard + expression. Can this guard ever be true? Are there contradictions? + Think of it like spell-checking your blueprint. + +
+
+
+
1
+
+ Level 1: Model Checking + + Stateright + explores EVERY possible sequence of actions. Every path. Every combination. + Like testing every possible route through a maze — not just the obvious ones. + This is called + model checking. + +
+
+
+
2
+
+ Level 2: Deterministic Simulation + + Runs the system with injected failures — dropped messages, delays, reordering. This is + deterministic simulation + with + fault injection. + Like a stress test for buildings: if it survives a simulated earthquake, it'll survive the real thing. + +
+
+
+
3
+
+ Level 3: Property-Based Testing + + Throws thousands of random action sequences at the spec. + Edge cases, + weird combinations, boundary values. This is + property-based testing + — like a chaos monkey shaking every branch. + +
+
+
+ +
+ + +
+

+ What Each Level Catches +

+

+ Each level is designed to catch a different class of defect. Together, they cover the full + state space. +

+
+ +
+
+ +
Impossible Guards
+

+ items > 0 AND items < 0 would be caught — the + SMT + solver proves this guard can never be satisfied. +

+
+
+ +
Unreachable States
+

+ A state that no sequence of actions can ever reach. The model checker explores every path and proves no route leads there. +

+
+
+ +
Timing Bugs
+

+ Things that break when messages arrive out of order. The simulation injects delays and reordering to expose concurrency defects. +

+
+
+ +
Edge Cases
+

+ Boundary values, empty strings, maximum counters. Random sequences find the weird combinations humans never think to try. +

+
+
+ +
+ + +
+ +
+ The Single-Table Principle +

+ Here is the profound bit: the same + TransitionTable + that + Stateright + model-checks + in Level 1 is the EXACT SAME code that runs in + production. +

+

+ Not a "similar version." Not a "simplified model." The actual thing. This is the Single-Table Principle — verify once, run forever. If the verification cascade proves your spec is correct, you KNOW production is correct too. There is no gap between what was tested and what runs. +

+
+
+ +
+ + +
+

+ Check Your Understanding +

+

+ Apply what you just learned about the verification cascade. +

+
+ +
+ + +
+

You write a guard that says priority > 10 AND priority < 5. Which verification level catches this first?

+
+ + + +
+
+
+ + +
+

Your spec has a "Paused" state, but no action ever transitions TO it. Which level catches this?

+
+ + + +
+
+
+ + +
+

Why is it important that the same TransitionTable runs in both verification and production?

+
+ + + +
+
+
+ +
+ + + +
+
+ +
+
+
+ + +
+
+
+ 05 +

The Bouncer at the Door

+

How Cedar authorization controls every action in the system

+
+
+ + +
+

+ Default: No +

+

+ In most software, agents can do anything unless you explicitly block them. Temper flips this: agents can do nothing unless you explicitly allow them. +

+

+ Think of a concert venue. You don't walk in and go wherever you want. The door is locked. You need a badge to get in, a different badge to go backstage, and a crew pass to access the control room. No badge? No entry. +

+

+ This is called + default deny. + Every action an agent takes needs explicit + authorization. + No + permission, + no action. +

+
+ +
+ + +
+

+ Cedar: The Policy Language +

+

+ Temper uses + Cedar + to express authorization rules. Here is a real policy from Temper's project management app. Left side: code. Right side: what it means. +

+
+ + +
+
+
// Universal: any agent can create,
+// read, list, comment
+
+permit(
+  principal,
+  action in [
+    Action::"create",
+    Action::"read",
+    Action::"list",
+    Action::"AddComment",
+    Action::"AddLabel",
+    Action::"SetDescription"
+  ],
+  resource is Issue
+);
+
+
+
+ permit( + This rule allows something (as opposed to forbid which blocks it) +
+
+ principal, + Any agent — doesn't matter who they are +
+
+ action in [...] + Can perform these specific actions: create, read, list, comment, label, describe +
+
+ resource is Issue + On any Issue entity +
+
+ Summary + Everyone gets basic access. Reading and commenting are open to all. +
+
+
+ +

+ That covers the basics. But what about the powerful stuff — assigning work, setting priority, canceling issues? Those need a stronger badge. +

+ + +
+
+
// Triage & Prioritization:
+// supervisors and humans only
+
+permit(
+  principal,
+  action in [
+    Action::"MoveToTriage",
+    Action::"MoveToTodo",
+    Action::"SetPriority",
+    Action::"Assign",
+    Action::"AssignPlanner",
+    Action::"Unassign",
+    Action::"Cancel"
+  ],
+  resource is Issue
+) when {
+  ["supervisor", "human"]
+    .contains(principal.agent_type)
+  && context.agentTypeVerified
+    == true
+};
+
+
+
+ action in [...] + Management actions — triaging, prioritizing, assigning — are locked down +
+
+ when { ... } + Only + supervisors + or humans can do them +
+
+ agentTypeVerified + AND their identity must be + verified + (not just claimed) +
+
+
+ +
+ + +
+

+ Role Separation in Action +

+

+ Cedar doesn't just gate actions by role. It can enforce that specific agents are assigned to specific responsibilities. Here is how a planning flow works, step by step. +

+
+ +
+
+
1
+
+ Supervisor assigns a planner + + Only + supervisors + or humans can call AssignPlanner. This sets the PlannerId on the issue. + +
+
+
+
2
+
+ Planner drafts the plan + + Only the assigned planner — the agent whose ID matches PlannerId — can call BeginPlanning and WritePlan. + +
+
+
+
3
+
+ Supervisor approves the plan + + A supervisor or human reviews and calls ApprovePlan. The human's conversational "go ahead" is the real gate. + +
+
+
+
4
+
+ Implementer starts work + + Only the assigned implementer (whose ID matches AssigneeId) can call StartWork. This requires an approved plan. + +
+
+
+
5
+
+ Supervisor reviews the work + + Only supervisors or humans can call ApproveReview or RequestChanges. The cycle closes. + +
+
+
+ +

+ Here is the Cedar + policy + that enforces step 2 — only the assigned planner can write the plan: +

+ +
+
+
// Planning: the assigned planner
+
+permit(
+  principal,
+  action in [
+    Action::"BeginPlanning",
+    Action::"WritePlan"
+  ],
+  resource is Issue
+) when {
+  resource.PlannerId
+    == principal.id
+};
+
+
+
+ resource.PlannerId == principal.id + Only the agent whose ID matches the issue's PlannerId field can begin planning and write the plan. No one else — not even a supervisor — unless a separate + policy + grants them access. +
+
+
+ +
+ + +
+ +
+ When Access is Denied +

+ When an agent tries something it's not allowed to do, Temper doesn't just say "no" and move on. It creates a + pending decision + — a structured request that surfaces to the human through the Observe UI. +

+

+ The human can approve it narrowly (just this once), broadly (for this type of action), or deny it entirely. Over time, the + policy + set grows to match exactly what agents need. The system learns the right permissions from real usage. +

+

+ Think of it like a new employee. Day one, they can barely access anything. Each time they need something, they ask. A manager approves. After a few weeks, their badge opens exactly the doors they need — no more, no less. That's how + scope + works in Temper. +

+
+
+ +
+ + +
+

+ Check Your Understanding +

+

+ Apply what you just learned about Cedar authorization. +

+
+ +
+ + +
+

A new intern-agent tries to cancel an issue. The Cedar policy requires 'supervisor' or 'human' agent_type. What happens?

+
+ + + +
+
+
+ + +
+

Why does Cedar check context.agentTypeVerified == true instead of just trusting the agent's claimed type?

+
+ + + +
+
+
+ + +
+

You want to allow a specific agent to triage issues without being a supervisor. How would you approach this?

+
+ + + +
+
+
+ +
+ + + +
+
+ +
+
+
+ + +
+
+
+ 06 +

The Living System

+

The evolution engine and how the whole system grows over time

+
+
+ + +
+

Software That Learns From Use

+

+ Most software is built, deployed, and stays the same until someone manually updates it. A developer notices a problem, + writes a ticket, waits for a sprint, ships a patch. Weeks pass. Meanwhile, users work around the limitation every single day. +

+

+ Temper works differently. Its + specs + evolve based on how agents and users actually use the system. Every failed action, every workaround, every friction point + is automatically observed and can + feed back + into spec improvements. +

+

+ The metaphor is biological evolution. Organisms don't redesign themselves from scratch. They observe what works, + adapt incrementally, and the environment — natural selection — gates what changes survive. In Temper, the human holds + the role of natural selection: the approval gate. +

+
+ + + + +
+

The O-P-A-D-I Chain

+

+ The + evolution engine + works through a chain of five linked records. Each one builds on the previous, creating a complete audit trail from observation to impact. +

+ +
+ + +
+ +
+ O — + Observation +

+ Agents keep trying to do something the spec doesn't support. The system notices the pattern automatically. + "Users tried action X 47 times this week. It failed every time." +

+
+
+ + +
+ +
+ P — + Problem +

+ The pattern is formalized into a structured problem statement. + "Users tried action X 47 times this week. It failed every time. This looks like a gap in the spec." +

+
+
+ + +
+ +
+ A — + Analysis +

+ A proposed fix is generated. + "Add a Rollback action from Deployed to Testing status, with guard deployed_duration > 24h." +

+
+
+ + +
+ +
+ D — + Decision +

+ A human reviews the proposal. They can approve (narrow, medium, or broad scope), reject, or modify it. + No change goes live without explicit human approval. +

+
+
+ + +
+ +
+ I — + Insight +

+ After the change deploys, impact is tracked. Did rollback success rates improve? Was the change worth it? + The I-Record closes the + feedback loop. +

+
+
+ +
+ +

+ Each record links to the previous one. You can trace from any change back to the + observation + that motivated it. Complete + audit trail. +

+
+ + + + +
+
+ +
+ The Human Gate +

+ Here's what makes this different from "AI that rewrites itself": the system can observe, analyze, and propose — + but only humans can approve changes. No + autonomous mutation. + The human holds the approval gate. +

+

+ This isn't a limitation — it's the design. It's what makes the system trustworthy. + Every spec change has a + D-Record + with a human's decision attached. Every change can be traced back to the observation that motivated it. + The + audit trail + is unbroken. +

+
+
+
+ + + + +
+

The Big Picture

+

+ Here's how all the pieces fit together — the full Temper loop from design to runtime to evolution. + Click any zone to learn more. +

+ +
+ + +
+
+ + Design Time +
+
    +
  • Developer / Agent describes capability
  • +
  • IOA Spec + CSDL + Cedar generated
  • +
  • Verification Cascade (L0 → L1 → L2 → L3)
  • +
+ +
+ + +
+
+ + Runtime +
+
    +
  • Entity Actors process actions
  • +
  • Cedar checks every request
  • +
  • Events persisted (event sourcing)
  • +
+ +
+ + +
+
+ + Observability +
+
    +
  • WideEvents recorded automatically
  • +
  • Trajectory analysis
  • +
  • Pattern detection
  • +
+ +
+ + +
+
+ + Evolution + Connects back to Design Time ↺ +
+
+ O + + P + + A + + D + + I + Human approval gate • Spec change • Re-verification • Hot deploy +
+ +
+ +
+
+ + + + +
+
+ +
+ The + Von Neumann + Insight +

+ In 1949, John von Neumann designed a self-replicating machine with three parts: a description (blueprint), + a constructor + (machine that builds from any description), and a copy mechanism (that mutates descriptions over time). +

+

+ Temper mirrors this exactly. The + kernel + is the constructor — it doesn't change. Specs are the descriptions — they evolve. + The evolution engine is the copy mechanism. Complexity grows through better descriptions, + not by changing the machine that interprets them. +

+
+
+
+ + + + +
+

Check Your Understanding

+

+ The capstone quiz. Apply everything you've learned about the evolution engine and the full Temper system. +

+ +
+ + +
+

Agents keep trying a "Pause" action that doesn't exist in the spec. What's the first thing the evolution engine creates?

+
+ + + + +
+
+
+ + +
+

A proposed spec change passes the D-Record approval. What happens before it goes live?

+
+ + + + +
+
+
+ + +
+

Why does Temper use + event sourcing + instead of just storing the current state?

+
+ + + + +
+
+
+ + +
+

What's the practical benefit of the + Von Neumann architecture + for someone building on Temper?

+
+ + + + +
+
+
+ +
+ + + +
+ +
+
+ + + + +
+

+ You've just learned how an operating layer for AI agents works — from specs to verification to governance to evolution. + The next time you tell an AI agent to do something, you'll know: under the hood, there could be a system like Temper + making sure it does the right thing, the right way, with proof. +

+
+ +
+
+
+ + + + + +