diff --git a/README.md b/README.md index d62405bf..b91bcf15 100644 --- a/README.md +++ b/README.md @@ -9,143 +9,122 @@
- The framework where agents build their own OS. + A verified operating layer for autonomous agents
--- -Agents are starting to build their own tools — generating MCP servers at runtime, synthesizing helpers mid-session, evolving workflow topologies. At the same time, the infrastructure for making this safe is developing: policy-based authorization on tool invocations, behavioral contracts, state-machine-constrained agents, formal verification becoming practical. Harness engineering, durable execution, declarative agent specs — all moving forward. - -Temper is our attempt to explore what happens when you connect these ideas into one framework: agent-created tools as formally verified state machines, authorization policies derived from behavioral specs, and an evolution loop where unmet intents feed back into spec proposals with human approval. - -## How It Works - -An agent describes what it needs as declarative specs — state machines, data models, integrations, authorization policies. Temper formally verifies the specs, deploys them as a live API, mediates every action through [Cedar](https://www.cedarpolicy.com/) policies, and records everything. The human approves or rejects. The agent operates through what it built. - -```python -# Agent gives itself long-term memory — Temper verifies and deploys it -await temper.submit_specs("my-app", { - "Knowledge.ioa.toml": knowledge_spec, # state machine: agent-generated - "model.csdl.xml": data_model # data model: agent-generated -}) -# → Verification cascade: Z3 SMT, model checking, simulation, property tests -# → If all levels pass, the knowledge system is live - -# Agent stores and retrieves its own knowledge through the verified API -await temper.create("my-app", "KnowledgeEntries", { - "content": "service X fails under concurrent writes — use advisory locks", - "source": "incident-247" -}) -await temper.action("my-app", "KnowledgeEntries", "k-42", "Link", { - "related": ["k-12", "k-31"] # connect insights across sessions -}) -# → Cedar checks every operation — the agent can read its own entries -# but can't access another agent's knowledge without approval -``` - -The kernel is a thin Rust runtime that interprets whatever the mediation pipeline feeds it. Everything agents touch — specs, policies, WASM modules, reaction rules — hot-reloads. The kernel itself rarely changes. +## What is Temper? -## Why Temper? +Agents build tools at runtime. They generate helpers and create workflows. Those tools have no verification, no governance, no memory of why they exist. -Agent scaffolding — prompt templates, tool wrappers, output parsers — shrinks as models get smarter. What compounds is the world-facing infrastructure: verified state machines, authorization policies, persistent trajectories. The kernel is a [universal interpreter](https://en.wikipedia.org/wiki/Von_Neumann_universal_constructor) — everything else is a spec. Tools, harnesses, applications are all declarative descriptions with a signature that agents write, verify, deploy, and rewrite. The kernel rarely changes. The descriptions evolve. +Temper is an operating layer where agents describe capabilities as specifications. The kernel verifies each spec before deployment. Every action flows through authorization policies. A human approves changes to scope. -| What's developing in the field | Temper's angle | -|---|---| -| Agents synthesize tools at runtime | Those tools are verified state machines that persist as specs | -| Policy-based authorization on tool invocations | Policies derived from a behavioral spec, not authored separately | -| Runtime guardrails check outputs | State machine checked exhaustively *before* deployment (model checking + SMT) | -| Observability shows what happened | Unmet intents feed back into spec proposals with human approval | -| Declarative agent specs for portability | Declarative specs for correctness — verified, then deployed | -| Durable execution engines | Spec defines what the system does; durability follows from event sourcing | -| Harnesses as static scaffolding | Harnesses as specs — agents program and rewrite them through the same verify-deploy loop | +As agents and users operate through a skill, the evolution engine identifies gaps. It adds missing capabilities, fixes broken ones, and removes redundant ones. The human approves each change. -It's an exploration of what happens when you put formal verification, Cedar authorization, and evolution feedback into the same loop. +| | Step | What happens | +| ------ | --------------- | ----------------------------------------------------------------------- | +| **01** | Describe | An agent describes what it needs: states, transitions, guards, data shape. | +| **02** | Verify | The kernel proves the spec is sound before anything runs. | +| **03** | Operate | The agent works through the verified API. Every action is governed and recorded. | +| **04** | Evolve | Usage patterns surface gaps. The spec adapts. The human approves. | -## Key Features +
+Verified Skills+Agents describe capabilities as specs. A four-level verification cascade proves them sound before deployment. + |
+
+Governed by Default+Every action flows through authorization with a default-deny posture. Denied actions surface to the human for approval. The policy set grows as the agent works. + |
+
+Self-Evolving+The evolution engine observes usage patterns and failures. It proposes spec changes. Agents create new skills. The human approves every change. + |
+
+Self-Describing API+Every skill generates a queryable API with schema discovery. Agents find available actions and valid transitions without documentation. + |
+
+Full Audit Trail+Every action records agent identity, before/after state, and the authorization decision. Agents can query their own history. + |
+
+Hot-Reload+Skills deploy and update without downtime. Specs, policies, and integrations reload live. + |
+