Agents that remember, forge their own tools, and survive long-running sessions. Persistent cognitive memory, optional HEXACO personality, multi-agent orchestration, and one dispatch interface across 11 LLM providers. Apache-2.0.
AgentOS is an open-source TypeScript framework for AI agents that remember, adapt, and write their own tools.
- Top open-source memory benchmarks: 85.6% on LongMemEval-S at $0.0090/correct (gpt-4o), and 70.2% on LongMemEval-M, the only open-source library above 65% on M with reproducible methodology.
- Runtime tool forging. An agent writes a TypeScript function with a Zod schema, an LLM judge approves it, and it runs in a hardened
node:vmsandbox before joining the catalog for the rest of the session. - Persistent cognitive memory with 8 neuroscience-backed mechanisms: Ebbinghaus decay, retrieval-induced forgetting, reconsolidation, source-confidence decay.
- Optional HEXACO personality, 6 orchestration strategies, guardrails, and voice across 11 LLM providers; 100+ extensions and 88 skills auto-load at startup.
Runtime tool forging + multi-agent collaboration. Reproduce with node examples/emergent-hierarchical-spawning.mjs.
npm install @framers/agentosimport { agent } from '@framers/agentos';
const tutor = agent({
provider: 'anthropic', // resolves to claude-sonnet-4-6 (provider default)
// model: 'claude-opus-4-8', // pin a specific model to override the default
instructions: 'You are a patient CS tutor.',
personality: { openness: 0.9, conscientiousness: 0.95 },
memory: { types: ['episodic', 'semantic'], working: { enabled: true } },
});
// Provider auto-detected from env when `provider` is omitted.
const session = tutor.session('student-1');
await session.send('Explain recursion with an analogy.');
await session.send('Can you expand on that?'); // remembers contextFull quickstart * Examples cookbook * API reference
Three things accumulate across a session and compose into behavior: memory (what was said, decided, retrieved), the tool surface (which grows when an agent forges a tool the judge approves), and an optional HEXACO personality vector that biases retrieval, routing, and decisions. Each is configurable and observable.
Runtime tool forging. When no tool covers a sub-task, the agent writes a TypeScript function with a Zod schema; a separate LLM judge approves it; it runs in a hardened node:vm sandbox (5s wall clock, no eval/require/process), then joins a discoverable index for the rest of the session. First forge costs full tokens; reuse costs tens. Promoted tools export as SKILL.md skills. Emergent capabilities ->
HEXACO personality (optional). Off by default; the runtime behaves identically without it. When supplied, the kernel weights retrieval, specialist routing, and tool selection by trait values, so the same prompt and tools yield measurably different decision sequences. It lives in the kernel, not the prompt, so it persists under context pressure. HEXACO docs ->
Soul files. Identity, voice, hard limits, and HEXACO scores can live in a SOUL.md workspace. Its memory/ directory is a markdown wiki (an index.md catalog plus entities/, concepts/, log/ pages with [[wikilinks]]) that is the agent's long-term memory: markdown is the source of truth, the vector/graph index is rebuilt from it, and souledAgent() wires it end to end. Soul Files ->
import { souledAgent } from '@framers/agentos';
const aria = await souledAgent({ provider: 'anthropic', soul: '~/.agentos/agents/aria' });gpt-4o reader, gpt-4o-2024-08-06 judge, full N=500, single-CLI reproduction with bootstrap 95% CIs and per-benchmark judge-FPR probes.
- LongMemEval-S: 85.6% at $0.0090/correct, 3,558 ms p50: +1.4 points over Mastra OM gpt-4o (84.23%), 0.4 behind Emergence.ai's closed-source 86%. The highest publicly reproducible open-source number at
gpt-4o. - LongMemEval-M: 70.2% (1.5M-token haystacks, 500 sessions): the only open-source library above 65% on M with reproducible methodology.
Full leaderboard -> * Transparency audit -> * LongMemEval paper (Wu et al., ICLR 2025)
| vs. | AgentOS differentiator |
|---|---|
| LangChain / LangGraph | Cognitive memory (8 neuroscience-backed mechanisms), HEXACO personality, runtime tool forging |
| Vercel AI SDK | Multi-agent teams (6 strategies), 7 vector backends, guardrails, voice/telephony |
| CrewAI / Mastra | Unified orchestration (DAGs + graphs + missions), personality-driven routing, published reproducible numbers on LongMemEval-S (85.6%) and LongMemEval-M (70.2%) with full methodology disclosure |
| Category | Highlights |
|---|---|
| LLM Providers | 11 (9 API-key + 2 local CLI): OpenAI, Anthropic, Gemini, Groq, Ollama, OpenRouter, Together, Mistral, xAI, Claude CLI, Gemini CLI. Plus image/video/audio generation providers. |
| Cognitive Memory | 8 mechanisms: reconsolidation, retrieval-induced forgetting, involuntary recall, FOK, gist extraction, schema encoding, source decay, emotion regulation |
| HEXACO Personality | 6 traits modulate memory, retrieval bias, response style |
| RAG Pipeline | 7 vector backends * 4 retrieval strategies * GraphRAG * HyDE * Cohere rerank-v3.5 |
| Multi-Agent Teams | 6 coordination strategies * shared memory * inter-agent messaging * HITL gates |
| Orchestration | workflow() DAGs * AgentGraph cycles * mission() goal-driven planning * checkpointing |
| Guardrails | 5 security tiers * 6 packs (PII, ML classifiers, topicality, code safety, grounding, content policy) |
| Emergent Capabilities | Runtime tool forging * 4 self-improvement tools * tiered promotion * skill export |
| Voice & Telephony | ElevenLabs, Deepgram, Whisper * Twilio, Telnyx, Plivo |
| Channels | 37 platform adapters (Telegram, Discord, Slack, WhatsApp, webchat, ...) |
| Observability | OpenTelemetry * usage ledger * cost guard * circuit breaker |
import { agency } from '@framers/agentos';
const team = agency({
strategy: 'graph',
agents: {
researcher: { provider: 'anthropic', instructions: 'Find relevant facts.' }, // -> claude-sonnet-4-6
writer: { provider: 'openai', instructions: 'Summarize clearly.', dependsOn: ['researcher'] }, // -> gpt-4o
reviewer: { provider: 'gemini', instructions: 'Check accuracy.', dependsOn: ['writer'] }, // -> gemini-2.5-flash
},
});
const result = await team.generate('Compare TCP vs UDP for game networking.');Strategies: sequential, parallel, debate, review-loop, hierarchical, graph. With hierarchical + emergent: { enabled: true }, the manager forges new sub-agents at runtime. Multi-agent docs ->
| Package | Role |
|---|---|
@framers/agentos |
Core runtime: agents, cognitive memory, orchestration, guardrails, voice, 11 LLM providers. Apache-2.0. |
@framers/agentos-extensions |
100+ first-party extensions: channel adapters, tool packs, integrations, guardrail packs. |
@framers/agentos-extensions-registry |
Discovery + auto-loader for the extensions catalog. |
@framers/agentos-skills |
88 curated SKILL.md skills. |
@framers/agentos-skills-registry |
Discovery + auto-loader for skills; where promoted forged tools land. |
@framers/agentos-bench |
Open benchmark harness: bootstrap 95% CIs, judge-FPR probes, per-case run JSONs. MIT. |
@framers/sql-storage-adapter |
Cross-platform SQL persistence: SQLite, Postgres, IndexedDB, Capacitor SQLite. |
paracosm |
AI agent swarm simulation on AgentOS. Live demo. |
wunderland |
Batteries-included CLI + daemon over the AgentOS registries (preview). Apache-2.0. |
Extensions and skills auto-load at startup. Extensions architecture ->
Three layers, highest priority first: inline apiKey on the call, a module-level setDefaultProvider() at boot, or environment-variable auto-detection (OPENAI_API_KEY, ANTHROPIC_API_KEY, and the rest, resolved in priority order and reorderable with setProviderPriority([...])). Comma-separated keys auto-rotate on quota.
Full credential resolution + default models per provider ->
agent(): lightweight stateful agent. Prompts, sessions, personality, hooks, tools, memory.agency(): multi-agent teams + full runtime. Emergent tooling, guardrails, RAG, voice, channels, HITL.generateText()/streamText()/generateObject()/generateImage()/generateVideo()/generateMusic()/performOCR()/embedText(): low-level multi-modal helpers with native tool calling.workflow()/AgentGraph/mission(): three orchestration authoring APIs over one graph runtime.
Provider fallback is an explicit opt-in via agent({ fallbackProviders: [...] }); the runtime never silently retries against a different provider unless you configure a chain.
Full API reference -> * High-Level API guide ->
- Benchmarks: benchmark tables, 95% confidence intervals, methodology audit
- Architecture: system design, layer breakdown
- Cognitive Memory: 8 mechanisms with 30+ APA citations
- RAG Configuration: vector stores, embeddings, sources
- Guardrails: 5 tiers, 6 packs
- Voice Pipeline: TTS, STT, telephony
- Blog: engineering posts, benchmark publications, transparency audits
- Discord * GitHub Issues * Wilds.ai (AI game worlds powered by AgentOS)
git clone https://github.com/framerslab/agentos.git && cd agentos
pnpm install && pnpm build && pnpm testWe use Conventional Commits. Project guides:
| Guide | What |
|---|---|
| Contributing | Dev setup, PR checklist, commit conventions, contribution licensing |
| Adding an LLM provider | Provider interface, acceptance checklist, vendor-neutrality policy |
| Maintainers | Who reviews and merges changes |
| Code of Conduct | Community standards |
| Security Policy | Reporting vulnerabilities privately |
| Support | Where to get help |
| Sponsors | Funding and the vendor-neutral placement policy |
AgentOS is Apache-2.0 and free. We integrate any quality provider on technical merit, and partners and sponsors are featured in the README and docs, labeled as such. Companies engage through partner startup programs, sponsorship, or a provider integration. See SPONSORS.md.
| Partner | Type | Provides | Since |
|---|---|---|---|
| Startup Program | Speech-to-text + text-to-speech credits, go-to-market | 2026 |
| Track | What it is | Where |
|---|---|---|
| Sponsor | Fund development. Disclosed logo placement + release-notes credit. | SPONSORS.md |
| Provider integration | Ship your model or API as a supported provider. Free, on technical merit. | Provider guide |
Interested? Email team@frame.dev.