GitHub - fredm23579/Agent-Omega: Governed recursive self-improvement control plane — autonomous proposal generation, constitutional constraints, ensemble evaluation, chain-of-thought monitoring, staged deployment, 13 ADRs, 600+ tests. v2.0.0.

AgentΩ
Governed Recursive Self-Improvement Control Plane

Live Portal · Install as App · Download Source · API Reference

What is Agent-Omega

Agent-Omega is a control plane for AI systems that modify themselves. It provides the governance machinery — constitutional constraints, multi-dimensional evaluation, staged deployment, and immutable audit trails — that allows an AI agent to propose, validate, evaluate, and deploy changes to its own structure without losing safety, accountability, or the ability to roll back.

This is not a coding assistant, a chatbot framework, or a model training pipeline. It is the governance layer that sits between an AI system's desire to self-improve and its actual ability to do so.

In plain terms: if you're building an AI agent that should be able to change its own prompts, tools, reasoning strategies, or architecture — Agent-Omega is the system that decides whether each proposed change is safe, tracks what changed and why, and deploys it through staged rollout with automatic rollback.

Why it Exists

Unrestricted self-modification collapses accountability. If a system can change any part of itself at any time, there is no stable basis for:

Knowing what changed and why
Evaluating whether the change was an improvement
Rolling back if it was not
Preventing the system from disabling its own safeguards
Attributing decisions to evidence rather than optimization pressure

Agent-Omega was invented to solve this problem. It separates the concerns of proposing changes, validating them against structural rules, evaluating them across multiple dimensions, deciding based on evidence thresholds, and deploying them through staged rollout — each step independently auditable.

The research motivation comes from the AI safety literature: Anthropic's constitutional AI framework (4-tier priority hierarchy), formal verification approaches to safe recursive self-improvement (LessWrong/MIT), and the emerging alignment-by-architecture paradigm where systems are structurally incapable of misalignment.

Who Needs It

You need Agent-Omega if you are:

Building an AI agent that modifies itself — any agent that changes its own prompts, tools, reasoning chains, or architecture needs governance to prevent unsafe drift
An AI safety researcher — you need a concrete, running implementation of governed self-improvement to test theories against, not just papers
An MLOps / platform team — you need structured model deployment with evidence-based approval, staged rollout, automatic rollback, and audit trails for compliance
Building for regulated industries — healthcare, finance, government — where you need to demonstrate change management, evidence-based decisions, and immutable records (EU AI Act Article 12, SOC 2)
A developer building agentic applications — you want your agent to improve over time, but you need guardrails so it doesn't break itself

You do NOT need Agent-Omega if:

You're building a simple chatbot with static prompts
You don't need your AI system to modify itself
You're looking for a model training or fine-tuning framework
You want a general-purpose task runner or CI/CD pipeline

Screenshots

Dashboard

Live health status, mutation budget, archivist summary, and quick actions.

Guided Onboarding

Step-by-step walkthrough with 4 use-case examples for different audiences.

Mutation Lifecycle

Submit proposals and run the full governance pipeline: constitutional check → validate → evaluate → decide.

Constitutional Governance

6 immutable constraints, mutation budget status, and dry-run constraint checker.

Staged Deployments

Create and manage deployments through shadow → canary → promote with interactive controls.

Configuration

All 10 environment variables documented, 4 setup recipes, live status panel.

Interactive API Docs

Auto-generated Swagger UI with every endpoint documented.

Install

As a native app (all platforms)

Agent-Omega is a Progressive Web App. Install it directly from the live portal:

Platform	How to install
Windows / macOS / Linux	Open the portal in Chrome or Edge → click install icon in address bar
Android	Open in Chrome → menu (⋮) → "Install app"
iPhone / iPad	Open in Safari → Share → "Add to Home Screen"

From source

git clone https://github.com/fredm23579/Agent-Omega.git
cd Agent-Omega
pip install -e ".[test]"

Or download the ZIP.

Requirements: Python 3.12+ only. No other system dependencies for in-memory mode.

Quick Start

# Start the control plane (no configuration required)
uvicorn apps.server.primary_runtime:app --reload

# Open http://localhost:8000 (redirects to web console)

That's it. The server starts in in-memory mode with all governance features active. Visit http://localhost:8000/console/guide for a guided walkthrough.

First things to try

Submit a mutation: Go to /console/mutations, fill in the form, click "Run Full Lifecycle"
Test constitutional constraints: On the same page, click "Pre-Check Constitutional" with changed_module_ids: ["governance_core"] — watch it get blocked
Create a deployment: Go to /console/deployments, create one, then walk it through shadow → canary → promote
Check the audit trail: Go to /console/archivist to see recorded outcomes and patterns

Using the API directly

curl

curl http://localhost:8000/api/v1/health
curl -X POST http://localhost:8000/api/v1/mutations/lifecycle \
  -H "Content-Type: application/json" \
  -d '{"mutation_class":"parameter_update","parent_system_version_id":"v1","hypothesis":"Improve quality"}'
curl http://localhost:8000/api/v1/budget/status

Python (httpx)

import httpx

client = httpx.Client(base_url="http://localhost:8000/api/v1")

# Health check
print(client.get("/health").json())

# Run mutation lifecycle
result = client.post("/mutations/lifecycle", json={
    "mutation_class": "parameter_update",
    "parent_system_version_id": "v1",
    "hypothesis": "Improve response quality",
}).json()
print(result["decision"])  # {"decision": "quarantine", ...}

# Constitutional check (dry run)
check = client.post("/constitutional/check", json={
    "changed_module_ids": ["governance_core"],
}).json()
print(check["passed"])  # False — governance modules are protected

# Budget status
print(client.get("/budget/status").json())

JavaScript (fetch)

const API = "http://localhost:8000/api/v1";

// Health check
const health = await fetch(`${API}/health`).then(r => r.json());

// Run mutation lifecycle
const result = await fetch(`${API}/mutations/lifecycle`, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    mutation_class: "parameter_update",
    parent_system_version_id: "v1",
    hypothesis: "Improve response quality",
  }),
}).then(r => r.json());

Using the CLI

python -m apps.cli.main health --json
python -m apps.cli.main init my-system --json
python -m apps.cli.main providers-health --json

How to Use

Step 1: Register a system

A "system" is the AI agent you want to govern. Register it once:

curl -X POST http://localhost:8000/api/v1/systems -H "Content-Type: application/json" \
  -d '{"name": "my-agent"}'
# Returns: {"id": "sys-1", "name": "my-agent", "status": "draft"}

Step 2: Submit a mutation proposal

When your agent wants to change itself (update a prompt, add a tool, modify architecture), it submits a proposal:

curl -X POST http://localhost:8000/api/v1/mutations/lifecycle \
  -H "Content-Type: application/json" \
  -d '{
    "proposal_id": "improve-reasoning",
    "parent_system_version_id": "v1",
    "mutation_class": "parameter_update",
    "hypothesis": "Adding chain-of-thought will improve accuracy",
    "changed_module_ids": ["reasoning-engine"],
    "payload": {"temperature": 0.7}
  }'

The system will:

Constitutional check — verify the proposal doesn't violate any of the 6 immutable rules
Validate — run 7 structural checks (schema, graph, contracts, tiers, capabilities, resources, preflight)
Evaluate — score across 7 dimensions (task improvement, generalization, risk, maintainability, calibration, efficiency, integrity)
Decide — accept, reject, or quarantine based on evidence thresholds

Step 3: Deploy the change

If accepted, deploy through staged rollout:

# Create deployment
curl -X POST http://localhost:8000/api/v1/deploy/create \
  -d '{"system_version_id": "v1"}'

# Shadow (test alongside production, no live traffic)
curl -X POST http://localhost:8000/api/v1/deploy/deploy-1/shadow

# Canary (10% of traffic)
curl -X POST http://localhost:8000/api/v1/deploy/deploy-1/canary

# Promote (100% of traffic)
curl -X POST http://localhost:8000/api/v1/deploy/deploy-1/promote

# Or rollback at any stage
curl -X POST http://localhost:8000/api/v1/deploy/deploy-1/rollback

Step 4: Audit and monitor

# View detected patterns
curl http://localhost:8000/api/v1/archivist/patterns

# Get audit summary
curl http://localhost:8000/api/v1/archivist/summary

# Explore version lineage
curl http://localhost:8000/api/v1/lineage/version-mutation-1

Configuration

Variable	Purpose	Default
`DATABASE_URL`	PostgreSQL connection string for persistent storage	Not set (in-memory)
`AGENT_OMEGA_USE_PERSISTENT_REGISTRIES`	Enable database-backed registries	`false`
`AGENT_OMEGA_USE_MODEL_EVALUATION`	Use LLM-based evaluation instead of heuristics	`false`
`AGENT_OMEGA_ENABLE_JUDGE`	Enable independent judge verification	`false`
`AGENT_OMEGA_CORS_ORIGINS`	Allowed CORS origins (comma-separated)	`*` (all)
`OPENAI_API_KEY`	OpenAI API key (for model evaluation)	Not set
`ANTHROPIC_API_KEY`	Anthropic API key (for model evaluation)	Not set
`OPENROUTER_API_KEY`	OpenRouter API key (for model evaluation)	Not set

See the Config page for setup recipes.

Architecture

Mutation Proposal
       │
       ▼
┌──────────────────┐
│  Constitutional   │  6 immutable rules (ADR-008)
│  Constraint Layer │  Cannot be bypassed or self-modified
└──────┬───────────┘
       ▼
┌──────────────────┐
│  Budget Check     │  Rate limiting (ADR-010)
└──────┬───────────┘
       ▼
┌──────────────────┐
│  7-Stage          │  Schema, graph, contracts, tiers,
│  Validation       │  capabilities, resources, preflight
└──────┬───────────┘
       ▼
┌──────────────────┐
│  7-Dimension      │  Task, generalization, risk, maintainability,
│  Evaluation       │  calibration, efficiency, integrity (ADR-007)
└──────┬───────────┘
       ▼
┌──────────────────┐
│  Decision Engine  │  Accept / reject / quarantine
│  + Judge          │  Independent verification (optional)
└──────┬───────────┘
       ▼
┌──────────────────┐
│  Archivist        │  Record outcome, detect patterns
└──────┬───────────┘
       ▼
┌──────────────────┐
│  Staged Deploy    │  Shadow → canary → promote (ADR-006)
│  + Executor       │  Health checks, traffic allocation, rollback
└──────────────────┘

Runtime Topology

Runtime	Module	Purpose
Primary (preferred)	`apps/server/primary_runtime.py`	Top-level entrypoint. Selects canonical or persistent composition.
Canonical	`apps/server/canonical_runtime.py`	In-memory-default composition.
Persistent	`apps/server/persistent_runtime.py`	All state through SQLAlchemy. Requires `DATABASE_URL`.

Repository Structure

Agent-Omega/
├── apps/
│   ├── server/              # FastAPI control plane
│   │   ├── primary_runtime.py       # Preferred entrypoint
│   │   ├── api_v2_router.py         # All API routes (36 endpoints)
│   │   ├── middleware.py            # CORS configuration
│   │   ├── settings.py             # Environment configuration
│   │   └── *_factory.py            # Service graph composition
│   ├── cli/                 # Typer CLI (health, init, providers)
│   ├── github_app/          # GitHub App webhooks (ADR-004)
│   └── web/                 # 13-page interactive web console
│       └── console.py
├── services/
│   ├── kernel/              # Core governance
│   │   ├── constitutional.py        # 6 immutable constraints (ADR-008)
│   │   ├── validation.py           # 7-stage validation pipeline
│   │   ├── evaluation_engine.py    # Heuristic 7-dimension scoring
│   │   ├── model_evaluation_engine.py  # LLM-based scoring
│   │   ├── decision_engine.py      # Evidence-based decisions
│   │   ├── adaptive_thresholds.py  # Feedback-driven thresholds (ADR-009)
│   │   ├── mutation_budget.py      # Rate limiting (ADR-010)
│   │   └── canonical_service.py    # Composed kernel service
│   ├── deployment/          # Shadow → canary → promote (ADR-006)
│   │   ├── service.py              # State machine
│   │   ├── executor.py             # Health checks + traffic allocation
│   │   └── persistent_service.py   # SQLAlchemy-backed
│   ├── judge/               # Independent verification (3 modes)
│   ├── archivist/           # Outcome recording + pattern detection
│   ├── ir_compiler/         # 6-stage IR compilation pipeline
│   ├── sandbox/             # Capability/resource/secret isolation (ADR-005)
│   ├── lineage/             # Version ancestry tracking
│   ├── ir_registry/         # IR version storage
│   └── system_registry/     # System record management
├── packages/
│   ├── core_types/          # 32 Pydantic models, 6 enums
│   ├── provider_openai/     # OpenAI adapter (Responses API + SSE)
│   ├── provider_anthropic/  # Anthropic adapter (Messages API + SSE)
│   ├── provider_openrouter/ # OpenRouter adapter (Chat completions + SSE)
│   ├── openclaw_bridge/     # OpenClaw thin client
│   ├── github_integration/  # JWT auth, webhooks, event routing
│   └── storage/             # 9 SQLAlchemy models, 5 Alembic migrations
├── docs/
│   ├── adr/                 # 10 Architectural Decision Records
│   ├── architecture/        # System design documents
│   ├── api/                 # API reference
│   └── screenshots/         # Console screenshots
├── tests/                   # 602 tests (unit + integration + stress)
├── .github/workflows/       # CI: ruff + pytest on 3.12/3.13
└── pyproject.toml           # v2.0.0

Constitutional Governance

As of v2.0, Agent-Omega enforces 6 immutable constitutional constraints checked before any mutation enters the governance pipeline. These constraints cannot be bypassed, weakened, or self-modified:

ID	Constraint	What it prevents
C1	Governance self-preservation	No mutation may target governance or constitutional components
C2	Safety service protection	No mutation may disable evaluation, judge, archivist, or sandbox
C3	Deployment discipline	No mutation may bypass shadow → canary → promote
C4	Auditability preservation	No mutation may remove lineage tracking or audit trails
C5	Tier escalation prevention	No mutation may self-grant authority-tier escalation
C6	Mutation budget	Rate limiting prevents runaway proposal generation

Decision thresholds adapt over time from archivist pattern feedback (ADR-009), within constitutional floors that prevent unsafe relaxation.

API Reference

36 endpoints across 10 groups. Full documentation at /docs (Swagger) when the server is running.

Method	Path	Description
Health
`GET`	`/api/v1/health`	Server health check
Systems
`POST`	`/api/v1/systems`	Create a system record
`GET`	`/api/v1/systems/{id}`	Fetch a system record
IR Registry
`POST`	`/api/v1/ir/register`	Register an IR version
`GET`	`/api/v1/ir/{id}`	Fetch an IR version
`GET`	`/api/v1/ir/{from}/diff/{to}`	Diff two IR versions
`POST`	`/api/v1/ir/compile`	Compile IR to executable artifact
`GET`	`/api/v1/ir/artifacts/{id}`	Fetch a compiled artifact
Mutations
`POST`	`/api/v1/mutations/propose`	Submit a mutation proposal
`POST`	`/api/v1/mutations/{id}/validate`	Validate through 7-stage pipeline
`POST`	`/api/v1/mutations/{id}/evaluate`	Evaluate across 7 dimensions
`POST`	`/api/v1/mutations/{id}/decide`	Evidence-based decision
`POST`	`/api/v1/mutations/lifecycle`	Full lifecycle (constitutional → validate → evaluate → decide)
Lineage
`POST`	`/api/v1/lineage/record`	Record a lineage entry
`GET`	`/api/v1/lineage/{version_id}`	Walk lineage graph to roots
Deployment
`POST`	`/api/v1/deploy/create`	Create a deployment
`POST`	`/api/v1/deploy/{id}/shadow`	Start shadow deployment
`POST`	`/api/v1/deploy/{id}/canary`	Promote to canary
`POST`	`/api/v1/deploy/{id}/promote`	Promote to production
`POST`	`/api/v1/deploy/{id}/rollback`	Rollback deployment
`GET`	`/api/v1/deploy/{id}/status`	Detailed deployment status
`GET`	`/api/v1/deploy/{id}/health`	Health check results
Sandbox
`POST`	`/api/v1/sandbox/execute`	Execute artifact in sandbox
`GET`	`/api/v1/sandbox/results/{id}`	Fetch execution result
Providers
`GET`	`/api/v1/providers/health`	Aggregated provider status
`POST`	`/api/v1/providers/{provider}/stream`	Stream from provider (SSE)
Judge
`POST`	`/api/v1/judge/verify`	Independent evaluation verification
Archivist
`POST`	`/api/v1/archivist/record`	Record lifecycle outcome
`GET`	`/api/v1/archivist/patterns`	Detected patterns and anti-patterns
`GET`	`/api/v1/archivist/summary`	Transfer summary for time window
Governance
`POST`	`/api/v1/constitutional/check`	Dry-run constitutional check
`GET`	`/api/v1/budget/status`	Mutation budget status
Proposals
`POST`	`/api/v1/proposals/generate`	Generate autonomous proposals from pattern data
`POST`	`/api/v1/proposals/generate-and-run`	Generate + run each through full lifecycle

Technologies

Category	Technology	Purpose
Language	Python 3.12+	Type-safe, async-capable
Web Framework	FastAPI 0.115+	API and web console serving
ASGI Server	Uvicorn 0.30+	Production async server
Data Validation	Pydantic 2.7+	32 typed models, OpenAPI schema
CLI	Typer 0.12+	Command-line operator surface
HTTP Client	httpx 0.27+	Provider API calls, SSE streaming
ORM	SQLAlchemy 2.0+	9 persistence models
Migrations	Alembic 1.13+	5 database migrations
Linting	Ruff 0.4+	Linting + formatting (120 char, py312)
Testing	pytest 8.0+ / pytest-asyncio / pytest-cov	592 tests, 94% coverage, 70% gate
CI	GitHub Actions	Python 3.12 + 3.13 matrix
Type Checking	PEP 561 py.typed markers	All packages type-checkable

Comparison with Alternatives

Feature	Agent-Omega	LangGraph	AutoGen	CrewAI
Constitutional constraints	6 immutable rules, pre-validation	No	No	No
Multi-dimensional evaluation	7 dimensions with threshold rules	No built-in	No built-in	No built-in
Staged deployment	Shadow → canary → promote	No	No	No
Independent judge verification	3 verification modes	No	No	No
Immutable audit trail	Archivist + pattern detection	No	No	No
Mutation budget / rate limiting	Per-window budget with constitutional enforcement	No	No	No
Adaptive thresholds	Feedback-driven with constitutional floors	No	No	No
Version lineage tracking	Full ancestry graph	No	No	No
Sandbox execution	Capability/resource/secret isolation	No	Limited	No
Self-modification governance	Primary purpose	Not designed for this	Not designed for this	Not designed for this

Agent-Omega is not a competitor to LangGraph, AutoGen, or CrewAI. Those are agent orchestration frameworks. Agent-Omega is the governance layer that sits on top of any agent — including agents built with those frameworks — to govern how they modify themselves.

Cross-Domain Applications

Agent-Omega's governance model applies wherever an autonomous system needs to modify itself safely:

Domain	Application
AI Agents	Govern prompt mutations, tool additions, reasoning strategy changes
MLOps	Model deployment with evidence-based approval and automatic rollback
Autonomous Vehicles	Govern software updates to self-driving systems with staged rollout
Healthcare AI	Change management for diagnostic AI with audit trails for regulatory compliance
Financial Trading	Govern strategy modifications with risk evaluation and lineage tracking
Robotics	Govern control-system parameter changes with safety constraints
Smart Contracts	Govern upgrades to on-chain logic with constitutional constraints
Cybersecurity	Govern rule changes to threat-detection systems with pattern monitoring

The common pattern: any system where (a) autonomous modification is desirable for improvement, but (b) ungoverned modification is dangerous.

Development Philosophy

Alignment by architecture — The system is structurally incapable of approving mutations that violate its constitutional constraints. Safety is not a behavior to be trained; it is a structural property.
Evaluation-decision separation — The evaluation engine scores candidates; the decision engine applies threshold rules. These are independent stages with typed interfaces. No single score can dominate (ADR-007).
Evidence over authority — Acceptance requires multi-dimensional evidence across 7 dimensions. Authority tier alone is not sufficient to bypass evaluation.
Thin control surfaces — CLI, web console, GitHub App, OpenClaw — all are thin clients over one shared API (ADR-001). Business logic stays in services.
Staged deployment discipline — No change goes directly to production. Shadow → canary → promote, with rollback available at every stage (ADR-006).
Comprehensive auditability — Every decision is recorded with full evidence, lineage is tracked, patterns are detected. The archivist is the system's memory.
Bounded adaptation — Thresholds adapt from historical data, but within constitutional floors that prevent unsafe relaxation. The system self-tunes, but cannot self-corrupt.

Architectural Decision Records

13 ADRs in docs/adr/ govern the architecture:

ADR	Decision
ADR-001	Thin control surfaces over one orchestration API
ADR-002	Reflective IR is the canonical mutable object
ADR-003	Provider access only through adapter interfaces
ADR-004	GitHub integration through a GitHub App
ADR-005	Candidate execution is sandboxed
ADR-006	Deployment requires shadow then canary
ADR-007	Evaluation evidence is multi-dimensional
ADR-008	Constitutional constraints are immutable
ADR-009	Decision thresholds adapt from archivist feedback
ADR-010	Mutation budget rate limiting

Contributing

See CONTRIBUTING.md. PRs should be bounded, tested, and documented.

# Development setup
pip install -e ".[test]"
make check   # ruff check + format + pytest
make coverage  # pytest with coverage report

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 174 Commits
.github		.github
apps		apps
docs		docs
infrastructure/migrations		infrastructure/migrations
packages		packages
services		services
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
alembic.ini		alembic.ini
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

What is Agent-Omega

Why it Exists

Who Needs It

You need Agent-Omega if you are:

You do NOT need Agent-Omega if:

Screenshots

Dashboard

Guided Onboarding

Mutation Lifecycle

Constitutional Governance

Staged Deployments

Configuration

Interactive API Docs

Install

As a native app (all platforms)

From source

Quick Start

First things to try

Using the API directly

Using the CLI

How to Use

Step 1: Register a system

Step 2: Submit a mutation proposal

Step 3: Deploy the change

Step 4: Audit and monitor

Configuration

Architecture

Runtime Topology

Repository Structure

Constitutional Governance

API Reference

Technologies

Comparison with Alternatives

Cross-Domain Applications

Development Philosophy

Architectural Decision Records

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages