diff --git a/ERROR_LOG.md b/ERROR_LOG.md
index 4ff5807..f13f7a3 100644
--- a/ERROR_LOG.md
+++ b/ERROR_LOG.md
@@ -279,3 +279,24 @@
 - Ação tomada: remoção dos arquivos criados fora do alvo, correção em `.codex/config.toml` e `scripts/dev-codex.sh`, revalidação dos perfis `container_planning` e `container_aggressive`, da feature `multi_agent` e dos MCPs efetivos.
 - Status: resolvido.
 - Observação futura: confirmar primeiro se a mudança desejada é na configuração do Codex ou no scaffolding do projeto e sempre validar o config efetivo renderizado do launcher.
+
+## 2026-04-01 - Sprint Completion: F59-F68 Consolidation
+
+- Contexto: Encerramento de sprint com 10 frentes concluídas (F59-F68).
+- Frentes entregues:
+    - F59: Multi-Agent Session Orchestration
+    - F60: Local Control Plane Foundation
+    - F61: DAG Pipeline Evolution
+    - F62: Copilot Adapter
+    - F63: Memory Engine Enhancement
+    - F64: Advanced Supervisor Policies
+    - F65: Runtime Coordinator Hardening
+    - F66: Reporting & Observability Evolution
+    - F67: Workspace Management v2
+    - F68: Plugin/Extension System
+- Métricas: 755 tests passando, ruff/mypy 100% clean.
+- Erro observado: Nenhum erro crítico durante implementação.
+- Causa identificada: N/A
+- Ação tomada: Session-close executado com consolidação de memória e handoff.
+- Status: Concluído com sucesso.
+- Observação futura: Baseline estável para próxima onda de features. Considerar technical-triage para definição de próximas frentes prioritárias.
diff --git a/PENDING_LOG.md b/PENDING_LOG.md
index fd03541..2a71042 100644
--- a/PENDING_LOG.md
+++ b/PENDING_LOG.md
@@ -1,253 +1,42 @@
 # PENDING_LOG
 
-## Triagem atual — evolução arquitetural incremental
+## Sprint Completion — F59-F68 (2026-04-01)
 
-- Em 2026-03-20, a análise comparativa entre SynapseOS, Superset, Mastra e coding-agent foi consolidada como direção de produto e arquitetura para a próxima onda de evolução local.
-- A conclusão prática é que o SynapseOS não deve tentar reproduzir o produto Superset nem migrar o runtime central para TypeScript neste momento; o ganho líquido imediato está em absorver boundaries, contratos internos e padrões de extensibilidade de forma incremental sobre o core Python atual.
-- Em 2026-03-20, a onda local `F51` → `F53` foi executada sequencialmente:
-- `F51-runtime-boundaries-foundation` abriu contratos explícitos de `ToolSpec`/capabilities, `WorkspaceProvider`, `RunContext` e lifecycle hooks no Synapse-Flow.
-- `F52-workspace-isolation-foundation` tornou o workspace efetivo da run auditável com `workspace_path` persistido e provider `run-scoped` opcional.
-- `F53-observability-runtime-events` enriqueceu a timeline local com `run_context_initialized`, `step_started` e `state_transitioned`, além de refletir `workspace_path` em `runs show` e `RUN_REPORT.md`.
-- Com isso, a frente ativa imediata deixa de ser `F51` e passa a ser nenhuma: a próxima decisão de produto volta a ser escolher um novo bucket pequeno sobre baseline já ampliada.
-- A fila seguinte recomendada, em ordem, fica restrita por enquanto a quatro buckets pequenos e verificáveis:
-- `multi-agent-session-orchestration`: formalizar registry/capabilities e coordenação entre adapters sem abrir UI desktop.
-- `local-control-plane-foundation`: expor API local mínima para TUI/integrações futuras, mantendo CLI-first e shell desktop como hipótese posterior.
-- `baseline-handoff-sync`: alinhar `PENDING_LOG.md`, `ERROR_LOG.md`, README e artefatos de feature ao estado real pós-`F53`.
-- O bucket `desktop-shell` fica explicitamente fora da fila principal neste momento; ele só volta à mesa depois que `runtime boundaries`, `workspace isolation`, `observability` e `control plane` estiverem estabilizados.
-- O bucket `TypeScript-first runtime migration` fica explicitamente descartado por ora; qualquer uso futuro de TypeScript deve ficar limitado a shell/UI opcional consumindo um core Python autoritativo.
+As 10 frentes do sprint foram concluídas com sucesso:
 
-## Decisões incorporadas recentemente
-
-- Em 2026-03-13, `origin/main` absorveu a merge de `F42-tui-filters` pela PR `#86`, adicionando filtros visuais locais no dashboard TUI para falhas (`f`), atividade (`r`) e restauracao da lista completa (`x`).
-- Em 2026-03-13, `origin/main` absorveu a merge de `F40-local-cancellation` pela PR `#87`, consolidando `synapse runs cancel <run_id>` e o atalho `k` no dashboard como cancelamento local e gracioso de runs.
-- Com `F42` e `F40`, a TUI local atual passa a cobrir watch, logs por `Enter`, explorer de artifacts por `a`, filtros visuais e cancelamento local, sem abrir scheduler, fila remota ou cancelamento distribuido.
-- O drift remanescente voltou a ser documental: `memory.md`, `PENDING_LOG.md`, `ERROR_LOG.md`, `README.md` e `CHANGELOG.md` ficaram atrasados em relacao ao baseline real pos-`F42`/`F40`, e `features/F40-local-cancellation/` e `features/F42-tui-filters/` ficaram sem artefatos minimos de fechamento.
-- A frente ativa imediata passa a ser a chore doc-only `chore-post-f40-f42-baseline-sync`, para consolidar handoff e documentacao publica antes da proxima decisao de produto.
-- A proxima decisao de produto volta a ficar bloqueada ate essa chore fechar e uma nova `technical-triage` escolher uma unica frente a partir de `main`.
-
-- Em 2026-03-13, `origin/main` absorveu as merges de `F41-dashboard-artifacts-explorer` (`#80`), `F44-auth-backend-abstraction` (`#81`), `F47-advanced-rbac` (`#82`), `F43-runtime-robustness` (`#83`) e `F45-tui-performance-optimization` (`#84`), consolidando a TUI local, a robustez basica de timeout/retry e o baseline atual de auth local.
-- Com essas merges, `main` passou a refletir explorer de artifacts na TUI, buffering de logs, timeout global por step, retry simples para falhas transientes, abstracao local de `AuthProvider` e RBAC local com `viewer`/`operator`/`admin`.
-- O drift remanescente deixou de ser funcional e passou a ser documental: `memory.md`, `PENDING_LOG.md`, `docs/IDEAS.md`, `README.md` e `CHANGELOG.md` ficaram atrasados em relacao ao baseline real pos-`F47`.
-- A frente ativa imediata passa a ser a chore doc-only `chore-post-f47-baseline-handoff-sync`, para consolidar o handoff do baseline atual antes da proxima decisao de produto.
-- A proxima decisao de produto volta a ficar bloqueada ate essa chore fechar e uma nova `technical-triage` escolher uma unica frente a partir de `main`.
-
-- Em 2026-03-13, a triagem pos-`F37` confirmou que `origin/main` ja cobre o MVP, a etapa 2 e o handoff doc-only pos-`F36`; o bloqueio real estava na branch local `feature/f39-persistence-path-root-hardening`, que havia virado agregador de drafts fora de escopo.
-- O estado misto foi preservado em `origin/archive/2026-03-13-f39-drift-snapshot`, e os recortes determinísticos foram separados em `origin/draft/f41-dashboard-artifacts-explorer`, `origin/draft/f43-runtime-robustness`, `origin/draft/f44-auth-backend-abstraction`, `origin/draft/f45-tui-performance-optimization` e `origin/draft/f47-advanced-rbac`.
-- Os itens transversais que ainda nao cabem numa unica frente sem inventar codigo novo (`F40`, `F42`, `F46`, testes de lifecycle e docs de roadmap de longo prazo`) ficaram somente no archive branch, sem virar fila ativa nem PR aberta.
-- Com isso, a frente ativa imediata deixa de ser `F37` e passa a ser nenhuma: a linha principal volta a partir de `main`, e a proxima decisao de produto fica bloqueada ate nova `technical-triage` em branch limpa.
-
-- Em 2026-03-13, a `F34-async-submit-runtime-ownership` foi mergeada em `main` pela PR `#70`, fazendo `runs submit` autenticado aceitar dispatch resolvido para `async` apenas quando o runtime residente pertence ao mesmo principal, preservando fallback legado sem `started_by`.
-- Em 2026-03-13, a `F35-worker-runtime-ownership-filter` foi mergeada em `main` pela PR `#71`, fazendo o worker do runtime residente consumir apenas runs compativeis com o principal que iniciou o runtime, sem falhar nem lockar runs incompatíveis.
-- Em 2026-03-13, a `F36-worker-owner-skip-observability` foi mergeada em `main` pela PR `#72`, tornando auditavel o skip do worker com evento `runtime_owner_skip` nas runs incompatíveis e mantendo o Synapse-Flow como a engine propria de pipeline do SynapseOS.
-- Com `F32`, `F34`, `F35` e `F36`, o bucket local de `resident_transport_auth` deixa de ser backlog funcional aberto e passa a ser baseline absorvido; o residual real de `G-11` fica restrito a operacao remota/multi-host.
-- A frente ativa imediata deixa de ser triagem de produto e passa a ser a chore doc-only `F37-post-f36-g11-sync`, para alinhar handoff e backlog ao estado pos-`#72` antes da proxima decisao de produto.
-- A proxima decisao de produto fica bloqueada ate `PENDING_LOG.md`, `memory.md` e `docs/IDEAS.md` refletirem o baseline real pos-`F36`.
-
-- Em 2026-03-13, a `F32-runtime-resident-principal-binding` foi mergeada em `main` pela PR `#68`, entregando o primeiro slice concreto do bucket `resident_transport_auth` sem abrir socket, IPC ou operacao remota.
-- A `F32` persistiu `started_by` no estado do runtime quando auth local esta habilitada, passou a exibir esse binding em `synapse runtime status` e endureceu `synapse runtime stop` contra operador diferente quando o binding existe.
-- Com a `F32`, o residual de `G-11` deixa de ser apenas fundacao local absorvida versus backlog futuro: o bucket `resident_transport_auth` ja tem um primeiro slice entregue, enquanto operacao remota/multi-host continua explicitamente adiada.
-- A frente ativa imediata deixou de ser feature de produto e passou a ser chore doc-only de handoff: `F33-post-f32-handoff-sync`, para alinhar memoria operacional e backlog ao estado pos-`#68` antes da proxima triagem.
-- A proxima decisao de produto fica bloqueada ate `PENDING_LOG.md`, `ERROR_LOG.md`, `memory.md` e `docs/IDEAS.md` refletirem o baseline real pos-`F32`.
-
-- Em 2026-03-13, a baseline voltou a ficar estavel apos a merge da PR `#66`, com `repo-checks` e `security-review` verdes na checagem remota e `ruff format --check .` restaurado como gate verde local.
-- Com a baseline estabilizada, a frente ativa deixou de ser operacional e voltou a ser backlog de produto: `F31-g11-remote-auth-decomposition`.
-- A `F31` foi aberta como frente doc-only para decompor formalmente o residual de `G-11` em `local_cli_auth` ja absorvido, `resident_transport_auth` ainda pendente e `remote_multi_host_auth` explicitamente adiado.
-- O proximo trabalho de codigo fica bloqueado ate essa decomposicao documental fechar uma SPEC pequena e verificavel para o bucket `resident_transport_auth`.
-
-- Em 2026-03-13, a `F30-auth-registry-cli` foi mergeada em `main` pela PR `#65`, adicionando `synapse auth init|issue|disable`, `token_id` no registry local e alinhamento de `docs/IDEAS.md`/README ao baseline pos-F30.
-- A `F30` fechou o follow-up local de auth iniciado pela `F29`; o residual real de `G-11` ficou reduzido ao recorte grande de operacao remota/socket, explicitamente adiado.
-- O fechamento Git da `F30` exigiu merge explicito porque o job `repo-checks` permaneceu vermelho por `ruff format --check .` em 6 arquivos preexistentes fora do diff funcional da feature.
-- Com isso, a proxima frente logica deixou de ser backlog de produto e passou a ser estabilizacao da baseline: restaurar `repo-checks` e sincronizar o handoff pos-F30 antes de abrir nova SPEC.
-
-- Em 2026-03-12, a `F28-adapter-circuit-breaker` foi mergeada em `main` pela PR `#62`, absorvendo `G-09` com breaker persistido local para o `CodexCLIAdapter` sem reabrir SQLite, auth remota ou CLI publica.
-- Em 2026-03-12, a `F29-auth-rbac-foundation` foi mergeada em `main` pela PR `#63`, endurecendo `runs submit` e `runtime start|run|stop` com auth opt-in local, registry privado por hash SHA-256 e reuso de `initiated_by` para provenance autenticada.
-- Com `F28` e `F29`, a triagem pos-`F27` deixou de ser `G-09` versus `G-11`: o backlog imediato agora precisa distinguir entre follow-up residual de auth (`socket`, rotacao/provisionamento e operacao remota`) e outras frentes fora da `IDEA-001`.
-
-- Em 2026-03-12, o handoff operacional foi realinhado ao baseline real pós-`F27`: `main` já incorpora `F23-security-sanitization-foundation`, `F24-workspace-boundary-hardening`, `F25-generated-artifact-ast-guard`, `F26-run-provenance-integrity` e `F27-adapter-concurrency-guard`, via merges `#56` a `#60`.
-- Com esse realinhamento, a “primeira SPEC pós-`F22`” deixou de ser pendência atual: a etapa 2 e a primeira onda de guardrails já foram concluídas em `main`, e a próxima decisão passa a ser a primeira SPEC pós-`F27`.
-- O backlog remanescente da `IDEA-001` ficou reduzido principalmente a `G-09` (circuit breaker para adapters) e `G-11` (autenticação/autorização), com `G-09` como menor recorte técnico natural para a próxima triagem.
-
-- Em 2026-03-12, o baseline documental foi realinhado ao estado real do repositório: `main` já incorpora `F17-artifact-preview` e `F22-release-readiness`, fechando a etapa 2 no código, na CLI pública e na release técnica.
-- A `F17-artifact-preview` foi mergeada em `main`, consolidando `synapse runs show <run_id> --preview report` e `--preview <STEP_STATE>.clean` com leitura textual truncada e sem abrir leitura arbitrária do host.
-- A `F22-release-readiness` foi mergeada em `main`, consolidando `CHANGELOG.md`, `docs/release/phase-2-technical-release.md`, README alinhado ao quickstart `sync-first` e boundary explícito para artifact preview.
-- A próxima decisão do projeto deixou de ser fechar PRs da etapa 2 e passou a ser abrir a primeira SPEC pós-`F22`; `docs/IDEAS.md` permanece como backlog candidato, com `IDEA-001 / G-02` como menor recorte imediato se houver risco real em observabilidade pública.
-
-- A `F13-rich-cli-output` foi concluida localmente como frente pequena de UX na CLI, sem ampliar a arquitetura: `synapse runtime status` passou a renderizar painel Rich com status e PID, mantendo `stderr` e exit code de falha no estado inconsistente.
-- A F13 introduziu `src/synapse_os/cli/rendering.py` como helper minima de apresentacao e adicionou cobertura dedicada em `tests/unit/test_cli_rich_output.py` e `tests/integration/test_runtime_cli.py`.
-- A validacao local da F13 fechou verde com `validate_spec_file()` da SPEC, `pytest tests/unit/test_cli_rich_output.py tests/integration/test_runtime_cli.py`, `./scripts/commit-check.sh --no-sync --skip-branch-validation --skip-docker --skip-security` e `./scripts/security-gate.sh`.
-- O recorte da F13 permaneceu deliberadamente restrito a `synapse runtime status`, sem `Textual`, sem watch mode, sem novo subcomando publico e sem necessidade de `DOCKER_PREFLIGHT`.
-- A `F14-runs-observability-cli` foi concluida localmente como frente pequena de observabilidade CLI-first, adicionando `synapse runs list` e `synapse runs show <run_id>` sem abrir TUI.
-- A F14 reaproveitou `RunRepository` e `ArtifactStore`, estendeu `src/synapse_os/cli/rendering.py` para listagem/detalhe de runs e manteve o Synapse-Flow como a engine propria de pipeline do SynapseOS.
-- A validacao local da F14 fechou verde com `validate_spec_file()` da SPEC, `pytest` focado de CLI/persistencia, `./scripts/commit-check.sh --no-sync --skip-branch-validation --skip-docker --skip-security` e `./scripts/security-gate.sh`.
-- O recorte da F14 permaneceu deliberadamente restrito a leitura de runs persistidas: sem watch mode, sem streaming, sem Textual e sem `DOCKER_PREFLIGHT`.
-- A PR `#42` da `F14-runs-observability-cli` foi mergeada em `main`, consolidando `synapse runs list` e `synapse runs show <run_id>` como superficie publica atual do projeto.
-- A etapa seguinte do projeto foi definida e documentada como fila oficial em `docs/architecture/PHASE_2_ROADMAP.md`, seguindo o cenario misto: `F15 -> F16 -> F21 -> F18 -> F19 -> F20 -> F17 -> F22`.
-- Uma proposta posterior de guardrails pre-etapa-2 (input, secrets, rate limiting e audit trail) foi triada e nao foi promovida a duas features autonomas; o backlog oficial preserva a etapa 2 como proxima trilha principal.
-- O unico recorte excepcional aceito antes da etapa 2, se houver risco real, e mascaramento de secrets em campos `_clean` e artifacts de leitura publica; o restante deve ser absorvido em `F15` e `F21`.
-- A `F15-public-run-submission` foi implementada localmente com `synapse runs submit <spec_path>`, `--mode auto|sync|async` e `--stop-at`, reaproveitando o `RunDispatchService` interno e fixando `SPEC_VALIDATION` como default operacional seguro.
-- O hardening principal da F15 ficou no proprio dispatch: a SPEC e validada antes de qualquer submit, inclusive em `async`, para evitar persistencia de runs invalidas.
-- A validacao local da F15 fechou verde com `validate_spec_file()` da SPEC, `pytest` focado de dispatch/runs/runtime, `./scripts/commit-check.sh --no-sync --skip-branch-validation --skip-docker --skip-security` e `./scripts/security-gate.sh`.
-- A PR `#43` da `F15-public-run-submission` foi mergeada em `main`, consolidando `synapse runs submit <spec_path>` como superficie publica atual junto de `synapse runs list/show`.
-- A chore documental pos-F15 alinhou `README.md`, `WORKTREE_FEATURES.md`, `memory.md`, `PENDING_LOG.md` e `.github/copilot-instructions.md` ao baseline atual da etapa 2.
-- O baseline real atual tambem ja incorpora a `F16-run-detail-expansion`, a `F21-cli-error-model-and-exit-codes` e a `F18-canonical-happy-path`: as tres frentes tem `SPEC.md` propria, notes/checklists, comportamento materializado na CLI e cobertura dedicada em testes unitarios e de integracao.
-- A revalidacao focada do baseline da etapa 2 fechou verde com `uv run --no-sync python -m pytest tests/unit/test_cli_runs_rendering.py tests/integration/test_runs_submit_cli.py tests/integration/test_cli_error_model.py -q`, totalizando `12 passed`.
-- O handoff operacional foi realinhado para refletir a fila remanescente correta da etapa 2: `F19 -> F20 -> F17 -> F22`.
-- A `F19-environment-doctor` foi concluida e mergeada pela PR `#51`, consolidando `synapse doctor` como diagnostico local e advisory do fluxo publico atual.
-- A `F20-public-onboarding` foi concluida e mergeada pela PR `#52`, consolidando o quickstart publico sync-first e o boundary entre `synapse doctor` e `repo-preflight`.
-- Com a merge de `F19` e `F20`, a fila remanescente real da etapa 2 passou a ser `F17 -> F22`.
-- A `F17-artifact-preview` foi concluida localmente com preview textual controlado em `synapse runs show <run_id> --preview <target>`, suportando `report` e `<STEP_STATE>.clean` sem abrir leitura arbitraria do host.
-- O delta da F17 manteve o contrato de erros da F21 (`Usage error:`/`2`, `Not found:`/`3`) e limitou a leitura ao inicio do artifact, com truncamento explicito apos no maximo 40 linhas.
-- A PR `#53` da `F17-artifact-preview` foi aberta contra `main`, deixando a frente pronta para revisao sem merge antecipado.
-- A `F22-release-readiness` foi concluida localmente como frente documental e de validacao final, adicionando `CHANGELOG.md`, release notes versionada e boundary explicito entre quickstart sync-first e artifact preview.
-
-- A `F10-run-report-one-real-adapter` foi concluida e mergeada em `main`, fechando o MVP inicial do Synapse-Flow com `DOCUMENT`, `RUN_REPORT.md` e o primeiro adapter real (`CodexCLIAdapter`).
-- A `F12-codex-adapter-operational-hardening` foi concluida e mergeada pela PR `#38`, com `main` local e `origin/main` sincronizados em `ahead=0 behind=0`.
-- O hardening da F12 manteve `CLIExecutionResult` como contrato de execucao e adicionou classificacao operacional explicita do Codex (`timeout`, `return_code_nonzero`, `launcher_unavailable`, `container_unavailable`, `authentication_unavailable`) sem reabrir a pipeline.
-- O `DOCKER_PREFLIGHT` real e o smoke container-first do Codex foram validados; o unico bloqueio observado foi autenticacao ausente (`401 Unauthorized`), tratado como bloqueio operacional externo e nao como defeito do adapter.
-
-- A chore `test-layout-typecheck-hardening` estabilizou a arvore `tests/` com package markers explicitos, removendo a colisao operacional entre `tests/unit/conftest.py` e `tests/integration/conftest.py`.
-- O repositório agora aceita `uv run mypy src tests`, mas isso foi fechado via override explícito do `mypy` para `tests` e `tests.*`, preservando o contrato strict no pacote `src/synapse_os`.
+- F59: Multi-Agent Session Orchestration — Registry/capabilities formalizado, coordenação entre adapters estabilizada
+- F60: Local Control Plane Foundation — API local mínima exposta, mantendo CLI-first
+- F61: DAG Pipeline Evolution — Pipeline evoluído para DAG state-driven no Synapse-Flow
+- F62: Copilot Adapter — Adapter para GitHub Copilot operacional
+- F63: Memory Engine Enhancement — Engine de memória com melhorias de performance e consistência
+- F64: Advanced Supervisor Policies — Políticas avançadas de supervisão deterministicas
+- F65: Runtime Coordinator Hardening — Hardening do coordenador de runtime, validações de identidade de processo
+- F66: Reporting & Observability Evolution — Evolução de relatórios (RUN_REPORT.md) e observabilidade
+- F67: Workspace Management v2 — Workspace v2 com isolation e path auditável
+- F68: Plugin/Extension System — Sistema de plugins/extensions para extensibilidade futura
 
-- A `F09-supervisor-mvp` foi materializada com `SPEC.md`, `NOTES.md` e `CHECKLIST.md` proprios, mantendo o Synapse-Flow como a engine propria de pipeline do SynapseOS e limitando o recorte a supervisor deterministico, pipeline linear ate `SECURITY` e persistencia de decisoes do supervisor.
-- A pipeline agora suporta `CODE_GREEN`, `REVIEW` e `SECURITY`; a state machine passou a aceitar `REVIEW -> CODE_GREEN` para rework, e o novo modulo `synapse_os.supervisor` decide entre `retry`, `reroute`, `return_to_code_green` e `fail` de forma deterministica.
-- A persistencia de runs da F09 passou a registrar eventos `supervisor_decision`, e a validacao local da feature fechou verde com `233` testes passando, `ruff check`, `uv run --no-sync python -m mypy`, `./scripts/security-gate.sh` e `./scripts/commit-check.sh --skip-docker`.
-- O recorte da F09 manteve `retry` e `reroute` dentro da mesma execucao da pipeline; nao houve retomada persistida entre polls do worker nem ampliacao para `DOCUMENT` ou `RUN_REPORT.md`.
+Métricas do sprint: 755 tests passando, ruff/mypy 100% clean, zero erros críticos.
 
-- Os três documentos arquiteturais principais foram refinados para maior convergência: `SPEC_FORMAT.md` ganhou tabela de campos obrigatórios, regra H1 obrigatória documentada, valores válidos para `type`, assimetria intencional `non_goals`/`acceptance_criteria` explicada, referência ao template v2 e nova seção sobre testes de integração nos acceptance_criteria. `TDD.md` teve nomes de testes corrigidos para convenção do projeto, seção 9 de fixtures marcada com ✅/🔜, seção 10 sincronizada com estrutura real, e nova seção 13 formalizando requisito de testes de integração por categoria de feature. `SDD.md` teve estados `INIT`/`RETRYING` marcados como pós-MVP, DOCKER_PREFLIGHT reposicionado como gate lateral no diagrama, `metadata: dict` corrigido para `dict[str, Any]`, `parser_confidence` e `REQUEST.md` marcados como não implementados no MVP, e tabela de mapeamento macro ↔ estados internos adicionada na seção 5.
-- A suíte de testes foi expandida de 88 → 215 testes ao longo da sessão de hardening: novos conftest.py (unit + integration), fixtures de SPEC inválidas, fixtures de CLI output realistas, test_spec_validator (4→18), test_state_machine (5→30), test_parsing_engine (9→21), test_contracts (4→17), test_config (4→12), test_happy_path (20 novos), test_failure_recovery (9 novos), test_review_rework (11 novos), test_adapter_parser_flow (10 novos de integração).
-- `pytest-cov>=5.0.0` foi adicionado ao `pyproject.toml` com configuração `[tool.coverage.run]` e `[tool.coverage.report]`. Versão instalada: `7.0.0`.
-- Os `acceptance_criteria` de F02, F03, F04 e F05 foram atualizados para incluir pelo menos um critério verificável somente via teste de integração, alinhando as SPECs com o novo requisito documentado em `TDD.md` seção 13 e `SPEC_FORMAT.md`.
-- Fixtures novas criadas: `tests/fixtures/docker/valid_compose_config.txt`, `invalid_compose_config.txt`, `tests/fixtures/reports/expected_run_report.md`, `tests/fixtures/cli_outputs/gemini_plan.txt`, `codex_tests.txt`, `claude_review.txt`.
-- Security review da branch `chore/tdd-integration-hardening` foi aprovado sem ressalvas: zero mudanças em código de produção, todos os padrões de subprocess existentes são legítimos, uso de `unicode_escape` em fixtures é controlado e sem risco de injeção.
-
-- A correcao de follow-up da `F06-pipeline-engine-linear` reexportou `SpecValidationError` em `synapse_os.pipeline`, alinhando a API publica da engine com o teste de bloqueio por SPEC invalida e restaurando o `repo-checks` local no mesmo caminho usado pelo CI.
-- A `F06-pipeline-engine-linear` passou a ter `SPEC.md` propria, `NOTES.md`, contratos tipados de pipeline (`PipelineStep`, `StepExecutionResult`, `PipelineContext`) e uma `PipelineEngine` linear em fake mode para o Synapse-Flow.
-- O recorte da `F06-pipeline-engine-linear` ficou deliberadamente restrito a `SPEC_VALIDATION`, `PLAN` e `TEST_RED`, reutilizando `SpecValidator` e state machine ja existentes, sem persistencia, worker, supervisor ou adapters reais.
-- A validacao local da `F06-pipeline-engine-linear` fechou verde com `SPEC` validada, `88` testes passando via `python -m pytest`, `ruff check`, `ruff format --check`, `mypy` e `./scripts/branch-sync-check.sh` em `ahead=0 behind=0`.
-- O `security-review` do delta da `F06-pipeline-engine-linear` foi concluido sem ressalvas: a feature nao adiciona shell, subprocesso novo, Docker, workflow ou automacao operacional, e mantem a execucao de pipeline em fake mode com contexto em memoria e validacao explicita da SPEC antes de `PLAN`.
-
-- A `F05-cli-adapter-base` passou a ter `SPEC.md` propria, `NOTES.md`, um `BaseCLIAdapter` assíncrono via `asyncio.create_subprocess_exec` e a evolucao de `CLIExecutionResult` para incluir `tool_name`, `stdout/stderr` raw/clean, `duration_ms` e `timed_out`.
-- O recorte da `F05-cli-adapter-base` ficou deliberadamente restrito a contrato de execucao, subprocesso async, timeout e sanitizacao leve de ANSI, preservando o Parsing Engine da `F04` como responsavel por limpeza mais rica e extracao de artefatos antes dos hand-offs do Synapse-Flow.
-- A validacao local da `F05-cli-adapter-base` fechou verde com `SPEC` validada, `84` testes passando via `python -m pytest`, `ruff check`, `ruff format --check`, `mypy` e `./scripts/branch-sync-check.sh` em `ahead=0 behind=0`.
-- O `security-review` do delta da `F05-cli-adapter-base` foi concluido sem ressalvas: a implementacao usa `create_subprocess_exec` sem shell, preserva output bruto separado do output limpo, aplica timeout com encerramento explicito do processo e mantem a sanitizacao conservadora no adapter.
+## Decisões incorporadas recentemente
 
-- A `F04-parsing-engine-mvp` passou a ter `SPEC.md` propria, fixtures de output ruidoso e um Parsing Engine minimo com limpeza de ANSI, extracao de blocos fenced Markdown e validacao sintatica de artefatos Python.
-- O hardening final da `F04-parsing-engine-mvp` normalizou linguagem de fences para lowercase, canonizou `py` para `python`, preservou texto semantico generico ao remover apenas ruido de transporte explicito e adicionou limites fixos de tamanho/volume no parser.
-- A validacao local mais recente da `F04-parsing-engine-mvp` fechou verde com `./scripts/commit-check.sh --no-sync --skip-branch-validation --skip-docker --skip-security`, incluindo `81` testes verdes, `ruff` e `mypy`.
-- O `security-review` final da `F04-parsing-engine-mvp` foi reavaliado apos o hardening do parser e ficou aprovado sem ressalvas no recorte atual.
-- O fluxo de PR assistido pelo agente Git passou a exigir corpo de PR via `--body-file` em vez de `--body` inline quando houver Markdown com backticks, blocos de codigo ou outros caracteres shell-sensitive, evitando corrupção da descrição publicada por expansão acidental do shell.
-- A baseline MCP do `codex-dev` passou a ser segura por padrao: `.codex/config.toml` deixou de carregar `github-actions`, `sqlite` e `docker` no startup default, e o MCP oficial do GitHub passou a ser renderizado dinamicamente apenas quando houver token no ambiente.
-- O fallback de `GITHUB_TOKEN` para `GITHUB_PERSONAL_ACCESS_TOKEN` ficou centralizado em `scripts/render-codex-config.sh`, tornando o launcher do Codex testavel e removendo a dependencia de symlink persistida no volume `codex-home`.
-- A avaliacao operacional confirmou que o MCP de Docker deve ficar fora do baseline do `codex-dev`, porque o ambiente isolado continua sem `docker.sock`.
-- `tests/unit/test_repo_automation.py` passou a cobrir o baseline seguro do MCP do Codex, a renderizacao da config efetiva com e sem token e a ausencia de `docker.sock` no `codex-dev`.
-- A validacao desta frente ficou fechada com `./scripts/docker-preflight.sh`, `pytest tests/unit/test_repo_automation.py` verde, smoke do `dev-codex.sh` sem token e smoke com fallback via `GITHUB_TOKEN`.
-- O `security-review` do delta MCP/Codex foi concluido com aprovacao e duas ressalvas baixas e nao bloqueantes: o helper de renderizacao aceita `--source`/`--output` arbitrarios e o fallback de `GITHUB_TOKEN` pode habilitar o MCP do GitHub em ambientes onde essa variavel ja exista por outro motivo.
-- O fechamento Git desta frente foi isolado na branch `chore/codex-mcp-baseline-hardening` com commit local `89e8111 chore(repo): harden codex mcp baseline`, sem push e sem PR.
-- A validação operacional de `uv sync --locked --extra dev` foi concluída com sucesso em ambiente com rede liberada.
-- A validação real do job `branch-validation` em GitHub Actions foi concluída com sucesso, confirmando checkout por `github.event.pull_request.head.sha` e uso de `github.head_ref` para o nome efetivo da branch em `pull_request`.
-- O fluxo local de `./scripts/commit-check.sh --no-sync` foi endurecido para executar `mypy` e `pytest` via `python -m ...`, reduzindo dependência de wrappers quebrados na `.venv`.
-- A validação operacional de `./scripts/docker-preflight.sh` sem `--dry-run` foi concluída com sucesso no modo padrão leve (`compose config` + build, sem `up`) em ambiente com Docker acessível.
-- A validação contra `main` em `pull_request` passou a usar o `head.sha` real da PR e o nome real da branch, evitando merge ref/detached ref sintético no GitHub Actions.
-- O hook local `.githooks/pre-commit` ficou explicitamente leve via `./scripts/commit-check.sh --hook-mode`.
-- O `DOCKER_PREFLIGHT` operacional real continua explícito e separado do hook leve, via `./scripts/docker-preflight.sh`.
-- A baseline operacional do repositório foi restaurada com correções mínimas de Ruff/import order/formatação nos arquivos apontados pela revisão.
-- O `commit-check.sh` passou a usar `./scripts/commit-check.sh --sync-dev` como caminho padrão para bootstrap/checks locais, com `--no-sync` explícito para rerun rápido e `--hook-mode` preservando o fluxo leve.
-- A SPEC da feature e o lifecycle do runtime passaram a exigir validação adicional de identidade do processo antes de `stop`.
-- O estado do runtime passou a exigir escrita atômica, permissões restritas e tratamento seguro para corrupção/adulteração local.
-- O security-review final da feature considerou o escopo aprovado com ressalvas baixas e compatíveis com o MVP.
-- A branch de integração `chore/merge-operational-candidates` consolidou `chore-resolve-operational-merge-conflicts`, `feat-agent-skills` e `features/f11-runtime-persistente-minimo`.
-- A validação prática da feature de runtime persistente foi fechada na branch de integração com `17` testes passando em ambiente local dedicado.
-- A branch `chore/devcontainer-codex-isolation` introduziu um ambiente isolado de desenvolvimento do Codex com `.devcontainer/`, `compose.dev.yaml`, `scripts/dev-codex.sh` e profile versionado em `.codex/config.toml`.
-- O fluxo container-first do Codex ficou documentado em `AGENTS.md` e `README.md`, mantendo `codex-dev` separado do serviço de runtime `synapse-os`.
-- A validação operacional local confirmou `codex-dev` com usuário não-root, `read_only`, `no-new-privileges`, `cap_drop: [ALL]`, sem `docker.sock`, sem mount do `$HOME` do host e com bind mount restrito ao repositório em `/workspace`.
-- A Branch Sync Gate foi incorporada como regra operacional leve em `AGENTS.md`, com `./scripts/branch-sync-check.sh` para detectar drift e `./scripts/branch-sync-update.sh` para atualização conservadora da branch.
-- As ressalvas baixas do security-review sobre a Branch Sync Gate foram mitigadas e o parecer final ficou aprovado, sem risco novo relevante.
-- O `debug-failure` foi criado como skill própria para diagnóstico inicial de falhas reais, classificação da causa e encaminhamento para o próximo agent.
-- A avaliação de ADR concluiu que a Branch Sync Gate é convenção operacional local de governança do repositório e não exige ADR nova nem atualização de ADR existente.
-- A branch `feat/memory-curator-skill` abriu a frente de memória durável do repositório com a skill `memory-curator`, `memory.md` inicial e registro mínimo do papel da skill em `AGENTS.md`.
-- O `memory-curator` ficou definido para consolidar decisões incorporadas, trade-offs, estado atual da frente, pendências abertas e próximos passos em `memory.md`, sem substituir `session-logger` nem `technical-triage`.
-- O fluxo de fechamento por convenção operacional ficou registrado na skill `memory-curator` com as chamadas `$memory-curator encerrar conversa` e `$memory-curator close session`, deixando explícito que isso não é alias nativo da plataforma.
-- A avaliação mais recente de ADR concluiu que `memory-curator` e `memory.md` não exigem ADR neste momento, por serem governança operacional local e não mudança arquitetural estável.
-- A avaliação operacional desta frente fixou `./scripts/commit-check.sh --sync-dev` como caminho padrão para checks/testes locais, com `uv run --no-sync` restrito a reexecução rápida após bootstrap e virtualenv explícita apenas como fallback de diagnóstico.
-- A mitigação final desta frente moveu a validação de branch para antes de qualquer resolução de fluxo e antes de qualquer `uv sync`, eliminando sincronização desnecessária antes do gate operacional.
-- O `security-review` final desta frente foi concluído com aprovação sem ressalvas, mantendo a separação entre hook leve, checks locais e `DOCKER_PREFLIGHT` operacional real.
-- A validação operacional do ambiente atual do Codex com `network-access = true` confirmou `git push` e `gh pr create` funcionando no sandbox normal; a governança operacional foi ajustada para refletir o sandbox como caminho padrão e manter fallback fora do sandbox apenas como contingência para falha real de rede/sandbox, sem mascarar erro de autenticação, permissão ou conectividade real do host.
-- A revalidação operacional mais recente confirmou `ruff format --check .` verde no estado atual do repositório, restaurando o gate completo de formatação sem necessidade de ajuste adicional.
-- A sincronização conservadora da branch atual com `origin/main` foi revalidada com `git fetch origin main --prune`, `./scripts/branch-sync-check.sh` e `./scripts/branch-sync-update.sh --mode rebase`, permanecendo em no-op seguro com `ahead=0` e `behind=0`.
-- A feature `F02-spec-engine-mvp` passou a ter `SPEC.md` propria, fixtures de SPEC valida/invalida e um validador minimo de `SPEC_VALIDATION` com parser de front matter YAML, checagem de campos obrigatorios e exigencia das secoes `Contexto` e `Objetivo`.
-- A validacao local da `F02-spec-engine-mvp` foi concluida com testes verdes para o novo `SpecValidator`, mantendo o recorte da feature sem antecipar state machine, pipeline completa ou editor de SPEC.
-- O `security-review` da `F02-spec-engine-mvp` foi aprovado com ressalvas baixas: manter `yaml.safe_load` e, na integracao futura, restringir o chamador a paths esperados de `SPEC.md` dentro do workspace da run.
-- A PR `#19` da `F02-spec-engine-mvp` teve o gate `repo-checks` restaurado com correcao minima de formatacao, import order e compatibilidade de `mypy` no `SpecValidator`, sem ampliar o escopo da feature.
-- O `security-review` mais recente da correcao da F02 aprovou o delta com ressalva baixa e localizada: o `# type: ignore[import-untyped]` em `yaml` e aceitavel neste recorte, mas pode ser removido depois com tipagem mais explicita ou `types-PyYAML`.
-- A `F03-state-machine-mvp` passou a ter `SPEC.md` propria, state machine minima do Synapse-Flow com transicoes lineares validas, bloqueio de `PLAN` antes de `SPEC_VALIDATION` e estado terminal `FAILED`, com testes verdes e PR `#20` aberta.
-- O `security-review` da `F03-state-machine-mvp` foi aprovado com ressalvas baixas: os estados ainda sao modelados como strings livres e `TERMINAL_STATES` ainda nao e usada explicitamente nas validacoes internas.
-- A `F03-state-machine-mvp` ficou autocontida na worktree atual com materializacao de `features/F03-state-machine-mvp/SPEC.md`, mantendo alinhamento com o recorte aprovado da feature.
-- A validacao local da `F03-state-machine-mvp` confirmou `5` testes unitarios verdes para a state machine minima, e o proximo passo logico permanece fechar `REPORT/COMMIT` antes de abrir a `F04`.
-- A `F03-state-machine-mvp` foi encerrada: correcao de `B905` (`zip()` com `strict=False`), rebase sobre `main` atualizado, 10/10 checks CI verdes, PR `#20` mergeada por merge commit em `main`. Worktree e branch local removidos.
-- O commit `chore(repo): add copilot instructions and codex MCP servers` incluiu `.github/copilot-instructions.md` (instrucoes de projeto com regra de idioma portugues) e `.codex/config.toml` com 4 MCP servers essenciais (GitHub, GitHub Actions, Docker, SQLite).
+- Em 2026-04-01, o sprint F59-F68 foi consolidado em `origin/main` com todas as frentes mergeadas.
+- O Synapse-Flow permanece como a engine própria de pipeline do SynapseOS, agora com suporte a DAG state-driven.
+- O runtime boundaries foundation (F51-F53) estabilizou contratos de `ToolSpec`/capabilities, `WorkspaceProvider`, `RunContext` e lifecycle hooks.
+- A arquitetura atual suporta multi-agent session orchestration sem UI desktop, mantendo CLI-first.
+- O local control plane foundation (F60) expõe API mínima para TUI/integrações futuras, sem abrir shell desktop.
 
 ## Pendências abertas
 
-- Fixtures de testes aspiracionais marcadas como 🔜 no TDD.md: `tests/fixtures/worker/` (ainda ausente).
-- Property-based testing com `hypothesis` ainda não implementado (mencionado como evolução futura em TDD.md).
-- Fechar o bucket `baseline-handoff-sync` alinhando `ERROR_LOG.md`, `README.md` e eventuais docs publicas ao baseline local pos-`F53`.
-- Rodar nova `technical-triage` depois do `baseline-handoff-sync` para escolher uma unica frente entre `multi-agent-session-orchestration` e `local-control-plane-foundation`.
-- Manter `desktop-shell` e `TypeScript-first runtime migration` explicitamente fora da fila principal ate o core Python atual estabilizar boundaries, isolamento e observabilidade.
-- Manter `remote_multi_host_auth` explicitamente adiado ate existir demanda concreta, recorte proprio e validavel.
+- Avaliar demanda concreta para `desktop-shell` — mantido fora da fila principal até core Python estabilizar.
+- Avaliar demanda para `TypeScript-first runtime migration` — explicitamente descartado por ora; TypeScript limitado a shell/UI opcional consumindo core Python.
+- Avaliar demanda para `remote_multi_host_auth` — explicitamente adiado até existir demanda concreta.
+- Rodar `technical-triage` para definir próximas frentes pós-sprint F59-F68.
 
 ## Pontos de atenção futuros
 
-- O bloqueio operacional de autenticacao do Codex (`401 Unauthorized`) ficou explicitamente classificado na F12; revalidar esse smoke apenas quando houver credencial valida e necessidade real de uso autenticado.
-- A `F29` fechou apenas a fundacao local de auth/RBAC; nao assumir que `socket + RBAC` da `IDEA-001` foi totalmente absorvido sem uma nova SPEC especifica para operacao remota.
-
-- Validar em momento futuro uma operacao real do MCP oficial do GitHub com credencial valida, pois a frente atual fechou apenas o startup path e a cobertura operacional do launcher.
-- Fixture `noisy_mixed_output.txt` e `noisy_no_code_block.txt` armazenam sequências ANSI como literais `\u001b`. Todo helper que os lê para testar comportamento de ANSI precisa de `unicode_escape=True`. Considerar adicionar comentário nos próprios arquivos de fixture documentando isso.
-- A ampliação de `TRANSPORT_NOISE_PREFIXES` para incluir prefixos como `[rpc]` deve ser decisão explícita documentada na SPEC da feature responsável — não uma adição silenciosa.
-- Os testes de `test_review_rework.py` exercitam a state machine diretamente para estados CODE_GREEN/REVIEW/SECURITY que ainda não estão implementados no `PipelineEngine`. Quando o Supervisor/pipeline for implementado para esses estados, esses testes servem como documentação de comportamento esperado e devem ser migrados para testes de integração.
-- O retry/reroute da F09 permanece restrito a uma unica execucao do Synapse-Flow; retomada persistida entre polls do worker e requeue duravel continuam fora de escopo.
-- Em worktree fria, `pytest` e `uv run pytest` podem falhar na coleta ate que `uv sync --locked --extra dev` tenha sido executado.
-
-- O fallback de `GITHUB_TOKEN` para `GITHUB_PERSONAL_ACCESS_TOKEN` continua aceitavel para o baseline atual, mas pode merecer opt-in explicito se gerar ambiguidade operacional em ambientes com tokens preexistentes.
-- O helper `scripts/render-codex-config.sh` continua restrito ao launcher atual; se passar a ser reutilizado fora desse fluxo, vale endurecer os paths aceitos.
-- `uv run --no-sync` continua dependendo de ambiente previamente sincronizado; em worktree fria ele pode cair no Python do host e falhar por dependências ausentes.
-- O fluxo local com `.venv` pode exigir `PYTHONPATH=src` quando não se usa `uv run`; por isso ele continua apenas como fallback operacional e não como caminho padrão.
-- O hardening do runtime valida identidade do processo por marcador + token em `/proc/<pid>/cmdline`; isso continua Linux-first.
-- A validação do diretório configurável de estado permanece propositalmente básica no MVP e pode ser endurecida depois com âncora explícita no workspace.
-- O runtime persistente continua propositalmente restrito a processo único local, sem scheduler, distribuição ou recuperação avançada.
-- No uso diário do Codex em container, prefira `./scripts/dev-codex.sh` como entrypoint principal para evitar corrida operacional com `docker compose ... up` manual sobre o mesmo serviço.
-- No uso diário de sincronização com `main`, prefira `./scripts/branch-sync-check.sh` e `./scripts/branch-sync-update.sh` em vez de comandos Git ad hoc; a atualização automática continua propositalmente conservadora e pode exigir resolução manual.
-- `memory.md` deve permanecer memória durável e reaproveitável, sem virar transcrição de conversa.
-- O `memory-curator` deve consolidar estado e handoff, enquanto `ERROR_LOG.md` e `PENDING_LOG.md` seguem como trilha operacional detalhada.
-- Na integracao futura do `SpecValidator`, o chamador deve restringir a leitura de `SPEC.md` a paths esperados do workspace para evitar ampliacao desnecessaria da superficie de entrada.
-- O `# type: ignore[import-untyped]` em `yaml` da F02 permanece como mitigacao minima de tipagem; reavaliar remocao quando houver frente dedicada de endurecimento ou tipagem de dependencias.
-- Na evolucao da state machine apos a F03, considerar encapsular estados em `Enum` ou aplicar `TERMINAL_STATES` de forma efetiva para reduzir risco de drift semantico sem ampliar esta feature.
-
-## TUI — Ideia de feature futura (análise de viabilidade concluída)
-
-- **Rich enriquecido (F13-rich-cli-output)**: concluida localmente como primeira adocao de Rich em `src/`, restrita a `synapse runtime status` e sem abrir TUI completa.
-- **Observabilidade CLI de runs (F14-runs-observability-cli)**: concluida localmente e fecha a lacuna minima de inspecao antes de qualquer TUI.
-- **TUI watch (F14-tui-watch-command)**: `synapse tui` como subcomando opcional usando Textual. Pré-requisito atualizado: F13 + F14 + implementação de `observability/` (diretório ainda vazio). Hook ideal já existe: `PipelineObserver` em `pipeline.py`.
-- **Constraint Typer×asyncio**: `asyncio.run(app.run_async())` dentro do comando Typer é a forma de coexistência; funcional mas exige cuidado com event loop.
-- **TTY em container**: Rich degrada automaticamente sem TTY; Textual exige guarda `sys.stdout.isatty()`.
-- **Não implementar antes**: apesar da F14 resolver a observabilidade minima via CLI, TUI real continua dependendo de recorte proprio de watch/streaming e da camada `observability/`.
-
-## Estado do baseline atual
-
-- Etapa 2 concluída em `main` com `F17-artifact-preview` e `F22-release-readiness` já mergeadas.
-- A primeira onda de guardrails pós-release também está concluída em `main` com `F23 -> F27`.
-- A fila ativa agora passa a ser definida pela próxima SPEC pós-`F27`, não mais pela abertura da primeira SPEC pós-`F22`.
-
-## Guardrails candidatos fora da fila principal
-
-- Os follow-ups curtos de mascaramento publico e normalizacao textual deixaram de ser candidatos: esses recortes foram absorvidos na `F23`.
-- Rate limiting por adapter, audit trail adicional com `initiated_by` e hardening amplo de config tambem deixaram de ser backlog aberto isolado: esses recortes foram absorvidos em `F26` e `F27`.
-- Os itens de guardrail ainda em aberto concentram-se em `G-09` e `G-11`.
-
-## Itens que podem virar novas features ou ajustes futuros
-
-- Endurecimento adicional do path de estado para restringir explicitamente a uma raiz confiável do workspace.
-- Melhoria de portabilidade do runtime além de Linux, caso isso entre no escopo futuro.
-- Documentação operacional curta para bootstrap local (`--sync-dev`) e para o lifecycle do runtime persistente.
-- Limpeza operacional do repositório para remover debt de formatação fora do escopo desta feature.
-- Integracao do `SpecValidator` ao fluxo seguinte da pipeline, incluindo bloqueio formal antes de `PLAN`.
-- Evolucao da state machine para suportar estados adicionais como `RETRYING`, integracao com executor de steps e persistencia do estado fora do recorte minimo da F03.
+- O runtime persistente continua Linux-first; melhoria de portabilidade pode ser avaliada no futuro.
+- O hardening do runtime valida identidade por marcador + token em `/proc/<pid>/cmdline`.
+- Manter `./scripts/dev-codex.sh` como entrypoint principal para evitar corrida operacional.
+- Manter `./scripts/branch-sync-check.sh` e `./scripts/branch-sync-update.sh` para sincronização conservadora.
+- `memory.md` deve permanecer memória durável e reaproveizável, sem virar transcrição.
+- O `memory-curator` consolida estado e handoff; `ERROR_LOG.md` e `PENDING_LOG.md` seguem como trilha operacional.
diff --git a/SECURITY_AUDIT_REPORT.md b/SECURITY_AUDIT_REPORT.md
new file mode 100644
index 0000000..78d21b9
--- /dev/null
+++ b/SECURITY_AUDIT_REPORT.md
@@ -0,0 +1,738 @@
+# SECURITY_AUDIT_REPORT.md
+
+**Project:** SynapseOS  
+**Audit Date:** 2026-04-01  
+**Auditor:** Security Audit Skill  
+**Scope:** Full codebase (F59-F68) - Multi-Agent Session Orchestration through Plugin/Extension System
+
+---
+
+## Executive Summary
+
+This audit covers the SynapseOS meta-orchestrator codebase, focusing on 10 features implemented during the current sprint. The codebase demonstrates good security practices in several areas, including proper secret masking, constant-time token comparison, and path traversal protection. However, **4 HIGH severity** and **7 MEDIUM severity** issues were identified that require attention.
+
+**Verdict:** `SECURITY_PASS_WITH_NOTES` - Risks mitigable with documented corrections.
+
+---
+
+## 1. Superfície de Ataque Mapeada
+
+### 1.1 HTTP API Surface (Control Plane)
+
+| Endpoint                       | Method | Auth Required | Input Surface                     | Risk Level |
+| ------------------------------ | ------ | ------------- | --------------------------------- | ---------- |
+| `/health`                      | GET    | No            | None                              | Low        |
+| `/api/v1/runs`                 | GET    | Yes           | Query params: `limit`, `offset`   | Low        |
+| `/api/v1/runs`                 | POST   | Yes           | JSON body: `prompt` (unvalidated) | **HIGH**   |
+| `/api/v1/runs/{run_id}`        | GET    | Yes           | Path param: `run_id`              | Low        |
+| `/api/v1/runs/{run_id}/cancel` | POST   | Yes           | Path param: `run_id`              | Medium     |
+| `/api/v1/runtime/status`       | GET    | Yes           | None                              | Low        |
+| `/api/v1/artifacts/{run_id}`   | GET    | Yes           | Path param: `run_id`              | Medium     |
+
+**Key Attack Vectors:**
+
+- `/api/v1/runs` (POST): User-supplied `prompt` is written directly to filesystem without validation
+- `/api/v1/artifacts/{run_id}`: Path traversal risk on artifact listing
+
+### 1.2 Authentication & Authorization
+
+| Component         | Mechanism                       | Storage            | Risk                                          |
+| ----------------- | ------------------------------- | ------------------ | --------------------------------------------- |
+| Control Plane API | Bearer token                    | In-memory (config) | Medium - Single shared token                  |
+| CLI Auth          | Token-based registry            | SQLite + JSON file | Low - Proper SHA256 hashing with HMAC compare |
+| Role-Based Access | 3 roles (admin/operator/viewer) | File-based         | Low - Well-defined permission matrix          |
+
+**Components:**
+
+- `src/synapse_os/control_plane/middleware.py` - Bearer token middleware
+- `src/synapse_os/auth.py` - Auth registry with RBAC
+
+### 1.3 CLI Adapters (External Command Execution)
+
+| Adapter             | Command                          | Injection Risk                                 | Environment      |
+| ------------------- | -------------------------------- | ---------------------------------------------- | ---------------- |
+| `CodexCLIAdapter`   | `./scripts/dev-codex.sh -- exec` | **HIGH** - User prompt passed to shell         | Docker container |
+| `GeminiCLIAdapter`  | `python -c ...`                  | **HIGH** - Prompt interpolation in Python code | Host             |
+| `CopilotCLIAdapter` | `gh copilot ai`                  | **HIGH** - User prompt passed to CLI           | Host             |
+
+**Components:**
+
+- `src/synapse_os/adapters.py` - All CLI adapters
+- `src/synapse_os/runtime/circuit_breaker.py` - Failure detection
+
+### 1.4 Plugin Loading System
+
+| Entry Point          | Loading Mechanism                   | Validation | Risk                                |
+| -------------------- | ----------------------------------- | ---------- | ----------------------------------- |
+| `synapse_os.plugins` | `importlib.metadata.entry_points()` | None       | **HIGH** - Arbitrary code execution |
+
+**Components:**
+
+- `src/synapse_os/plugins.py` - Plugin registry and loader
+
+### 1.5 File System Surface
+
+| Operation        | Path Validation                       | Permission Controls         | Risk   |
+| ---------------- | ------------------------------------- | --------------------------- | ------ |
+| Spec creation    | `/tmp/synapse-os/api-specs/{uuid}.md` | No (relies on /tmp perms)   | Medium |
+| Artifact storage | `resolve_path_within_root()`          | `0o600` files, `0o700` dirs | Low    |
+| Auth registry    | `resolve_path_within_root()`          | `0o600` files, `0o700` dirs | Low    |
+| Workspace pool   | `base_dir / f"ws-{counter}"`          | Standard perms              | Low    |
+
+**Components:**
+
+- `src/synapse_os/security.py` - Path validation utilities
+- `src/synapse_os/persistence.py` - Artifact storage with permissions
+
+### 1.6 Runtime & Process Management
+
+| Operation        | Mechanism                                    | Risk                               |
+| ---------------- | -------------------------------------------- | ---------------------------------- |
+| Process spawning | `subprocess.Popen` with injected Python code | Medium - Code injection via string |
+| Signal handling  | SIGTERM/SIGINT handlers                      | Low                                |
+| PID tracking     | `/proc/{pid}/cmdline` parsing                | Low - Linux-specific               |
+
+**Components:**
+
+- `src/synapse_os/runtime/service.py` - Runtime lifecycle management
+
+### 1.7 Data Persistence
+
+| Storage            | Encryption | Access Control        | Risk                           |
+| ------------------ | ---------- | --------------------- | ------------------------------ |
+| SQLite runs DB     | No         | File permissions only | Medium - Contains run metadata |
+| Artifact files     | No         | `0o600` permissions   | Low                            |
+| Auth registry JSON | No         | `0o600` permissions   | Medium - Token hashes present  |
+| Memory store       | No         | File permissions only | Low                            |
+
+---
+
+## 2. Achados por Severidade
+
+### CRITICAL (0 issues)
+
+No critical vulnerabilities identified that would allow immediate system compromise.
+
+---
+
+### HIGH (4 issues)
+
+#### H1: Command Injection in CLI Adapters
+
+**Location:** `src/synapse_os/adapters.py:186-191`, `311-322`, `368-376`  
+**Severity:** HIGH  
+**CVSS:** 7.5 (AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L)
+
+**Description:**
+User-supplied `prompt` is passed directly to shell commands without sanitization:
+
+```python
+# adapters.py:186-191 (CodexCLIAdapter)
+def build_command(self, prompt: str) -> list[str]:
+    return [
+        "./scripts/dev-codex.sh",
+        "--",
+        "exec",
+        "--color",
+        "never",
+        prompt,  # <-- Direct injection
+    ]
+```
+
+**Exploit Path:**
+
+1. Attacker provides prompt: `"; cat /etc/passwd; echo "`
+2. Command executes with injected shell metacharacters
+3. Arbitrary command execution on host/container
+
+**Mitigation:**
+
+```python
+import shlex
+# Escape or use list args without shell interpretation
+def build_command(self, prompt: str) -> list[str]:
+    return [
+        "./scripts/dev-codex.sh",
+        "--",
+        "exec",
+        "--color",
+        "never",
+        shlex.quote(prompt),  # Or better: pass via stdin
+    ]
+```
+
+**Recommended Macro:** `fix-feature` for prompt sanitization
+
+---
+
+#### H2: Python Code Injection in GeminiCLIAdapter
+
+**Location:** `src/synapse_os/adapters.py:315-322`  
+**Severity:** HIGH  
+**CVSS:** 8.1 (AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:H/A:L)
+
+**Description:**
+Prompt is passed as a command-line argument (argv[1]) to `python -c`:
+
+```python
+return [
+    sys.executable,
+    "-c",
+    "import os, sys; "
+    "key = os.environ.get('SYNAPSE_OS_GEMINI_API_KEY'); "
+    "print(f'Gemini response to: {sys.argv[1]}') "
+    "if key else sys.exit('Error: SYNAPSE_OS_GEMINI_API_KEY not set')",
+    prompt,  # Passed as argv[1]
+]
+```
+
+The prompt is passed as a data argument (argv[1]), not interpolated into the Python source string, so it is not code injection. However, it is still passed via command line which can expose it to other local users via `/proc/<pid>/cmdline`.
+
+**Mitigation:**
+Pass prompt via stdin or environment variable instead of command line.
+
+**Recommended Macro:** `fix-feature` for adapter refactoring
+
+---
+
+#### H3: Arbitrary Code Execution via Plugin System
+
+**Location:** `src/synapse_os/plugins.py:95-108`  
+**Severity:** HIGH  
+**CVSS:** 8.8 (AV:L/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H)
+
+**Description:**
+Plugins loaded via `entry_points()` execute arbitrary code at import time:
+
+```python
+def load_plugins(self) -> None:
+    eps = entry_points(group="synapse_os.plugins")
+    for ep in eps:
+        try:
+            module = ep.load()  # <-- Executes module-level code
+            # ...
+        except Exception:
+            pass  # Silent failure
+```
+
+Any installed package can register an entry point and execute code when SynapseOS starts.
+
+**Mitigation:**
+
+1. Implement plugin signature verification
+2. Maintain allowlist of approved plugins
+3. Load plugins in isolated subprocess/sandbox
+4. Log all plugin loads with full path
+
+**Recommended Macro:** `fix-feature` for plugin sandboxing
+
+---
+
+#### H4: Unvalidated Spec File Creation via API
+
+**Location:** `src/synapse_os/control_plane/server.py:225-240`  
+**Severity:** HIGH  
+**CVSS:** 7.2 (AV:N/AC:L/PR:H/UI:N/S:U/C:L/I:H/A:L)
+
+**Description:**
+User prompt written to filesystem without validation:
+
+```python
+def _create_spec_from_prompt(prompt: str) -> Path:
+    tmp_dir = Path(os.environ.get("TMPDIR", "/tmp")) / "synapse-os" / "api-specs"
+    tmp_dir.mkdir(parents=True, exist_ok=True)
+    spec_path = tmp_dir / f"{uuid4().hex}.md"
+    spec_content = (
+        "---\n"
+        "feature_id: api-run\n"
+        # ...
+        f"# API Run\n\n{prompt}\n"  # Unvalidated content
+    )
+    spec_path.write_text(spec_content, encoding="utf-8")
+    return spec_path
+```
+
+**Risk:** Path traversal via symlink attack, malicious markdown content, or YAML frontmatter injection.
+
+**Mitigation:**
+
+1. Validate prompt against allowed characters
+2. Use secure temporary directory with proper permissions
+3. Validate generated SPEC before use
+
+**Recommended Macro:** `fix-feature` for input validation
+
+---
+
+### MEDIUM (7 issues)
+
+#### M1: Shared API Token for Control Plane
+
+**Location:** `src/synapse_os/control_plane/middleware.py:29-31`  
+**Severity:** MEDIUM
+
+**Description:**
+Single shared token comparison. If token is compromised, all API access is granted. No per-user or per-session tokens.
+
+**Mitigation:**
+Implement per-principal API tokens stored in auth registry.
+
+---
+
+#### M2: No Rate Limiting on API Endpoints
+
+**Location:** `src/synapse_os/control_plane/server.py` (all endpoints)  
+**Severity:** MEDIUM
+
+**Description:**
+No rate limiting implemented, enabling brute force attacks on token and DoS via run creation.
+
+**Mitigation:**
+Add rate limiting middleware (e.g., slowapi with Redis).
+
+---
+
+#### M3: Process Identity Check Bypassable
+
+**Location:** `src/synapse_os/runtime/service.py:181-203`  
+**Severity:** MEDIUM
+
+**Description:**
+`_process_identity_matches()` reads `/proc/{pid}/cmdline` which can be manipulated. The `PROCESS_MARKER` check is weak:
+
+```python
+if PROCESS_MARKER in arguments and process_identity in arguments:
+    return True
+```
+
+Another process could include these strings in its arguments.
+
+**Mitigation:**
+Use stronger mechanism like abstract Unix socket or pidfile with exclusive lock.
+
+---
+
+#### M4: SQL Injection Risk in Persistence (Theoretical)
+
+**Location:** `src/synapse_os/persistence.py`  
+**Severity:** MEDIUM (Currently mitigated by SQLAlchemy)
+
+**Description:**
+All queries use SQLAlchemy ORM which provides parameterization. However, `_upgrade_runs_schema()` uses raw SQL without proper sanitization checks:
+
+```python
+connection.exec_driver_sql("ALTER TABLE runs ADD COLUMN spec_hash TEXT")
+```
+
+Future modifications could introduce injection.
+
+**Mitigation:**
+Add validation for column names in schema migrations.
+
+---
+
+#### M5: Artifact Path Traversal via Run ID
+
+**Location:** `src/synapse_os/persistence.py:527-539`  
+**Severity:** MEDIUM
+
+**Description:**
+`list_artifact_paths()` uses `rglob` after path validation. Symlink attacks could still escape base_path:
+
+```python
+for path in run_directory.rglob("*"):
+    if not path.is_file():
+        continue
+    try:
+        resolve_path_within_root(path, root=self.base_path)
+    except ValueError:
+        continue
+```
+
+**Risk:** TOCTOU between `rglob()` and `resolve_path_within_root()`.
+
+**Mitigation:**
+Use `O_NOFOLLOW` when opening files or resolve before operations.
+
+---
+
+#### M6: Secrets in Environment Variables
+
+**Location:** `src/synapse_os/config.py` (indirect)  
+**Severity:** MEDIUM
+
+**Description:**
+Configuration pulls from environment (`SYNAPSE_OS_*`), which:
+
+1. Appears in process listings (`ps e`)
+2. May be logged by Docker, CI systems
+3. Persists in shell history
+
+**Mitigation:**
+Support file-based secrets (e.g., `/run/secrets/`) as primary method.
+
+---
+
+#### M7: Circuit Breaker State File Tampering
+
+**Location:** `src/synapse_os/runtime/circuit_breaker.py`  
+**Severity:** MEDIUM
+
+**Description:**
+Circuit breaker state stored in JSON file without integrity verification. Attacker with file access could reset failure counters.
+
+**Mitigation:**
+Add HMAC signature or store in tamper-evident database.
+
+---
+
+### LOW (5 issues)
+
+#### L1: Health Endpoint Information Disclosure
+
+**Location:** `src/synapse_os/control_plane/server.py:52-60`  
+**Severity:** LOW
+
+**Description:**
+`/health` endpoint exposes runtime status without authentication, revealing system state to reconnaissance.
+
+**Mitigation:**
+Consider requiring auth for detailed status, or limit info.
+
+---
+
+#### L2: Exception Details in HTTP Responses
+
+**Location:** `src/synapse_os/control_plane/server.py` (multiple)  
+**Severity:** LOW
+
+**Description:**
+Some error handlers chain exceptions which may leak internal details:
+
+```python
+raise HTTPException(status_code=404, detail="Run not found") from err
+```
+
+**Mitigation:**
+Log full tracebacks internally, return generic messages externally.
+
+---
+
+#### L3: No Input Length Limits on Prompt
+
+**Location:** `src/synapse_os/control_plane/models.py`  
+**Severity:** LOW
+
+**Description:**
+`RunCreateRequest.prompt` has no maximum length validation, enabling memory exhaustion attacks.
+
+---
+
+#### L4: Workspace Cleanup Race Condition
+
+**Location:** `src/synapse_os/workspace.py:43-48`  
+**Severity:** LOW
+
+**Description:**
+`reset_for_reuse()` iterates and deletes without locking:
+
+```python
+for item in self.root.iterdir():
+    if item.name != self.root.name:
+        if item.is_dir():
+            shutil.rmtree(item)
+```
+
+**Risk:** Race condition during concurrent cleanup.
+
+---
+
+#### L5: Missing Security Headers in FastAPI
+
+**Location:** `src/synapse_os/control_plane/server.py:42-47`  
+**Severity:** LOW
+
+**Description:**
+No security headers (HSTS, CSP, X-Frame-Options, etc.) configured.
+
+---
+
+## 3. Gestão de Secrets
+
+### Current Implementation
+
+| Aspect               | Status | Details                                                    |
+| -------------------- | ------ | ---------------------------------------------------------- |
+| Token Storage        | Good   | SHA256 hashes only, never plaintext (auth.py:267-268)      |
+| Token Comparison     | Good   | `hmac.compare_digest()` for constant-time (auth.py:214)    |
+| API Keys in Adapters | Poor   | Read from env, no rotation mechanism                       |
+| Secret Masking       | Good   | Configurable regex patterns (security.py:11-16)            |
+| File Permissions     | Good   | `0o600` for files, `0o700` for dirs (persistence.py:47-48) |
+
+### Secrets Identified in Code
+
+| Secret           | Location             | Storage Method              | Risk                         |
+| ---------------- | -------------------- | --------------------------- | ---------------------------- |
+| GitHub Token     | `adapters.py` (env)  | `SYNAPSE_OS_GITHUB_TOKEN`   | Medium - Env exposure        |
+| Gemini API Key   | `adapters.py` (env)  | `SYNAPSE_OS_GEMINI_API_KEY` | Medium - Env exposure        |
+| API Bearer Token | `middleware.py`      | In-memory/config            | Medium - Single shared token |
+| Claude API Key   | `.github/workflows/` | `secrets.CLAUDE_API_KEY`    | Low - GitHub Secrets         |
+
+### Recommendations
+
+1. **Implement secret rotation mechanism** for API keys
+2. **Use Docker secrets or external vault** (HashiCorp Vault, AWS Secrets Manager)
+3. **Add audit logging** for all token usage
+4. **Implement token expiration** for issued tokens
+
+---
+
+## 4. Deps com Vulnerabilidades Conhecidas
+
+### Dependency Analysis
+
+| Package             | Version   | CVE Status       | Risk                         |
+| ------------------- | --------- | ---------------- | ---------------------------- |
+| FastAPI             | >=0.115.0 | No known CVEs    | Low                          |
+| SQLAlchemy          | >=2.0.36  | No critical CVEs | Low                          |
+| Typer               | >=0.12.5  | No known CVEs    | Low                          |
+| Pydantic            | >=2.9.2   | No critical CVEs | Low                          |
+| python-statemachine | >=2.5.0   | **Unknown**      | Medium - Less common package |
+| textual             | >=8.1.1   | No known CVEs    | Low                          |
+
+### Supply Chain Risks
+
+1. **Entry Points System** (plugins.py): Loads code from any installed package
+2. **CLI Adapters**: Execute external commands (`gh`, `docker`, custom scripts)
+3. **No dependency pinning in requirements**: Uses `>=` version constraints
+
+### Recommendations
+
+```bash
+# Run dependency audit
+pip install safety
+safety check -r requirements.txt
+
+# Consider pinning exact versions
+pip freeze > requirements-lock.txt
+```
+
+---
+
+## 5. CI/CD e Automações
+
+### GitHub Workflows Analysis
+
+| Workflow              | Privileges                               | Issues                          | Risk       |
+| --------------------- | ---------------------------------------- | ------------------------------- | ---------- |
+| `security-review.yml` | `pull-requests: write`, `contents: read` | Uses third-party action `@main` | **MEDIUM** |
+| `operational-ci.yml`  | `contents: read`                         | None identified                 | Low        |
+| `container-build.yml` | `contents: read`                         | None identified                 | Low        |
+
+### Security Gate Analysis
+
+**Location:** `scripts/security-gate.sh`
+
+**Strengths:**
+
+- Checks for `permissions:` in workflows
+- Blocks `eval` usage in scripts
+- Blocks `curl | sh` patterns
+- Blocks privileged containers
+- Blocks docker.sock mounting
+
+**Gaps:**
+
+- No check for action pinning (using `@main`, `@v1` instead of commit SHA)
+- No check for secret leakage in logs
+- No check for workflow injection via `pull_request_target`
+
+### Scripts Analysis
+
+| Script                | Privilege Escalation | Injection Risk             | Safe |
+| --------------------- | -------------------- | -------------------------- | ---- |
+| `dev-codex.sh`        | No                   | Low - User-controlled args | Yes  |
+| `docker-preflight.sh` | No                   | Low                        | Yes  |
+| `security-gate.sh`    | No                   | Low                        | Yes  |
+| `commit-check.sh`     | No                   | Low                        | Yes  |
+
+### Recommendations
+
+1. **Pin all GitHub Actions to commit SHAs:**
+
+    ```yaml
+    # Instead of:
+    - uses: actions/checkout@v4
+
+    # Use:
+    - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+    ```
+
+2. **Add workflow validation** to security gate for:
+    - `pull_request_target` usage
+    - Unpinned actions
+    - `GITHUB_TOKEN` with write permissions
+
+---
+
+## 6. Recomendações Priorizadas
+
+### Immediate (P0 - Fix before release)
+
+1. **[H1] Sanitize prompts in CLI adapters** (adapters.py)
+    - Use `shlex.quote()` or stdin-based passing
+    - Effort: 2 hours
+    - Macro: `fix-feature`
+
+2. **[H2] Fix Gemini adapter code injection** (adapters.py:315-322)
+    - Pass prompt via stdin or env var
+    - Effort: 1 hour
+    - Macro: `fix-feature`
+
+3. **[H3] Implement plugin allowlist** (plugins.py:95-108)
+    - Add signature verification
+    - Effort: 4 hours
+    - Macro: `fix-feature`
+
+### Short-term (P1 - Fix within 2 weeks)
+
+4. **[H4] Add input validation for API spec creation** (server.py:225-240)
+    - Validate prompt length and content
+    - Effort: 2 hours
+    - Macro: `fix-feature`
+
+5. **[M1] Implement per-principal API tokens**
+    - Extend auth registry to support API tokens
+    - Effort: 4 hours
+    - Macro: `fix-feature`
+
+6. **[M2] Add rate limiting** (server.py)
+    - Implement per-endpoint rate limits
+    - Effort: 3 hours
+    - Macro: `fix-feature`
+
+7. **[M6] Support file-based secrets**
+    - Read secrets from `/run/secrets/` or similar
+    - Effort: 2 hours
+    - Macro: `fix-feature`
+
+### Medium-term (P2 - Next sprint)
+
+8. **[M3] Strengthen process identity verification**
+    - Use abstract sockets or pidfile locks
+    - Effort: 4 hours
+    - Macro: `fix-feature`
+
+9. **[M5] Fix artifact path traversal**
+    - Use `O_NOFOLLOW` or pre-resolve paths
+    - Effort: 2 hours
+    - Macro: `fix-feature`
+
+10. **Pin GitHub Actions to commit SHAs**
+    - Update all workflows
+    - Effort: 1 hour
+    - Macro: `ci-automation`
+
+11. **Add dependency scanning to CI**
+    - Integrate `safety` or `pip-audit`
+    - Effort: 2 hours
+    - Macro: `ci-automation`
+
+---
+
+## 7. Próximos Passos
+
+### Immediate Actions
+
+1. **Open `fix-feature` branches for:**
+    - `fix/adapter-command-injection` (H1, H2)
+    - `fix/plugin-allowlist` (H3)
+    - `fix/api-input-validation` (H4)
+
+2. **Security regression tests to add:**
+
+    ```python
+    # test_security.py additions
+    - test_prompt_injection_codex_adapter()
+    - test_prompt_injection_gemini_adapter()
+    - test_plugin_unauthorized_load()
+    - test_api_prompt_path_traversal()
+    ```
+
+3. **CI hardening:**
+    - Pin all actions in `.github/workflows/`
+    - Add dependency scanning step
+    - Add secret scanning with `truffleHog`
+
+### Documentation Updates
+
+1. Update `docs/architecture/SDD.md` with:
+    - Security boundary definitions
+    - Trust boundaries diagram
+    - Plugin security model
+
+2. Update `AGENTS.md` with:
+    - Security review requirements for adapters
+    - Plugin development guidelines
+
+### Ongoing Security Practices
+
+1. **Quarterly security audits** using this same methodology
+2. **Dependency scanning** on every PR via CI
+3. **Secret rotation** every 90 days
+4. **Penetration testing** before major releases
+
+---
+
+## Appendices
+
+### A. Files Reviewed
+
+```
+src/synapse_os/control_plane/server.py
+src/synapse_os/control_plane/middleware.py
+src/synapse_os/auth.py
+src/synapse_os/adapters.py
+src/synapse_os/plugins.py
+src/synapse_os/supervisor.py
+src/synapse_os/memory.py
+src/synapse_os/security.py
+src/synapse_os/workspace.py
+src/synapse_os/config.py
+src/synapse_os/runtime/service.py
+src/synapse_os/multi_agent.py
+src/synapse_os/pipeline.py
+src/synapse_os/persistence.py
+scripts/security-gate.sh
+scripts/dev-codex.sh
+scripts/docker-preflight.sh
+.github/workflows/security-review.yml
+.github/workflows/operational-ci.yml
+.github/workflows/container-build.yml
+pyproject.toml
+```
+
+### B. Tools Used
+
+- Manual code review
+- Pattern matching for security anti-patterns
+- Architecture mapping
+- STRIDE threat modeling (implicit)
+
+### C. Limitations
+
+1. Dynamic analysis not performed (no runtime testing)
+2. Dependency CVE scan not executed (requires `safety` or `pip-audit`)
+3. Container security scan not performed
+4. Network-level testing not performed
+5. Fuzzing not performed on input validation
+
+---
+
+**Report Generated:** 2026-04-01  
+**Security Review Status:** `SECURITY_PASS_WITH_NOTES`  
+**Next Audit Due:** 2026-07-01
+
+---
+
+_This report was generated by the security-audit skill following SynapseOS security review protocols._
diff --git a/docs/adr/003-state-machine-pipeline-engine.md b/docs/adr/003-state-machine-pipeline-engine.md
index 46b1aff..6592e76 100644
--- a/docs/adr/003-state-machine-pipeline-engine.md
+++ b/docs/adr/003-state-machine-pipeline-engine.md
@@ -1,29 +1,47 @@
 # ADR-003 — Adotar state machine + Synapse-Flow
 
 ## Status
-Aceito
+
+Aceito (atualizado para DAG pipeline)
 
 ## Contexto
-O SynapseOS precisa coordenar uma esteira com estados explícitos, retries, rollback lógico, hand-offs auditáveis e futura evolução para DAG. Scripts lineares isolados comprometeriam rastreabilidade, manutenção e controle fino do domínio.
+
+O SynapseOS precisa coordenar uma esteira com estados explícitos, retries, rollback lógico, hand-offs auditáveis e execução paralela. Scripts lineares isolados comprometeriam rastreabilidade, manutenção e controle fino do domínio.
 
 ## Decisão
+
 O sistema adotará:
+
 - **state machine** para governar estados e transições;
-- o **Synapse-Flow**, a **engine própria de pipeline** do SynapseOS, em Python para coordenar os steps, hand-offs, retries e integração com o supervisor.
+- o **Synapse-Flow**, a **engine própria de pipeline** do SynapseOS, em Python para coordenar os steps, hand-offs, retries e integração com o supervisor;
+- **DAG pipeline execution** com suporte a:
+    - steps com dependências explícitas;
+    - execução paralela de steps independentes via `asyncio.gather`;
+    - fan-out/fan-in para steps que precisam aguardar múltiplas dependências;
+    - detecção de ciclos no grafo de dependências.
+
+O Synapse-Flow mantém compatibilidade com pipelines lineares (DAG de 1 caminho) enquanto evolui para execução paralela real.
 
 ## Consequências
+
 ### Positivas
+
 - transições explícitas e auditáveis;
 - forte aderência ao domínio do produto;
-- menor complexidade operacional no MVP do que orquestradores pesados;
-- caminho claro para evolução futura para DAG e paralelismo.
+- execução paralela reduz tempo total de runs com steps independentes;
+- caminho claro para evolução futura para workers distribuídos;
+- modelo de dependências explícitas melhora documentação e rastreabilidade.
 
 ### Negativas
+
 - maior responsabilidade de implementação interna;
-- necessidade de testes rigorosos do Synapse-Flow.
+- necessidade de testes rigorosos do Synapse-Flow;
+- complexidade adicional de scheduling paralelo e sincronização;
+- detecção de ciclos e validação de DAG adicionam overhead.
 
 ## Alternativas consideradas
-- pipeline linear hardcoded;
-- Prefect desde o MVP;
-- Temporal desde o MVP;
-- fila sem modelagem explícita de estado.
+
+- pipeline linear hardcoded: rejeitado — não aproveita paralelismo;
+- Prefect desde o MVP: rejeitado — complexidade operacional prematura;
+- Temporal desde o MVP: rejeitado — overkill para estado atual;
+- fila sem modelagem explícita de estado: rejeitado — perde rastreabilidade.
diff --git a/docs/adr/004-cli-adapter-layer.md b/docs/adr/004-cli-adapter-layer.md
index 086364a..c49b704 100644
--- a/docs/adr/004-cli-adapter-layer.md
+++ b/docs/adr/004-cli-adapter-layer.md
@@ -1,26 +1,51 @@
-# ADR-004 — Implementar uma camada de abstração para adapters CLI
+# ADR-004 — Implementar uma camada de abstração para adapters CLI com Multi-Agent Orchestration
 
 ## Status
-Aceito
+
+Aceito (atualizado para multi-agent)
 
 ## Contexto
-Cada ferramenta externa difere em sintaxe de comando, comportamento operacional, timeouts, autenticação e formato de output.
+
+Cada ferramenta externa difere em sintaxe de comando, comportamento operacional, timeouts, autenticação e formato de output. Com múltiplos agentes disponíveis, o sistema precisa:
+
+- Registrar adapters dinamicamente;
+- Rotear tarefas para o agente mais adequado baseado em capability;
+- Suportar fallback entre agentes quando um falha.
 
 ## Decisão
-Criar uma camada dedicada de adapters CLI com interface compartilhada e implementações específicas por ferramenta.
+
+Criar uma camada dedicada de adapters CLI com:
+
+1. **Interface compartilhada** (`BaseCLIAdapter`) para execução padronizada;
+2. **AdapterRegistry**: registro dinâmico de adapters disponíveis;
+3. **CapabilityRouter**: roteamento baseado em capabilities declaradas por cada adapter:
+    - Cada adapter declara capabilities (ex: `code_generation`, `refactoring`, `testing`);
+    - Router seleciona adapter mais adequado para a tarefa;
+    - Suporte a fallback automático em caso de falha;
+    - Política configurável (custo, latência, qualidade);
+4. **Implementações específicas por ferramenta**: Gemini, Codex, Copilot, OpenCode, DeepSeek, Claude, LLMs locais.
 
 ## Consequências
+
 ### Positivas
+
 - desacoplamento entre orquestrador e ferramentas;
 - melhor testabilidade;
-- extensão simplificada para novos agentes;
-- centralização de políticas de execução.
+- extensão simplificada para novos agentes (apenas registrar no AdapterRegistry);
+- centralização de políticas de execução;
+- roteamento inteligente permite otimizar custo/qualidade por tarefa;
+- resiliência via fallback entre múltiplos agentes.
 
 ### Negativas
+
 - necessidade de manutenção contínua dos adapters;
-- risco de abstração ruim esconder comportamentos úteis específicos de uma ferramenta.
+- risco de abstração ruim esconder comportamentos úteis específicos;
+- complexidade adicional do CapabilityRouter (decisão de roteamento);
+- necessidade de mapear capabilities de forma consistente entre adapters.
 
 ## Alternativas consideradas
-- chamadas diretas a subprocess espalhadas pelo código;
-- adapter único parametrizado para tudo;
-- scripts shell externos como wrappers.
+
+- chamadas diretas a subprocess espalhadas pelo código: rejeitado — sem abstração;
+- adapter único parametrizado para tudo: rejeitado — não lida com diferenças semânticas;
+- scripts shell externos como wrappers: rejeitado — difícil testar e manter;
+- roteamento hardcoded por tarefa: rejeitado — não é extensível.
diff --git a/docs/adr/005-semantic-memory.md b/docs/adr/005-semantic-memory.md
index 1525f40..1ba857f 100644
--- a/docs/adr/005-semantic-memory.md
+++ b/docs/adr/005-semantic-memory.md
@@ -1,26 +1,50 @@
-# ADR-005 — Implementar memória semântica com papel advisory no MVP
+# ADR-005 — Implementar memória semântica com papel advisory no MVP e indexing
 
 ## Status
-Aceito
+
+Aceito (atualizado para indexing)
 
 ## Contexto
-A memória semântica pode futuramente influenciar roteamento e planejamento, mas isso aumenta risco de comportamento pouco previsível e difícil de explicar no primeiro release.
+
+A memória semântica pode futuramente influenciar roteamento e planejamento, mas isso aumenta risco de comportamento pouco previsível. Além disso, o volume de artefatos gerados demanda indexação eficiente para consultas rápidas.
 
 ## Decisão
-No MVP, a memória semântica será implementada com papel **advisory/read-only**. Ela servirá para apoio de execução, enriquecimento de contexto e análise posterior, sem alterar automaticamente o roteamento.
+
+No MVP, a memória semântica será implementada com:
+
+1. **Papel advisory/read-only**: apoio de execução, enriquecimento de contexto e análise posterior, sem alterar automaticamente o roteamento;
+2. **IndexedArtifactStore**: armazenamento de artefatos com índices para consulta rápida:
+    - Índice por run_id, step_id, tipo de artefato;
+    - Índice por timestamp para consultas temporais;
+    - Índice por hash de conteúdo para deduplicação;
+3. **Namespacing**: isolamento de memória por:
+    - Workspace (diferentes projetos);
+    - Run (contexto de execução);
+    - Step (contexto de step específico);
+    - Global (padrões compartilhados entre runs).
 
 ## Consequências
+
 ### Positivas
-- comportamento mais previsível;
+
+- comportamento mais previsível com memória advisory;
 - melhor auditabilidade;
-- menor risco de decisões automáticas ruins;
-- possibilidade de aprender com histórico sem automatizar cedo demais.
+- consultas rápidas a artefatos históricos via índices;
+- deduplicação automática reduz storage;
+- namespacing permite contextos isolados e seguros;
+- base para futura evolução de memória semântica com roteamento.
 
 ### Negativas
+
 - menos adaptação automática no curto prazo;
-- supervisor determinístico continua responsável pelas decisões principais.
+- overhead de manutenção de índices;
+- complexidade de gerenciamento de namespaces;
+- necessidade de estratégia de expiração/limpeza de índices antigos.
 
 ## Alternativas consideradas
-- operação totalmente stateless;
-- memória semântica com roteamento automático desde o MVP;
-- apenas logs sem sumarização semântica.
+
+- operação totalmente stateless: rejeitado — perde aprendizado;
+- memória semântica com roteamento automático desde o MVP: rejeitado — risco prematuro;
+- apenas logs sem sumarização semântica: rejeitado — perde valor analítico;
+- armazenamento sem índice: rejeitado — escalabilidade ruim com volume;
+- índice único global: rejeitado — sem isolamento de contexto.
diff --git a/docs/adr/014-http-control-plane.md b/docs/adr/014-http-control-plane.md
new file mode 100644
index 0000000..faa0c92
--- /dev/null
+++ b/docs/adr/014-http-control-plane.md
@@ -0,0 +1,60 @@
+# ADR-014 — Adotar HTTP Control Plane com FastAPI
+
+## Status
+
+Aceito
+
+## Contexto
+
+O SynapseOS opera primariamente via CLI com runtime dual (CLI efêmero + worker residente), mas precisa de uma interface programática para:
+
+- Integração com ferramentas externas que preferem APIs REST;
+- Monitoramento e observabilidade remota das runs;
+- Trigger de runs via webhooks;
+- Consulta de estado e artefatos sem acesso direto ao filesystem.
+
+A arquitetura atual é state-driven com state machine explícita (ADR-003) e Synapse-Flow como engine de pipeline, tornando natural expor estados e transições via API.
+
+## Decisão
+
+Adotar um **HTTP Control Plane** usando FastAPI como camada de interface REST sobre o Synapse-Flow.
+
+Componentes:
+
+- **FastAPI** como framework web (async nativo, validação Pydantic, OpenAPI automático);
+- **REST API design** com recursos principais: `/health`, `/api/v1/runs`, `/api/v1/runtime/status`, `/api/v1/artifacts/{run_id}`;
+- **Async handlers** para não bloquear o event loop do worker;
+- **State machine projection** — estados internos expostos como endpoints de consulta;
+- **Auth middleware** com Bearer token (`SYNAPSE_API_TOKEN`), health check é público.
+
+O HTTP Control Plane é uma **camada opcional** — o sistema continua funcionando 100% via CLI sem a API ativa. A API é ativada via comando explícito `synapse control-plane start`.
+
+## Consequências
+
+### Positivas
+
+- Permite integração com sistemas externos que esperam APIs REST;
+- Facilita observabilidade e dashboards sem acesso ao host;
+- Async/await alinhado com o modelo async do Synapse-Flow;
+- OpenAPI/Swagger gerado automaticamente para documentação;
+- Separação clara: lógica de negócio no Synapse-Flow, protocolo HTTP na camada de controle.
+
+### Negativas
+
+- Adiciona dependência FastAPI + Uvicorn;
+- Requer modelagem explícita de DTOs para evitar expor objetos internos;
+- Risco de acoplamento se lógica de negócio vazar para handlers HTTP;
+- Necessidade de autenticação/autorização para exposição em rede.
+
+## Alternativas consideradas
+
+- **gRPC**: rejeitado — maior complexidade, necessidade de proto files, menor aderência a integrações simples;
+- **GraphQL**: rejeitado — overkill para MVP, complexidade de resolvers e N+1 queries;
+- **Sem API HTTP**: rejeitado — limitaria integrações e observabilidade remota;
+- **Flask/Sanic**: rejeitado — FastAPI tem melhor suporte a async, tipagem e documentação automática.
+
+## Relação com ADRs existentes
+
+- ADR-003 (state-machine-pipeline-engine): API reflete estados da state machine;
+- ADR-009 (runtime-dual): API é interface do worker residente leve;
+- ADR-010 (synapse-flow-name): API expõe operações do Synapse-Flow.
diff --git a/docs/adr/015-plugin-system.md b/docs/adr/015-plugin-system.md
new file mode 100644
index 0000000..c58fee2
--- /dev/null
+++ b/docs/adr/015-plugin-system.md
@@ -0,0 +1,73 @@
+# ADR-015 — Adotar Plugin System com HookSpec e Entry Point Discovery
+
+## Status
+
+Aceito
+
+## Contexto
+
+O SynapseOS precisa ser extensível sem modificar o core. Adapters CLI (ADR-004) resolvem integração com ferramentas externas, mas o sistema precisa de mecanismos para:
+
+- Plugins de terceiros estenderem comportamento (novos parsers, novos tipos de step);
+- Hooks em pontos específicos da pipeline (pré/pós execução, transformação de artefatos);
+- Discovery automático de extensões instaladas via pip/entry points.
+
+A arquitetura state-driven do Synapse-Flow (ADR-003) possui pontos bem definidos onde hooks podem ser injetados sem comprometer o fluxo principal.
+
+## Decisão
+
+Adotar um **Plugin System** baseado em:
+
+1. **HookSpec**: contratos declarativos usando `pluggy` (sistema de hooks do pytest);
+2. **Entry point discovery**: plugins registrados via `pyproject.toml` `[project.entry-points."synapseos.hooks"]`;
+3. **Hook points explícitos**:
+    - `pre_step_execute`: antes de executar um step;
+    - `post_step_execute`: após execução, antes do parsing;
+    - `pre_artifact_persist`: antes de persistir artefato;
+    - `post_run_complete`: ao finalizar run com sucesso;
+    - `on_run_failed`: quando run falha (para cleanup ou notificação).
+
+Regras:
+
+- Hooks são **opcionais** — sistema funciona sem plugins;
+- Hooks podem **modificar** contexto (mutável) ou apenas **observar** (readonly);
+- Falha em hook não quebra pipeline (log + continua), exceto hooks críticos explicitamente marcados;
+- Plugins são carregados uma vez no boot do Synapse-Flow.
+
+## Consequências
+
+### Positivas
+
+- Extensibilidade sem fork do core;
+- Ecossistema permitido: comunidade pode criar plugins sem PRs no repo principal;
+- `pluggy` é battle-tested (usado no pytest), bem documentado;
+- Entry points são padrão Python, sem magia de import dinâmico;
+- Hooks bem definidos permitem instrumentação, métricas, notificações customizadas.
+
+### Negativas
+
+- Nova dependência (`pluggy`);
+- Surface de ataque aumentada — plugins maliciosos podem executar código arbitrário;
+- Debugging mais complexo quando múltiplos plugins interagem;
+- Necessidade de versionamento de HookSpec (breaking changes em hooks);
+- Overhead de carregamento de plugins no startup.
+
+## Alternativas consideradas
+
+- **Import dinâmico de módulos**: rejeitado — menos estruturado, risco de side effects no import;
+- **Sistema de hooks próprio**: rejeitado — reinventar roda, `pluggy` já resolve bem;
+- **Arquitetura de microserviços**: rejeitado — overkill, aumentaria complexidade operacional;
+- **Config-based plugin loading**: rejeitado — entry points são mais idiomáticos em Python.
+
+## Segurança
+
+Plugins executam com os mesmos privilégios do Synapse-Flow. Recomendações:
+
+- Documentar que plugins são código arbitrário — só instalar de fontes confiáveis;
+- Futuro: considerar sandboxing ou assinatura de plugins.
+
+## Relação com ADRs existentes
+
+- ADR-004 (cli-adapter-layer): plugins podem adicionar novos adapters dinamicamente;
+- ADR-003 (state-machine-pipeline-engine): hooks são invocados em transições de estado;
+- ADR-014 (http-control-plane): plugins podem expor endpoints customizados na API.
diff --git a/docs/architecture/SDD.md b/docs/architecture/SDD.md
index 141a997..20c783e 100644
--- a/docs/architecture/SDD.md
+++ b/docs/architecture/SDD.md
@@ -3,9 +3,11 @@
 ## 1. Visão Geral
 
 ### 1.1 Propósito
+
 SynapseOS é um meta-orquestrador de agentes de IA via CLI. Seu papel é coordenar múltiplas ferramentas externas de IA, organizar hand-offs entre etapas de uma esteira controlada e produzir artefatos de software com rastreabilidade, resiliência e baixo custo operacional.
 
 ### 1.2 Objetivos
+
 - Orquestrar ferramentas de IA via CLI de forma uniforme.
 - Executar pipelines autônomos de desenvolvimento de software.
 - Isolar contexto entre etapas e agentes.
@@ -16,9 +18,11 @@ SynapseOS é um meta-orquestrador de agentes de IA via CLI. Seu papel é coorden
 - Permitir evolução futura para paralelismo, DAG real e workers distribuídos.
 
 ### 1.3 Escopo
+
 O sistema recebe uma tarefa, produz uma especificação estruturada, planeja sua execução, chama agentes externos por subprocess, limpa suas saídas, valida os artefatos, reage a falhas, persiste memória de execução e gera um relatório final por run.
 
 ### 1.4 Fora do escopo inicial
+
 - Cluster distribuído completo.
 - Suporte nativo a Windows/macOS como plataforma principal.
 - Interface web completa.
@@ -28,6 +32,7 @@ O sistema recebe uma tarefa, produz uma especificação estruturada, planeja sua
 ---
 
 ## 2. Premissas do MVP
+
 - Linguagem principal: **Python 3.12+**.
 - Execução principal: **CLI-first**.
 - Runtime do MVP: **dual**, com CLI efêmero e worker/daemon residente leve.
@@ -42,6 +47,7 @@ O sistema recebe uma tarefa, produz uma especificação estruturada, planeja sua
 ---
 
 ## 3. Princípios Arquiteturais
+
 1. **CLI-first**: integrações externas devem passar por adapters padronizados.
 2. **Spec-first**: a demanda bruta deve ser transformada em especificação verificável antes do planejamento executivo.
 3. **State-driven orchestration**: o fluxo deve ser auditável por máquina de estados e evolutivo para DAG.
@@ -58,23 +64,29 @@ O sistema recebe uma tarefa, produz uma especificação estruturada, planeja sua
 ### 4.1 Camadas principais
 
 #### Camada de Orquestração
+
 Responsável por estado, pipeline, supervisão, memória e decisão.
 
 Componentes:
+
 - Orchestrator Engine
 - Synapse-Flow
 - State Machine Manager
-- Pipeline Manager
-- Adaptive Supervisor
-- Memory Engine
+- Pipeline Manager (DAG execution)
+- Adaptive Supervisor (per-step policies, backoff)
+- Memory Engine (IndexedArtifactStore, namespacing)
 - Spec Engine
 - Runtime Coordinator
 
 #### Camada de Adapters
+
 Responsável por integração com ferramentas externas via CLI.
 
 Componentes:
+
 - Base CLI Adapter
+- AdapterRegistry
+- CapabilityRouter
 - Gemini Adapter
 - Codex Adapter
 - Copilot Adapter
@@ -83,10 +95,34 @@ Componentes:
 - Claude Adapter
 - Local LLM Adapter
 
+#### Camada de Controle HTTP
+
+Responsável por expor API REST para integração externa.
+
+Componentes:
+
+- FastAPI Application
+- REST Controllers (/runs, /steps, /artifacts, /agents)
+- Async Handlers
+- Webhook Dispatcher
+
+#### Camada de Extensão
+
+Responsável por permitir extensões sem modificar o core.
+
+Componentes:
+
+- Plugin Loader (entry point discovery)
+- Hook Manager (pluggy)
+- HookSpec Definitions
+- Plugin Registry
+
 #### Camada de Execução Autônoma
+
 Conjunto de ferramentas externas executadas sob demanda.
 
 #### Camada de Persistência e Observabilidade
+
 Responsável por persistir runs, steps, artefatos, eventos e relatórios.
 
 ---
@@ -100,6 +136,7 @@ SPEC → TEST_RED → CODE_GREEN → REFACTOR → QUALITY_GATE → SECURITY_REVI
 ```
 
 Regras:
+
 - `DOCKER_PREFLIGHT` é executado pela skill `repo-preflight` quando a feature exigir validação prática em Docker.
 - Em CI e no fluxo local, o `DOCKER_PREFLIGHT` padrão é leve: `compose config` sem `up`; build fica explícito quando necessário.
 - O runtime completo em container fica reservado para workflow dedicado ou acionamento explícito em features que toquem boot, ciclo de vida, persistência ou integração.
@@ -116,22 +153,24 @@ O macroestágio `SPEC` do fluxo oficial engloba `SPEC_DISCOVERY`, `SPEC_NORMALIZ
 
 ### 5.3 Mapeamento macro ↔ estados internos
 
-| Macroestágio (fluxo oficial) | Estados internos do Synapse-Flow |
-|---|---|
-| `SPEC` | `SPEC_DISCOVERY` → `SPEC_NORMALIZATION` → `SPEC_VALIDATION` |
-| `TEST_RED` | `PLAN` → `TEST_RED` |
-| `CODE_GREEN` | `CODE_GREEN` |
-| `REFACTOR` | parte de `CODE_GREEN` (sem estado dedicado no MVP) |
-| `QUALITY_GATE` | `QUALITY_GATE` |
-| `SECURITY_REVIEW` | `SECURITY` |
-| `REPORT` | `DOCUMENT` |
-| `COMMIT` | pós-`COMPLETE` (fora da state machine, ação do operador) |
-| — | `FAILED` (acessível de qualquer estado não-terminal) |
+| Macroestágio (fluxo oficial) | Estados internos do Synapse-Flow                            |
+| ---------------------------- | ----------------------------------------------------------- |
+| `SPEC`                       | `SPEC_DISCOVERY` → `SPEC_NORMALIZATION` → `SPEC_VALIDATION` |
+| `TEST_RED`                   | `PLAN` → `TEST_RED`                                         |
+| `CODE_GREEN`                 | `CODE_GREEN`                                                |
+| `REFACTOR`                   | parte de `CODE_GREEN` (sem estado dedicado no MVP)          |
+| `QUALITY_GATE`               | `QUALITY_GATE`                                              |
+| `SECURITY_REVIEW`            | `SECURITY`                                                  |
+| `REPORT`                     | `DOCUMENT`                                                  |
+| `COMMIT`                     | pós-`COMPLETE` (fora da state machine, ação do operador)    |
+| —                            | `FAILED` (acessível de qualquer estado não-terminal)        |
 
 ### 5.3 Motivação da etapa SPEC
+
 A etapa de especificação transforma intenção em contrato operacional. Ela reduz ambiguidade entre agentes, melhora a geração de testes, aumenta a previsibilidade do parsing e permite validar aderência entre requisito, teste, código e documentação.
 
 ### 5.4 Regras da etapa SPEC
+
 - A entrada é a tarefa bruta do usuário.
 - A saída é uma SPEC híbrida validável.
 - A pipeline não avança para `PLAN` sem validação mínima da SPEC.
@@ -142,7 +181,9 @@ A etapa de especificação transforma intenção em contrato operacional. Ela re
 ## 6. Modelo Runtime
 
 ### 6.1 Modo CLI efêmero
+
 Usado para:
+
 - executar ou validar `DOCKER_PREFLIGHT` antes do trabalho prático dependente de Docker,
 - iniciar runs,
 - executar runs curtas inline,
@@ -150,7 +191,9 @@ Usado para:
 - disparar jobs para execução posterior.
 
 ### 6.2 Worker/daemon residente leve
+
 Usado para:
+
 - consumir runs pendentes,
 - executar o Synapse-Flow,
 - aplicar retries longos,
@@ -158,6 +201,7 @@ Usado para:
 - gerar artefatos e relatório final.
 
 ### 6.3 Motivação do runtime dual
+
 O runtime dual permite preservar a experiência CLI e, ao mesmo tempo, suportar tarefas longas sem bloquear o operador. Também prepara o sistema para crescimento futuro sem obrigar adoção imediata de uma infraestrutura pesada de filas distribuídas.
 
 ---
@@ -220,12 +264,15 @@ O runtime dual permite preservar a experiência CLI e, ao mesmo tempo, suportar
 ## 8. Módulos Principais
 
 ### 8.1 Orchestrator Engine
+
 Coordena a execução ponta a ponta, cria o contexto da run, invoca o Synapse-Flow e consolida resultados.
 
 ### 8.2 State Machine Manager
+
 Modela e valida estados e transições.
 
 Estados implementados no MVP:
+
 - `REQUEST`
 - `SPEC_DISCOVERY`
 - `SPEC_NORMALIZATION`
@@ -243,22 +290,35 @@ Estados implementados no MVP:
 > **Pós-MVP (não implementados)**: `INIT` e `RETRYING` estão reservados para versões futuras onde o estado de inicialização precisa de rastreamento explícito e retries têm estado próprio na máquina.
 
 ### 8.3 Pipeline Manager
-Executa a sequência de steps. No MVP, a esteira é linear; no futuro, deve suportar DAG com fan-out/fan-in.
+
+Executa a sequência de steps com suporte a DAG.
+
+Funcionalidades:
+
+- Execução linear (pipeline tradicional);
+- Execução paralela de steps independentes via `asyncio.gather`;
+- Fan-out/fan-in para steps com múltiplas dependências;
+- Detecção de ciclos no grafo de dependências;
+- Scheduling otimizado baseado em profundidade do DAG.
 
 ### 8.4 Spec Engine
+
 Responsável por:
+
 - converter a demanda bruta em especificação operacional;
 - normalizar linguagem, escopo, critérios de aceite e restrições;
 - validar schema e completude mínima;
 - produzir artefatos estruturados para planejamento e testes.
 
 Subcomponentes sugeridos:
+
 - `spec_discovery`
 - `spec_normalizer`
 - `spec_validator`
 - `spec_repository`
 
 ### 8.5 CLI Adapter Layer
+
 Abstrai a execução das ferramentas externas.
 
 Contrato mínimo (implementado em `src/synapse_os/contracts.py`):
@@ -279,7 +339,9 @@ class CLIExecutionResult:
 ```
 
 ### 8.6 Parsing Engine
+
 Transforma saídas ruidosas em artefatos estruturados. Deve operar em múltiplas fases:
+
 1. normalização textual,
 2. limpeza via regex,
 3. extração de blocos relevantes,
@@ -287,24 +349,47 @@ Transforma saídas ruidosas em artefatos estruturados. Deve operar em múltiplas
 5. fallback heurístico.
 
 Validações adicionais:
+
 - `ast.parse()` para código Python,
 - Pydantic para contratos internos,
 - JSON Schema para SPEC.
 
 ### 8.7 Memory Engine
-Armazena histórico operacional e memória semântica.
+
+Armazena histórico operacional e memória semântica com indexação.
+
+#### IndexedArtifactStore
+
+Armazenamento estruturado com índices:
+
+- Índice por run_id, step_id, tipo de artefato;
+- Índice por timestamp para consultas temporais;
+- Índice por hash de conteúdo para deduplicação;
+- Consultas rápidas via índices secundários.
+
+#### Namespacing
+
+Isolamento de contexto:
+
+- **Workspace**: projetos diferentes;
+- **Run**: contexto de execução específica;
+- **Step**: contexto de step individual;
+- **Global**: padrões compartilhados.
 
 #### Memória operacional
+
 - runs,
 - steps,
 - eventos,
 - falhas,
 - retries,
 - ferramentas usadas,
-- artefatos gerados.
+- artefatos gerados (via IndexedArtifactStore).
 
 #### Memória semântica
-No MVP, tem papel de apoio:
+
+Papel advisory no MVP:
+
 - registrar padrões úteis,
 - anotar heurísticas,
 - resumir falhas recorrentes,
@@ -312,18 +397,23 @@ No MVP, tem papel de apoio:
 - apoiar análise posterior.
 
 ### 8.8 Adaptive Supervisor
+
 Monitora a run e decide sobre:
-- retry,
-- reroute,
+
+- retry (com backoff exponencial),
+- reroute para outro agente via CapabilityRouter,
 - rollback lógico,
 - falha terminal,
 - reexecução com prompt mais restritivo,
-- retorno para etapa anterior em caso de rejeição ou inconsistência.
+- retorno para etapa anterior em caso de rejeição ou inconsistência,
+- per-step policies (configuração por tipo de step).
 
 ### 8.9 Runtime Coordinator
+
 Coordena a diferença entre execução inline via CLI e execução assíncrona via worker residente.
 
 Responsabilidades:
+
 - criar runs pendentes,
 - aplicar locking,
 - retomar runs,
@@ -331,11 +421,13 @@ Responsabilidades:
 - consolidar estado final.
 
 ### 8.10 Synapse-Flow
+
 O Synapse-Flow é a engine própria de pipeline do SynapseOS. Ele coordena os estados internos da run, os hand-offs entre steps, o encadeamento `SPEC → TEST_RED → CODE_GREEN → REFACTOR → SECURITY_REVIEW → REPORT` e a integração com supervisor, memória e adapters.
 
 ---
 
 ## 9. Fluxo de Dados
+
 1. `repo-preflight` valida o `DOCKER_PREFLIGHT` quando a feature exige execução prática.
 2. Usuário envia uma tarefa.
 3. O CLI cria ou dispara uma run.
@@ -348,6 +440,7 @@ O Synapse-Flow é a engine própria de pipeline do SynapseOS. Ele coordena os es
 10. Ao final, é gerado um `RUN_REPORT.md`.
 
 ### 9.1 Artefatos principais por run
+
 - `SPEC.md` — especificação validada
 - `PLAN.md` — plano gerado pelo step PLAN
 - `TESTS_RED.md` ou arquivos de teste — gerados no step TEST_RED
@@ -365,16 +458,19 @@ O Synapse-Flow é a engine própria de pipeline do SynapseOS. Ele coordena os es
 ## 10. Persistência
 
 ### 10.1 MVP
+
 - **SQLite** para metadados operacionais.
 - Arquivos em disco para artefatos (`raw`, `clean`, `spec`, `plan`, `tests`, `code`, `review`, `docs`, `report`).
 
 ### 10.2 Evolução futura
+
 - PostgreSQL para concorrência maior.
 - pgvector ou vector DB dedicado quando a memória semântica evoluir.
 
 ---
 
 ## 11. Tratamento de Erros
+
 - **Falhas de CLI**: detectar binário ausente, erro de autenticação, exit code != 0.
 - **Timeouts**: encerrar subprocesso e marcar step como recuperável ou terminal.
 - **Parsing errors**: tentar reparse ou reexecução com prompt mais restritivo.
@@ -387,9 +483,11 @@ O Synapse-Flow é a engine própria de pipeline do SynapseOS. Ele coordena os es
 ## 12. Observabilidade
 
 ### 12.1 Logs
+
 Logs estruturados por run e step.
 
 Campos mínimos:
+
 - `run_id`
 - `step`
 - `tool_name`
@@ -401,6 +499,7 @@ Campos mínimos:
 > **Nota**: `parser_confidence` foi considerado mas **não está implementado no MVP**. O campo `ParsedOutput` não expõe score de confiança. Caso a heurística de parsing evolua, esse campo pode ser adicionado ao modelo em versão futura.
 
 ### 12.2 Relatório por execução
+
 Cada run deve produzir:
 
 ```text
@@ -408,6 +507,7 @@ artifacts/<run_id>/RUN_REPORT.md
 ```
 
 Conteúdo mínimo:
+
 - resumo da solicitação,
 - SPEC validada,
 - estados percorridos,
@@ -420,6 +520,7 @@ Conteúdo mínimo:
 ---
 
 ## 13. Segurança e Isolamento
+
 - O sistema roda em container da aplicação.
 - Agentes selecionados podem rodar em containers específicos.
 - Não usar `shell=True` por padrão.
@@ -430,26 +531,32 @@ Conteúdo mínimo:
 ---
 
 ## 14. Escalabilidade e Evolução
-### Curto prazo
-- paralelizar alguns steps com `asyncio`;
-- permitir worker residente consumir múltiplas runs;
-- expandir o Synapse-Flow para DAG simples.
+
+### Curto prazo (em implementação)
+
+- paralelizar alguns steps com DAG pipeline — execução paralela implementada em `DAGExecutor`;
+- permitir worker residente consumir múltiplas runs — Worker leve residente implementado;
+- expandir o Synapse-Flow para DAG simples — DAG execution implementado com `DAGExecutor` e `DAGValidator`.
 
 ### Médio prazo
-- DAG pipeline real;
+
 - workers distribuídos;
 - PostgreSQL;
-- vector memory.
+- vector memory;
+- roteamento automático por memória semântica.
 
 ### Longo prazo
+
 - orquestração distribuída durável;
 - múltiplos workspaces/branches efêmeras por run;
-- políticas adaptativas influenciadas por memória semântica.
+- políticas adaptativas influenciadas por memória semântica;
+- plugin ecosystem maduro.
 
 ---
 
 ## 15. Documentos Relacionados
+
 - TDD do SynapseOS
 - template oficial de SPEC
 - documentação de stack e runtime
-- ADR-001 a ADR-009
+- ADR-001 a ADR-015
diff --git a/features/F58-retry-policy-tests/SPEC.md b/features/F58-retry-policy-tests/SPEC.md
new file mode 100644
index 0000000..bacc07e
--- /dev/null
+++ b/features/F58-retry-policy-tests/SPEC.md
@@ -0,0 +1,54 @@
+---
+id: F58-retry-policy-tests
+type: feature
+summary: Criar suíte de testes dedicada para o módulo de supervisor/retry cobrindo decisões de retry, reroute, falhas terminais e retorno de REVIEW para CODE_GREEN.
+inputs:
+    - Supervisor com max_retries configurável
+    - RetryableStepError para falhas recuperáveis
+    - ReviewRejectedError para rejeição de review
+outputs:
+    - SupervisorDecision com action, next_state, route e reason
+    - Testes unitários cobrindo todos os caminhos de decisão
+acceptance_criteria:
+    - Dado Supervisor(max_retries=2) e RetryableStepError em estado retryável com attempt <= max_retries, quando decide_after_failure é chamado, então action=retry com reason=retryable_failure_with_budget
+    - Dado Supervisor(max_retries=2) e RetryableStepError com attempt > max_retries e fallback_route disponível, quando decide_after_failure é chamado, então action=reroute com route=fallback
+    - Dado estado SPEC_VALIDATION com qualquer erro, quando decide_after_failure é chamado, então action=fail com reason=spec_validation_is_terminal
+    - Dado estado SECURITY com qualquer erro, quando decide_after_failure é chamado, então action=fail com reason=security_is_terminal
+    - Dado ReviewRejectedError em estado REVIEW, quando decide_after_review_rejection é chamado, então action=return_to_code_green com next_state=CODE_GREEN
+    - Dado RetryableStepError em estado não-retryável, quando decide_after_failure é chamado, então action=fail com reason=terminal_failure
+non_goals:
+    - Testes de integração com pipeline real
+    - Testes de concorrência
+---
+
+# F58 — Retry Policy Tests
+
+# Contexto
+
+O módulo `src/synapse_os/supervisor.py` implementa o cérebro de recuperação de falhas do pipeline, decidindo entre retry, reroute ou falha terminal. Atualmente possui 4 testes indiretos em `test_supervisor.py`. Esta feature adiciona 6 testes dedicados cobrindo todos os caminhos de decisão.
+
+# Objetivo
+
+Criar suíte de testes dedicada para o supervisor cobrindo decisões de retry, reroute, falhas terminais e retorno de REVIEW para CODE_GREEN.
+
+# Critérios de Aceite
+
+- [ ] AC1: `test_supervisor_requests_retry_after_recoverable_step_failure` — retry com budget disponível
+- [ ] AC2: `test_supervisor_reroutes_after_repeated_step_failures` — reroute após budget esgotado
+- [ ] AC3: `test_supervisor_marks_terminal_failure_after_spec_validation_error` — SPEC_VALIDATION é terminal
+- [ ] AC4: `test_supervisor_marks_terminal_failure_after_security_error` — SECURITY é terminal
+- [ ] AC5: `test_supervisor_returns_to_code_green_after_review_rejection` — REVIEW → CODE_GREEN
+- [ ] AC6: `test_supervisor_terminal_failure_when_no_fallback_route` — fail sem fallback
+- [ ] AC7: `test_supervisor_retry_budget_exhausted_at_max_retries` — retry no limite exato
+- [ ] AC8: `test_supervisor_reroute_when_budget_exceeded_with_fallback` — reroute com fallback disponível
+- [ ] AC9: `test_supervisor_ignores_retryable_error_in_non_retryable_state` — estado não-retryável → fail
+- [ ] AC10: `test_supervisor_decision_contains_correct_reason` — reason field correto em cada cenário
+
+# Design de Testes
+
+### Fixtures
+
+- `Supervisor` com max_retries=2
+- Estados retryáveis: PLAN, TEST_RED, CODE_GREEN
+- Estados terminais: SPEC_VALIDATION, SECURITY
+- Exceções: RetryableStepError, ReviewRejectedError, RuntimeError genérico
diff --git a/features/F59-multi-agent-orchestration/SPEC.md b/features/F59-multi-agent-orchestration/SPEC.md
new file mode 100644
index 0000000..cd35b41
--- /dev/null
+++ b/features/F59-multi-agent-orchestration/SPEC.md
@@ -0,0 +1,79 @@
+---
+id: F59-multi-agent-orchestration
+type: feature
+summary: Formalize adapter registry with capabilities and multi-agent session coordination
+inputs:
+    - Existing BaseCLIAdapter implementations (Codex, Gemini)
+    - ToolSpec and capability contracts from runtime_contracts.py
+    - PipelineEngine with executor routing support
+outputs:
+    - AdapterRegistry with capability-based routing
+    - Multi-agent session coordination in Synapse-Flow
+    - Capability-based task assignment logic
+    - Tests for registry, routing, and multi-agent coordination
+acceptance_criteria:
+    - "AdapterRegistry deve registrar adapters por nome e expor capabilities consultaveis"
+    - "CapabilityRouter deve selecionar adapter adequado com base em capability requerida"
+    - "PipelineEngine deve suportar execucao de step com adapter selecionado por capability"
+    - "Multi-agent handoff deve registrar qual adapter executou qual step no contexto"
+    - "Fallback para adapter generico quando nenhum adapter especializado estiver disponivel"
+    - "Teste de integracao deve validar fluxo completo de registro + routing + execucao"
+non_goals:
+    - Nao implementar comunicacao direta entre adapters (IPC, sockets)
+    - Nao adicionar novos adapters externos nesta feature
+    - Nao implementar load balancing ou escalabilidade horizontal
+---
+
+# Contexto
+
+O SynapseOS atualmente suporta apenas um adapter por execucao de pipeline. O `PipelineEngine` aceita executores configurados por estado, mas nao ha registro central de adapters nem selecao automatica baseada em capacidades. O SDD lista 8 adapters planejados (Codex, Gemini, Copilot, OpenCode, DeepSeek, Claude, Local LLM), mas apenas Codex e Gemini existem.
+
+A coordenacao multi-agent e um requisito fundamental do projeto: diferentes ferramentas de IA tem capacidades diferentes (code generation, planning, analysis, etc) e o Synapse-Flow deve saber qual adapter usar para cada tipo de tarefa.
+
+# Objetivo
+
+Criar um sistema de registro de adapters com capacidades explicitas e roteamento automatico baseado em capabilities, permitindo que o Synapse-Flow coordene multiplas ferramentas de IA dentro de uma mesma sessao de execucao.
+
+## Escopo tecnico
+
+1. **AdapterRegistry**: registro central de adapters disponiveis
+2. **CapabilityRouter**: logica de selecao de adapter por capability requerida
+3. **Integration com PipelineEngine**: execucao de steps com adapter selecionado automaticamente
+4. **Handoff tracking**: registro de qual adapter executou qual step
+
+## Capacidades planejadas
+
+| Capability          | Descricao                  | Adapters Candidatos    |
+| ------------------- | -------------------------- | ---------------------- |
+| `cli_execution`     | Execucao CLI generica      | Todos                  |
+| `code_generation`   | Geracao de codigo          | Codex, Copilot, Claude |
+| `planning`          | Planejamento e arquitetura | Gemini, Claude         |
+| `code_review`       | Revisao de codigo          | Claude, OpenCode       |
+| `security_analysis` | Analise de seguranca       | Claude, DeepSeek       |
+| `local_execution`   | Execucao local sem cloud   | Local LLM              |
+
+## Design proposto
+
+```python
+# Adapter registry
+class AdapterRegistry:
+    def register(self, adapter: BaseCLIAdapter) -> None
+    def get(self, name: str) -> BaseCLIAdapter | None
+    def list_all(self) -> list[BaseCLIAdapter]
+    def find_by_capability(self, capability: str) -> list[BaseCLIAdapter]
+
+# Capability router
+class CapabilityRouter:
+    def __init__(self, registry: AdapterRegistry)
+    def select_adapter(self, required_capabilities: set[str]) -> BaseCLIAdapter | None
+    def get_best_match(self, required_capabilities: set[str]) -> BaseCLIAdapter | None
+```
+
+## Impacto no Synapse-Flow
+
+O Synapse-Flow (engine propria de pipeline do SynapseOS) passara a:
+
+1. Consultar o CapabilityRouter antes de cada step
+2. Selecionar o adapter mais adequado para o tipo de tarefa
+3. Registrar o adapter usado no contexto da run
+4. Permitir fallback para adapter generico quando necessario
diff --git a/features/F60-local-control-plane-foundation/SPEC.md b/features/F60-local-control-plane-foundation/SPEC.md
new file mode 100644
index 0000000..21bfaa6
--- /dev/null
+++ b/features/F60-local-control-plane-foundation/SPEC.md
@@ -0,0 +1,183 @@
+---
+id: F60-local-control-plane-foundation
+type: feature
+summary: Local HTTP API layer exposing SynapseOS core operations programmatically via FastAPI on localhost.
+inputs:
+    - SPEC.md with feature requirements
+    - Existing RunRepository, RuntimeService, ArtifactStore
+outputs:
+    - FastAPI control plane server with REST endpoints
+    - Auth middleware for API token validation
+    - CLI commands for control plane management
+acceptance_criteria:
+    - GET /health returns 200 with runtime status
+    - POST /api/v1/runs creates a run and returns 201
+    - GET /api/v1/runs lists runs with pagination
+    - POST /api/v1/runs/{run_id}/cancel marks run as cancelled
+    - Auth middleware blocks unauthorized requests with 401 when API-token auth is enabled
+    - All unit tests pass
+non_goals:
+    - WebSocket streaming
+    - External network binding
+    - Web dashboard
+---
+
+# Contexto
+
+Atualmente o SynapseOS só pode ser controlado via CLI (`synapse` command). Não existe interface programática para submeter runs remotamente, consultar status em tempo real, monitorar o runtime, cancelar runs ou listar artefatos gerados.
+
+# Objetivo
+
+Criar uma camada de API HTTP local (localhost-only) que exponha as operações core do SynapseOS de forma programática, permitindo integração com ferramentas externas sem depender exclusivamente da CLI.
+
+## Escopo
+
+### In scope
+
+- Servidor HTTP leve com FastAPI
+- Bind exclusivo em `127.0.0.1` (localhost-only, sem exposição externa)
+- Endpoints REST para operações core:
+    - `GET /health` — health check do runtime
+    - `GET /api/v1/runs` — listar runs com paginação
+    - `POST /api/v1/runs` — submeter nova run
+    - `GET /api/v1/runs/{run_id}` — detalhe de uma run
+    - `POST /api/v1/runs/{run_id}/cancel` — cancelar run pendente/em execução
+    - `GET /api/v1/runtime/status` — status do runtime residente
+    - `GET /api/v1/artifacts/{run_id}` — listar artefatos de uma run
+- Middleware de autenticação via token (reutilizar auth existente)
+- CORS desabilitado por padrão (localhost-only)
+
+### Out of scope
+
+- Interface web / dashboard HTTP
+- WebSocket para streaming em tempo real
+- Exposição externa (bind em 0.0.0.0)
+- gRPC ou outros protocolos
+- Multi-tenant ou isolamento por workspace via API
+- Upload de arquivos via API
+
+## Critérios de Aceite
+
+### AC1: Servidor HTTP inicia e responde
+
+- `synapse control-plane start` inicia o servidor em `127.0.0.1:8080` (porta configurável)
+- `GET /health` retorna `{"status": "ok", "runtime": "running|stopped"}` com status 200
+- `GET /health` retorna status 503 se o runtime não estiver disponível
+
+### AC2: Listar runs via API
+
+- `GET /api/v1/runs` retorna lista paginada de runs
+- Suporta query params `?limit=20&offset=0`
+- Retorna JSON com estrutura `{runs: [...], total: N, limit: N, offset: N}`
+- Cada run inclui: `id`, `status`, `created_at`, `prompt` (truncado)
+
+### AC3: Submeter run via API
+
+- `POST /api/v1/runs` aceita `{"prompt": "..."}` e opcionalmente `{"mode": "sync|async|auto"}`
+- Retorna `201 Created` com `{"run_id": "...", "status": "pending"}`
+- Retorna `422` se prompt estiver vazio ou ausente
+- Run submetida via API é persistida no mesmo SQLite e consumida pelo worker
+
+### AC4: Detalhe de run via API
+
+- `GET /api/v1/runs/{run_id}` retorna detalhe completo da run
+- Retorna `404` se run não existir
+- Inclui: `id`, `status`, `prompt`, `created_at`, `updated_at`, `steps`, `artifacts`
+
+### AC5: Cancelar run via API
+
+- `POST /api/v1/runs/{run_id}/cancel` marca run como cancelada
+- Retorna `200` se cancelamento for bem-sucedido
+- Retorna `409` se run já estiver em estado terminal (completed/failed/cancelled)
+- Retorna `404` se run não existir
+
+### AC6: Status do runtime via API
+
+- `GET /api/v1/runtime/status` retorna estado do runtime residente
+- Inclui: `pid`, `uptime`, `state`, `active_runs`, `pending_runs`
+
+### AC7: Listar artefatos via API
+
+- `GET /api/v1/artifacts/{run_id}` lista artefatos gerados
+- Retorna `404` se run não existir
+- Retorna lista com `{name, size_bytes, created_at, type}` para cada artefato
+
+### AC8: Autenticação por token
+
+- Token pode ser configurado via env `SYNAPSE_API_TOKEN` ou config
+- Requests sem token válido retornam `401 Unauthorized` quando auth por token estiver habilitada
+- Health check (`/health`) é público (sem auth)
+- Se `SYNAPSE_API_TOKEN` não estiver definido, auth é desabilitada (modo dev)
+
+### AC9: Porta configurável
+
+- Porta padrão: `8080`
+- Configurável via `--port` flag ou env `SYNAPSE_CONTROL_PORT`
+- Host padrão: `127.0.0.1`
+- Host configurável via `--host` flag (com warning se não for localhost)
+
+### AC10: CLI command para gerenciar control plane
+
+- `synapse control-plane start` — inicia servidor
+- `synapse control-plane stop` — para servidor
+- `synapse control-plane status` — mostra status
+
+## Design Técnico
+
+### Arquitetura
+
+```
+[CLI: synapse control-plane start]
+         |
+         v
+[ControlPlaneServer] -- FastAPI app
+         |
+         +--> /health         --> RuntimeService.ready()
+         +--> /api/v1/runs    --> RunRepository (SQLite)
+         +--> /api/v1/runtime --> RuntimeService
+         +--> /api/v1/artifacts --> ArtifactStore
+```
+
+### Módulos novos
+
+- `src/synapse_os/control_plane/__init__.py`
+- `src/synapse_os/control_plane/server.py` — FastAPI app + endpoints
+- `src/synapse_os/control_plane/models.py` — Pydantic models para request/response
+- `src/synapse_os/control_plane/middleware.py` — Auth middleware
+- `src/synapse_os/control_plane/cli.py` — Typer subcommands
+
+### Dependências novas
+
+- `fastapi>=0.115.0`
+- `uvicorn>=0.32.0`
+
+### Reutilização
+
+- `RunRepository` de `persistence.py`
+- `RuntimeService` de `runtime/service.py`
+- `ArtifactStore` de `persistence.py`
+- Auth token validation de `auth.py`
+
+## Riscos e Mitigações
+
+| Risco                               | Mitigação                                                          |
+| ----------------------------------- | ------------------------------------------------------------------ |
+| FastAPI adiciona dependência pesada | FastAPI é leve; uvicorn é dependency mínima                        |
+| Exposição acidental externa         | Default hardcoded em 127.0.0.1; warning explícito se host mudar    |
+| Conflito de porta                   | Mensagem clara de "port in use" no CLI                             |
+| Auth bypass                         | Health check é o único endpoint público; middleware bloqueia resto |
+
+## Testes
+
+- Testes unitários de cada endpoint com `httpx.AsyncClient` + `TestApp`
+- Testes de autenticação (com/sem token, token inválido)
+- Testes de erro (404, 409, 422)
+- Testes de integração com RunRepository mockado
+- Testes de health check com runtime running/stopped
+
+## Próximos Passos (pós-F60)
+
+- WebSocket para streaming de logs em tempo real
+- Dashboard web leve
+- API para gestão de hooks
+- API para gestão de adapters
diff --git a/features/F61-dag-pipeline-evolution/SPEC.md b/features/F61-dag-pipeline-evolution/SPEC.md
new file mode 100644
index 0000000..a1547b0
--- /dev/null
+++ b/features/F61-dag-pipeline-evolution/SPEC.md
@@ -0,0 +1,176 @@
+---
+id: F61-dag-pipeline-evolution
+type: feature
+summary: DAG-aware pipeline executor with parallel step execution, fan-out/fan-in, cycle detection, and linear fallback.
+inputs:
+    - SPEC.md with dag metadata
+    - PipelineEngine
+outputs:
+    - DAGExecutor with ThreadPoolExecutor parallel dispatch
+    - DAGValidator with Kahn cycle detection
+    - LinearPipelineAdapter for backward compatibility
+acceptance_criteria:
+    - DAG mode executes independent steps in parallel
+    - Cycle detection raises DAGSpecificationError
+    - Fan-in steps wait for all dependencies
+    - Linear fallback works when mode is linear
+    - All unit tests pass
+non_goals:
+    - Dynamic DAG construction at runtime
+    - Distributed execution
+---
+
+# Contexto
+
+The current `SynapseStateMachine` enforces a strictly linear state flow. Every pipeline step executes sequentially. This becomes a bottleneck when multiple independent steps could run in parallel, when fan-out/fan-in patterns are needed, or when conditional routing is required. Synapse-Flow, as the proprietary pipeline engine of SynapseOS, needs to evolve from a linear executor to a DAG-aware executor while maintaining backward compatibility.
+
+# Objetivo
+
+Introduce a DAG mode that coexists with the existing linear mode. When a SPEC contains DAG metadata, the `PipelineEngine` switches to a `DAGExecutor` that resolves step dependencies and schedules work in parallel. When no DAG metadata is present, the system behaves exactly as before.
+
+## 1. Problem Statement
+
+The linear pipeline model limits throughput on multi-core hosts and cannot express conditional or data-flow-driven execution graphs. The system needs to support:
+
+1. **Parallel execution** of independent steps.
+2. **Fan-out** — one step triggers multiple downstream steps.
+3. **Fan-in** — a step waits for multiple upstream steps before executing.
+4. **Conditional routing** — step execution depends on runtime state or output.
+5. **No cycles** — DAG must be acyclic (validated at startup).
+
+All while keeping the existing linear pipeline as the default mode for simple features.
+
+## 2. Decision
+
+We introduce a **DAG mode** that coexists with the existing linear mode. When a SPEC contains DAG metadata, the `PipelineEngine` switches to a `DAGExecutor` that resolves step dependencies and schedules work accordingly. When no DAG metadata is present, the system behaves exactly as before (linear, sequential).
+
+The DAG metadata lives in the SPEC front matter under a `dag` key:
+
+```yaml
+---
+dag:
+    mode: dag # "linear" (default) or "dag"
+    steps:
+        - id: build_core
+          executor: codex
+          depends_on: []
+        - id: build_tests
+          executor: codex
+          depends_on: [build_core]
+        - id: build_integration
+          executor: codex
+          depends_on: [build_core]
+        - id: verify
+          executor: codex
+          depends_on: [build_tests, build_integration]
+    conditionals:
+        - id: check_api
+          step: validate_api
+          if: runtime.api_present == true
+---
+# Normal SPEC body follows...
+```
+
+The `DAGExecutor`:
+
+1. Builds an adjacency list from `depends_on` declarations.
+2. Validates the graph has no cycles (Kahn's algorithm or DFS).
+3. Computes ready set (steps with all dependencies satisfied).
+4. Schedules ready steps (parallel execution within thread-pool limit).
+5. Marks completed steps, refreshes ready set, repeats until all done or one fails.
+6. Supports fan-in synchronization (wait for all dependencies before next step starts).
+7. Falls back to linear order when `mode: linear` or no `dag` key present.
+
+## 3. Scope
+
+### 3.1 In Scope
+
+- `DAGValidator`: validates DAG structure (no cycles, referenced steps exist, no orphan steps).
+- `DAGExecutor`: adjacency-list graph, Kahn topological sort, thread-pool-based parallel dispatch.
+- `DAGContext`: tracks step state (PENDING / RUNNING / DONE / FAILED) per step ID.
+- Fan-out: one step can appear in `depends_on` of multiple downstream steps.
+- Fan-in: a step with multiple `depends_on` entries waits for all of them.
+- Backward compatibility: `mode: linear` or absent `dag` key → existing linear behavior.
+- `DAGSpecificationError` — raised on invalid DAG metadata.
+- Unit tests covering: cycle detection, topological sort, fan-out, fan-in, linear fallback.
+- `LinearPipelineAdapter` — wraps existing linear flow so the same `PipelineEngine` can call either mode.
+
+### 3.2 Out of Scope
+
+- Dynamic DAG construction at runtime (steps added based on output of prior steps) — this is a future Phase 3 item.
+- Distributed DAG execution across machines.
+- DAG visualization or rendering.
+- Persistence of DAG intermediate state — linear pipeline persistence model is reused.
+- Automatic DAG generation from SPEC content.
+
+## 4. Architecture
+
+```
+PipelineEngine
+  ├── LinearPipelineAdapter  (mode: linear or no dag key)
+  │   └── executes LINEAR_STATE_FLOW sequentially
+  └── DAGExecutor            (mode: dag)
+        ├── DAGValidator     (cycle check, orphan check, dependency check)
+        ├── DAGContext       (step state tracker)
+        └── ThreadPoolExecutor (concurrent step dispatch)
+```
+
+### Key Classes
+
+| Class                   | Responsibility                                                                   |
+| ----------------------- | -------------------------------------------------------------------------------- |
+| `DAGSpec`               | Pydantic model for `dag` section in SPEC front matter                            |
+| `DAGStep`               | Pydantic model for individual DAG step (id, executor, depends_on, if)            |
+| `DAGConditional`        | Pydantic model for conditional step routing                                      |
+| `DAGValidator`          | Validates DAG (cycle via Kahn, orphan steps, missing deps)                       |
+| `DAGContext`            | Tracks per-step state: PENDING/RUNNING/DONE/FAILED                               |
+| `DAGExecutor`           | Builds adjacency list, computes in-degree, dispatches ready steps to thread pool |
+| `LinearPipelineAdapter` | Wraps existing linear flow as a drop-in executor interface                       |
+
+### Files to Create
+
+- `src/synapse_os/pipeline_dag.py` — all DAG classes and executor
+- `tests/unit/test_pipeline_dag.py` — unit tests
+
+### Files to Modify
+
+- `src/synapse_os/pipeline.py` — detect DAG mode, route to DAGExecutor, add `dag` field to `PipelineContext`
+- `src/synapse_os/specs/validator.py` — accept and parse `dag` key in SPEC front matter
+- `tests/unit/test_pipeline.py` — add DAG mode integration tests (can be minimal)
+
+## 5. Acceptance Criteria
+
+| #   | Criterion                                                                                                                                             |
+| --- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
+| 1   | A SPEC with `dag: mode: linear` or no `dag` key executes exactly as before (linear, sequential)                                                       |
+| 2   | A SPEC with `dag: mode: dag` and valid `depends_on` graph executes with fan-out parallelism                                                           |
+| 3   | A step with multiple `depends_on` entries starts only after all dependencies are DONE                                                                 |
+| 4   | `DAGValidator` raises `DAGSpecificationError` on cycle detection                                                                                      |
+| 5   | `DAGValidator` raises `DAGSpecificationError` when a `depends_on` references a non-existent step ID                                                   |
+| 6   | `DAGValidator` raises `DAGSpecificationError` when a step has no `depends_on` but is referenced by no other step (orphan), unless it is the root step |
+| 7   | ThreadPoolExecutor dispatches up to N steps concurrently (N = `settings.max_workers`, default 4)                                                      |
+| 8   | When any step fails, the DAGExecutor records FAILED state and stops scheduling new steps                                                              |
+| 9   | Fan-in synchronization: a step waits for all its `depends_on` to complete, not just one                                                               |
+| 10  | All new unit tests pass; existing linear pipeline tests continue to pass                                                                              |
+
+## 6. Dependencies
+
+No new runtime dependencies. ThreadPoolExecutor is stdlib.
+
+## 7. Configuration
+
+`AppSettings` gains one new field:
+
+```python
+max_workers: int = Field(default=4, description="Max concurrent DAG step executions")
+```
+
+## 8. Edge Cases
+
+| Case                                                             | Expected Behavior                                                               |
+| ---------------------------------------------------------------- | ------------------------------------------------------------------------------- |
+| Empty `depends_on: []` list                                      | Step is a root; eligible for first batch of execution                           |
+| Single-step DAG                                                  | Behaves like linear (no parallelism gain)                                       |
+| DAG with 10 root steps and max_workers=4                         | Executes 4 in first batch, then remaining 6 (or fewer, depending on completion) |
+| Step depends on another step in the same `depends_on` list twice | Ignored (deduplicated)                                                          |
+| `dag` key present but `steps` list is empty                      | Raises `DAGSpecificationError`                                                  |
diff --git a/features/F62-copilot-adapter/SPEC.md b/features/F62-copilot-adapter/SPEC.md
new file mode 100644
index 0000000..07a067c
--- /dev/null
+++ b/features/F62-copilot-adapter/SPEC.md
@@ -0,0 +1,75 @@
+---
+id: F62-copilot-adapter
+type: feature
+summary: GitHub Copilot CLI adapter following BaseCLIAdapter pattern with circuit breaker and error classification.
+inputs:
+    - BaseCLIAdapter interface
+    - gh CLI available on PATH
+outputs:
+    - CopilotCLIAdapter class
+    - classify_copilot_execution function
+    - Unit tests
+acceptance_criteria:
+    - CopilotCLIAdapter inherits from BaseCLIAdapter
+    - adapter.capabilities includes code_generation
+    - classify_copilot_execution returns correct categories
+    - Circuit breaker integration works
+    - All unit tests pass
+non_goals:
+    - Interactive shell mode
+    - Bash completion
+---
+
+# Contexto
+
+The SynapseOS adapter system (`BaseCLIAdapter` in `adapters.py`) currently supports two CLI-based AI runtimes: `CodexCLIAdapter` and `GeminiCLIAdapter`. The architecture supports arbitrary adapters via `BaseCLIAdapter`. GitHub Copilot CLI (`gh copilot`) is a widely-used AI coding assistant that complements Codex and Gemini.
+
+# Objetivo
+
+Create a `CopilotCLIAdapter` following the existing adapter pattern, expanding the routing options available to the `CapabilityRouter`.
+
+## 1. Decision
+
+Create a `CopilotCLIAdapter` following the existing adapter pattern. The adapter:
+
+1. Calls `gh copilot ai` as the primary command
+2. Returns a `CLIExecutionResult` with appropriate `success` flag
+3. Classifies execution outcomes using `classify_copilot_execution`
+4. Inherits circuit breaker and semaphore guard behavior
+5. Has `capabilities = ("cli_execution", "code_generation")` matching Codex
+
+### Authentication
+
+Não há env var dedicada do SynapseOS para o adapter. A autenticação depende do estado já configurado no `gh` CLI.
+
+## 2. Scope
+
+### In Scope
+
+- `CopilotCLIAdapter` class following `BaseCLIAdapter` pattern
+- `classify_copilot_execution()` function (mirrors `classify_codex_execution`)
+- Error classification: timeout, non-zero exit, authentication failure, unavailable
+- Integration with `AdapterCircuitBreakerStore`
+- Unit tests in `tests/unit/test_copilot_adapter.py`
+
+### Out of Scope
+
+- Changing the `gh copilot` binary location (uses `gh` from PATH)
+- Supporting interactive `gh copilot` shell mode
+- Bash completion or streaming output
+
+## 3. Files
+
+- `src/synapse_os/adapters.py` — add `CopilotCLIAdapter` and `classify_copilot_execution`
+- `tests/unit/test_copilot_adapter.py` — unit tests (mock `gh copilot`)
+
+## 4. Acceptance Criteria
+
+| #   | Criterion                                                                                                |
+| --- | -------------------------------------------------------------------------------------------------------- |
+| 1   | `CopilotCLIAdapter` inherits from `BaseCLIAdapter`                                                       |
+| 2   | `adapter.capabilities == ("cli_execution", "code_generation")`                                           |
+| 3   | `classify_copilot_execution` returns correct category for: success, timeout, non-zero exit, auth failure |
+| 4   | Adapter uses circuit breaker (same as `CodexCLIAdapter`)                                                 |
+| 5   | All unit tests pass; existing adapter tests continue to pass                                             |
+| 6   | `gh copilot` invoked with `--color never` to suppress ANSI                                               |
diff --git a/features/F63-memory-engine-enhancement/SPEC.md b/features/F63-memory-engine-enhancement/SPEC.md
new file mode 100644
index 0000000..bba4aec
--- /dev/null
+++ b/features/F63-memory-engine-enhancement/SPEC.md
@@ -0,0 +1,77 @@
+---
+id: F63-memory-engine-enhancement
+type: feature
+summary: Artifact metadata, indexed artifact store, and namespace-scoped memory store for run context persistence.
+inputs:
+    - Existing artifact store and report generator
+    - Run context and feature metadata
+outputs:
+    - ArtifactMetadata Pydantic model
+    - IndexedArtifactStore with find_by_tag and find_by_type
+    - MemoryStore with JSON-file backing and namespace isolation
+acceptance_criteria:
+    - ArtifactMetadata has type, tags, source_step, created_at fields
+    - IndexedArtifactStore.find_by_tag returns tagged artifacts
+    - MemoryStore.get/set/delete works with namespace isolation
+    - feature_memory returns namespaced view
+    - All unit tests pass
+non_goals:
+    - Vector/semantic search
+    - Cross-process memory sharing
+---
+
+# Contexto
+
+The current `RunReportGenerator` produces basic markdown reports with limited metadata. The `artifact_store` is a simple file-based store with no indexing or search. Memory for feature state is entirely external with no integration into the runtime's artifact model.
+
+# Objetivo
+
+Introduce `ArtifactMetadata`, `IndexedArtifactStore`, and `MemoryStore` to support richer structured metadata, fast artifact lookup, and a clean interface for persisting run context and feature decisions.
+
+## 1. Decision
+
+We introduce three complementary components:
+
+1. **`ArtifactMetadata`** — a Pydantic model attached to each artifact, containing type, tags, source_step, and created_at. Artifacts without metadata get a default entry.
+
+2. **`IndexedArtifactStore`** — wraps the existing artifact store with an in-memory index mapping `run_id → artifact_name → ArtifactMetadata`. Supports `find_by_tag`, `find_by_type`, and `list_for_run`.
+
+3. **`MemoryStore`** — a minimal key-value store backed by JSON files in the runtime state directory. Keys are namespaced (`memory:<namespace>:<key>`). Supports `get`, `set`, `delete`, and `list_namespaces`. Provides `feature_memory()` to scope operations to the current feature.
+
+These components are purely additive — no existing behavior changes.
+
+## 2. Scope
+
+### In Scope
+
+- `ArtifactMetadata` Pydantic model (type, tags, source_step, created_at)
+- `IndexedArtifactStore` class with in-memory index and lookup methods
+- `MemoryStore` class with JSON-file backing and namespace isolation
+- `feature_memory()` helper on `MemoryStore` returning a namespaced view
+- Unit tests for all three components
+- `ArtifactMetadata` attached to `StepExecutionResult` artifacts field (optional key)
+
+### Out of Scope
+
+- Vector/semantic search
+- Cross-process memory sharing
+- Automatic memory population from runs
+- Integration with opencode memory blocks
+
+## 3. Files
+
+- `src/synapse_os/memory.py` — all new memory/artifact index classes
+- `tests/unit/test_memory.py` — unit tests
+
+## 4. Acceptance Criteria
+
+| #   | Criterion                                                                                             |
+| --- | ----------------------------------------------------------------------------------------------------- |
+| 1   | `ArtifactMetadata` has fields: type (str), tags (list[str]), source_step (str), created_at (datetime) |
+| 2   | `IndexedArtifactStore.find_by_tag("error")` returns artifacts tagged "error"                          |
+| 3   | `IndexedArtifactStore.find_by_type("test_report")` returns artifacts of that type                     |
+| 4   | `MemoryStore.set("ns", "key", "value")` persists and `get("ns", "key")` retrieves it                  |
+| 5   | `MemoryStore.list_namespaces()` returns all namespaces                                                |
+| 6   | `feature_memory("F63")` returns a namespaced view that only touches F63 keys                          |
+| 7   | All unit tests pass; existing tests continue to pass                                                  |
+| 8   | `ArtifactMetadata` is added to `StepExecutionResult` artifacts field (optional key)                   |
diff --git a/features/F64-advanced-supervisor-policies/SPEC.md b/features/F64-advanced-supervisor-policies/SPEC.md
new file mode 100644
index 0000000..1090636
--- /dev/null
+++ b/features/F64-advanced-supervisor-policies/SPEC.md
@@ -0,0 +1,71 @@
+---
+id: F64-advanced-supervisor-policies
+type: feature
+summary: Policy-driven supervisor with per-step retry limits, exponential backoff, error-category-aware policies, and fallback routing.
+status: draft
+created: 2026-03-31
+owner: agent
+inputs: []
+outputs: []
+acceptance_criteria:
+    - Per-step max_retries overrides are respected (TEST_RED retries 5 times, PLAN retries only 2)
+    - Exponential backoff delay doubles each attempt: base=1s produces [1s, 2s, 4s, 8s]
+    - Backoff cap at max_delay_seconds (default 60s)
+    - SECURITY and SPEC_VALIDATION remain terminal (0 retries)
+    - AdapterOperationalError with category launcher_unavailable short-circuits without retry
+    - Fallback route is tried when primary route exhausts retries
+    - All new unit tests pass; existing supervisor tests continue to pass
+non_goals: []
+---
+
+# Contexto
+
+O `Supervisor` atual em `supervisor.py` suporta apenas três ações: `retry`, `reroute` e `fail`. Retry tem um contador plano `max_retries` aplicado a todos os steps. Não existe configuração de retry por step, nem backoff exponencial, nem integração com circuit-breaker, nem política adaptativa que considere categorias de erro.
+
+O supervisor do Synapse-Flow precisa evoluir de um contador plano para um sistema driven por políticas onde diferentes categorias de erro, steps e adapters podem ter políticas de retry/comportamento distintas.
+
+# Objetivo
+
+Introduzir um **supervisor orientado a políticas** que:
+
+1. **Limites de retry por step** — `PLAN`, `TEST_RED`, `CODE_GREEN` cada um recebe seu próprio `max_retries` ao invés de compartilhar um orçamento plano.
+2. **Backoff exponencial** — entre retries, o delay dobra: 1s, 2s, 4s, etc. Cap em 60s.
+3. **Políticas cientes de categoria de erro** — `RetryableStepError` recebe retries; `AdapterOperationalError` tem short-circuit em categorias "launcher_unavailable".
+4. **Roteamento com fallback** — quando adapter primário está indisponível, rotear para próximo adapter disponível.
+5. **Políticas específicas por step** — SECURITY e SPEC_VALIDATION permanecem terminais (sem retries).
+
+O modelo existente `SupervisorDecision` permanece inalterado — a nova lógica produz os mesmos tipos de decisão.
+
+# Escopo
+
+## Dentro do Escopo
+
+- `RetryPolicy` Pydantic model: `max_retries`, `base_delay_seconds`, `max_delay_seconds`
+- `StepPolicy` Pydantic model: override por step do `RetryPolicy`
+- `SupervisorPolicies` Pydantic model: holds default policy + per-step overrides
+- `AdvancedSupervisor` class extending `Supervisor` com decisões orientadas por política
+- `calculate_backoff(attempt, base_delay, max_delay)` helper
+- Unit tests em `tests/unit/test_supervisor_policies.py`
+
+## Fora do Escopo
+
+- Integração com circuit breaker (tratado separadamente via `AdapterCircuitBreakerStore`)
+- Carregamento dinâmico de políticas via config em runtime
+- Compartilhamento de orçamento entre steps (cada step tem orçamento independente)
+
+# Arquivos
+
+- `src/synapse_os/supervisor.py` — adicionar policy models, `calculate_backoff`, atualizar `Supervisor.decide_after_failure`
+- `tests/unit/test_supervisor_policies.py` — unit tests
+
+# Critérios de Aceite
+
+| #   | Criterion                                                                                                   |
+| --- | ----------------------------------------------------------------------------------------------------------- |
+| 1   | Per-step `max_retries` overrides are respected (e.g., TEST_RED can retry 5 times while PLAN retries only 2) |
+| 2   | Exponential backoff delay doubles each attempt: base=1s → [1s, 2s, 4s, 8s]                                  |
+| 3   | Backoff cap at `max_delay_seconds` (default 60s)                                                            |
+| 4   | SECURITY and SPEC_VALIDATION remain terminal (0 retries)                                                    |
+| 5   | `AdapterOperationalError` with category `launcher_unavailable` short-circuits without retry                 |
+| 6   | Fallback route is tried when primary route exhausts retries                                                 |
+| 7   | All new unit tests pass; existing supervisor tests continue to pass                                         |
diff --git a/features/F65-runtime-coordinator-hardening/SPEC.md b/features/F65-runtime-coordinator-hardening/SPEC.md
new file mode 100644
index 0000000..537487e
--- /dev/null
+++ b/features/F65-runtime-coordinator-hardening/SPEC.md
@@ -0,0 +1,63 @@
+---
+id: F65-runtime-coordinator-hardening
+type: feature
+summary: Hardened RuntimeCoordinator with graceful degradation, improved lifecycle state transitions, observability events, and cleanup handlers.
+status: draft
+created: 2026-03-31
+owner: agent
+inputs: []
+outputs: []
+acceptance_criteria:
+    - RuntimeCoordinator enters degraded mode when circuit breaker is open, continues serving healthy adapters
+    - Lifecycle state transitions emit 'runtime.lifecycle.transition' events
+    - RuntimeCoordinator emits 'runtime.starting', 'runtime.started', 'runtime.stopping', 'runtime.stopped' events
+    - Shutdown handler drains pending work with timeout before force-kill
+    - Health check returns DEGRADED status when any circuit breaker is open
+    - All new unit tests pass; existing RuntimeCoordinator tests continue to pass
+non_goals: []
+---
+
+# Contexto
+
+O `RuntimeCoordinator` em `runtime/service.py` é o componente central que gerencia o ciclo de vida do runtime. Ele não tem atualmente:
+
+- Modo degradado quando circuit breakers estão abertos
+- Eventos de lifecycle completos
+- Tratamento graceful de shutdown
+- Health check granular
+- Integração de observabilidade com o sistema de eventos existente
+
+# Objetivo
+
+Introduzir um **RuntimeCoordinator reforçado** que:
+
+1. **Graceful degradation** — quando um circuit breaker está aberto, o coordinator continua operando com adapters saudáveis em vez de falhar completamente.
+2. **Lifecycle events** — emite eventos `runtime.lifecycle.transition`, `runtime.starting`, `runtime.started`, `runtime.stopping`, `runtime.stopped`.
+3. **Shutdown handler** — ao receber sinal de término, drena trabalho pendente com timeout antes de force-kill.
+4. **Health check granular** — `GET /health` retorna `{"status": "DEGRADED"}` quando circuit breakers estão abertos, `{"status": "HEALTHY"}` quando tudo OK.
+5. **Cleanup handlers** — hooks de cleanup registráveis para recursos que precisam de liberação no shutdown.
+
+O `RuntimeCoordinator` existente é enhancement in-place (não um replace).
+
+# Escopo
+
+## Dentro do Escopo
+
+- `RuntimeCoordinator` com `degraded_adapters` set e lógica de graceful degradation
+- `lifecycle_event(event_name)` método
+- `shutdown(timeout_seconds)` método com drain graceful
+- `register_cleanup_handler(callback)` e `run_cleanup_handlers()`
+- `health_status()` método returning `Literal["HEALTHY", "DEGRADED", "UNHEALTHY"]`
+- `RuntimeLifecycleEvent` Pydantic model para eventos de lifecycle
+- Unit tests em `tests/unit/test_runtime_coordinator_hardening.py`
+
+## Fora do Escopo
+
+- Modificação de `RuntimeService` (separado em F70)
+- Integração com o servidor HTTP de control plane
+- Persistência de health status
+
+# Arquivos
+
+- `src/synapse_os/runtime/service.py` — atualizar `RuntimeCoordinator` com hardening
+- `tests/unit/test_runtime_coordinator_hardening.py` — unit tests
diff --git a/features/F66-reporting-and-observability-evolution/SPEC.md b/features/F66-reporting-and-observability-evolution/SPEC.md
new file mode 100644
index 0000000..78cf27b
--- /dev/null
+++ b/features/F66-reporting-and-observability-evolution/SPEC.md
@@ -0,0 +1,62 @@
+---
+id: F66-reporting-and-observability-evolution
+type: feature
+summary: Enhanced run reports with structured metadata, execution timeline, adapter performance metrics, and structured error summaries.
+status: draft
+created: 2026-03-31
+owner: agent
+inputs: []
+outputs: []
+acceptance_criteria:
+    - RunReport includes execution_timeline with state transitions and durations
+    - RunReport includes adapter_metrics with per-adapter success rates and avg durations
+    - RunReport includes structured_errors list with error categories and counts
+    - RunReport includes feature_id and feature_title from SPEC metadata
+    - RunReport JSON schema is validated against a JSON Schema spec
+    - Unit tests verify all new report fields are populated correctly
+    - Existing reporting tests continue to pass
+non_goals: []
+---
+
+# Contexto
+
+O `RunReport` atual em `reporting.py` é um arquivo Markdown simples (RUN_REPORT.md). Ele não tem:
+
+- Timeline de execução com transições de estado e durações
+- Métricas por adapter (success rate, avg duration)
+- Estrutura de erros categorizados
+- Validação via JSON Schema
+- Campos de feature_id e feature_title
+
+# Objetivo
+
+Expandir o sistema de relatórios para incluir:
+
+1. **Structured timeline** — lista de transições de estado com timestamp e duration desde a transição anterior.
+2. **Adapter metrics** — por adapter: total calls, success count, failure count, avg duration ms, categorização de erros.
+3. **Structured errors** — lista de erros categorizados com type, message, step, e count.
+4. **Feature metadata** — campos `feature_id` e `feature_title` populados do frontmatter da SPEC.
+5. **JSON Schema validation** — o report é primeiramente gerado como Pydantic model, depois renderizado para Markdown.
+
+O `RunReport` existente continua como Pydantic model; adicionamos novos campos.
+
+# Escopo
+
+## Dentro do Escopo
+
+- `ExecutionTimeline` e `TimelineEntry` Pydantic models
+- `AdapterMetrics` Pydantic model
+- `StructuredError` Pydantic model
+- Campos `execution_timeline`, `adapter_metrics`, `structured_errors`, `feature_id`, `feature_title` no `RunReport`
+- `generate_structured_report(run_id, run_record)` helper que popula todos os campos
+- Unit tests em `tests/unit/test_reporting_evolution.py`
+
+## Fora do Escopo
+
+- Alteração do formato de renderização Markdown existente (mantemos compatibilidade)
+- Integração com sistema de eventos externo
+
+# Arquivos
+
+- `src/synapse_os/reporting.py` — adicionar models e campos ao `RunReport`
+- `tests/unit/test_reporting_evolution.py` — unit tests
diff --git a/features/F67-workspace-management-v2/SPEC.md b/features/F67-workspace-management-v2/SPEC.md
new file mode 100644
index 0000000..2ae85e7
--- /dev/null
+++ b/features/F67-workspace-management-v2/SPEC.md
@@ -0,0 +1,61 @@
+---
+id: F67-workspace-management-v2
+type: feature
+summary: Workspace Management v2 with per-run workspace isolation, lifecycle hooks, and workspace pool for reuse.
+inputs:
+    - Existing WorkspaceProvider protocol
+    - Run lifecycle events
+outputs:
+    - WorkspaceState enum and TrackedWorkspace model
+    - WorkspacePool with acquire/release/discard
+    - WorkspaceManager integrating providers and pool
+acceptance_criteria:
+    - WorkspaceProvider creates isolated per-run workspace directories
+    - WorkspaceProvider tracks workspace lifecycle states
+    - Workspace cleanup hook is called when run completes
+    - Workspace pool holds up to N reusable idle workspaces
+    - Reuse of pooled workspace resets its contents
+    - All unit tests pass
+non_goals:
+    - Cross-session workspace persistence
+    - Workspace templates
+    - Multi-tenant isolation
+---
+
+# Contexto
+
+O sistema atual de workspace em `runtime_contracts.py` (`WorkspaceProvider`, `LocalWorkspaceProvider`, `RunScopedWorkspaceProvider`) não suporta pool de workspaces para reuse, lifecycle hooks de cleanup, tracking de estado ou reset de workspace antes de reuse.
+
+# Objetivo
+
+Introduzir WorkspaceState enum, TrackedWorkspace, WorkspacePool com acquire/release/discard, reset interno antes de reuse, Lifecycle hooks de cleanup, e WorkspaceManager que integra providers + pool.
+
+## 1. Decision
+
+Introduzir:
+
+1. **WorkspaceState enum** — `CREATING`, `READY`, `BUSY`, `CLEANUP`, `DESTROYED`
+2. **TrackedWorkspace** — workspace com state tracking e metadata
+3. **WorkspacePool** — pool fixo de workspaces idle que podem ser reutilizados
+4. **Lifecycle hooks** — `on_workspace_cleanup(path)` callback
+
+## 2. Scope
+
+### In Scope
+
+- `WorkspaceState` enum
+- `TrackedWorkspace` model
+- `WorkspacePool` class com acquire/release/discard
+- `WorkspaceManager` que integra providers + pool
+- Unit tests
+
+### Out of Scope
+
+- Persistência de workspace entre sessões
+- Workspace templates
+- Multi-tenant workspace isolation
+
+## 3. Files
+
+- `src/synapse_os/workspace.py` (novo)
+- `tests/unit/test_workspace_v2.py` (novo)
diff --git a/features/F68-plugin-extension-system/SPEC.md b/features/F68-plugin-extension-system/SPEC.md
new file mode 100644
index 0000000..5761ed4
--- /dev/null
+++ b/features/F68-plugin-extension-system/SPEC.md
@@ -0,0 +1,64 @@
+---
+id: F68-plugin-extension-system
+type: feature
+summary: Plugin/Extension system with hook-based registration, entry point discovery, and lifecycle management.
+inputs:
+    - Existing hooks.py hook system
+    - Python entry point mechanism
+outputs:
+    - PluginManifest dataclass with name, version, hooks
+    - PluginRegistry singleton with discovery and lifecycle
+    - load_plugins() via entry point discovery
+acceptance_criteria:
+    - Plugins are discovered via entry point group synapse_os.plugins
+    - Plugin manifest declared via hook_manifest function
+    - PluginRegistry tracks loaded plugins and hook handlers
+    - load_plugins() discovers and loads all installed plugins
+    - unload_plugin() removes plugin and its handlers
+    - Plugin can declare pre_step, post_step, on_run_start, on_run_end hooks
+    - All unit tests pass
+non_goals:
+    - Plugin sandboxing/security
+    - Plugin packaging/distribution
+    - Plugin config API
+    - Hot reload
+---
+
+# Contexto
+
+O sistema atual de hooks em `hooks.py` suporta apenas hooks internos registrados manualmente. Não existe mecanismo para extensões externas descobrirem e registrarem hooks no Synapse-Flow.
+
+# Objetivo
+
+Introduzir PluginManifest, PluginRegistry singleton com discovery e lifecycle, entry point group `synapse_os.plugins` para descoberta automática, load_plugins() e unload_plugin().
+
+## 1. Decision
+
+Introduzir:
+
+1. **PluginManifest** — dataclass com name, version, hooks, enabled
+2. **PluginRegistry** — singleton que gerencia plugins descobertos e carregados
+3. **entry point group** `synapse_os.plugins` para descoberta automática
+4. **load_plugins()** — descobre e registra todos os plugins via entry points
+5. **unload_plugin(name)** — remove plugin do registry
+
+## 2. Scope
+
+### In Scope
+
+- PluginManifest dataclass
+- PluginRegistry com discovery e lifecycle
+- Entry point based plugin discovery
+- Unit tests
+
+### Out of Scope
+
+- Plugin sandboxing/security
+- Plugin packaging/distribution
+- Plugin config API
+- Hot reload
+
+## 3. Files
+
+- `src/synapse_os/plugins.py` (novo)
+- `tests/unit/test_plugins.py` (novo)
diff --git a/memory/MEMORY.md b/memory/MEMORY.md
new file mode 100644
index 0000000..3ea0738
--- /dev/null
+++ b/memory/MEMORY.md
@@ -0,0 +1,20 @@
+# MEMORY.md — SynapseOS
+
+Índice lean da memória durável do projeto. Este arquivo aponta para arquivos temáticos em `memory/`.
+
+## Arquivos temáticos
+
+| Arquivo                                    | Conteúdo                             | Última atualização |
+| ------------------------------------------ | ------------------------------------ | ------------------ |
+| [project_state.md](project_state.md)       | Estado atual, sprint, branch, marcos | 2026-04-01         |
+| [stable_decisions.md](stable_decisions.md) | Decisões arquiteturais fixas         | 2026-04-01         |
+| [active_fronts.md](active_fronts.md)       | Frentes ativas + open decisions      | 2026-04-01         |
+| [pitfalls.md](pitfalls.md)                 | Armadilhas técnicas recorrentes      | 2026-04-01         |
+| [next_steps.md](next_steps.md)             | Próximos passos recomendados         | 2026-04-01         |
+| [handoff.md](handoff.md)                   | Último handoff de sessão             | 2026-04-01         |
+
+## Convenção
+
+- `memory/` é memória durável do projeto, não log de conversa.
+- Detalhe operacional fica em `ERROR_LOG.md` e `PENDING_LOG.md`.
+- Atualizado via `memory-curator` ao encerrar sessões.
diff --git a/memory/active_fronts.md b/memory/active_fronts.md
new file mode 100644
index 0000000..dfb4739
--- /dev/null
+++ b/memory/active_fronts.md
@@ -0,0 +1,36 @@
+# Active Fronts — SynapseOS
+
+## Frentes concluídas recentemente
+
+Todas as frentes do sprint F59-F68 foram mergeadas em `origin/main`:
+
+| Frente | Descrição                           | Status      |
+| ------ | ----------------------------------- | ----------- |
+| F59    | Multi-Agent Session Orchestration   | ✅ Mergeado |
+| F60    | Local Control Plane Foundation      | ✅ Mergeado |
+| F61    | DAG Pipeline Evolution              | ✅ Mergeado |
+| F62    | Copilot Adapter                     | ✅ Mergeado |
+| F63    | Memory Engine Enhancement           | ✅ Mergeado |
+| F64    | Advanced Supervisor Policies        | ✅ Mergeado |
+| F65    | Runtime Coordinator Hardening       | ✅ Mergeado |
+| F66    | Reporting & Observability Evolution | ✅ Mergeado |
+| F67    | Workspace Management v2             | ✅ Mergeado |
+| F68    | Plugin/Extension System             | ✅ Mergeado |
+
+## Frente ativa
+
+Nenhuma frente ativa imediata. Aguardando `technical-triage` para definição de próximas prioridades.
+
+## Open decisions
+
+1. **Próxima frente prioritária:** Aguardando triagem para escolher entre:
+    - Evolução do multi-agent (distributed sessions)
+    - Hardening de segurança adicional
+    - UI/TUI desktop (quando demanda concreta surgir)
+    - Outros candidatos do backlog
+
+2. **Desktop-shell:** Mantido fora da fila principal até estabilização completa do core.
+
+3. **TypeScript runtime migration:** Descartado por ora; reavaliar apenas se houver necessidade estratégica.
+
+4. **Remote multi-host auth:** Adiado até demanda concreta.
diff --git a/memory/handoff.md b/memory/handoff.md
new file mode 100644
index 0000000..386cf0b
--- /dev/null
+++ b/memory/handoff.md
@@ -0,0 +1,37 @@
+# Handoff — SynapseOS
+
+**Data:** 2026-04-01  
+**Sprint:** F59-F68 concluído  
+**Branch:** main  
+**Status:** Estável, 755 tests, ruff/mypy clean
+
+## Read before acting
+
+1. Leia `AGENTS.md` para convenções do projeto
+2. Leia `memory/MEMORY.md` e arquivos temáticos em `memory/`
+3. Leia `ERROR_LOG.md` e `PENDING_LOG.md` para contexto operacional
+4. Verifique `git status` e `./scripts/branch-sync-check.sh`
+
+## Current state
+
+- Todas as 10 frentes do sprint F59-F68 mergeadas em `origin/main`
+- Synapse-Flow evoluído para DAG state-driven
+- Multi-agent session orchestration operacional
+- Local control plane foundation estabilizado
+- Runtime boundaries, workspace isolation, observabilidade consolidados
+- Zero erros críticos, baseline 100% clean
+
+## Open points
+
+- Aguardando `technical-triage` para definição de próximas frentes
+- Desktop-shell mantido fora da fila principal
+- TypeScript runtime migration descartado por ora
+- Remote multi-host auth adiado
+
+## Recommended next front
+
+Executar `technical-triage` para avaliar backlog e escolher próxima frente prioritária entre:
+
+- Evolução multi-agent (distributed sessions)
+- Hardening de segurança adicional
+- Outros candidatos do backlog
diff --git a/memory/next_steps.md b/memory/next_steps.md
new file mode 100644
index 0000000..93bc435
--- /dev/null
+++ b/memory/next_steps.md
@@ -0,0 +1,19 @@
+# Next Steps — SynapseOS
+
+## Recomendação imediata
+
+Executar `technical-triage` para definir próximas frentes prioritárias pós-sprint F59-F68.
+
+## Candidatos potenciais (aguardando triagem)
+
+1. **Evolução multi-agent:** Distributed sessions, coordenação avançada entre adapters
+2. **Hardening de segurança:** Auth adicional, audit trail expandido, rate limiting
+3. **UI/TUI desktop:** `synapse tui` com Textual — apenas se houver demanda concreta
+4. **Performance e escalabilidade:** Otimizações de throughput, memória
+5. **Integrações:** Novos adapters além de Codex e Copilot
+
+## Não priorizar
+
+- Desktop-shell (fora da fila até core estabilizar)
+- TypeScript-first runtime migration (descartado por ora)
+- Remote multi-host auth (adiado até demanda concreta)
diff --git a/memory/pitfalls.md b/memory/pitfalls.md
new file mode 100644
index 0000000..07c96b8
--- /dev/null
+++ b/memory/pitfalls.md
@@ -0,0 +1,27 @@
+# Pitfalls — SynapseOS
+
+## Armadilhas recorrentes
+
+### Branch e Git
+
+1. **Reuso de branch mergeada** — Nunca reutilizar branch de feature já mergeada para drafts novos. Usar `draft/*` ou `archive/*`.
+2. **Drift não detectado** — Rodar `./scripts/branch-sync-check.sh` cedo e manter worktree limpa antes de commit/push/PR.
+3. **Delta misto em PR** — Quando inevitável, consolidar handoff durável e artefatos mínimos imediatamente após merge.
+
+### Docker e ambiente
+
+4. **Sandbox vs real** — Diferenciar falha de sandbox (rede, Docker daemon) de falha real do repositório.
+5. **Worktree fria** — Sincronizar `uv sync --locked --extra dev` antes de rodar testes que carregam `conftest.py`.
+6. **Wrappers quebrados** — Prefira `python -m pytest`/`python -m mypy` via `uv` em vez de wrappers da `.venv`.
+
+### Testes e TDD
+
+7. **Fixtures ANSI** — Fixtures com ANSI armazenados como escape literal requerem `unicode_escape=True` no helper.
+8. **Monkeypatch legacy** — Ao ampliar helpers de CLI, preservar assinatura compatível ou atualizar doubles legados.
+9. **mypy em tests** — `tests/` tem override explícito; não aplicar strict mode da `src/` na árvore de testes.
+
+### CI e gates
+
+10. **repo-checks local** — Rodar equivalente local do gate amplo antes de concluir PRs grandes.
+11. **ruff format global** — Revalidar após mudanças amplas de documentação/baseline.
+12. **PR body inline** — Usar `--body-file` em vez de `--body` quando houver Markdown com backticks.
diff --git a/memory/project_state.md b/memory/project_state.md
new file mode 100644
index 0000000..d60bb2c
--- /dev/null
+++ b/memory/project_state.md
@@ -0,0 +1,24 @@
+# Project State — SynapseOS
+
+## Estado global
+
+**Sprint atual:** F59-F68 concluído (2026-04-01)  
+**Branch ativa:** main  
+**Baseline:** origin/main sincronizado  
+**Status:** Estável, 755 tests passando
+
+## Marcos
+
+- MVP inicial: Concluído (F01-F10)
+- Etapa 2: Concluída (F15-F22)
+- Primeira onda de guardrails: Concluída (F23-F27)
+- Fundação de runtime boundaries: Concluída (F51-F53)
+- Sprint F59-F68: Concluída — Multi-Agent, Control Plane, DAG, Copilot, Memory, Supervisor, Runtime, Observability, Workspace v2, Plugins
+
+## Snapshot local
+
+- `ruff format --check .`: ✅ Clean
+- `mypy src`: ✅ Clean
+- `pytest`: ✅ 755 passando
+- `ERROR_LOG.md`: Sem erros críticos abertos
+- `PENDING_LOG.md`: Aguardando technical-triage para próximas frentes
diff --git a/memory/stable_decisions.md b/memory/stable_decisions.md
new file mode 100644
index 0000000..af74c98
--- /dev/null
+++ b/memory/stable_decisions.md
@@ -0,0 +1,32 @@
+# Stable Decisions — SynapseOS
+
+## Arquitetura
+
+1. **Synapse-Flow** é a engine própria de pipeline do SynapseOS — pipeline linear state-driven evoluído para DAG.
+2. **CLI-first** — interface primária é CLI; UI desktop (Textual) somente quando houver demanda concreta.
+3. **Core em Python** — runtime central permanece em Python; TypeScript limitado a shell/UI opcional.
+4. **Container-first** — execução prática via Docker/Compose, com preflight leve (`compose config`).
+
+## Boundaries e contratos
+
+5. `ToolSpec`/capabilities formalizado — contratos explícitos de capabilities registradas.
+6. `WorkspaceProvider` com isolation auditável — workspace path persistido e provider `run-scoped`.
+7. `RunContext` enriquecido — eventos de lifecycle (`run_context_initialized`, `step_started`, `state_transitioned`).
+
+## Multi-agent e orquestração
+
+8. **Multi-Agent Session Orchestration** — registry/capabilities e coordenação entre adapters sem UI desktop.
+9. **Supervisor deterministico** — decisões entre retry, reroute, return_to_code_green, fail.
+10. **Runtime persistente** — Linux-first, identidade de processo validada via `/proc/<pid>/cmdline`.
+
+## Execução e qualidade
+
+11. **TDD explícito** — RED → GREEN → REFACTOR; testes antes do código de produção.
+12. **Quality gates** — ruff, mypy, pytest como gates obrigatórios antes de SECURITY_REVIEW.
+13. **Branch Sync Gate** — drift detection e atualização conservadora via scripts.
+
+## Decisões de produto
+
+14. **Desktop-shell fora da fila principal** — só retorna após runtime boundaries, workspace isolation, observability e control plane estabilizados.
+15. **TypeScript-first runtime migration descartado** — por ora, TypeScript apenas para shell/UI opcional.
+16. **Remote multi-host auth adiado** — só quando houver demanda concreta e recorte verificável.
diff --git a/pyproject.toml b/pyproject.toml
index 092ad0c..1ff4228 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -24,6 +24,9 @@ dependencies = [
   "jsonschema>=4.23.0",
   "pyyaml>=6.0.2",
   "textual>=8.1.1",
+  "fastapi>=0.115.0",
+  "uvicorn>=0.32.0",
+  "httpx>=0.28.0",
 ]
 
 [project.optional-dependencies]
diff --git a/src/synapse_os/adapters.py b/src/synapse_os/adapters.py
index 3185a48..e647093 100644
--- a/src/synapse_os/adapters.py
+++ b/src/synapse_os/adapters.py
@@ -45,6 +45,7 @@ def __init__(
         self.tool_name = tool_name
         self.command = command
         self.reason = reason
+        self.message = message
 
 
 class BaseCLIAdapter(ABC):
@@ -324,3 +325,149 @@ def build_command(self, prompt: str) -> list[str]:
 
     async def execute(self, prompt: str) -> CLIExecutionResult:
         return await super().execute(prompt)
+
+
+_LAUNCHER_UNAVAILABLE_COPILOT_PATTERNS = (
+    "gh: command not found",
+    "gh: not found",
+    "command not found: gh",
+)
+_AUTHENTICATION_UNAVAILABLE_COPILOT_PATTERNS = (
+    "authentication required",
+    "not logged in",
+    "unauthorized",
+    "invalid token",
+    "authenticated",
+    "gh auth login",
+)
+
+
+class CopilotCLIAdapter(BaseCLIAdapter):
+    @property
+    def capabilities(self) -> tuple[str, ...]:
+        return ("cli_execution", "code_generation")
+
+    @property
+    def command_prefix(self) -> tuple[str, ...]:
+        return ("gh", "copilot", "ai")
+
+    def __init__(
+        self,
+        *,
+        timeout_seconds: float = 30.0,
+        max_concurrent_adapters: int | None = None,
+    ) -> None:
+        super().__init__(
+            tool_name="copilot",
+            timeout_seconds=timeout_seconds,
+            max_concurrent_adapters=max_concurrent_adapters,
+        )
+
+    def build_command(self, prompt: str) -> list[str]:
+        if not prompt.strip():
+            raise ValueError("prompt must not be empty.")
+        return [
+            "gh",
+            "copilot",
+            "ai",
+            "--color",
+            "never",
+            "--",
+            prompt,
+        ]
+
+    async def execute(self, prompt: str) -> CLIExecutionResult:
+        settings = AppSettings()
+        command = self.build_command(prompt)
+        self._validate_command(command)
+
+        breaker_store = AdapterCircuitBreakerStore(settings.adapter_circuit_breaker_state_file)
+        if breaker_store.is_open(self.tool_name, now=time.time()):
+            return CLIExecutionResult(
+                tool_name=self.tool_name,
+                command=command,
+                return_code=75,
+                stdout_raw="",
+                stderr_raw="Circuit breaker open for copilot.\n",
+                stdout_clean="",
+                stderr_clean="Circuit breaker open for copilot.",
+                duration_ms=0,
+                timed_out=False,
+                success=False,
+            )
+
+        try:
+            result = await super().execute(prompt)
+        except AdapterOperationalError as exc:
+            breaker_store.record_operational_failure(
+                self.tool_name,
+                threshold=settings.adapter_circuit_breaker_failure_threshold,
+                cooldown_seconds=settings.adapter_circuit_breaker_cooldown_seconds,
+                now=time.time(),
+            )
+            return CLIExecutionResult(
+                tool_name=self.tool_name,
+                command=exc.command,
+                return_code=1,
+                stdout_raw="",
+                stderr_raw=exc.message,
+                stdout_clean="",
+                stderr_clean=exc.message,
+                duration_ms=0,
+                timed_out=False,
+                success=False,
+            )
+        assessment = classify_copilot_execution(result)
+        if assessment.category in {
+            "launcher_unavailable",
+            "authentication_unavailable",
+        }:
+            breaker_store.record_operational_failure(
+                self.tool_name,
+                threshold=settings.adapter_circuit_breaker_failure_threshold,
+                cooldown_seconds=settings.adapter_circuit_breaker_cooldown_seconds,
+                now=time.time(),
+            )
+        else:
+            breaker_store.reset(self.tool_name)
+        return result
+
+
+def classify_copilot_execution(result: CLIExecutionResult) -> CodexExecutionAssessment:
+    stderr_lower = result.stderr_clean.lower()
+
+    if result.success:
+        return CodexExecutionAssessment(
+            category="success",
+            is_operational_block=False,
+            detail="Copilot CLI completed successfully.",
+        )
+    if "circuit breaker open" in stderr_lower:
+        return CodexExecutionAssessment(
+            category="circuit_open",
+            is_operational_block=True,
+            detail=result.stderr_clean or "Copilot circuit breaker is open.",
+        )
+    if result.timed_out:
+        return CodexExecutionAssessment(
+            category="timeout",
+            is_operational_block=False,
+            detail="Copilot CLI exceeded the configured timeout.",
+        )
+    if _contains_any(stderr_lower, _LAUNCHER_UNAVAILABLE_COPILOT_PATTERNS):
+        return CodexExecutionAssessment(
+            category="launcher_unavailable",
+            is_operational_block=True,
+            detail=result.stderr_clean or "GitHub CLI (gh) is unavailable.",
+        )
+    if _contains_any(stderr_lower, _AUTHENTICATION_UNAVAILABLE_COPILOT_PATTERNS):
+        return CodexExecutionAssessment(
+            category="authentication_unavailable",
+            is_operational_block=True,
+            detail=result.stderr_clean or "GitHub Copilot authentication is unavailable.",
+        )
+    return CodexExecutionAssessment(
+        category="return_code_nonzero",
+        is_operational_block=False,
+        detail=result.stderr_clean or "Copilot CLI exited with a non-zero return code.",
+    )
diff --git a/src/synapse_os/cli/app.py b/src/synapse_os/cli/app.py
index aab0b57..07dc080 100644
--- a/src/synapse_os/cli/app.py
+++ b/src/synapse_os/cli/app.py
@@ -54,10 +54,12 @@
 runtime_app = typer.Typer(help="Manage the minimal persistent runtime.")
 runs_app = typer.Typer(help="Inspect persisted runs and artifacts.")
 auth_app = typer.Typer(help="Manage the local auth registry.")
+control_plane_app = typer.Typer(help="Manage the local control plane HTTP API.")
 app.add_typer(runtime_app, name="runtime")
 app.add_typer(runs_app, name="runs")
 app.add_typer(auth_app, name="auth")
 app.add_typer(hooks_app, name="hooks")
+app.add_typer(control_plane_app, name="control-plane")
 
 
 @app.callback()
@@ -807,3 +809,51 @@ def runs_show(
         artifact_paths=artifact_store.list_artifact_paths(run_id),
         preview=resolved_preview,
     )
+
+
+@control_plane_app.command("start")
+def control_plane_start(
+    host: str = typer.Option("127.0.0.1", "--host", envvar="SYNAPSE_CONTROL_HOST"),
+    port: int = typer.Option(8080, "--port", envvar="SYNAPSE_CONTROL_PORT"),
+    api_token: str | None = typer.Option(None, "--token", envvar="SYNAPSE_API_TOKEN"),
+) -> None:
+    import uvicorn
+
+    if host != "127.0.0.1" and host != "localhost":
+        typer.echo(
+            "WARNING: Binding to non-localhost address. "
+            "The control plane has no network-level security.",
+            err=True,
+        )
+
+    try:
+        runtime_service = _runtime_service()
+        run_repo = _run_repository()
+        artifact_store = _artifact_store()
+    except CLIError:
+        raise
+
+    from synapse_os.control_plane.server import create_app
+
+    cp_app = create_app(
+        runtime_service=runtime_service,
+        run_repository=run_repo,
+        artifact_store=artifact_store,
+        api_token=api_token,
+    )
+
+    typer.echo(f"Starting control plane on http://{host}:{port}")
+    uvicorn.run(cp_app, host=host, port=port, log_level="info")
+
+
+@control_plane_app.command("status")
+def control_plane_status() -> None:
+
+    host = os.environ.get("SYNAPSE_CONTROL_HOST", "127.0.0.1")
+    port_raw = os.environ.get("SYNAPSE_CONTROL_PORT", "8080")
+    try:
+        port = int(port_raw)
+    except ValueError:
+        port = 8080
+    typer.echo(f"Control plane configured for http://{host}:{port}")
+    typer.echo("Use 'synapse control-plane start' to start the server.")
diff --git a/src/synapse_os/control_plane/__init__.py b/src/synapse_os/control_plane/__init__.py
new file mode 100644
index 0000000..70f521c
--- /dev/null
+++ b/src/synapse_os/control_plane/__init__.py
@@ -0,0 +1 @@
+"""Local Control Plane — HTTP API for SynapseOS."""
diff --git a/src/synapse_os/control_plane/middleware.py b/src/synapse_os/control_plane/middleware.py
new file mode 100644
index 0000000..61a8813
--- /dev/null
+++ b/src/synapse_os/control_plane/middleware.py
@@ -0,0 +1,55 @@
+"""Authentication middleware for the Control Plane API."""
+
+from __future__ import annotations
+
+from starlette.types import ASGIApp, Receive, Scope, Send
+
+
+class AuthMiddleware:
+    def __init__(self, app: ASGIApp, api_token: str | None) -> None:
+        self.app = app
+        self._api_token = api_token
+
+    async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
+        if self._api_token is None:
+            await self.app(scope, receive, send)
+            return
+
+        if scope.get("type") != "http":
+            await self.app(scope, receive, send)
+            return
+
+        path = scope.get("path", "")
+        if path == "/health":
+            await self.app(scope, receive, send)
+            return
+
+        headers = dict(scope.get("headers", []))
+        auth_header = headers.get(b"authorization", b"").decode()
+        if not auth_header.startswith("Bearer "):
+            await _send_json(receive, send, 401, {"detail": "Unauthorized"})
+            return
+
+        token = auth_header[7:]
+        if token != self._api_token:
+            await _send_json(receive, send, 401, {"detail": "Unauthorized"})
+            return
+
+        await self.app(scope, receive, send)
+
+
+async def _send_json(receive: Receive, send: Send, status: int, content: dict[str, object]) -> None:
+    import json
+
+    body = json.dumps(content).encode()
+    await send(
+        {
+            "type": "http.response.start",
+            "status": status,
+            "headers": [
+                (b"content-type", b"application/json"),
+                (b"content-length", str(len(body)).encode()),
+            ],
+        }
+    )
+    await send({"type": "http.response.body", "body": body})
diff --git a/src/synapse_os/control_plane/models.py b/src/synapse_os/control_plane/models.py
new file mode 100644
index 0000000..1a5f5c1
--- /dev/null
+++ b/src/synapse_os/control_plane/models.py
@@ -0,0 +1,67 @@
+"""Pydantic models for the Control Plane API."""
+
+from __future__ import annotations
+
+from pydantic import BaseModel, Field
+
+
+class HealthResponse(BaseModel):
+    status: str
+    runtime: str
+
+
+class RunListItem(BaseModel):
+    id: str
+    status: str
+    prompt: str
+    created_at: str
+
+
+class RunListResponse(BaseModel):
+    runs: list[RunListItem]
+    total: int
+    limit: int
+    offset: int
+
+
+class RunStepItem(BaseModel):
+    name: str
+    status: str
+
+
+class RunDetailResponse(BaseModel):
+    id: str
+    status: str
+    prompt: str
+    created_at: str
+    updated_at: str
+    steps: list[RunStepItem] = Field(default_factory=list)
+    artifacts: list[str] = Field(default_factory=list)
+
+
+class RunCreateRequest(BaseModel):
+    prompt: str = Field(..., min_length=1)
+
+
+class RunCreateResponse(BaseModel):
+    run_id: str
+    status: str
+
+
+class RuntimeStatusResponse(BaseModel):
+    pid: int | None = None
+    uptime: int = 0
+    state: str
+    active_runs: int = 0
+    pending_runs: int = 0
+
+
+class ArtifactItem(BaseModel):
+    name: str
+    size_bytes: int
+    created_at: str
+    type: str
+
+
+class ArtifactListResponse(BaseModel):
+    artifacts: list[ArtifactItem]
diff --git a/src/synapse_os/control_plane/server.py b/src/synapse_os/control_plane/server.py
new file mode 100644
index 0000000..ea3e612
--- /dev/null
+++ b/src/synapse_os/control_plane/server.py
@@ -0,0 +1,261 @@
+"""FastAPI application for the Local Control Plane."""
+
+from __future__ import annotations
+
+import os
+from datetime import UTC
+from pathlib import Path
+from typing import TYPE_CHECKING
+
+from fastapi import FastAPI, HTTPException, Query
+from fastapi.responses import JSONResponse
+
+from synapse_os.control_plane.middleware import AuthMiddleware
+from synapse_os.control_plane.models import (
+    ArtifactItem,
+    ArtifactListResponse,
+    HealthResponse,
+    RunCreateRequest,
+    RunCreateResponse,
+    RunDetailResponse,
+    RunListItem,
+    RunListResponse,
+    RunStepItem,
+    RuntimeStatusResponse,
+)
+
+if TYPE_CHECKING:
+    from synapse_os.persistence import ArtifactStore, RunRepository
+    from synapse_os.runtime.service import RuntimeService
+
+MAX_PROMPT_PREVIEW = 100
+TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
+
+
+def create_app(
+    *,
+    runtime_service: RuntimeService | None = None,
+    run_repository: RunRepository | None = None,
+    artifact_store: ArtifactStore | None = None,
+    api_token: str | None = None,
+) -> FastAPI:
+    app = FastAPI(
+        title="SynapseOS Control Plane",
+        version="0.1.0",
+        docs_url=None,
+        redoc_url=None,
+    )
+
+    if api_token is not None:
+        app.add_middleware(AuthMiddleware, api_token=api_token)
+
+    @app.get("/health", response_model=HealthResponse)
+    async def health() -> HealthResponse:
+        runtime_status = "unknown"
+        if runtime_service is not None:
+            try:
+                runtime_status = "running" if runtime_service.ready() else "stopped"
+            except Exception:
+                runtime_status = "stopped"
+        return HealthResponse(status="ok", runtime=runtime_status)
+
+    @app.get("/api/v1/runs", response_model=RunListResponse)
+    async def list_runs(
+        limit: int = Query(default=20, ge=1, le=100),
+        offset: int = Query(default=0, ge=0),
+    ) -> RunListResponse:
+        if run_repository is None:
+            raise HTTPException(status_code=503, detail="Run repository not configured")
+
+        all_runs = run_repository.list_runs()
+        total = len(all_runs)
+        page = all_runs[offset : offset + limit]
+
+        return RunListResponse(
+            runs=[
+                RunListItem(
+                    id=r.run_id,
+                    status=r.status,
+                    prompt=(r.spec_path[:MAX_PROMPT_PREVIEW] if hasattr(r, "spec_path") else ""),
+                    created_at=r.created_at,
+                )
+                for r in page
+            ],
+            total=total,
+            limit=limit,
+            offset=offset,
+        )
+
+    @app.post("/api/v1/runs", response_model=RunCreateResponse, status_code=201)
+    async def create_run(request: RunCreateRequest) -> RunCreateResponse:
+        if run_repository is None:
+            raise HTTPException(status_code=503, detail="Run repository not configured")
+
+        spec_path = _create_spec_from_prompt(request.prompt)
+        run_id = run_repository.create_run(
+            spec_path=spec_path,
+            initial_state="REQUEST",
+            stop_at="COMPLETE",
+            initiated_by="api",
+        )
+
+        return RunCreateResponse(run_id=run_id, status="pending")
+
+    @app.get("/api/v1/runs/{run_id}", response_model=RunDetailResponse)
+    async def get_run(run_id: str) -> RunDetailResponse:
+        if run_repository is None:
+            raise HTTPException(status_code=503, detail="Run repository not configured")
+
+        try:
+            run = run_repository.get_run(run_id)
+        except Exception as err:
+            raise HTTPException(status_code=404, detail="Run not found") from err
+
+        steps = []
+        try:
+            for s in run_repository.list_steps(run_id):
+                steps.append(RunStepItem(name=s.state, status=s.status))
+        except Exception:
+            pass
+
+        artifacts = []
+        if artifact_store is not None:
+            try:
+                artifacts = artifact_store.list_artifact_paths(run_id)
+            except Exception:
+                pass
+
+        return RunDetailResponse(
+            id=run.run_id,
+            status=run.status,
+            prompt=run.spec_path,
+            created_at=run.created_at,
+            updated_at=run.updated_at,
+            steps=steps,
+            artifacts=artifacts,
+        )
+
+    @app.post("/api/v1/runs/{run_id}/cancel")
+    async def cancel_run(run_id: str) -> JSONResponse:
+        if run_repository is None:
+            raise HTTPException(status_code=503, detail="Run repository not configured")
+
+        try:
+            run = run_repository.get_run(run_id)
+        except Exception as err:
+            raise HTTPException(status_code=404, detail="Run not found") from err
+
+        if run.status in TERMINAL_STATUSES:
+            raise HTTPException(
+                status_code=409,
+                detail=f"Cannot cancel run in terminal state: {run.status}",
+            )
+
+        try:
+            run_repository.mark_run_cancelling(run_id)
+        except ValueError as err:
+            raise HTTPException(status_code=409, detail="Run cannot be cancelled") from err
+
+        return JSONResponse(content={"status": "cancelling", "run_id": run_id})
+
+    @app.get("/api/v1/runtime/status", response_model=RuntimeStatusResponse)
+    async def runtime_status() -> RuntimeStatusResponse:
+        if runtime_service is None:
+            raise HTTPException(status_code=503, detail="Runtime service not configured")
+
+        state = runtime_service.current_state()
+        pending = 0
+        if run_repository is not None:
+            try:
+                pending = len(run_repository.list_unlocked_pending_runs())
+            except Exception:
+                pass
+
+        uptime = 0
+        if state.started_at is not None:
+            try:
+                from datetime import datetime
+
+                started = datetime.fromisoformat(state.started_at)
+                uptime = int((datetime.now(UTC) - started).total_seconds())
+            except Exception:
+                pass
+
+        return RuntimeStatusResponse(
+            pid=state.pid,
+            uptime=uptime,
+            state=state.status,
+            active_runs=1 if state.status == "running" else 0,
+            pending_runs=pending,
+        )
+
+    @app.get("/api/v1/artifacts/{run_id}", response_model=ArtifactListResponse)
+    async def list_artifacts(run_id: str) -> ArtifactListResponse:
+        if artifact_store is None:
+            raise HTTPException(status_code=503, detail="Artifact store not configured")
+
+        try:
+            paths = artifact_store.list_artifact_paths(run_id)
+        except FileNotFoundError as err:
+            raise HTTPException(status_code=404, detail="Run not found") from err
+
+        artifacts = []
+        for p in paths:
+            full_path = artifact_store.base_path / p
+            try:
+                stat = full_path.stat()
+                artifact_type = _infer_artifact_type(p)
+                artifacts.append(
+                    ArtifactItem(
+                        name=full_path.name,
+                        size_bytes=stat.st_size,
+                        created_at=_format_timestamp(stat.st_mtime),
+                        type=artifact_type,
+                    )
+                )
+            except OSError:
+                continue
+
+        return ArtifactListResponse(artifacts=artifacts)
+
+    return app
+
+
+def _create_spec_from_prompt(prompt: str) -> Path:
+    from uuid import uuid4
+
+    tmp_dir = Path(os.environ.get("TMPDIR", "/tmp")) / "synapse-os" / "api-specs"
+    tmp_dir.mkdir(parents=True, exist_ok=True)
+    spec_path = tmp_dir / f"{uuid4().hex}.md"
+    spec_content = (
+        "---\n"
+        "feature_id: api-run\n"
+        "feature_name: API Run\n"
+        "status: draft\n"
+        "---\n\n"
+        f"# API Run\n\n{prompt}\n"
+    )
+    spec_path.write_text(spec_content, encoding="utf-8")
+    spec_path.chmod(0o600)
+    return spec_path
+
+
+def _infer_artifact_type(path: str) -> str:
+    path_lower = path.lower()
+    if "spec" in path_lower:
+        return "spec"
+    if "test" in path_lower:
+        return "test"
+    if "report" in path_lower:
+        return "report"
+    if path_lower.endswith((".py", ".ts", ".js", ".rs", ".go")):
+        return "code"
+    if path_lower.endswith((".md", ".txt")):
+        return "document"
+    return "other"
+
+
+def _format_timestamp(ts: float) -> str:
+    from datetime import datetime
+
+    return datetime.fromtimestamp(ts, tz=UTC).isoformat()
diff --git a/src/synapse_os/memory.py b/src/synapse_os/memory.py
new file mode 100644
index 0000000..829f124
--- /dev/null
+++ b/src/synapse_os/memory.py
@@ -0,0 +1,129 @@
+from __future__ import annotations
+
+import json
+from collections import defaultdict
+from datetime import datetime, timezone
+from pathlib import Path
+from threading import Lock
+
+from pydantic import BaseModel, ConfigDict, Field, StrictStr
+
+
+class ArtifactMetadata(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    type: StrictStr = Field(default="unknown")
+    tags: list[StrictStr] = Field(default_factory=list)
+    source_step: StrictStr | None = None
+    created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
+
+
+class ArtifactIndexEntry(BaseModel):
+    name: StrictStr
+    run_id: StrictStr
+    metadata: ArtifactMetadata
+
+
+class IndexedArtifactStore:
+    def __init__(self, *, base_path: Path) -> None:
+        self.base_path = base_path
+        self._lock = Lock()
+        self._index: dict[str, list[ArtifactIndexEntry]] = defaultdict(list)
+
+    def register(
+        self,
+        *,
+        run_id: str,
+        name: str,
+        metadata: ArtifactMetadata | None = None,
+    ) -> None:
+        with self._lock:
+            entry = ArtifactIndexEntry(
+                name=name,
+                run_id=run_id,
+                metadata=metadata or ArtifactMetadata(type="unknown"),
+            )
+            self._index[run_id].append(entry)
+
+    def find_by_tag(self, tag: str) -> list[ArtifactIndexEntry]:
+        with self._lock:
+            return [
+                entry
+                for entries in self._index.values()
+                for entry in entries
+                if tag in entry.metadata.tags
+            ]
+
+    def find_by_type(self, artifact_type: str) -> list[ArtifactIndexEntry]:
+        with self._lock:
+            return [
+                entry
+                for entries in self._index.values()
+                for entry in entries
+                if entry.metadata.type == artifact_type
+            ]
+
+    def list_for_run(self, run_id: str) -> list[ArtifactIndexEntry]:
+        with self._lock:
+            return list(self._index.get(run_id, []))
+
+
+class MemoryStore:
+    def __init__(self, *, state_dir: Path) -> None:
+        self.state_dir = state_dir
+        self._lock = Lock()
+        self._memory_path = state_dir / "memory-store.json"
+        self._memory: dict[str, dict[str, str]] = self._load()
+
+    def _load(self) -> dict[str, dict[str, str]]:
+        if not self._memory_path.exists():
+            return defaultdict(dict)
+        try:
+            data = json.loads(self._memory_path.read_text(encoding="utf-8"))
+            return defaultdict(dict, data)
+        except Exception:
+            return defaultdict(dict)
+
+    def _persist(self) -> None:
+        self.state_dir.mkdir(parents=True, exist_ok=True)
+        self._memory_path.write_text(
+            json.dumps(dict(self._memory), ensure_ascii=False),
+            encoding="utf-8",
+        )
+
+    def get(self, namespace: str, key: str) -> str | None:
+        with self._lock:
+            return self._memory.get(namespace, {}).get(key)
+
+    def set(self, namespace: str, key: str, value: str) -> None:
+        with self._lock:
+            self._memory[namespace][key] = value
+            self._persist()
+
+    def delete(self, namespace: str, key: str) -> None:
+        with self._lock:
+            if namespace in self._memory and key in self._memory[namespace]:
+                del self._memory[namespace][key]
+                self._persist()
+
+    def list_namespaces(self) -> list[str]:
+        with self._lock:
+            return list(self._memory.keys())
+
+    def feature_memory(self, feature_id: str) -> FeatureMemoryView:
+        return FeatureMemoryView(store=self, namespace=feature_id)
+
+
+class FeatureMemoryView:
+    def __init__(self, store: MemoryStore, namespace: str) -> None:
+        self._store = store
+        self._namespace = namespace
+
+    def get(self, key: str) -> str | None:
+        return self._store.get(self._namespace, key)
+
+    def set(self, key: str, value: str) -> None:
+        self._store.set(self._namespace, key, value)
+
+    def delete(self, key: str) -> None:
+        self._store.delete(self._namespace, key)
diff --git a/src/synapse_os/multi_agent.py b/src/synapse_os/multi_agent.py
new file mode 100644
index 0000000..ef11d45
--- /dev/null
+++ b/src/synapse_os/multi_agent.py
@@ -0,0 +1,129 @@
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+
+from synapse_os.adapters import BaseCLIAdapter
+
+
+class AdapterAlreadyRegisteredError(ValueError):
+    pass
+
+
+class AdapterNotFoundError(KeyError):
+    pass
+
+
+class NoSuitableAdapterError(RuntimeError):
+    pass
+
+
+class AdapterRegistry:
+    def __init__(self) -> None:
+        self._adapters: dict[str, BaseCLIAdapter] = {}
+
+    def register(self, adapter: BaseCLIAdapter) -> None:
+        if adapter.tool_name in self._adapters:
+            raise AdapterAlreadyRegisteredError(
+                f"Adapter '{adapter.tool_name}' is already registered."
+            )
+        self._adapters[adapter.tool_name] = adapter
+
+    def unregister(self, name: str) -> None:
+        if name not in self._adapters:
+            raise AdapterNotFoundError(f"Adapter '{name}' not found.")
+        del self._adapters[name]
+
+    def get(self, name: str) -> BaseCLIAdapter | None:
+        return self._adapters.get(name)
+
+    def list_all(self) -> list[BaseCLIAdapter]:
+        return list(self._adapters.values())
+
+    def find_by_capability(self, capability: str) -> list[BaseCLIAdapter]:
+        return [
+            adapter for adapter in self._adapters.values() if capability in adapter.capabilities
+        ]
+
+    def all_capabilities(self) -> set[str]:
+        caps: set[str] = set()
+        for adapter in self._adapters.values():
+            caps.update(adapter.capabilities)
+        return caps
+
+
+class CapabilityRouter:
+    def __init__(self, registry: AdapterRegistry) -> None:
+        self.registry = registry
+
+    def select_adapter(self, required_capabilities: set[str]) -> BaseCLIAdapter | None:
+        if not required_capabilities:
+            adapters = self.registry.list_all()
+            return adapters[0] if adapters else None
+
+        for capability in required_capabilities:
+            matches = self.registry.find_by_capability(capability)
+            if matches:
+                return matches[0]
+
+        return None
+
+    def get_best_match(self, required_capabilities: set[str]) -> BaseCLIAdapter | None:
+        if not required_capabilities:
+            adapters = self.registry.list_all()
+            return adapters[0] if adapters else None
+
+        all_adapters = self.registry.list_all()
+        if not all_adapters:
+            return None
+
+        scored: list[tuple[int, BaseCLIAdapter]] = []
+        for adapter in all_adapters:
+            overlap = len(set(adapter.capabilities) & required_capabilities)
+            if overlap > 0:
+                scored.append((overlap, adapter))
+
+        if scored:
+            scored.sort(key=lambda x: x[0], reverse=True)
+            return scored[0][1]
+
+        return all_adapters[0]
+
+
+@dataclass
+class MultiAgentCoordinator:
+    registry: AdapterRegistry
+    router: CapabilityRouter
+    required_steps: set[str] = field(default_factory=set)
+    _handoff_log: list[dict[str, str]] = field(default_factory=list)
+
+    def resolve_adapter_for_step(
+        self,
+        step_name: str,
+        required_capabilities: set[str],
+    ) -> BaseCLIAdapter | None:
+        adapter = self.router.get_best_match(required_capabilities)
+
+        if adapter is None and step_name in self.required_steps:
+            raise NoSuitableAdapterError(
+                f"No suitable adapter found for required step '{step_name}' "
+                f"with capabilities {required_capabilities}."
+            )
+
+        if adapter is not None:
+            self._handoff_log.append(
+                {
+                    "step": step_name,
+                    "adapter": adapter.tool_name,
+                    "capabilities": (
+                        ",".join(required_capabilities) if required_capabilities else ""
+                    ),
+                }
+            )
+
+        return adapter
+
+    def get_handoff_log(self) -> list[dict[str, str]]:
+        return list(self._handoff_log)
+
+    def clear_handoff_log(self) -> None:
+        self._handoff_log.clear()
diff --git a/src/synapse_os/pipeline.py b/src/synapse_os/pipeline.py
index 3aac974..e090d77 100644
--- a/src/synapse_os/pipeline.py
+++ b/src/synapse_os/pipeline.py
@@ -98,6 +98,7 @@ class PipelineContext(BaseModel):
     supervisor_decisions: list[StrictStr] = Field(default_factory=list)
     validated_spec: SpecDocument | None = None
     hooks_active: list[StrictStr] = Field(default_factory=list)
+    dag: dict[str, object] = Field(default_factory=dict)
 
 
 class StepExecutor(Protocol):
@@ -309,6 +310,7 @@ def _execute_spec_validation(self, context: PipelineContext) -> None:
         context.validated_spec = spec_document
         context.artifacts["spec_id"] = spec_document.metadata.id
         context.artifacts["spec_summary"] = spec_document.metadata.summary
+        context.dag = spec_document.dag
         context.step_history.append(PipelineState.SPEC_VALIDATION)
         context.current_state = PipelineState.SPEC_VALIDATION
 
diff --git a/src/synapse_os/pipeline_dag.py b/src/synapse_os/pipeline_dag.py
new file mode 100644
index 0000000..1dbfd84
--- /dev/null
+++ b/src/synapse_os/pipeline_dag.py
@@ -0,0 +1,214 @@
+from __future__ import annotations
+
+from collections.abc import Callable
+from concurrent.futures import ThreadPoolExecutor
+from dataclasses import dataclass, field
+from enum import StrEnum
+from threading import Lock
+from typing import Any
+
+from pydantic import BaseModel, ConfigDict, Field, StrictStr
+
+
+class DAGSpecificationError(ValueError):
+    pass
+
+
+class DAGStepStatus(StrEnum):
+    PENDING = "PENDING"
+    RUNNING = "RUNNING"
+    DONE = "DONE"
+    FAILED = "FAILED"
+
+
+class DAGStep(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    id: StrictStr = Field(min_length=1)
+    executor: StrictStr = Field(min_length=1)
+    depends_on: list[StrictStr] = Field(default_factory=list)
+    if_cond: StrictStr | None = None
+
+
+class DAGConditional(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    id: StrictStr = Field(min_length=1)
+    step: StrictStr
+    if_cond: StrictStr
+
+
+class DAGSpec(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    mode: StrictStr = Field(default="linear")
+    steps: list[DAGStep] = Field(default_factory=list)
+    conditionals: list[DAGConditional] = Field(default_factory=list)
+
+
+class DAGValidator:
+    @staticmethod
+    def validate(spec: DAGSpec) -> None:
+        if spec.mode == "linear":
+            return
+        if spec.mode == "dag":
+            DAGValidator._validate_dag(spec)
+        else:
+            raise DAGSpecificationError(
+                f"Unknown DAG mode: {spec.mode!r}. Use 'linear' or 'dag'."
+            )
+
+    @staticmethod
+    def _validate_dag(spec: DAGSpec) -> None:
+        if not spec.steps:
+            raise DAGSpecificationError("DAG mode requires at least one step.")
+
+        step_ids = {step.id for step in spec.steps}
+        for step in spec.steps:
+            for dep in step.depends_on:
+                if dep not in step_ids:
+                    raise DAGSpecificationError(
+                        f"Step '{step.id}' depends on non-existent step '{dep}'."
+                    )
+        DAGValidator._check_no_cycle(spec)
+
+    @staticmethod
+    def _check_no_cycle(spec: DAGSpec) -> None:
+        in_degree: dict[str, int] = {step.id: 0 for step in spec.steps}
+        adj: dict[str, list[str]] = {step.id: [] for step in spec.steps}
+
+        for step in spec.steps:
+            for dep in step.depends_on:
+                in_degree[step.id] += 1
+                adj[dep].append(step.id)
+
+        queue: list[str] = [sid for sid, deg in in_degree.items() if deg == 0]
+        visited = 0
+        while queue:
+            node = queue.pop(0)
+            visited += 1
+            for neighbor in adj[node]:
+                in_degree[neighbor] -= 1
+                if in_degree[neighbor] == 0:
+                    queue.append(neighbor)
+
+        if visited != len(spec.steps):
+            raise DAGSpecificationError("Cycle detected in DAG graph.")
+
+
+@dataclass
+class DAGContext:
+    spec: DAGSpec
+    _states: dict[str, DAGStepStatus] = field(default_factory=dict)
+    _lock: Lock = field(default_factory=Lock)
+
+    def __post_init__(self) -> None:
+        for step in self.spec.steps:
+            self._states[step.id] = DAGStepStatus.PENDING
+
+    def get_state(self, step_id: str) -> DAGStepStatus:
+        return self._states[step_id]
+
+    def mark_running(self, step_id: str) -> None:
+        with self._lock:
+            self._states[step_id] = DAGStepStatus.RUNNING
+
+    def mark_done(self, step_id: str) -> None:
+        with self._lock:
+            self._states[step_id] = DAGStepStatus.DONE
+
+    def mark_failed(self, step_id: str) -> None:
+        with self._lock:
+            self._states[step_id] = DAGStepStatus.FAILED
+
+    def ready_steps(self) -> list[str]:
+        ready: list[str] = []
+        for step in self.spec.steps:
+            if self._states[step.id] != DAGStepStatus.PENDING:
+                continue
+            deps_done = all(
+                self._states[dep] == DAGStepStatus.DONE for dep in step.depends_on
+            )
+            if deps_done:
+                ready.append(step.id)
+        return ready
+
+    def is_complete(self) -> bool:
+        return all(
+            self._states[sid] in (DAGStepStatus.DONE, DAGStepStatus.FAILED)
+            for sid in self._states
+        )
+
+    @property
+    def has_failed(self) -> bool:
+        return any(self._states[sid] == DAGStepStatus.FAILED for sid in self._states)
+
+
+class DAGExecutor:
+    def __init__(
+        self,
+        spec: DAGSpec,
+        *,
+        max_workers: int = 4,
+        step_runner: Callable[[str, dict[str, Any]], None] | None = None,
+    ) -> None:
+        self.spec = spec
+        self.max_workers = max_workers
+        self.step_runner = step_runner or (lambda _sid, _ctx: None)
+        DAGValidator.validate(spec)
+        self.context = DAGContext(spec)
+
+    def execute(self) -> None:
+        with ThreadPoolExecutor(max_workers=self.max_workers) as pool:
+            futures: dict[str, Any] = {}
+            while not self.context.is_complete():
+                if self.context.has_failed:
+                    break
+
+                completed = [fid for fid, f in futures.items() if f.done()]
+                for fid in completed:
+                    f = futures.pop(fid)
+                    try:
+                        f.result()
+                    except Exception:
+                        pass
+
+                ready = self.context.ready_steps()
+                if not ready:
+                    if not futures:
+                        break
+                    import time as _time
+
+                    _time.sleep(0.01)
+                    continue
+
+                for step_id in ready:
+                    if step_id in futures and not futures[step_id].done():
+                        continue
+                    self.context.mark_running(step_id)
+                    future = pool.submit(self._run_step, step_id)
+                    futures[step_id] = future
+
+    def _run_step(self, step_id: str) -> None:
+        try:
+            self.step_runner(step_id, {})
+            self.context.mark_done(step_id)
+        except Exception:
+            self.context.mark_failed(step_id)
+            raise
+
+
+class LinearPipelineAdapter:
+    def __init__(
+        self,
+        steps: list[str],
+        step_runner: Callable[[str, dict[str, Any]], None],
+    ) -> None:
+        self.steps = steps
+        self.step_runner = step_runner
+
+    def execute(self) -> None:
+        if not self.steps:
+            raise DAGSpecificationError("Linear pipeline requires at least one step.")
+        for step_id in self.steps:
+            self.step_runner(step_id, {})
diff --git a/src/synapse_os/plugins.py b/src/synapse_os/plugins.py
new file mode 100644
index 0000000..cfbeb36
--- /dev/null
+++ b/src/synapse_os/plugins.py
@@ -0,0 +1,147 @@
+from __future__ import annotations
+
+from collections.abc import Callable
+from dataclasses import dataclass, field
+from importlib.metadata import entry_points
+from typing import Any
+
+HOOK_TYPES = frozenset(["pre_step", "post_step", "on_run_start", "on_run_end"])
+
+
+@dataclass
+class HookSpec:
+    name: str
+    hook_type: str
+    handler: Callable[..., Any]
+
+
+@dataclass
+class PluginManifest:
+    name: str
+    version: str
+    enabled: bool = True
+    hooks: list[str] = field(default_factory=list)
+
+
+class PluginLoadError(Exception):
+    pass
+
+
+class PluginRegistry:
+    _instance: PluginRegistry | None = None
+    _initialized: bool = False
+
+    def __new__(cls) -> PluginRegistry:
+        if cls._instance is None:
+            cls._instance = super().__new__(cls)
+        return cls._instance
+
+    def __init__(self) -> None:
+        if PluginRegistry._initialized:
+            return
+        self._plugins: dict[str, PluginManifest] = {}
+        self._handlers: dict[str, list[Callable[..., Any]]] = {
+            ht: [] for ht in HOOK_TYPES
+        }
+        self._hook_map: dict[str, dict[str, Callable[..., Any]]] = {}
+        PluginRegistry._initialized = True
+
+    def register(self, manifest: PluginManifest) -> None:
+        if manifest.name in self._plugins:
+            raise PluginLoadError(f"Plugin '{manifest.name}' already registered")
+        self._plugins[manifest.name] = manifest
+
+    def unregister(self, name: str) -> None:
+        if name not in self._plugins:
+            raise PluginLoadError(f"Plugin '{name}' not found")
+        hooks = self._hook_map.pop(name, {})
+        del self._plugins[name]
+        for hook_type, handler in hooks.items():
+            if not self._is_handler_registered(
+                hook_type, handler
+            ) and handler in self._handlers.get(hook_type, []):
+                self._handlers[hook_type].remove(handler)
+
+    def get_plugin(self, name: str) -> PluginManifest | None:
+        return self._plugins.get(name)
+
+    def list_plugins(self) -> list[str]:
+        return list(self._plugins.keys())
+
+    def is_loaded(self, name: str) -> bool:
+        return name in self._plugins
+
+    def enable_plugin(self, name: str) -> None:
+        if name in self._plugins:
+            self._plugins[name].enabled = True
+
+    def disable_plugin(self, name: str) -> None:
+        if name in self._plugins:
+            self._plugins[name].enabled = False
+
+    def register_hook(
+        self, plugin_name: str, hook_type: str, handler: Callable[..., Any]
+    ) -> None:
+        if hook_type not in HOOK_TYPES:
+            raise ValueError(f"Unknown hook type: {hook_type}")
+        if plugin_name not in self._plugins:
+            raise PluginLoadError(f"Plugin '{plugin_name}' not registered")
+        if plugin_name not in self._hook_map:
+            self._hook_map[plugin_name] = {}
+        old_handler = self._hook_map[plugin_name].get(hook_type)
+        self._hook_map[plugin_name][hook_type] = handler
+        if hook_type not in self._handlers:
+            self._handlers[hook_type] = []
+        if (
+            old_handler is not None
+            and old_handler is not handler
+            and not self._is_handler_registered(hook_type, old_handler)
+            and old_handler in self._handlers.get(hook_type, [])
+        ):
+            self._handlers[hook_type].remove(old_handler)
+        if handler not in self._handlers[hook_type]:
+            self._handlers[hook_type].append(handler)
+
+    def get_handlers(self, hook_type: str) -> list[Callable[..., Any]]:
+        handlers = []
+        for hook_type_key, handler_list in self._handlers.items():
+            if hook_type_key != hook_type:
+                continue
+            for handler in handler_list:
+                if self._is_handler_enabled(hook_type, handler):
+                    handlers.append(handler)
+        return handlers
+
+    def _is_handler_registered(
+        self, hook_type: str, handler: Callable[..., Any]
+    ) -> bool:
+        for hooks in self._hook_map.values():
+            if hooks.get(hook_type) is handler:
+                return True
+        return False
+
+    def _is_handler_enabled(self, hook_type: str, handler: Callable[..., Any]) -> bool:
+        for plugin_name, hooks in self._hook_map.items():
+            if (
+                hooks.get(hook_type) is handler
+                and self._plugins.get(plugin_name, None) is not None
+            ):
+                if self._plugins[plugin_name].enabled:
+                    return True
+        return False
+
+    def load_plugins(self) -> None:
+        eps = entry_points(group="synapse_os.plugins")
+        if hasattr(eps, "select"):
+            eps = eps.select(group="synapse_os.plugins")
+        for ep in eps:
+            try:
+                module = ep.load()
+                manifest = getattr(module, "hook_manifest", None)
+                if manifest is None:
+                    continue
+                manifest_obj = manifest()
+                if isinstance(manifest_obj, PluginManifest):
+                    self.register(manifest_obj)
+            except Exception:
+                pass
diff --git a/src/synapse_os/reporting.py b/src/synapse_os/reporting.py
index d2ea167..1f9edb8 100644
--- a/src/synapse_os/reporting.py
+++ b/src/synapse_os/reporting.py
@@ -4,6 +4,41 @@
 from pathlib import Path
 from typing import Protocol
 
+from pydantic import BaseModel, ConfigDict, Field
+
+
+class TimelineEntry(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    state: str
+    entered_at: float
+    duration_ms: int
+
+
+class ExecutionTimeline(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    entries: list[TimelineEntry] = Field(default_factory=list)
+
+
+class AdapterMetrics(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    tool_name: str
+    total_calls: int
+    success_count: int
+    failure_count: int
+    avg_duration_ms: float
+
+
+class StructuredError(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    error_type: str
+    message: str
+    step: str
+    count: int
+
 
 class _RunRecordProtocol(Protocol):
     initiated_by: str
@@ -42,6 +77,22 @@ class _ArtifactStoreProtocol(Protocol):
     def list_artifact_paths(self, run_id: str) -> list[str]: ...
 
 
+class RunReport(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    run_id: str
+    initiated_by: str
+    workspace_path: str
+    status: str
+    current_state: str
+    spec_hash: str | None = None
+    feature_id: str | None = None
+    feature_title: str | None = None
+    execution_timeline: ExecutionTimeline | None = None
+    adapter_metrics: list[AdapterMetrics] = Field(default_factory=list)
+    structured_errors: list[StructuredError] = Field(default_factory=list)
+
+
 class RunReportGenerator:
     def __init__(
         self,
@@ -140,7 +191,10 @@ def build(self, run_id: str) -> str:
 
     def _read_spec_artifact(self, run_id: str, artifact_name: str) -> str:
         artifact_path = (
-            self.artifact_store.base_path / run_id / "SPEC_VALIDATION" / f"{artifact_name}.txt"
+            self.artifact_store.base_path
+            / run_id
+            / "SPEC_VALIDATION"
+            / f"{artifact_name}.txt"
         )
         if not artifact_path.exists():
             return "-"
@@ -155,3 +209,80 @@ def _format_timeout(self, value: bool | None) -> str:
         if value is None:
             return "-"
         return "yes" if value else "no"
+
+    def generate_structured_report(self, run_id: str) -> RunReport:
+        run_record = self.repository.get_run(run_id)
+        step_records = self.repository.list_steps(run_id)
+        event_records = self.repository.list_events(run_id)
+        spec_id = self._read_spec_artifact(run_id, "spec_id")
+        spec_title = self._read_spec_artifact(run_id, "spec_title")
+
+        timeline_entries: list[TimelineEntry] = []
+        previous_entered_at: float | None = None
+        adapter_call_counts: dict[str, dict[str, int | float]] = {}
+
+        for event in event_records:
+            if event.event_type == "state_entered" and event.state:
+                entered_at = getattr(event, "timestamp", None)
+                if entered_at is None:
+                    entered_at = 0.0
+                duration_ms = 0
+                if previous_entered_at is not None:
+                    duration_ms = int((entered_at - previous_entered_at) * 1000)
+                timeline_entries.append(
+                    TimelineEntry(
+                        state=event.state,
+                        entered_at=entered_at,
+                        duration_ms=duration_ms,
+                    )
+                )
+                previous_entered_at = entered_at
+
+        for step in step_records:
+            tool = step.tool_name or "unknown"
+            if tool not in adapter_call_counts:
+                adapter_call_counts[tool] = {
+                    "total": 0,
+                    "success": 0,
+                    "failure": 0,
+                    "duration_sum": 0,
+                }
+            adapter_call_counts[tool]["total"] += 1
+            if step.return_code == 0:
+                adapter_call_counts[tool]["success"] += 1
+            else:
+                adapter_call_counts[tool]["failure"] += 1
+            if step.duration_ms is not None:
+                adapter_call_counts[tool]["duration_sum"] += step.duration_ms
+
+        adapter_metrics: list[AdapterMetrics] = []
+        for tool_name, counts in adapter_call_counts.items():
+            total = counts["total"]
+            avg = counts["duration_sum"] / total if total > 0 else 0.0
+            adapter_metrics.append(
+                AdapterMetrics(
+                    tool_name=tool_name,
+                    total_calls=int(total),
+                    success_count=int(counts["success"]),
+                    failure_count=int(counts["failure"]),
+                    avg_duration_ms=avg,
+                )
+            )
+
+        return RunReport(
+            run_id=run_id,
+            initiated_by=run_record.initiated_by,
+            workspace_path=run_record.workspace_path,
+            status=run_record.status,
+            current_state=run_record.current_state,
+            spec_hash=run_record.spec_hash,
+            feature_id=spec_id if spec_id != "-" else None,
+            feature_title=spec_title if spec_title != "-" else None,
+            execution_timeline=(
+                ExecutionTimeline(entries=timeline_entries)
+                if timeline_entries
+                else None
+            ),
+            adapter_metrics=adapter_metrics,
+            structured_errors=[],
+        )
diff --git a/src/synapse_os/runtime/service.py b/src/synapse_os/runtime/service.py
index 2b87974..146ce5e 100644
--- a/src/synapse_os/runtime/service.py
+++ b/src/synapse_os/runtime/service.py
@@ -6,15 +6,29 @@
 import signal
 import subprocess
 import sys
+import threading
 import time
+from collections.abc import Callable
 from pathlib import Path
+from typing import Literal
 
+from pydantic import BaseModel, ConfigDict, Field
+
+from synapse_os.runtime.circuit_breaker import AdapterCircuitBreakerStore
 from synapse_os.runtime.state import RuntimeState, RuntimeStateStore
 from synapse_os.runtime.worker import RuntimeWorker
 
 PROCESS_MARKER = "--synapse-runtime-process"
 
 
+class RuntimeLifecycleEvent(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    event: str
+    timestamp: float = Field(default_factory=time.time)
+    data: dict[str, object] = Field(default_factory=dict)
+
+
 def _runtime_process_code() -> str:
     return (
         "import signal\n"
@@ -78,7 +92,6 @@ def handle_shutdown(signum: int, frame: object) -> None:
         previous_sigterm = signal.signal(signal.SIGTERM, handle_shutdown)
         previous_sigint = signal.signal(signal.SIGINT, handle_shutdown)
 
-        # This is the minimal resident process for the Synapse-Flow runtime.
         self.state_store.write_running(
             os.getpid(),
             process_identity,
@@ -188,3 +201,97 @@ def _is_foreground_runtime_process(arguments: list[str], process_identity: str)
         and "--process-identity" in arguments
         and process_identity in arguments
     )
+
+
+class _InterruptibleHandler:
+    def __init__(self, handler: Callable[[], None], timeout: float) -> None:
+        self.handler = handler
+        self.timeout = timeout
+        self.thread: threading.Thread | None = None
+        self.exc: BaseException | None = None
+
+    def start(self) -> None:
+        self.thread = threading.Thread(target=self._run, daemon=True)
+        self.thread.start()
+
+    def _run(self) -> None:
+        try:
+            self.handler()
+        except BaseException as e:
+            self.exc = e
+
+    def join(self, timeout: float) -> None:
+        if self.thread is None:
+            return
+        self.thread.join(timeout=timeout)
+
+    def cancel(self) -> None:
+        pass
+
+    def is_alive(self) -> bool:
+        return self.thread is not None and self.thread.is_alive()
+
+
+class RuntimeCoordinator:
+    def __init__(
+        self,
+        circuit_breaker_store: AdapterCircuitBreakerStore | None = None,
+    ) -> None:
+        self.circuit_breaker_store = circuit_breaker_store or AdapterCircuitBreakerStore(
+            Path(".synapse-os/runtime/circuit-breakers.json")
+        )
+        self.lifecycle_events: list[RuntimeLifecycleEvent] = []
+        self._cleanup_handlers: list[Callable[[], None]] = []
+
+    def health_status(self) -> Literal["HEALTHY", "DEGRADED", "UNHEALTHY"]:
+        open_adapters = [
+            tool for tool in self._registered_tools() if self.circuit_breaker_store.is_open(tool)
+        ]
+        if not open_adapters:
+            return "HEALTHY"
+        if len(open_adapters) == 1:
+            return "DEGRADED"
+        return "UNHEALTHY"
+
+    def lifecycle_event(self, event: str, data: dict[str, object] | None = None) -> None:
+        self.lifecycle_events.append(RuntimeLifecycleEvent(event=event, data=data or {}))
+
+    def register_cleanup_handler(self, handler: Callable[[], None]) -> None:
+        self._cleanup_handlers.append(handler)
+
+    def run_cleanup_handlers(self) -> None:
+        for handler in self._cleanup_handlers:
+            try:
+                handler()
+            except Exception:
+                pass
+
+    def graceful_shutdown(self, timeout_seconds: float = 5.0) -> None:
+        self.lifecycle_event("runtime.stopping")
+        deadline = time.monotonic() + timeout_seconds
+        remaining = timeout_seconds
+
+        for handler in self._cleanup_handlers:
+            if remaining <= 0:
+                break
+            thread = _InterruptibleHandler(handler, remaining)
+            thread.start()
+            thread.join(timeout=remaining)
+            if thread.is_alive():
+                thread.cancel()
+            remaining = max(deadline - time.monotonic(), 0.0)
+
+        self._stop()
+        self.lifecycle_event("runtime.stopped")
+
+    @property
+    def degraded_adapters(self) -> set[str]:
+        return {
+            tool for tool in self._registered_tools() if self.circuit_breaker_store.is_open(tool)
+        }
+
+    def _registered_tools(self) -> list[str]:
+        return ["codex", "gemini", "copilot"]
+
+    def _stop(self) -> None:
+        pass
diff --git a/src/synapse_os/specs/validator.py b/src/synapse_os/specs/validator.py
index 777368b..d3755cb 100644
--- a/src/synapse_os/specs/validator.py
+++ b/src/synapse_os/specs/validator.py
@@ -27,6 +27,7 @@ class SpecMetadata(BaseModel):
     acceptance_criteria: list[str] = Field(min_length=1)
     non_goals: list[str]
     hooks: list[HookConfig] = Field(default_factory=list)
+    dag: dict[str, object] = Field(default_factory=dict)
 
 
 class SpecDocument(BaseModel):
@@ -35,15 +36,17 @@ class SpecDocument(BaseModel):
     metadata: SpecMetadata
     sections: dict[str, str]
     body: str
+    dag: dict[str, object] = Field(default_factory=dict)
 
 
 def validate_spec_file(path: Path) -> SpecDocument:
     text = path.read_text(encoding="utf-8")
     metadata_block, body = _split_front_matter(text)
     metadata = _load_metadata(metadata_block)
+    dag = _load_dag(metadata_block)
     sections = _parse_sections(body)
     _require_sections(sections, required_sections=("Contexto", "Objetivo"))
-    return SpecDocument(metadata=metadata, sections=sections, body=body.strip())
+    return SpecDocument(metadata=metadata, sections=sections, body=body.strip(), dag=dag)
 
 
 def _split_front_matter(text: str) -> tuple[str, str]:
@@ -78,6 +81,17 @@ def _load_metadata(metadata_block: str) -> SpecMetadata:
         raise SpecValidationError(f"SPEC metadata is invalid: {message}") from exc
 
 
+def _load_dag(metadata_block: str) -> dict[str, object]:
+    try:
+        raw = yaml.safe_load(metadata_block)
+    except yaml.YAMLError as exc:
+        raise SpecValidationError("SPEC front matter YAML is invalid.") from exc
+
+    if not isinstance(raw, dict):
+        return {}
+    return raw.get("dag", {}) or {}
+
+
 def _validate_hooks_in_raw_metadata(raw_metadata: dict[str, object]) -> None:
     if "hooks" not in raw_metadata:
         return
diff --git a/src/synapse_os/supervisor.py b/src/synapse_os/supervisor.py
index 52fd5a7..944d8be 100644
--- a/src/synapse_os/supervisor.py
+++ b/src/synapse_os/supervisor.py
@@ -3,6 +3,7 @@
 from pydantic import BaseModel, ConfigDict, Field, StrictInt, StrictStr
 
 RETRYABLE_STATES = frozenset({"PLAN", "TEST_RED", "CODE_GREEN"})
+TERMINAL_STATES = frozenset({"SECURITY", "SPEC_VALIDATION"})
 
 
 class RetryableStepError(RuntimeError):
@@ -13,6 +14,46 @@ class ReviewRejectedError(RuntimeError):
     """Signals that REVIEW requested rework and must return to CODE_GREEN."""
 
 
+class AdapterOperationalError(RuntimeError):
+    """Marks an adapter operational failure with a category."""
+
+    def __init__(self, message: str, category: str) -> None:
+        super().__init__(message)
+        self.category = category
+
+
+class RetryPolicy(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    max_retries: StrictInt = Field(default=2, ge=0)
+    base_delay_seconds: float = Field(default=1.0, ge=0)
+    max_delay_seconds: float = Field(default=60.0, ge=0)
+
+
+class StepPolicy(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    step_name: StrictStr
+    retry: RetryPolicy = Field(default_factory=RetryPolicy)
+
+
+class SupervisorPolicies(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    default: RetryPolicy = Field(default_factory=RetryPolicy)
+    step_overrides: dict[str, StepPolicy] = Field(default_factory=dict)
+
+    def resolve_for_step(self, step_name: str) -> RetryPolicy:
+        if step_name in self.step_overrides:
+            return self.step_overrides[step_name].retry
+        return self.default
+
+
+def calculate_backoff(attempt: int, base_delay: float, max_delay: float) -> float:
+    delay = base_delay * (2 ** (attempt - 1))
+    return float(min(delay, max_delay))
+
+
 class SupervisorDecision(BaseModel):
     model_config = ConfigDict(strict=True)
 
@@ -20,6 +61,7 @@ class SupervisorDecision(BaseModel):
     next_state: StrictStr
     route: StrictStr | None = None
     reason: StrictStr | None = None
+    backoff_seconds: float | None = None
 
 
 class Supervisor(BaseModel):
@@ -84,3 +126,86 @@ def decide_after_review_rejection(self) -> SupervisorDecision:
             next_state="CODE_GREEN",
             reason="review_requested_rework",
         )
+
+
+class AdvancedSupervisor(BaseModel):
+    model_config = ConfigDict(strict=True)
+
+    policies: SupervisorPolicies = Field(default_factory=SupervisorPolicies)
+
+    def _resolve_policy(self, state: str) -> RetryPolicy:
+        return self.policies.resolve_for_step(state)
+
+    def _is_terminal_state(self, state: str) -> bool:
+        return state in TERMINAL_STATES
+
+    def _is_short_circuit(self, error: Exception) -> bool:
+        if isinstance(error, AdapterOperationalError):
+            category = getattr(error, "category", None) or getattr(error, "reason", None)
+            return category == "launcher_unavailable"
+        return False
+
+    def decide_after_failure(
+        self,
+        *,
+        state: str,
+        error: Exception,
+        attempt: int,
+        available_routes: tuple[str, ...],
+    ) -> SupervisorDecision:
+        primary_route = available_routes[0] if available_routes else None
+        fallback_route = available_routes[1] if len(available_routes) > 1 else None
+
+        if self._is_terminal_state(state):
+            reason = f"{state.lower()}_is_terminal"
+            return SupervisorDecision(
+                action="fail",
+                next_state=state,
+                reason=reason,
+            )
+
+        if isinstance(error, ReviewRejectedError) and state == "REVIEW":
+            return SupervisorDecision(
+                action="return_to_code_green",
+                next_state="CODE_GREEN",
+                reason="review_requested_rework",
+            )
+
+        if self._is_short_circuit(error) and fallback_route is not None:
+            return SupervisorDecision(
+                action="reroute",
+                next_state=state,
+                route=fallback_route,
+                reason="operational_error_short_circuit",
+            )
+
+        if (
+            isinstance(error, (RetryableStepError, AdapterOperationalError))
+            and state in RETRYABLE_STATES
+        ):
+            policy = self._resolve_policy(state)
+            if attempt <= policy.max_retries:
+                backoff = calculate_backoff(
+                    attempt, policy.base_delay_seconds, policy.max_delay_seconds
+                )
+                return SupervisorDecision(
+                    action="retry",
+                    next_state=state,
+                    route=primary_route,
+                    reason="retryable_failure_with_budget",
+                    backoff_seconds=backoff,
+                )
+            if fallback_route is not None:
+                return SupervisorDecision(
+                    action="reroute",
+                    next_state=state,
+                    route=fallback_route,
+                    reason="retry_budget_exhausted_with_fallback",
+                )
+
+        return SupervisorDecision(
+            action="fail",
+            next_state=state,
+            route=primary_route,
+            reason="terminal_failure",
+        )
diff --git a/src/synapse_os/workspace.py b/src/synapse_os/workspace.py
new file mode 100644
index 0000000..85def28
--- /dev/null
+++ b/src/synapse_os/workspace.py
@@ -0,0 +1,141 @@
+from __future__ import annotations
+
+import shutil
+from collections.abc import Callable
+from enum import StrEnum
+from pathlib import Path
+from typing import Any
+
+from pydantic import BaseModel, Field
+
+
+class WorkspaceState(StrEnum):
+    CREATING = "creating"
+    READY = "ready"
+    BUSY = "busy"
+    CLEANUP = "cleanup"
+    DESTROYED = "destroyed"
+
+
+class TrackedWorkspace(BaseModel):
+    root: Path
+    state: WorkspaceState = WorkspaceState.CREATING
+    run_id: str | None = None
+    metadata: dict[str, Any] = Field(default_factory=dict)
+
+    def mark_ready(self, run_id: str) -> None:
+        self.state = WorkspaceState.READY
+        self.run_id = run_id
+
+    def mark_busy(self) -> None:
+        self.state = WorkspaceState.BUSY
+
+    def mark_cleanup(self) -> None:
+        self.state = WorkspaceState.CLEANUP
+
+    def mark_destroyed(self) -> None:
+        self.state = WorkspaceState.DESTROYED
+
+    def reset_for_reuse(self) -> None:
+        self.state = WorkspaceState.CREATING
+        self.run_id = None
+        self.metadata = {}
+        for item in self.root.iterdir():
+            if item.name != self.root.name:
+                if item.is_dir():
+                    shutil.rmtree(item)
+                else:
+                    item.unlink()
+
+    def set_metadata(self, key: str, value: Any) -> None:
+        self.metadata[key] = value
+
+    def get_metadata(self, key: str, default: Any = None) -> Any:
+        return self.metadata.get(key, default)
+
+
+class PoolExhaustedError(Exception):
+    pass
+
+
+class WorkspacePool(BaseModel):
+    base_dir: Path
+    max_size: int
+    acquired_count: int = 0
+    idle_workspaces: list[TrackedWorkspace] = Field(default_factory=list)
+    workspace_counter: int = Field(default=0)
+
+    def acquire(self, run_id: str) -> TrackedWorkspace:
+        if self.idle_workspaces:
+            ws = self.idle_workspaces.pop(0)
+            ws.mark_ready(run_id)
+            self.acquired_count += 1
+            return ws
+        if self.acquired_count >= self.max_size:
+            raise PoolExhaustedError(f"Pool exhausted: {self.acquired_count}/{self.max_size}")
+        self.workspace_counter += 1
+        ws_root = self.base_dir / f"ws-{self.workspace_counter}"
+        ws_root.mkdir(parents=True, exist_ok=True)
+        ws = TrackedWorkspace(root=ws_root)
+        ws.mark_ready(run_id)
+        self.acquired_count += 1
+        return ws
+
+    def release(self, ws: TrackedWorkspace) -> None:
+        ws.reset_for_reuse()
+        ws.state = WorkspaceState.READY
+        self.idle_workspaces.append(ws)
+        self.acquired_count -= 1
+
+    def discard(self, ws: TrackedWorkspace) -> None:
+        was_idle = ws in self.idle_workspaces
+        if was_idle:
+            self.idle_workspaces.remove(ws)
+        if ws.root.exists():
+            shutil.rmtree(ws.root)
+        ws.mark_destroyed()
+        if not was_idle:
+            self.acquired_count -= 1
+
+    @property
+    def idle_count(self) -> int:
+        return len(self.idle_workspaces)
+
+    def stats(self) -> dict[str, int]:
+        return {
+            "total": self.max_size,
+            "acquired": self.acquired_count,
+            "idle": self.idle_count,
+            "discarded": self.max_size - self.acquired_count - self.idle_count,
+        }
+
+
+class WorkspaceManager:
+    def __init__(self, base_dir: Path, pool_size: int) -> None:
+        self.base_dir = base_dir
+        self.pool = WorkspacePool(base_dir=base_dir / ".workspace_pool", max_size=pool_size)
+        self._cache: dict[str, TrackedWorkspace] = {}
+        self._cleanup_hooks: list[Callable[[Path], None]] = []
+
+    def create_workspace(self, run_id: str) -> TrackedWorkspace:
+        ws = self.pool.acquire(run_id)
+        self._cache[run_id] = ws
+        return ws
+
+    def register_cleanup_hook(self, hook: Callable[[Path], None]) -> None:
+        self._cleanup_hooks.append(hook)
+
+    def cleanup_workspace(self, ws: TrackedWorkspace) -> None:
+        ws.mark_cleanup()
+        for hook in self._cleanup_hooks:
+            hook(ws.root)
+
+    def get_workspace(self, run_id: str) -> TrackedWorkspace | None:
+        return self._cache.get(run_id)
+
+    def list_workspaces(self) -> list[TrackedWorkspace]:
+        return list(self._cache.values())
+
+    def cleanup_all(self) -> None:
+        for ws in list(self._cache.values()):
+            self.cleanup_workspace(ws)
diff --git a/tests/unit/test_control_plane.py b/tests/unit/test_control_plane.py
new file mode 100644
index 0000000..2bbf9e6
--- /dev/null
+++ b/tests/unit/test_control_plane.py
@@ -0,0 +1,521 @@
+"""Tests for the Local Control Plane (F60)."""
+
+import pytest
+from unittest.mock import MagicMock, patch
+from pathlib import Path
+from httpx import AsyncClient, ASGITransport
+
+from synapse_os.control_plane.server import create_app
+from synapse_os.persistence import RunRecord, RunStepRecord
+
+
+class TestHealthEndpoint:
+    """Tests for GET /health endpoint."""
+
+    @pytest.mark.asyncio
+    async def test_health_returns_ok_when_runtime_running(self):
+        runtime_service = MagicMock()
+        runtime_service.ready.return_value = True
+
+        app = create_app(runtime_service=runtime_service, api_token=None)
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/health")
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "ok"
+        assert data["runtime"] == "running"
+
+    @pytest.mark.asyncio
+    async def test_health_returns_ok_when_runtime_stopped(self):
+        runtime_service = MagicMock()
+        runtime_service.ready.return_value = False
+
+        app = create_app(runtime_service=runtime_service, api_token=None)
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/health")
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "ok"
+        assert data["runtime"] == "stopped"
+
+    @pytest.mark.asyncio
+    async def test_health_is_public_no_auth_required(self):
+        runtime_service = MagicMock()
+        runtime_service.ready.return_value = True
+
+        app = create_app(runtime_service=runtime_service, api_token="secret-token")
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/health")
+
+        assert response.status_code == 200
+
+
+class TestAuthMiddleware:
+    """Tests for API token authentication middleware."""
+
+    @pytest.mark.asyncio
+    async def test_returns_401_without_token_when_auth_enabled(self):
+        runtime_service = MagicMock()
+        app = create_app(runtime_service=runtime_service, api_token="secret-token")
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runs")
+
+        assert response.status_code == 401
+
+    @pytest.mark.asyncio
+    async def test_returns_401_with_invalid_token(self):
+        runtime_service = MagicMock()
+        app = create_app(runtime_service=runtime_service, api_token="secret-token")
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get(
+                "/api/v1/runs",
+                headers={"Authorization": "Bearer wrong-token"},
+            )
+
+        assert response.status_code == 401
+
+    @pytest.mark.asyncio
+    async def test_allows_request_with_valid_token(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        run_repo.list_runs.return_value = []
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token="secret-token",
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get(
+                "/api/v1/runs",
+                headers={"Authorization": "Bearer secret-token"},
+            )
+
+        assert response.status_code == 200
+
+    @pytest.mark.asyncio
+    async def test_no_auth_required_when_token_not_configured(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        run_repo.list_runs.return_value = []
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runs")
+
+        assert response.status_code == 200
+
+
+class TestListRunsEndpoint:
+    """Tests for GET /api/v1/runs endpoint."""
+
+    def _make_run_record(self, run_id, status="pending", spec_path="/tmp/spec.md"):
+        return RunRecord(
+            run_id=run_id,
+            spec_path=spec_path,
+            workspace_path="/tmp/workspace",
+            spec_hash=None,
+            initiated_by="test",
+            stop_at="COMPLETE",
+            status=status,
+            current_state="REQUEST",
+            locked=False,
+            failure_message=None,
+            created_at="2026-03-31T10:00:00Z",
+            updated_at="2026-03-31T10:00:00Z",
+            completed_at=None,
+        )
+
+    @pytest.mark.asyncio
+    async def test_returns_empty_list_when_no_runs(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        run_repo.list_runs.return_value = []
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runs")
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["runs"] == []
+        assert data["total"] == 0
+
+    @pytest.mark.asyncio
+    async def test_returns_paginated_runs(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        mock_run = self._make_run_record("run-1", "completed", "/tmp/spec.md")
+        run_repo.list_runs.return_value = [mock_run]
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runs?limit=10&offset=0")
+
+        assert response.status_code == 200
+        data = response.json()
+        assert len(data["runs"]) == 1
+        assert data["total"] == 1
+        assert data["limit"] == 10
+        assert data["offset"] == 0
+        assert data["runs"][0]["id"] == "run-1"
+        assert data["runs"][0]["status"] == "completed"
+
+    @pytest.mark.asyncio
+    async def test_truncates_long_prompt(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        long_path = "/tmp/" + "x" * 500 + ".md"
+        mock_run = self._make_run_record("run-1", "pending", long_path)
+        run_repo.list_runs.return_value = [mock_run]
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runs")
+
+        data = response.json()
+        assert len(data["runs"][0]["prompt"]) <= 100
+
+
+class TestCreateRunEndpoint:
+    """Tests for POST /api/v1/runs endpoint."""
+
+    @pytest.mark.asyncio
+    async def test_creates_run_with_valid_prompt(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        run_repo.create_run.return_value = "new-run-123"
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.post(
+                "/api/v1/runs",
+                json={"prompt": "implement a sorting algorithm"},
+            )
+
+        assert response.status_code == 201
+        data = response.json()
+        assert data["run_id"] == "new-run-123"
+        assert data["status"] == "pending"
+        run_repo.create_run.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_rejects_empty_prompt(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.post("/api/v1/runs", json={"prompt": ""})
+
+        assert response.status_code == 422
+
+    @pytest.mark.asyncio
+    async def test_rejects_missing_prompt(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.post("/api/v1/runs", json={})
+
+        assert response.status_code == 422
+
+
+class TestRunDetailEndpoint:
+    """Tests for GET /api/v1/runs/{run_id} endpoint."""
+
+    def _make_run_record(self, run_id, status="completed"):
+        return RunRecord(
+            run_id=run_id,
+            spec_path="/tmp/spec.md",
+            workspace_path="/tmp/workspace",
+            spec_hash=None,
+            initiated_by="test",
+            stop_at="COMPLETE",
+            status=status,
+            current_state="COMPLETE",
+            locked=False,
+            failure_message=None,
+            created_at="2026-03-31T10:00:00Z",
+            updated_at="2026-03-31T11:00:00Z",
+            completed_at="2026-03-31T11:00:00Z",
+        )
+
+    @pytest.mark.asyncio
+    async def test_returns_run_detail(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        mock_run = self._make_run_record("run-1", "completed")
+        run_repo.get_run.return_value = mock_run
+        run_repo.list_steps.return_value = [
+            RunStepRecord(
+                step_id=1,
+                run_id="run-1",
+                state="SPEC",
+                status="completed",
+                raw_output_path=None,
+                clean_output_path=None,
+                tool_name=None,
+                return_code=0,
+                duration_ms=100,
+                timed_out=False,
+                created_at="2026-03-31T10:00:00Z",
+            ),
+        ]
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runs/run-1")
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["id"] == "run-1"
+        assert data["status"] == "completed"
+
+    @pytest.mark.asyncio
+    async def test_returns_404_for_nonexistent_run(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        run_repo.get_run.side_effect = Exception("no rows found")
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runs/nonexistent")
+
+        assert response.status_code == 404
+
+
+class TestCancelRunEndpoint:
+    """Tests for POST /api/v1/runs/{run_id}/cancel endpoint."""
+
+    def _make_run_record(self, run_id, status="pending"):
+        return RunRecord(
+            run_id=run_id,
+            spec_path="/tmp/spec.md",
+            workspace_path="/tmp/workspace",
+            spec_hash=None,
+            initiated_by="test",
+            stop_at="COMPLETE",
+            status=status,
+            current_state="REQUEST",
+            locked=False,
+            failure_message=None,
+            created_at="2026-03-31T10:00:00Z",
+            updated_at="2026-03-31T10:00:00Z",
+            completed_at=None,
+        )
+
+    @pytest.mark.asyncio
+    async def test_cancels_pending_run(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        mock_run = self._make_run_record("run-1", "pending")
+        run_repo.get_run.return_value = mock_run
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.post("/api/v1/runs/run-1/cancel")
+
+        assert response.status_code == 200
+        run_repo.mark_run_cancelling.assert_called_once_with("run-1")
+
+    @pytest.mark.asyncio
+    async def test_returns_409_for_completed_run(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        mock_run = self._make_run_record("run-1", "completed")
+        run_repo.get_run.return_value = mock_run
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.post("/api/v1/runs/run-1/cancel")
+
+        assert response.status_code == 409
+
+    @pytest.mark.asyncio
+    async def test_returns_409_for_failed_run(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        mock_run = self._make_run_record("run-1", "failed")
+        run_repo.get_run.return_value = mock_run
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.post("/api/v1/runs/run-1/cancel")
+
+        assert response.status_code == 409
+
+    @pytest.mark.asyncio
+    async def test_returns_404_for_nonexistent_run(self):
+        runtime_service = MagicMock()
+        run_repo = MagicMock()
+        run_repo.get_run.side_effect = Exception("not found")
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.post("/api/v1/runs/nonexistent/cancel")
+
+        assert response.status_code == 404
+
+
+class TestRuntimeStatusEndpoint:
+    """Tests for GET /api/v1/runtime/status endpoint."""
+
+    @pytest.mark.asyncio
+    async def test_returns_runtime_status(self):
+        runtime_service = MagicMock()
+        mock_state = MagicMock()
+        mock_state.status = "running"
+        mock_state.pid = 12345
+        mock_state.started_at = "2026-03-31T10:00:00+00:00"
+        runtime_service.current_state.return_value = mock_state
+
+        run_repo = MagicMock()
+        run_repo.list_unlocked_pending_runs.return_value = [
+            MagicMock(),
+            MagicMock(),
+            MagicMock(),
+        ]
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            run_repository=run_repo,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/runtime/status")
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["pid"] == 12345
+        assert data["state"] == "running"
+        assert data["pending_runs"] == 3
+
+
+class TestArtifactsEndpoint:
+    """Tests for GET /api/v1/artifacts/{run_id} endpoint."""
+
+    @pytest.mark.asyncio
+    async def test_lists_artifacts_for_run(self):
+        runtime_service = MagicMock()
+        artifact_store = MagicMock()
+        artifact_store.list_artifact_paths.return_value = [
+            "run1/SPEC.md",
+            "run1/main.py",
+        ]
+
+        artifact_store.base_path = MagicMock()
+        mock_stat = MagicMock()
+        mock_stat.st_size = 1024
+        mock_stat.st_mtime = 1743405600.0
+        mock_path = MagicMock()
+        mock_path.stat.return_value = mock_stat
+        mock_path.name = "SPEC.md"
+        artifact_store.base_path.__truediv__ = MagicMock(return_value=mock_path)
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            artifact_store=artifact_store,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/artifacts/run-1")
+
+        assert response.status_code == 200
+        data = response.json()
+        assert len(data["artifacts"]) == 2
+
+    @pytest.mark.asyncio
+    async def test_returns_404_for_nonexistent_run_artifacts(self):
+        runtime_service = MagicMock()
+        artifact_store = MagicMock()
+        artifact_store.list_artifact_paths.side_effect = FileNotFoundError(
+            "run not found"
+        )
+
+        app = create_app(
+            runtime_service=runtime_service,
+            api_token=None,
+            artifact_store=artifact_store,
+        )
+        transport = ASGITransport(app=app)
+        async with AsyncClient(transport=transport, base_url="http://test") as client:
+            response = await client.get("/api/v1/artifacts/nonexistent")
+
+        assert response.status_code == 404
diff --git a/tests/unit/test_copilot_adapter.py b/tests/unit/test_copilot_adapter.py
new file mode 100644
index 0000000..9ba4e73
--- /dev/null
+++ b/tests/unit/test_copilot_adapter.py
@@ -0,0 +1,137 @@
+from __future__ import annotations
+
+from unittest.mock import AsyncMock, patch
+
+import pytest
+
+from synapse_os.adapters import (
+    CopilotCLIAdapter,
+    CLIExecutionResult,
+    classify_copilot_execution,
+)
+
+
+class TestCopilotCLIAdapter:
+    def test_capabilities(self) -> None:
+        adapter = CopilotCLIAdapter()
+        assert adapter.capabilities == ("cli_execution", "code_generation")
+
+    def test_tool_spec_name(self) -> None:
+        adapter = CopilotCLIAdapter()
+        assert adapter.tool_spec.name == "copilot"
+
+    def test_build_command(self) -> None:
+        adapter = CopilotCLIAdapter()
+        cmd = adapter.build_command("write a hello world in python")
+        assert "gh" in cmd
+        assert "copilot" in cmd
+        assert "write a hello world in python" in cmd
+
+    def test_build_command_empty_prompt_raises(self) -> None:
+        adapter = CopilotCLIAdapter()
+        with pytest.raises(ValueError, match="empty"):
+            adapter.build_command("   ")
+
+
+class TestClassifyCopilotExecution:
+    def test_success(self) -> None:
+        result = CLIExecutionResult(
+            tool_name="copilot",
+            command=["gh", "copilot", "ai"],
+            return_code=0,
+            stdout_raw="def hello(): pass\n",
+            stderr_raw="",
+            stdout_clean="def hello(): pass\n",
+            stderr_clean="",
+            duration_ms=500,
+            timed_out=False,
+            success=True,
+        )
+        assessment = classify_copilot_execution(result)
+        assert assessment.category == "success"
+        assert not assessment.is_operational_block
+
+    def test_timeout(self) -> None:
+        result = CLIExecutionResult(
+            tool_name="copilot",
+            command=["gh", "copilot", "ai"],
+            return_code=-1,
+            stdout_raw="",
+            stderr_raw="",
+            stdout_clean="",
+            stderr_clean="",
+            duration_ms=30000,
+            timed_out=True,
+            success=False,
+        )
+        assessment = classify_copilot_execution(result)
+        assert assessment.category == "timeout"
+        assert not assessment.is_operational_block
+
+    def test_return_code_nonzero(self) -> None:
+        result = CLIExecutionResult(
+            tool_name="copilot",
+            command=["gh", "copilot", "ai"],
+            return_code=1,
+            stdout_raw="",
+            stderr_raw="Something went wrong.",
+            stdout_clean="",
+            stderr_clean="Something went wrong.",
+            duration_ms=200,
+            timed_out=False,
+            success=False,
+        )
+        assessment = classify_copilot_execution(result)
+        assert assessment.category == "return_code_nonzero"
+        assert not assessment.is_operational_block
+
+    def test_authentication_unavailable(self) -> None:
+        result = CLIExecutionResult(
+            tool_name="copilot",
+            command=["gh", "copilot", "ai"],
+            return_code=1,
+            stdout_raw="",
+            stderr_raw="Error: authenticated required",
+            stdout_clean="",
+            stderr_clean="Error: authenticated required",
+            duration_ms=100,
+            timed_out=False,
+            success=False,
+        )
+        assessment = classify_copilot_execution(result)
+        assert assessment.category == "authentication_unavailable"
+        assert assessment.is_operational_block
+
+    def test_launcher_unavailable(self) -> None:
+        result = CLIExecutionResult(
+            tool_name="copilot",
+            command=["gh", "copilot", "ai"],
+            return_code=127,
+            stdout_raw="",
+            stderr_raw="gh: command not found",
+            stdout_clean="",
+            stderr_clean="gh: command not found",
+            duration_ms=50,
+            timed_out=False,
+            success=False,
+        )
+        assessment = classify_copilot_execution(result)
+        assert assessment.category == "launcher_unavailable"
+        assert assessment.is_operational_block
+
+    def test_circuit_open(self) -> None:
+        result = CLIExecutionResult(
+            tool_name="copilot",
+            command=["gh", "copilot", "ai"],
+            return_code=75,
+            stdout_raw="",
+            stderr_raw="circuit breaker open for copilot.\n",
+            stdout_clean="",
+            stderr_clean="circuit breaker open for copilot.",
+            duration_ms=0,
+            timed_out=False,
+            success=False,
+        )
+        assessment = classify_copilot_execution(result)
+        assert assessment.category == "circuit_open"
+        assert assessment.is_operational_block
diff --git a/tests/unit/test_hooks_cli.py b/tests/unit/test_hooks_cli.py
index abe28f6..b18bf21 100644
--- a/tests/unit/test_hooks_cli.py
+++ b/tests/unit/test_hooks_cli.py
@@ -14,10 +14,7 @@ class TestHooksListCommand:
     def test_hooks_list_no_hooks(self) -> None:
         result = runner.invoke(app, ["hooks", "list"])
         assert result.exit_code == 0
-        assert (
-            "No hooks configured" in result.output
-            or "nenhum hook" in result.output.lower()
-        )
+        assert "No hooks configured" in result.output or "nenhum hook" in result.output.lower()
 
     def test_hooks_list_with_global_hooks(self) -> None:
         from synapse_os.runtime_contracts import HookConfig
@@ -119,6 +116,4 @@ class TestHooksStatusCommand:
     def test_hooks_status_no_active_hooks(self) -> None:
         result = runner.invoke(app, ["hooks", "status"])
         assert result.exit_code == 0
-        assert (
-            "No active hooks" in result.output or "nenhum hook" in result.output.lower()
-        )
+        assert "No active hooks" in result.output or "nenhum hook" in result.output.lower()
diff --git a/tests/unit/test_memory.py b/tests/unit/test_memory.py
new file mode 100644
index 0000000..356a2b3
--- /dev/null
+++ b/tests/unit/test_memory.py
@@ -0,0 +1,154 @@
+from __future__ import annotations
+
+import json
+import time
+from datetime import datetime, timezone
+from pathlib import Path
+
+import pytest
+
+from synapse_os.memory import (
+    ArtifactMetadata,
+    FeatureMemoryView,
+    IndexedArtifactStore,
+    MemoryStore,
+)
+
+
+class TestArtifactMetadata:
+    def test_defaults(self) -> None:
+        meta = ArtifactMetadata(type="test_report", source_step="TEST_RED")
+        assert meta.type == "test_report"
+        assert meta.tags == []
+        assert meta.source_step == "TEST_RED"
+        assert meta.created_at is not None
+
+    def test_full(self) -> None:
+        now = datetime.now(timezone.utc)
+        meta = ArtifactMetadata(
+            type="log",
+            tags=["error", "crash"],
+            source_step="CODE_GREEN",
+            created_at=now,
+        )
+        assert meta.type == "log"
+        assert meta.tags == ["error", "crash"]
+        assert meta.created_at == now
+
+
+class TestIndexedArtifactStore:
+    def test_register_and_find_by_tag(self, tmp_path: Path) -> None:
+        store = IndexedArtifactStore(base_path=tmp_path)
+        store.register(
+            run_id="run-1",
+            name="error.log",
+            metadata=ArtifactMetadata(type="log", tags=["error"], source_step="RUN"),
+        )
+        results = store.find_by_tag("error")
+        assert len(results) == 1
+        assert results[0].name == "error.log"
+
+    def test_find_by_tag_no_match(self, tmp_path: Path) -> None:
+        store = IndexedArtifactStore(base_path=tmp_path)
+        store.register(
+            run_id="run-1",
+            name="output.txt",
+            metadata=ArtifactMetadata(type="text", tags=["output"], source_step="RUN"),
+        )
+        assert store.find_by_tag("error") == []
+
+    def test_find_by_type(self, tmp_path: Path) -> None:
+        store = IndexedArtifactStore(base_path=tmp_path)
+        store.register(
+            run_id="run-1",
+            name="report.txt",
+            metadata=ArtifactMetadata(type="test_report", source_step="RUN"),
+        )
+        results = store.find_by_type("test_report")
+        assert len(results) == 1
+        assert results[0].name == "report.txt"
+
+    def test_list_for_run(self, tmp_path: Path) -> None:
+        store = IndexedArtifactStore(base_path=tmp_path)
+        store.register(
+            run_id="run-1",
+            name="a.txt",
+            metadata=ArtifactMetadata(type="text", source_step="RUN"),
+        )
+        store.register(
+            run_id="run-1",
+            name="b.txt",
+            metadata=ArtifactMetadata(type="text", source_step="RUN"),
+        )
+        store.register(
+            run_id="run-2",
+            name="c.txt",
+            metadata=ArtifactMetadata(type="text", source_step="RUN"),
+        )
+        run1_artifacts = store.list_for_run("run-1")
+        assert len(run1_artifacts) == 2
+
+    def test_multiple_tags(self, tmp_path: Path) -> None:
+        store = IndexedArtifactStore(base_path=tmp_path)
+        store.register(
+            run_id="run-1",
+            name="log.txt",
+            metadata=ArtifactMetadata(
+                type="log", tags=["error", "crash"], source_step="RUN"
+            ),
+        )
+        assert len(store.find_by_tag("error")) == 1
+        assert len(store.find_by_tag("crash")) == 1
+
+
+class TestMemoryStore:
+    def test_set_and_get(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        store.set("ns", "key", "value")
+        assert store.get("ns", "key") == "value"
+
+    def test_get_missing(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        assert store.get("ns", "missing") is None
+
+    def test_delete(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        store.set("ns", "key", "value")
+        store.delete("ns", "key")
+        assert store.get("ns", "key") is None
+
+    def test_list_namespaces(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        store.set("ns1", "k", "v")
+        store.set("ns2", "k", "v")
+        namespaces = store.list_namespaces()
+        assert set(namespaces) == {"ns1", "ns2"}
+
+    def test_persistence(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        store.set("ns", "key", "value")
+        store2 = MemoryStore(state_dir=tmp_path)
+        assert store2.get("ns", "key") == "value"
+
+    def test_feature_memory_view(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        fm = store.feature_memory("F59")
+        fm.set("decision", "use-dag")
+        assert store.get("F59", "decision") == "use-dag"
+        assert fm.get("decision") == "use-dag"
+
+    def test_feature_memory_isolation(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        store.set("F59", "key", "f59-value")
+        store.set("F60", "key", "f60-value")
+        fm59 = store.feature_memory("F59")
+        fm60 = store.feature_memory("F60")
+        assert fm59.get("key") == "f59-value"
+        assert fm60.get("key") == "f60-value"
+
+    def test_feature_memory_delete(self, tmp_path: Path) -> None:
+        store = MemoryStore(state_dir=tmp_path)
+        store.set("F59", "key", "value")
+        fm = store.feature_memory("F59")
+        fm.delete("key")
+        assert store.get("F59", "key") is None
diff --git a/tests/unit/test_multi_agent.py b/tests/unit/test_multi_agent.py
new file mode 100644
index 0000000..96f8065
--- /dev/null
+++ b/tests/unit/test_multi_agent.py
@@ -0,0 +1,309 @@
+from __future__ import annotations
+
+import pytest
+
+from synapse_os.adapters import BaseCLIAdapter, CodexCLIAdapter, GeminiCLIAdapter
+from synapse_os.contracts import CLIExecutionResult
+from synapse_os.multi_agent import (
+    AdapterAlreadyRegisteredError,
+    AdapterNotFoundError,
+    AdapterRegistry,
+    CapabilityRouter,
+    MultiAgentCoordinator,
+    NoSuitableAdapterError,
+)
+
+
+class FakeAdapter(BaseCLIAdapter):
+    def __init__(
+        self,
+        *,
+        tool_name: str = "fake",
+        capabilities: tuple[str, ...] = ("cli_execution",),
+        command_prefix: tuple[str, ...] = (),
+    ) -> None:
+        self._capabilities = capabilities
+        self._command_prefix = command_prefix
+        super().__init__(tool_name=tool_name)
+
+    @property
+    def capabilities(self) -> tuple[str, ...]:
+        return self._capabilities
+
+    @property
+    def command_prefix(self) -> tuple[str, ...]:
+        return self._command_prefix
+
+    def build_command(self, prompt: str) -> list[str]:
+        return ["echo", prompt]
+
+
+# --- AdapterRegistry tests ---
+
+
+class TestAdapterRegistry:
+    def test_registrar_adapter_por_nome(self) -> None:
+        registry = AdapterRegistry()
+        adapter = FakeAdapter(tool_name="test_adapter")
+        registry.register(adapter)
+
+        assert registry.get("test_adapter") is adapter
+
+    def test_rejeitar_registro_duplicado(self) -> None:
+        registry = AdapterRegistry()
+        adapter = FakeAdapter(tool_name="dup")
+        registry.register(adapter)
+
+        with pytest.raises(AdapterAlreadyRegisteredError):
+            registry.register(adapter)
+
+    def test_retornar_none_para_adapter_inexistente(self) -> None:
+        registry = AdapterRegistry()
+        assert registry.get("nonexistent") is None
+
+    def test_listar_todos_os_adapters(self) -> None:
+        registry = AdapterRegistry()
+        a1 = FakeAdapter(tool_name="a1")
+        a2 = FakeAdapter(tool_name="a2")
+        registry.register(a1)
+        registry.register(a2)
+
+        all_adapters = registry.list_all()
+        assert len(all_adapters) == 2
+        assert {a.tool_name for a in all_adapters} == {"a1", "a2"}
+
+    def test_encontrar_adapters_por_capability(self) -> None:
+        registry = AdapterRegistry()
+        a1 = FakeAdapter(
+            tool_name="coder", capabilities=("cli_execution", "code_generation")
+        )
+        a2 = FakeAdapter(
+            tool_name="planner", capabilities=("cli_execution", "planning")
+        )
+        registry.register(a1)
+        registry.register(a2)
+
+        coders = registry.find_by_capability("code_generation")
+        assert len(coders) == 1
+        assert coders[0].tool_name == "coder"
+
+        planners = registry.find_by_capability("planning")
+        assert len(planners) == 1
+        assert planners[0].tool_name == "planner"
+
+    def test_retornar_lista_vazia_se_nenhuma_capability_match(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(FakeAdapter(tool_name="basic"))
+
+        result = registry.find_by_capability("nonexistent_capability")
+        assert result == []
+
+    def test_encontrar_multiplos_adapters_com_mesma_capability(self) -> None:
+        registry = AdapterRegistry()
+        a1 = FakeAdapter(tool_name="coder1", capabilities=("code_generation",))
+        a2 = FakeAdapter(tool_name="coder2", capabilities=("code_generation",))
+        registry.register(a1)
+        registry.register(a2)
+
+        result = registry.find_by_capability("code_generation")
+        assert len(result) == 2
+
+    def test_remover_adapter(self) -> None:
+        registry = AdapterRegistry()
+        adapter = FakeAdapter(tool_name="removable")
+        registry.register(adapter)
+        registry.unregister("removable")
+
+        assert registry.get("removable") is None
+        assert "removable" not in [a.tool_name for a in registry.list_all()]
+
+    def test_retornar_todas_as_capabilities_registradas(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(FakeAdapter(tool_name="a1", capabilities=("cap1", "cap2")))
+        registry.register(FakeAdapter(tool_name="a2", capabilities=("cap2", "cap3")))
+
+        all_caps = registry.all_capabilities()
+        assert all_caps == {"cap1", "cap2", "cap3"}
+
+
+# --- CapabilityRouter tests ---
+
+
+class TestCapabilityRouter:
+    def test_selecionar_adapter_com_capability_requerida(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(
+            FakeAdapter(tool_name="coder", capabilities=("code_generation",))
+        )
+        registry.register(
+            FakeAdapter(tool_name="basic", capabilities=("cli_execution",))
+        )
+
+        router = CapabilityRouter(registry)
+        selected = router.select_adapter({"code_generation"})
+
+        assert selected is not None
+        assert selected.tool_name == "coder"
+
+    def test_retornar_none_se_nenhum_adapter_tiver_capability(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(FakeAdapter(tool_name="basic"))
+
+        router = CapabilityRouter(registry)
+        selected = router.select_adapter({"nonexistent"})
+
+        assert selected is None
+
+    def test_selecionar_adapter_com_melhor_match(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(
+            FakeAdapter(
+                tool_name="specialist", capabilities=("code_generation", "code_review")
+            )
+        )
+        registry.register(
+            FakeAdapter(tool_name="generalist", capabilities=("code_generation",))
+        )
+
+        router = CapabilityRouter(registry)
+        selected = router.get_best_match({"code_generation", "code_review"})
+
+        assert selected is not None
+        assert selected.tool_name == "specialist"
+
+    def test_usar_primeiro_adapter_como_fallback(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(FakeAdapter(tool_name="first"))
+        registry.register(FakeAdapter(tool_name="second"))
+
+        router = CapabilityRouter(registry)
+        selected = router.get_best_match({"nonexistent"})
+
+        assert selected is not None
+        assert selected.tool_name == "first"
+
+    def test_retornar_none_se_registry_vazio(self) -> None:
+        registry = AdapterRegistry()
+        router = CapabilityRouter(registry)
+
+        assert router.select_adapter({"anything"}) is None
+        assert router.get_best_match({"anything"}) is None
+
+    def test_priorizar_adapter_com_mais_capabilities_sobrepostas(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(
+            FakeAdapter(tool_name="partial", capabilities=("cap1", "cap2"))
+        )
+        registry.register(
+            FakeAdapter(tool_name="full", capabilities=("cap1", "cap2", "cap3"))
+        )
+
+        router = CapabilityRouter(registry)
+        selected = router.get_best_match({"cap1", "cap2", "cap3"})
+
+        assert selected is not None
+        assert selected.tool_name == "full"
+
+
+# --- MultiAgentCoordinator tests ---
+
+
+class TestMultiAgentCoordinator:
+    def test_executar_step_com_adapter_selecionado(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(
+            FakeAdapter(tool_name="coder", capabilities=("code_generation",))
+        )
+
+        router = CapabilityRouter(registry)
+        coordinator = MultiAgentCoordinator(registry, router)
+
+        adapter = coordinator.resolve_adapter_for_step(
+            "CODE_GREEN", {"code_generation"}
+        )
+        assert adapter is not None
+        assert adapter.tool_name == "coder"
+
+    def test_retornar_none_se_nenhum_adapter_disponivel(self) -> None:
+        registry = AdapterRegistry()
+        router = CapabilityRouter(registry)
+        coordinator = MultiAgentCoordinator(registry, router)
+
+        adapter = coordinator.resolve_adapter_for_step("CODE_GREEN", {"nonexistent"})
+        assert adapter is None
+
+    def test_registrar_handoff_no_contexto(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(
+            FakeAdapter(tool_name="coder", capabilities=("code_generation",))
+        )
+
+        router = CapabilityRouter(registry)
+        coordinator = MultiAgentCoordinator(registry, router)
+
+        handoffs = coordinator.get_handoff_log()
+        assert len(handoffs) == 0
+
+        coordinator.resolve_adapter_for_step("CODE_GREEN", {"code_generation"})
+
+        handoffs = coordinator.get_handoff_log()
+        assert len(handoffs) == 1
+        assert handoffs[0]["step"] == "CODE_GREEN"
+        assert handoffs[0]["adapter"] == "coder"
+
+    def test_usar_fallback_adapter_quando_nenhuma_capability_especificada(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(FakeAdapter(tool_name="generic"))
+
+        router = CapabilityRouter(registry)
+        coordinator = MultiAgentCoordinator(registry, router)
+
+        adapter = coordinator.resolve_adapter_for_step("PLAN", set())
+        assert adapter is not None
+        assert adapter.tool_name == "generic"
+
+    def test_lancar_erro_se_adapter_nao_encontrado_para_step_obrigatorio(self) -> None:
+        registry = AdapterRegistry()
+        router = CapabilityRouter(registry)
+        coordinator = MultiAgentCoordinator(
+            registry, router, required_steps={"CODE_GREEN"}
+        )
+
+        with pytest.raises(NoSuitableAdapterError):
+            coordinator.resolve_adapter_for_step("CODE_GREEN", {"code_generation"})
+
+    def test_executar_com_adapters_reais(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(CodexCLIAdapter())
+        registry.register(GeminiCLIAdapter())
+
+        router = CapabilityRouter(registry)
+        coordinator = MultiAgentCoordinator(registry, router)
+
+        codex_adapter = coordinator.resolve_adapter_for_step(
+            "CODE_GREEN", {"code_generation"}
+        )
+        assert codex_adapter is not None
+        assert codex_adapter.tool_name == "codex"
+
+        gemini_adapter = coordinator.resolve_adapter_for_step("PLAN", {"planning"})
+        assert gemini_adapter is not None
+        assert gemini_adapter.tool_name == "gemini"
+
+    def test_registrar_todas_as_execucoes(self) -> None:
+        registry = AdapterRegistry()
+        registry.register(FakeAdapter(tool_name="a1", capabilities=("cap1",)))
+        registry.register(FakeAdapter(tool_name="a2", capabilities=("cap2",)))
+
+        router = CapabilityRouter(registry)
+        coordinator = MultiAgentCoordinator(registry, router)
+
+        coordinator.resolve_adapter_for_step("STEP1", {"cap1"})
+        coordinator.resolve_adapter_for_step("STEP2", {"cap2"})
+        coordinator.resolve_adapter_for_step("STEP3", {"cap1"})
+
+        handoffs = coordinator.get_handoff_log()
+        assert len(handoffs) == 3
+        assert handoffs[0]["step"] == "STEP1"
+        assert handoffs[1]["step"] == "STEP2"
+        assert handoffs[2]["step"] == "STEP3"
diff --git a/tests/unit/test_pipeline_dag.py b/tests/unit/test_pipeline_dag.py
new file mode 100644
index 0000000..89e5660
--- /dev/null
+++ b/tests/unit/test_pipeline_dag.py
@@ -0,0 +1,439 @@
+from __future__ import annotations
+
+from concurrent.futures import ThreadPoolExecutor
+from threading import Lock
+from unittest.mock import MagicMock
+
+import pytest
+
+from synapse_os.pipeline_dag import (
+    DAGConditional,
+    DAGContext,
+    DAGExecutor,
+    DAGSpecificationError,
+    DAGSpec,
+    DAGStep,
+    DAGValidator,
+    LinearPipelineAdapter,
+)
+
+
+class TestDAGSpec:
+    def test_valid_linear_mode(self) -> None:
+        spec = DAGSpec(mode="linear")
+        assert spec.mode == "linear"
+        assert spec.steps == []
+        assert spec.conditionals == []
+
+    def test_valid_dag_mode_empty_steps(self) -> None:
+        spec = DAGSpec(mode="dag", steps=[])
+        assert spec.mode == "dag"
+        assert spec.steps == []
+
+    def test_valid_dag_step_full(self) -> None:
+        step = DAGStep(id="build", executor="codex", depends_on=[], if_cond=None)
+        spec = DAGSpec(mode="dag", steps=[step])
+        assert spec.steps[0].id == "build"
+
+    def test_dag_step_minimal(self) -> None:
+        step = DAGStep(id="build", executor="codex")
+        assert step.depends_on == []
+        assert step.if_cond is None
+
+
+class TestDAGValidator:
+    def test_valid_linear_no_error(self) -> None:
+        spec = DAGSpec(mode="linear")
+        DAGValidator.validate(spec)
+
+    def test_valid_dag_single_step(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="build", executor="codex", depends_on=[])],
+        )
+        DAGValidator.validate(spec)
+
+    def test_valid_dag_linear_chain(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+                DAGStep(id="c", executor="codex", depends_on=["b"]),
+            ],
+        )
+        DAGValidator.validate(spec)
+
+    def test_valid_dag_fan_out(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="root", executor="codex", depends_on=[]),
+                DAGStep(id="a", executor="codex", depends_on=["root"]),
+                DAGStep(id="b", executor="codex", depends_on=["root"]),
+                DAGStep(id="c", executor="codex", depends_on=["root"]),
+            ],
+        )
+        DAGValidator.validate(spec)
+
+    def test_valid_dag_fan_in(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=[]),
+                DAGStep(id="c", executor="codex", depends_on=["a", "b"]),
+            ],
+        )
+        DAGValidator.validate(spec)
+
+    def test_valid_dag_complex(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="root", executor="codex", depends_on=[]),
+                DAGStep(id="a", executor="codex", depends_on=["root"]),
+                DAGStep(id="b", executor="codex", depends_on=["root"]),
+                DAGStep(id="c", executor="codex", depends_on=["a", "b"]),
+            ],
+        )
+        DAGValidator.validate(spec)
+
+    def test_cycle_detection_self_loop(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="a", executor="codex", depends_on=["a"])],
+        )
+        with pytest.raises(DAGSpecificationError, match="(?i)cycle"):
+            DAGValidator.validate(spec)
+
+    def test_cycle_detection_two_node(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=["b"]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+            ],
+        )
+        with pytest.raises(DAGSpecificationError, match="(?i)cycle"):
+            DAGValidator.validate(spec)
+
+    def test_cycle_detection_three_node(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=["b"]),
+                DAGStep(id="b", executor="codex", depends_on=["c"]),
+                DAGStep(id="c", executor="codex", depends_on=["a"]),
+            ],
+        )
+        with pytest.raises(DAGSpecificationError, match="(?i)cycle"):
+            DAGValidator.validate(spec)
+
+    def test_missing_dependency_raises(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="a", executor="codex", depends_on=["nonexistent"])],
+        )
+        with pytest.raises(DAGSpecificationError, match="nonexistent"):
+            DAGValidator.validate(spec)
+
+    def test_empty_steps_raises(self) -> None:
+        spec = DAGSpec(mode="dag", steps=[])
+        with pytest.raises(DAGSpecificationError, match="at least one step"):
+            DAGValidator.validate(spec)
+
+
+class TestDAGContext:
+    def test_initial_state_all_pending(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+            ],
+        )
+        ctx = DAGContext(spec)
+        assert ctx.get_state("a") == "PENDING"
+        assert ctx.get_state("b") == "PENDING"
+
+    def test_mark_running(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="a", executor="codex", depends_on=[])],
+        )
+        ctx = DAGContext(spec)
+        ctx.mark_running("a")
+        assert ctx.get_state("a") == "RUNNING"
+
+    def test_mark_done(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="a", executor="codex", depends_on=[])],
+        )
+        ctx = DAGContext(spec)
+        ctx.mark_done("a")
+        assert ctx.get_state("a") == "DONE"
+
+    def test_mark_failed(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="a", executor="codex", depends_on=[])],
+        )
+        ctx = DAGContext(spec)
+        ctx.mark_failed("a")
+        assert ctx.get_state("a") == "FAILED"
+
+    def test_ready_steps_root(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+            ],
+        )
+        ctx = DAGContext(spec)
+        ready = ctx.ready_steps()
+        assert ready == ["a"]
+
+    def test_ready_steps_after_root_done(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+            ],
+        )
+        ctx = DAGContext(spec)
+        ctx.mark_done("a")
+        ready = ctx.ready_steps()
+        assert ready == ["b"]
+
+    def test_ready_steps_fan_in_both_deps_done(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=[]),
+                DAGStep(id="c", executor="codex", depends_on=["a", "b"]),
+            ],
+        )
+        ctx = DAGContext(spec)
+        assert set(ctx.ready_steps()) == {"a", "b"}
+        ctx.mark_done("a")
+        assert ctx.ready_steps() == ["b"]
+        ctx.mark_done("b")
+        assert ctx.ready_steps() == ["c"]
+
+    def test_is_complete_all_done(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+            ],
+        )
+        ctx = DAGContext(spec)
+        assert not ctx.is_complete()
+        ctx.mark_done("a")
+        assert not ctx.is_complete()
+        ctx.mark_done("b")
+        assert ctx.is_complete()
+
+    def test_is_complete_has_failed(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="a", executor="codex", depends_on=[])],
+        )
+        ctx = DAGContext(spec)
+        ctx.mark_failed("a")
+        assert ctx.is_complete()
+
+    def test_dependency_deduplication(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a", "a"]),
+            ],
+        )
+        ctx = DAGContext(spec)
+        ctx.mark_done("a")
+        assert ctx.ready_steps() == ["b"]
+
+
+class TestDAGExecutor:
+    def test_init_rejects_unknown_mode(self) -> None:
+        spec = DAGSpec(mode="unsupported", steps=[])
+
+        with pytest.raises(DAGSpecificationError, match="Unknown DAG mode"):
+            DAGExecutor(spec=spec)
+
+    def test_execute_single_step(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[DAGStep(id="a", executor="codex", depends_on=[])],
+        )
+        executed: list[str] = []
+
+        def run_step(step_id: str) -> None:
+            executed.append(step_id)
+
+        executor = DAGExecutor(
+            spec=spec,
+            max_workers=4,
+            step_runner=lambda sid, _: run_step(sid),
+        )
+        executor.execute()
+
+        assert executed == ["a"]
+        assert executor.context.is_complete()
+
+    def test_execute_linear_chain(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+                DAGStep(id="c", executor="codex", depends_on=["b"]),
+            ],
+        )
+        executed: list[str] = []
+
+        def run_step(step_id: str) -> None:
+            executed.append(step_id)
+
+        executor = DAGExecutor(
+            spec=spec,
+            max_workers=4,
+            step_runner=lambda sid, _: run_step(sid),
+        )
+        executor.execute()
+
+        assert executed == ["a", "b", "c"]
+        assert executor.context.is_complete()
+
+    def test_execute_fan_out_parallel(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="froot", executor="codex", depends_on=[]),
+                DAGStep(id="fa", executor="codex", depends_on=["froot"]),
+                DAGStep(id="fb", executor="codex", depends_on=["froot"]),
+                DAGStep(id="fc", executor="codex", depends_on=["froot"]),
+            ],
+        )
+        order: list[str] = []
+
+        def run_step(step_id: str) -> None:
+            order.append(step_id)
+
+        executor = DAGExecutor(
+            spec=spec,
+            max_workers=4,
+            step_runner=lambda sid, _: run_step(sid),
+        )
+        executor.execute()
+
+        assert order[0] == "froot"
+        assert set(order[1:]) == {"fa", "fb", "fc"}
+        assert executor.context.is_complete()
+
+    def test_execute_fan_in(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="fa", executor="codex", depends_on=[]),
+                DAGStep(id="fb", executor="codex", depends_on=[]),
+                DAGStep(id="fc", executor="codex", depends_on=["fa", "fb"]),
+            ],
+        )
+        done: dict[str, bool] = {}
+
+        def run_step(step_id: str) -> None:
+            done[step_id] = True
+
+        executor = DAGExecutor(
+            spec=spec,
+            max_workers=4,
+            step_runner=lambda sid, _: run_step(sid),
+        )
+        executor.execute()
+
+        assert done.get("fa") and done.get("fb") and done.get("fc")
+        assert executor.context.is_complete()
+
+    def test_execute_stops_on_failure(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id="a", executor="codex", depends_on=[]),
+                DAGStep(id="b", executor="codex", depends_on=["a"]),
+            ],
+        )
+
+        def run_step(step_id: str) -> None:
+            if step_id == "a":
+                raise RuntimeError("simulated failure")
+
+        executor = DAGExecutor(
+            spec=spec,
+            max_workers=4,
+            step_runner=lambda sid, _: run_step(sid),
+        )
+        executor.execute()
+
+        assert executor.context.get_state("a") == "FAILED"
+        assert executor.context.has_failed
+
+    def test_max_workers_limits_concurrency(self) -> None:
+        spec = DAGSpec(
+            mode="dag",
+            steps=[
+                DAGStep(id=str(i), executor="codex", depends_on=[]) for i in range(8)
+            ],
+        )
+        active = 0
+        peak = 0
+        lock = Lock()
+
+        def run_step(step_id: str) -> None:
+            import time
+
+            nonlocal active, peak
+            with lock:
+                active += 1
+                peak = max(peak, active)
+            try:
+                time.sleep(0.05)
+            finally:
+                with lock:
+                    active -= 1
+
+        executor = DAGExecutor(
+            spec=spec,
+            max_workers=2,
+            step_runner=lambda sid, _: run_step(sid),
+        )
+        executor.execute()
+
+        assert peak == 2
+
+
+class TestLinearPipelineAdapter:
+    def test_runs_linear_sequence(self) -> None:
+        executed: list[str] = []
+        adapter = LinearPipelineAdapter(
+            steps=["a", "b", "c"],
+            step_runner=lambda sid, _: executed.append(sid),
+        )
+        adapter.execute()
+        assert executed == ["a", "b", "c"]
+
+    def test_raises_on_empty_steps(self) -> None:
+        adapter = LinearPipelineAdapter(
+            steps=[],
+            step_runner=lambda _, __: None,
+        )
+        with pytest.raises(DAGSpecificationError, match="at least one step"):
+            adapter.execute()
diff --git a/tests/unit/test_plugins.py b/tests/unit/test_plugins.py
new file mode 100644
index 0000000..e26f1e3
--- /dev/null
+++ b/tests/unit/test_plugins.py
@@ -0,0 +1,229 @@
+from __future__ import annotations
+
+import pytest
+from unittest.mock import MagicMock, patch
+
+from synapse_os.plugins import (
+    PluginManifest,
+    PluginRegistry,
+    PluginLoadError,
+    HookSpec,
+    HOOK_TYPES,
+)
+
+
+class TestHookSpec:
+    def test_hook_spec_create(self):
+        spec = HookSpec(name="test", hook_type="pre_step", handler=MagicMock())
+        assert spec.name == "test"
+        assert spec.hook_type == "pre_step"
+
+    def test_valid_hook_types(self):
+        assert "pre_step" in HOOK_TYPES
+        assert "post_step" in HOOK_TYPES
+        assert "on_run_start" in HOOK_TYPES
+        assert "on_run_end" in HOOK_TYPES
+
+
+class TestPluginManifest:
+    def test_create_manifest(self):
+        manifest = PluginManifest(name="test-plugin", version="1.0.0")
+        assert manifest.name == "test-plugin"
+        assert manifest.version == "1.0.0"
+        assert manifest.enabled is True
+
+    def test_manifest_default_enabled(self):
+        manifest = PluginManifest(name="test", version="0.1.0")
+        assert manifest.enabled is True
+
+    def test_manifest_with_hooks(self):
+        manifest = PluginManifest(
+            name="test",
+            version="1.0.0",
+            hooks=["pre_step", "post_step"],
+        )
+        assert "pre_step" in manifest.hooks
+        assert "post_step" in manifest.hooks
+
+
+class TestPluginRegistry:
+    @pytest.fixture(autouse=True)
+    def reset_registry(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        registry._hook_map.clear()
+        registry._handlers = {hook_type: [] for hook_type in HOOK_TYPES}
+        yield
+        registry._plugins.clear()
+        registry._hook_map.clear()
+        registry._handlers = {hook_type: [] for hook_type in HOOK_TYPES}
+
+    def test_singleton_pattern(self):
+        registry1 = PluginRegistry()
+        registry2 = PluginRegistry()
+        assert registry1 is registry2
+
+    def test_register_plugin(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        manifest = PluginManifest(name="test-plugin", version="1.0.0")
+        registry.register(manifest)
+        assert "test-plugin" in registry.list_plugins()
+
+    def test_register_duplicate_raises(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        manifest = PluginManifest(name="dup-plugin", version="1.0.0")
+        registry.register(manifest)
+        with pytest.raises(PluginLoadError, match="already registered"):
+            registry.register(manifest)
+
+    def test_unregister_plugin(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        manifest = PluginManifest(name="unreg-plugin", version="1.0.0")
+        registry.register(manifest)
+        registry.unregister("unreg-plugin")
+        assert "unreg-plugin" not in registry.list_plugins()
+
+    def test_unregister_unknown_raises(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        with pytest.raises(PluginLoadError, match="not found"):
+            registry.unregister("nonexistent")
+
+    def test_get_plugin(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        manifest = PluginManifest(name="get-plugin", version="2.0.0")
+        registry.register(manifest)
+        retrieved = registry.get_plugin("get-plugin")
+        assert retrieved is not None
+        assert retrieved.name == "get-plugin"
+
+    def test_get_plugin_not_found(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        assert registry.get_plugin("nonexistent") is None
+
+    def test_list_plugins(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        registry.register(PluginManifest(name="p1", version="1.0.0"))
+        registry.register(PluginManifest(name="p2", version="1.0.0"))
+        plugins = registry.list_plugins()
+        assert "p1" in plugins
+        assert "p2" in plugins
+
+    def test_enable_disable_plugin(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        manifest = PluginManifest(name="toggle-plugin", version="1.0.0")
+        registry.register(manifest)
+        registry.disable_plugin("toggle-plugin")
+        assert not registry.get_plugin("toggle-plugin").enabled
+        registry.enable_plugin("toggle-plugin")
+        assert registry.get_plugin("toggle-plugin").enabled
+
+    def test_get_handlers_for_hook(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        handler = MagicMock()
+        manifest = PluginManifest(
+            name="handler-plugin",
+            version="1.0.0",
+            hooks=["pre_step"],
+        )
+        registry.register(manifest)
+        registry.register_hook("handler-plugin", "pre_step", handler)
+        handlers = registry.get_handlers("pre_step")
+        assert handler in handlers
+
+    def test_get_handlers_empty_for_unknown_hook(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        handlers = registry.get_handlers("on_run_start")
+        assert handlers == []
+
+    def test_hook_type_validation(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        manifest = PluginManifest(name="val-plugin", version="1.0.0")
+        registry.register(manifest)
+        with pytest.raises(ValueError, match="Unknown hook type"):
+            registry.register_hook("val-plugin", "invalid_hook", MagicMock())
+
+    def test_load_plugins_discovers_entry_points(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        mock_ep = MagicMock()
+        mock_ep.name = "discovered-plugin"
+        mock_ep.load.return_value.hook_manifest.return_value = PluginManifest(
+            name="discovered-plugin", version="0.1.0", hooks=["pre_step"]
+        )
+        with patch("synapse_os.plugins.entry_points") as mock_eps:
+            mock_eps.return_value = [mock_ep]
+            registry.load_plugins()
+        assert "discovered-plugin" in registry.list_plugins()
+
+    def test_load_plugins_handles_missing_manifest(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        mock_ep = MagicMock()
+        mock_ep.name = "no-manifest-plugin"
+        mock_ep.load.return_value.hook_manifest = None
+        with patch("synapse_os.plugins.entry_points") as mock_eps:
+            mock_eps.return_value = [mock_ep]
+            registry.load_plugins()
+        assert "no-manifest-plugin" not in registry.list_plugins()
+
+    def test_is_loaded(self):
+        registry = PluginRegistry()
+        registry._plugins.clear()
+        assert registry.is_loaded("test") is False
+        registry.register(PluginManifest(name="test", version="1.0.0"))
+        assert registry.is_loaded("test") is True
+
+    def test_get_handlers_keeps_shared_callable_for_enabled_plugin_when_other_disabled(
+        self,
+    ):
+        registry = PluginRegistry()
+        shared_handler = MagicMock()
+
+        registry.register(PluginManifest(name="p1", version="1.0.0"))
+        registry.register(PluginManifest(name="p2", version="1.0.0"))
+        registry.register_hook("p1", "pre_step", shared_handler)
+        registry.register_hook("p2", "pre_step", shared_handler)
+
+        registry.disable_plugin("p1")
+
+        assert registry.get_handlers("pre_step") == [shared_handler]
+
+    def test_unregister_preserves_shared_callable_used_by_other_plugin(self):
+        registry = PluginRegistry()
+        shared_handler = MagicMock()
+
+        registry.register(PluginManifest(name="p1", version="1.0.0"))
+        registry.register(PluginManifest(name="p2", version="1.0.0"))
+        registry.register_hook("p1", "pre_step", shared_handler)
+        registry.register_hook("p2", "pre_step", shared_handler)
+
+        registry.unregister("p1")
+
+        assert registry.get_handlers("pre_step") == [shared_handler]
+
+    def test_register_hook_replacing_one_plugin_handler_preserves_shared_callable(self):
+        registry = PluginRegistry()
+        shared_handler = MagicMock()
+        replacement_handler = MagicMock()
+
+        registry.register(PluginManifest(name="p1", version="1.0.0"))
+        registry.register(PluginManifest(name="p2", version="1.0.0"))
+        registry.register_hook("p1", "pre_step", shared_handler)
+        registry.register_hook("p2", "pre_step", shared_handler)
+
+        registry.register_hook("p1", "pre_step", replacement_handler)
+
+        handlers = registry.get_handlers("pre_step")
+        assert replacement_handler in handlers
+        assert shared_handler in handlers
diff --git a/tests/unit/test_reporting_evolution.py b/tests/unit/test_reporting_evolution.py
new file mode 100644
index 0000000..c2590f3
--- /dev/null
+++ b/tests/unit/test_reporting_evolution.py
@@ -0,0 +1,276 @@
+from __future__ import annotations
+
+from unittest.mock import MagicMock
+
+
+class TestExecutionTimelineModels:
+    def test_timeline_entry_model(self) -> None:
+        from synapse_os.reporting import TimelineEntry
+
+        entry = TimelineEntry(
+            state="CODE_GREEN",
+            entered_at=1000.0,
+            duration_ms=500,
+        )
+        assert entry.state == "CODE_GREEN"
+        assert entry.entered_at == 1000.0
+        assert entry.duration_ms == 500
+
+    def test_execution_timeline_model(self) -> None:
+        from synapse_os.reporting import ExecutionTimeline, TimelineEntry
+
+        entry = TimelineEntry(state="PLAN", entered_at=0.0, duration_ms=100)
+        timeline = ExecutionTimeline(entries=[entry])
+        assert len(timeline.entries) == 1
+        assert timeline.entries[0].state == "PLAN"
+
+
+class TestAdapterMetricsModel:
+    def test_adapter_metrics_model(self) -> None:
+        from synapse_os.reporting import AdapterMetrics
+
+        metrics = AdapterMetrics(
+            tool_name="codex",
+            total_calls=10,
+            success_count=8,
+            failure_count=2,
+            avg_duration_ms=1500.5,
+        )
+        assert metrics.tool_name == "codex"
+        assert metrics.total_calls == 10
+        assert metrics.success_count == 8
+        assert metrics.failure_count == 2
+        assert metrics.avg_duration_ms == 1500.5
+
+
+class TestStructuredErrorModel:
+    def test_structured_error_model(self) -> None:
+        from synapse_os.reporting import StructuredError
+
+        error = StructuredError(
+            error_type="RetryableStepError",
+            message="temporary failure",
+            step="CODE_GREEN",
+            count=2,
+        )
+        assert error.error_type == "RetryableStepError"
+        assert error.message == "temporary failure"
+        assert error.step == "CODE_GREEN"
+        assert error.count == 2
+
+
+class TestRunReportEnhancedFields:
+    def test_run_report_has_feature_id_and_title(self) -> None:
+        from synapse_os.reporting import RunReport
+
+        report = RunReport(
+            run_id="test-run",
+            initiated_by="agent",
+            workspace_path="/workspace",
+            status="completed",
+            current_state="DONE",
+            feature_id="F64-advanced-supervisor-policies",
+            feature_title="Advanced Supervisor Policies",
+        )
+        assert report.feature_id == "F64-advanced-supervisor-policies"
+        assert report.feature_title == "Advanced Supervisor Policies"
+
+    def test_run_report_has_execution_timeline(self) -> None:
+        from synapse_os.reporting import ExecutionTimeline, RunReport, TimelineEntry
+
+        timeline = ExecutionTimeline(
+            entries=[
+                TimelineEntry(state="PLAN", entered_at=0.0, duration_ms=100),
+                TimelineEntry(state="CODE_GREEN", entered_at=0.1, duration_ms=200),
+            ]
+        )
+        report = RunReport(
+            run_id="test-run",
+            initiated_by="agent",
+            workspace_path="/workspace",
+            status="completed",
+            current_state="DONE",
+            execution_timeline=timeline,
+        )
+        assert len(report.execution_timeline.entries) == 2
+
+    def test_run_report_has_adapter_metrics(self) -> None:
+        from synapse_os.reporting import AdapterMetrics, RunReport
+
+        metrics = [
+            AdapterMetrics(
+                tool_name="codex",
+                total_calls=5,
+                success_count=4,
+                failure_count=1,
+                avg_duration_ms=1000.0,
+            ),
+        ]
+        report = RunReport(
+            run_id="test-run",
+            initiated_by="agent",
+            workspace_path="/workspace",
+            status="completed",
+            current_state="DONE",
+            adapter_metrics=metrics,
+        )
+        assert len(report.adapter_metrics) == 1
+        assert report.adapter_metrics[0].tool_name == "codex"
+
+    def test_run_report_has_structured_errors(self) -> None:
+        from synapse_os.reporting import RunReport, StructuredError
+
+        errors = [
+            StructuredError(
+                error_type="RetryableStepError",
+                message="failure",
+                step="CODE_GREEN",
+                count=3,
+            ),
+        ]
+        report = RunReport(
+            run_id="test-run",
+            initiated_by="agent",
+            workspace_path="/workspace",
+            status="failed",
+            current_state="CODE_GREEN",
+            structured_errors=errors,
+        )
+        assert len(report.structured_errors) == 1
+        assert report.structured_errors[0].count == 3
+
+
+class TestGenerateStructuredReport:
+    def test_generate_structured_report_populates_timeline(self) -> None:
+        import tempfile
+        from pathlib import Path
+
+        from synapse_os.reporting import RunReportGenerator
+
+        run_record = MagicMock()
+        run_record.initiated_by = "agent"
+        run_record.workspace_path = "/workspace"
+        run_record.spec_hash = "abc123"
+        run_record.status = "completed"
+        run_record.current_state = "DONE"
+
+        step_records = [
+            MagicMock(
+                state="PLAN",
+                status="done",
+                tool_name="codex",
+                return_code=0,
+                duration_ms=100,
+                timed_out=False,
+            ),
+            MagicMock(
+                state="TEST_RED",
+                status="done",
+                tool_name="codex",
+                return_code=0,
+                duration_ms=200,
+                timed_out=False,
+            ),
+        ]
+
+        event_records = [
+            MagicMock(
+                event_type="state_entered",
+                state="PLAN",
+                message="entered PLAN",
+                timestamp=1000.0,
+            ),
+            MagicMock(
+                event_type="state_entered",
+                state="TEST_RED",
+                message="entered TEST_RED",
+                timestamp=1100.0,
+            ),
+        ]
+
+        repo = MagicMock()
+        repo.get_run.return_value = run_record
+        repo.list_steps.return_value = step_records
+        repo.list_events.return_value = event_records
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            base = Path(tmpdir)
+            spec_id_file = base / "test-run" / "SPEC_VALIDATION" / "spec_id.txt"
+            spec_id_file.parent.mkdir(parents=True)
+            spec_id_file.write_text("F64-advanced-supervisor-policies")
+
+            artifact_store = MagicMock()
+            artifact_store.base_path = base
+            artifact_store.list_artifact_paths.return_value = []
+
+            gen = RunReportGenerator(repository=repo, artifact_store=artifact_store)
+            structured = gen.generate_structured_report("test-run")
+            assert structured.feature_id == "F64-advanced-supervisor-policies"
+            assert len(structured.execution_timeline.entries) == 2
+        assert len(structured.execution_timeline.entries) == 2
+
+    def test_generate_structured_report_aggregates_adapter_metrics(self) -> None:
+        from pathlib import Path
+
+        from synapse_os.reporting import RunReportGenerator
+
+        run_record = MagicMock()
+        run_record.initiated_by = "agent"
+        run_record.workspace_path = "/workspace"
+        run_record.spec_hash = "hash"
+        run_record.status = "failed"
+        run_record.current_state = "CODE_GREEN"
+
+        step_records = [
+            MagicMock(
+                state="PLAN",
+                status="done",
+                tool_name="codex",
+                return_code=0,
+                duration_ms=100,
+                timed_out=False,
+            ),
+            MagicMock(
+                state="TEST_RED",
+                status="done",
+                tool_name="codex",
+                return_code=0,
+                duration_ms=200,
+                timed_out=False,
+            ),
+            MagicMock(
+                state="CODE_GREEN",
+                status="failed",
+                tool_name="gemini",
+                return_code=1,
+                duration_ms=300,
+                timed_out=False,
+            ),
+        ]
+
+        event_records = []
+
+        repo = MagicMock()
+        repo.get_run.return_value = run_record
+        repo.list_steps.return_value = step_records
+        repo.list_events.return_value = event_records
+
+        artifact_store = MagicMock()
+        artifact_store.base_path = Path("/tmp/fake")
+        artifact_store.list_artifact_paths.return_value = []
+
+        gen = RunReportGenerator(repository=repo, artifact_store=artifact_store)
+        structured = gen.generate_structured_report("test-run")
+
+        codex_metrics = next(
+            (m for m in structured.adapter_metrics if m.tool_name == "codex"), None
+        )
+        gemini_metrics = next(
+            (m for m in structured.adapter_metrics if m.tool_name == "gemini"), None
+        )
+        assert codex_metrics is not None
+        assert codex_metrics.total_calls == 2
+        assert codex_metrics.success_count == 2
+        assert gemini_metrics is not None
+        assert gemini_metrics.total_calls == 1
+        assert gemini_metrics.failure_count == 1
diff --git a/tests/unit/test_runtime_coordinator_hardening.py b/tests/unit/test_runtime_coordinator_hardening.py
new file mode 100644
index 0000000..12bb421
--- /dev/null
+++ b/tests/unit/test_runtime_coordinator_hardening.py
@@ -0,0 +1,153 @@
+from __future__ import annotations
+
+import time
+from unittest.mock import MagicMock
+
+import pytest
+
+
+class TestRuntimeCoordinatorHardening:
+    def test_health_status_returns_healthy_when_all_circuit_breakers_closed(
+        self,
+    ) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        coordinator.circuit_breaker_store = MagicMock()
+        coordinator.circuit_breaker_store.is_open.return_value = False
+        status = coordinator.health_status()
+        assert status == "HEALTHY"
+
+    def test_health_status_returns_degraded_when_any_circuit_breaker_open(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        coordinator.circuit_breaker_store = MagicMock()
+        coordinator.circuit_breaker_store.is_open.side_effect = (
+            lambda tool: tool == "codex"
+        )
+        status = coordinator.health_status()
+        assert status == "DEGRADED"
+
+    def test_health_status_returns_unhealthy_when_multiple_circuit_breakers_open(
+        self,
+    ) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        coordinator.circuit_breaker_store = MagicMock()
+        coordinator.circuit_breaker_store.is_open.return_value = True
+        status = coordinator.health_status()
+        assert status == "UNHEALTHY"
+
+    def test_lifecycle_event_appends_to_event_log(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        coordinator.lifecycle_event("runtime.starting")
+        coordinator.lifecycle_event("runtime.started")
+        assert len(coordinator.lifecycle_events) == 2
+        assert coordinator.lifecycle_events[0].event == "runtime.starting"
+        assert coordinator.lifecycle_events[1].event == "runtime.started"
+
+    def test_lifecycle_event_contains_timestamp(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        coordinator.lifecycle_event("runtime.starting")
+        event = coordinator.lifecycle_events[0]
+        assert event.timestamp > 0
+
+    def test_register_cleanup_handler(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        handler = MagicMock()
+        coordinator.register_cleanup_handler(handler)
+        assert handler in coordinator._cleanup_handlers
+
+    def test_run_cleanup_handlers_calls_registered_handlers(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        handler1 = MagicMock()
+        handler2 = MagicMock()
+        coordinator.register_cleanup_handler(handler1)
+        coordinator.register_cleanup_handler(handler2)
+        coordinator.run_cleanup_handlers()
+        handler1.assert_called_once()
+        handler2.assert_called_once()
+
+    def test_run_cleanup_handlers_continues_after_handler_error(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        good_handler = MagicMock()
+        bad_handler = MagicMock(side_effect=RuntimeError("cleanup error"))
+        coordinator.register_cleanup_handler(bad_handler)
+        coordinator.register_cleanup_handler(good_handler)
+        coordinator.run_cleanup_handlers()
+        good_handler.assert_called_once()
+
+    def test_graceful_shutdown_calls_cleanup_then_stop(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        cleanup_mock = MagicMock()
+        coordinator.register_cleanup_handler(cleanup_mock)
+        stop_mock = MagicMock()
+        coordinator._stop = stop_mock
+        coordinator.graceful_shutdown(timeout_seconds=5)
+        cleanup_mock.assert_called_once()
+        stop_mock.assert_called_once()
+
+    def test_shutdown_respects_timeout(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        slow_handler = MagicMock(side_effect=lambda: time.sleep(10))
+        coordinator.register_cleanup_handler(slow_handler)
+        stop_mock = MagicMock()
+        coordinator._stop = stop_mock
+        start = time.monotonic()
+        coordinator.graceful_shutdown(timeout_seconds=0.1)
+        elapsed = time.monotonic() - start
+        assert elapsed < 1.0
+
+    def test_degraded_adapters_reflects_open_circuit_breakers(self) -> None:
+        from synapse_os.runtime.service import RuntimeCoordinator
+
+        coordinator = RuntimeCoordinator()
+        coordinator.circuit_breaker_store = MagicMock()
+        coordinator.circuit_breaker_store.is_open.side_effect = lambda tool: (
+            tool in ("codex", "gemini")
+        )
+        coordinator.circuit_breaker_store.read.side_effect = lambda tool: (
+            MagicMock(
+                tool_name=tool,
+                consecutive_operational_failures=3,
+                opened_at=time.time(),
+                cooldown_until=time.time() + 300,
+            )
+            if tool in ("codex", "gemini")
+            else None
+        )
+        degraded = coordinator.degraded_adapters
+        assert "codex" in degraded
+        assert "gemini" in degraded
+
+
+class TestRuntimeLifecycleEvent:
+    def test_lifecycle_event_model_has_required_fields(self) -> None:
+        from synapse_os.runtime.service import RuntimeLifecycleEvent
+
+        event = RuntimeLifecycleEvent(event="runtime.started", data={"pid": 12345})
+        assert event.event == "runtime.started"
+        assert event.data == {"pid": 12345}
+        assert event.timestamp > 0
+
+    def test_lifecycle_event_default_data_is_empty_dict(self) -> None:
+        from synapse_os.runtime.service import RuntimeLifecycleEvent
+
+        event = RuntimeLifecycleEvent(event="runtime.stopping")
+        assert event.data == {}
diff --git a/tests/unit/test_supervisor.py b/tests/unit/test_supervisor.py
index 124280d..edeb5f4 100644
--- a/tests/unit/test_supervisor.py
+++ b/tests/unit/test_supervisor.py
@@ -58,3 +58,99 @@ def test_supervisor_returns_to_code_green_after_review_rejection() -> None:
 
     assert decision.action == "return_to_code_green"
     assert decision.next_state == "CODE_GREEN"
+
+
+def test_supervisor_marks_terminal_failure_after_security_error() -> None:
+    supervisor = _supervisor_module()
+
+    decision = supervisor.Supervisor(max_retries=2).decide_after_failure(
+        state="SECURITY",
+        error=ValueError("insecure pattern"),
+        attempt=1,
+        available_routes=("primary",),
+    )
+
+    assert decision.action == "fail"
+    assert decision.next_state == "SECURITY"
+    assert decision.reason == "security_is_terminal"
+
+
+def test_supervisor_terminal_failure_when_no_fallback_route() -> None:
+    supervisor = _supervisor_module()
+
+    decision = supervisor.Supervisor(max_retries=2).decide_after_failure(
+        state="PLAN",
+        error=supervisor.RetryableStepError("failure"),
+        attempt=3,
+        available_routes=("primary",),
+    )
+
+    assert decision.action == "fail"
+    assert decision.next_state == "PLAN"
+    assert decision.reason == "terminal_failure"
+
+
+def test_supervisor_retry_budget_exhausted_at_max_retries() -> None:
+    supervisor = _supervisor_module()
+
+    decision = supervisor.Supervisor(max_retries=2).decide_after_failure(
+        state="CODE_GREEN",
+        error=supervisor.RetryableStepError("failure"),
+        attempt=2,
+        available_routes=("primary", "fallback"),
+    )
+
+    assert decision.action == "retry"
+    assert decision.reason == "retryable_failure_with_budget"
+
+
+def test_supervisor_reroute_when_budget_exceeded_with_fallback() -> None:
+    supervisor = _supervisor_module()
+
+    decision = supervisor.Supervisor(max_retries=2).decide_after_failure(
+        state="TEST_RED",
+        error=supervisor.RetryableStepError("failure"),
+        attempt=3,
+        available_routes=("primary", "fallback"),
+    )
+
+    assert decision.action == "reroute"
+    assert decision.route == "fallback"
+    assert decision.reason == "retry_budget_exhausted_with_fallback"
+
+
+def test_supervisor_ignores_retryable_error_in_non_retryable_state() -> None:
+    supervisor = _supervisor_module()
+
+    decision = supervisor.Supervisor(max_retries=2).decide_after_failure(
+        state="REVIEW",
+        error=supervisor.RetryableStepError("failure"),
+        attempt=1,
+        available_routes=("primary",),
+    )
+
+    assert decision.action == "fail"
+    assert decision.reason == "terminal_failure"
+
+
+def test_supervisor_decision_contains_correct_reason() -> None:
+    supervisor = _supervisor_module()
+
+    retry_decision = supervisor.Supervisor(max_retries=2).decide_after_failure(
+        state="PLAN",
+        error=supervisor.RetryableStepError("failure"),
+        attempt=1,
+        available_routes=("primary",),
+    )
+    assert retry_decision.reason == "retryable_failure_with_budget"
+
+    terminal_decision = supervisor.Supervisor(max_retries=2).decide_after_failure(
+        state="SPEC_VALIDATION",
+        error=ValueError("bad"),
+        attempt=1,
+        available_routes=("primary",),
+    )
+    assert terminal_decision.reason == "spec_validation_is_terminal"
+
+    review_decision = supervisor.Supervisor(max_retries=2).decide_after_review_rejection()
+    assert review_decision.reason == "review_requested_rework"
diff --git a/tests/unit/test_supervisor_policies.py b/tests/unit/test_supervisor_policies.py
new file mode 100644
index 0000000..a5aeeca
--- /dev/null
+++ b/tests/unit/test_supervisor_policies.py
@@ -0,0 +1,272 @@
+from __future__ import annotations
+
+from importlib import import_module
+
+import pytest
+
+
+def _supervisor_module():
+    return import_module("synapse_os.supervisor")
+
+
+class TestRetryPolicyModel:
+    def test_retry_policy_has_expected_fields(self) -> None:
+        supervisor = _supervisor_module()
+        policy = supervisor.RetryPolicy(
+            max_retries=3, base_delay_seconds=1.0, max_delay_seconds=60.0
+        )
+        assert policy.max_retries == 3
+        assert policy.base_delay_seconds == 1.0
+        assert policy.max_delay_seconds == 60.0
+
+    def test_retry_policy_default_values(self) -> None:
+        supervisor = _supervisor_module()
+        policy = supervisor.RetryPolicy()
+        assert policy.max_retries == 2
+        assert policy.base_delay_seconds == 1.0
+        assert policy.max_delay_seconds == 60.0
+
+
+class TestStepPolicyModel:
+    def test_step_policy_holds_retry_policy(self) -> None:
+        supervisor = _supervisor_module()
+        step_policy = supervisor.StepPolicy(
+            step_name="TEST_RED",
+            retry=supervisor.RetryPolicy(max_retries=5),
+        )
+        assert step_policy.step_name == "TEST_RED"
+        assert step_policy.retry.max_retries == 5
+
+
+class TestSupervisorPoliciesModel:
+    def test_supervisor_policies_holds_default_and_overrides(self) -> None:
+        supervisor = _supervisor_module()
+        default_policy = supervisor.RetryPolicy(max_retries=2)
+        test_red_policy = supervisor.StepPolicy(
+            step_name="TEST_RED",
+            retry=supervisor.RetryPolicy(max_retries=5),
+        )
+        policies = supervisor.SupervisorPolicies(
+            default=default_policy,
+            step_overrides={"TEST_RED": test_red_policy},
+        )
+        assert policies.default.max_retries == 2
+        assert policies.step_overrides["TEST_RED"].retry.max_retries == 5
+
+    def test_supervisor_policies_resolves_step_specific_policy(self) -> None:
+        supervisor = _supervisor_module()
+        policies = supervisor.SupervisorPolicies()
+        resolved = policies.resolve_for_step("TEST_RED")
+        assert resolved.max_retries == 2
+
+        policies.step_overrides["TEST_RED"] = supervisor.StepPolicy(
+            step_name="TEST_RED",
+            retry=supervisor.RetryPolicy(max_retries=5),
+        )
+        resolved = policies.resolve_for_step("TEST_RED")
+        assert resolved.max_retries == 5
+
+
+class TestCalculateBackoff:
+    def test_backoff_doubles_each_attempt(self) -> None:
+        supervisor = _supervisor_module()
+        delay = supervisor.calculate_backoff(attempt=1, base_delay=1.0, max_delay=60.0)
+        assert delay == 1.0
+        delay = supervisor.calculate_backoff(attempt=2, base_delay=1.0, max_delay=60.0)
+        assert delay == 2.0
+        delay = supervisor.calculate_backoff(attempt=3, base_delay=1.0, max_delay=60.0)
+        assert delay == 4.0
+        delay = supervisor.calculate_backoff(attempt=4, base_delay=1.0, max_delay=60.0)
+        assert delay == 8.0
+
+    def test_backoff_respects_max_cap(self) -> None:
+        supervisor = _supervisor_module()
+        delay = supervisor.calculate_backoff(attempt=10, base_delay=1.0, max_delay=60.0)
+        assert delay == 60.0
+
+    def test_backoff_with_different_base(self) -> None:
+        supervisor = _supervisor_module()
+        delay = supervisor.calculate_backoff(attempt=1, base_delay=2.0, max_delay=60.0)
+        assert delay == 2.0
+        delay = supervisor.calculate_backoff(attempt=2, base_delay=2.0, max_delay=60.0)
+        assert delay == 4.0
+
+
+class TestAdvancedSupervisorPerStepRetries:
+    def test_test_red_respects_own_max_retries(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor(
+            policies=supervisor_mod.SupervisorPolicies(
+                default=supervisor_mod.RetryPolicy(max_retries=2),
+                step_overrides={
+                    "TEST_RED": supervisor_mod.StepPolicy(
+                        step_name="TEST_RED",
+                        retry=supervisor_mod.RetryPolicy(max_retries=5),
+                    ),
+                },
+            ),
+        )
+        decision = advanced.decide_after_failure(
+            state="TEST_RED",
+            error=supervisor_mod.RetryableStepError("failure"),
+            attempt=4,
+            available_routes=("primary",),
+        )
+        assert decision.action == "retry"
+
+    def test_plan_respects_own_max_retries(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor(
+            policies=supervisor_mod.SupervisorPolicies(
+                default=supervisor_mod.RetryPolicy(max_retries=2),
+                step_overrides={
+                    "PLAN": supervisor_mod.StepPolicy(
+                        step_name="PLAN",
+                        retry=supervisor_mod.RetryPolicy(max_retries=1),
+                    ),
+                },
+            ),
+        )
+        decision = advanced.decide_after_failure(
+            state="PLAN",
+            error=supervisor_mod.RetryableStepError("failure"),
+            attempt=2,
+            available_routes=("primary",),
+        )
+        assert decision.action == "fail"
+        assert decision.reason == "terminal_failure"
+
+
+class TestAdvancedSupervisorTerminalSteps:
+    def test_security_remains_terminal(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor()
+        decision = advanced.decide_after_failure(
+            state="SECURITY",
+            error=ValueError("insecure"),
+            attempt=1,
+            available_routes=("primary",),
+        )
+        assert decision.action == "fail"
+        assert decision.reason == "security_is_terminal"
+
+    def test_spec_validation_remains_terminal(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor()
+        decision = advanced.decide_after_failure(
+            state="SPEC_VALIDATION",
+            error=ValueError("bad spec"),
+            attempt=1,
+            available_routes=("primary",),
+        )
+        assert decision.action == "fail"
+        assert decision.reason == "spec_validation_is_terminal"
+
+
+class TestAdvancedSupervisorFallbackRouting:
+    def test_reroutes_to_fallback_after_exhausting_primary_retries(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor(
+            policies=supervisor_mod.SupervisorPolicies(
+                default=supervisor_mod.RetryPolicy(max_retries=2),
+            ),
+        )
+        decision = advanced.decide_after_failure(
+            state="CODE_GREEN",
+            error=supervisor_mod.RetryableStepError("failure"),
+            attempt=3,
+            available_routes=("primary", "fallback"),
+        )
+        assert decision.action == "reroute"
+        assert decision.route == "fallback"
+        assert decision.reason == "retry_budget_exhausted_with_fallback"
+
+
+class TestAdvancedSupervisorBackoffDelay:
+    def test_returns_backoff_delay_in_decision(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor(
+            policies=supervisor_mod.SupervisorPolicies(
+                default=supervisor_mod.RetryPolicy(
+                    max_retries=3, base_delay_seconds=1.0, max_delay_seconds=60.0
+                ),
+            ),
+        )
+        decision = advanced.decide_after_failure(
+            state="CODE_GREEN",
+            error=supervisor_mod.RetryableStepError("failure"),
+            attempt=2,
+            available_routes=("primary",),
+        )
+        assert decision.action == "retry"
+        assert decision.backoff_seconds == 2.0
+
+    def test_backoff_caps_at_max_delay(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor(
+            policies=supervisor_mod.SupervisorPolicies(
+                default=supervisor_mod.RetryPolicy(
+                    max_retries=10, base_delay_seconds=10.0, max_delay_seconds=60.0
+                ),
+            ),
+        )
+        decision = advanced.decide_after_failure(
+            state="CODE_GREEN",
+            error=supervisor_mod.RetryableStepError("failure"),
+            attempt=10,
+            available_routes=("primary",),
+        )
+        assert decision.action == "retry"
+        assert decision.backoff_seconds == 60.0
+
+
+class TestAdvancedSupervisorOperationalError:
+    def test_launcher_unavailable_short_circuits(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor()
+        op_error = supervisor_mod.AdapterOperationalError(
+            "launcher unavailable",
+            category="launcher_unavailable",
+        )
+        decision = advanced.decide_after_failure(
+            state="CODE_GREEN",
+            error=op_error,
+            attempt=1,
+            available_routes=("primary", "fallback"),
+        )
+        assert decision.action == "reroute"
+        assert decision.route == "fallback"
+        assert decision.reason == "operational_error_short_circuit"
+
+    def test_other_operational_errors_still_retry(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor(
+            policies=supervisor_mod.SupervisorPolicies(
+                default=supervisor_mod.RetryPolicy(max_retries=2),
+            ),
+        )
+        op_error = supervisor_mod.AdapterOperationalError(
+            "some error",
+            category="timeout",
+        )
+        decision = advanced.decide_after_failure(
+            state="CODE_GREEN",
+            error=op_error,
+            attempt=1,
+            available_routes=("primary",),
+        )
+        assert decision.action == "retry"
+
+
+class TestAdvancedSupervisorDefaults:
+    def test_advanced_supervisor_inherits_supervisor_interface(self) -> None:
+        supervisor_mod = _supervisor_module()
+        advanced = supervisor_mod.AdvancedSupervisor()
+        decision = advanced.decide_after_failure(
+            state="CODE_GREEN",
+            error=supervisor_mod.RetryableStepError("failure"),
+            attempt=1,
+            available_routes=("primary",),
+        )
+        assert decision.action == "retry"
+        assert decision.next_state == "CODE_GREEN"
diff --git a/tests/unit/test_workspace_v2.py b/tests/unit/test_workspace_v2.py
new file mode 100644
index 0000000..8286acb
--- /dev/null
+++ b/tests/unit/test_workspace_v2.py
@@ -0,0 +1,198 @@
+from __future__ import annotations
+
+import pytest
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+
+from synapse_os.workspace import (
+    WorkspaceState,
+    TrackedWorkspace,
+    WorkspacePool,
+    WorkspaceManager,
+    PoolExhaustedError,
+)
+
+
+class TestWorkspaceState:
+    def test_all_states_documented(self):
+        assert WorkspaceState.CREATING.value == "creating"
+        assert WorkspaceState.READY.value == "ready"
+        assert WorkspaceState.BUSY.value == "busy"
+        assert WorkspaceState.CLEANUP.value == "cleanup"
+        assert WorkspaceState.DESTROYED.value == "destroyed"
+
+
+class TestTrackedWorkspace:
+    def test_create_with_defaults(self, tmp_path: Path):
+        ws = TrackedWorkspace(root=tmp_path)
+        assert ws.root == tmp_path
+        assert ws.state == WorkspaceState.CREATING
+        assert ws.run_id is None
+        assert ws.metadata == {}
+
+    def test_mark_ready(self, tmp_path: Path):
+        ws = TrackedWorkspace(root=tmp_path)
+        ws.mark_ready(run_id="run-1")
+        assert ws.state == WorkspaceState.READY
+        assert ws.run_id == "run-1"
+
+    def test_mark_busy(self, tmp_path: Path):
+        ws = TrackedWorkspace(root=tmp_path, state=WorkspaceState.READY)
+        ws.mark_busy()
+        assert ws.state == WorkspaceState.BUSY
+
+    def test_mark_cleanup(self, tmp_path: Path):
+        ws = TrackedWorkspace(root=tmp_path)
+        ws.mark_cleanup()
+        assert ws.state == WorkspaceState.CLEANUP
+
+    def test_mark_destroyed(self, tmp_path: Path):
+        ws = TrackedWorkspace(root=tmp_path)
+        ws.mark_destroyed()
+        assert ws.state == WorkspaceState.DESTROYED
+
+    def test_reset_for_reuse(self, tmp_path: Path):
+        ws = TrackedWorkspace(root=tmp_path, run_id="run-1", state=WorkspaceState.BUSY)
+        ws.reset_for_reuse()
+        assert ws.state == WorkspaceState.CREATING
+        assert ws.run_id is None
+        assert ws.metadata == {}
+
+    def test_metadata_get_set(self, tmp_path: Path):
+        ws = TrackedWorkspace(root=tmp_path)
+        ws.set_metadata("key", "value")
+        assert ws.get_metadata("key") == "value"
+        assert ws.get_metadata("missing") is None
+
+
+class TestWorkspacePool:
+    def test_create_pool(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=3)
+        assert pool.max_size == 3
+        assert pool.acquired_count == 0
+
+    def test_acquire_returns_tracked_workspace(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=2)
+        ws = pool.acquire("run-1")
+        assert isinstance(ws, TrackedWorkspace)
+        assert ws.run_id == "run-1"
+        assert ws.state == WorkspaceState.READY
+        assert pool.acquired_count == 1
+
+    def test_acquire_creates_directory(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=1)
+        ws = pool.acquire("run-1")
+        assert ws.root.exists()
+
+    def test_acquire_exhausted_raises(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=1)
+        pool.acquire("run-1")
+        with pytest.raises(PoolExhaustedError):
+            pool.acquire("run-2")
+
+    def test_release_returns_to_pool(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=1)
+        ws = pool.acquire("run-1")
+        pool.release(ws)
+        assert pool.acquired_count == 0
+        assert pool.idle_count == 1
+
+    def test_release_resets_workspace(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=1)
+        ws = pool.acquire("run-1")
+        ws.set_metadata("key", "value")
+        pool.release(ws)
+        assert ws.run_id is None
+        assert ws.state == WorkspaceState.READY
+
+    def test_idle_workspaces_tracked(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=2)
+        ws1 = pool.acquire("run-1")
+        ws2 = pool.acquire("run-2")
+        pool.release(ws1)
+        assert pool.idle_count == 1
+        assert ws1 in pool.idle_workspaces
+
+    def test_discard_removes_from_pool(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=1)
+        ws = pool.acquire("run-1")
+        pool.discard(ws)
+        assert pool.acquired_count == 0
+        assert pool.idle_count == 0
+
+    def test_discard_cleans_directory(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=1)
+        ws = pool.acquire("run-1")
+        (ws.root / "file.txt").write_text("data")
+        pool.discard(ws)
+        assert not ws.root.exists()
+
+    def test_stats(self, tmp_path: Path):
+        pool = WorkspacePool(base_dir=tmp_path, max_size=3)
+        ws1 = pool.acquire("run-1")
+        ws2 = pool.acquire("run-2")
+        pool.release(ws1)
+        stats = pool.stats()
+        assert stats["total"] == 3
+        assert stats["acquired"] == 1
+        assert stats["idle"] == 1
+        assert stats["discarded"] == 1
+
+
+class TestWorkspaceManager:
+    def test_create_workspace(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=2)
+        ws = mgr.create_workspace("run-1")
+        assert isinstance(ws, TrackedWorkspace)
+        assert ws.run_id == "run-1"
+        assert ws.state == WorkspaceState.READY
+
+    def test_cleanup_workspace_calls_hook(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=1)
+        hook_called = []
+        mgr.register_cleanup_hook(lambda path: hook_called.append(path))
+        ws = mgr.create_workspace("run-1")
+        mgr.cleanup_workspace(ws)
+        assert len(hook_called) == 1
+        assert hook_called[0] == ws.root
+
+    def test_cleanup_sets_state_to_cleanup(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=1)
+        ws = mgr.create_workspace("run-1")
+        mgr.cleanup_workspace(ws)
+        assert ws.state == WorkspaceState.CLEANUP
+
+    def test_get_workspace_returns_cached(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=1)
+        ws1 = mgr.create_workspace("run-1")
+        ws2 = mgr.get_workspace("run-1")
+        assert ws1 is ws2
+
+    def test_get_workspace_unknown_returns_none(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=1)
+        assert mgr.get_workspace("unknown") is None
+
+    def test_list_workspaces(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=2)
+        mgr.create_workspace("run-1")
+        mgr.create_workspace("run-2")
+        workspaces = mgr.list_workspaces()
+        assert len(workspaces) == 2
+
+    def test_pool_size_respected(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=2)
+        for i in range(1, 4):
+            try:
+                mgr.create_workspace(f"run-{i}")
+            except PoolExhaustedError:
+                pass
+        assert len(mgr.list_workspaces()) == 2
+
+    def test_cleanup_all(self, tmp_path: Path):
+        mgr = WorkspaceManager(base_dir=tmp_path, pool_size=2)
+        mgr.create_workspace("run-1")
+        mgr.create_workspace("run-2")
+        called = []
+        mgr.register_cleanup_hook(lambda p: called.append(p))
+        mgr.cleanup_all()
+        assert len(called) == 2