hyperpolymath · hyperpolymath · Apr 21, 2026 · Apr 21, 2026 · Apr 21, 2026 · Apr 21, 2026
@@ -10,6 +10,70 @@ The following files in `.machine_readable/` contain structured project metadata:
 - `.machine_readable/6a2/PLAYBOOK.a2ml` - Operational runbook
 - `.machine_readable/bot_directives/*.a2ml` - Per-bot permission and scope rules
 
+## Canonical Roadmap
+
+See [`docs/ROADMAP.md`](../docs/ROADMAP.md) for the single source of truth on
+where ECHIDNA is going: the 8‑stage map, row‑by‑row endpoint targets,
+current sprint (S1–S5), and agent‑tier guidance.  Record decisions
+there, not in chat.
+
+## Supervised Multi-Agent Execution
+
+When a supervising agent (Opus) plans work that will be implemented by
+a focused executor (Sonnet) or a mechanical bulk agent (Haiku), the
+supervisor MUST run a **Haiku scout pass** immediately before spawning
+the implementation agent, unless the scope is strictly single‑file
+and under ten lines of change.
+
+### Scout pattern
+
+**Purpose**: clear obvious cruft (unused imports, stale comments,
+dead stubs, broken references) so the implementation agent arrives
+to clean ground and does not burn tokens on trivial lint.
+
+**When to run**: once per upcoming step, after the step's scope is
+locked and before the implementation agent is launched.
+
+**Prompt template** (copy this verbatim, fill in the file list):
+
+```
+Haiku scout, scope = <exact file list for the step>
+
+1. Confirm each file compiles in isolation (cargo check --lib / julia -e).
+2. Scan for: unused imports, dead code, `todo!()/FIXME/XXX`,
+   orphan type references, stale doc comments referring to
+   renamed symbols, broken `mod.rs` exports.
+3. Trivially fix:
+   - unused imports (remove)
+   - obvious typos in comments
+   - unused `_var` renames
+4. DO NOT fix — FLAG to me:
+   - any TODO with semantic content (may be load-bearing)
+   - anything requiring judgement about intent
+   - any cross-module change
+5. Report: <20 lines. Green-light or list of blockers.
+```
+
+**Caveats**:
+
+- Scout is **not** a full audit.  Full audits remain the periodic
+  cross‑module sweep (cf. the Agent A + Agent B pattern in session
+  history); scout is narrow prep for an imminent step.
+- Scout must **not over‑fix**.  TODO/FIXME comments sometimes
+  contain the spec for the upcoming work — flag, don't delete.
+- Scout is **idempotent**.  Re‑running on unchanged code should
+  produce the same report.
+- If scout flags a blocker, the supervisor decides whether to
+  resolve it inline, expand the step's scope, or defer.  The
+  implementation agent does not launch until the supervisor
+  green‑lights.
+
+**Commit convention** (when the scout applied trivial fixes):
+
+```
+chore(<scope>): scout-pass trivial cleanup ahead of <step id>
+```
+
 ### CRITICAL: .scm metadata files are DEPRECATED
 
 **All `.scm` state/metadata files have been replaced by `.a2ml`.**

@@ -193,3 +193,6 @@ deps/
 .cache/
 build/
 dist/
+
+# Subagent worktrees (isolation mode)
+.claude/worktrees/
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: PMPL-1.0-or-later
+
+[package]
+name = "typed-wasm"
+version = "0.1.0"
+edition = "2021"
+authors = ["Jonathan D.A. Jewell <j.d.a.jewell@open.ac.uk>"]
+license = "PMPL-1.0-or-later"
+description = "TypedWasm prover oracle engine for ECHIDNA — standalone domain types and engine"
+repository = "https://github.com/hyperpolymath/echidna"
+
+[dependencies]
+anyhow = "1"
+async-trait = "0.1"
+tokio = { version = "1", features = ["full"] }
+serde = { version = "1", features = ["derive"] }
+serde_json = "1"
@@ -0,0 +1,192 @@
+<!-- SPDX-License-Identifier: PMPL-1.0-or-later -->
+<!-- SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell <j.d.a.jewell@open.ac.uk> -->
+
+# ECHIDNA Roadmap
+
+**Status**: canonical • supersedes the aspirational parts of
+[`FUTURE_DEVELOPMENT_ROADMAP.md`](./FUTURE_DEVELOPMENT_ROADMAP.md).
+**Last revised**: 2026‑04‑20
+**Scope**: the shortest honest path from today's repo to the endpoint vision.
+
+This document is the single source of truth for where ECHIDNA is going.
+The domain‑specific plans under [`handover/`](./handover/) and
+[`design/SPARK_ADOPTION_PLAN.md`](./design/SPARK_ADOPTION_PLAN.md) are
+still authoritative for their sub‑areas; they fit *inside* this
+roadmap as implementations of individual stages.
+
+## 1. Endpoint
+
+ECHIDNA, when complete, is the **reasoning substrate of the
+hyperpolymath ecosystem** — not just a proof tool.  It is:
+
+* a unified interface to every mature theorem prover, SMT solver,
+  model checker, and type checker in public use;
+* an ML layer (GNN + Transformer over Flux.jl) that actually learns
+  premise selection, tactic synthesis, and strategy routing well
+  enough to beat hand‑curated choices on held‑out theorems;
+* a persistent E‑R layer (Verisim) that stores every proof, tactic
+  application, and version as a content‑addressed row and closes a
+  real learning loop over historical outcomes;
+* a parallel dispatch layer (Chapel) that runs portfolio solves
+  across CPU cores and dispatches specialised work to coprocessors
+  (GPU / tensor / vector / FPGA / QPU);
+* a high‑speed IPC fabric (Cap'n Proto) between the Rust core and
+  the Julia ML sidecar and between the backend and the frontend;
+* a frontend in AffineScript + Deno (TEA architecture) with i18n via
+  the LOL locale repo;
+* a 16‑endpoint Zig ABI that external consumers (Coursera labs,
+  other projects) link against, with an SPDX‑pinned, Guix‑built,
+  Chainguard‑based container managed by Stapeln and hardened by the
+  Vordr / Selur / Svalinn / Cerro‑Torre surround;
+* itself formally verified — the trust kernel's critical invariants
+  are proved in SPARK and in Idris2 / Agda.
+
+From a user's seat: **indistinguishable from magic for easy
+obligations, honest about ambiguity for hard ones, never silently
+wrong.**
+
+## 2. Stage map
+
+```
+Stage 1  Train the model at all             ML layer can receive signal
+    1a  extractors emit premises            [done 2026‑04‑20, 22 of 50 ✓]
+    1b  align_premises joins correctly      [done 2026‑04‑19 ✓]
+    1c  residual extractor gaps fixed       [in progress, 28 to go]
+    1d  vocab covers the surface            [255 K; target ~1 M]
+
+Stage 2  Signal becomes useful              MRR moves off 0.66 baseline
+    2a  run training at scale               hardware task
+    2b  online vocab growth                 new feature
+    2c  re‑baseline and publish metrics     run gate
+
+Stage 3  Learning loop closes               model improves from outcomes
+    3a  Verisim read paths                  cross‑prover queries by goal_hash
+    3b  hypatia strategy loop wired         `mv_prover_success_by_class` reads
+    3c  Proof / TacticApplication / ProofVersion emission wired
+                                            [defined 2026‑04‑19 ✓, emission pending]
+
+Stage 4  Interaction layer honest           every declared ProverKind works
+    4a  typed_wasm → crates/typed_wasm      extract to standalone crate
+    4b  40 TypeChecker variant dispatch     Sigma‑route via the crate
+    4c  tactic synthesis template           37 suggest_tactics stubs replaced
+    4d  search_theorems template            49 stubs replaced, query Verisim
+
+Stage 5  Distributed & fast                 scale + specialised hardware
+    5a  Cap'n Proto IPC                     Rust ↔ Julia, :8090
+    5b  Chapel full integration             dispatch.rs picks by config
+    5c  coprocessor abstraction             trait + tensor/vector/QPU/FPGA
+    5d  Tier‑4 live‑CI provisioning         19 backends, Containerfile path
+
+Stage 6  Cross‑prover semantics             translation actually works
+    6a  OpenTheory bridge real              bidirectional term translation
+    6b  Dedukti bridge real                 λΠ modulo as lingua franca
+    6c  mathematical identity arbitration   canonicalise across representations
+    6d  Coursera delivery                   Zig ABI exposes capability
+
+Stage 7  Sovereign tooling surround         the rest of the ecosystem
+    7a  Stapeln container lifecycle
+    7b  Vordr entry gate
+    7c  Selur scheduler
+    7d  Svalinn trust boundary
+    7e  Cerro‑Torre observability
+    7f  AffineScript‑TEA frontend           src/rescript/, ≥33 modules
+    7g  LOL i18n                            locale/ + t!() macro
+    7h  Zig ABI 16‑endpoint surface         src/zig/echidna_abi.zig
+
+Stage 8  Self‑verified                      ECHIDNA proves ECHIDNA
+    8a  Idris2 meta‑proofs extend           from 32 modules to full kernel
+    8b  Agda meta‑proofs extend             from 12 modules
+    8c  SPARK‑verified trust kernel         crates/echidna-core-spark/
+    8d  panic‑attack hardened               proptest + AFL++ nightly
+```
+
+## 3. Endpoint target — row by row
+
+| Claim | Today | End‑state target |
+|---|---|---|
+| "Every important solver" | 59 backends; 37 `suggest_tactics` stubs; 49 `search_theorems` stubs | **65 backends, every one with real `suggest_tactics` (GNN‑ranked top‑k) and `search_theorems` (queries Verisim by `goal_hash`) — no `Ok(vec![])` returns anywhere** |
+| "Vocab at 2.5 M" | 255 K | **~1 M canonical tokens** after Mathport + Iris + VST + Flyspeck + HoTT absorption, with **online growth** adding tokens during training |
+| "Chapel fully supported" | 2 files, POC only | **`dispatch.rs` picks Chapel‑parallel dispatch by config**; runtime init + cancellation + error propagation wired; ≥1 OoM speedup on portfolio solves |
+| "Cap'n Proto serialisation" | 0 `.capnp` files | **`crates/echidna-wire/`** contains schemas for ProofState / Goal / Tactic / EmbeddingRequest / RankingResponse; IPC on :8090 is Cap'n Proto; JSON retained only as debug fallback |
+| "Vordr / Selur / Svalinn / Cerro‑Torre / Stapeln" | Not present | Each named as a versioned dependency in Cargo.toml or the container definition, wired into its role |
+| "AffineScript‑TEA frontend" | Not present (10 `.res` files stub) | **`src/rescript/`** holds ≥33 AffineScript‑TEA modules, persistent Model → Msg → Update loop, talks to core over Cap'n Proto WebSocket |
+| "LOL i18n" | Not present | **`locale/`** with LOL‑sourced translations; `t!()` macro for every user‑facing string |
+| "Rust/SPARK base stable" | Rust only | **`crates/echidna-core-spark/`** holds the Ada/SPARK‑verified kernel (axiom scanning, trust‑level computation); proved free of runtime errors; called from Rust via C ABI |
+| "16‑endpoint Zig ABI" | 4 shared libs, endpoint count unverified | **`src/zig/echidna_abi.zig`** exports exactly 16 functions (`echidna_prove`, `echidna_apply_tactic`, `echidna_rank_premises`, `echidna_verify_certificate`, `echidna_post_octad`, `echidna_query_identity`, …) with a versioned header |
+| "100 K+ per prover" | Uneven; long tail <20 K | **Top 15 provers ≥ 100 K records each**; next 20 ≥ 25 K each; bottom 10 synthetically augmented to ≥ 10 K; `stats_UNIFIED.json` reports the distribution publicly |
+| "Panic‑attacked" | No fuzz infra | **`benches/panic_attack/`** with proptest + AFL++ harnesses covering every parser, FFI boundary, IPC schema; nightly CI ≥ 1 h; zero known panic paths |
+| "Prover count" | 59 real, 105 declared | **`ProverKind` = 65 variants, every one factory‑dispatched**; CLAUDE.md states the canonical count; no orphans |
+
+## 4. Current sprint — "the smallest set that flips half the table"
+
+Five concrete commitments.  Landed together inside ~6 weeks they move
+"Every important solver", "Vocab", "Prover count", half of "100 K+",
+and start the self‑learning loop closure.
+
+| # | Stage | Commitment | Agent tier |
+|---|---|---|---|
+| S1 | 1c | Finish the extractor premise‑emission wave (22 done; 28 remaining) | **Sonnet** (in flight) |
+| S2 | 2a / 2c | Run training on the fixed corpus; record new MRR; commit `metrics_baseline.jsonl` | **Hardware / you** |
+| S3 | 4a / 4b | Extract `typed_wasm` to `crates/typed_wasm/`; route 40 TypeChecker variants through Sigma parameters | **Opus‑design + Sonnet‑impl** |
+| S4 | 3a / 3b | Wire Verisim cross‑prover read paths (`goal_hash` queries + `mv_prover_success_by_class` + hypatia loop) | **Opus‑design + Sonnet‑impl** |
+| S5 | 4c / 4d (pilot) | Tactic synthesis template for 5 high‑value provers (coq, lean, agda, isabelle, z3) using the GNN | **Opus‑design + Sonnet‑impl** |
+
+After S1–S5 land, the roadmap's next sprint takes up Stage 5 (IPC +
+Chapel + Tier‑4 CI) and Stage 8 begins in parallel.
+
+## 5. Agent‑tier guidance
+
+**Opus** (supervisor): architecture, schema design, cross‑module
+contracts, merge review.  Touches: IPC protocol design, Verisim
+schema additions, Chapel FFI shape, coprocessor trait, SPARK kernel
+surface.
+
+**Sonnet** (executor): focused implementation against a clear spec,
+3–20 files, moderate complexity.  Touches: per‑prover tactic
+synthesis, Verisim read client, typed_wasm crate extraction,
+adapter code, worker logic.
+
+**Haiku** (bulk): template‑driven mechanical edits, CI YAML,
+deletes, repetitive stubs.  Touches: Tier‑4 provisioning shell
+commands, extractor boilerplate, per‑prover factory branches that
+follow a mould, localisation key additions.
+
+## 6. Inheritance from existing planning docs
+
+* **[`docs/FUTURE_DEVELOPMENT_ROADMAP.md`](FUTURE_DEVELOPMENT_ROADMAP.md)**
+  — 2026‑01‑29 vision document.  Its chapters on RL for tactic
+  search (§1.1), active learning (§1.2), and distributed dispatch
+  map onto Stages 3c / 4c and 5b/5c of this roadmap.  Retain as
+  background reading; this document is the source of truth for
+  sequencing.
+* **[`docs/handover/PRODUCTION-WIRING-PLAN.md`](handover/PRODUCTION-WIRING-PLAN.md)**
+  — L1/L2/L3 sprint doc.  L1 = Stage 5a.  L2 = Stage 5b.  L3 Wave‑1/2
+  are complete (in Stage 4 surface); Wave‑3 is partially complete in
+  this branch (tamarin/proverif/metamath/twelf/ortools); Wave‑4 is
+  Stage 5d.
+* **[`docs/design/SPARK_ADOPTION_PLAN.md`](design/SPARK_ADOPTION_PLAN.md)**
+  — design for the SPARK‑verified kernel.  Executes Stage 8c.
+* **[`docs/handover/L1-CAPNPROTO-PROMPT.md`](handover/L1-CAPNPROTO-PROMPT.md)**
+  — implementation brief for Stage 5a.  Use as the Sonnet prompt
+  when S6 begins.
+* **[`docs/handover/L2-CHAPEL-PROMPT.md`](handover/L2-CHAPEL-PROMPT.md)**
+  — implementation brief for Stage 5b.
+* **[`docs/handover/L3-LIVE-PROVER-CI-PROMPT.md`](handover/L3-LIVE-PROVER-CI-PROMPT.md)**
+  — implementation brief for Stage 5d.
+* **[`docs/handover/STATE.md`](handover/STATE.md)** and
+  **[`.machine_readable/6a2/STATE.a2ml`](../.machine_readable/6a2/STATE.a2ml)**
+  — running log of session‑by‑session progress.  Continue to use as
+  running state; sync highlights to §2 of this document at each
+  sprint boundary.
+
+## 7. How to update this roadmap
+
+* When a stage sub‑item lands, mark its line `[done YYYY‑MM‑DD ✓]`
+  in §2 and update the corresponding row of §3.
+* When a new endpoint target emerges, add a row to §3, extend §2's
+  stage map if it doesn't already fit, and add it to the sprint
+  plan in §4 only once it's next‑up.
+* Don't remove rows from §3 when they land — keep the history
+  visible, the "Today" column becomes "Done" and the "End‑state"
+  column stays as the invariant.
+* Record non‑trivial decisions in this doc, not in chat.
@@ -43,6 +43,14 @@ function run_extract()
                 "goal"=>"$(splitext(f)[2][2:end])_circuit size=$(sz)",
                 "kind"=>splitext(f)[2][2:end],
                 "context"=>Any[]))
+            # Emit premises: identifiers from circuit filename stem
+            stem = splitext(basename(f))[1]
+            for hm in eachmatch(r"\b([a-zA-Z_][a-zA-Z0-9_]{1,40})\b", stem)
+                h = strip(hm.captures[1])
+                !isempty(h) && length(h) < 50 && push!(pm, Dict{String,Any}(
+                    "proof_id"=>id, "premise"=>h,
+                    "prover"=>"ABC", "theorem"=>basename(f), "source"=>"abc"))
+            end
             id += 1
         catch e; println("Warning: $f: $e"); end
     end

@@ -26,6 +26,7 @@ const REPO_ROOT = dirname(dirname(abspath(@__FILE__)))
 const EXTERNAL_DIR = joinpath(REPO_ROOT, "external_corpora", "acl2")
 const OUTPUT_DIR = joinpath(REPO_ROOT, "training_data")
 const OUTPUT_FILE = joinpath(OUTPUT_DIR, "proof_states_acl2.jsonl")
+const PREMISES_FILE = joinpath(OUTPUT_DIR, "premises_acl2.jsonl")
 const STATS_FILE = joinpath(OUTPUT_DIR, "stats_acl2.json")
 const START_ID = 92000
 
@@ -297,8 +298,17 @@ function run()::Tuple{Int,Int}
     end
     println("  Generated $added unique synthetic proofs")
 
+    # ACL2 premise patterns: :use hints, :in-theory enables, defthm names
+    acl2_hyp_patterns = [
+        r":use\s+\(?:?instance\s+([a-zA-Z0-9_-]+)",
+        r":use\s+\(([a-zA-Z0-9_-]+)",
+        r":in-theory\s+\(enable\s+([a-zA-Z0-9_\s-]+)\)",
+        r"\(defthm\s+([a-zA-Z0-9_-]+)",
+    ]
+
     current_id = START_ID
     output_records = Dict{String,Any}[]
+    premises = Dict{String,Any}[]
     for entry in all_entries
         record = Dict{String,Any}(
             "id" => current_id,
@@ -310,6 +320,25 @@ function run()::Tuple{Int,Int}
             "source" => get(entry, "source", "acl2"),
         )
         push!(output_records, record)
+        # Emit premise records from hints/tactic_proof
+        proof_text = get(entry, "tactic_proof", "") * " " * string(get(entry, "hints", ""))
+        thm_name = entry["theorem"]
+        for hyp_pattern in acl2_hyp_patterns
+            for hyp_match in eachmatch(hyp_pattern, proof_text)
+                hyps = [strip(h) for h in split(hyp_match.captures[1], ' ')]
+                for hyp in hyps
+                    if !isempty(hyp) && length(hyp) < 50
+                        push!(premises, Dict{String,Any}(
+                            "proof_id" => current_id,
+                            "premise" => String(hyp),
+                            "prover" => "ACL2",
+                            "theorem" => thm_name,
+                            "source" => get(entry, "source", "acl2"),
+                        ))
+                    end
+                end
+            end
+        end
         current_id += 1
     end
 
@@ -318,6 +347,11 @@ function run()::Tuple{Int,Int}
             println(fh, JSON3.write(rec))
         end
     end
+    open(PREMISES_FILE, "w") do fh
+        for p in premises
+            println(fh, JSON3.write(p))
+        end
+    end
 
     stats = Dict{String,Any}(
         "prover" => "ACL2",

@@ -48,6 +48,13 @@ function extract_acl2s_proofs()
                         "theorem" => name, "goal" => statement,
                         "context" => Any[],
                     ))
+                    # Emit premises: ACL2s identifiers from theorem statement
+                    for hm in eachmatch(r"\b([a-zA-Z][a-zA-Z0-9_\-]{1,40})\b", statement)
+                        h = strip(hm.captures[1])
+                        !isempty(h) && length(h) < 50 && push!(premises, Dict{String,Any}(
+                            "proof_id"=>current_id, "premise"=>h,
+                            "prover"=>"ACL2s", "theorem"=>name, "source"=>"acl2s"))
+                    end
                     current_id += 1
                 end
             end
-Original file line number
+Diff line change
@@ Expand Up / @@ -193,3 +193,6 @@ deps/ @@
     .cache/
     build/
     dist/
+    # Subagent worktrees (isolation mode)
+    .claude/worktrees/