Skip to content

feat(corpus): resolved & expanded training corpus (700K+ proofs)#25

Merged
hyperpolymath merged 449 commits intomainfrom
claude/expand-vocabulary-corpus-1SFwy
Apr 20, 2026
Merged

feat(corpus): resolved & expanded training corpus (700K+ proofs)#25
hyperpolymath merged 449 commits intomainfrom
claude/expand-vocabulary-corpus-1SFwy

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

This PR resolves the issues from the closed #23.

Key changes:

  • Fixed syntax errors in Mizar and Isabelle extraction scripts.
  • Resolved ID conflicts in the Dafny corpus.
  • Expanded UNIFIED training corpus to 702,099 proofs.
  • Expanded model vocabulary to 141,517 terms.
  • Verified with full test suite (613/613 passing).
  • Cleaned git history of accidentally tracked large stale files.

hyperpolymath and others added 30 commits March 16, 2026 02:06
Convert Guile Scheme state files to A2ML format, preserving
project-specific content as structured sections.

Part of global TODO cleanup (2026-03-16).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added from rsr-template-repo: standardizing CI/CD across all repos.

Part of global TODO cleanup (2026-03-16).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…files

- Convert Trustfile.hs to standardized Trustfile.a2ml format
- Remove Trustfile example files
- Remove legacy AI.djot files (migrated to 0-AI-MANIFEST.a2ml)
- Remove duplicate root AI.a2ml files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Part of gv-clade-index Phase 1: every repo declares its identity,
primary clade, and forge mappings for the VeriSimDB central registry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pin all GitHub Actions to specific commit SHAs instead of mutable tags
to prevent supply chain attacks. Tags preserved as comments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds the forbid(unsafe_code) crate-level attribute to all Rust crates
that do not use unsafe code, hardening the safety guarantee at compile
time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ooling

Refine workflow permissions from read-all to granular contents:read across
all 21 CI/CD workflows. Add FFI wrapper modules and integration test for
GraphQL/gRPC/REST interfaces. Expand training corpus with Metamath, CoqGym,
and mathlib4 extraction scripts (Julia + Python tooling). Add algebra proof
examples for Coq, Isabelle, and Lean. Update .gitignore for external corpora
and large intermediate training artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ooling

Refine workflow permissions from read-all to granular contents:read across
all 21 CI/CD workflows. Add FFI wrapper modules and integration test for
GraphQL/gRPC/REST interfaces. Expand training corpus with Metamath, CoqGym,
and mathlib4 extraction scripts (Julia + Python tooling). Add algebra proof
examples for Coq, Isabelle, and Lean. Update .gitignore for external corpora
and large intermediate training artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	.github/workflows/generator-generic-ossf-slsa3-publish.yml
#	.github/workflows/guix-nix-policy.yml
#	.github/workflows/jekyll-gh-pages.yml
#	.github/workflows/quality.yml
#	.github/workflows/rsr-antipattern.yml
#	.github/workflows/rust-ci.yml
#	.github/workflows/rust.yml
#	.github/workflows/scorecard.yml
#	.github/workflows/security-policy.yml
#	.github/workflows/ts-blocker.yml
#	.github/workflows/wellknown-enforcement.yml
#	STATE.scm
Wire up the dormant Chapel parallel proof search into the build system:
- Add `chapel` feature flag to Cargo.toml
- Add build.rs to link Zig FFI library when feature enabled
- Update build.zig for Zig 0.14+/0.15+ API compatibility
- Add C header (chapel_ffi_exports.h) defining Chapel ↔ Zig ABI
- Add stub implementations for building/testing without Chapel runtime
- Add Justfile recipes (build-chapel-ffi, test-chapel, chapel-all, etc.)
- Add chapel-ci.yml workflow (Zig FFI build + Rust feature gate CI)
- Fix var→const in chapel_bridge.zig for Zig 0.15 compliance

The Chapel accelerator remains optional (default off). Enable with:
  just build-chapel    # or: cargo build --features chapel

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Integrate frontier language models (Claude, GPT-4, etc.) as proof
advisors through BoJ Server's cartridge system. The LLM suggests
tactics, ranks provers, decomposes goals, and generates auxiliary
lemmas — but CANNOT influence trust levels.

Architecture:
  ProofState → LlmAdvisor → BoJ REST API → Frontier LLM
  LLM suggestions → ProverDispatcher (trust pipeline unchanged)

New module: src/rust/llm.rs
  - LlmAdvisor: connects to BoJ (localhost:7700) for LLM guidance
  - TacticSuggestionRequest/Response: structured JSON protocol
  - ProverRecommendation: LLM-ranked prover selection
  - GoalDecomposition: LLM-suggested subgoal splitting
  - DispatchOptimisation: LLM-guided dispatch configuration
  - In-context learning via proof attempt history
  - Cost-aware model routing via BoJ model-router cartridge
  - 8 tests (all passing)

Integration into dispatch.rs:
  - ProverDispatcher.with_llm_advisor(): attach LLM advisor
  - verify_proof_llm_guided(): LLM-optimised dispatch path
  - Falls back to standard dispatch if LLM unavailable

Trust invariant: LLM optimises WHICH PATH to take, never WHAT
THE ANSWER IS. Trust levels computed solely by dispatch pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Integrates ECHIDNA with VeriSimDB for persistent, queryable, federated
proof storage across all 8 modalities (semantic, temporal, provenance,
document, graph, vector, tensor, spatial).

New modules (behind `verisimdb` feature flag):
- proof_encoding.rs: CBOR encode/decode + SHA-256 proof identity hashing
- verisimdb_bridge.rs: ProofOctadBuilder maps ProofState → 8-modality
  octad, VeriSimDBClient for async HTTP, provenance hash-chain builder

Updated:
- agent/memory.rs: VeriSimDBProofStore implements ProofMemory trait,
  stores proofs as octads with graceful fallback to in-memory cache
- Cargo.toml: optional serde_cbor + sha2 deps behind verisimdb feature
- lib.rs: conditional module exports

ECHIDNA Priority 3 (VeriSimDB integration) — Rust side complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Formal interface proofs and C-ABI implementations for all 30 ECHIDNA
prover backends, organized by protocol category:

Idris2 ABI (dependent type proofs):
- InteractiveAssistants.idr: 10 provers (Agda, Coq, Lean, Isabelle,
  Idris2, F*, HOL4, HOL Light, Nuprl, Minlog) — session state machine,
  logic foundation proofs, tactic typing
- SmtSolvers.idr: 3 provers (Z3, CVC5, Alt-Ergo) — SMT-LIB2 protocol,
  push/pop stack discipline, certificate trust proofs, portfolio safety
- FirstOrderAtp.idr: 3 provers (Vampire, E Prover, SPASS) — TPTP
  protocol, SZS ontology, TSTP proof validity
- DeclarativeProvers.idr: 7 provers (Metamath, Mizar, PVS, ACL2, TLAPS,
  Twelf, Imandra) — file verification, kernel trust, axiom transparency
- AutoActive.idr: 2 provers (Dafny, Why3) — VC pipeline, backend
  delegation proofs, annotation tracking
- ConstraintSolvers.idr: 5 provers (GLPK, SCIP, MiniZinc, Chuffed,
  OR-Tools) — solver state machine, problem class capabilities
- Provers.idr: top-level re-export with universal dispatcher

Zig FFI (C-ABI, thread-safe, mutex-guarded):
- interactive.zig, smt.zig, atp.zig, declarative.zig, autoactive.zig,
  constraint.zig — matching session management, state validation, tests

2,693 lines. Every state transition proven in Idris2, enforced in Zig.
ECHIDNA Priority 4 (per-prover ABI + FFI) — complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements ECHIDNA's VQL-UT integration for type-safe federated proof
queries across all 30 prover backends via VeriSimDB octads.

New files:
- VqlUt.idr: Idris2 ABI with proofs for all 10 type safety levels
  (parse, schema, type-compat, null, injection, result-type,
  cardinality, effect-tracking, temporal, linearity)
- vql_ut.zig: Zig FFI query session engine (parse → typecheck →
  plan → execute → collect pipeline, thread-safe)
- vql_ut.rs: Rust CrossProverQueryBuilder (fluent API enforcing
  progressive type safety), QueryExecutor routing to VeriSimDB,
  8 query operations, 9 tests

Query types: FindProof, FindSimilar, CrossProverSearch, ProvenanceTrace,
TemporalHistory, DependencyGraph, AxiomUsage, TacticStats.

ECHIDNA Priority 5 (VQL-UT cross-prover queries) — complete.
All 5 Prover Wars priorities now done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes all integration gaps in the ECHIDNA Prover Wars stack:

- vql_ut.rs: executor methods now call VeriSimDB REST endpoints
  (/api/v1/search/text, /api/v1/search/vector, /api/v1/search/related)
  instead of returning empty stubs. Includes URL encoding, search
  result parsing, and Julia embedding fetch for vector similarity.

- agent/memory.rs: VeriSimDBProofStore.store_success() now fetches
  a 512-dim goal embedding from the Julia inference service (port 8090)
  and passes it to ProofOctadBuilder.with_embedding() before storing
  the octad. Graceful fallback if Julia is unreachable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QueryExecutor now calls TypeLL /api/v1/vql-ut/check before executing
queries, validating that the query meets its declared safety level.
Falls back gracefully if TypeLL is unreachable (3s timeout).
Configurable via TYPELL_URL env var.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add AI manifest, Trustfile, Dustfile, and assail recipe as part of the
Floor Raise campaign to establish baseline tooling across all repos.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hyperpolymath and others added 27 commits April 18, 2026 13:42
…rates

Three unrelated CI failures triaged:

1. `security-scan.yml` referenced `hyperpolymath/panic-attacker` — the
   repo is `panic-attack`.  GitHub follows the repo rename for API
   calls, but `uses:` does not, so the workflow failed to resolve.
2. `hypatia-scan.yml` pinned `actions/upload-artifact` at a SHA that
   no longer resolves; replace with the canonical v4 SHA already used
   by chapel-ci.yml in this same repo.
3. `scorecard.yml` pinned `github/codeql-action/upload-sarif` at a SHA
   that doesn't exist on the upstream repo.  Replace with the v3.28.1
   SHA used by codeql.yml and scorecard-enforcer.yml for consistency.

Also runs `cargo update -p rustls-webpki -p rand` to pull in the
security-patched versions flagged by the six low-severity dependabot
alerts (rustls-webpki 0.103.10 → 0.103.12, rand 0.8.5 → 0.8.6,
rand 0.9.2 → 0.9.4).

Remaining CI failures are secrets/infra issues, not code issues:
- `Mirror to Git Forges`: Radicle SSH key secret not provisioned in
  this repo; mirror-gitlab job succeeds, mirror-radicle fails at
  `~/.radicle/keys/radicle: No such file or directory`.
- `Instant Sync`: GH_PAT-equivalent token for the propagation workflow
  returns "Bad credentials" when dispatching to `.git-private-farm`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t bust adjust dust)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifth AI-WORK §1.6 residual item: cargo check --lib --features verisim
compiled with 22 errors; default (gate off) build was unaffected. This
commit closes all 22 with a small surgical patch.

Breakdown:

- 7× E0616 private-field access (VeriSimDBClient.base_url / .http):
  vcl_ut composes queries directly against these two fields in six
  execute_* methods (find_proof, find_similar, cross_prover, provenance,
  dependency, fetch_goal_embedding). Exposing them to the rest of the
  crate via pub(crate) is cheaper and more honest than narrowing each
  call site through an accessor pair that would do exactly the same
  thing.

- 3× E0422 Goal-type not found + 1× E0425 theorem-value not found:
  vcl_ut was missing `use crate::core::Goal;` under the verisim gate,
  and execute_find_proof bound the theorem name as `_theorem` (unused
  marker) but then referenced `theorem` at proof_identity — rename
  to `theorem`.

- 4× cannot-find-macro `warn`:
  Two files imported `tracing::{debug, info}` but called `warn!` in
  the body. Add `warn` to the `use tracing::{...}` line in vcl_ut.rs
  and agent/memory.rs.

- 7× E0282 type-annotations needed:
  Downstream fallout from the E0616 errors — once the field accesses
  type-check, reqwest's Result<Response, Error> inference completes
  without hints. No annotations actually needed.

Verified:

  cargo check --lib                       clean
  cargo check --lib --features verisim    clean (was 22 errors)

Three unrelated pre-existing warnings remain (1 unreachable pattern in
provers/mod.rs:1365, 2 unused-import). Not touched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
train_models.jl walks each training_data/*.jsonl line with a hand-
rolled regex because the script avoids JSON.jl as a dependency. The
patterns for "goal", "prover", and "tactic" required zero whitespace
after the colon — i.e. `"goal":"..."` — but the generated corpus
files have `"goal": "..."` (one space). Result: the pre-fix run
loaded zero proof states, zero tactics, and saved a zero-word
vocabulary while reporting "complete".

Fix: allow optional `\s*` after each colon in all three patterns.
Equivalent to using a real JSON parser but keeps the zero-dep
property the script values.

Post-fix run (COMPLETE corpus, PID 3994289 at commit time):
  premises_COMPLETE.jsonl        300 lines
  proof_states_COMPLETE.jsonl  66,611 lines
  tactics_COMPLETE.jsonl      179,933 lines

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Julia's default ConsoleLogger wraps stderr in a stream that only
flushes on tty writes. Under nohup/systemd/Claude-background runs
the whole training session stays silent — no @info output, no
epoch progress, no "saved models" — which hid the earlier
0-data-loaded failure (the regex parser bug fixed in e93ab25)
for 17 minutes before it was caught.

This commit swaps the default logger for `SimpleLogger` writing
directly to stderr, and adds a defensive 1-second flush loop
as belt-and-braces. Startup now prints @info lines within
~30-60s (Julia JIT precompile) instead of never.

No behavioural change to the training itself — only observability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two env-var knobs to train_models.jl so multiple training
runs can execute concurrently against the same source without
stomping on each other's output:

  ECHIDNA_MODELS_DIR     default "models"     (output artefact root)
  ECHIDNA_TRAIN_EPOCHS   default 50           (logistic epochs)

Use case: epoch-variant sweep where a quick-baseline (5 epochs),
intermediate (20 epochs), and full (50 epochs) run launch in
parallel and the shortest gives a usable MRR-comparable model
within minutes, while the longer runs refine behind it.

Example:

  ECHIDNA_TRAIN_EPOCHS=5  ECHIDNA_MODELS_DIR=models/e5  \
      julia src/julia/train_models.jl &
  ECHIDNA_TRAIN_EPOCHS=20 ECHIDNA_MODELS_DIR=models/e20 \
      julia src/julia/train_models.jl &
  ECHIDNA_TRAIN_EPOCHS=50 ECHIDNA_MODELS_DIR=models/e50 \
      julia src/julia/train_models.jl &

No behavioural change when both env vars are unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `.-` in the weight-update line created a new matrix and discarded
it; only the bias was being mutated. Training loss plateaued at 0.7906
across all epochs because the weights never moved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promotes Echidna from mock-only CI to real subprocess validation against
canonical micro-goals.  Wave-1 covers Tier-1 backends (apt-installable) and
seeds Tier-2 (build-from-source) entries that subsequent waves will fill in.

- manifests/live-provers.scm    Guix manifest, primary reproducible path
- flake.nix                     liveProverDeps + .#live-provers devShell, fallback
- .github/workflows/live-provers.yml
                                Tiered workflow: T1/PR, T2/nightly, T3/weekly,
                                T4/quarterly + Guix-resolution allow-fail check
- tests/live_prover_suite.rs    Cargo-feature-gated suite; auto-skip when
                                binary absent; version() smoke test per backend
- Cargo.toml                    new live-provers feature flag
- .machine_readable/6a2/STATE.a2ml
                                session-2026-04-19-production-wiring-kickoff
                                logged with locked decisions D1..D6

Decisions locked this session (full plan in
~/Desktop/ECHIDNA-PRODUCTION-WIRING-PLAN.md):

  D1 Cap'n Proto chosen over Bebop3 (dependability + zero-copy + maturity)
  D2 Chapel POC promoted to first-class layer across 7 sub-waves (L2)
  D3 Guix primary, Nix fallback (per project CLAUDE.md)
  D4 Order: L3 (this) -> L1 (capnp) -> L2 (chapel)
  D5 Live-prover CI tiered: T1/PR, T2/nightly, T3/weekly, T4/quarterly
  D6 No-JSON rule enforced; capnp kills HTTP+JSON on Rust<->Julia hot path

Continuation prompts on Desktop:
  ECHIDNA-PRODUCTION-WIRING-PLAN.md   master plan + session log
  ECHIDNA-L3-LIVE-PROVER-CI-PROMPT.md continuation for L3 Wave-2..4
  ECHIDNA-L1-CAPNPROTO-PROMPT.md      continuation for protocol swap
  ECHIDNA-L2-CHAPEL-PROMPT.md         continuation for Chapel maximisation

Note: Wave-1 is scaffolding.  CI will surface any naming drift between the
test harness ProverKind variants and the backend default_executable strings;
those drifts are exactly the mock-vs-reality gaps this suite exists to expose.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tandalone

Cargo build --features chapel was broken end-to-end even though symbol
names and struct layouts across Chapel→Zig→Rust were already correct.
The static lib shipped by src/zig_ffi had unresolved chapel_* externs
(chapel_stubs.c was only compiled into the test binary), so Rust's
link step failed whenever no real libechidna_chapel was on the system.

- src/zig_ffi/build.zig: add -Dstubs=<bool> (default true); when true,
  bundle chapel_stubs.c into both the shared and static echidna_chapel_ffi
  with -fno-sanitize=undefined (Zig's default UBSan on C pulled in
  __ubsan_* that Rust's linker couldn't resolve). Pass -Dstubs=false
  when linking against real Chapel built from chapel_poc/.
- src/rust/proof_search.rs: add `use anyhow::Context;` inside the
  chapel_ffi submodule — pre-existing compile error exposed once the
  lib linked.
- .machine_readable/6a2/STATE.a2ml: new session entry correcting the
  stale gap list — Tamarin (tamarin.rs, 592 LoC) and ProVerif
  (proverif.rs, 799 LoC) bridges are fully wired through ProverFactory
  and kind_from_u8/kind_to_u8, not "planned"; 0 TODO/FIXME in src/rust/.

Verified: cargo build --features chapel OK; cargo test --features chapel
--lib proof_search → 6/6 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tputs

Follow-up to 2392923 — human/machine parity.

- QUICKSTART-DEV.adoc: new section describing the zig build -Dstubs=true
  default and the cargo build --features chapel two-step; plus the
  -Dstubs=false path for linking against real libechidna_chapel.
- chapel_poc/README.md: mark "Add FFI bindings to call from Rust" as
  DONE, pointing to the Chapel → Zig → Rust chain that's now wired.
- .gitignore: add models/e*/, /models_e*/ (per-epoch training
  checkpoints, regenerable) and src/zig_ffi/zig-out/, .zig-cache/
  (regenerated by `zig build`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tlaps tests

Tier-2 (nightly) live-prover CI now has real provisioning, not TODO
placeholders, for every backend except hol-light (which defers to
Wave-3 container provisioning — no prebuilt binary, opam build is
~20 min and pulls camlp5).

- idris2:  bootstrap from source tarball against Chez Scheme.
- isabelle: Isabelle2024 tarball, symlink /usr/local/bin/isabelle.
- dafny:   dotnet tool install --global Dafny (ubuntu-latest has sdk-8).
- fstar:   GitHub release tarball; binary is fstar.exe even on Linux.
- tlaps:   self-extracting installer from tlapm releases; binary is tlapm.
- hol-light: deferred with explicit exit 0 + comment.

Also adds live_fstar_version and live_tlaps_version to the Rust test
suite (matches provers/mod.rs executable names: fstar.exe, tlapm) and
fills in the missing library_paths field on ProverConfig that had made
the test file fail to compile.

Local verification: `cargo test --test live_prover_suite --features
live-provers` runs all 18 tests; 13 real binaries (Z3, CVC5, EProver,
Alt-Ergo, Why3, Vampire, Dafny, Isabelle, Idris2, F*, Coq, Agda, Lean4)
return real versions; 5 auto-skip (GLPK/SPASS/MiniZinc/TLAPS/Chuffed
not on this host). No mock tests were modified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Session entry session-2026-04-19-l3-wave-2 captures what was wired in
8bbde25 (Tier-2 provisioning commands + 2 new Rust tests) and lists the
remaining work per wave so the next agent doesn't have to re-derive it:

- Wave-1: DONE (9 Tier-1 backends, 45a7a95).
- Wave-2: DONE for 9; hol-light deferred.
- Wave-3: scaffold only — 9 Tier-3 backends need per-backend
  Containerfiles. Handover hints added for each.
- Wave-4: scaffold only — 19 Tier-4 backends retained as mock-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Makes the production-wiring plan + L1/L2/L3 continuation prompts
part of the repo so a fresh clone has the full handover context
without needing Desktop access.

- docs/handover/PRODUCTION-WIRING-PLAN.md (← ECHIDNA-PRODUCTION-WIRING-PLAN.md)
- docs/handover/L1-CAPNPROTO-PROMPT.md
- docs/handover/L2-CHAPEL-PROMPT.md
- docs/handover/L3-LIVE-PROVER-CI-PROMPT.md
- docs/handover/README.md — drift-handling policy (both copies
  updated in same commit; in-repo copy wins on conflict).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same treatment as the coord-MCP synthesis: merges the four Desktop
echidna continuation prompts (L1/L2/L3 + master plan) into two tight
summaries while keeping the full prompts as deeper reference.

Added:
- docs/handover/TODO.md — P0→P4 backlog across all three phases.
  P0: watch next nightly + real-Chapel CI job. P1: L3 Wave-3 + Wave-4
  + Dafny deepen + VeriSimDB. P2: L1 Cap'n Proto (gated on L3 hand-off).
  P3: L2 Chapel 7 sub-waves (gated on L1). P4: adjacent/deferred.
  Includes 3 open questions (Imandra licence, CapnProto.jl vs shim,
  Chapel default-on threshold).
- docs/handover/STATE.md — per-phase status honest: L3 Wave-1+2 DONE,
  Wave-3+4 scaffold only; L1 not started; L2 Zig self-link fix landed
  but no src/chapel/ yet. Locks decisions D1-D6 from the master plan.
  Corrects stale-listed Tamarin/ProVerif as fully wired.

Updated:
- docs/handover/README.md — index now points to TODO + STATE first;
  the full L1/L2/L3 prompts remain as deeper reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Grow the per-prover corpora for the four provers with no public downloadable
training data (tiny communities, small libraries):

  Nuprl:   50 -> 152  (+102, +204%)
  Minlog:  46 -> 148  (+102, +221%)
  Imandra: 47 -> 133  (+86,  +183%)
  Twelf:   33 -> 118  (+85,  +258%)

New categories added to scripts/generate_synthetic_provers.jl:

  Nuprl:   sets, relations, number_theory, real_analysis, computability,
           category_theory, topology, algebra, probability, bishop_analysis,
           proof_theory, information, quotient_types, w_types
  Minlog:  lists, trees, classical_logic, sequent_calculus, induction_advanced,
           higher_order, program_verification, modal_logic, intuitionistic,
           games, realizers_advanced, bounded_arithmetic, finite_logic
  Twelf:   binary_trees, substitution, session_types, linear_lf, metatheory,
           polymorphism, modules, records, traces, bigstep, hoas, parsing,
           automata, effects, numerics
  Imandra: trees, numerics, records, state_machines, crypto, blockchain,
           algebra, concurrency, databases, networking, security, ml_correctness

Regenerated proof_states_{nuprl,minlog,twelf,imandra}.jsonl match the script.

https://claude.ai/code/session_0173ntsBsELMiXaTWvtjXdN8
Second pass: grow per-prover corpora for provers using extract scripts
with synthetic fallback sections:

  MiniZinc:  29 ->  52 (+23, timetabling/network/logistics/puzzle/production)
  Dafny:     48 ->  88 (+40, maps/loops/induction/ghost/types/concurrency)
  Mizar:     66 -> 128 (+62, sequences/lattices/metric/finseq/ordinals/measure)
  F*:        76 ->  92 (+16, sequences/buffers/monotonic/tactics/dependent)
  Idris2:    87 ->  95 (+8,  maybe/either/nat_order/stream/universe/type_level)
  Isabelle:  97 -> 105 (+8,  sets/analysis/algebra/logic/lists/matrix/induction/auto)

Combined with pass 1 (Nuprl/Minlog/Twelf/Imandra), total corpus grows
from 10,645 to 11,177 entries (+532, +5%).

All 17 JSONL files validate as well-formed JSON.

https://claude.ai/code/session_0173ntsBsELMiXaTWvtjXdN8
Corpus expansion (pass 3):
  F*:       92 -> 136 (+44, machine_integers/parsers/state_machine/protocol/
                            arrays/monad_laws/string_processing)
  Idris2:   95 -> 132 (+37, bool/pair/snoclist/proof_search/with_views/cong)
  MiniZinc: 52 ->  82 (+30, energy/healthcare/education/telecom/agriculture/
                            manufacturing)

Vocabulary expansion:
  tactic_vocab.txt:  422 -> 6,130  (14.5× growth)
  premise_vocab.txt: 422 -> 60,046 (142× growth)

Both vocabs extracted from all 17 corpus JSONL files (context arrays for
tactics, theorem names for premises). All JSONL files validate as well-formed.

https://claude.ai/code/session_0173ntsBsELMiXaTWvtjXdN8
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@hyperpolymath hyperpolymath merged commit 03ae8ba into main Apr 20, 2026
26 of 40 checks passed
@hyperpolymath hyperpolymath deleted the claude/expand-vocabulary-corpus-1SFwy branch April 20, 2026 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants