Merge main into main2 by keyboardDrummer · Pull Request #1211 · strata-org/Strata

keyboardDrummer · 2026-05-22T08:41:44Z

Warning: This repository will shortly undergo a split into several separate repositories. If you're creating a PR that crosses the boundaries between these repositories, you may want to hold off until the split is complete or be prepared to rework your PR into multiple PRs once the split is complete.

The code that will be moved includes:

Strata/DDM/*
Strata/Languages/Boole/*
Strata/Languages/Python/* along with Tools/Python/*
Tools/BoogieToStrata

Add a pull request description template that warns about the upcoming repository split. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

The `StrataTestMain` test driver spawns `lean` on files under `StrataTestExtra/`, which is not declared as a `lean_lib`. Some of those files (e.g. `DDM/Integration/Java/TestGen.lean`) import modules from the `Strata` library that are not in `StrataTest`'s transitive closure, most notably `Strata.DDM.Integration.Java`. Before this change, `lake test` on a clean `.lake` would only build the closure reachable from `StrataTest` and then fail at subprocess time with: error: object file '.../Strata/DDM/Integration/Java.olean' of module Strata.DDM.Integration.Java does not exist Running `lake build` first masked the bug because the `Strata` library is in `defaultTargets`. Add `Strata` to the `needs` list of `StrataTestMain` so Lake builds the full `Strata` library before the test driver runs, regardless of whether `lake build` was run beforehand. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

…ove unlabeled exit from Core (#1141) *Issue #, if available:* Closes #372 *Description of changes:* Formalizing scoped variable initialization: - Adds scoped environment semantics to blocks: step_block_done now projects the store through the parent store via projectStore, discarding block-local variables on exit. This applies to both StepStmt (deterministic) and StepKleene (nondeterministic). - Adds PostWF helper definition in Specification.lean for postcondition stability under projectStore. - Adds .block constructor to KleeneStmt and corresponding step_block/step_block_body/step_block_done to StepKleene, mirroring the deterministic block semantics. - Changes StmtToKleeneStmt (.block _ bss _) to produce .block b (wrapping in a Kleene block), so the Kleene translation preserves scoping. Removing unlabeled `exit` command (`exit;`, not `exit lbl;`): - The unlabeled `exit` command doesn't have clear meaning when it is inside `while`. In `while cond { ... exit; ... }`, is `exit` equivalent to `continue` or `break` in C/Python? No translation to Core was introducing the unlabeled `exit`, and the small-step semantics wasn't clear about the meaning of `exit` inside a loop. - The type checker of Core (Strata/Languages/Core/StatementType.lean) fails when an unlabeled `exit` appears inside the body of a while loop / if statement (unless it is wrapped by another nested block), but the DDM syntax of core allows it, so from user's perspective this gap is kind of surprising. - This patch removes the unlabeled `exit` case because it can always be simulated by a labeled block + labeled exit. On top of this, this also renames `touchedVars` to `modifiedOrDefinedVars` for clarity, and instead makes `touchedVars` all vars that are read + modified + defined. ### How to review? - Strata/DL/Imperative/StmtSemantics.lean has the most important update: the `.block` constructor now has the input store which will be used to 'project' the output store to the variables that have been defined before. Also, any block-exiting small steps like `step_block_done` will do the projection, to define variables which were defined inside the inner scope. - Correspondingly, the Kleene language was updated to add a notion of block and scoping to its syntax and semantics, otherwise DetToKleene doesn't prove. (Strata/DL/Imperative/KleeneStmt.lean, Strata/DL/Imperative/KleeneStmtSemantics.lean) - StrataTest/DL/Imperative/StepStmtTest.lean has a few new tests that show scoping works well. - DetToKleeneCorrect.lean and ProcBodyVerifyCorrect.lean shows that the top-level statements are not touched after this update. - Strata/DL/Imperative/HasVars.lean has the `touchedVars` update and Strata/Languages/Core/StatementSemantics.lean has some additional well-formedness about evaluator extension that are useful. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…1109) translateSort already handles .prim .string, but translateTerm had no arm for .prim (.string _): any SMT string literal produced by the encoder would fall through to the catch-all and raise 'unsupported term'. Add the missing arm plus the two string operations already supported in Denote.lean so that the two translators agree on the end-to-end-supported subset. Specifically, add arms for: - .prim (.string s) -> (mkString, toExpr s) - .app .str_length [s] _ -> (mkInt, Int.ofNat s.length) - .app .str_concat as _ -> (mkString, leftAssocOp mkStringAppend as) mkStringAppend uses instHAppendOfAppend + instAppendString, which is the same instance chain Lean elaborates inferInstance to for HAppend String String String. Regression tests in StrataTest/DL/SMT/TranslateTests.lean cover all three arms; each fails on origin/main (catch-all throw observable via #guard_msgs mismatch) and passes with this change. I additionally verified under a Meta.check-enabled harness that the produced Expr type-checks in the kernel, so the (fst = mkString/mkInt, snd = ...) pairs are internally consistent. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>

…1140) Fixes #1139 Introduces a `Provenance` type that fully replaces `FileRange` and `SourceRange` as the canonical way to store source locations in metadata. This structurally eliminates the "SourceRange without file name" problem. The `Provenance` type has two constructors: - `Provenance.loc uri range` — a real source location (always requires both URI and range) - `Provenance.synthesized origin` — a node created programmatically, with a `SynthesizedOrigin` inductive type A `SynthesizedOrigin` inductive enforces that only canonical origins are used (`smtEncode`, `nondetIte`, `laurelParse`, `laurel`, `laurelToCore`, `structuredToUnstructured`), preventing typos and duplicates at the type level. Key changes: - The `.fileRange` variant is removed from `MetaDataElem.Value` — all metadata values now use `.provenance` exclusively - `MetaData.ofSourceRange` emits only a provenance element - `getProvenance` is the single source of truth for reading source locations from metadata - `setCallSiteFileRange` works directly with `Provenance` instead of roundtripping through `FileRange` - `getFileRange` delegates to `getProvenance` for extraction - `FileRange.unknown` and `SourceRange.none` eliminated from metadata construction - SMT DDM translator uses `smtAnn` combinator to reduce annotation boilerplate - SARIF output uses `getFileRange` (which reads provenance) `FileRange` remains as a utility struct for extraction and formatting (e.g., in `DiagnosticModel`), but is no longer a metadata value type. Existing tests pass unchanged. ## Follow-up - Migrate the B3, Boole, and Python grammar ASTs from using `SourceRange` as their annotation type parameter to `Provenance`. This would allow combining multiple files at the AST level and enable proper provenance tracking through translation passes.

…seArrayTheory option. (#1145) *Description of changes:* **Bug:** When `useArrayTheory` is enabled, the SMT encoder correctly converts `Map` types to `Array` in variable declarations and function signatures, but it did *not* perform this conversion for fields inside ADT (algebraic data type) constructor declarations. This caused a type mismatch: a datatype field would be declared with type `Map` while the rest of the encoding used `Array`. **Fix:** Thread the `useArrayTheory` flag through the datatype emission pipeline: - `lMonoTyToTermType` now accepts `useArrayTheory` and converts `Map` to `Array` when enabled. - `datatypeConstructorsToSMT` passes the flag to `lMonoTyToTermType`. - `SMT.Context.emitDatatypes` accepts and forwards the flag. - `encodeCore` in `Verifier.lean` passes `options.useArrayTheory` to `emitDatatypes`. **Test:** Added a unit test in `SMTEncoderDatatypeTest.lean` that verifies a datatype with a `Map`-typed field emits `(Array Int Int)` when `useArrayTheory=true` and `(Map Int Int)` when `useArrayTheory=false`. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>

…structs (#1165) Fixes #1158 Supersedes #1036 ## Summary Fixes several issues with `Core.formatProgram` that prevented round-trip parsing: 1. **`inline function` formatted without space** — The `inline` grammar op now includes a trailing space, preventing `inlinefunction` concatenation. The formatter also now emits the `inline` attribute when present on functions. 2. **Quantifier variable names** — The formatter now uses the `prettyName` field from `LExpr.quant` instead of generating `__qN` names, preserving the original variable names through formatting. 3. **Overflow predicates** — Added grammar entries, formatter handling, parser entries, and Factory operations for all bitvector overflow predicates (`SNegOverflow`, `UNegOverflow`, `SAddOverflow`, `SSubOverflow`, `SMulOverflow`, `SDivOverflow`, `UAddOverflow`, `USubOverflow`, `UMulOverflow`). `SNegOverflow` and `UNegOverflow` use distinct grammar entries (`Bv.SNegOverflow` / `Bv.UNegOverflow`) to preserve their different semantics through roundtrip. The translate direction correctly dispatches overflow ops by type parameter (bv1/bv8/bv16/bv32/bv64), logging an error for unsupported widths. 4. **Array assignment (`lhsArray`)** — Implemented `m[k] := v` translation in both directions: the parser decomposes nested LHS into base identifier + indices and builds `map_update` expressions; the formatter detects the `map_update(var, idx, val)` pattern and produces `lhsArray` syntax. 5. **`Sequence.empty`** — Added grammar entry with explicit type annotation syntax (`Sequence.empty<int>()`), plus translate and format handling, resolving the 0-ary polymorphic function limitation for this operation. Includes solver verification tests (cherry-picked from #1036) covering basic usage and various element types. 6. **Datatype roundtrip** — Verified that datatype formatting roundtrips correctly (the extra-parentheses issue noted earlier is resolved). 7. **Roundtrip test infrastructure** — Added `RoundtripTest.lean` that verifies parse → format → re-parse → re-format produces stable output for types, functions, procedures, inline functions, parameterized type arguments, datatypes, array assignments, and `Sequence.empty`. 8. **`getLFuncCall` reuse** — Refactored `lappToExpr` and `decomposeMapUpdate` to use `getLFuncCall` for decomposing nested applications, reducing code duplication with the existing utility. ## Testing - All existing tests pass (with expected output updates for the formatting improvements) - New roundtrip tests verify types, functions, procedures, inline functions, parameterized type arguments, datatypes, lhsArray, and Sequence.empty - Solver verification tests for Sequence.empty (from #1036) pass ## Remaining limitations - Bitvector operations with widths beyond 1/8/16/32/64 log an error and fall back to bv64 (requires grammar-level type entries per width) --------- Co-authored-by: Aaron Tomb <aarotomb@amazon.com>

…coding (#935) Fixes #917 ## Summary Introduces an abstract solver interface (`AbstractSolver τ σ m`) that decouples term construction and session operations from the SMT-LIB string encoding pipeline. The interface is parameterized by a term type `τ`, sort type `σ`, and monad `m` (with `[Monad m] [MonadExceptOf IO.Error m]` constraints), so a non-SMT-LIB backend (e.g., cvc5 FFI) can be plugged in by implementing `AbstractSolver` without touching the encoding logic. All methods throw `IO.Error` directly (no `Except String` layer). A new incremental backend communicates with a live solver process via stdin/stdout. It is opt-in via `--incremental` (batch remains the default). ## What the interface provides All operations are monadic (`m`), throwing on error: - **Configuration**: `setLogic`, `setOption`, `comment` - **Sort construction**: `boolSort`, `intSort`, `realSort`, `stringSort`, `regexSort`, `bitvecSort`, `arraySort`, `constrSort`, `termTypeToSort` - **Term construction**: `mkPrim`, `mkBool`, `mkAnd`, `mkOr`, `mkNot`, `mkImplies`, `mkEq`, `mkIte`, `mkAdd`, `mkSub`, `mkMul`, `mkDiv`, `mkMod`, `mkNeg`, `mkAbs`, `mkLt`, `mkLe`, `mkGt`, `mkGe`, `mkSelect`, `mkStore`, `mkApp`, `mkAppOp`, `mkForall`, `mkExists` - **Declarations**: `declareNew`, `declareFun`, `defineFun`, `declareSort`, `declareDatatype`, `declareDatatypes` - **Session**: `assert`, `checkSat`, `checkSatAssuming`, `getUnsatAssumptions`, `getModel`, `getValue`, `reset`, `close` - **Model inspection**: `termToSMTLibString` Terms (`τ`) are opaque handles whose meaning is backend-specific. ## Other changes - **Generic encoder** — `AbstractEncoder.encodeTerm` parameterized over `τ`/`σ`, caches handles in `AbstractEncoderState τ`. - **Generic incremental discharge** — `Imperative.SMT.dischargeObligationIncremental` in `SMTUtils.lean` is parameterized over `PureExpr`, so any Imperative-level language can use the incremental solver. - **Pluggable solver backends** — `CoreSMTSolver` and `MkDischargeFn` threaded through `Core.verifyProgram`, `Core.verify`, and `Strata.verify`. - **Strict solver compatibility** — `encodeDeclarationsAbstract` skips `declareSort` for datatype names and validates datatype fields. - **`--incremental`** flag to opt in to the incremental backend. ## Testing All tests pass with both batch (default) and incremental modes. ## Follow-ups - [ ] Implement with Lean FFI for cvc5

The Core grammar declared command_datatypes as op command_datatypes (datatypes : NewlineSepBy DatatypeDecl) : Command => datatypes ";\\n"; Without @[nonempty] on the field, NewlineSepBy compiles to a zero-or-more parser, so the trailing ";\\n" production alone matched any stray ";" in the input. This is particularly easy to trip accidentally: command_fndef's surface syntax has no trailing semicolon, so a user writing "};" at the end of a function body (by analogy with procedures, whose grammar does end in ";") silently produced a phantom command_datatypes op with zero datatypes. The phantom later tripped an explicit assertion in translateDatatypes: "Datatype block must contain at least one datatype" which calls panic!, producing a large backtrace at gen_smt_vcs time with no useful source location. Mark the datatypes field @[nonempty]. The stray ";" now surfaces as a parse error at the point it actually appears, and the original repro elaborates cleanly. Add a regression test (StrataTest/Languages/Core/Tests/Issue1146Test.lean) that pins the canonical form: a program mixing a datatype and a function can run through gen_smt_vcs without panicking. Fixes: #1146 By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>

#1179) Fixes #1163 Moves all library code from `StrataMain.lean` into a new `StrataMainLib` module, leaving `StrataMain.lean` as a thin executable entry point (`import StrataMainLib`). This makes the library importable from tests and other tools without pulling in the executable entry point.

…sition (#1116) When a StaticCall to a multi-output procedure appears in expression position (not as the RHS of an Assign), emit an error diagnostic: the extra outputs would be silently discarded, which is a semantic bug (e.g., error channels lost). The diagnostic is suppressed when the call is the direct RHS of an assignment (where the target count check already validates arity). Includes a test in ResolutionKindTests demonstrating the diagnostic. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

…1110) Both helpers' error message already said 'expected at least two arguments', but the destructuring `let a :: as := as | throw ...` only rejected the empty list: a singleton like `.app .add [x]` or `.app .bvadd [x]` silently returned just `x`. This diverged from `Denote.leftAssoc`, which uses `let t₁ :: t₂ :: ts := ts | none` and returns `none` for fewer than two operands. Tighten the pattern to `a :: b :: as := as | throw ...` and seed the `foldl` with `mkApp2 op a b` so the helper's semantics now match the existing error message and `Denote.leftAssoc`. Empty-operand lists continue to throw with the same message as before. This affects every variadic caller: `.app .and`, `.app .or`, `.app .add`, `.app .sub`, `.app .mul`, `.app .div`, `.app .mod`, plus `.app .bvadd`, `.app .bvsub`, `.app .bvmul`, `.app .bvand`, `.app .bvor`, `.app .bvxor`. The Strata Core SMT encoder does not produce singleton applications of any of these ops (ripgrep'd across `SMTEncoder.lean`, `Factory.lean`, and `B3/Verifier/Expression.lean`: every construction passes at least two operands), so well-formed queries are unaffected. Zero-operand placeholders like `Term.app Op.and [] .bool` that appear as "unreachable" fallbacks in `SMTEncoder.lean` were already rejected by `translateTerm` before this change, and remain rejected. Regression tests in StrataTest/DL/SMT/TranslateTests.lean cover: - singleton `.app .add`, `.app .and`, `.app .bvadd`, `.app .bvand` (one per helper+typeclass combination), - empty-operand `.app .add` (unchanged behaviour), - ternary `.app .add` producing `1 + 2 + 3 = 6` (preserves left associativity past the new 2-operand seed). Each singleton-rejection test fails on the previous commit (the term silently translates to its sole operand) and passes here. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>

The ANF encoder extracts duplicated subexpressions into fresh variables, but the existing single-pass implementation can leave large duplicated sub-subexpressions behind. Root cause: `removeSubsumed` drops candidate duplicates that are subexpressions of other (larger) candidate duplicates, to avoid creating redundant variable declarations. But this means that if only the outer expression appears at the top level, the inner dupes are hidden inside the lifted var declaration and never extracted. Example (from PyAnalyzeLaurel benchmark check_storage_costs): Original: assert Any_to_bool(Any_get(response, "Datapoints")) ... After partial evaluation, Any_to_bool inlines its 7-branch body, each branch referencing the argument. With Any_get also inlined as an `ite (is-DictStrAny response) (DictStrAny_get ...) (List_get ...)`, the Any_get expression ends up duplicated 62 times inside a single assert. Old ANF output: var $__anf.0 : bool := <9KB body with 62 duplicates of Any_get>; assert $__anf.0 New ANF output (after iteration): var $__anf.3 : Any := Any_get(response, "Datapoints"); var $__anf.0 : bool := <body using $__anf.3>; assert $__anf.0 Effect: VC file size for VCs on one benchmark drops from 32KB to 11KB (~65% reduction), and another benchmark now completes verification at 36s where it previously hit a 60s timeout. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

…te actions (#1137) The same ~15-line blocks for installing cvc5, installing z3, and restoring the lake cache were duplicated across ci.yml's three jobs (build_and_test_lean, check_pending_python, build_python), cbmc.yml, and python-fuzz.yml. Robin asked for this to be consolidated as part of the #984 review (flagged non-blocking there, handled here). Introduce three composite actions under .github/actions/: - install-cvc5: downloads the static cvc5 release zip, makes the binary available either on $GITHUB_PATH ('path', default) or in /usr/local/bin ('system'). Version defaults to 1.2.1. - install-z3: same shape for z3, version defaults to 4.15.2. This fixes a latent bug: cbmc.yml and build_and_test_lean used different aarch64 archive names ('arm64-glibc-2.34' vs. 'arm64-win'); the consolidated action uses the correct glibc-2.34 one for both. - restore-lake-cache: wraps actions/cache/restore@v5 with the established key pattern lake-<os>-<arch>-<toolchain>-<manifest>-<all .st>-<sha> plus the three fallback keys dropping each trailing component. Exposes a 'fail-on-cache-miss' input for jobs that depend on a cache saved for the same SHA by build_and_test_lean. Rewire ci.yml and cbmc.yml to use the new actions: - ci.yml: build_and_test_lean: install-cvc5, install-z3, restore-lake-cache. check_pending_python: install-cvc5 (system), restore-lake-cache (fail-on-cache-miss: true). build_python: install-cvc5 (system), restore-lake-cache (fail-on-cache-miss: true). (The pip-based z3 install here is deliberately left as-is — it is a different install flow.) - cbmc.yml: install-cvc5, install-z3, restore-lake-cache (fail-on-cache-miss: true). Net change: -108 lines of duplicated shell, +17 lines of action usage across the two workflow files, with the reusable definitions ready to be adopted by python-fuzz.yml and any future workflows. actionlint reports no new warnings (the pre-existing save-always warning on cbmc.yml line 22 is unrelated). By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

Blocker for Laurel Seq<T>/Array<T> in #1073. ## Bounds preconditions for `Sequence.select` / `update` / `take` / `drop` Following the `Int.SafeDiv` pattern: | Op | Precondition | |---------------------|-------------------------------------| | `Sequence.select` | `0 <= i && i < Sequence.length(s)` | | `Sequence.update` | `0 <= i && i < Sequence.length(s)` | | `Sequence.take` | `0 <= n && n <= Sequence.length(s)` | | `Sequence.drop` | `0 <= n && n <= Sequence.length(s)` | `Sequence.length` / `empty` / `append` / `build` / `contains` remain total. `PrecondElim` picks the obligations up automatically at call sites in imperative code (via `transformStmt`) and in pure positions like `requires` / `ensures` / quantifier bodies / function bodies (via the synthetic `$$wf` procedures). Obligations carry `propertyType = "outOfBoundsAccess"` (new `MetaData` constant, mirroring `divisionByZero`), flow through a new `PropertyType.outOfBoundsAccess` enum variant and the three metadata-to-PropertyType conversion sites, and render as `"out-of-bounds-access"` in SARIF output. ### Incidental fix While wiring the SARIF classification, I noticed `propertyTypeToClassification` in `SarifOutput.lean` was pre-existing dead code: `vcResultToSarifResult` never set `properties.propertyType`, so every obligation defaulted to `"assert"` in SARIF output. This PR wires it up, so `divisionByZero` and `arithmeticOverflow` obligations now also classify correctly in SARIF alongside the new `outOfBoundsAccess`. ## Testing - **`StrataTest/Transform/PrecondElim.lean` Test 10** — `collectPrecondAsserts` attaches `outOfBoundsAccess` metadata for all four partial ops plus a nested `Sequence.select(Sequence.update(...))` call. Mirrors the pattern in `OverflowCheckTest.lean`. Complemented by per-op pretty-print snapshots (`#guard_msgs`) that assert the exact obligation body string — these catch regressions that preserve count and metadata tag but corrupt the obligation body (e.g. passing `.Le` instead of `.Lt` to `mkSeqBoundsPrecond` at a call site, changing the bound-variable name, or swapping the lower/upper bound inside `mkSeqBoundsPrecond`). - **`StrataTest/Languages/Core/Tests/SarifOutputTests.lean`** — property-type classification tests covering all five `PropertyType` variants. Collateral updates to existing tests reflect the new obligations (`Seq.lean`) and updated `requires` on Sequence function signatures (`ProgramEvalTests.lean`). Note: the bounds obligations in `Seq.lean` appear as `true && 0 < length(...)` — the partial evaluator simplifies `0 <= 0` to `true` but does not further simplify `true && X` to `X`. This is a pre-existing evaluator gap, not newly introduced by this PR; the SMT solver discharges the obligation trivially. ## Known downstream impact PR #1073 (Laurel Seq<T>/Array<T>) emits `Sequence.select` / `update` / `take` / `drop` calls from its translation. Its `T18_Sequences` test uses `#guard_msgs(drop info, error)` on a diagnostics-only pipeline, so the test assertions should still pass syntactically — but individual sequence-manipulation programs now require bounds guards to fully verify. PR #1073 has been adjusted accordingly.

## Summary The DDM type checker uses unification to infer implicit type arguments, but this fails in some cases — notably when template-generated accessors with unresolved tvar return types are composed with polymorphic functions. This PR adds a per-dialect option to bypass type checking entirely, filling implicit type parameter slots with anonymous type placeholders instead. ## How to use In a dialect definition, add `dialect_option typecheck off;` to disable type checking for all programs using that dialect: ``` #dialect dialect MyDialect; dialect_option typecheck off; type Foo; fn bar (A : Type, x : A) : A => x; ... #end ``` Programs using the dialect will skip `inferType` and `unifyTypes` for expression arguments. Variable name resolution and global context population still work normally — only type inference is bypassed. ## Test changes `StrataTest/DDM/TypecheckSkip.lean` defines an inline dialect with parameterized types, polymorphic functions, and perField accessor templates. It demonstrates that the accessor-into-polymorphic-fn pattern (from issue #650 / PR #734) produces type errors with typecheck on, and elaborates successfully with typecheck off. ## Details - **AST.lean**: Add `TypeExprF.skip` (anonymous type placeholder `.tvar loc ""`), `Dialect.typecheck : Bool := true`, refactor `Program.globalContext` into explicit `computeGlobalContext` function - **Elab/Core.lean**: Add `ElabContext.typecheck`, skip `inferType`/`unifyTypes` in `elabSyntaxArg`, fill unfilled type param slots in `runSyntaxElaborator` post-pass, skip pre-registration in `elabOperation`, fix `applyNArgs` to handle tvar types - **Elab/DeclM.lean**: Add `DeclContext.typecheck` - **Elab/DialectM.lean**: Add `elabSetOptionCommand` handler dispatching on `"typecheck"` option - **BuiltinDialects/StrataDDL.lean**: Add `setOptionCommand` with syntax `dialect_option <name> <value> ;` - **Elab.lean**: Thread `Dialect.typecheck` into `DeclContext` in `elabProgramRest` - **Ion.lean, Python/Specs/DDM.lean**: Update `Program` construction for `globalContext` refactor By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…etrics JSONL (#1093) ## Motivation The pipeline previously had no unified infrastructure for diagnostics or timing. Warnings were handled ad-hoc (some printed to stderr, some silently dropped), phase timing was manual and inconsistent, and the only structured output (`--warning-summary`) wrote a single JSON blob at the end — useless if the process was killed by a timeout. There was also no distinction between fatal and non-fatal errors, so callers couldn't tell whether a result was trustworthy. This PR introduces `PipelineContext`/`PipelineM` as a shared backbone that gives every pipeline phase consistent behavior: categorized diagnostics, nested phase timing, fatal-error abort, and streaming metrics output that survives timeouts. ## Summary - Introduce `PipelineContext` and `PipelineM` for structured diagnostic accumulation and phase timing across the entire pipeline - Replace ad-hoc `profileStep`/`startPhase` with `withPhase` combinator that handles nesting, timing, and profile output via `try`/`finally` - Add `withRepeatedPhase` for iteration-level timing (preprocess, smtEncode, solver) that accumulates count+duration and reports as `[profile]` lines - Replace `--warning-summary` (single JSON blob) with `--metrics <file>` (streaming JSONL, flushed per-record for timeout resilience) - Add `--verbose` program dumps: prints intermediate Laurel and Core programs with `---- BEGIN/END ----` delimiters for programmatic parsing - Refactor module resolution: fatal errors on invalid/missing modules, unified `resolveModules` for dispatch, auto-resolved, and explicit modules - Fix `pyResolveOverloads` to print errors from pipeline context on failure - Split `Messages.lean` into data types (`Messages.lean`) and runtime infrastructure (`Context.lean`) - Move Python-specific `MessageKind` constants to `Languages/Python/Specs/MessageKind.lean` ## `PipelineContext` and `PipelineM` `PipelineContext` is a structure carrying immutable config (`outputMode`, `profilePipeline`) plus mutable state as individual `IO.Ref` fields: - `messagesRef` — accumulated `PipelineMessage` diagnostics - `toolErrorsRef` / `userCodeErrorsRef` — bucketed by impact for fast access - `currentPhaseRef` — current position in the phase hierarchy (managed via `push`/`pop`) - `repeatedDepthRef` — nesting depth of `withRepeatedPhase` scopes (when > 0, `withPhase` aggregates silently) - `phaseStateRef` — per-phase scoped state bundling repeated-phase aggregation and message counts (saved/restored atomically on phase entry/exit) - `metricsHandle` — optional file handle for streaming JSONL metrics Because all state is in `IO.Ref`s, any monad with `BaseIO` access can use pipeline capabilities by passing a `PipelineContext` value directly — `withPhase` is polymorphic over the monad (`[MonadLiftT BaseIO m]`). `PipelineM` is defined as: ```lean abbrev PipelineM := ReaderT PipelineContext (EIO Unit) ``` The `EIO Unit` base ensures that fatal errors (configuration errors, internal errors, user code issues) are impossible to silently ignore — they throw `()`, aborting the pipeline. Non-fatal diagnostics (warnings, known limitations) are added to `messagesRef` and execution continues. Callers that want to attempt recovery (e.g., to gather more diagnostics) can catch the exception, but should rethrow to avoid proceeding with invalid state. The top-level runner catches the exception and still returns all accumulated messages — so the caller always gets complete diagnostics regardless of how the pipeline terminated. ## Phase tracking state machine The phase system operates in two modes controlled by `repeatedDepthRef`: - **Mode N** (normal, depth = 0): `withPhase` prints `[start]`/`[end]` in profile mode and emits JSONL timing metrics. - **Mode R** (repeated, depth > 0): `withPhase` silently aggregates timing into `phaseStateRef.repeatedPhases` — no print, no individual metrics. `withRepeatedPhase` increments `repeatedDepthRef` on entry and decrements on exit. `withPhase` never changes the depth — it inherits the mode and behaves accordingly. Key invariants: - `currentPhaseRef` always reflects the innermost active scope's full path - `phaseStateRef` is scoped: saved on entry, restored on exit (no leakage) - In mode R, all timing flows through `phaseStateRef.repeatedPhases` only ## `--metrics <file>` (replaces `--warning-summary`) Writes streaming JSONL (one JSON object per line, flushed immediately after each record). If the process is killed mid-pipeline, all records should remain. Record types: - `timing` — phase start/end in ms, with optional count for aggregated phases - `diagnostic` — pipeline messages with phase, file, category, impact, location - `outcome` — result/exit_code/total_ms, written last (absence = killed) Example output: ```jsonl {"type":"timing","phase":"pythonAndSpecToLaurel","start_ms":0,"end_ms":97} {"type":"diagnostic","phase":"pythonAndSpecToLaurel.resolveAndBuildPrelude","file":"specs/builtins.ion","category":"unsupportedUnion","impact":"knownLimitation","message":"unsupported overload form for 'dict.update'"} {"type":"timing","phase":"verification.vcDischarge.solver","start_ms":0,"end_ms":1102,"count":12} {"type":"outcome","result":"verified","exit_code":0,"total_ms":1830} ``` ## `--verbose` program dumps When `outputMode == .verbose`, the pipeline prints intermediate programs to stdout with delimiters for programmatic parsing: ``` ---- BEGIN Laurel Program ---- <full Laurel program text> ---- END Laurel Program ---- ---- BEGIN Core Program ---- <full Core program text> ---- END Core Program ---- ``` ## Module resolution refactoring - `resolveModuleEntry` now takes a parsed `ModuleName` (not a string) - Single `resolveModules` handles dispatch, auto-resolved, and explicit modules - `invalidModuleName` changed from `internalWarning` to `configurationError` (fatal) - Merged `missingDispatchModule` + `missingExplicitPySpec` into `missingPySpecModule` - Eliminated duplicate message scenario where both invalid-name and missing-module errors fired for the same input ## `pyResolveOverloads` error handling When `readDispatchOverloads` encounters a fatal error (e.g., nonexistent file), the command now reads `pctx.messagesRef` and prints all accumulated messages to stderr before exiting with a non-zero code — instead of silently yielding empty results. ## Profile output format `--profile` emits structured `[start]`/`[end]`/`[profile]` lines to stdout: ``` [start] pythonAndSpecToLaurel (time: 0ms) [start] readPythonIon (time: 0ms) [end] readPythonIon (time: 2ms) [start] resolveAndBuildPrelude (time: 2ms) [end] resolveAndBuildPrelude (time: 97ms) [warnings] resolveAndBuildPrelude: 2 unsupportedUnion [end] pythonAndSpecToLaurel (time: 97ms) [start] verification (time: 105ms) [start] vcDischarge (time: 320ms) [profile] preprocess (×12, total: 45ms, avg: 3ms) [profile] solver (×12, total: 1102ms, avg: 91ms) [end] vcDischarge (time: 1185ms) [end] verification (time: 1185ms) ``` ## Design: impact drives behavior Each `MessageKind` has a `MessageImpact` that determines pipeline behavior: | Impact | Fatal? | Meaning | |--------|--------|---------| | `configurationError` | yes | Invalid arguments or unreadable on-disk pyspecs | | `internalError` | yes | Unexpected failure preventing output | | `userCodeIssue` | yes | Issue in the user's source code | | `internalWarning` | no | Unexpected condition that didn't prevent output | | `knownLimitation` | no | Documented gap in modeling | Note: `emitMessageAndAbort` is the function that aborts — callers choose whether to abort. A caller may use `emitMessage` (continues) and check `isFatal` at phase boundaries for custom abort strategies. ## Not addressed: `[statistics]` output The `[statistics]` lines (transform counters like `Evaluator.simulatedStmts`, `FilterProcedures.erasedProcedures`) are orthogonal to phase timing. They currently bypass `PipelineContext` — printed via bare `IO.println` in `Core.verify` and `StrataMainLib.lean` — so they don't respect phase nesting or indentation. Integrating statistics into the phase system is future work. ## Key files changed - `Strata/Pipeline/Messages.lean` — pure data types: `Phase`, `MessageImpact`, `MessageKind`, `PipelineMessage` - `Strata/Pipeline/Context.lean` — runtime: `PipelineContext`, `PipelineM`, `withPhase`, `withRepeatedPhase`, `emitMetric`, `OutputMode` - `Strata/Languages/Python/Specs/MessageKind.lean` — Python-specific `MessageKind` constants - `Strata/Pipeline/PyAnalyzeLaurel.lean` — top-level pipeline orchestration, `--verbose` program dumps, `metricsHandle` threading - `Strata/Languages/Core/Verifier.lean` — replaced manual timing with `withRepeatedPhase` - `Strata/Languages/Laurel/LaurelCompilationPipeline.lean` — replaced `profileStep` with `withPhase` - `Strata/Languages/Python/PySpecPipeline.lean` — fatal abort via throw, module resolution refactor, diagnostic metric emission - `StrataMainLib.lean` — `--metrics` flag, outcome record emission, error printing in `pyResolveOverloads` - `StrataTest/Pipeline/PhaseTimingTest.lean` — phase timing integration tests --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com> Co-authored-by: June Lee <lebjuney@amazon.com>

**Warning:** This repository will shortly undergo a split into several separate repositories. If you're creating a PR that crosses the boundaries between these repositories, you may want to hold off until the split is complete or be prepared to rework your PR into multiple PRs once the split is complete. The code that will be moved includes: - Strata/DDM/* - Strata/Languages/Boole/* - Strata/Languages/Python/* along with Tools/Python/* - Tools/BoogieToStrata

## Summary `PythonIdent.pythonModule` was a raw `String`, allowing empty module names to propagate silently through the system and cause downstream failures (see #1143). This PR changes `pythonModule` to use `ModuleName` — which now guarantees at least one non-empty component — and adopts it more widely through the PySpec pipeline, eliminating this class of bugs at the type level. ## Approach Rather than adding runtime checks at each use site, `ModuleName` gains a strengthened invariant via a private constructor enforcing non-empty components. `PythonIdent.ofComponent` uses `decide` to verify non-emptiness at compile time for literal module names (used in `Decls.lean`). Runtime parsing goes through `ModuleName.ofString?` which returns `Option`. ## Key changes - **PythonIdent.lean**: `PythonIdent.pythonModule` changed from `String` to `ModuleName`. New `ofComponent` constructor with compile-time proof for single-component literals. `ModuleName` API extended: `back`, `parent (n)`, `push`, `append`/`++`, `toString (sep)` - **PythonIdent.lean**: New `ModuleName.ofRelativePath` provides a single canonical function for deriving a `ModuleName` from a `.py` file path relative to a search root, handling both regular files and `__init__.py` packages. This eliminates redundant ad-hoc implementations scattered across `discoverModules`, `RelativeImportTest`, and `findInPath` — each of which had slightly different `__init__.py` handling, creating inconsistency and latent bugs (e.g., `pySpecToLaurelCommand` silently failed for package modules) - **Specs.lean**: `findInPath` simplified to return `(FilePath × Bool)` instead of `(FilePath × Array ModuleComponent)` — the `Bool` is `isInit`, and callers use `ModuleOfPath.modulePrefix` to derive the package prefix. Deleted unused `ofFile` function - **Specs/DDM.lean**: Converted `fromDDM` functions to monadic `FromDDM` (`Except (SourceRange × String)`) for proper error propagation instead of panics - **Specs/ToLaurel.lean**: `signaturesToLaurel` takes `ModuleName` instead of `String` prefix. Overload extraction factored into dedicated `ExtractM` monad (no dependency on `ToLaurelContext`) - **PySpecPipeline.lean**: `buildPySpecLaurel` takes `Array (ModuleName × String)` instead of `Array (String × String)` - **Specs.lean**: Import resolution uses `ModuleName` throughout; `translateFile` requires a `ModuleName` (no longer optional) - **Specs/Decls.lean**: Builtin idents use `ofComponent` with compile-time proof ### Unrelated cleanup - Deleted `Strata.Python.containsSubstr` helper and replaced all call sites with `String.contains` (which accepts substring patterns in Lean 4.29) By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…1149) *Issue #, if available:* #1148 *Description of changes:* 1. Namespace collision (issue 1 in #1148): Constants that share a name with a procedure are now prefixed with __const_ in the Strata output. The rename is applied consistently in declarations and references. 2. Recursive synonym resolution (Fixes issue 2 in #1148): DealiasTypeExpr now recurses on its result, resolving synonym chains like ref := i64 := int fully to the base type. Previously it resolved only one level, causing panics on comparison/arithmetic operators applied to multi-level type synonyms. 3. SMACK assert encoding (issue 3 in #1148): Procedures named assert_.* now get a requires (arg != 0) precondition emitted. Call elimination generates VCs at each call site, making SMACK assertions verifiable. 4. InferModifies: SMACK-generated Boogie often omits explicit `modifies` clauses on procedures that mutate globals. Setting `InferModifies = true` in BoogieToStrata has two effects: (1) it runs `ModSetCollector.CollectModifies(program)` to populate empty modifies clauses, and (2) through `CheckModifies` in Boogie's resolution context, it suppresses typechecking of modifies clauses. Without this, `ResolveAndTypecheck` would reject SMACK programs that omit modifies clauses. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: David Deng <htd@amazon.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fixes #1045 ## Summary Adds a `--parallel N` flag that runs up to N solver instances concurrently when verifying proof obligations. Without the flag (or with `--parallel 1`), behavior is unchanged (sequential). ## Problem Verification of programs with many obligations is bottlenecked by sequential solver invocations. Each obligation spawns a separate solver process, waits for the result, then moves to the next. ## Solution When `--parallel N` is specified (N > 1), the verification pipeline splits into two phases: 1. **Sequential preprocessing** (fast): determine checks, preprocess obligations, encode to SMT terms. Obligations resolved by the evaluator are handled immediately. 2. **Parallel solver dispatch** (slow): obligations that need the solver are placed in a shared queue. N worker tasks (on dedicated threads) continuously pull from the queue — when a solver finishes, it immediately picks up the next unsolved obligation. Results are collected in original obligation order so output is deterministic. The worker pool design avoids the "wait for slowest in batch" bottleneck: if one obligation takes 10s and others take 1s, the fast-finishing workers immediately start on the next obligation instead of idling. `stopOnFirstError` is supported via a shared flag: on failure, workers stop claiming new jobs. Already-running jobs complete naturally; skipped jobs leave their placeholder results in place. Both the incremental and batch solver paths are safe for parallel use: the incremental backend spawns independent solver processes, and the batch path uses atomic `modifyGet` for filename counter generation. **Pluggable discharge function**: The full public API (`Strata.verify`, `Core.verify`, `verifySingleEnv`, `mkDefaultCoreSMTSolver`) accepts a `mkDischarge : MkDischargeFn` parameter (defaulting to `mkDischargeFn`). External solvers (e.g. using the AbstractSolver API) can provide their own discharge function factory. ## Performance Benchmark: 30 obligations across 2 programs, z3 4.12.2, avg over 3 runs: | Mode | Time | Speedup | |------|------|---------| | `--parallel 1` (sequential) | 921ms | baseline | | `--parallel 2` | 491ms | **1.88x** | | `--parallel 3` | 405ms | **2.27x** | | `--parallel 4` | 356ms | **2.59x** | Sweet spot is `--parallel 3-4`. The 1→2 jump gives the biggest single improvement (1.88x), then diminishing returns as solver spawn overhead and sequential preprocessing become the bottleneck. ## Testing All tests that pass without `--parallel` also pass with it. The sequential path (`--parallel 1`, the default) is unchanged.

When translateExpr produces an expression containing a nested multi-output procedure call (e.g., PSub(base, timedelta_func(...))), the new extractMultiOutputCalls pass in translateAssign rewrites it into a preceding multi-target assignment and a variable reference: var $mo_N := ...; assign $mo_N, maybe_except := timedelta_func(...); delta := PSub(base, $mo_N) This prevents multi-output calls from appearing in expression position, which would silently discard the error channel. Key design choices: - Counter seeded from source byte offset for globally unique temp variable names - Branch-local multi-output calls are wrapped in Block nodes inside their respective IfThenElse branches (not hoisted unconditionally) - Generated wrapper statements propagate source metadata from the original call - Block-flattening in the lifter uses cons-based prepend for correct evaluation order - containsBareAssignment skips Blocks so generated wrappers pass through assert/assume handlers - extractMultiOutputCalls uses structural recursion (attach) rather than partial Tested: existing tests pass, builds cleanly. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> Co-authored-by: Mikaël Mayer <MikaelMayer@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Mikael Mayer <mimayere@amazon.com>

### Changes - Fix scoping bug in HeapParam pass where `Declare` targets inside blocks created by `heapTransformExpr` were invisible to subsequent statements - Fix small transparency bug in ConstrainedTypeElim - Factor out `heapVarName` / `heapInVarName` constants to avoid duplicated magic strings across passes - Add consistency check to LaurelCompilationPipeline to detect bugs in passes - Move intermediate program output to `.lake/build/intermediatePrograms/` ### Testing - Existing tests pass, new consistency check added By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Michael Tautschnig <mt@debian.org> Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com> Co-authored-by: keyboardDrummer-bot <keyboarddrummer.bot@gmail.com> Co-authored-by: Mikael Mayer <mimayere@amazon.com>

…1134) Holes in arguments to datatype destructors (e.g. `Any..as_ListAny!(<?>)`) and testers (e.g. `Any..isfrom_str(<?>)`) were left untyped after `InferHoleTypes`, producing "could not infer type" diagnostics and downstream `Unknown`-typed `$hole_N` functions. The fix surfaces the parent datatype through the semantic model so `InferHoleTypes` can recover parameter types via `uniqueId`-based model lookups (per the Laurel steering doc, post-resolution passes should rely on model lookups, not textual names). Key changes: - New `ResolvedNode.datatypeDestructor` variant carrying the parent type name and field parameter, with corresponding updates to `ResolvedNodeKind`, `.kind`, `.getType`, and `SemanticModel.isFunction`. - Datatype scope registration chains `defineNameCheckDup` calls so safe/unsafe destructors and tester/constructor names share a single `uniqueId`. - `buildRefToDef` registers each constructor parameter as `.datatypeDestructor`. - `calleeParamTypes` matches `.datatypeConstructor` and `.datatypeDestructor` directly, synthesizing the correct `.UserDefined` input type. - `ResolvedNode.getType` now uses explicit pattern matches instead of a catch-all, so new constructors trigger compile errors rather than silently falling through. - `SemanticModel.refToDef` is now `private`; external code uses `SemanticModel.get?` (returning `Option ResolvedNode`) or `SemanticModel.get` to look up nodes by `Identifier`, preventing arbitrary `Nat` key access. Tests: three new `#guard_msgs` cases for holes inside destructors/testers, plus updated expected diagnostics in `ResolutionKindTests`. Existing tests pass. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> Co-authored-by: Mikael Mayer <mimayere@amazon.com>

Migrate 241 of 282 non-Boole StrataTest files to be Lean modules, completing the module migration started in #982. - **Added `module` keyword** to 241 test files. The Boole test files are excluded because `gen_smt_vcs` produces proof terms that fail kernel type-checking when the file is a module. - **Added `meta import`** for Strata/StrataTest dependencies where `#eval`, `#guard_msgs`, or `#guard` need native implementations from the imported module. - **Added `meta import all`** where proofs use `simp`, `rfl`, or `native_decide` on definitions whose bodies must be visible across the module boundary. - **Added `import all`** where non-meta definitions (types, instances) need access to definition bodies from other StrataTest modules. - **Added `meta section` wrappers** around test code that exercises native implementations, since `#eval`/`#guard_msgs`/`#guard` require meta context to call into compiled code from another module. - **Added `import Strata.DDM.Integration.Lean.HashCommands`** (non-meta) in files using `#strata`/`#dialect` syntax, since the syntax extension must be available at parse time regardless of meta context. - **Updated byte offsets** in `#guard_msgs` docstrings where expected output includes position-dependent labels (e.g., invariant names derived from source positions shifted by added import lines). By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

The `mkDischarge` parameter was accepted by `Core.verifyProgram` but not exposed through `PyAnalyzeConfig` or the CLI `verify` command, so callers could not supply a custom solver discharge function. This PR: - Adds `mkDischarge` to `PyAnalyzeConfig` (defaulting to `Core.mkDischargeFn`) and forwards it to the verification call. - Adds `mkDischarge` as a parameter to `verifyCommand` and passes it through to the `verify` call. - Adds `mkDischarge` as a parameter to `pyAnalyzeLaurelCommand` (same pattern), so downstream executables can override the discharge function without redefining the command. Tested: existing tests pass, build succeeds. Fixes #1199

## Summary Run `lake shake` to minimize the import graph across all 293 Strata modules, reducing average transitive Strata imports by 32% (56 → 38) and total transitive imports (including Lean/Std) by 40% (1379 → 831). Add CI linting to ensure `Strata.lean` transitively imports every module, and add an import statistics script for ongoing measurement. ## Changes - **Import lint driver (`Scripts/CheckImports.lean`)**: Verifies that all `.lean` files under `Strata/` are transitively imported by `Strata.lean`. Registered as `lintDriver` in `lakefile.toml` so `lake lint` runs it automatically. Supports `-- noimport: Strata.Foo` comments in `Strata.lean` to allowlist intentionally excluded modules. - **Import statistics script (`Scripts/ImportStats.lean`)**: Computes per-module transitive import counts (both total and Strata-only), reporting average, median, min, max, top/bottom lists, and histograms. Run via `lake exe ImportStats`. - **`lake shake Strata --fix`**: Applied shake's suggestions to remove unnecessary imports across 225 files (net -185 lines). This is the bulk of the change. - **Aggregator modules use `-- shake: keep`**: Pure aggregator modules (`CBMC.lean`, `GOTO.lean`, `B3.lean`, `Dyn.lean`, `Python.lean`, `Imperative.lean`, `Lambda.lean`, `DDM.lean`, `LExprType.lean`, `B3/Verifier.lean`) intentionally re-export their submodules and are annotated with `-- shake: keep` so `lake shake --fix` won't strip them. - **DDM elaboration imports use `-- shake: keep`**: Files using `#dialect`, `#strata`, `#strata_gen`, and related macros have elaboration-time import dependencies that shake cannot detect. These are annotated with `-- shake: keep` to prevent breakage. - **Renamed aggregator files**: `Dyn/Dyn.lean` → `Dyn.lean`, `Python/Python.lean` → `Python.lean`, `Imperative/Imperative.lean` → `Imperative.lean` — the `Foo/Foo.lean` pattern is unnecessary now that these are proper `module` files. - **Orphaned modules added to `Strata.lean`**: When shake stripped re-exports from real code files (e.g. `ProgramWF.lean` no longer re-exports `StatementWF`), the orphaned modules were added directly to `Strata.lean` to maintain full coverage. ## Results | Metric | Before | After | Change | |---|---|---|---| | Avg transitive imports (Strata-only) | 56.2 | 38.2 | -32% | | Median transitive imports (Strata-only) | 38 | 22 | -42% | | Avg transitive imports (all) | 1379 | 831 | -40% | | Total import edges (Strata-only) | 16,455 | 11,194 | -32% | | Clean build time | 3m 02s | 2m 58s | -2% | By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

# Move Boole dialect to standalone package Extract the Boole dialect into its own `StrataBoole` Lake package with a dependency on the main `Strata` package. This enables faster incremental builds for Boole-specific work and cleaner separation of concerns. ## Changes - Move `Strata/Languages/Boole/` source files to `StrataBoole/StrataBoole/` - Move Boole test files to `StrataBoole/StrataBooleTest/` - Create `StrataBoole/lakefile.toml` with `Strata` as a dependency - Add `StrataBoole/StrataBooleTest.lean` root module importing all test files - Fix `MetaVerifier.lean` to add required meta imports (`Lean.Meta.Eval`, `Lean.Meta.Tactic.Rewrite`, `Lean.Meta.Tactic.Unfold`) and define `genSMTVCsBoole` as a standalone function for the tactic infrastructure - Update all test imports from `Strata.Languages.Boole.*` to `StrataBoole.*` - Update `#guard_msgs` expected output to reflect shifted source positions - Add CI workflow step for building the `StrataBoole` package - Remove Boole-specific code from the main `Strata` package (`Strata.lean`, `MetaVerifier.lean`, `StrataMainLib.lean`)

atomb and others added 27 commits May 14, 2026 21:44

Add PR template warning about repository split (#1166)

0fcdc25

Add a pull request description template that warns about the upcoming repository split. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

github-actions Bot added Git conflicts Laurel Python labels May 22, 2026

github-actions Bot added Core GOTO dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code SMT labels May 22, 2026

github-merge-queue Bot deployed to github-pages May 22, 2026 16:08 Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge main into main2#1211

Merge main into main2#1211
keyboardDrummer wants to merge 28 commits into
main2from
main

keyboardDrummer commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Conversation

keyboardDrummer commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants