Skip to content

Merge main into main2#1211

Draft
keyboardDrummer wants to merge 28 commits into
main2from
main
Draft

Merge main into main2#1211
keyboardDrummer wants to merge 28 commits into
main2from
main

Conversation

@keyboardDrummer
Copy link
Copy Markdown
Contributor

Warning: This repository will shortly undergo a split into several separate repositories. If you're creating a PR that crosses the boundaries between these repositories, you may want to hold off until the split is complete or be prepared to rework your PR into multiple PRs once the split is complete.

The code that will be moved includes:

  • Strata/DDM/*
  • Strata/Languages/Boole/*
  • Strata/Languages/Python/* along with Tools/Python/*
  • Tools/BoogieToStrata

atomb and others added 27 commits May 14, 2026 21:44
Add a pull request description template that warns about the upcoming
repository split.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
The `StrataTestMain` test driver spawns `lean` on files under
`StrataTestExtra/`, which is not declared as a `lean_lib`. Some of those
files (e.g. `DDM/Integration/Java/TestGen.lean`) import modules from the
`Strata` library that are not in `StrataTest`'s transitive closure, most
notably `Strata.DDM.Integration.Java`.

Before this change, `lake test` on a clean `.lake` would only build the
closure reachable from `StrataTest` and then fail at subprocess time
with:

  error: object file '.../Strata/DDM/Integration/Java.olean' of module
  Strata.DDM.Integration.Java does not exist

Running `lake build` first masked the bug because the `Strata` library
is in `defaultTargets`.

Add `Strata` to the `needs` list of `StrataTestMain` so Lake builds the
full `Strata` library before the test driver runs, regardless of whether
`lake build` was run beforehand.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
…ove unlabeled exit from Core (#1141)

*Issue #, if available:*

Closes #372

*Description of changes:*

Formalizing scoped variable initialization:

- Adds scoped environment semantics to blocks: step_block_done now
projects the store through the parent store via projectStore, discarding
block-local variables on exit. This applies to both StepStmt
(deterministic) and StepKleene
  (nondeterministic).
- Adds PostWF helper definition in Specification.lean for postcondition
stability under projectStore.
- Adds .block constructor to KleeneStmt and corresponding
step_block/step_block_body/step_block_done to StepKleene, mirroring the
deterministic block semantics.
- Changes StmtToKleeneStmt (.block _ bss _) to produce .block b
(wrapping in a Kleene block), so the Kleene translation preserves
scoping.

Removing unlabeled `exit` command (`exit;`, not `exit lbl;`):

- The unlabeled `exit` command doesn't have clear meaning when it is
inside `while`. In `while cond { ... exit; ... }`, is `exit` equivalent
to `continue` or `break` in C/Python? No translation to Core was
introducing the unlabeled `exit`, and the small-step semantics wasn't
clear about the meaning of `exit` inside a loop.
- The type checker of Core (Strata/Languages/Core/StatementType.lean)
fails when an unlabeled `exit` appears inside the body of a while loop /
if statement (unless it is wrapped by another nested block), but the DDM
syntax of core allows it, so from user's perspective this gap is kind of
surprising.
- This patch removes the unlabeled `exit` case because it can always be
simulated by a labeled block + labeled exit.

On top of this, this also renames `touchedVars` to
`modifiedOrDefinedVars` for clarity, and instead makes `touchedVars` all
vars that are read + modified + defined.

### How to review?

- Strata/DL/Imperative/StmtSemantics.lean has the most important update:
the `.block` constructor now has the input store which will be used to
'project' the output store to the variables that have been defined
before. Also, any block-exiting small steps like `step_block_done` will
do the projection, to define variables which were defined inside the
inner scope.
- Correspondingly, the Kleene language was updated to add a notion of
block and scoping to its syntax and semantics, otherwise DetToKleene
doesn't prove. (Strata/DL/Imperative/KleeneStmt.lean,
Strata/DL/Imperative/KleeneStmtSemantics.lean)
- StrataTest/DL/Imperative/StepStmtTest.lean has a few new tests that
show scoping works well.
- DetToKleeneCorrect.lean and ProcBodyVerifyCorrect.lean shows that the
top-level statements are not touched after this update.
- Strata/DL/Imperative/HasVars.lean has the `touchedVars` update and
Strata/Languages/Core/StatementSemantics.lean has some additional
well-formedness about evaluator extension that are useful.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
…1109)

translateSort already handles .prim .string, but translateTerm had no
arm for .prim (.string _): any SMT string literal produced by the
encoder would fall through to the catch-all and raise 'unsupported
term'. Add the missing arm plus the two string operations already
supported in Denote.lean so that the two translators agree on the
end-to-end-supported subset.

Specifically, add arms for:
- .prim (.string s) -> (mkString, toExpr s)
- .app .str_length [s] _ -> (mkInt, Int.ofNat s.length)
- .app .str_concat as _ -> (mkString, leftAssocOp mkStringAppend as)

mkStringAppend uses instHAppendOfAppend + instAppendString, which is the
same instance chain Lean elaborates inferInstance to for HAppend String
String String.

Regression tests in StrataTest/DL/SMT/TranslateTests.lean cover all
three arms; each fails on origin/main (catch-all throw observable via
#guard_msgs mismatch) and passes with this change. I additionally
verified under a Meta.check-enabled harness that the produced Expr
type-checks in the kernel, so the (fst = mkString/mkInt, snd = ...)
pairs are internally consistent.


By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
…1140)

Fixes #1139

Introduces a `Provenance` type that fully replaces `FileRange` and
`SourceRange` as the canonical way to store source locations in
metadata. This structurally eliminates the "SourceRange without file
name" problem.

The `Provenance` type has two constructors:
- `Provenance.loc uri range` — a real source location (always requires
both URI and range)
- `Provenance.synthesized origin` — a node created programmatically,
with a `SynthesizedOrigin` inductive type

A `SynthesizedOrigin` inductive enforces that only canonical origins are
used (`smtEncode`, `nondetIte`, `laurelParse`, `laurel`, `laurelToCore`,
`structuredToUnstructured`), preventing typos and duplicates at the type
level.

Key changes:
- The `.fileRange` variant is removed from `MetaDataElem.Value` — all
metadata values now use `.provenance` exclusively
- `MetaData.ofSourceRange` emits only a provenance element
- `getProvenance` is the single source of truth for reading source
locations from metadata
- `setCallSiteFileRange` works directly with `Provenance` instead of
roundtripping through `FileRange`
- `getFileRange` delegates to `getProvenance` for extraction
- `FileRange.unknown` and `SourceRange.none` eliminated from metadata
construction
- SMT DDM translator uses `smtAnn` combinator to reduce annotation
boilerplate
- SARIF output uses `getFileRange` (which reads provenance)

`FileRange` remains as a utility struct for extraction and formatting
(e.g., in `DiagnosticModel`), but is no longer a metadata value type.

Existing tests pass unchanged.

## Follow-up

- Migrate the B3, Boole, and Python grammar ASTs from using
`SourceRange` as their annotation type parameter to `Provenance`. This
would allow combining multiple files at the AST level and enable proper
provenance tracking through translation passes.
…seArrayTheory option. (#1145)

*Description of changes:*

**Bug:** When `useArrayTheory` is enabled, the SMT encoder correctly
converts `Map` types to `Array` in variable declarations and function
signatures, but it did *not* perform this conversion for fields inside
ADT (algebraic data type) constructor declarations. This caused a type
mismatch: a datatype field would be declared with type `Map` while the
rest of the encoding used `Array`.

**Fix:** Thread the `useArrayTheory` flag through the datatype emission
pipeline:
- `lMonoTyToTermType` now accepts `useArrayTheory` and converts `Map` to
`Array` when enabled.
- `datatypeConstructorsToSMT` passes the flag to `lMonoTyToTermType`.
- `SMT.Context.emitDatatypes` accepts and forwards the flag.
- `encodeCore` in `Verifier.lean` passes `options.useArrayTheory` to
`emitDatatypes`.

**Test:** Added a unit test in `SMTEncoderDatatypeTest.lean` that
verifies a datatype with a `Map`-typed field emits `(Array Int Int)`
when `useArrayTheory=true` and `(Map Int Int)` when
`useArrayTheory=false`.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
…structs (#1165)

Fixes #1158
Supersedes #1036

## Summary

Fixes several issues with `Core.formatProgram` that prevented round-trip
parsing:

1. **`inline function` formatted without space** — The `inline` grammar
op now includes a trailing space, preventing `inlinefunction`
concatenation. The formatter also now emits the `inline` attribute when
present on functions.

2. **Quantifier variable names** — The formatter now uses the
`prettyName` field from `LExpr.quant` instead of generating `__qN`
names, preserving the original variable names through formatting.

3. **Overflow predicates** — Added grammar entries, formatter handling,
parser entries, and Factory operations for all bitvector overflow
predicates (`SNegOverflow`, `UNegOverflow`, `SAddOverflow`,
`SSubOverflow`, `SMulOverflow`, `SDivOverflow`, `UAddOverflow`,
`USubOverflow`, `UMulOverflow`). `SNegOverflow` and `UNegOverflow` use
distinct grammar entries (`Bv.SNegOverflow` / `Bv.UNegOverflow`) to
preserve their different semantics through roundtrip. The translate
direction correctly dispatches overflow ops by type parameter
(bv1/bv8/bv16/bv32/bv64), logging an error for unsupported widths.

4. **Array assignment (`lhsArray`)** — Implemented `m[k] := v`
translation in both directions: the parser decomposes nested LHS into
base identifier + indices and builds `map_update` expressions; the
formatter detects the `map_update(var, idx, val)` pattern and produces
`lhsArray` syntax.

5. **`Sequence.empty`** — Added grammar entry with explicit type
annotation syntax (`Sequence.empty<int>()`), plus translate and format
handling, resolving the 0-ary polymorphic function limitation for this
operation. Includes solver verification tests (cherry-picked from #1036)
covering basic usage and various element types.

6. **Datatype roundtrip** — Verified that datatype formatting roundtrips
correctly (the extra-parentheses issue noted earlier is resolved).

7. **Roundtrip test infrastructure** — Added `RoundtripTest.lean` that
verifies parse → format → re-parse → re-format produces stable output
for types, functions, procedures, inline functions, parameterized type
arguments, datatypes, array assignments, and `Sequence.empty`.

8. **`getLFuncCall` reuse** — Refactored `lappToExpr` and
`decomposeMapUpdate` to use `getLFuncCall` for decomposing nested
applications, reducing code duplication with the existing utility.

## Testing

- All existing tests pass (with expected output updates for the
formatting improvements)
- New roundtrip tests verify types, functions, procedures, inline
functions, parameterized type arguments, datatypes, lhsArray, and
Sequence.empty
- Solver verification tests for Sequence.empty (from #1036) pass

## Remaining limitations

- Bitvector operations with widths beyond 1/8/16/32/64 log an error and
fall back to bv64 (requires grammar-level type entries per width)

---------

Co-authored-by: Aaron Tomb <aarotomb@amazon.com>
…coding (#935)

Fixes #917

## Summary

Introduces an abstract solver interface (`AbstractSolver τ σ m`) that
decouples term construction and session operations from the SMT-LIB
string encoding pipeline. The interface is parameterized by a term type
`τ`, sort type `σ`, and monad `m` (with `[Monad m] [MonadExceptOf
IO.Error m]` constraints), so a non-SMT-LIB backend (e.g., cvc5 FFI) can
be plugged in by implementing `AbstractSolver` without touching the
encoding logic. All methods throw `IO.Error` directly (no `Except
String` layer).

A new incremental backend communicates with a live solver process via
stdin/stdout. It is opt-in via `--incremental` (batch remains the
default).

## What the interface provides

All operations are monadic (`m`), throwing on error:

- **Configuration**: `setLogic`, `setOption`, `comment`
- **Sort construction**: `boolSort`, `intSort`, `realSort`,
`stringSort`, `regexSort`, `bitvecSort`, `arraySort`, `constrSort`,
`termTypeToSort`
- **Term construction**: `mkPrim`, `mkBool`, `mkAnd`, `mkOr`, `mkNot`,
`mkImplies`, `mkEq`, `mkIte`, `mkAdd`, `mkSub`, `mkMul`, `mkDiv`,
`mkMod`, `mkNeg`, `mkAbs`, `mkLt`, `mkLe`, `mkGt`, `mkGe`, `mkSelect`,
`mkStore`, `mkApp`, `mkAppOp`, `mkForall`, `mkExists`
- **Declarations**: `declareNew`, `declareFun`, `defineFun`,
`declareSort`, `declareDatatype`, `declareDatatypes`
- **Session**: `assert`, `checkSat`, `checkSatAssuming`,
`getUnsatAssumptions`, `getModel`, `getValue`, `reset`, `close`
- **Model inspection**: `termToSMTLibString`

Terms (`τ`) are opaque handles whose meaning is backend-specific.

## Other changes

- **Generic encoder** — `AbstractEncoder.encodeTerm` parameterized over
`τ`/`σ`, caches handles in `AbstractEncoderState τ`.
- **Generic incremental discharge** —
`Imperative.SMT.dischargeObligationIncremental` in `SMTUtils.lean` is
parameterized over `PureExpr`, so any Imperative-level language can use
the incremental solver.
- **Pluggable solver backends** — `CoreSMTSolver` and `MkDischargeFn`
threaded through `Core.verifyProgram`, `Core.verify`, and
`Strata.verify`.
- **Strict solver compatibility** — `encodeDeclarationsAbstract` skips
`declareSort` for datatype names and validates datatype fields.
- **`--incremental`** flag to opt in to the incremental backend.

## Testing

All tests pass with both batch (default) and incremental modes.

## Follow-ups

- [ ] Implement with Lean FFI for cvc5
The Core grammar declared command_datatypes as

op command_datatypes (datatypes : NewlineSepBy DatatypeDecl) : Command
=>
    datatypes ";\\n";

Without @[nonempty] on the field, NewlineSepBy compiles to a
zero-or-more parser, so the trailing ";\\n" production alone matched any
stray ";" in the input. This is particularly easy to trip accidentally:
command_fndef's surface syntax has no trailing semicolon, so a user
writing "};" at the end of a function body (by analogy with procedures,
whose grammar does end in ";") silently produced a phantom
command_datatypes op with zero datatypes.

The phantom later tripped an explicit assertion in translateDatatypes:

  "Datatype block must contain at least one datatype"

which calls panic!, producing a large backtrace at gen_smt_vcs time with
no useful source location.

Mark the datatypes field @[nonempty]. The stray ";" now surfaces as a
parse error at the point it actually appears, and the original repro
elaborates cleanly.

Add a regression test
(StrataTest/Languages/Core/Tests/Issue1146Test.lean) that pins the
canonical form: a program mixing a datatype and a function can run
through gen_smt_vcs without panicking.

Fixes: #1146

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>
#1179)

Fixes #1163

Moves all library code from `StrataMain.lean` into a new `StrataMainLib`
module, leaving `StrataMain.lean` as a thin executable entry point
(`import StrataMainLib`). This makes the library importable from tests
and other tools without pulling in the executable entry point.
…sition (#1116)

When a StaticCall to a multi-output procedure appears in expression
position (not as the RHS of an Assign), emit an error diagnostic: the
extra outputs would be silently discarded, which is a semantic bug
(e.g., error channels lost).

The diagnostic is suppressed when the call is the direct RHS of an
assignment (where the target count check already validates arity).

Includes a test in ResolutionKindTests demonstrating the diagnostic.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
…1110)

Both helpers' error message already said 'expected at least two
arguments', but the destructuring `let a :: as := as | throw ...` only
rejected the empty list: a singleton like `.app .add [x]` or `.app
.bvadd [x]` silently returned just `x`. This diverged from
`Denote.leftAssoc`, which uses `let t₁ :: t₂ :: ts := ts | none` and
returns `none` for fewer than two operands.

Tighten the pattern to `a :: b :: as := as | throw ...` and seed the
`foldl` with `mkApp2 op a b` so the helper's semantics now match the
existing error message and `Denote.leftAssoc`. Empty-operand lists
continue to throw with the same message as before.

This affects every variadic caller: `.app .and`, `.app .or`, `.app
.add`, `.app .sub`, `.app .mul`, `.app .div`, `.app .mod`, plus `.app
.bvadd`, `.app .bvsub`, `.app .bvmul`, `.app .bvand`, `.app .bvor`,
`.app .bvxor`. The Strata Core SMT encoder does not produce singleton
applications of any of these ops (ripgrep'd across `SMTEncoder.lean`,
`Factory.lean`, and `B3/Verifier/Expression.lean`: every construction
passes at least two operands), so well-formed queries are unaffected.
Zero-operand placeholders like `Term.app Op.and [] .bool` that appear as
"unreachable" fallbacks in `SMTEncoder.lean` were already rejected by
`translateTerm` before this change, and remain rejected.

Regression tests in StrataTest/DL/SMT/TranslateTests.lean cover:
- singleton `.app .add`, `.app .and`, `.app .bvadd`, `.app .bvand` (one
per helper+typeclass combination),
- empty-operand `.app .add` (unchanged behaviour),
- ternary `.app .add` producing `1 + 2 + 3 = 6` (preserves left
associativity past the new 2-operand seed).

Each singleton-rejection test fails on the previous commit (the term
silently translates to its sole operand) and passes here.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
The ANF encoder extracts duplicated subexpressions into fresh variables,
but the existing single-pass implementation can leave large duplicated
sub-subexpressions behind.

Root cause: `removeSubsumed` drops candidate duplicates that are
subexpressions of other (larger) candidate duplicates, to avoid creating
redundant variable declarations. But this means that if only the outer
expression appears at the top level, the inner dupes are hidden inside
the lifted var declaration and never extracted.

Example (from PyAnalyzeLaurel benchmark check_storage_costs):

  Original: assert Any_to_bool(Any_get(response, "Datapoints")) ...

  After partial evaluation, Any_to_bool inlines its 7-branch body,
  each branch referencing the argument. With Any_get also inlined
  as an `ite (is-DictStrAny response) (DictStrAny_get ...)
  (List_get ...)`, the Any_get expression ends up duplicated 62
  times inside a single assert.

  Old ANF output:
    var $__anf.0 : bool := <9KB body with 62 duplicates of Any_get>;
    assert $__anf.0

  New ANF output (after iteration):
    var $__anf.3 : Any := Any_get(response, "Datapoints");
    var $__anf.0 : bool := <body using $__anf.3>;
    assert $__anf.0

Effect: VC file size for VCs on one benchmark drops from 32KB to 11KB
(~65% reduction), and another benchmark now completes verification at
36s where it previously hit a 60s timeout.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
…te actions (#1137)

The same ~15-line blocks for installing cvc5, installing z3, and
restoring the lake cache were duplicated across ci.yml's three jobs
(build_and_test_lean, check_pending_python, build_python), cbmc.yml, and
python-fuzz.yml. Robin asked for this to be consolidated as part of the
#984 review (flagged non-blocking there, handled here).

Introduce three composite actions under .github/actions/:

- install-cvc5: downloads the static cvc5 release zip, makes the binary
available either on $GITHUB_PATH ('path', default) or in /usr/local/bin
('system'). Version defaults to 1.2.1.

- install-z3: same shape for z3, version defaults to 4.15.2. This fixes
a latent bug: cbmc.yml and build_and_test_lean used different aarch64
archive names ('arm64-glibc-2.34' vs. 'arm64-win'); the consolidated
action uses the correct glibc-2.34 one for both.

- restore-lake-cache: wraps actions/cache/restore@v5 with the
established key pattern lake-<os>-<arch>-<toolchain>-<manifest>-<all
.st>-<sha> plus the three fallback keys dropping each trailing
component. Exposes a 'fail-on-cache-miss' input for jobs that depend on
a cache saved for the same SHA by build_and_test_lean.

Rewire ci.yml and cbmc.yml to use the new actions:

- ci.yml: build_and_test_lean: install-cvc5, install-z3,
restore-lake-cache. check_pending_python: install-cvc5 (system),
restore-lake-cache (fail-on-cache-miss: true).
    build_python:        install-cvc5 (system), restore-lake-cache
      (fail-on-cache-miss: true).
      (The pip-based z3 install here is deliberately left as-is —
      it is a different install flow.)
- cbmc.yml:
install-cvc5, install-z3, restore-lake-cache (fail-on-cache-miss: true).

Net change: -108 lines of duplicated shell, +17 lines of action usage
across the two workflow files, with the reusable definitions ready to be
adopted by python-fuzz.yml and any future workflows.

actionlint reports no new warnings (the pre-existing save-always warning
on cbmc.yml line 22 is unrelated).

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Blocker for Laurel Seq<T>/Array<T> in #1073.

## Bounds preconditions for `Sequence.select` / `update` / `take` /
`drop`

Following the `Int.SafeDiv` pattern:

| Op                  | Precondition                        |
|---------------------|-------------------------------------|
| `Sequence.select`   | `0 <= i && i < Sequence.length(s)`  |
| `Sequence.update`   | `0 <= i && i < Sequence.length(s)`  |
| `Sequence.take`     | `0 <= n && n <= Sequence.length(s)` |
| `Sequence.drop`     | `0 <= n && n <= Sequence.length(s)` |

`Sequence.length` / `empty` / `append` / `build` / `contains` remain
total.

`PrecondElim` picks the obligations up automatically at call sites in
imperative code (via `transformStmt`) and in pure positions like
`requires` / `ensures` / quantifier bodies / function bodies (via the
synthetic `$$wf` procedures).

Obligations carry `propertyType = "outOfBoundsAccess"` (new `MetaData`
constant, mirroring `divisionByZero`), flow through a new
`PropertyType.outOfBoundsAccess` enum variant and the three
metadata-to-PropertyType conversion sites, and render as
`"out-of-bounds-access"` in SARIF output.

### Incidental fix

While wiring the SARIF classification, I noticed
`propertyTypeToClassification` in `SarifOutput.lean` was pre-existing
dead code: `vcResultToSarifResult` never set `properties.propertyType`,
so every obligation defaulted to `"assert"` in SARIF output. This PR
wires it up, so `divisionByZero` and `arithmeticOverflow` obligations
now also classify correctly in SARIF alongside the new
`outOfBoundsAccess`.

## Testing

- **`StrataTest/Transform/PrecondElim.lean` Test 10** —
`collectPrecondAsserts` attaches `outOfBoundsAccess` metadata for all
four partial ops plus a nested `Sequence.select(Sequence.update(...))`
call. Mirrors the pattern in `OverflowCheckTest.lean`. Complemented by
per-op pretty-print snapshots (`#guard_msgs`) that assert the exact
obligation body string — these catch regressions that preserve count and
metadata tag but corrupt the obligation body (e.g. passing `.Le` instead
of `.Lt` to `mkSeqBoundsPrecond` at a call site, changing the
bound-variable name, or swapping the lower/upper bound inside
`mkSeqBoundsPrecond`).
- **`StrataTest/Languages/Core/Tests/SarifOutputTests.lean`** —
property-type classification tests covering all five `PropertyType`
variants.

Collateral updates to existing tests reflect the new obligations
(`Seq.lean`) and updated `requires` on Sequence function signatures
(`ProgramEvalTests.lean`). Note: the bounds obligations in `Seq.lean`
appear as `true && 0 < length(...)` — the partial evaluator simplifies
`0 <= 0` to `true` but does not further simplify `true && X` to `X`.
This is a pre-existing evaluator gap, not newly introduced by this PR;
the SMT solver discharges the obligation trivially.

## Known downstream impact

PR #1073 (Laurel Seq<T>/Array<T>) emits `Sequence.select` / `update` /
`take` / `drop` calls from its translation. Its `T18_Sequences` test
uses `#guard_msgs(drop info, error)` on a diagnostics-only pipeline, so
the test assertions should still pass syntactically — but individual
sequence-manipulation programs now require bounds guards to fully
verify. PR #1073 has been adjusted accordingly.
## Summary

The DDM type checker uses unification to infer implicit type arguments,
but this fails in some cases — notably when template-generated accessors
with unresolved tvar return types are composed with polymorphic
functions. This PR adds a per-dialect option to bypass type checking
entirely, filling implicit type parameter slots with anonymous type
placeholders instead.

## How to use

In a dialect definition, add `dialect_option typecheck off;` to disable
type checking for all programs using that dialect:

```
#dialect
dialect MyDialect;
dialect_option typecheck off;

type Foo;
fn bar (A : Type, x : A) : A => x;
...
#end
```

Programs using the dialect will skip `inferType` and `unifyTypes` for
expression arguments. Variable name resolution and global context
population still work normally — only type inference is bypassed.

## Test changes

`StrataTest/DDM/TypecheckSkip.lean` defines an inline dialect with
parameterized types, polymorphic functions, and perField accessor
templates. It demonstrates that the accessor-into-polymorphic-fn pattern
(from issue #650 / PR #734) produces type errors with typecheck on, and
elaborates successfully with typecheck off.

## Details

- **AST.lean**: Add `TypeExprF.skip` (anonymous type placeholder `.tvar
loc ""`), `Dialect.typecheck : Bool := true`, refactor
`Program.globalContext` into explicit `computeGlobalContext` function
- **Elab/Core.lean**: Add `ElabContext.typecheck`, skip
`inferType`/`unifyTypes` in `elabSyntaxArg`, fill unfilled type param
slots in `runSyntaxElaborator` post-pass, skip pre-registration in
`elabOperation`, fix `applyNArgs` to handle tvar types
- **Elab/DeclM.lean**: Add `DeclContext.typecheck`
- **Elab/DialectM.lean**: Add `elabSetOptionCommand` handler dispatching
on `"typecheck"` option
- **BuiltinDialects/StrataDDL.lean**: Add `setOptionCommand` with syntax
`dialect_option <name> <value> ;`
- **Elab.lean**: Thread `Dialect.typecheck` into `DeclContext` in
`elabProgramRest`
- **Ion.lean, Python/Specs/DDM.lean**: Update `Program` construction for
`globalContext` refactor

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…etrics JSONL (#1093)

## Motivation

The pipeline previously had no unified infrastructure for diagnostics or
timing. Warnings were handled ad-hoc (some printed to stderr, some
silently
dropped), phase timing was manual and inconsistent, and the only
structured
output (`--warning-summary`) wrote a single JSON blob at the end —
useless
if the process was killed by a timeout. There was also no distinction
between
fatal and non-fatal errors, so callers couldn't tell whether a result
was
trustworthy.

This PR introduces `PipelineContext`/`PipelineM` as a shared backbone
that
gives every pipeline phase consistent behavior: categorized diagnostics,
nested phase timing, fatal-error abort, and streaming metrics output
that
survives timeouts.

## Summary

- Introduce `PipelineContext` and `PipelineM` for structured diagnostic
  accumulation and phase timing across the entire pipeline
- Replace ad-hoc `profileStep`/`startPhase` with `withPhase` combinator
that
  handles nesting, timing, and profile output via `try`/`finally`
- Add `withRepeatedPhase` for iteration-level timing (preprocess,
smtEncode,
solver) that accumulates count+duration and reports as `[profile]` lines
- Replace `--warning-summary` (single JSON blob) with `--metrics <file>`
  (streaming JSONL, flushed per-record for timeout resilience)
- Add `--verbose` program dumps: prints intermediate Laurel and Core
programs
  with `---- BEGIN/END ----` delimiters for programmatic parsing
- Refactor module resolution: fatal errors on invalid/missing modules,
unified `resolveModules` for dispatch, auto-resolved, and explicit
modules
- Fix `pyResolveOverloads` to print errors from pipeline context on
failure
- Split `Messages.lean` into data types (`Messages.lean`) and runtime
  infrastructure (`Context.lean`)
- Move Python-specific `MessageKind` constants to
  `Languages/Python/Specs/MessageKind.lean`

## `PipelineContext` and `PipelineM`

`PipelineContext` is a structure carrying immutable config
(`outputMode`,
`profilePipeline`) plus mutable state as individual `IO.Ref` fields:

- `messagesRef` — accumulated `PipelineMessage` diagnostics
- `toolErrorsRef` / `userCodeErrorsRef` — bucketed by impact for fast
access
- `currentPhaseRef` — current position in the phase hierarchy (managed
via
  `push`/`pop`)
- `repeatedDepthRef` — nesting depth of `withRepeatedPhase` scopes (when
> 0,
  `withPhase` aggregates silently)
- `phaseStateRef` — per-phase scoped state bundling repeated-phase
aggregation
  and message counts (saved/restored atomically on phase entry/exit)
- `metricsHandle` — optional file handle for streaming JSONL metrics

Because all state is in `IO.Ref`s, any monad with `BaseIO` access can
use
pipeline capabilities by passing a `PipelineContext` value directly —
`withPhase` is polymorphic over the monad (`[MonadLiftT BaseIO m]`).

`PipelineM` is defined as:

```lean
abbrev PipelineM := ReaderT PipelineContext (EIO Unit)
```

The `EIO Unit` base ensures that fatal errors (configuration errors,
internal errors, user code issues) are impossible to silently ignore —
they
throw `()`, aborting the pipeline. Non-fatal diagnostics (warnings,
known
limitations) are added to `messagesRef` and execution continues. Callers
that want to attempt recovery (e.g., to gather more diagnostics) can
catch
the exception, but should rethrow to avoid proceeding with invalid
state.

The top-level runner catches the exception and still returns all
accumulated
messages — so the caller always gets complete diagnostics regardless of
how
the pipeline terminated.

## Phase tracking state machine

The phase system operates in two modes controlled by `repeatedDepthRef`:
- **Mode N** (normal, depth = 0): `withPhase` prints `[start]`/`[end]`
in
  profile mode and emits JSONL timing metrics.
- **Mode R** (repeated, depth > 0): `withPhase` silently aggregates
timing
  into `phaseStateRef.repeatedPhases` — no print, no individual metrics.

`withRepeatedPhase` increments `repeatedDepthRef` on entry and
decrements
on exit. `withPhase` never changes the depth — it inherits the mode and
behaves accordingly.

Key invariants:
- `currentPhaseRef` always reflects the innermost active scope's full
path
- `phaseStateRef` is scoped: saved on entry, restored on exit (no
leakage)
- In mode R, all timing flows through `phaseStateRef.repeatedPhases`
only

## `--metrics <file>` (replaces `--warning-summary`)

Writes streaming JSONL (one JSON object per line, flushed immediately
after
each record). If the process is killed mid-pipeline, all records should
remain.

Record types:
- `timing` — phase start/end in ms, with optional count for aggregated
phases
- `diagnostic` — pipeline messages with phase, file, category, impact,
location
- `outcome` — result/exit_code/total_ms, written last (absence = killed)

Example output:
```jsonl
{"type":"timing","phase":"pythonAndSpecToLaurel","start_ms":0,"end_ms":97}
{"type":"diagnostic","phase":"pythonAndSpecToLaurel.resolveAndBuildPrelude","file":"specs/builtins.ion","category":"unsupportedUnion","impact":"knownLimitation","message":"unsupported overload form for 'dict.update'"}
{"type":"timing","phase":"verification.vcDischarge.solver","start_ms":0,"end_ms":1102,"count":12}
{"type":"outcome","result":"verified","exit_code":0,"total_ms":1830}
```

## `--verbose` program dumps

When `outputMode == .verbose`, the pipeline prints intermediate programs
to
stdout with delimiters for programmatic parsing:

```
---- BEGIN Laurel Program ----
<full Laurel program text>
---- END Laurel Program ----
---- BEGIN Core Program ----
<full Core program text>
---- END Core Program ----
```

## Module resolution refactoring

- `resolveModuleEntry` now takes a parsed `ModuleName` (not a string)
- Single `resolveModules` handles dispatch, auto-resolved, and explicit
modules
- `invalidModuleName` changed from `internalWarning` to
`configurationError` (fatal)
- Merged `missingDispatchModule` + `missingExplicitPySpec` into
`missingPySpecModule`
- Eliminated duplicate message scenario where both invalid-name and
missing-module
  errors fired for the same input

## `pyResolveOverloads` error handling

When `readDispatchOverloads` encounters a fatal error (e.g., nonexistent
file),
the command now reads `pctx.messagesRef` and prints all accumulated
messages
to stderr before exiting with a non-zero code — instead of silently
yielding
empty results.

## Profile output format

`--profile` emits structured `[start]`/`[end]`/`[profile]` lines to
stdout:

```
[start] pythonAndSpecToLaurel (time: 0ms)
  [start] readPythonIon (time: 0ms)
  [end] readPythonIon (time: 2ms)
  [start] resolveAndBuildPrelude (time: 2ms)
  [end] resolveAndBuildPrelude (time: 97ms)
  [warnings] resolveAndBuildPrelude: 2 unsupportedUnion
[end] pythonAndSpecToLaurel (time: 97ms)
[start] verification (time: 105ms)
  [start] vcDischarge (time: 320ms)
    [profile] preprocess (×12, total: 45ms, avg: 3ms)
    [profile] solver (×12, total: 1102ms, avg: 91ms)
  [end] vcDischarge (time: 1185ms)
[end] verification (time: 1185ms)
```

## Design: impact drives behavior

Each `MessageKind` has a `MessageImpact` that determines pipeline
behavior:

| Impact | Fatal? | Meaning |
|--------|--------|---------|
| `configurationError` | yes | Invalid arguments or unreadable on-disk
pyspecs |
| `internalError` | yes | Unexpected failure preventing output |
| `userCodeIssue` | yes | Issue in the user's source code |
| `internalWarning` | no | Unexpected condition that didn't prevent
output |
| `knownLimitation` | no | Documented gap in modeling |

Note: `emitMessageAndAbort` is the function that aborts — callers choose
whether to abort. A caller may use `emitMessage` (continues) and check
`isFatal` at phase boundaries for custom abort strategies.

## Not addressed: `[statistics]` output

The `[statistics]` lines (transform counters like
`Evaluator.simulatedStmts`,
`FilterProcedures.erasedProcedures`) are orthogonal to phase timing.
They
currently bypass `PipelineContext` — printed via bare `IO.println` in
`Core.verify` and `StrataMainLib.lean` — so they don't respect phase
nesting
or indentation. Integrating statistics into the phase system is future
work.

## Key files changed

- `Strata/Pipeline/Messages.lean` — pure data types: `Phase`,
`MessageImpact`, `MessageKind`, `PipelineMessage`
- `Strata/Pipeline/Context.lean` — runtime: `PipelineContext`,
`PipelineM`, `withPhase`, `withRepeatedPhase`, `emitMetric`,
`OutputMode`
- `Strata/Languages/Python/Specs/MessageKind.lean` — Python-specific
`MessageKind` constants
- `Strata/Pipeline/PyAnalyzeLaurel.lean` — top-level pipeline
orchestration, `--verbose` program dumps, `metricsHandle` threading
- `Strata/Languages/Core/Verifier.lean` — replaced manual timing with
`withRepeatedPhase`
- `Strata/Languages/Laurel/LaurelCompilationPipeline.lean` — replaced
`profileStep` with `withPhase`
- `Strata/Languages/Python/PySpecPipeline.lean` — fatal abort via throw,
module resolution refactor, diagnostic metric emission
- `StrataMainLib.lean` — `--metrics` flag, outcome record emission,
error printing in `pyResolveOverloads`
- `StrataTest/Pipeline/PhaseTimingTest.lean` — phase timing integration
tests

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
Co-authored-by: June Lee <lebjuney@amazon.com>
**Warning:** This repository will shortly undergo a split into several
separate repositories. If you're creating a PR that crosses the
boundaries between these repositories, you may want to hold off until
the split is complete or be prepared to rework your PR into multiple PRs
once the split is complete.

The code that will be moved includes:
- Strata/DDM/*
- Strata/Languages/Boole/*
- Strata/Languages/Python/* along with Tools/Python/*
- Tools/BoogieToStrata
## Summary

`PythonIdent.pythonModule` was a raw `String`, allowing empty module
names to propagate silently through the system and cause downstream
failures (see #1143). This PR changes `pythonModule` to use `ModuleName`
— which now guarantees at least one non-empty component — and adopts it
more widely through the PySpec pipeline, eliminating this class of bugs
at the type level.

## Approach

Rather than adding runtime checks at each use site, `ModuleName` gains a
strengthened invariant via a private constructor enforcing non-empty
components. `PythonIdent.ofComponent` uses `decide` to verify
non-emptiness at compile time for literal module names (used in
`Decls.lean`). Runtime parsing goes through `ModuleName.ofString?` which
returns `Option`.

## Key changes

- **PythonIdent.lean**: `PythonIdent.pythonModule` changed from `String`
to `ModuleName`. New `ofComponent` constructor with compile-time proof
for single-component literals. `ModuleName` API extended: `back`,
`parent (n)`, `push`, `append`/`++`, `toString (sep)`
- **PythonIdent.lean**: New `ModuleName.ofRelativePath` provides a
single canonical function for deriving a `ModuleName` from a `.py` file
path relative to a search root, handling both regular files and
`__init__.py` packages. This eliminates redundant ad-hoc implementations
scattered across `discoverModules`, `RelativeImportTest`, and
`findInPath` — each of which had slightly different `__init__.py`
handling, creating inconsistency and latent bugs (e.g.,
`pySpecToLaurelCommand` silently failed for package modules)
- **Specs.lean**: `findInPath` simplified to return `(FilePath × Bool)`
instead of `(FilePath × Array ModuleComponent)` — the `Bool` is
`isInit`, and callers use `ModuleOfPath.modulePrefix` to derive the
package prefix. Deleted unused `ofFile` function
- **Specs/DDM.lean**: Converted `fromDDM` functions to monadic `FromDDM`
(`Except (SourceRange × String)`) for proper error propagation instead
of panics
- **Specs/ToLaurel.lean**: `signaturesToLaurel` takes `ModuleName`
instead of `String` prefix. Overload extraction factored into dedicated
`ExtractM` monad (no dependency on `ToLaurelContext`)
- **PySpecPipeline.lean**: `buildPySpecLaurel` takes `Array (ModuleName
× String)` instead of `Array (String × String)`
- **Specs.lean**: Import resolution uses `ModuleName` throughout;
`translateFile` requires a `ModuleName` (no longer optional)
- **Specs/Decls.lean**: Builtin idents use `ofComponent` with
compile-time proof

### Unrelated cleanup

- Deleted `Strata.Python.containsSubstr` helper and replaced all call
sites with `String.contains` (which accepts substring patterns in Lean
4.29)

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…1149)

*Issue #, if available:* #1148

*Description of changes:*

1. Namespace collision (issue 1 in #1148): Constants that share a name
with a procedure are now prefixed with __const_ in the Strata output.
The rename is applied consistently in declarations and references.

2. Recursive synonym resolution (Fixes issue 2 in #1148):
DealiasTypeExpr now recurses on its result, resolving synonym chains
like ref := i64 := int fully to the base type. Previously it resolved
only one level, causing panics on comparison/arithmetic operators
applied to multi-level type synonyms.

3. SMACK assert encoding (issue 3 in #1148): Procedures named assert_.*
now get a requires (arg != 0) precondition emitted. Call elimination
generates VCs at each call site, making SMACK assertions verifiable.

4. InferModifies: SMACK-generated Boogie often omits explicit `modifies`
clauses on procedures that mutate globals. Setting `InferModifies =
true` in BoogieToStrata has two effects: (1) it runs
`ModSetCollector.CollectModifies(program)` to populate empty modifies
clauses, and (2) through `CheckModifies` in Boogie's resolution context,
it suppresses typechecking of modifies clauses. Without this,
`ResolveAndTypecheck` would reject SMACK programs that omit modifies
clauses.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: David Deng <htd@amazon.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes #1045

## Summary

Adds a `--parallel N` flag that runs up to N solver instances
concurrently when verifying proof obligations. Without the flag (or with
`--parallel 1`), behavior is unchanged (sequential).

## Problem

Verification of programs with many obligations is bottlenecked by
sequential solver invocations. Each obligation spawns a separate solver
process, waits for the result, then moves to the next.

## Solution

When `--parallel N` is specified (N > 1), the verification pipeline
splits into two phases:

1. **Sequential preprocessing** (fast): determine checks, preprocess
obligations, encode to SMT terms. Obligations resolved by the evaluator
are handled immediately.
2. **Parallel solver dispatch** (slow): obligations that need the solver
are placed in a shared queue. N worker tasks (on dedicated threads)
continuously pull from the queue — when a solver finishes, it
immediately picks up the next unsolved obligation. Results are collected
in original obligation order so output is deterministic.

The worker pool design avoids the "wait for slowest in batch"
bottleneck: if one obligation takes 10s and others take 1s, the
fast-finishing workers immediately start on the next obligation instead
of idling.

`stopOnFirstError` is supported via a shared flag: on failure, workers
stop claiming new jobs. Already-running jobs complete naturally; skipped
jobs leave their placeholder results in place.

Both the incremental and batch solver paths are safe for parallel use:
the incremental backend spawns independent solver processes, and the
batch path uses atomic `modifyGet` for filename counter generation.

**Pluggable discharge function**: The full public API (`Strata.verify`,
`Core.verify`, `verifySingleEnv`, `mkDefaultCoreSMTSolver`) accepts a
`mkDischarge : MkDischargeFn` parameter (defaulting to `mkDischargeFn`).
External solvers (e.g. using the AbstractSolver API) can provide their
own discharge function factory.

## Performance

Benchmark: 30 obligations across 2 programs, z3 4.12.2, avg over 3 runs:

| Mode | Time | Speedup |
|------|------|---------|
| `--parallel 1` (sequential) | 921ms | baseline |
| `--parallel 2` | 491ms | **1.88x** |
| `--parallel 3` | 405ms | **2.27x** |
| `--parallel 4` | 356ms | **2.59x** |

Sweet spot is `--parallel 3-4`. The 1→2 jump gives the biggest single
improvement (1.88x), then diminishing returns as solver spawn overhead
and sequential preprocessing become the bottleneck.

## Testing

All tests that pass without `--parallel` also pass with it. The
sequential path (`--parallel 1`, the default) is unchanged.
When translateExpr produces an expression containing a nested
multi-output procedure call (e.g., PSub(base, timedelta_func(...))), the
new extractMultiOutputCalls pass in translateAssign rewrites it into a
preceding multi-target assignment and a variable reference:

  var $mo_N := ...; assign $mo_N, maybe_except := timedelta_func(...);
  delta := PSub(base, $mo_N)

This prevents multi-output calls from appearing in expression position,
which would silently discard the error channel.

Key design choices:
- Counter seeded from source byte offset for globally unique temp
variable names
- Branch-local multi-output calls are wrapped in Block nodes inside
their respective IfThenElse branches (not hoisted unconditionally)
- Generated wrapper statements propagate source metadata from the
original call
- Block-flattening in the lifter uses cons-based prepend for correct
evaluation order
- containsBareAssignment skips Blocks so generated wrappers pass through
assert/assume handlers
- extractMultiOutputCalls uses structural recursion (attach) rather than
partial

Tested: existing tests pass, builds cleanly.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Mikaël Mayer <MikaelMayer@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mikael Mayer <mimayere@amazon.com>
### Changes
- Fix scoping bug in HeapParam pass where `Declare` targets inside
blocks created by `heapTransformExpr` were invisible to subsequent
statements
- Fix small transparency bug in ConstrainedTypeElim
- Factor out `heapVarName` / `heapInVarName` constants to avoid
duplicated magic strings across passes
- Add consistency check to LaurelCompilationPipeline to detect bugs in
passes
- Move intermediate program output to
`.lake/build/intermediatePrograms/`

### Testing
- Existing tests pass, new consistency check added

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Michael Tautschnig <mt@debian.org>
Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
Co-authored-by: keyboardDrummer-bot <keyboarddrummer.bot@gmail.com>
Co-authored-by: Mikael Mayer <mimayere@amazon.com>
…1134)

Holes in arguments to datatype destructors (e.g.
`Any..as_ListAny!(<?>)`) and testers (e.g. `Any..isfrom_str(<?>)`) were
left untyped after `InferHoleTypes`, producing "could not infer type"
diagnostics and downstream `Unknown`-typed `$hole_N` functions.

The fix surfaces the parent datatype through the semantic model so
`InferHoleTypes` can recover parameter types via `uniqueId`-based model
lookups (per the Laurel steering doc, post-resolution passes should rely
on model lookups, not textual names).

Key changes:

- New `ResolvedNode.datatypeDestructor` variant carrying the parent type
name and field parameter, with corresponding updates to
`ResolvedNodeKind`, `.kind`, `.getType`, and `SemanticModel.isFunction`.
- Datatype scope registration chains `defineNameCheckDup` calls so
safe/unsafe destructors and tester/constructor names share a single
`uniqueId`.
- `buildRefToDef` registers each constructor parameter as
`.datatypeDestructor`.
- `calleeParamTypes` matches `.datatypeConstructor` and
`.datatypeDestructor` directly, synthesizing the correct `.UserDefined`
input type.
- `ResolvedNode.getType` now uses explicit pattern matches instead of a
catch-all, so new constructors trigger compile errors rather than
silently falling through.
- `SemanticModel.refToDef` is now `private`; external code uses
`SemanticModel.get?` (returning `Option ResolvedNode`) or
`SemanticModel.get` to look up nodes by `Identifier`, preventing
arbitrary `Nat` key access.

Tests: three new `#guard_msgs` cases for holes inside
destructors/testers, plus updated expected diagnostics in
`ResolutionKindTests`. Existing tests pass.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Mikael Mayer <mimayere@amazon.com>
Migrate 241 of 282 non-Boole StrataTest files to be Lean modules,
completing the module migration started in #982.

- **Added `module` keyword** to 241 test files. The Boole test files are
excluded because `gen_smt_vcs` produces proof terms that fail kernel
type-checking when the file is a module.

- **Added `meta import`** for Strata/StrataTest dependencies where
`#eval`, `#guard_msgs`, or `#guard` need native implementations from the
imported module.

- **Added `meta import all`** where proofs use `simp`, `rfl`, or
`native_decide` on definitions whose bodies must be visible across the
module boundary.

- **Added `import all`** where non-meta definitions (types, instances)
need access to definition bodies from other StrataTest modules.

- **Added `meta section` wrappers** around test code that exercises
native implementations, since `#eval`/`#guard_msgs`/`#guard` require
meta context to call into compiled code from another module.

- **Added `import Strata.DDM.Integration.Lean.HashCommands`** (non-meta)
in files using `#strata`/`#dialect` syntax, since the syntax extension
must be available at parse time regardless of meta context.

- **Updated byte offsets** in `#guard_msgs` docstrings where expected
output includes position-dependent labels (e.g., invariant names derived
from source positions shifted by added import lines).

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The `mkDischarge` parameter was accepted by `Core.verifyProgram` but not
exposed through `PyAnalyzeConfig` or the CLI `verify` command, so
callers could not supply a custom solver discharge function.

This PR:
- Adds `mkDischarge` to `PyAnalyzeConfig` (defaulting to
`Core.mkDischargeFn`) and forwards it to the verification call.
- Adds `mkDischarge` as a parameter to `verifyCommand` and passes it
through to the `verify` call.
- Adds `mkDischarge` as a parameter to `pyAnalyzeLaurelCommand` (same
pattern), so downstream executables can override the discharge function
without redefining the command.

Tested: existing tests pass, build succeeds.

Fixes #1199
## Summary

Run `lake shake` to minimize the import graph across all 293 Strata
modules,
reducing average transitive Strata imports by 32% (56 → 38) and total
transitive
imports (including Lean/Std) by 40% (1379 → 831). Add CI linting to
ensure
`Strata.lean` transitively imports every module, and add an import
statistics
script for ongoing measurement.

## Changes

- **Import lint driver (`Scripts/CheckImports.lean`)**: Verifies that
all `.lean`
files under `Strata/` are transitively imported by `Strata.lean`.
Registered as
`lintDriver` in `lakefile.toml` so `lake lint` runs it automatically.
Supports
`-- noimport: Strata.Foo` comments in `Strata.lean` to allowlist
intentionally
  excluded modules.

- **Import statistics script (`Scripts/ImportStats.lean`)**: Computes
per-module
transitive import counts (both total and Strata-only), reporting
average,
median, min, max, top/bottom lists, and histograms. Run via `lake exe
ImportStats`.

- **`lake shake Strata --fix`**: Applied shake's suggestions to remove
unnecessary
imports across 225 files (net -185 lines). This is the bulk of the
change.

- **Aggregator modules use `-- shake: keep`**: Pure aggregator modules
  (`CBMC.lean`, `GOTO.lean`, `B3.lean`, `Dyn.lean`, `Python.lean`,
  `Imperative.lean`, `Lambda.lean`, `DDM.lean`, `LExprType.lean`,
`B3/Verifier.lean`) intentionally re-export their submodules and are
annotated
  with `-- shake: keep` so `lake shake --fix` won't strip them.

- **DDM elaboration imports use `-- shake: keep`**: Files using
`#dialect`,
`#strata`, `#strata_gen`, and related macros have elaboration-time
import
  dependencies that shake cannot detect. These are annotated with
  `-- shake: keep` to prevent breakage.

- **Renamed aggregator files**: `Dyn/Dyn.lean` → `Dyn.lean`,
  `Python/Python.lean` → `Python.lean`, `Imperative/Imperative.lean` →
`Imperative.lean` — the `Foo/Foo.lean` pattern is unnecessary now that
these
  are proper `module` files.

- **Orphaned modules added to `Strata.lean`**: When shake stripped
re-exports
from real code files (e.g. `ProgramWF.lean` no longer re-exports
`StatementWF`),
the orphaned modules were added directly to `Strata.lean` to maintain
full
  coverage.

## Results

| Metric | Before | After | Change |
|---|---|---|---|
| Avg transitive imports (Strata-only) | 56.2 | 38.2 | -32% |
| Median transitive imports (Strata-only) | 38 | 22 | -42% |
| Avg transitive imports (all) | 1379 | 831 | -40% |
| Total import edges (Strata-only) | 16,455 | 11,194 | -32% |
| Clean build time | 3m 02s | 2m 58s | -2% |

By submitting this pull request, I confirm that you can use, modify,
copy, and
redistribute this contribution, under the terms of your choice.
@github-actions github-actions Bot added Core GOTO dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code SMT labels May 22, 2026
# Move Boole dialect to standalone package

Extract the Boole dialect into its own `StrataBoole` Lake package with a
dependency on the main `Strata` package. This enables faster incremental
builds for Boole-specific work and cleaner separation of concerns.

## Changes

- Move `Strata/Languages/Boole/` source files to
`StrataBoole/StrataBoole/`
- Move Boole test files to `StrataBoole/StrataBooleTest/`
- Create `StrataBoole/lakefile.toml` with `Strata` as a dependency
- Add `StrataBoole/StrataBooleTest.lean` root module importing all test
files
- Fix `MetaVerifier.lean` to add required meta imports
(`Lean.Meta.Eval`, `Lean.Meta.Tactic.Rewrite`,
`Lean.Meta.Tactic.Unfold`) and define `genSMTVCsBoole` as a standalone
function for the tactic infrastructure
- Update all test imports from `Strata.Languages.Boole.*` to
`StrataBoole.*`
- Update `#guard_msgs` expected output to reflect shifted source
positions
- Add CI workflow step for building the `StrataBoole` package
- Remove Boole-specific code from the main `Strata` package
(`Strata.lean`, `MetaVerifier.lean`, `StrataMainLib.lean`)
@github-merge-queue github-merge-queue Bot deployed to github-pages May 22, 2026 16:08 Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Core dependencies Pull requests that update a dependency file Git conflicts github_actions Pull requests that update GitHub Actions code GOTO Laurel Python SMT

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants