Skip to content

Formatting and debugging improvements#1115

Open
keyboardDrummer-bot wants to merge 38 commits into
main2from
formatting-and-debugging-improvements
Open

Formatting and debugging improvements#1115
keyboardDrummer-bot wants to merge 38 commits into
main2from
formatting-and-debugging-improvements

Conversation

@keyboardDrummer-bot
Copy link
Copy Markdown
Collaborator

Summary

Extracts the formatting and debugging improvements from #34 into a standalone PR.

Formatting improvements

  • Block formatting: Changes block output from single-line { stmt1; stmt2 } to vertical layout with indent(2):
    {
      stmt1;
      stmt2
    }
    
  • Semicolon separator: Uses newlines instead of spaces between semicolon-separated items in the formatter.

Debugging improvements

  • Better diagnostic reporting: Replaces the boolean coreProgramHasSuperfluousErrors with a coreDiagnostics : List DiagnosticModel that records why the Core program was suppressed. When no other diagnostics explain the suppression, these are surfaced to the user.
  • Informative invalidCoreType: Adds source and reason parameters so each suppression site provides context about what went wrong.
  • Intermediate file output: Adds processLaurelFileKeepIntermediates test helper that writes pipeline intermediate files to Build/ for debugging.
  • .gitignore: Adds Build/ directory.

Test updates

  • All expected outputs updated to match the new block formatting.
  • Minor #guard_msgs whitespace fixes.
  • Some test inputs updated to use opaque keyword where needed for the new formatting to apply correctly.

Formatting improvements:
- Change block formatting to use indent(2) with newlines for vertical layout
  instead of single-line '{ ... }' format
- Update semicolon separator to use newlines instead of spaces

Debugging improvements:
- Replace boolean 'coreProgramHasSuperfluousErrors' with 'coreDiagnostics' list
  that records why the Core program was suppressed, enabling better error surfacing
- Add source location and reason parameters to 'invalidCoreType' for more
  informative diagnostics
- Surface core diagnostics when no other diagnostics explain program suppression
- Add 'processLaurelFileKeepIntermediates' test helper for writing intermediate
  pipeline files to Build/ directory
- Add Build/ to .gitignore

Test updates:
- Update all expected outputs to match new block formatting
- Add 'opaque' keyword to test procedures that need it for the new formatting
- Fix #guard_msgs whitespace formatting
Diagnostics with FileRange.unknown are not actionable for users and
can arise from pre-existing resolution limitations (e.g., variables
in multi-assign with field targets losing their uniqueId). Filter
these out when surfacing suppression reasons to avoid spurious errors.
The CodeBuild source location was hardcoded to strata-org/Strata.git,
which fails when the commit only exists in keyboardDrummer/Strata.
Use github.server_url/github.repository so it resolves to the correct
repo dynamically.
The CodeBuild project has credentials for strata-org/Strata only.
On forks like keyboardDrummer/Strata, the job cannot access the source.
Add an 'if' condition to skip the benchmark on non-upstream repos,
and restore the hardcoded source URL that CodeBuild expects.
…gging-improvements

# Conflicts:
#	Strata/Languages/Laurel/LaurelToCoreTranslator.lean
@github-actions github-actions Bot added Laurel Python github_actions Pull requests that update GitHub Actions code labels May 5, 2026
@keyboardDrummer keyboardDrummer marked this pull request as ready for review May 5, 2026 16:23
@keyboardDrummer keyboardDrummer requested a review from a team May 5, 2026 16:23
@keyboardDrummer keyboardDrummer enabled auto-merge May 5, 2026 16:23
Comment thread .github/workflows/ci.yml Outdated
@github-actions github-actions Bot removed the github_actions Pull requests that update GitHub Actions code label May 5, 2026
@keyboardDrummer
Copy link
Copy Markdown
Contributor

@keyboardDrummer-bot please resolve the build failures

Copy link
Copy Markdown
Contributor

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. Scope. The PR title is accurate but understates what's inside. It contains:

  • Formatter change in Strata/DDM/Format.lean:456 — flips SepFormat.semicolon from "; " to ";\n". This is a DDM-wide change that affects any dialect using SemicolonSepBy. I searched the repo (grep -rn SemicolonSepBy Strata/Languages --include='*.st') and only Laurel currently uses it, so in practice the blast radius is the Laurel tests you're already updating — but the commit history doesn't flag this as a cross-cutting change, and anyone adding a new dialect that uses SemicolonSepBy later will inherit the vertical layout without necessarily expecting it. Worth a one-line PR-body note or a comment in Format.lean.
  • Laurel grammar change at LaurelGrammar.st:109–110 — the block and labelledBlock ops use "{\n " indent(2, stmts) "\n}" instead of the single-line form. This is the Laurel-local wrapper that, combined with (1), gives the vertical { \n … \n} shape.
  • Diagnostic-reporting refactor in LaurelToCoreTranslator.lean and LaurelCompilationPipeline.leanBool → List DiagnosticModel. Touches a different concern from the formatter change.
  • Test helper processLaurelFileKeepIntermediates in TestExamples.lean, plus .gitignore entry for Build/.
  • ~250 lines of test expected-output updates across 8+ files.

Ideally this would have been three PRs (formatter, diagnostic refactor, test helper). Given it's already cut, a structured commit log and PR body note would help reviewers separate the concerns — they're independently reviewable.

2. Silent-suppression edge case. At LaurelCompilationPipeline.lean:218:

    if translateState.coreDiagnostics.length > 0 && allDiagnostics.isEmpty then
      let locatedDiags := translateState.coreDiagnostics.filter (·.fileRange != FileRange.unknown)
      allDiagnostics := allDiagnostics ++ locatedDiags

    let coreProgramOption :=
      if !translateState.coreDiagnostics.isEmpty then none else coreProgramOption

If translateState.coreDiagnostics is non-empty but every entry has FileRange.unknown, then:

  • locatedDiags = [], so allDiagnostics stays empty.
  • coreProgramOption := none because coreDiagnostics.isEmpty = false.
  • User sees: no diagnostics, but the translation is suppressed.

The intended rationale ("those without one are not actionable for the user") is understandable for locations, but the diagnostics still carry a message field that's strictly more useful than silence. Before this PR the behaviour was "silent suppression on coreProgramHasSuperfluousErrors = true", so this isn't a regression — it just misses an opportunity that the new design could take.

Two shapes of fix:

  • Don't filter. allDiagnostics := allDiagnostics ++ translateState.coreDiagnostics. The user gets the message text without a location; still better than silence, and the StrataBug type tag (which every invalidCoreType entry sets) already flags these as pipeline issues rather than user errors.
  • Fallback message. Keep the filter; if locatedDiags is empty but coreDiagnostics is non-empty, synthesise a single summary diagnostic of the form "Translation suppressed with {n} internal errors without source locations: {message1}; {message2}; …".

Inline suggestion on line 215–223.

3. Duplicate test helper. processLaurelFileKeepIntermediates (plus buildDir plus the Build/ .gitignore entry) is added verbatim by PR #1113 as well. Whichever lands first, the other will have a merge conflict. Worth coordinating with the #1113 author so only one PR lands the helper (probably this one, since the PR description ties it to the diagnostic-reporting debugging work).

4. invalidCoreType default reason is noisy. LaurelToCoreTranslator.lean:78:

private def invalidCoreType (source : Option FileRange := none) (reason : String := "Type could not be translated to Core (resolution error placeholder)") : TranslateM LMonoTy := do

Every call site now passes an explicit source and reason — the only caller left using the default is… none, I think; every invalidCoreType call site in this diff passes both arguments. If that's true, dropping the defaults makes the contract explicit and eliminates a class of "I forgot to say why" callers. Low-priority stylistic nit.

5. Redundant diagnostic write in the last arm of translateType. Lines 111–113:

  | _ => do
    emitDiagnostic (diagnosticFromSource ty.source "cannot translate type to Core: not supported yet" DiagnosticType.StrataBug)
    invalidCoreType ty.source s!"cannot translate type to Core: not supported yet"

The emitDiagnostic call adds the message to state.diagnostics; the invalidCoreType call then also constructs and appends an identical message to state.coreDiagnostics. Since allDiagnostics is now non-empty (the emitDiagnostic entry), the guard at line 215 skips over the coreDiagnostics entry — so the duplicate is dead state, not user-visible duplication, but it's carrying a second copy of the same DiagnosticModel through the pipeline. Either drop the emitDiagnostic (and rely on the fallback surfacing at line 218) or have invalidCoreType only append to coreDiagnostics without constructing a fresh diagnostic when the caller already emitted one. Small cleanup; not blocking.

Comment thread Strata/Languages/Laurel/LaurelCompilationPipeline.lean
Comment on lines +39 to +46
/-- Project-root-relative path to the `Build/` directory for intermediate files.
Resolved from the current working directory so it works on any machine. -/
def buildDir : IO String := do
let cwd ← IO.currentDir
return s!"{cwd}/Build/"

def processLaurelFileKeepIntermediates (input : InputContext) : IO (Array Diagnostic) := do
let dir ← buildDir
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate helper with PR #1113. The buildDir + processLaurelFileKeepIntermediates pair, and the .gitignore Build/ entry, are added verbatim by PR #1113 as well. Whichever of the two PRs lands first, the other gets a merge conflict on three files.

Secondary note: processLaurelFileKeepIntermediates isn't called from any test in this diff — it's infrastructure for ad-hoc debugging. That's fine, but infrastructure without a first caller tends to rot. If you have a specific debugging scenario the helper is intended for, consider wiring a small example test that actually uses it (asserting that Build/ gets populated, or just smoke-testing one pass's intermediate file).

Or, if the helper is strictly for local debugging and not intended to run in CI, mark it @[inline]-ish with a comment (-- For local debugging; never invoked in CI) and leave the .gitignore hint alone. Either way, some form of "why this exists" annotation would help the next maintainer who sees an unused public def.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keyboardDrummer-bot can you add the comment?

@keyboardDrummer
Copy link
Copy Markdown
Contributor

keyboardDrummer commented May 5, 2026

1. Scope. The PR title is accurate but understates what's inside. It contains:

Ideally this would have been three PRs (formatter, diagnostic refactor, test helper). Given it's already cut, a structured commit log and PR body note would help reviewers separate the concerns — they're independently reviewable.

Are you asking me to change my approach to cutting PRs? I'm not used to a PR like this, which is of fairly manageable size and complexity, to get such feedback.

2. Silent-suppression edge case. At LaurelCompilationPipeline.lean:218:

Let's remove the filter. It makes no sense to me :) @keyboardDrummer-bot can you do that?

3. Duplicate test helper. processLaurelFileKeepIntermediates (plus buildDir plus the Build/ .gitignore entry) is added verbatim by PR #1113 as well. Whichever lands first, the other will have a merge conflict. Worth coordinating with the #1113 author so only one PR lands the helper (probably this one, since the PR description ties it to the diagnostic-reporting debugging work).

I don't think there would be a merge conflict, since the code is the same, and if there was, it would be trivially resolved by a bot. Do you really think coordination here (with the other author also being myself) is necessary?

4. invalidCoreType default reason is noisy. LaurelToCoreTranslator.lean:78:

private def invalidCoreType (source : Option FileRange := none) (reason : String := "Type could not be translated to Core (resolution error placeholder)") : TranslateM LMonoTy := do

Every call site now passes an explicit source and reason — the only caller left using the default is… none, I think; every invalidCoreType call site in this diff passes both arguments. If that's true, dropping the defaults makes the contract explicit and eliminates a class of "I forgot to say why" callers. Low-priority stylistic nit.

Okay, @keyboardDrummer-bot please remove the default.

5. Redundant diagnostic write in the last arm of translateType. Lines 111–113:

  | _ => do
    emitDiagnostic (diagnosticFromSource ty.source "cannot translate type to Core: not supported yet" DiagnosticType.StrataBug)
    invalidCoreType ty.source s!"cannot translate type to Core: not supported yet"

@keyboardDrummer-bot please remove the redundant write.

…t emitDiagnostic

- Remove the FileRange.unknown filter in LaurelCompilationPipeline: all
  coreDiagnostics are now surfaced when no other diagnostics explain
  the suppression.
- Remove default parameters from invalidCoreType, making source and
  reason explicit at every call site.
- Remove redundant emitDiagnostic in the catch-all arm of translateType:
  invalidCoreType already records the diagnostic via coreDiagnostics.
Diagnostics without a known source location are not actionable for the
user. Restore the filter that was removed in 0b6cee4 to fix the
T1_MutableFields test failure.
Copy link
Copy Markdown
Contributor

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Silent-suppression edge case in LaurelCompilationPipeline.lean (my inline on lines 215–223; thread now marked resolved on GitHub). 0b6cee46 took my option (a) — dropping the filter — then b30607ea reverted it with message "Diagnostics without a known source location are not actionable for the user. Restore the filter to fix the T1_MutableFields test failure." So:

  • The root concern (unlocated coreDiagnostics get silently dropped, the refactor's "we now have the info" promise is partially broken) still holds in the current code.
  • The test failure the revert cites reveals something I hadn't noticed: unlocated coreDiagnostics are being produced during the normal verification of T1_MutableFields, not only in the "shouldn't happen" paths. That's itself a symptom of upstream diagnostic labeling being sloppier than the filter's rationale assumes.
  • Option (b) from my earlier inline — keep the filter but synthesise a summary diagnostic when locatedDiags is empty AND coreDiagnostics is non-empty — wasn't tried. It wouldn't hit the test-failure that killed option (a), because in T1_MutableFields's case the located arm is active alongside any unlocated entries, so the fallback never fires. See the renewed inline.

The b30607ea revert commit message suggests unlocated coreDiagnostics fire during normal operation. If the T1_MutableFields test hits the surfacing path (which it would if allDiagnostics.isEmpty and any coreDiagnostics are present), then some upstream invalidCoreType call is recording a DiagnosticType.StrataBug diagnostic without a source location as part of a successful T1 compile. That's not something the filter can fix — it's a mislabelling at the source (an invalidCoreType call that shouldn't be firing, or firing without a source when one is available). Worth tracking as a follow-up: grep invalidCoreType Strata/Languages/Laurel/LaurelToCoreTranslator.lean against the T1 fixture to find which call site is the culprit.

No test or proof coverage changes from my earlier review — the suggestions I made there still apply, and the regression test I asked for would have caught exactly the case that drove the revert: a program whose suppression reason is a StrataBug without source location should emit something to the user, not silence.

Comment on lines +215 to 223
if translateState.coreDiagnostics.length > 0 && allDiagnostics.isEmpty then
-- The program was suppressed but no diagnostics explain why — report the core diagnostics
-- that have a known source location (those without one are not actionable for the user).
let locatedDiags := translateState.coreDiagnostics.filter (·.fileRange != FileRange.unknown)
allDiagnostics := allDiagnostics ++ locatedDiags

let coreProgramOption :=
if translateState.coreProgramHasSuperfluousErrors then none else coreProgramOption
if !translateState.coreDiagnostics.isEmpty then none else coreProgramOption
return (coreProgramOption, allDiagnostics, program, stats)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renewed concern. Thread #2 on this block was marked resolved but the revert at b30607ea restored the silent-suppression edge case I originally flagged. To recap:

  • 0b6cee46 took my option (a): drop the filter, surface all coreDiagnostics. The T1_MutableFields test failed.
  • b30607ea reverted to the original filter, citing that test failure.
  • The silent-suppression case is therefore still present in the current HEAD.

The revert message notes "Diagnostics without a known source location are not actionable for the user." — fair, but the conclusion-from-test-failure isn't "drop them silently", it's "the current flood of unlocated diagnostics during T1's compile is surprising in the first place". Two independent follow-ups are worth doing:

In this PR — try option (b) from my earlier inline. Keep the filter, but when all the coreDiagnostics are unlocated, synthesise a single summary diagnostic so the user isn't left with silence:

Suggested change
if translateState.coreDiagnostics.length > 0 && allDiagnostics.isEmpty then
-- The program was suppressed but no diagnostics explain why — report the core diagnostics
-- that have a known source location (those without one are not actionable for the user).
let locatedDiags := translateState.coreDiagnostics.filter (·.fileRange != FileRange.unknown)
allDiagnostics := allDiagnostics ++ locatedDiags
let coreProgramOption :=
if translateState.coreProgramHasSuperfluousErrors then none else coreProgramOption
if !translateState.coreDiagnostics.isEmpty then none else coreProgramOption
return (coreProgramOption, allDiagnostics, program, stats)
if translateState.coreDiagnostics.length > 0 && allDiagnostics.isEmpty then
-- Prefer located diagnostics; fall back to a single synthetic summary when
-- every coreDiagnostic is unlocated, so the user always sees *some*
-- explanation when translation is suppressed.
let locatedDiags := translateState.coreDiagnostics.filter (·.fileRange != FileRange.unknown)
if !locatedDiags.isEmpty then
allDiagnostics := allDiagnostics ++ locatedDiags
else
let msgs := String.intercalate "; " (translateState.coreDiagnostics.map (·.message))
let summary : DiagnosticModel := {
fileRange := FileRange.unknown,
message := s!"Translation suppressed: {translateState.coreDiagnostics.length} internal errors without source locations — {msgs}",
type := DiagnosticType.StrataBug }
allDiagnostics := allDiagnostics ++ [summary]
let coreProgramOption :=
if !translateState.coreDiagnostics.isEmpty then none else coreProgramOption
return (coreProgramOption, allDiagnostics, program, stats)

This avoids the T1_MutableFields regression: T1 hits the "located diagnostics exist" branch, so the fallback never fires. It only fires in the genuinely-silent case my original review called out.

As a follow-up issue — investigate why T1 ever produces unlocated StrataBug diagnostics. If T1 is a known-good fixture, any diagnostic produced during its compile is a labeling bug (either a missed source range on an invalidCoreType call, or a pass that shouldn't be firing at all). The grep invalidCoreType Strata/Languages/Laurel/LaurelToCoreTranslator.lean sites are the place to start — each should either have a definite ty.source or a comment explaining why it's unsourceable. This is out of scope for this PR but worth filing.

Not blocking the merge — the current behaviour matches pre-PR silent-suppression semantics — but the option (b) fallback is a one-shape-fits-both improvement that's cheap and localised to this block.

@keyboardDrummer
Copy link
Copy Markdown
Contributor

This PR depends on #1113 being fixed before it can remove the filter on empty source locations

atomb and others added 3 commits May 14, 2026 21:44
Add a pull request description template that warns about the upcoming
repository split.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
The `StrataTestMain` test driver spawns `lean` on files under
`StrataTestExtra/`, which is not declared as a `lean_lib`. Some of those
files (e.g. `DDM/Integration/Java/TestGen.lean`) import modules from the
`Strata` library that are not in `StrataTest`'s transitive closure, most
notably `Strata.DDM.Integration.Java`.

Before this change, `lake test` on a clean `.lake` would only build the
closure reachable from `StrataTest` and then fail at subprocess time
with:

  error: object file '.../Strata/DDM/Integration/Java.olean' of module
  Strata.DDM.Integration.Java does not exist

Running `lake build` first masked the bug because the `Strata` library
is in `defaultTargets`.

Add `Strata` to the `needs` list of `StrataTestMain` so Lake builds the
full `Strata` library before the test driver runs, regardless of whether
`lake build` was run beforehand.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
…ove unlabeled exit from Core (#1141)

*Issue #, if available:*

Closes #372

*Description of changes:*

Formalizing scoped variable initialization:

- Adds scoped environment semantics to blocks: step_block_done now
projects the store through the parent store via projectStore, discarding
block-local variables on exit. This applies to both StepStmt
(deterministic) and StepKleene
  (nondeterministic).
- Adds PostWF helper definition in Specification.lean for postcondition
stability under projectStore.
- Adds .block constructor to KleeneStmt and corresponding
step_block/step_block_body/step_block_done to StepKleene, mirroring the
deterministic block semantics.
- Changes StmtToKleeneStmt (.block _ bss _) to produce .block b
(wrapping in a Kleene block), so the Kleene translation preserves
scoping.

Removing unlabeled `exit` command (`exit;`, not `exit lbl;`):

- The unlabeled `exit` command doesn't have clear meaning when it is
inside `while`. In `while cond { ... exit; ... }`, is `exit` equivalent
to `continue` or `break` in C/Python? No translation to Core was
introducing the unlabeled `exit`, and the small-step semantics wasn't
clear about the meaning of `exit` inside a loop.
- The type checker of Core (Strata/Languages/Core/StatementType.lean)
fails when an unlabeled `exit` appears inside the body of a while loop /
if statement (unless it is wrapped by another nested block), but the DDM
syntax of core allows it, so from user's perspective this gap is kind of
surprising.
- This patch removes the unlabeled `exit` case because it can always be
simulated by a labeled block + labeled exit.

On top of this, this also renames `touchedVars` to
`modifiedOrDefinedVars` for clarity, and instead makes `touchedVars` all
vars that are read + modified + defined.

### How to review?

- Strata/DL/Imperative/StmtSemantics.lean has the most important update:
the `.block` constructor now has the input store which will be used to
'project' the output store to the variables that have been defined
before. Also, any block-exiting small steps like `step_block_done` will
do the projection, to define variables which were defined inside the
inner scope.
- Correspondingly, the Kleene language was updated to add a notion of
block and scoping to its syntax and semantics, otherwise DetToKleene
doesn't prove. (Strata/DL/Imperative/KleeneStmt.lean,
Strata/DL/Imperative/KleeneStmtSemantics.lean)
- StrataTest/DL/Imperative/StepStmtTest.lean has a few new tests that
show scoping works well.
- DetToKleeneCorrect.lean and ProcBodyVerifyCorrect.lean shows that the
top-level statements are not touched after this update.
- Strata/DL/Imperative/HasVars.lean has the `touchedVars` update and
Strata/Languages/Core/StatementSemantics.lean has some additional
well-formedness about evaluator extension that are useful.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
@keyboardDrummer keyboardDrummer changed the base branch from main to main2 May 18, 2026 08:01
thanhnguyen-aws and others added 5 commits May 18, 2026 15:38
…seArrayTheory option. (#1145)

*Description of changes:*

**Bug:** When `useArrayTheory` is enabled, the SMT encoder correctly
converts `Map` types to `Array` in variable declarations and function
signatures, but it did *not* perform this conversion for fields inside
ADT (algebraic data type) constructor declarations. This caused a type
mismatch: a datatype field would be declared with type `Map` while the
rest of the encoding used `Array`.

**Fix:** Thread the `useArrayTheory` flag through the datatype emission
pipeline:
- `lMonoTyToTermType` now accepts `useArrayTheory` and converts `Map` to
`Array` when enabled.
- `datatypeConstructorsToSMT` passes the flag to `lMonoTyToTermType`.
- `SMT.Context.emitDatatypes` accepts and forwards the flag.
- `encodeCore` in `Verifier.lean` passes `options.useArrayTheory` to
`emitDatatypes`.

**Test:** Added a unit test in `SMTEncoderDatatypeTest.lean` that
verifies a datatype with a `Map`-typed field emits `(Array Int Int)`
when `useArrayTheory=true` and `(Map Int Int)` when
`useArrayTheory=false`.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
…structs (#1165)

Fixes #1158
Supersedes #1036

## Summary

Fixes several issues with `Core.formatProgram` that prevented round-trip
parsing:

1. **`inline function` formatted without space** — The `inline` grammar
op now includes a trailing space, preventing `inlinefunction`
concatenation. The formatter also now emits the `inline` attribute when
present on functions.

2. **Quantifier variable names** — The formatter now uses the
`prettyName` field from `LExpr.quant` instead of generating `__qN`
names, preserving the original variable names through formatting.

3. **Overflow predicates** — Added grammar entries, formatter handling,
parser entries, and Factory operations for all bitvector overflow
predicates (`SNegOverflow`, `UNegOverflow`, `SAddOverflow`,
`SSubOverflow`, `SMulOverflow`, `SDivOverflow`, `UAddOverflow`,
`USubOverflow`, `UMulOverflow`). `SNegOverflow` and `UNegOverflow` use
distinct grammar entries (`Bv.SNegOverflow` / `Bv.UNegOverflow`) to
preserve their different semantics through roundtrip. The translate
direction correctly dispatches overflow ops by type parameter
(bv1/bv8/bv16/bv32/bv64), logging an error for unsupported widths.

4. **Array assignment (`lhsArray`)** — Implemented `m[k] := v`
translation in both directions: the parser decomposes nested LHS into
base identifier + indices and builds `map_update` expressions; the
formatter detects the `map_update(var, idx, val)` pattern and produces
`lhsArray` syntax.

5. **`Sequence.empty`** — Added grammar entry with explicit type
annotation syntax (`Sequence.empty<int>()`), plus translate and format
handling, resolving the 0-ary polymorphic function limitation for this
operation. Includes solver verification tests (cherry-picked from #1036)
covering basic usage and various element types.

6. **Datatype roundtrip** — Verified that datatype formatting roundtrips
correctly (the extra-parentheses issue noted earlier is resolved).

7. **Roundtrip test infrastructure** — Added `RoundtripTest.lean` that
verifies parse → format → re-parse → re-format produces stable output
for types, functions, procedures, inline functions, parameterized type
arguments, datatypes, array assignments, and `Sequence.empty`.

8. **`getLFuncCall` reuse** — Refactored `lappToExpr` and
`decomposeMapUpdate` to use `getLFuncCall` for decomposing nested
applications, reducing code duplication with the existing utility.

## Testing

- All existing tests pass (with expected output updates for the
formatting improvements)
- New roundtrip tests verify types, functions, procedures, inline
functions, parameterized type arguments, datatypes, lhsArray, and
Sequence.empty
- Solver verification tests for Sequence.empty (from #1036) pass

## Remaining limitations

- Bitvector operations with widths beyond 1/8/16/32/64 log an error and
fall back to bv64 (requires grammar-level type entries per width)

---------

Co-authored-by: Aaron Tomb <aarotomb@amazon.com>
…coding (#935)

Fixes #917

## Summary

Introduces an abstract solver interface (`AbstractSolver τ σ m`) that
decouples term construction and session operations from the SMT-LIB
string encoding pipeline. The interface is parameterized by a term type
`τ`, sort type `σ`, and monad `m` (with `[Monad m] [MonadExceptOf
IO.Error m]` constraints), so a non-SMT-LIB backend (e.g., cvc5 FFI) can
be plugged in by implementing `AbstractSolver` without touching the
encoding logic. All methods throw `IO.Error` directly (no `Except
String` layer).

A new incremental backend communicates with a live solver process via
stdin/stdout. It is opt-in via `--incremental` (batch remains the
default).

## What the interface provides

All operations are monadic (`m`), throwing on error:

- **Configuration**: `setLogic`, `setOption`, `comment`
- **Sort construction**: `boolSort`, `intSort`, `realSort`,
`stringSort`, `regexSort`, `bitvecSort`, `arraySort`, `constrSort`,
`termTypeToSort`
- **Term construction**: `mkPrim`, `mkBool`, `mkAnd`, `mkOr`, `mkNot`,
`mkImplies`, `mkEq`, `mkIte`, `mkAdd`, `mkSub`, `mkMul`, `mkDiv`,
`mkMod`, `mkNeg`, `mkAbs`, `mkLt`, `mkLe`, `mkGt`, `mkGe`, `mkSelect`,
`mkStore`, `mkApp`, `mkAppOp`, `mkForall`, `mkExists`
- **Declarations**: `declareNew`, `declareFun`, `defineFun`,
`declareSort`, `declareDatatype`, `declareDatatypes`
- **Session**: `assert`, `checkSat`, `checkSatAssuming`,
`getUnsatAssumptions`, `getModel`, `getValue`, `reset`, `close`
- **Model inspection**: `termToSMTLibString`

Terms (`τ`) are opaque handles whose meaning is backend-specific.

## Other changes

- **Generic encoder** — `AbstractEncoder.encodeTerm` parameterized over
`τ`/`σ`, caches handles in `AbstractEncoderState τ`.
- **Generic incremental discharge** —
`Imperative.SMT.dischargeObligationIncremental` in `SMTUtils.lean` is
parameterized over `PureExpr`, so any Imperative-level language can use
the incremental solver.
- **Pluggable solver backends** — `CoreSMTSolver` and `MkDischargeFn`
threaded through `Core.verifyProgram`, `Core.verify`, and
`Strata.verify`.
- **Strict solver compatibility** — `encodeDeclarationsAbstract` skips
`declareSort` for datatype names and validates datatype fields.
- **`--incremental`** flag to opt in to the incremental backend.

## Testing

All tests pass with both batch (default) and incremental modes.

## Follow-ups

- [ ] Implement with Lean FFI for cvc5
The Core grammar declared command_datatypes as

op command_datatypes (datatypes : NewlineSepBy DatatypeDecl) : Command
=>
    datatypes ";\\n";

Without @[nonempty] on the field, NewlineSepBy compiles to a
zero-or-more parser, so the trailing ";\\n" production alone matched any
stray ";" in the input. This is particularly easy to trip accidentally:
command_fndef's surface syntax has no trailing semicolon, so a user
writing "};" at the end of a function body (by analogy with procedures,
whose grammar does end in ";") silently produced a phantom
command_datatypes op with zero datatypes.

The phantom later tripped an explicit assertion in translateDatatypes:

  "Datatype block must contain at least one datatype"

which calls panic!, producing a large backtrace at gen_smt_vcs time with
no useful source location.

Mark the datatypes field @[nonempty]. The stray ";" now surfaces as a
parse error at the point it actually appears, and the original repro
elaborates cleanly.

Add a regression test
(StrataTest/Languages/Core/Tests/Issue1146Test.lean) that pins the
canonical form: a program mixing a datatype and a function can run
through gen_smt_vcs without panicking.

Fixes: #1146

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>
#1179)

Fixes #1163

Moves all library code from `StrataMain.lean` into a new `StrataMainLib`
module, leaving `StrataMain.lean` as a thin executable entry point
(`import StrataMainLib`). This makes the library importable from tests
and other tools without pulling in the executable entry point.
tautschnig and others added 17 commits May 19, 2026 08:46
…sition (#1116)

When a StaticCall to a multi-output procedure appears in expression
position (not as the RHS of an Assign), emit an error diagnostic: the
extra outputs would be silently discarded, which is a semantic bug
(e.g., error channels lost).

The diagnostic is suppressed when the call is the direct RHS of an
assignment (where the target count check already validates arity).

Includes a test in ResolutionKindTests demonstrating the diagnostic.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
…1110)

Both helpers' error message already said 'expected at least two
arguments', but the destructuring `let a :: as := as | throw ...` only
rejected the empty list: a singleton like `.app .add [x]` or `.app
.bvadd [x]` silently returned just `x`. This diverged from
`Denote.leftAssoc`, which uses `let t₁ :: t₂ :: ts := ts | none` and
returns `none` for fewer than two operands.

Tighten the pattern to `a :: b :: as := as | throw ...` and seed the
`foldl` with `mkApp2 op a b` so the helper's semantics now match the
existing error message and `Denote.leftAssoc`. Empty-operand lists
continue to throw with the same message as before.

This affects every variadic caller: `.app .and`, `.app .or`, `.app
.add`, `.app .sub`, `.app .mul`, `.app .div`, `.app .mod`, plus `.app
.bvadd`, `.app .bvsub`, `.app .bvmul`, `.app .bvand`, `.app .bvor`,
`.app .bvxor`. The Strata Core SMT encoder does not produce singleton
applications of any of these ops (ripgrep'd across `SMTEncoder.lean`,
`Factory.lean`, and `B3/Verifier/Expression.lean`: every construction
passes at least two operands), so well-formed queries are unaffected.
Zero-operand placeholders like `Term.app Op.and [] .bool` that appear as
"unreachable" fallbacks in `SMTEncoder.lean` were already rejected by
`translateTerm` before this change, and remain rejected.

Regression tests in StrataTest/DL/SMT/TranslateTests.lean cover:
- singleton `.app .add`, `.app .and`, `.app .bvadd`, `.app .bvand` (one
per helper+typeclass combination),
- empty-operand `.app .add` (unchanged behaviour),
- ternary `.app .add` producing `1 + 2 + 3 = 6` (preserves left
associativity past the new 2-operand seed).

Each singleton-rejection test fails on the previous commit (the term
silently translates to its sole operand) and passes here.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
The ANF encoder extracts duplicated subexpressions into fresh variables,
but the existing single-pass implementation can leave large duplicated
sub-subexpressions behind.

Root cause: `removeSubsumed` drops candidate duplicates that are
subexpressions of other (larger) candidate duplicates, to avoid creating
redundant variable declarations. But this means that if only the outer
expression appears at the top level, the inner dupes are hidden inside
the lifted var declaration and never extracted.

Example (from PyAnalyzeLaurel benchmark check_storage_costs):

  Original: assert Any_to_bool(Any_get(response, "Datapoints")) ...

  After partial evaluation, Any_to_bool inlines its 7-branch body,
  each branch referencing the argument. With Any_get also inlined
  as an `ite (is-DictStrAny response) (DictStrAny_get ...)
  (List_get ...)`, the Any_get expression ends up duplicated 62
  times inside a single assert.

  Old ANF output:
    var $__anf.0 : bool := <9KB body with 62 duplicates of Any_get>;
    assert $__anf.0

  New ANF output (after iteration):
    var $__anf.3 : Any := Any_get(response, "Datapoints");
    var $__anf.0 : bool := <body using $__anf.3>;
    assert $__anf.0

Effect: VC file size for VCs on one benchmark drops from 32KB to 11KB
(~65% reduction), and another benchmark now completes verification at
36s where it previously hit a 60s timeout.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
…te actions (#1137)

The same ~15-line blocks for installing cvc5, installing z3, and
restoring the lake cache were duplicated across ci.yml's three jobs
(build_and_test_lean, check_pending_python, build_python), cbmc.yml, and
python-fuzz.yml. Robin asked for this to be consolidated as part of the
#984 review (flagged non-blocking there, handled here).

Introduce three composite actions under .github/actions/:

- install-cvc5: downloads the static cvc5 release zip, makes the binary
available either on $GITHUB_PATH ('path', default) or in /usr/local/bin
('system'). Version defaults to 1.2.1.

- install-z3: same shape for z3, version defaults to 4.15.2. This fixes
a latent bug: cbmc.yml and build_and_test_lean used different aarch64
archive names ('arm64-glibc-2.34' vs. 'arm64-win'); the consolidated
action uses the correct glibc-2.34 one for both.

- restore-lake-cache: wraps actions/cache/restore@v5 with the
established key pattern lake-<os>-<arch>-<toolchain>-<manifest>-<all
.st>-<sha> plus the three fallback keys dropping each trailing
component. Exposes a 'fail-on-cache-miss' input for jobs that depend on
a cache saved for the same SHA by build_and_test_lean.

Rewire ci.yml and cbmc.yml to use the new actions:

- ci.yml: build_and_test_lean: install-cvc5, install-z3,
restore-lake-cache. check_pending_python: install-cvc5 (system),
restore-lake-cache (fail-on-cache-miss: true).
    build_python:        install-cvc5 (system), restore-lake-cache
      (fail-on-cache-miss: true).
      (The pip-based z3 install here is deliberately left as-is —
      it is a different install flow.)
- cbmc.yml:
install-cvc5, install-z3, restore-lake-cache (fail-on-cache-miss: true).

Net change: -108 lines of duplicated shell, +17 lines of action usage
across the two workflow files, with the reusable definitions ready to be
adopted by python-fuzz.yml and any future workflows.

actionlint reports no new warnings (the pre-existing save-always warning
on cbmc.yml line 22 is unrelated).

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Blocker for Laurel Seq<T>/Array<T> in #1073.

## Bounds preconditions for `Sequence.select` / `update` / `take` /
`drop`

Following the `Int.SafeDiv` pattern:

| Op                  | Precondition                        |
|---------------------|-------------------------------------|
| `Sequence.select`   | `0 <= i && i < Sequence.length(s)`  |
| `Sequence.update`   | `0 <= i && i < Sequence.length(s)`  |
| `Sequence.take`     | `0 <= n && n <= Sequence.length(s)` |
| `Sequence.drop`     | `0 <= n && n <= Sequence.length(s)` |

`Sequence.length` / `empty` / `append` / `build` / `contains` remain
total.

`PrecondElim` picks the obligations up automatically at call sites in
imperative code (via `transformStmt`) and in pure positions like
`requires` / `ensures` / quantifier bodies / function bodies (via the
synthetic `$$wf` procedures).

Obligations carry `propertyType = "outOfBoundsAccess"` (new `MetaData`
constant, mirroring `divisionByZero`), flow through a new
`PropertyType.outOfBoundsAccess` enum variant and the three
metadata-to-PropertyType conversion sites, and render as
`"out-of-bounds-access"` in SARIF output.

### Incidental fix

While wiring the SARIF classification, I noticed
`propertyTypeToClassification` in `SarifOutput.lean` was pre-existing
dead code: `vcResultToSarifResult` never set `properties.propertyType`,
so every obligation defaulted to `"assert"` in SARIF output. This PR
wires it up, so `divisionByZero` and `arithmeticOverflow` obligations
now also classify correctly in SARIF alongside the new
`outOfBoundsAccess`.

## Testing

- **`StrataTest/Transform/PrecondElim.lean` Test 10** —
`collectPrecondAsserts` attaches `outOfBoundsAccess` metadata for all
four partial ops plus a nested `Sequence.select(Sequence.update(...))`
call. Mirrors the pattern in `OverflowCheckTest.lean`. Complemented by
per-op pretty-print snapshots (`#guard_msgs`) that assert the exact
obligation body string — these catch regressions that preserve count and
metadata tag but corrupt the obligation body (e.g. passing `.Le` instead
of `.Lt` to `mkSeqBoundsPrecond` at a call site, changing the
bound-variable name, or swapping the lower/upper bound inside
`mkSeqBoundsPrecond`).
- **`StrataTest/Languages/Core/Tests/SarifOutputTests.lean`** —
property-type classification tests covering all five `PropertyType`
variants.

Collateral updates to existing tests reflect the new obligations
(`Seq.lean`) and updated `requires` on Sequence function signatures
(`ProgramEvalTests.lean`). Note: the bounds obligations in `Seq.lean`
appear as `true && 0 < length(...)` — the partial evaluator simplifies
`0 <= 0` to `true` but does not further simplify `true && X` to `X`.
This is a pre-existing evaluator gap, not newly introduced by this PR;
the SMT solver discharges the obligation trivially.

## Known downstream impact

PR #1073 (Laurel Seq<T>/Array<T>) emits `Sequence.select` / `update` /
`take` / `drop` calls from its translation. Its `T18_Sequences` test
uses `#guard_msgs(drop info, error)` on a diagnostics-only pipeline, so
the test assertions should still pass syntactically — but individual
sequence-manipulation programs now require bounds guards to fully
verify. PR #1073 has been adjusted accordingly.
## Summary

The DDM type checker uses unification to infer implicit type arguments,
but this fails in some cases — notably when template-generated accessors
with unresolved tvar return types are composed with polymorphic
functions. This PR adds a per-dialect option to bypass type checking
entirely, filling implicit type parameter slots with anonymous type
placeholders instead.

## How to use

In a dialect definition, add `dialect_option typecheck off;` to disable
type checking for all programs using that dialect:

```
#dialect
dialect MyDialect;
dialect_option typecheck off;

type Foo;
fn bar (A : Type, x : A) : A => x;
...
#end
```

Programs using the dialect will skip `inferType` and `unifyTypes` for
expression arguments. Variable name resolution and global context
population still work normally — only type inference is bypassed.

## Test changes

`StrataTest/DDM/TypecheckSkip.lean` defines an inline dialect with
parameterized types, polymorphic functions, and perField accessor
templates. It demonstrates that the accessor-into-polymorphic-fn pattern
(from issue #650 / PR #734) produces type errors with typecheck on, and
elaborates successfully with typecheck off.

## Details

- **AST.lean**: Add `TypeExprF.skip` (anonymous type placeholder `.tvar
loc ""`), `Dialect.typecheck : Bool := true`, refactor
`Program.globalContext` into explicit `computeGlobalContext` function
- **Elab/Core.lean**: Add `ElabContext.typecheck`, skip
`inferType`/`unifyTypes` in `elabSyntaxArg`, fill unfilled type param
slots in `runSyntaxElaborator` post-pass, skip pre-registration in
`elabOperation`, fix `applyNArgs` to handle tvar types
- **Elab/DeclM.lean**: Add `DeclContext.typecheck`
- **Elab/DialectM.lean**: Add `elabSetOptionCommand` handler dispatching
on `"typecheck"` option
- **BuiltinDialects/StrataDDL.lean**: Add `setOptionCommand` with syntax
`dialect_option <name> <value> ;`
- **Elab.lean**: Thread `Dialect.typecheck` into `DeclContext` in
`elabProgramRest`
- **Ion.lean, Python/Specs/DDM.lean**: Update `Program` construction for
`globalContext` refactor

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…etrics JSONL (#1093)

## Motivation

The pipeline previously had no unified infrastructure for diagnostics or
timing. Warnings were handled ad-hoc (some printed to stderr, some
silently
dropped), phase timing was manual and inconsistent, and the only
structured
output (`--warning-summary`) wrote a single JSON blob at the end —
useless
if the process was killed by a timeout. There was also no distinction
between
fatal and non-fatal errors, so callers couldn't tell whether a result
was
trustworthy.

This PR introduces `PipelineContext`/`PipelineM` as a shared backbone
that
gives every pipeline phase consistent behavior: categorized diagnostics,
nested phase timing, fatal-error abort, and streaming metrics output
that
survives timeouts.

## Summary

- Introduce `PipelineContext` and `PipelineM` for structured diagnostic
  accumulation and phase timing across the entire pipeline
- Replace ad-hoc `profileStep`/`startPhase` with `withPhase` combinator
that
  handles nesting, timing, and profile output via `try`/`finally`
- Add `withRepeatedPhase` for iteration-level timing (preprocess,
smtEncode,
solver) that accumulates count+duration and reports as `[profile]` lines
- Replace `--warning-summary` (single JSON blob) with `--metrics <file>`
  (streaming JSONL, flushed per-record for timeout resilience)
- Add `--verbose` program dumps: prints intermediate Laurel and Core
programs
  with `---- BEGIN/END ----` delimiters for programmatic parsing
- Refactor module resolution: fatal errors on invalid/missing modules,
unified `resolveModules` for dispatch, auto-resolved, and explicit
modules
- Fix `pyResolveOverloads` to print errors from pipeline context on
failure
- Split `Messages.lean` into data types (`Messages.lean`) and runtime
  infrastructure (`Context.lean`)
- Move Python-specific `MessageKind` constants to
  `Languages/Python/Specs/MessageKind.lean`

## `PipelineContext` and `PipelineM`

`PipelineContext` is a structure carrying immutable config
(`outputMode`,
`profilePipeline`) plus mutable state as individual `IO.Ref` fields:

- `messagesRef` — accumulated `PipelineMessage` diagnostics
- `toolErrorsRef` / `userCodeErrorsRef` — bucketed by impact for fast
access
- `currentPhaseRef` — current position in the phase hierarchy (managed
via
  `push`/`pop`)
- `repeatedDepthRef` — nesting depth of `withRepeatedPhase` scopes (when
> 0,
  `withPhase` aggregates silently)
- `phaseStateRef` — per-phase scoped state bundling repeated-phase
aggregation
  and message counts (saved/restored atomically on phase entry/exit)
- `metricsHandle` — optional file handle for streaming JSONL metrics

Because all state is in `IO.Ref`s, any monad with `BaseIO` access can
use
pipeline capabilities by passing a `PipelineContext` value directly —
`withPhase` is polymorphic over the monad (`[MonadLiftT BaseIO m]`).

`PipelineM` is defined as:

```lean
abbrev PipelineM := ReaderT PipelineContext (EIO Unit)
```

The `EIO Unit` base ensures that fatal errors (configuration errors,
internal errors, user code issues) are impossible to silently ignore —
they
throw `()`, aborting the pipeline. Non-fatal diagnostics (warnings,
known
limitations) are added to `messagesRef` and execution continues. Callers
that want to attempt recovery (e.g., to gather more diagnostics) can
catch
the exception, but should rethrow to avoid proceeding with invalid
state.

The top-level runner catches the exception and still returns all
accumulated
messages — so the caller always gets complete diagnostics regardless of
how
the pipeline terminated.

## Phase tracking state machine

The phase system operates in two modes controlled by `repeatedDepthRef`:
- **Mode N** (normal, depth = 0): `withPhase` prints `[start]`/`[end]`
in
  profile mode and emits JSONL timing metrics.
- **Mode R** (repeated, depth > 0): `withPhase` silently aggregates
timing
  into `phaseStateRef.repeatedPhases` — no print, no individual metrics.

`withRepeatedPhase` increments `repeatedDepthRef` on entry and
decrements
on exit. `withPhase` never changes the depth — it inherits the mode and
behaves accordingly.

Key invariants:
- `currentPhaseRef` always reflects the innermost active scope's full
path
- `phaseStateRef` is scoped: saved on entry, restored on exit (no
leakage)
- In mode R, all timing flows through `phaseStateRef.repeatedPhases`
only

## `--metrics <file>` (replaces `--warning-summary`)

Writes streaming JSONL (one JSON object per line, flushed immediately
after
each record). If the process is killed mid-pipeline, all records should
remain.

Record types:
- `timing` — phase start/end in ms, with optional count for aggregated
phases
- `diagnostic` — pipeline messages with phase, file, category, impact,
location
- `outcome` — result/exit_code/total_ms, written last (absence = killed)

Example output:
```jsonl
{"type":"timing","phase":"pythonAndSpecToLaurel","start_ms":0,"end_ms":97}
{"type":"diagnostic","phase":"pythonAndSpecToLaurel.resolveAndBuildPrelude","file":"specs/builtins.ion","category":"unsupportedUnion","impact":"knownLimitation","message":"unsupported overload form for 'dict.update'"}
{"type":"timing","phase":"verification.vcDischarge.solver","start_ms":0,"end_ms":1102,"count":12}
{"type":"outcome","result":"verified","exit_code":0,"total_ms":1830}
```

## `--verbose` program dumps

When `outputMode == .verbose`, the pipeline prints intermediate programs
to
stdout with delimiters for programmatic parsing:

```
---- BEGIN Laurel Program ----
<full Laurel program text>
---- END Laurel Program ----
---- BEGIN Core Program ----
<full Core program text>
---- END Core Program ----
```

## Module resolution refactoring

- `resolveModuleEntry` now takes a parsed `ModuleName` (not a string)
- Single `resolveModules` handles dispatch, auto-resolved, and explicit
modules
- `invalidModuleName` changed from `internalWarning` to
`configurationError` (fatal)
- Merged `missingDispatchModule` + `missingExplicitPySpec` into
`missingPySpecModule`
- Eliminated duplicate message scenario where both invalid-name and
missing-module
  errors fired for the same input

## `pyResolveOverloads` error handling

When `readDispatchOverloads` encounters a fatal error (e.g., nonexistent
file),
the command now reads `pctx.messagesRef` and prints all accumulated
messages
to stderr before exiting with a non-zero code — instead of silently
yielding
empty results.

## Profile output format

`--profile` emits structured `[start]`/`[end]`/`[profile]` lines to
stdout:

```
[start] pythonAndSpecToLaurel (time: 0ms)
  [start] readPythonIon (time: 0ms)
  [end] readPythonIon (time: 2ms)
  [start] resolveAndBuildPrelude (time: 2ms)
  [end] resolveAndBuildPrelude (time: 97ms)
  [warnings] resolveAndBuildPrelude: 2 unsupportedUnion
[end] pythonAndSpecToLaurel (time: 97ms)
[start] verification (time: 105ms)
  [start] vcDischarge (time: 320ms)
    [profile] preprocess (×12, total: 45ms, avg: 3ms)
    [profile] solver (×12, total: 1102ms, avg: 91ms)
  [end] vcDischarge (time: 1185ms)
[end] verification (time: 1185ms)
```

## Design: impact drives behavior

Each `MessageKind` has a `MessageImpact` that determines pipeline
behavior:

| Impact | Fatal? | Meaning |
|--------|--------|---------|
| `configurationError` | yes | Invalid arguments or unreadable on-disk
pyspecs |
| `internalError` | yes | Unexpected failure preventing output |
| `userCodeIssue` | yes | Issue in the user's source code |
| `internalWarning` | no | Unexpected condition that didn't prevent
output |
| `knownLimitation` | no | Documented gap in modeling |

Note: `emitMessageAndAbort` is the function that aborts — callers choose
whether to abort. A caller may use `emitMessage` (continues) and check
`isFatal` at phase boundaries for custom abort strategies.

## Not addressed: `[statistics]` output

The `[statistics]` lines (transform counters like
`Evaluator.simulatedStmts`,
`FilterProcedures.erasedProcedures`) are orthogonal to phase timing.
They
currently bypass `PipelineContext` — printed via bare `IO.println` in
`Core.verify` and `StrataMainLib.lean` — so they don't respect phase
nesting
or indentation. Integrating statistics into the phase system is future
work.

## Key files changed

- `Strata/Pipeline/Messages.lean` — pure data types: `Phase`,
`MessageImpact`, `MessageKind`, `PipelineMessage`
- `Strata/Pipeline/Context.lean` — runtime: `PipelineContext`,
`PipelineM`, `withPhase`, `withRepeatedPhase`, `emitMetric`,
`OutputMode`
- `Strata/Languages/Python/Specs/MessageKind.lean` — Python-specific
`MessageKind` constants
- `Strata/Pipeline/PyAnalyzeLaurel.lean` — top-level pipeline
orchestration, `--verbose` program dumps, `metricsHandle` threading
- `Strata/Languages/Core/Verifier.lean` — replaced manual timing with
`withRepeatedPhase`
- `Strata/Languages/Laurel/LaurelCompilationPipeline.lean` — replaced
`profileStep` with `withPhase`
- `Strata/Languages/Python/PySpecPipeline.lean` — fatal abort via throw,
module resolution refactor, diagnostic metric emission
- `StrataMainLib.lean` — `--metrics` flag, outcome record emission,
error printing in `pyResolveOverloads`
- `StrataTest/Pipeline/PhaseTimingTest.lean` — phase timing integration
tests

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
Co-authored-by: June Lee <lebjuney@amazon.com>
**Warning:** This repository will shortly undergo a split into several
separate repositories. If you're creating a PR that crosses the
boundaries between these repositories, you may want to hold off until
the split is complete or be prepared to rework your PR into multiple PRs
once the split is complete.

The code that will be moved includes:
- Strata/DDM/*
- Strata/Languages/Boole/*
- Strata/Languages/Python/* along with Tools/Python/*
- Tools/BoogieToStrata
## Summary

`PythonIdent.pythonModule` was a raw `String`, allowing empty module
names to propagate silently through the system and cause downstream
failures (see #1143). This PR changes `pythonModule` to use `ModuleName`
— which now guarantees at least one non-empty component — and adopts it
more widely through the PySpec pipeline, eliminating this class of bugs
at the type level.

## Approach

Rather than adding runtime checks at each use site, `ModuleName` gains a
strengthened invariant via a private constructor enforcing non-empty
components. `PythonIdent.ofComponent` uses `decide` to verify
non-emptiness at compile time for literal module names (used in
`Decls.lean`). Runtime parsing goes through `ModuleName.ofString?` which
returns `Option`.

## Key changes

- **PythonIdent.lean**: `PythonIdent.pythonModule` changed from `String`
to `ModuleName`. New `ofComponent` constructor with compile-time proof
for single-component literals. `ModuleName` API extended: `back`,
`parent (n)`, `push`, `append`/`++`, `toString (sep)`
- **PythonIdent.lean**: New `ModuleName.ofRelativePath` provides a
single canonical function for deriving a `ModuleName` from a `.py` file
path relative to a search root, handling both regular files and
`__init__.py` packages. This eliminates redundant ad-hoc implementations
scattered across `discoverModules`, `RelativeImportTest`, and
`findInPath` — each of which had slightly different `__init__.py`
handling, creating inconsistency and latent bugs (e.g.,
`pySpecToLaurelCommand` silently failed for package modules)
- **Specs.lean**: `findInPath` simplified to return `(FilePath × Bool)`
instead of `(FilePath × Array ModuleComponent)` — the `Bool` is
`isInit`, and callers use `ModuleOfPath.modulePrefix` to derive the
package prefix. Deleted unused `ofFile` function
- **Specs/DDM.lean**: Converted `fromDDM` functions to monadic `FromDDM`
(`Except (SourceRange × String)`) for proper error propagation instead
of panics
- **Specs/ToLaurel.lean**: `signaturesToLaurel` takes `ModuleName`
instead of `String` prefix. Overload extraction factored into dedicated
`ExtractM` monad (no dependency on `ToLaurelContext`)
- **PySpecPipeline.lean**: `buildPySpecLaurel` takes `Array (ModuleName
× String)` instead of `Array (String × String)`
- **Specs.lean**: Import resolution uses `ModuleName` throughout;
`translateFile` requires a `ModuleName` (no longer optional)
- **Specs/Decls.lean**: Builtin idents use `ofComponent` with
compile-time proof

### Unrelated cleanup

- Deleted `Strata.Python.containsSubstr` helper and replaced all call
sites with `String.contains` (which accepts substring patterns in Lean
4.29)

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…1149)

*Issue #, if available:* #1148

*Description of changes:*

1. Namespace collision (issue 1 in #1148): Constants that share a name
with a procedure are now prefixed with __const_ in the Strata output.
The rename is applied consistently in declarations and references.

2. Recursive synonym resolution (Fixes issue 2 in #1148):
DealiasTypeExpr now recurses on its result, resolving synonym chains
like ref := i64 := int fully to the base type. Previously it resolved
only one level, causing panics on comparison/arithmetic operators
applied to multi-level type synonyms.

3. SMACK assert encoding (issue 3 in #1148): Procedures named assert_.*
now get a requires (arg != 0) precondition emitted. Call elimination
generates VCs at each call site, making SMACK assertions verifiable.

4. InferModifies: SMACK-generated Boogie often omits explicit `modifies`
clauses on procedures that mutate globals. Setting `InferModifies =
true` in BoogieToStrata has two effects: (1) it runs
`ModSetCollector.CollectModifies(program)` to populate empty modifies
clauses, and (2) through `CheckModifies` in Boogie's resolution context,
it suppresses typechecking of modifies clauses. Without this,
`ResolveAndTypecheck` would reject SMACK programs that omit modifies
clauses.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: David Deng <htd@amazon.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes #1045

## Summary

Adds a `--parallel N` flag that runs up to N solver instances
concurrently when verifying proof obligations. Without the flag (or with
`--parallel 1`), behavior is unchanged (sequential).

## Problem

Verification of programs with many obligations is bottlenecked by
sequential solver invocations. Each obligation spawns a separate solver
process, waits for the result, then moves to the next.

## Solution

When `--parallel N` is specified (N > 1), the verification pipeline
splits into two phases:

1. **Sequential preprocessing** (fast): determine checks, preprocess
obligations, encode to SMT terms. Obligations resolved by the evaluator
are handled immediately.
2. **Parallel solver dispatch** (slow): obligations that need the solver
are placed in a shared queue. N worker tasks (on dedicated threads)
continuously pull from the queue — when a solver finishes, it
immediately picks up the next unsolved obligation. Results are collected
in original obligation order so output is deterministic.

The worker pool design avoids the "wait for slowest in batch"
bottleneck: if one obligation takes 10s and others take 1s, the
fast-finishing workers immediately start on the next obligation instead
of idling.

`stopOnFirstError` is supported via a shared flag: on failure, workers
stop claiming new jobs. Already-running jobs complete naturally; skipped
jobs leave their placeholder results in place.

Both the incremental and batch solver paths are safe for parallel use:
the incremental backend spawns independent solver processes, and the
batch path uses atomic `modifyGet` for filename counter generation.

**Pluggable discharge function**: The full public API (`Strata.verify`,
`Core.verify`, `verifySingleEnv`, `mkDefaultCoreSMTSolver`) accepts a
`mkDischarge : MkDischargeFn` parameter (defaulting to `mkDischargeFn`).
External solvers (e.g. using the AbstractSolver API) can provide their
own discharge function factory.

## Performance

Benchmark: 30 obligations across 2 programs, z3 4.12.2, avg over 3 runs:

| Mode | Time | Speedup |
|------|------|---------|
| `--parallel 1` (sequential) | 921ms | baseline |
| `--parallel 2` | 491ms | **1.88x** |
| `--parallel 3` | 405ms | **2.27x** |
| `--parallel 4` | 356ms | **2.59x** |

Sweet spot is `--parallel 3-4`. The 1→2 jump gives the biggest single
improvement (1.88x), then diminishing returns as solver spawn overhead
and sequential preprocessing become the bottleneck.

## Testing

All tests that pass without `--parallel` also pass with it. The
sequential path (`--parallel 1`, the default) is unchanged.
When translateExpr produces an expression containing a nested
multi-output procedure call (e.g., PSub(base, timedelta_func(...))), the
new extractMultiOutputCalls pass in translateAssign rewrites it into a
preceding multi-target assignment and a variable reference:

  var $mo_N := ...; assign $mo_N, maybe_except := timedelta_func(...);
  delta := PSub(base, $mo_N)

This prevents multi-output calls from appearing in expression position,
which would silently discard the error channel.

Key design choices:
- Counter seeded from source byte offset for globally unique temp
variable names
- Branch-local multi-output calls are wrapped in Block nodes inside
their respective IfThenElse branches (not hoisted unconditionally)
- Generated wrapper statements propagate source metadata from the
original call
- Block-flattening in the lifter uses cons-based prepend for correct
evaluation order
- containsBareAssignment skips Blocks so generated wrappers pass through
assert/assume handlers
- extractMultiOutputCalls uses structural recursion (attach) rather than
partial

Tested: existing tests pass, builds cleanly.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Mikaël Mayer <MikaelMayer@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mikael Mayer <mimayere@amazon.com>
### Changes
- Fix scoping bug in HeapParam pass where `Declare` targets inside
blocks created by `heapTransformExpr` were invisible to subsequent
statements
- Fix small transparency bug in ConstrainedTypeElim
- Factor out `heapVarName` / `heapInVarName` constants to avoid
duplicated magic strings across passes
- Add consistency check to LaurelCompilationPipeline to detect bugs in
passes
- Move intermediate program output to
`.lake/build/intermediatePrograms/`

### Testing
- Existing tests pass, new consistency check added

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Michael Tautschnig <mt@debian.org>
Co-authored-by: keyboardDrummer-bot <keyboardDrummer-bot@users.noreply.github.com>
Co-authored-by: keyboardDrummer-bot <keyboarddrummer.bot@gmail.com>
Co-authored-by: Mikael Mayer <mimayere@amazon.com>
…1134)

Holes in arguments to datatype destructors (e.g.
`Any..as_ListAny!(<?>)`) and testers (e.g. `Any..isfrom_str(<?>)`) were
left untyped after `InferHoleTypes`, producing "could not infer type"
diagnostics and downstream `Unknown`-typed `$hole_N` functions.

The fix surfaces the parent datatype through the semantic model so
`InferHoleTypes` can recover parameter types via `uniqueId`-based model
lookups (per the Laurel steering doc, post-resolution passes should rely
on model lookups, not textual names).

Key changes:

- New `ResolvedNode.datatypeDestructor` variant carrying the parent type
name and field parameter, with corresponding updates to
`ResolvedNodeKind`, `.kind`, `.getType`, and `SemanticModel.isFunction`.
- Datatype scope registration chains `defineNameCheckDup` calls so
safe/unsafe destructors and tester/constructor names share a single
`uniqueId`.
- `buildRefToDef` registers each constructor parameter as
`.datatypeDestructor`.
- `calleeParamTypes` matches `.datatypeConstructor` and
`.datatypeDestructor` directly, synthesizing the correct `.UserDefined`
input type.
- `ResolvedNode.getType` now uses explicit pattern matches instead of a
catch-all, so new constructors trigger compile errors rather than
silently falling through.
- `SemanticModel.refToDef` is now `private`; external code uses
`SemanticModel.get?` (returning `Option ResolvedNode`) or
`SemanticModel.get` to look up nodes by `Identifier`, preventing
arbitrary `Nat` key access.

Tests: three new `#guard_msgs` cases for holes inside
destructors/testers, plus updated expected diagnostics in
`ResolutionKindTests`. Existing tests pass.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Mikael Mayer <mimayere@amazon.com>
Migrate 241 of 282 non-Boole StrataTest files to be Lean modules,
completing the module migration started in #982.

- **Added `module` keyword** to 241 test files. The Boole test files are
excluded because `gen_smt_vcs` produces proof terms that fail kernel
type-checking when the file is a module.

- **Added `meta import`** for Strata/StrataTest dependencies where
`#eval`, `#guard_msgs`, or `#guard` need native implementations from the
imported module.

- **Added `meta import all`** where proofs use `simp`, `rfl`, or
`native_decide` on definitions whose bodies must be visible across the
module boundary.

- **Added `import all`** where non-meta definitions (types, instances)
need access to definition bodies from other StrataTest modules.

- **Added `meta section` wrappers** around test code that exercises
native implementations, since `#eval`/`#guard_msgs`/`#guard` require
meta context to call into compiled code from another module.

- **Added `import Strata.DDM.Integration.Lean.HashCommands`** (non-meta)
in files using `#strata`/`#dialect` syntax, since the syntax extension
must be available at parse time regardless of meta context.

- **Updated byte offsets** in `#guard_msgs` docstrings where expected
output includes position-dependent labels (e.g., invariant names derived
from source positions shifted by added import lines).

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The `mkDischarge` parameter was accepted by `Core.verifyProgram` but not
exposed through `PyAnalyzeConfig` or the CLI `verify` command, so
callers could not supply a custom solver discharge function.

This PR:
- Adds `mkDischarge` to `PyAnalyzeConfig` (defaulting to
`Core.mkDischargeFn`) and forwards it to the verification call.
- Adds `mkDischarge` as a parameter to `verifyCommand` and passes it
through to the `verify` call.
- Adds `mkDischarge` as a parameter to `pyAnalyzeLaurelCommand` (same
pattern), so downstream executables can override the discharge function
without redefining the command.

Tested: existing tests pass, build succeeds.

Fixes #1199
@keyboardDrummer keyboardDrummer changed the base branch from main2 to main May 21, 2026 12:36
@github-actions github-actions Bot added the Core label May 21, 2026
@keyboardDrummer keyboardDrummer changed the base branch from main to main2 May 21, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants