Skip to content

Add Lean 4 backend [AI-assisted]#40

Open
septract wants to merge 52 commits intorems-project:masterfrom
septract:mdd/lean-backend
Open

Add Lean 4 backend [AI-assisted]#40
septract wants to merge 52 commits intorems-project:masterfrom
septract:mdd/lean-backend

Conversation

@septract
Copy link

@septract septract commented Mar 7, 2026

Summary

Add a new backend targeting Lean 4, allowing Lem definitions to be exported as Lean 4 code. This brings the set of supported targets to: OCaml, Coq, HOL4, Isabelle/HOL, Lean 4, LaTeX, and HTML.

The backend is structurally modelled on the Coq backend, adapted for Lean 4 syntax and semantics. Tested against Lean 4.28.0 via the Lake build system.

What's included

  • src/lean_backend.ml (~2400 lines) — main backend translating Lem AST to Lean 4 syntax
  • lean-lib/LemLib Lean 4 runtime library (~720 lines): sets, maps, comparisons, numeric utilities, BitVec operations for machine words
  • library/lean_constants — Lean 4 reserved words and typeclass names
  • 368 Lean target reps across 28 library files (vs 309 for Coq) — every Coq target rep has a Lean equivalent, plus extras for Lean-specific features
  • Supporting changes to 21 shared source files (all guarded behind Target_lean checks — see "Shared code safety" below)
  • Documentation in doc/manual/backend_lean.md and updates to README.md
  • make lean-libs target generating Lean library files from Lem's standard library
  • scripts/lean_coverage.sh — bisect_ppx coverage script used to identify and close untested codepaths (~87% line coverage on lean_backend.ml)

Test suite

  • 12 backend tests: Types, Pats, Pats3, Classes2, Classes3, Exps, Coq_test, Coq_exps_test, Record_test, Op, Let_rec, Indreln2 — all generate valid Lean and compile via Lake (57 jobs)
  • 44 comprehensive tests with 424 runtime assertions covering: arithmetic, strings, lists, sets, maps, pattern matching, type definitions, type classes, mutual recursion, vectors, machine words (BitVec), cross-module imports, and more
  • 2 real-world examples:
    • examples/cpp/ — C++ concurrency model (Cmm.lean, ~1930 generated lines, 34 Lake jobs, 0 errors)
    • examples/ppcmem-model/ — PowerPC memory model (10 .lem files, 10/10 compile, 43 Lake jobs, 0 errors)

Notable design decisions

  • Whitespace-sensitive output: Lem's block formatting disabled for Lean; explicit spaces used instead
  • UTF-8 output: Meta_utf8 variant in output.ml for correct encoding of , ×, ,
  • Constructor scoping: export TypeName (Ctor1 Ctor2 ...) after each inductive — Lean's open is file-local, so export is needed for importers to see constructors
  • Termination: recursive functions default to partial def unless a declare {lean} termination_argument = automatic annotation is present; 10 library functions were annotated total, and 3 more redirect to total LemLib wrappers. Only 2 genuinely partial defs remain (unfoldr, leastFixedPointUnbounded)
  • deriving BEq, Ord: auto-derived for simple types (non-mutual, no function-typed constructor args); sorry-based instances as fallback for mutual types (Lean's deriving doesn't support mutual inductives)
  • Machine words: mword maps to Lean's BitVec with 36 operations implemented in LemLib; int32/int64 use distinct newtype wrappers (LemInt32/LemInt64)
  • Type/value namespace unification: Lean shares a single namespace (unlike Lem/Coq/OCaml). rename_top_level.ml seeds constant renaming with type names to handle collisions. Cross-module names included via full env.t_env scan
  • Typeclass name avoidance: lean_constants includes Add, Sub, Mul, etc. to prevent clashes with Lean stdlib
  • Propositional equality in indreln: == (BEq) converted to = (Prop) in inductive relation antecedents, handling both Lem AST decomposition paths
  • n+k pattern rejection: is_lean_pattern_match in patterns.ml triggers guard-based desugaring instead of unsupported n+k patterns
  • Tab sanitization: Lean 4 forbids tabs; all whitespace tokens sanitized automatically

Shared code safety

Changes to shared source files are guarded to only affect the Lean backend:

  • typed_ast_syntax.ml: class path collection in used_types — only consumed by Lean-specific code in rename_top_level.ml
  • rename_top_level.ml: Tc_class renaming returns early for non-Lean targets
  • target_trans.ml: class names added to avoid set only when target is Target_lean
  • output.ml: block token type kept as Kwd (preserves Coq/HOL/Isabelle spacing); UTF-8 encoding fix (of_string) is universal but correct
  • target_binding.ml: e_env fallback only fires when primary lookup fails (low risk, all targets)

Verified: all 6 non-Lean backends (OCaml, Coq, HOL, Isabelle, HTML, LaTeX) produce byte-for-byte identical output on all test files compared to master. Zero regressions.

Known limitations (won't fix)

  • 2 genuinely partial def in generated library: unfoldr (user-supplied termination), leastFixedPointUnbounded (iterates until fixpoint). Correctly partial.
  • sorry-based instances for mutual types: Lean deriving can't handle mutual inductives. Documented with /- mutual type -/ comments.
  • Duplicate Nat/Int instances in Num.lean: nat/natural both map to Nat, int/integer both to Int. Inherent to Lem's design, same in all backends.
  • Overlapping BEq instances: from Eq0, SetType, MapKeyType — three paths to BEq a. Lem typeclass design tradeoff.
  • O(n²) set operations: list-based representation, same as Coq backend.
  • coq_backend_skips.lem: non-positive inductive — fundamental Lean restriction.
  • Generated output spacing: double spaces, extra parens — inherited from Lem's output pipeline, affects all backends.

Test plan

  • make -C src — compiler builds cleanly
  • 12/12 backend tests generate valid Lean and lake build succeeds (57 Lake jobs)
  • 44/44 comprehensive tests pass with 424 runtime assertions verified
  • lake build in lean-lib/ — LemLib compiles (33 Lake jobs)
  • make lean-libs — all library files generated successfully
  • examples/cpp/Cmm.lean generates and compiles (34 Lake jobs)
  • examples/ppcmem-model/ — 10/10 files generate and compile (43 Lake jobs)
  • All 6 non-Lean backends produce byte-for-byte identical output vs master (60 file comparisons across OCaml, Coq, HOL, Isabelle, HTML, LaTeX — 0 differences)

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

septract and others added 17 commits March 6, 2026 10:06
Add a new backend targeting Lean 4, enabling lem to generate .lean files
from semantic definitions. This follows the same architecture as the
existing Coq backend (custom LeanBackend module via functor).

New files:
- src/lean_backend.ml: Main backend (~1600 lines)
- lean-lib/LemLib.lean: Runtime support library
- library/lean_constants: Lean 4 reserved words

Target registration across: ast.ml, target.ml, target.mli, parser.mly,
main.ml (-lean flag), target_trans.ml, backend.ml, backend.mli,
process_file.ml (.lean output with /- -/ comments and import LemLib).

Library declarations (declare lean target_rep) added to all standard
library files: basic_classes, bool, maybe, num, list, set, map, string,
either, relation, sorting, word, machine_word, tuple, function,
set_extra, map_extra, set_helpers, assert_extra.

Key Lean 4 adaptations:
- Bool/List/Nat capitalized types via target_rep type declarations
- Constructor patterns use dot notation (.Red, .some, etc.)
- Comments converted from (* *) to /- -/
- Records use structure/where syntax
- Match arms use | pat => expr (no end keyword)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Generate Lean 4 'instance : Inhabited T' for each type definition,
mirroring Coq's 'Definition T_default' generation. This ensures
default values are available for all user-defined types.

Also raise proper errors for Typ_with_sort in pat_typ and typ,
matching Coq's behavior instead of silently passing through.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Lean 4 to all documentation: README, manual (introduction,
invocation, backends, language grammar, backend linking, typeclasses),
and the Ott grammar definition. Create new backend_lean.md manual page.

Add missing declare lean target_rep entries for nth (list_extra.lem),
ord and chr (string_extra.lem) to complete library parity with Coq.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add lean to target lists for wordFromInteger, wordFromNumeral,
wordToHex in machine_word.lem. Add lean target_rep for choose and
exclude lean from choose lemmas/asserts in set_extra.lem. Exclude
lean from THE_spec lemma in function_extra.lem. Add lean-libs target
to library/Makefile and leantests target to tests/backends/Makefile.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n backend

- Add Meta_utf8 variant to output.ml to preserve UTF-8 bytes (×, →, etc.)
  instead of double-encoding through of_latin1
- Fix to_rope_help_block to use of_string for Format output, preventing
  double-encoding of Unicode characters in block-formatted output
- Add flatten_newlines utility to collapse newlines in output trees
- Disable block formatting for Lean backend (Lean 4 is whitespace-sensitive)
- Replace break_hint_space with explicit spaces in App, Infix, If, Fun, Case
- Add 'open TypeName' after inductive types for constructor scoping
- Add 'open ClassName' after class definitions for method scoping
- Remove dot-prefix on constructors in expression/pattern position
- Fix pattern constructor argument spacing (concat emp -> concat space)
- Fix 'let' keyword spacing (letx -> let x)
- Expand LemLib with set/map operations, ordering, and utility functions
- Add Lake project setup for lean-lib (lakefile.lean, lean-toolchain)
- Add lean-test Lake project for end-to-end compilation testing
- Expand lean_constants with ~30 missing Lean 4 reserved words
- Add lean-libs target and Lean paths to Makefile install/distrib/clean

5 of 7 test files now compile: Types, Classes2, Classes3, Pats, Pats3.
Remaining issues: Exps (set BEq instances), Coq_test (mutual inductives
with varying parameters), Classes3 (target-specific code leaking).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Filter target-specific class methods from Lean output: class bodies
  now skip methods annotated for other backends ({hol}, {coq}, etc.)
- Filter corresponding instance methods when class method is not
  target-visible for Lean
- Add 'deriving BEq' to inductive types and structures when all
  constructor/field types support it (no function-typed args)
- Skip 'deriving BEq' for mutual blocks to avoid cross-reference issues
- Handle mutual inductives with heterogeneous parameter counts by
  converting parameters to indices (Type 1 universe), with implicit
  bindings in constructors
- Use sorry for Inhabited defaults of mutual recursive types
- Export SetType.setElemCompare in Pervasives_extra for bare usage

All 7 test files now compile: Types, Classes2, Classes3, Pats, Pats3,
Exps, Coq_test (previously 5/7).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace all assert false in lean_backend.ml with descriptive error messages
- Add Typ_backend handling in typ and indreln_typ (previously unreachable crash)
- Fix sort_by_ordering bug: .EQ => false (mergeSort expects strict <, not <=)
- Add @[inline] to 19 trivial wrapper functions in LemLib.lean
- Add module-level documentation to LemLib.lean, Pervasives_extra.lean, lakefile.lean
- Add 13 missing Lean 4 keywords to lean_constants
- Fix flatten_newlines to recurse into Core nodes in output.ml
- Add setEqualBy doc comment noting sorted-input precondition

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix lean_backend.ml assert/lemma/theorem emission:
  - assert -> #eval with Bool check (runtime verification)
  - lemma/theorem -> by decide (proof-time verification)
- Create tests/comprehensive/ directory structure
- Add Makefile, run_tests.sh, lakefile.lean, expected_failures.txt
- Add Pervasives_extra.lean stub for test compilation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tions

Test suite covers: let bindings, function patterns, pattern matching edge
cases, type features, constructors, expression edge cases, higher-order
functions, either/maybe types, set/map operations, comprehensions, modules,
type classes, inductive relations, mutual recursion, do notation, target-
specific declarations, infix operators, scope/shadowing, strings/chars,
numeric formats, assertions/lemmas, records, reserved words, comments, and
stress testing.

Backend changes:
- lean_backend.ml: assert -> #eval Bool check (runtime verification),
  lemma/theorem -> by decide (proof-time verification)
- process_file.ml: auxiliary .lean files now import their main module

Results: 21/25 test files compile and pass Lake build, with 103 #eval
assertions verifying runtime correctness. 4 files are expected failures
(comprehensions, inductive relations, sets/maps, stress - all due to
missing BEq instances or syntax issues).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix indreln Prop ascription: remove ': Prop' from constructor antecedents
  that confused Lean's elaborator for subsequent type references
- Add 'export SetType (setElemCompare)' to comprehensive Pervasives_extra
- Enable all 4 formerly-expected-failure tests in lakefile (comprehensions,
  indreln, sets_maps, stress_large) — all now compile and pass
- Clear expected_failures.txt (no remaining failures)
- Track lake-manifest.json for comprehensive test project
- Expand .gitignore: Lean build artifacts, generated .lean files in
  tests/backends/ and tests/comprehensive/, .claude/, _opam/, lean-lib/.lake/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation

- Fix typ dropping Typ_app type arguments (e.g. List Nat → List)
- Fix setFromList/setFromListBy reversed output order (foldl → foldr)
- Change Comp_binding/Setcomp silent comments to proper errors
- Change pattern catch-alls from silent comments to proper errors
- Add lean-libs to Makefile libs_phase_2 target
- Add Pats3 to backends leantests target with build rule
- Add fmapUnion and fmapElements to LemLib
- Add test_typ_args.lem regression test (4 assertions, all pass)

All tests pass: 57/57 comprehensive jobs, 19/19 backend jobs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…c type vars

- Native do-notation: remove pipeline desugaring, emit Lean 4 do blocks
  with proper indentation and whitespace handling
- Vector literals: render L_vector as prefix+bits (e.g. 0b1010)
- Vector patterns: P_vector renders as list patterns with .toList on
  match expression; P_vectorC raises clear error (no backend supports it)
- Numeric type variables: fix sorry/errors in class definitions, instance
  declarations, and type class constraints — all now emit (n : Nat)
- Default values: use 'default' instead of 'sorry' for Typ_wild/Typ_var
- LemLib: add lowercase 'vector' type alias for Lean's Vector
- New test: test_vectors.lem (vector expressions + pattern matching)

All tests pass: 27/27 comprehensive (59 lake jobs), 19/19 backends.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…build system

Phase 1 — Backend bugs (lean_backend.ml):
- Fix string literal escaping: escape \, \n, \t, \0, \r (not just quotes)
- Fix let_type_variables: Nvar gets {n : Nat} not {n : Type}
- Fix Indreln: emit removal comment when not targeted for Lean
- Fix VectorSub: correct skips'' whitespace typo
- Fix indreln_typ: space for multi-arg types (ts <> [] not ts = 1)
- Fix theorem: explicit space after keyword
- Fix assert names: escape through lean_string_escape
- Fix Do handler: wrap in (do ...) parens for indentation isolation
- Fix Typ_app/Typ_backend: conditional space for zero-arg types
- Fix P_cons: parenthesize in fun_pattern context
- Fix default_value for Typ_var: use sorry (avoids missing Inhabited)

Phase 2 — LemLib fixes (LemLib.lean):
- Fix setEqualBy: order-independent mutual subset check
- Fix setCompareBy: sort both lists before comparing
- Fix setCase: 4th arg is plain value, not function (matches Lem sig)
- Fix chooseAndSplit: partition by comparison, not just head/tail
- Fix fmapEqualBy: key param from LemOrdering to Bool
- Add apply, integerSqrt, rationalNumerator/Denominator, realSqrt/Floor/Ceiling, intAbs, listGet?/listGet\!
- Add DecidableEq to LemOrdering
- Fix gen_pow_aux: total with termination_by/decreasing_by
- Fix sort_by_ordering: stable (.EQ => true)

Phase 3 — Build system:
- Add Classes2, Classes3, Coq_test to leantests Makefile
- Add nomatch, nofun, infix/infixl/infixr, prefix, postfix to lean_constants
- Fix README: Lean 4.28.0 (not 4.x)

New regression test: test_audit_regressions.lem (string escaping, cons
patterns, set equality — 6 assertions).

All 28 comprehensive tests pass, 19 backend jobs compile.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ces, add lean-libs

Fix Bool.<-> resolution error that blocked `make lean-libs`:
- Add e_env fallback to search_module_suffix in target_binding.ml
  When typeclass resolution macros synthesize definitions with narrow
  local environments (missing imported modules in m_env), fall back to
  looking up module paths directly in the global e_env registry
- Fix orderingEqual target_rep: `decide` is wrong (expects Prop),
  use infix `==` since LemOrdering derives BEq
- Revert Comp_binding/Setcomp to comment output (matches Coq backend)

Improve Inhabited instance generation (lean_backend.ml):
- Use `default` for type variables in Inhabited context (not sorry)
- For mutual types, find safe constructors whose args don't reference
  other mutual types, reducing sorry usage
- Collect type/class namespace opens for auxiliary file generation

Add lean-lib generated library files (58 files from make lean-libs)
Add pairEqual and maybeEqualBy to LemLib.lean
Add runtime assertions to 6 existing test files
Add test_cross_module.lem regression test (9 assertions)

29/29 comprehensive tests pass, 19/19 backend jobs pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… tests

- Add is_lean_pattern_match in patterns.ml that rejects P_num_add,
  triggering guard-based desugaring instead of invalid Lean 4 syntax
- Add 14 int32/int64 bitwise functions to LemLib with two's complement
  conversion (int32Lnot/Lor/Lxor/Land/Lsl/Lsr/Asr, same for int64)
- Add missing library functions: naturalOfString, integerDiv_t,
  integerRem_t, integerRem_f, THE with target_reps in .lem files
- Fix type/value namespace collision: rename_top_level.ml seeds constant
  renaming with type names for Lean so functions avoid type names
- Fix self-referential Inhabited: generate_default_values detects
  recursive types without base cases and uses sorry
- Add Add/Sub/Mul/Div/Mod/Neg/Pow/Min/Max/Abs/Append to lean_constants
  to avoid ambiguity with Lean stdlib type classes
- Expand backend tests: Record_test, Op, Let_rec, Indreln2 (11 total)
- Fix test .lem files: add type annotations for Num.Numeral resolution,
  convert tabs to spaces in let_rec.lem

All 11 backend tests and 29 comprehensive tests pass (90 Lake jobs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Expand backend_lean.md: add auxiliary files, recursive definitions,
  inductive relations, BEq derivation, automatic renaming sections
- Add Lake project example to compilation instructions
- Fix incorrect claim about constructor dot notation (uses open TypeName)
- Document Inhabited sorry behavior for recursive types without base cases
- Add -auxiliary_level auto mention, matching HOL4/Isabelle docs
- Fix introduction.md Lean version: 4.x -> 4.28.0
- Fix README.md Lean library entry to match other backends format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove duplicate unreachable P_record match arms in fun_pattern and
def_pattern. Replace silent 'Internal Lem error' comment strings with
proper exceptions that surface errors to users. Simplify
generate_inhabited_instance by removing dead None branch (single types
now always pass through mutual-aware path). Standardize error message
format to 'Lean backend: ...' prefix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@septract septract marked this pull request as draft March 7, 2026 07:14
septract and others added 12 commits March 6, 2026 23:48
Move generated library files under LemLib/ namespace so imports become
`import LemLib.Pervasives` instead of bare `import Pervasives`, avoiding
conflicts with Lean stdlib modules (Bool, List, String, etc.).

Fix class name collisions (Eq, Ord) with Lean stdlib by making the
renaming pipeline handle class types. Previously, class definitions were
skipped in add_def_aux_entities (TODO comment), so names in lean_constants
like Eq never triggered renaming. Now Eq -> Eq0, Ord -> Ord0 at all output
sites: class defs, constraints, and instance declarations.

Key changes:
- backend_common.ml: LemLib. prefix for library modules; class_path_to_name
- process_file.ml: dot-to-path conversion for Lean output files
- lean_backend.ml: strip LemLib. prefix from open stmts; use class_path_to_name
- types.ml/mli: type_defs_lookup_tc, type_defs_update_class
- typed_ast_syntax.ml: collect class paths and methods in add_def_aux_entities
- rename_top_level.ml: rename_type handles both Tc_type and Tc_class
- target_trans.ml: add_used_entities_to_avoid_names handles Tc_class
- lean_constants: add Ord
- Pervasives_extra stub moved to lean-lib/LemLib/; test stubs removed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…reps

Key changes to make generated library files compile:

- Import ordering: collect imports in a ref, emit all at file top before
  any other content (Lean requires imports before non-import statements)
- Import-open: suppress 'open' for LemLib.* modules (generated files have
  no namespaces; import alone brings definitions into scope)
- Class exports: use 'export ClassName (methods)' instead of 'open' after
  class definitions so methods are visible to importing files. Filter out
  names that clash with Lean globals (max, min, compare).
- Instance constraints: use inst_constraints from type system (fully
  qualified paths) instead of parsing unqualified Idents from Cs_list AST
- BEq bridges: emit 'instance [Eq0 a] : BEq a' after Eq class def, and
  'instance [SetType a] : BEq a' after SetType class def, so == works
  wherever these classes are in scope
- Target reps: Ord0.compare for compare method, intAbs for integer abs
  functions, \!= for unsafe_structural_inequality
- Remove pairEqual/maybeEqualBy from LemLib.lean (now in generated code)
- Pervasives_extra stub: remove namespace wrapper (matches generated style)
- lakefile: use submodules glob for full library discovery

28 of 61 library modules now build successfully (up from 0).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…g, cpp support

- Add explicit `: Type` annotation on all non-indexed inductives to prevent
  Lean auto-inferring Prop (Sort 0) for single-constructor mutual types
- Fix multi-clause mutual function naming to use const_ref_to_name (avoids
  definition/reference name mismatch e.g. test44 vs test440)
- Generate SetType/Eq0/Ord0 instances for all inductive types; skip for
  Type 1 (heterogeneous mutual blocks) since those classes require Type
- Auto-import LemLib.Pervasives_extra when Pervasives is imported, for
  bridge instances (NumAdd -> Add, etc.)
- Include transitive namespace opens in auxiliary files (Lean open is
  file-local, not exported to importers)
- Add MapKeyType compare method to BEq bridge derivation
- Add isInequal target_rep (\!=) for basic_classes
- Add One/Zero to lean_constants to avoid stdlib collisions
- Replace removed List.get?/List.get\! with listGetOpt/listGetBang wrappers
- Add Ord instance for Prod, set_tc, boolListFromNatural to LemLib runtime
- Update Pervasives_extra stub with Lem numeric class bridges

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ;lean to 13 target group annotations in cmm.lem so the Lean backend
can generate output from the same source file used by other backends.
Changes are purely additive and don't affect Coq/OCaml/Isabelle output.

Also add Lake project files (lakefile.lean, lean-toolchain, lake-manifest)
and .gitignore .lake/ globally instead of per-directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pes, constants

Backend fixes for real-world Lem projects (ppcmem-model, cpp):
- Sanitize tab characters in all generated output (Lean 4 forbids tabs)
- Use 'export Type (Ctor1 Ctor2)' instead of 'open Type' for inductives,
  so constructors are visible in importing files
- Parenthesize match/if/let/fun via shared needs_parens helper, applied
  consistently in function args, if-conditions, and case arm bodies
- Fix indreln type signatures to apply target reps (e.g. set -> List)
- Resolve wildcards in fun_pattern P_typ to concrete types
- Handle unit literal in fun_pattern as (_ : Unit)
- Extract is_library_module predicate for OpenImportTarget
- Expand lean_constants from 129 to 262 entries covering all Init types,
  typeclasses, and common functions (id, flip, cast, guard, etc.)
- Add gen_lean_constants.lean script for regenerating the list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- natLnot: panic instead of returning 0 (NOT undefined for Nat)
- naturalOfString: panic on invalid input instead of returning 0
- THE: panic instead of returning none (Hilbert choice not computable)
- rationalNumerator/Denominator: panic (rationals not supported)
- realSqrt/Floor/Ceiling: panic (reals not supported)
- Add Nat bitwise ops (natLand, natLor, natLxor, natLsl, natLsr, natAsr)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…compilation

- Class method constants: emit @method (Type) _ for bare class methods
  so Lean can resolve implicit type parameters (fixes Machine_word.lean)
- Standalone BEq instances without [Inhabited] constraint, separate from
  Ord which requires Inhabited for sorry bodies
- Termination annotations: use try_termination_proof (like Coq/Isabelle)
  to emit def instead of partial def when termination is provable
- Multi-discriminant match: decompose tuple scrutinees for termination
  checker visibility (match l1, l2 with instead of match (l1, l2) with)
- Library namespace qualification: push namespace before processing so
  auxiliary file opens get qualified names (Lem_Basic_classes.Eq0)
- Bridge instances moved to LemLib/Bridges.lean (survives make lean-libs)
- Auto-import LemLib.Bridges for non-library modules
- Makefile cleanup: remove auxiliary files after lean-libs generation
- Target reps: genlist, last, nat bitwise ops, int32 bitwise ops

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Cross-module name collision: rename_top_level.ml includes ALL env type
  names in Lean constant-avoid set, not just local ones (fixes thread_trans
  type/indreln collision across modules)
- Record literal type ascription: add (({ ... } : Type)) annotation using
  exp_to_typ so Lean can resolve record types without context
- setChoose replaces sorry target rep: Set_extra.choose now maps to a real
  function in LemLib instead of bare sorry (which can't be applied as fn)
- Propositional equality in indreln: lean_prop_equality flag makes isEqual
  output = (Eq) instead of == (BEq) in antecedents; functions lack BEq
- Indreln renamed name output: uses constant_descr_to_name instead of raw
  AST name, so renames like thread_trans -> thread_trans0 are reflected
- deriving BEq, Ord: simple types (non-mutual, no fn-typed args) use Lean's
  deriving instead of sorry-based instances; adds [BEq a] [Ord a] constraints
  on downstream SetType/Eq0/Ord0 instances for parameterized types
- Dynamic library namespace list: replaces hardcoded core_lib_ns with
  computation from module environment (e_env), detecting library modules by
  Coq rename presence
- String.mk -> String.ofList: fixes deprecation warning in string.lem

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix bug in type_def_indexed: types with 0 params in a heterogeneous
mutual block were emitted as Type instead of Type 1, causing a Lean
universe mismatch error. All types in such blocks now consistently
use Type 1.

New test files:
- test_case_arm_nesting.lem: match/if/let/fun in case arms, as function
  args, in if-conditions, in list/tuple constructors (42 assertions)
- test_termination.lem: declare termination_argument, multi-discriminant
  match with 2 and 3 scrutinees, partial def fallback (15 assertions)

Enhanced existing tests:
- test_pattern_edge_cases.lem: n+k patterns (fib, pred, classify),
  unit in tuple/let patterns (13 new assertions)
- test_indreln.lem: inequality, nested fn application, ordering, and
  multi-rule relations in antecedents (4 new relations)
- test_mutual_recursion.lem: heterogeneous param counts (caught the
  Type 1 bug), 3-way mutual recursion
- test_audit_regressions.lem: tabs in comments, type/record defs

Total: 31 tests, 231 assertions, all passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Honest accounting of every gap: 942 sorry stubs in Machine_word,
wrong floating-point types, missing overflow semantics, 18 partial
defs, incomplete target rep coverage, indreln \!= edge case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…annotations

Propositional equality in indreln antecedents now handles both the Infix
AST path (direct = / <> syntax) and the App AST path (Lem's <>
decomposition to not(isEqual x y)). Extracted check_beq_target_rep
helper to share logic between both cases. Added regression tests using
(nat -> nat) types which lack BEq and would fail without the fix.

Added {lean}-scoped termination annotations for 10 structurally recursive
library functions (map_tr, count_map, splitAtAcc, mapMaybe, mapiAux,
catMaybes, init, stringFromListAux, concat, integerOfStringHelper),
reducing partial def count from 18 to 8.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…FixedPoint

Convert 5 more partial defs to total:
- LemLib.lean: boolListFromNatural (n/2 division), bitSeqBinopAux (dual-list recursion)
- LemLib.lean: lemStringFromNatHelper, lemStringFromNaturalHelper (n/10 division)
- LemLib.lean: lemLeastFixedPoint (bounded countdown)

Add Lean-only target reps in string_extra.lem and set.lem to route
generated code through the total LemLib implementations. All changes
are inherently Lean-scoped (declare lean target_rep / hand-written Lean).

Add TODO rems-project#7: audit all pre-existing unscoped termination annotations
from upstream to verify they don't affect other backends.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
septract and others added 21 commits March 9, 2026 12:03
stringCompare in string_extra.lem always returned EQ (marked XXX: broken).
Added lean-specific inline: let inline {lean} stringCompare = defaultCompare.
This fixes stringLess, stringLessEq, stringGreater, stringGreaterEq, and
the Ord0 String instance.

Added 5 string comparison test assertions to prevent regression.

Updated TODO.md based on audit:
- rems-project#2 (numeric sorry stubs): non-issue, all inside block comments
- rems-project#8 (missing target reps): resolved, Lean has 288 vs Coq's 260

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Traced try_termination_proof through backend.ml — unscoped annotations
are intentionally universal (affect Coq, HOL, Isabelle, Lean). Pre-existing
upstream annotations have worked for years. Our branch additions are all
{lean} scoped. No changes needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously these 4 types silently mapped to Int, producing semantically
wrong results (e.g., rationalFromFrac 1 3 = 0 via integer division).
Now they map to LemRational/LemReal/LemFloat64/LemFloat32 — opaque
structure types in LemLib.lean where every operation panics with a clear
error message. This ensures misuse is caught immediately at runtime.

Changes:
- LemLib.lean: 4 new types with full panicking instances (Add, Sub, Mul,
  Div, Neg, HPow, BEq, Ord, Min, Max, OfNat, Inhabited) + 15 wrapper
  functions for target reps that can't use infix operators
- library/num.lem: 4 type target reps, 14 function target reps updated,
  2 new target reps for rationalFromFrac/realFromFrac (Lean-only changes)
- Also reduces duplicate Int typeclass instances (partial fix for TODO rems-project#5)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously int32 and int64 both mapped to bare Int, causing duplicate
typeclass instances with int/integer (all four were Int). Now they are
structure wrappers (LemInt32/LemInt64) with forwarding instances for
arithmetic, comparison, conversion, and bitwise operations. Same
semantics as Coq's Z mapping but type-safe.

Updated ppcmem bitwiseCompatibility.lem shift target reps to use
lemInt32ToNat instead of Int.toNat (which expects bare Int).

All tests pass: 31 comprehensive, 11 backend, ppcmem (43 jobs),
cpp (34 jobs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Map Lem's mword phantom type to Lean 4's BitVec via TYR_subst. The type
declaration `mword 'a = BitVec (@Size.size 'a _)` replaces 942 sorry
stubs with real BitVec operations.

Key changes:
- library/machine_word.lem: Add Lean target reps for all 36 mword
  operations (arithmetic, bitwise, shifts, rotates, comparisons, bit
  access, width ops, concat/extract/update, hex, bitlist conversion)
- lean-lib/LemLib.lean: 30 thin wrapper functions bridging Lem calling
  conventions to Lean 4 BitVec API
- src/lean_backend.ml: TYR_subst constraint propagation — walks Lem
  types to discover implicit [Size a] constraints that TYR_subst
  introduces; deferred abbrev emission for forward references; shared
  helpers for constraint extraction and formatting
- tests/comprehensive/test_mword.lem: 57 assert-based tests covering
  all operations, verified at runtime during lake build (not just
  type-checking)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Opaque types like ty1..ty4096 and itself are zero-constructor inductives
that exist only to carry type-level information (bit widths via Size).
They are uninhabitable by design. Generating sorry-based Inhabited/BEq/Ord
instances for them was unsound and produced 942 compiler warnings.

Filter out Te_opaque types in generate_default_values and
generate_default_values_mutual before instance generation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Unify fun_pattern/def_pattern into single pattern ~style function
  with FunParam | MatchArm discriminator (~80 lines saved)
- Extract tnvar_to_string/tnvar_to_variable helpers (6+ call sites)
- Rename shadowed variables in clauses function
- Fix mutual indreln: add mutual/end wrapping, per-relation inductive keyword
- Add make lean-tests target (full 6-stage test suite)
- Add coq_exps_test to backend tests (12/12 now)
- New comprehensive tests: test_class_instance_constraints,
  test_pattern_complex, test_mutual_indreln,
  test_set_comprehension_advanced (36 total, 251+ assertions)
- Remove TODO.md from tracking, add to .gitignore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- backend_lean.md: LemLib namespace, termination_argument override,
  mutual indreln, machine words section, BEq+Ord derivation,
  export vs open, cross-module renaming
- own_lem_files.md: add Lean to termination_argument example
- Makefile: update test count comment (36 tests, 251+ assertions)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…straints

Lem's default_instance declarations were silently dropped for the Lean
backend because def_trans.ml wrapped them in Comment nodes for all targets.

Fix: pass default instances through to the Lean backend (def_trans.ml),
emit them as instance (priority := low) (lean_backend.ml), and add
Lean-native typeclass constraints that the method bodies require:
- Eq0 default gets [BEq a] (body uses ==)
- SetType default gets [Ord a] (body uses defaultCompare)

The other two defaults (OrdMaxMin, MapKeyType) already carry sufficient
Lem-level constraints. This extends the extra_constraints_for_tyr_subst
pattern for function target rep constraints in default instances.

Also adds 5 new comprehensive test files (41 total, all passing).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix VectorAcc/VectorSub missing space between expression and index
- Fix \0 string escape to \x00 (Lean 4 doesn't support \0)
- Add coverage script (scripts/lean_coverage.sh) using bisect_ppx
- Add tests: polymorphic multi-clause function, module rename,
  direct isInequal in indreln, vector access, string escapes,
  target-filtered definitions (rec, indreln, val, lemma)
- Coverage: 82.56% on lean_backend.ml (1946/2357)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bug rems-project#20: Let_def path now emits one def per bound name using local let
destructuring (def x : T := let PAT := EXPR; x) instead of invalid
def PATTERN := EXPR syntax.

Bug rems-project#21: Polymorphic indreln premises now include explicit type parameters
via lean_indreln_params ref. Self-references in antecedents get the
type params injected (e.g., poly_mem a xs x).

New test: test_let_def_destructuring.lem (8 assertions: pairs, triples,
nested tuples). Extended test_indreln.lem with poly_mem and isInequal
cases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Vector.slice using Vector.ofFn + Array.extract/getD. Lem's v.[i..j]
syntax now fully works end-to-end. Test: 2 new runtime-verified assertions
in test_vectors.lem confirm slice correctness.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New indreln tests exercise previously uncovered paths:
- apply_pred: function-typed argument (nat -> bool) triggers indreln_typ
  Typ_fn Bool→Prop conversion
- pair_rel: tuple-typed index exercises indreln_typ Typ_tup path

lean_backend.ml coverage: 84.04% → 84.25% (2017 → 2022 points)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 4 dead/unreachable code paths identified by bisect_ppx coverage:
- typ function (~30 lines): entirely unused, never called
- typ_ident_to_output: only caller was dead typ
- lean_function_application_to_output: never called wrapper
- Let_fun branch in let_body: pattern compilation eliminates before backend
- Te_variant in generate_default_value_texp: replaced with explicit
  unreachable guard (caller handles Te_variant before dispatch)

Coverage improves from 84.25% to 86.63% (same 2022 covered spans,
66 fewer total spans). Net -41 lines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Te_opaque, Te_abbrev, and Te_record are never reached in tyexp:
def dispatches abbreviations and records to dedicated handlers, and
type_def_variant handles Te_opaque before calling tyexp.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace fragile List.nth with pattern matching on args list (line 1174).
Replace deprecated Hashtbl.find/Not_found with Hashtbl.find_opt (line 822).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ate gitignore

- Remove root-level test_lean.lem / test_lean2.lem (development scratch files)
- Commit examples/ppcmem-model/{lakefile.lean, lean-toolchain, lake-manifest.json}
  (needed for lake build, matching cpp and lean-test which are already committed)
- Add gitignore entries for: generated lean-lib/LemLib/*.lean,
  generated examples/ppcmem-model/*.lean, _build/, main.native,
  coverage-report/, .coverage-switch/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests that types, constructors, polymorphic types, and constants
defined in one user .lem file can be imported and used by another.
Exercises: auxiliary file transitive opens, skip-open for user modules,
cross-module name collision handling, dynamic library namespace list.
5 runtime-verified assertions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In non-library (user) modules, library namespace opens were emitted
twice: inline by OpenImportTarget during body processing, and again by
transitive_opens in the preamble. Skip inline opens for library imports
in user modules since transitive_opens handles them for both main and
auxiliary files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- rename_top_level.ml: Tc_class renaming only for Lean target; other
  backends skip class entries (preserves pre-existing behavior)
- target_trans.ml: class names added to avoid set only for Lean target
- output.ml: revert block token type from Meta_utf8 back to Kwd,
  preserving spacing semantics for Coq/HOL/Isabelle while keeping
  the UTF-8 encoding fix (of_latin1 -> of_string)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace silent emp catch-all in def with explicit Declaration/Lemma arms
- Replace unreachable OpenImportTarget emp with error guard
- Rename shadowed variable skips -> skips_out in P_wild pattern handler
- Change setChoose empty-set case from sorry to panic\! (consistent with
  rest of LemLib's error handling style)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@septract septract changed the title Add Lean 4 backend Add Lean 4 backend [AI-assisted] Mar 10, 2026
@septract septract marked this pull request as ready for review March 10, 2026 22:04
septract and others added 2 commits March 10, 2026 15:30
Section headers and design rationale for: expression rendering,
type definitions, instance generation, import/namespace management,
indreln clauses, multi-clause grouping, and the target_trans pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The script only tests the Lean backend, so the name should reflect that.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant