Add Lean 4 backend [AI-assisted]#40
Open
septract wants to merge 52 commits intorems-project:masterfrom
Open
Conversation
Add a new backend targeting Lean 4, enabling lem to generate .lean files from semantic definitions. This follows the same architecture as the existing Coq backend (custom LeanBackend module via functor). New files: - src/lean_backend.ml: Main backend (~1600 lines) - lean-lib/LemLib.lean: Runtime support library - library/lean_constants: Lean 4 reserved words Target registration across: ast.ml, target.ml, target.mli, parser.mly, main.ml (-lean flag), target_trans.ml, backend.ml, backend.mli, process_file.ml (.lean output with /- -/ comments and import LemLib). Library declarations (declare lean target_rep) added to all standard library files: basic_classes, bool, maybe, num, list, set, map, string, either, relation, sorting, word, machine_word, tuple, function, set_extra, map_extra, set_helpers, assert_extra. Key Lean 4 adaptations: - Bool/List/Nat capitalized types via target_rep type declarations - Constructor patterns use dot notation (.Red, .some, etc.) - Comments converted from (* *) to /- -/ - Records use structure/where syntax - Match arms use | pat => expr (no end keyword) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Generate Lean 4 'instance : Inhabited T' for each type definition, mirroring Coq's 'Definition T_default' generation. This ensures default values are available for all user-defined types. Also raise proper errors for Typ_with_sort in pat_typ and typ, matching Coq's behavior instead of silently passing through. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Lean 4 to all documentation: README, manual (introduction, invocation, backends, language grammar, backend linking, typeclasses), and the Ott grammar definition. Create new backend_lean.md manual page. Add missing declare lean target_rep entries for nth (list_extra.lem), ord and chr (string_extra.lem) to complete library parity with Coq. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add lean to target lists for wordFromInteger, wordFromNumeral, wordToHex in machine_word.lem. Add lean target_rep for choose and exclude lean from choose lemmas/asserts in set_extra.lem. Exclude lean from THE_spec lemma in function_extra.lem. Add lean-libs target to library/Makefile and leantests target to tests/backends/Makefile. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n backend - Add Meta_utf8 variant to output.ml to preserve UTF-8 bytes (×, →, etc.) instead of double-encoding through of_latin1 - Fix to_rope_help_block to use of_string for Format output, preventing double-encoding of Unicode characters in block-formatted output - Add flatten_newlines utility to collapse newlines in output trees - Disable block formatting for Lean backend (Lean 4 is whitespace-sensitive) - Replace break_hint_space with explicit spaces in App, Infix, If, Fun, Case - Add 'open TypeName' after inductive types for constructor scoping - Add 'open ClassName' after class definitions for method scoping - Remove dot-prefix on constructors in expression/pattern position - Fix pattern constructor argument spacing (concat emp -> concat space) - Fix 'let' keyword spacing (letx -> let x) - Expand LemLib with set/map operations, ordering, and utility functions - Add Lake project setup for lean-lib (lakefile.lean, lean-toolchain) - Add lean-test Lake project for end-to-end compilation testing - Expand lean_constants with ~30 missing Lean 4 reserved words - Add lean-libs target and Lean paths to Makefile install/distrib/clean 5 of 7 test files now compile: Types, Classes2, Classes3, Pats, Pats3. Remaining issues: Exps (set BEq instances), Coq_test (mutual inductives with varying parameters), Classes3 (target-specific code leaking). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Filter target-specific class methods from Lean output: class bodies
now skip methods annotated for other backends ({hol}, {coq}, etc.)
- Filter corresponding instance methods when class method is not
target-visible for Lean
- Add 'deriving BEq' to inductive types and structures when all
constructor/field types support it (no function-typed args)
- Skip 'deriving BEq' for mutual blocks to avoid cross-reference issues
- Handle mutual inductives with heterogeneous parameter counts by
converting parameters to indices (Type 1 universe), with implicit
bindings in constructors
- Use sorry for Inhabited defaults of mutual recursive types
- Export SetType.setElemCompare in Pervasives_extra for bare usage
All 7 test files now compile: Types, Classes2, Classes3, Pats, Pats3,
Exps, Coq_test (previously 5/7).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace all assert false in lean_backend.ml with descriptive error messages - Add Typ_backend handling in typ and indreln_typ (previously unreachable crash) - Fix sort_by_ordering bug: .EQ => false (mergeSort expects strict <, not <=) - Add @[inline] to 19 trivial wrapper functions in LemLib.lean - Add module-level documentation to LemLib.lean, Pervasives_extra.lean, lakefile.lean - Add 13 missing Lean 4 keywords to lean_constants - Fix flatten_newlines to recurse into Core nodes in output.ml - Add setEqualBy doc comment noting sorted-input precondition Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix lean_backend.ml assert/lemma/theorem emission: - assert -> #eval with Bool check (runtime verification) - lemma/theorem -> by decide (proof-time verification) - Create tests/comprehensive/ directory structure - Add Makefile, run_tests.sh, lakefile.lean, expected_failures.txt - Add Pervasives_extra.lean stub for test compilation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tions Test suite covers: let bindings, function patterns, pattern matching edge cases, type features, constructors, expression edge cases, higher-order functions, either/maybe types, set/map operations, comprehensions, modules, type classes, inductive relations, mutual recursion, do notation, target- specific declarations, infix operators, scope/shadowing, strings/chars, numeric formats, assertions/lemmas, records, reserved words, comments, and stress testing. Backend changes: - lean_backend.ml: assert -> #eval Bool check (runtime verification), lemma/theorem -> by decide (proof-time verification) - process_file.ml: auxiliary .lean files now import their main module Results: 21/25 test files compile and pass Lake build, with 103 #eval assertions verifying runtime correctness. 4 files are expected failures (comprehensions, inductive relations, sets/maps, stress - all due to missing BEq instances or syntax issues). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix indreln Prop ascription: remove ': Prop' from constructor antecedents that confused Lean's elaborator for subsequent type references - Add 'export SetType (setElemCompare)' to comprehensive Pervasives_extra - Enable all 4 formerly-expected-failure tests in lakefile (comprehensions, indreln, sets_maps, stress_large) — all now compile and pass - Clear expected_failures.txt (no remaining failures) - Track lake-manifest.json for comprehensive test project - Expand .gitignore: Lean build artifacts, generated .lean files in tests/backends/ and tests/comprehensive/, .claude/, _opam/, lean-lib/.lake/ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation - Fix typ dropping Typ_app type arguments (e.g. List Nat → List) - Fix setFromList/setFromListBy reversed output order (foldl → foldr) - Change Comp_binding/Setcomp silent comments to proper errors - Change pattern catch-alls from silent comments to proper errors - Add lean-libs to Makefile libs_phase_2 target - Add Pats3 to backends leantests target with build rule - Add fmapUnion and fmapElements to LemLib - Add test_typ_args.lem regression test (4 assertions, all pass) All tests pass: 57/57 comprehensive jobs, 19/19 backend jobs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…c type vars - Native do-notation: remove pipeline desugaring, emit Lean 4 do blocks with proper indentation and whitespace handling - Vector literals: render L_vector as prefix+bits (e.g. 0b1010) - Vector patterns: P_vector renders as list patterns with .toList on match expression; P_vectorC raises clear error (no backend supports it) - Numeric type variables: fix sorry/errors in class definitions, instance declarations, and type class constraints — all now emit (n : Nat) - Default values: use 'default' instead of 'sorry' for Typ_wild/Typ_var - LemLib: add lowercase 'vector' type alias for Lean's Vector - New test: test_vectors.lem (vector expressions + pattern matching) All tests pass: 27/27 comprehensive (59 lake jobs), 19/19 backends. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…build system
Phase 1 — Backend bugs (lean_backend.ml):
- Fix string literal escaping: escape \, \n, \t, \0, \r (not just quotes)
- Fix let_type_variables: Nvar gets {n : Nat} not {n : Type}
- Fix Indreln: emit removal comment when not targeted for Lean
- Fix VectorSub: correct skips'' whitespace typo
- Fix indreln_typ: space for multi-arg types (ts <> [] not ts = 1)
- Fix theorem: explicit space after keyword
- Fix assert names: escape through lean_string_escape
- Fix Do handler: wrap in (do ...) parens for indentation isolation
- Fix Typ_app/Typ_backend: conditional space for zero-arg types
- Fix P_cons: parenthesize in fun_pattern context
- Fix default_value for Typ_var: use sorry (avoids missing Inhabited)
Phase 2 — LemLib fixes (LemLib.lean):
- Fix setEqualBy: order-independent mutual subset check
- Fix setCompareBy: sort both lists before comparing
- Fix setCase: 4th arg is plain value, not function (matches Lem sig)
- Fix chooseAndSplit: partition by comparison, not just head/tail
- Fix fmapEqualBy: key param from LemOrdering to Bool
- Add apply, integerSqrt, rationalNumerator/Denominator, realSqrt/Floor/Ceiling, intAbs, listGet?/listGet\!
- Add DecidableEq to LemOrdering
- Fix gen_pow_aux: total with termination_by/decreasing_by
- Fix sort_by_ordering: stable (.EQ => true)
Phase 3 — Build system:
- Add Classes2, Classes3, Coq_test to leantests Makefile
- Add nomatch, nofun, infix/infixl/infixr, prefix, postfix to lean_constants
- Fix README: Lean 4.28.0 (not 4.x)
New regression test: test_audit_regressions.lem (string escaping, cons
patterns, set equality — 6 assertions).
All 28 comprehensive tests pass, 19 backend jobs compile.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ces, add lean-libs Fix Bool.<-> resolution error that blocked `make lean-libs`: - Add e_env fallback to search_module_suffix in target_binding.ml When typeclass resolution macros synthesize definitions with narrow local environments (missing imported modules in m_env), fall back to looking up module paths directly in the global e_env registry - Fix orderingEqual target_rep: `decide` is wrong (expects Prop), use infix `==` since LemOrdering derives BEq - Revert Comp_binding/Setcomp to comment output (matches Coq backend) Improve Inhabited instance generation (lean_backend.ml): - Use `default` for type variables in Inhabited context (not sorry) - For mutual types, find safe constructors whose args don't reference other mutual types, reducing sorry usage - Collect type/class namespace opens for auxiliary file generation Add lean-lib generated library files (58 files from make lean-libs) Add pairEqual and maybeEqualBy to LemLib.lean Add runtime assertions to 6 existing test files Add test_cross_module.lem regression test (9 assertions) 29/29 comprehensive tests pass, 19/19 backend jobs pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… tests - Add is_lean_pattern_match in patterns.ml that rejects P_num_add, triggering guard-based desugaring instead of invalid Lean 4 syntax - Add 14 int32/int64 bitwise functions to LemLib with two's complement conversion (int32Lnot/Lor/Lxor/Land/Lsl/Lsr/Asr, same for int64) - Add missing library functions: naturalOfString, integerDiv_t, integerRem_t, integerRem_f, THE with target_reps in .lem files - Fix type/value namespace collision: rename_top_level.ml seeds constant renaming with type names for Lean so functions avoid type names - Fix self-referential Inhabited: generate_default_values detects recursive types without base cases and uses sorry - Add Add/Sub/Mul/Div/Mod/Neg/Pow/Min/Max/Abs/Append to lean_constants to avoid ambiguity with Lean stdlib type classes - Expand backend tests: Record_test, Op, Let_rec, Indreln2 (11 total) - Fix test .lem files: add type annotations for Num.Numeral resolution, convert tabs to spaces in let_rec.lem All 11 backend tests and 29 comprehensive tests pass (90 Lake jobs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Expand backend_lean.md: add auxiliary files, recursive definitions, inductive relations, BEq derivation, automatic renaming sections - Add Lake project example to compilation instructions - Fix incorrect claim about constructor dot notation (uses open TypeName) - Document Inhabited sorry behavior for recursive types without base cases - Add -auxiliary_level auto mention, matching HOL4/Isabelle docs - Fix introduction.md Lean version: 4.x -> 4.28.0 - Fix README.md Lean library entry to match other backends format Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove duplicate unreachable P_record match arms in fun_pattern and def_pattern. Replace silent 'Internal Lem error' comment strings with proper exceptions that surface errors to users. Simplify generate_inhabited_instance by removing dead None branch (single types now always pass through mutual-aware path). Standardize error message format to 'Lean backend: ...' prefix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move generated library files under LemLib/ namespace so imports become `import LemLib.Pervasives` instead of bare `import Pervasives`, avoiding conflicts with Lean stdlib modules (Bool, List, String, etc.). Fix class name collisions (Eq, Ord) with Lean stdlib by making the renaming pipeline handle class types. Previously, class definitions were skipped in add_def_aux_entities (TODO comment), so names in lean_constants like Eq never triggered renaming. Now Eq -> Eq0, Ord -> Ord0 at all output sites: class defs, constraints, and instance declarations. Key changes: - backend_common.ml: LemLib. prefix for library modules; class_path_to_name - process_file.ml: dot-to-path conversion for Lean output files - lean_backend.ml: strip LemLib. prefix from open stmts; use class_path_to_name - types.ml/mli: type_defs_lookup_tc, type_defs_update_class - typed_ast_syntax.ml: collect class paths and methods in add_def_aux_entities - rename_top_level.ml: rename_type handles both Tc_type and Tc_class - target_trans.ml: add_used_entities_to_avoid_names handles Tc_class - lean_constants: add Ord - Pervasives_extra stub moved to lean-lib/LemLib/; test stubs removed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…reps Key changes to make generated library files compile: - Import ordering: collect imports in a ref, emit all at file top before any other content (Lean requires imports before non-import statements) - Import-open: suppress 'open' for LemLib.* modules (generated files have no namespaces; import alone brings definitions into scope) - Class exports: use 'export ClassName (methods)' instead of 'open' after class definitions so methods are visible to importing files. Filter out names that clash with Lean globals (max, min, compare). - Instance constraints: use inst_constraints from type system (fully qualified paths) instead of parsing unqualified Idents from Cs_list AST - BEq bridges: emit 'instance [Eq0 a] : BEq a' after Eq class def, and 'instance [SetType a] : BEq a' after SetType class def, so == works wherever these classes are in scope - Target reps: Ord0.compare for compare method, intAbs for integer abs functions, \!= for unsafe_structural_inequality - Remove pairEqual/maybeEqualBy from LemLib.lean (now in generated code) - Pervasives_extra stub: remove namespace wrapper (matches generated style) - lakefile: use submodules glob for full library discovery 28 of 61 library modules now build successfully (up from 0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…g, cpp support - Add explicit `: Type` annotation on all non-indexed inductives to prevent Lean auto-inferring Prop (Sort 0) for single-constructor mutual types - Fix multi-clause mutual function naming to use const_ref_to_name (avoids definition/reference name mismatch e.g. test44 vs test440) - Generate SetType/Eq0/Ord0 instances for all inductive types; skip for Type 1 (heterogeneous mutual blocks) since those classes require Type - Auto-import LemLib.Pervasives_extra when Pervasives is imported, for bridge instances (NumAdd -> Add, etc.) - Include transitive namespace opens in auxiliary files (Lean open is file-local, not exported to importers) - Add MapKeyType compare method to BEq bridge derivation - Add isInequal target_rep (\!=) for basic_classes - Add One/Zero to lean_constants to avoid stdlib collisions - Replace removed List.get?/List.get\! with listGetOpt/listGetBang wrappers - Add Ord instance for Prod, set_tc, boolListFromNatural to LemLib runtime - Update Pervasives_extra stub with Lem numeric class bridges Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ;lean to 13 target group annotations in cmm.lem so the Lean backend can generate output from the same source file used by other backends. Changes are purely additive and don't affect Coq/OCaml/Isabelle output. Also add Lake project files (lakefile.lean, lean-toolchain, lake-manifest) and .gitignore .lake/ globally instead of per-directory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pes, constants Backend fixes for real-world Lem projects (ppcmem-model, cpp): - Sanitize tab characters in all generated output (Lean 4 forbids tabs) - Use 'export Type (Ctor1 Ctor2)' instead of 'open Type' for inductives, so constructors are visible in importing files - Parenthesize match/if/let/fun via shared needs_parens helper, applied consistently in function args, if-conditions, and case arm bodies - Fix indreln type signatures to apply target reps (e.g. set -> List) - Resolve wildcards in fun_pattern P_typ to concrete types - Handle unit literal in fun_pattern as (_ : Unit) - Extract is_library_module predicate for OpenImportTarget - Expand lean_constants from 129 to 262 entries covering all Init types, typeclasses, and common functions (id, flip, cast, guard, etc.) - Add gen_lean_constants.lean script for regenerating the list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- natLnot: panic instead of returning 0 (NOT undefined for Nat) - naturalOfString: panic on invalid input instead of returning 0 - THE: panic instead of returning none (Hilbert choice not computable) - rationalNumerator/Denominator: panic (rationals not supported) - realSqrt/Floor/Ceiling: panic (reals not supported) - Add Nat bitwise ops (natLand, natLor, natLxor, natLsl, natLsr, natAsr) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…compilation - Class method constants: emit @method (Type) _ for bare class methods so Lean can resolve implicit type parameters (fixes Machine_word.lean) - Standalone BEq instances without [Inhabited] constraint, separate from Ord which requires Inhabited for sorry bodies - Termination annotations: use try_termination_proof (like Coq/Isabelle) to emit def instead of partial def when termination is provable - Multi-discriminant match: decompose tuple scrutinees for termination checker visibility (match l1, l2 with instead of match (l1, l2) with) - Library namespace qualification: push namespace before processing so auxiliary file opens get qualified names (Lem_Basic_classes.Eq0) - Bridge instances moved to LemLib/Bridges.lean (survives make lean-libs) - Auto-import LemLib.Bridges for non-library modules - Makefile cleanup: remove auxiliary files after lean-libs generation - Target reps: genlist, last, nat bitwise ops, int32 bitwise ops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Cross-module name collision: rename_top_level.ml includes ALL env type
names in Lean constant-avoid set, not just local ones (fixes thread_trans
type/indreln collision across modules)
- Record literal type ascription: add (({ ... } : Type)) annotation using
exp_to_typ so Lean can resolve record types without context
- setChoose replaces sorry target rep: Set_extra.choose now maps to a real
function in LemLib instead of bare sorry (which can't be applied as fn)
- Propositional equality in indreln: lean_prop_equality flag makes isEqual
output = (Eq) instead of == (BEq) in antecedents; functions lack BEq
- Indreln renamed name output: uses constant_descr_to_name instead of raw
AST name, so renames like thread_trans -> thread_trans0 are reflected
- deriving BEq, Ord: simple types (non-mutual, no fn-typed args) use Lean's
deriving instead of sorry-based instances; adds [BEq a] [Ord a] constraints
on downstream SetType/Eq0/Ord0 instances for parameterized types
- Dynamic library namespace list: replaces hardcoded core_lib_ns with
computation from module environment (e_env), detecting library modules by
Coq rename presence
- String.mk -> String.ofList: fixes deprecation warning in string.lem
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix bug in type_def_indexed: types with 0 params in a heterogeneous mutual block were emitted as Type instead of Type 1, causing a Lean universe mismatch error. All types in such blocks now consistently use Type 1. New test files: - test_case_arm_nesting.lem: match/if/let/fun in case arms, as function args, in if-conditions, in list/tuple constructors (42 assertions) - test_termination.lem: declare termination_argument, multi-discriminant match with 2 and 3 scrutinees, partial def fallback (15 assertions) Enhanced existing tests: - test_pattern_edge_cases.lem: n+k patterns (fib, pred, classify), unit in tuple/let patterns (13 new assertions) - test_indreln.lem: inequality, nested fn application, ordering, and multi-rule relations in antecedents (4 new relations) - test_mutual_recursion.lem: heterogeneous param counts (caught the Type 1 bug), 3-way mutual recursion - test_audit_regressions.lem: tabs in comments, type/record defs Total: 31 tests, 231 assertions, all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Honest accounting of every gap: 942 sorry stubs in Machine_word, wrong floating-point types, missing overflow semantics, 18 partial defs, incomplete target rep coverage, indreln \!= edge case. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…annotations
Propositional equality in indreln antecedents now handles both the Infix
AST path (direct = / <> syntax) and the App AST path (Lem's <>
decomposition to not(isEqual x y)). Extracted check_beq_target_rep
helper to share logic between both cases. Added regression tests using
(nat -> nat) types which lack BEq and would fail without the fix.
Added {lean}-scoped termination annotations for 10 structurally recursive
library functions (map_tr, count_map, splitAtAcc, mapMaybe, mapiAux,
catMaybes, init, stringFromListAux, concat, integerOfStringHelper),
reducing partial def count from 18 to 8.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…FixedPoint Convert 5 more partial defs to total: - LemLib.lean: boolListFromNatural (n/2 division), bitSeqBinopAux (dual-list recursion) - LemLib.lean: lemStringFromNatHelper, lemStringFromNaturalHelper (n/10 division) - LemLib.lean: lemLeastFixedPoint (bounded countdown) Add Lean-only target reps in string_extra.lem and set.lem to route generated code through the total LemLib implementations. All changes are inherently Lean-scoped (declare lean target_rep / hand-written Lean). Add TODO rems-project#7: audit all pre-existing unscoped termination annotations from upstream to verify they don't affect other backends. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
stringCompare in string_extra.lem always returned EQ (marked XXX: broken).
Added lean-specific inline: let inline {lean} stringCompare = defaultCompare.
This fixes stringLess, stringLessEq, stringGreater, stringGreaterEq, and
the Ord0 String instance.
Added 5 string comparison test assertions to prevent regression.
Updated TODO.md based on audit:
- rems-project#2 (numeric sorry stubs): non-issue, all inside block comments
- rems-project#8 (missing target reps): resolved, Lean has 288 vs Coq's 260
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Traced try_termination_proof through backend.ml — unscoped annotations
are intentionally universal (affect Coq, HOL, Isabelle, Lean). Pre-existing
upstream annotations have worked for years. Our branch additions are all
{lean} scoped. No changes needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously these 4 types silently mapped to Int, producing semantically wrong results (e.g., rationalFromFrac 1 3 = 0 via integer division). Now they map to LemRational/LemReal/LemFloat64/LemFloat32 — opaque structure types in LemLib.lean where every operation panics with a clear error message. This ensures misuse is caught immediately at runtime. Changes: - LemLib.lean: 4 new types with full panicking instances (Add, Sub, Mul, Div, Neg, HPow, BEq, Ord, Min, Max, OfNat, Inhabited) + 15 wrapper functions for target reps that can't use infix operators - library/num.lem: 4 type target reps, 14 function target reps updated, 2 new target reps for rationalFromFrac/realFromFrac (Lean-only changes) - Also reduces duplicate Int typeclass instances (partial fix for TODO rems-project#5) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously int32 and int64 both mapped to bare Int, causing duplicate typeclass instances with int/integer (all four were Int). Now they are structure wrappers (LemInt32/LemInt64) with forwarding instances for arithmetic, comparison, conversion, and bitwise operations. Same semantics as Coq's Z mapping but type-safe. Updated ppcmem bitwiseCompatibility.lem shift target reps to use lemInt32ToNat instead of Int.toNat (which expects bare Int). All tests pass: 31 comprehensive, 11 backend, ppcmem (43 jobs), cpp (34 jobs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Map Lem's mword phantom type to Lean 4's BitVec via TYR_subst. The type declaration `mword 'a = BitVec (@Size.size 'a _)` replaces 942 sorry stubs with real BitVec operations. Key changes: - library/machine_word.lem: Add Lean target reps for all 36 mword operations (arithmetic, bitwise, shifts, rotates, comparisons, bit access, width ops, concat/extract/update, hex, bitlist conversion) - lean-lib/LemLib.lean: 30 thin wrapper functions bridging Lem calling conventions to Lean 4 BitVec API - src/lean_backend.ml: TYR_subst constraint propagation — walks Lem types to discover implicit [Size a] constraints that TYR_subst introduces; deferred abbrev emission for forward references; shared helpers for constraint extraction and formatting - tests/comprehensive/test_mword.lem: 57 assert-based tests covering all operations, verified at runtime during lake build (not just type-checking) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Opaque types like ty1..ty4096 and itself are zero-constructor inductives that exist only to carry type-level information (bit widths via Size). They are uninhabitable by design. Generating sorry-based Inhabited/BEq/Ord instances for them was unsound and produced 942 compiler warnings. Filter out Te_opaque types in generate_default_values and generate_default_values_mutual before instance generation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Unify fun_pattern/def_pattern into single pattern ~style function with FunParam | MatchArm discriminator (~80 lines saved) - Extract tnvar_to_string/tnvar_to_variable helpers (6+ call sites) - Rename shadowed variables in clauses function - Fix mutual indreln: add mutual/end wrapping, per-relation inductive keyword - Add make lean-tests target (full 6-stage test suite) - Add coq_exps_test to backend tests (12/12 now) - New comprehensive tests: test_class_instance_constraints, test_pattern_complex, test_mutual_indreln, test_set_comprehension_advanced (36 total, 251+ assertions) - Remove TODO.md from tracking, add to .gitignore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- backend_lean.md: LemLib namespace, termination_argument override, mutual indreln, machine words section, BEq+Ord derivation, export vs open, cross-module renaming - own_lem_files.md: add Lean to termination_argument example - Makefile: update test count comment (36 tests, 251+ assertions) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…straints Lem's default_instance declarations were silently dropped for the Lean backend because def_trans.ml wrapped them in Comment nodes for all targets. Fix: pass default instances through to the Lean backend (def_trans.ml), emit them as instance (priority := low) (lean_backend.ml), and add Lean-native typeclass constraints that the method bodies require: - Eq0 default gets [BEq a] (body uses ==) - SetType default gets [Ord a] (body uses defaultCompare) The other two defaults (OrdMaxMin, MapKeyType) already carry sufficient Lem-level constraints. This extends the extra_constraints_for_tyr_subst pattern for function target rep constraints in default instances. Also adds 5 new comprehensive test files (41 total, all passing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix VectorAcc/VectorSub missing space between expression and index - Fix \0 string escape to \x00 (Lean 4 doesn't support \0) - Add coverage script (scripts/lean_coverage.sh) using bisect_ppx - Add tests: polymorphic multi-clause function, module rename, direct isInequal in indreln, vector access, string escapes, target-filtered definitions (rec, indreln, val, lemma) - Coverage: 82.56% on lean_backend.ml (1946/2357) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bug rems-project#20: Let_def path now emits one def per bound name using local let destructuring (def x : T := let PAT := EXPR; x) instead of invalid def PATTERN := EXPR syntax. Bug rems-project#21: Polymorphic indreln premises now include explicit type parameters via lean_indreln_params ref. Self-references in antecedents get the type params injected (e.g., poly_mem a xs x). New test: test_let_def_destructuring.lem (8 assertions: pairs, triples, nested tuples). Extended test_indreln.lem with poly_mem and isInequal cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Vector.slice using Vector.ofFn + Array.extract/getD. Lem's v.[i..j] syntax now fully works end-to-end. Test: 2 new runtime-verified assertions in test_vectors.lem confirm slice correctness. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New indreln tests exercise previously uncovered paths: - apply_pred: function-typed argument (nat -> bool) triggers indreln_typ Typ_fn Bool→Prop conversion - pair_rel: tuple-typed index exercises indreln_typ Typ_tup path lean_backend.ml coverage: 84.04% → 84.25% (2017 → 2022 points) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 4 dead/unreachable code paths identified by bisect_ppx coverage: - typ function (~30 lines): entirely unused, never called - typ_ident_to_output: only caller was dead typ - lean_function_application_to_output: never called wrapper - Let_fun branch in let_body: pattern compilation eliminates before backend - Te_variant in generate_default_value_texp: replaced with explicit unreachable guard (caller handles Te_variant before dispatch) Coverage improves from 84.25% to 86.63% (same 2022 covered spans, 66 fewer total spans). Net -41 lines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Te_opaque, Te_abbrev, and Te_record are never reached in tyexp: def dispatches abbreviations and records to dedicated handlers, and type_def_variant handles Te_opaque before calling tyexp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace fragile List.nth with pattern matching on args list (line 1174). Replace deprecated Hashtbl.find/Not_found with Hashtbl.find_opt (line 822). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ate gitignore
- Remove root-level test_lean.lem / test_lean2.lem (development scratch files)
- Commit examples/ppcmem-model/{lakefile.lean, lean-toolchain, lake-manifest.json}
(needed for lake build, matching cpp and lean-test which are already committed)
- Add gitignore entries for: generated lean-lib/LemLib/*.lean,
generated examples/ppcmem-model/*.lean, _build/, main.native,
coverage-report/, .coverage-switch/
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests that types, constructors, polymorphic types, and constants defined in one user .lem file can be imported and used by another. Exercises: auxiliary file transitive opens, skip-open for user modules, cross-module name collision handling, dynamic library namespace list. 5 runtime-verified assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In non-library (user) modules, library namespace opens were emitted twice: inline by OpenImportTarget during body processing, and again by transitive_opens in the preamble. Skip inline opens for library imports in user modules since transitive_opens handles them for both main and auxiliary files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- rename_top_level.ml: Tc_class renaming only for Lean target; other backends skip class entries (preserves pre-existing behavior) - target_trans.ml: class names added to avoid set only for Lean target - output.ml: revert block token type from Meta_utf8 back to Kwd, preserving spacing semantics for Coq/HOL/Isabelle while keeping the UTF-8 encoding fix (of_latin1 -> of_string) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace silent emp catch-all in def with explicit Declaration/Lemma arms - Replace unreachable OpenImportTarget emp with error guard - Rename shadowed variable skips -> skips_out in P_wild pattern handler - Change setChoose empty-set case from sorry to panic\! (consistent with rest of LemLib's error handling style) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Section headers and design rationale for: expression rendering, type definitions, instance generation, import/namespace management, indreln clauses, multi-clause grouping, and the target_trans pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The script only tests the Lean backend, so the name should reflect that. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a new backend targeting Lean 4, allowing Lem definitions to be exported as Lean 4 code. This brings the set of supported targets to: OCaml, Coq, HOL4, Isabelle/HOL, Lean 4, LaTeX, and HTML.
The backend is structurally modelled on the Coq backend, adapted for Lean 4 syntax and semantics. Tested against Lean 4.28.0 via the Lake build system.
What's included
src/lean_backend.ml(~2400 lines) — main backend translating Lem AST to Lean 4 syntaxlean-lib/—LemLibLean 4 runtime library (~720 lines): sets, maps, comparisons, numeric utilities,BitVecoperations for machine wordslibrary/lean_constants— Lean 4 reserved words and typeclass namesTarget_leanchecks — see "Shared code safety" below)doc/manual/backend_lean.mdand updates toREADME.mdmake lean-libstarget generating Lean library files from Lem's standard libraryscripts/lean_coverage.sh— bisect_ppx coverage script used to identify and close untested codepaths (~87% line coverage onlean_backend.ml)Test suite
BitVec), cross-module imports, and moreexamples/cpp/— C++ concurrency model (Cmm.lean, ~1930 generated lines, 34 Lake jobs, 0 errors)examples/ppcmem-model/— PowerPC memory model (10 .lem files, 10/10 compile, 43 Lake jobs, 0 errors)Notable design decisions
Meta_utf8variant inoutput.mlfor correct encoding of→,×,∀,∃export TypeName (Ctor1 Ctor2 ...)after eachinductive— Lean'sopenis file-local, soexportis needed for importers to see constructorspartial defunless adeclare {lean} termination_argument = automaticannotation is present; 10 library functions were annotated total, and 3 more redirect to total LemLib wrappers. Only 2 genuinely partial defs remain (unfoldr,leastFixedPointUnbounded)deriving BEq, Ord: auto-derived for simple types (non-mutual, no function-typed constructor args);sorry-based instances as fallback for mutual types (Lean'sderivingdoesn't support mutual inductives)mwordmaps to Lean'sBitVecwith 36 operations implemented in LemLib;int32/int64use distinct newtype wrappers (LemInt32/LemInt64)rename_top_level.mlseeds constant renaming with type names to handle collisions. Cross-module names included via fullenv.t_envscanlean_constantsincludesAdd,Sub,Mul, etc. to prevent clashes with Lean stdlib==(BEq) converted to=(Prop) in inductive relation antecedents, handling both Lem AST decomposition pathsis_lean_pattern_matchinpatterns.mltriggers guard-based desugaring instead of unsupported n+k patternsShared code safety
Changes to shared source files are guarded to only affect the Lean backend:
typed_ast_syntax.ml: class path collection inused_types— only consumed by Lean-specific code inrename_top_level.mlrename_top_level.ml:Tc_classrenaming returns early for non-Lean targetstarget_trans.ml: class names added to avoid set only when target isTarget_leanoutput.ml: block token type kept asKwd(preserves Coq/HOL/Isabelle spacing); UTF-8 encoding fix (of_string) is universal but correcttarget_binding.ml:e_envfallback only fires when primary lookup fails (low risk, all targets)Verified: all 6 non-Lean backends (OCaml, Coq, HOL, Isabelle, HTML, LaTeX) produce byte-for-byte identical output on all test files compared to master. Zero regressions.
Known limitations (won't fix)
partial defin generated library:unfoldr(user-supplied termination),leastFixedPointUnbounded(iterates until fixpoint). Correctly partial.sorry-based instances for mutual types: Leanderivingcan't handle mutual inductives. Documented with/- mutual type -/comments.nat/naturalboth map toNat,int/integerboth toInt. Inherent to Lem's design, same in all backends.BEqinstances: fromEq0,SetType,MapKeyType— three paths toBEq a. Lem typeclass design tradeoff.coq_backend_skips.lem: non-positive inductive — fundamental Lean restriction.Test plan
make -C src— compiler builds cleanlylake buildsucceeds (57 Lake jobs)lake buildinlean-lib/— LemLib compiles (33 Lake jobs)make lean-libs— all library files generated successfullyexamples/cpp/—Cmm.leangenerates and compiles (34 Lake jobs)examples/ppcmem-model/— 10/10 files generate and compile (43 Lake jobs)🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com