feat: add Malli function schema bridge for verified functions#2
Conversation
Introduce ansatz.malli namespace that automatically converts CIC type signatures from a/defn into Malli function schemas (e.g. Nat becomes [:and :int [:>= 0]], (List Bool) becomes [:sequential :boolean]). When *malli?* is bound to true, a/defn registers the generated schema with Malli's registry so that mi/instrument! can add runtime input/output validation — giving both compile-time proof and runtime contract checking on the same function. Malli is an optional dependency loaded lazily via requiring-resolve; the defn macro degrades gracefully if it is absent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Thanks for this — the direction is right and the effort is clear. Before this can merge I want to lay out the design precisely, because I think the integration can be significantly cleaner and more valuable. I'm leaving this open for you to iterate on.
Core constraint: zero changes to core.clj
The defn macro is the heart of the system. Remove *malli?*, remove the requiring-resolve call inside the macro, and remove the docstring amendment. Everything Malli-related must live entirely within ansatz.malli.
Registration should be provided by a wrapper macro in ansatz.malli itself:
;; Users opt in explicitly — no core coupling
(defmacro defn [fn-name params ret-type & body]
`(do
(a/defn ~fn-name ~params ~ret-type ~@body)
(register-schema! '~(ns-name *ns*) '~fn-name)))Or bulk instrumentation after the fact:
(am/instrument-all! @a/ansatz-env)Either way, the user makes an explicit choice to bring Malli in. The core stays clean.
Dependency
{:local/root "../malli"} cannot be merged — it breaks anyone who doesn't have Malli checked out next to Ansatz. Use a proper coordinate under a dedicated :malli alias (not :test), so it's clearly optional:
:malli {:extra-paths ["src"]
:extra-deps {metosin/malli {:mvn/version "0.16.4"}}}The type mapping — this is the hard part
This is where the current PR needs the most work, and it needs to be thought through carefully because the mapping is not straightforward. Here is a complete design spec.
Ground truth: what actually reaches the Clojure runtime
The key insight is that the schema must reflect what compute-arity says about the function, not the raw CIC type. The existing compute-arity function (core.clj:114) is your oracle:
(loop [t fn-type explicit 0 erased 0 in-prefix true]
(if (e/forall? t)
(let [bi (e/forall-info t)]
(if (= :default bi)
(recur (e/forall-body t) (inc explicit) erased false)
(recur (e/forall-body t) explicit ...)))
{:arity explicit :erased erased}))Only :default binder parameters survive to the Clojure runtime. Everything else — :implicit type parameters, :inst-implicit type class instances, universe parameters — is either synthesized at elaboration time or erased during compilation. Your fn-schema walker must use exactly this same binder-info logic, not just walk all foralls.
The current PR's fn-schema already does this correctly — but it's important to understand why, so the rest of the mapping is done consistently.
Category 1 — Clean scalar mappings
These are unambiguous. The only note is that the information loss on Nat should be documented explicitly:
| CIC type | Malli schema | Note |
|---|---|---|
Nat |
[:and :int [:>= 0]] |
Lean Nat is unbounded; Clojure int is bounded to Long/MAX_VALUE. Intentionally lossy — document this. |
Int |
:int |
|
Bool |
:boolean |
|
String |
:string |
|
Float / Real |
:double |
|
Unit |
:nil |
Compiles to nil at runtime |
Category 2 — Structural types (recursive mapping, information loss expected)
These are expressible but the mapping loses any dependent constraints. This is fine and expected — document it.
| CIC type | Malli schema |
|---|---|
List α |
[:sequential (type->malli α)] |
Array α |
[:vector (type->malli α)] |
Option α |
[:maybe (type->malli α)] |
Prod α β |
[:tuple (type->malli α) (type->malli β)] |
Sum α β |
[:or (type->malli α) (type->malli β)] |
User-defined inductives and structures are the hard case. At runtime:
-
Structures (
a/structure) compile to Clojuredefrecordinstances. The schema should be[:map [:field1 schema1] [:field2 schema2] ...]. To generate this, look up the structure instructure-registryand read the field types from the constructor's CIC type in the env. The field types are themselves CIC types that need recursive mapping. -
Enumerations (inductives with all nullary constructors, e.g.
(a/inductive Color [] (red) (green) (blue))) compile to integer indices (0, 1, 2, ...) at runtime. The schema should be[:enum 0 1 2]— though keyword variants might be a richer alternative worth discussing. -
General inductives (mixed nullary + n-ary constructors) compile to
[constructor-index & fields]tagged vectors. There is no clean Malli schema for these;:anywith a warning is the right default until a richer[:multi ...]representation is worked out.
The implication: type->malli needs access to the Ansatz environment and structure-registry, not just the type expression. Thread these through the opts map you already have.
Category 3 — Erased at runtime (never include in schema)
These never appear in the compiled Clojure function. If encountered while walking the forall telescope, they must be skipped silently (exactly as compute-arity does):
- Proof parameters: any
:defaultparameter whose type isProp-sorted. The current PR misses this case. A function like(a/defn safe-div [n :- Nat, m :- Nat, h :- (< 0 m)] Nat ...)hashwith:defaultbinder-info, but its type is a proposition. At runtimehreceives an opaque proof term — it has no meaningful Malli schema. Detect viae/sort?on the parameter type, and check the sort level isProp(level zero). Skip these parameters. - Type class instances:
:inst-implicit— already handled correctly. - Universe parameters:
:implicitwhere the type isSort _— already handled.
Category 4 — Dependent / indexed types (the hard problem — be honest about limitations)
Malli has no concept of dependent types. Do not try to solve this now. The correct approach for each case:
-
Vector α n(fixed-length): The indexnis type-level — it is not passed to the compiled function at runtime. The runtime value is just a sequential collection. Schema:[:sequential (type->malli α)]. The length constraint is guaranteed by the kernel; it does not need runtime re-checking. -
Fin n(bounded natural): The boundnlives in the type, not in the runtime value. Schema:[:and :int [:>= 0]]. -
Subtype P/{x : α // P x}: At runtime, only the witnessx : αis present — the proof ofP xis erased. Schema:(type->malli α). -
Matrix m n α,ZMod n, etc.::anywith a logged warning.
The general rule: for dependent types, produce the schema for the underlying carrier type and document that the dependent constraint is enforced by the kernel, not at runtime. This is correct semantics, not a limitation — the proof system already guarantees the constraint.
Category 5 — Mathlib abstract algebra (:any for now)
GroupHom, RingHom, LinearMap, Polynomial, ENNReal, etc. are opaque mathematical structures with no meaningful Malli schema. Return :any with a warning. The right extensibility mechanism for the future is a user-registered dispatch table:
(am/register-type-mapping! "Finset"
(fn [args opts] [:set (type->malli (first args) opts)]))Design this hook now even if the built-in table is small.
What to do at unknown boundaries
Never silently return :any. Return :any and emit a warning at registration time so the user knows their schema is approximate:
(println "WARN ansatz.malli: no schema for" type-name "— using :any")Edit: This should probably have a dynamic var to turn warnings on or off.
Priority: what to build first
1. Generative testing — this is the unique value. No verified-function system has this. malli.generator can generate random inputs from CIC types and property-test the compiled function against them. This tests the ansatz->clj compilation step, which is the least formally verified part of the pipeline:
(am/check-verified 'double {:iterations 1000})
;; => nil if passes, {:schema ... :input [...] :output ...} on failureImplement using malli.generator/function-checker. This should be the centrepiece.
2. Boundary instrumentation. am/instrument! / am/unstrument! for guarding verified functions against unverified callers. Off by default.
3. Schema inspection. am/fn-schema-for as a REPL convenience. Already mostly there.
Test fixes
-
test-malli-flag-auto-registercallsdefine-verified(the function) then manually callsregister-from-defn!— it never exercises the macro auto-registration path it claims to test. Rewrite against the wrapper macro once*malli?*is removed from core. -
The test fixture that resets
ansatz-envshould useenv/fork+try/finallyto avoid contaminating other tests. Seetest-isort-preservationincore_test.cljas the model. -
Tests for the type mapping should cover all five categories above, including explicit tests for unknown-boundary behaviour (
:any+ warning logged).
Happy to answer questions on any of this. The idea is solid — I just want the implementation to be as rigorous as the rest of the system.
Introduce ansatz.malli namespace that automatically converts CIC type signatures from a/defn into Malli function schemas (e.g. Nat becomes [:and :int [:>= 0]], (List Bool) becomes [:sequential :boolean]).
When malli? is bound to true, a/defn registers the generated schema with Malli's registry so that mi/instrument! can add runtime input/output validation — giving both compile-time proof and runtime contract checking on the same function. Malli is an optional dependency loaded lazily via requiring-resolve; the defn macro degrades gracefully if it is absent.