From e84fda2ad1049b8547ff25523f123f525a89fe68 Mon Sep 17 00:00:00 2001 From: Mikael Mayer Date: Mon, 19 Jan 2026 09:19:53 +0100 Subject: [PATCH 1/2] Add Performant Ordering Variable Numbering specification This spec proposes migrating Strata's variable representation to Performant Ordering - a unified variable numbering scheme using globally unique natural numbers. Key changes: - All variables (lambda-bound and imperative) use unique Nat identifiers - Eliminates De Bruijn index shifting, weakening, and lifting - Replaces old() construct with dual identifiers in modifies clauses - Introduces NextFree structure to track fresh variable generation - Leverages existing HasVars typeclass infrastructure Files: - requirements.md: 13 requirements with acceptance criteria - design.md: Complete technical design and architecture - tasks.md: Detailed implementation plan across 6 phases --- .../performant-ordering-variables/design.md | 912 +++++++++ .../requirements.md | 256 +++ .../specification.md | 1684 +++++++++++++++++ .../performant-ordering-variables/tasks.md | 445 +++++ 4 files changed, 3297 insertions(+) create mode 100644 .kiro/specs/performant-ordering-variables/design.md create mode 100644 .kiro/specs/performant-ordering-variables/requirements.md create mode 100644 .kiro/specs/performant-ordering-variables/specification.md create mode 100644 .kiro/specs/performant-ordering-variables/tasks.md diff --git a/.kiro/specs/performant-ordering-variables/design.md b/.kiro/specs/performant-ordering-variables/design.md new file mode 100644 index 000000000..4811fae77 --- /dev/null +++ b/.kiro/specs/performant-ordering-variables/design.md @@ -0,0 +1,912 @@ +# Performant Ordering Variable Numbering - Design + +## Architecture + +### High-Level Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ NextFree Structure │ +│ ┌────────────────────────────────────────────────────────┐ │ +│ │ structure NextFree (α : Type) where │ │ +│ │ content : α │ │ +│ │ nextFree : Nat │ │ +│ └────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ + │ + ├─────────────────────────────────┐ + │ │ + ▼ ▼ +┌──────────────────────────────────────┐ ┌──────────────────────────────────┐ +│ Lambda Calculus Layer │ │ Imperative Layer │ +│ ┌────────────────────────────────┐ │ │ ┌────────────────────────────┐ │ +│ │ LExpr: │ │ │ │ Cmd: │ │ +│ │ bvar (index: Nat) │ │ │ │ init (name: String) │ │ +│ │ abs (param: Nat) (body) │ │ │ │ set (var: Nat) (expr) │ │ +│ │ app (fn) (arg) │ │ │ │ havoc (var: Nat) │ │ +│ │ fvar (index: Nat) │ │ │ └────────────────────────────┘ │ +│ └────────────────────────────────┘ │ └──────────────────────────────────┘ +└──────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Fresh Variable Monad │ +│ ┌────────────────────────────────────────────────────────┐ │ +│ │ def FreshM (α : Type) : Type := StateM Nat α │ │ +│ │ │ │ +│ │ def freshVar : FreshM Nat := do │ │ +│ │ let n ← get │ │ +│ │ set (n + 1) │ │ +│ │ return n │ │ +│ └────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + +### Key Architectural Principles + +1. **Unified Numbering**: All variables (bvar, fvar, parameters, locals) share the same numbering space +2. **Global Uniqueness**: Each variable number is used exactly once across the entire program +3. **Monotonic Counter**: nextFree only increases, never decreases +4. **Stable Identity**: Variable numbers never change during transformations +5. **Display Separation**: Variable numbers (for semantics) are separate from display names (for humans) + +### Component Relationships + +``` +NextFree + ├── content (AST with variable numbers) + └── nextFree (fresh variable counter) + +VarContext (for display/debugging) + └── Map Nat (String × Type) -- number → (display name, type) + +FreshM Monad (for transformations) + └── StateM Nat -- threads nextFree through operations + +Evaluation Environment + └── Map Nat Value -- number → runtime value + +Type Environment + └── Map Nat Type -- number → type +``` + +--- + +## Design Details + +### 1. NextFree Structure + +All program entry points (expressions, statements, procedures, etc.) are wrapped in a `NextFree` structure: + +```lean +-- Generic typeclass for types that contain variables +-- This abstracts over HasVarsPure and HasVarsImp +class HasVars (α : Type) where + vars : α → List Nat + +structure NextFree (α : Type) [HasVars α] where + content : α + nextFree : Nat + +namespace NextFree + +def empty (content : α) : NextFree α := + ⟨content, 0⟩ + +def withFreshCounter (content : α) (nextFree : Nat) : NextFree α := + ⟨content, nextFree⟩ + +-- Primary invariant: all variables in content are below nextFree +def wellFormed (p : NextFree α) : Prop := + ∀ v ∈ HasVars.vars p.content, v < p.nextFree + +-- Alternative formulation: nextFree and above are not in the variable list +def wellFormed' (p : NextFree α) : Prop := + ∀ v ≥ p.nextFree, v ∉ HasVars.vars p.content + +-- These are equivalent +theorem wellFormed_iff_wellFormed' (p : NextFree α) : + p.wellFormed ↔ p.wellFormed' := by + constructor + · intro h v hge + by_contra hin + have : v < p.nextFree := h v hin + omega + · intro h v hin + by_contra hge + have : v ∉ HasVars.vars p.content := h v (Nat.le_of_not_lt hge) + contradiction + +end NextFree +``` + +**Integration with Existing HasVars Typeclasses**: + +The existing typeclasses in `Strata/DL/Imperative/HasVars.lean` are: +- `HasVarsPure P α` - provides `getVars : α → List P.Ident` +- `HasVarsImp P α` - provides `definedVars`, `modifiedVars`, `touchedVars : α → List P.Ident` +- `HasVarsTrans P α PT` - for procedure-aware variable lookup + +Since `P.Ident` will become `Nat` as part of this migration, we add a generic `HasVars` typeclass that abstracts over both pure and imperative cases: + +```lean +-- Generic HasVars typeclass (add to Strata/DL/Imperative/HasVars.lean or new file) +class HasVars (α : Type) where + vars : α → List Nat + +-- Default instance for types with HasVarsPure +instance [HasVarsPure P α] : HasVars α where + vars := HasVarsPure.getVars + +-- Default instance for types with HasVarsImp +instance [HasVarsImp P α] : HasVars α where + vars := HasVarsImp.touchedVars +``` + +This way, `NextFree` works uniformly for both expressions (via `HasVarsPure`) and imperative constructs (via `HasVarsImp`) without needing to know which one is being used. + +### 2. Fresh Variable Monad + +```lean +def FreshM (α : Type) : Type := StateM Nat α + +namespace FreshM + +def freshVar : FreshM Nat := do + let n ← get + set (n + 1) + return n + +def freshVars (count : Nat) : FreshM (List Nat) := do + let mut vars := [] + for _ in [0:count] do + vars := (← freshVar) :: vars + return vars.reverse + +def run (m : FreshM α) (initialNextFree : Nat) : α × Nat := + m.run initialNextFree + +def runProgram (m : FreshM α) (p : NextFree β) : NextFree α := + let (content, nextFree) := m.run p.nextFree + ⟨content, nextFree⟩ + +end FreshM +``` + +**Key Lemma**: +```lean +theorem freshVar_is_fresh (s : Nat) : + let (n, s') := freshVar.run s + n = s ∧ s' = s + 1 ∧ n < s' +``` + +### 3. Lambda Calculus with Performant Ordering + +#### LExpr Definition + +```lean +inductive LExpr (T : LExprParamsT) : Type where + | bvar (m : T.base.Metadata) (index : Nat) (ty : Option T.TypeType) + | fvar (m : T.base.Metadata) (index : Nat) (ty : Option T.TypeType) + | abs (m : T.base.Metadata) (param : Nat) (name : String) (body : LExpr T) (ty : Option T.TypeType) + | app (m : T.base.Metadata) (fn : LExpr T) (arg : LExpr T) (ty : Option T.TypeType) + | quant (m : T.base.Metadata) (kind : QuantKind) (var : Nat) (name : String) + (body : LExpr T) (ty : Option T.TypeType) + -- ... other constructors +``` + +**Key Changes**: +- `bvar` uses only `Nat` for unique identity (no string stored) +- `fvar` uses only `Nat` for unique identity (no string stored) +- `abs` stores parameter number + `name` for the parameter name +- `quant` stores variable number + `name` for the bound variable + +**Design Principle**: +1. **Variable references** (`bvar`, `fvar`) store only the semantic identity (Nat) +2. **Variable declarations** (`abs`, `quant`) store both identity (Nat) and display name (String) +3. **VarContext** provides display names for variable references during pretty printing + +#### Substitution (Simplified) + +```lean +def substitute (e : LExpr T) (var : Nat) (replacement : LExpr T) : LExpr T := + match e with + | .bvar m index ty => if index == var then replacement else e + | .fvar m index ty => if index == var then replacement else e + | .abs m param name body ty => + .abs m param name (substitute body var replacement) ty + | .app m fn arg ty => + .app m (substitute fn var replacement) (substitute arg var replacement) ty + | .quant m kind qvar name body ty => + .quant m kind qvar name (substitute body var replacement) ty + -- ... other cases +``` + +**No index shifting needed!** Simple replacement throughout. + +#### Beta Reduction (Simplified) + +```lean +def betaReduce (e : LExpr T) : LExpr T := + match e with + | .app _ (.abs _ param body _) arg _ => + substitute body param arg + | _ => e +``` + +**No index adjustment!** Just substitute the parameter number with the argument. + +#### Alpha Conversion (Simplified) + +```lean +def alphaConvert (abs : LExpr T) : FreshM (LExpr T) := + match abs with + | .abs m param name body ty => do + let newParam ← freshVar + let newBody := substitute body param (.bvar m newParam ty) + return .abs m newParam name newBody ty + | _ => return abs +``` + +**No index adjustment!** Just generate a fresh number and substitute. + +### 4. Imperative Layer with Performant Ordering + +#### Cmd Definition + +```lean +inductive Cmd (P : PureExpr) : Type where + | init (hint : String) (var : Nat) (ty : P.Ty) (e : Option P.Expr) (md : MetaData P := .empty) + | set (var : Nat) (e : P.Expr) (md : MetaData P := .empty) + | havoc (var : Nat) (md : MetaData P := .empty) + | assert (label : String) (b : P.Expr) (md : MetaData P := .empty) + | assume (label : String) (b : P.Expr) (md : MetaData P := .empty) +``` + +**Key Points**: +- `init` stores both hint (String) for display AND var (Nat) for semantic identity +- `init` has optional expression: `some e` for initialized, `none` for havoc +- `set` and `havoc` reference variables by number only (hint is in VarContext) +- Variable numbers are assigned when `init` is executed +- **All constructors have `md : MetaData P` parameter** with default `.empty` + +**Rationale for Optional Init**: +- `init "x" 42 ty (some e)` - declares variable 42 with initial value e +- `init "x" 42 ty none` - declares variable 42 with havoc/uninitialized value +- Eliminates pattern of `init` followed by immediate `havoc` +- More concise and clearer intent + +**Design Principle**: +- Declaration nodes (`init`, `abs`, `quant`) store hints directly +- Reference nodes (`bvar`, `fvar`, `set`, `havoc`) use numbers only +- VarContext maintains the mapping for display purposes +- Metadata is preserved throughout all operations + +#### Variable Context for Display + +```lean +structure VarContext (P : PureExpr) where + vars : Map Nat (String × P.Ty) -- number → (display name, type) + declOrder : List Nat -- variables in declaration order + +namespace VarContext + +def empty : VarContext P := ⟨Map.empty, []⟩ + +def insert (ctx : VarContext P) (num : Nat) (name : String) (ty : P.Ty) : VarContext P := + ⟨ctx.vars.insert num (name, ty), ctx.declOrder ++ [num]⟩ + +def lookup (ctx : VarContext P) (num : Nat) : Option (String × P.Ty) := + ctx.vars.find? num + +def lookupName (ctx : VarContext P) (num : Nat) : Option String := + (ctx.lookup num).map (·.1) + +def lookupType (ctx : VarContext P) (num : Nat) : Option P.Ty := + (ctx.lookup num).map (·.2) + +-- Get unambiguous display name for a variable +-- If multiple variables share the same name, use @N suffix where N is the shadowing count +def getDisplayName (ctx : VarContext P) (varNum : Nat) : String := + match ctx.lookupName varNum with + | none => s!"@{varNum}" -- Unknown variable, show raw number with @ + | some name => + if name.isEmpty then s!"@{varNum}" + else + -- Find position of this variable in declaration order + match ctx.declOrder.indexOf varNum with + | none => s!"@{varNum}" -- Not in declaration order (shouldn't happen) + | some pos => + -- Count how many variables with the same name were declared AFTER this one + let laterVarsWithSameName := + ctx.declOrder.drop (pos + 1) + |>.filter (fun v => ctx.lookupName v == some name) + |>.length + + if laterVarsWithSameName == 0 then + name -- No shadowing, use plain name + else + s!"{name}@{laterVarsWithSameName}" -- Shadowed, add suffix + +-- Initialize VarContext with global variables +def fromGlobals (globals : List (String × P.Ty)) : FreshM (VarContext P) := do + let mut ctx := empty + for (name, ty) in globals do + let varNum ← freshVar + ctx := ctx.insert varNum name ty + return ctx + +end VarContext +``` + +**Shadowing Example**: + +```lean +-- Source code with shadowing: +init x := 3 -- Gets number 10 +init x := 2 -- Gets number 11 (shadows 10) +assert x < (previous x) -- 11 < 10 + +-- VarContext state: +vars: {10 → ("x", int), 11 → ("x", int)} +declOrder: [10, 11] + +-- Display names: +getDisplayName ctx 10 = "x@1" -- 1 variable named x declared after this +getDisplayName ctx 11 = "x" -- 0 variables named x declared after this (most recent) + +-- Pretty printed output: +assert x < x@1 +``` + +**Key Points**: +- `declOrder` tracks variables in the order they were declared +- `getDisplayName` counts how many variables with the same name were declared AFTER the target variable +- The most recent variable with a given name gets the plain name (no suffix) +- Earlier variables get `@N` suffix where N is the shadowing count +- This makes it clear which variable is which in the presence of shadowing + +#### Evaluation with Performant Ordering + +```lean +def Cmd.eval [EC : EvalContext P S] (σ : S) (c : Cmd P) : Cmd P × S := + match c with + | .init name ty (some e) md => do + let varNum := EC.nextVar σ + let e := EC.eval σ e + let σ := EC.addVar σ varNum ty e + let σ := EC.incrementNextVar σ + return (.init name ty (some e) md, σ) + + | .init name ty none md => do + let varNum := EC.nextVar σ + let (e, σ) := EC.genFreeVar σ ty + let σ := EC.addVar σ varNum ty e + let σ := EC.incrementNextVar σ + return (.init name ty none md, σ) + + | .set varNum e md => + match EC.lookupVar σ varNum with + | none => return (c, EC.updateError σ (.VarNotFound varNum)) + | some (ty, _) => + let e := EC.eval σ e + let σ := EC.updateVar σ varNum ty e + return (.set varNum e md, σ) + + | .havoc varNum md => + match EC.lookupVar σ varNum with + | none => return (c, EC.updateError σ (.VarNotFound varNum)) + | some (ty, _) => + let (e, σ) := EC.genFreeVar σ ty + let σ := EC.updateVar σ varNum ty e + return (.havoc varNum md, σ) +``` + +### 5. Pretty Printing + +```lean +def formatVar (ctx : VarContext P) (varNum : Nat) : String := + ctx.getDisplayName varNum + +def formatCmd (ctx : VarContext P) (c : Cmd P) : String := + match c with + | .init name ty (some e) _ => + s!"init ({name} : {ty}) := {formatExpr ctx e}" + | .init name ty none _ => + s!"init ({name} : {ty})" + | .set varNum e _ => + s!"{formatVar ctx varNum} := {formatExpr ctx e}" + | .havoc varNum _ => + s!"havoc {formatVar ctx varNum}" + | .assert label b _ => + s!"assert {label}: {formatExpr ctx b}" + | .assume label b _ => + s!"assume {label}: {formatExpr ctx b}" +``` + +**Example with Shadowing**: + +```lean +-- AST: +init "x" 10 int (some 3) +init "x" 11 int (some 2) +assert "check" (11 < 10) + +-- Pretty printed (with VarContext): +init (x : int) := 3 +init (x : int) := 2 +assert check: x < x@1 + +-- The most recent 'x' (variable 11) displays as 'x' +-- The earlier 'x' (variable 10) displays as 'x@1' (1 shadowing variable after it) +``` + +### 6. SMT Encoding + +```lean +structure SMTContext where + varNames : Map Nat String -- variable number → unique SMT name + nameCounters : Map String Nat -- base name → occurrence count + +namespace SMTContext + +def empty : SMTContext := ⟨Map.empty, Map.empty⟩ + +def addVar (ctx : SMTContext) (varNum : Nat) (baseName : String) : SMTContext × String := + let counter := ctx.nameCounters.findD baseName 0 + let smtName := if counter == 0 then baseName else s!"{baseName}@{counter}" + let ctx' := { + varNames := ctx.varNames.insert varNum smtName, + nameCounters := ctx.nameCounters.insert baseName (counter + 1) + } + (ctx', smtName) + +def lookupVar (ctx : SMTContext) (varNum : Nat) : Option String := + ctx.varNames.find? varNum + +end SMTContext +``` + +### 7. Key Invariants + +#### Primary Invariant: All Variables Below nextFree + +```lean +def NextFree.wellFormed (p : NextFree α) : Prop := + ∀ v ∈ HasVars.vars p.content, v < p.nextFree +``` + +**Theorem**: All well-formed programs satisfy this invariant. + +**Alternative Formulation**: No variable at or above nextFree appears in content + +```lean +def NextFree.wellFormed' (p : NextFree α) : Prop := + ∀ v ≥ p.nextFree, v ∉ HasVars.vars p.content +``` + +**Theorem**: These formulations are equivalent. + +#### Well-formedness Invariant: All Uses Have Declarations + +```lean +def NextFree.allUsesHaveDecls [HasVarsImp P α] (p : NextFree α) : Prop := + ∀ use ∈ HasVarsImp.modifiedVars p.content, + use ∈ HasVarsImp.definedVars p.content +``` + +**Theorem**: Type checking ensures this invariant. + +### 8. Handling Source-Level Shadowing + +**Problem**: Source languages (Boogie, C_Simp) allow shadowing, but Performant Ordering makes it impossible. + +**Solution**: During parsing/translation, assign fresh numbers to all variables, even if they have the same source name. + +#### Example: Boogie Source with Shadowing + +```boogie +procedure Example() { + var x: int; + x := 1; + { + var x: int; // Shadows outer x + x := 2; // Refers to inner x + } + assert x == 1; // Refers to outer x +} +``` + +#### Translation to Performant Ordering + +```lean +-- During translation, maintain a scope stack +-- Outer scope: x → 42 +-- Inner scope: x → 73 + +init "x" 42 int (some 1) -- Outer x gets number 42 +set 42 (const 1) +-- Enter inner scope +init "x" 73 int (some 2) -- Inner x gets number 73 (fresh!) +set 73 (const 2) -- Refers to 73 +-- Exit inner scope +assert "outer_x" (42 == 1) -- Refers to 42 +``` + +#### Scope Stack During Translation + +```lean +structure ScopeStack where + scopes : List (Map String Nat) -- Stack of name → number mappings + +namespace ScopeStack + +def empty : ScopeStack := ⟨[Map.empty]⟩ + +def pushScope (s : ScopeStack) : ScopeStack := + ⟨Map.empty :: s.scopes⟩ + +def popScope (s : ScopeStack) : ScopeStack := + ⟨s.scopes.tail⟩ + +def lookup (s : ScopeStack) (name : String) : Option Nat := + -- Search from innermost to outermost scope + s.scopes.findSome? (fun scope => scope.find? name) + +def insert (s : ScopeStack) (name : String) (num : Nat) : ScopeStack := + match s.scopes with + | [] => ⟨[Map.empty.insert name num]⟩ + | scope :: rest => ⟨scope.insert name num :: rest⟩ + +end ScopeStack +``` + +#### Translation Algorithm + +```lean +def translateStmt (stmt : BoogieStmt) : FreshM (ScopeStack → Cmd × ScopeStack) := + match stmt with + | .varDecl name ty init => + fun scopes => do + let varNum ← freshVar + let scopes' := scopes.insert name varNum + let cmd := .init name varNum ty (translateExpr init scopes) + return (cmd, scopes') + + | .assign name expr => + fun scopes => + match scopes.lookup name with + | none => error s!"Variable {name} not found" + | some varNum => + let cmd := .set varNum (translateExpr expr scopes) + return (cmd, scopes) + + | .block stmts => + fun scopes => do + let scopes' := scopes.pushScope + let (cmds, scopes'') ← translateStmts stmts scopes' + let scopes''' := scopes''.popScope + return (.block cmds, scopes''') +``` + +**Key Insight**: Source-level shadowing is resolved during translation by assigning different numbers to variables with the same name. The Performant Ordering representation has no shadowing - each variable has a unique number. + +### 9. Names vs Indices Design Principle + +**Core Principle**: Commands use indices for implementation, specifications use names for user interface. + +#### When to Use Names + +1. **Variable declarations** (`init` hint parameter) - For display and debugging +2. **Procedure specifications** (`modifies` clauses) - Users write names +3. **Error messages** - Display names to users +4. **Pretty printing** - Show names to users +5. **Source code** - Users write names + +#### When to Use Indices + +1. **Variable references** (`set`, `havoc`, `bvar`, `fvar`) - For semantics +2. **modifiedVars return value** - Returns list of indices +3. **Evaluation** - Look up values by index +4. **Type checking** - Look up types by index +5. **Transformations** - Track variables by index + +#### Resolution Pattern + +When specifications (names) need to be validated against implementation (indices): + +```lean +-- Pattern 1: Validate procedure modifies clause +def validateModifies (proc : Procedure) (body : Cmd) (ctx : VarContext) : Bool := + let modifiedIndices := body.modifiedVars + let modifiedNames := modifiedIndices.filterMap (ctx.lookupName ·) + modifiedNames.all (fun name => name ∈ proc.modifies ∨ name ∈ proc.outputs) + +-- Pattern 2: Generate havoc statements for loop elimination +def generateLoopHavocs (body : Cmd) (ctx : VarContext) : FreshM (List Cmd) := do + let modifiedIndices := body.modifiedVars + return modifiedIndices.map (fun idx => .havoc idx) + +-- Pattern 3: Error messages with names +def formatError (err : Error) (ctx : VarContext) : String := + match err with + | .VarNotFound idx => + match ctx.lookupName idx with + | some name => s!"Variable '{name}' (#{idx}) not found" + | none => s!"Variable #{idx} not found" +``` + +### 10. modifiedVars and definedVars + +**Critical Feature**: Track which variables are modified by commands. + +#### modifiedVars Implementation + +```lean +def Cmd.modifiedVars (c : Cmd P) : List Nat := + match c with + | .init _ _ _ _ _ => [] + | .set var _ _ => [var] + | .havoc var _ => [var] + | .assert _ _ _ => [] + | .assume _ _ _ => [] + +def Cmds.modifiedVars (cs : List (Cmd P)) : List Nat := + cs.flatMap Cmd.modifiedVars +``` + +**Key Points**: +- Returns **indices**, not names +- `init` returns empty list (declares, doesn't modify) +- `set` and `havoc` return the modified variable's index +- Consumers resolve indices to names when needed + +#### definedVars Implementation + +```lean +def Cmd.definedVars (c : Cmd P) : List String := + match c with + | .init hint _ _ _ _ => [hint] + | .set _ _ _ => [] + | .havoc _ _ => [] + | .assert _ _ _ => [] + | .assume _ _ _ => [] +``` + +**Key Points**: +- Returns **names** (hints), not indices +- Only `init` defines new variables +- Used for tracking variable declarations + +### 11. Global Variables and VarContext Initialization + +**Problem**: Global variables need to be accessible in all procedures. + +**Solution**: Initialize VarContext with globals before translating procedures. + +#### Global Variable Initialization + +```lean +structure VarContext (P : PureExpr) where + vars : Map Nat (String × P.Ty) -- number → (hint, type) + +namespace VarContext + +def empty : VarContext P := ⟨Map.empty⟩ + +def insert (ctx : VarContext P) (num : Nat) (hint : String) (ty : P.Ty) : VarContext P := + ⟨ctx.vars.insert num (hint, ty)⟩ + +def lookup (ctx : VarContext P) (num : Nat) : Option (String × P.Ty) := + ctx.vars.find? num + +def lookupName (ctx : VarContext P) (num : Nat) : Option String := + (ctx.lookup num).map (·.1) + +def lookupType (ctx : VarContext P) (num : Nat) : Option P.Ty := + (ctx.lookup num).map (·.2) + +-- Initialize VarContext with global variables +def fromGlobals (globals : List (String × P.Ty)) : FreshM (VarContext P) := do + let mut ctx := empty + for (name, ty) in globals do + let varNum ← freshVar + ctx := ctx.insert varNum name ty + return ctx + +end VarContext +``` + +#### Program Translation with Globals + +```lean +def translateProgram (prog : BoogieProgram) : FreshM StrataProgram := do + -- Step 1: Assign numbers to globals + let globalVars := prog.globals.map (fun g => (g.name, g.type)) + let ctx ← VarContext.fromGlobals globalVars + + -- Step 2: Translate procedures with global context + let procedures ← prog.procedures.mapM (translateProcedure ctx) + + return { + globals := globalVars, + procedures := procedures + } +``` + +**Key Insight**: Globals get fresh numbers first, then procedures can reference them by those numbers. + +### 12. Elimination of old() Construct + +**Problem**: The `old()` construct in postconditions creates a special case for referencing pre-state values, which is inconsistent with the uniform variable numbering approach. + +**Solution**: Each variable in the `modifies` clause carries two identifiers - one for the current value and one for the old (pre-state) value. The semantics and VarContext use these identifiers to distinguish between current and old values. + +**Important Note**: Currently, `old()` is only used for global variables in postconditions. Input parameters do not use `old()` since they are immutable in procedures. + +#### Current Approach (with old()) + +```boogie +var g: int; + +procedure Increment() + modifies g; + ensures g == old(g) + 1; +{ + g := g + 1; +} +``` + +#### New Approach (with dual identifiers) + +```lean +-- Modifies clause carries both current and old identifiers +structure ModifiesEntry where + name : String -- Display name (e.g., "g") + currentVar : Nat -- Variable number for current value + oldVar : Nat -- Variable number for old/pre-state value + +-- During procedure translation +def translateProcedure (proc : BoogieProcedure) (globals : List (String × Nat × Type)) : FreshM Procedure := do + -- For each modified global, generate both current and old identifiers + let modifiesEntries ← proc.modifies.mapM (fun globalName => do + match globals.find? (fun (name, _, _) => name == globalName) with + | some (name, currentVar, ty) => + let oldVar ← freshVar + return ModifiesEntry.mk name currentVar oldVar + | none => error s!"Global {globalName} not found") + + -- Translate postconditions using old identifiers + let ensures := proc.ensures.map (fun e => + replaceOldReferences e modifiesEntries) + + return { + inputs := proc.inputs, + outputs := proc.outputs, + modifies := modifiesEntries, -- Carries both current and old identifiers + requires := proc.requires, + ensures := ensures, + body := translateBody proc.body + } + +-- Replace old(g) with reference to old identifier +def replaceOldReferences (expr : Expr) (modifiesEntries : List ModifiesEntry) : Expr := + match expr with + | .old (.fvar m currentVar ty) => + -- Find the old identifier for this global + match modifiesEntries.find? (fun entry => entry.currentVar == currentVar) with + | some entry => .fvar m entry.oldVar ty + | none => expr -- Not a modified global, keep as-is + | _ => expr.map (replaceOldReferences · modifiesEntries) + +-- VarContext tracks both current and old names +def VarContext.insertModifiesEntry (ctx : VarContext) (entry : ModifiesEntry) (ty : Type) : VarContext := + ctx + |> .insert entry.currentVar entry.name ty + |> .insert entry.oldVar ("old " ++ entry.name) ty +``` + +#### Example Translation + +**Boogie source:** +```boogie +var g: int; + +procedure Increment() + modifies g; + ensures g == old(g) + 1; +{ + g := g + 1; +} +``` + +**Translated to Performant Ordering:** +```lean +-- Assume global g was assigned number N during global initialization +-- Then the old value identifier gets a fresh number M (where M > N) + +procedure Increment + inputs: [] + outputs: [] + modifies: [ + { name: "g", currentVar: N, oldVar: M } + ] + requires: [] + ensures: [N == M + 1] -- g == old g + 1 (using old identifier M) + body: + set N (fvar N + 1) -- g := g + 1 (body only uses current identifier N) +``` + +**VarContext entries:** +```lean +N → ("g", int) +M → ("old g", int) +``` + +**Important**: The procedure body only references variable N. Variable M is only used in postconditions where `old(g)` appeared in the source. The actual numbers N and M depend on when variables were assigned during program initialization - M is simply a fresh number generated when processing the modifies clause. + +**Key Points**: +- `old(g)` is replaced with a reference to variable 11 (the old identifier) +- The procedure body only uses variable 10 (the current value) for normal operations +- Variable 11 is ONLY used when translating `old()` expressions in contracts +- No explicit init statements or pre-state copies in the procedure +- The semantics layer is responsible for capturing pre-state values at procedure entry +- Each modified global has two identifiers in the modifies clause +- VarContext uses "old " prefix for old identifiers +- No transformation from procedure to procedure - just identifier assignment + +#### Semantics Handling + +The semantics layer handles the dual identifiers: + +```lean +-- At procedure entry, the semantics captures pre-state values +def Procedure.eval (proc : Procedure) (state : State) : State := + -- Step 1: Capture old values for modified globals + let state := proc.modifies.foldl (fun s entry => + match s.lookup entry.currentVar with + | some value => s.insert entry.oldVar value + | none => s + ) state + + -- Step 2: Execute procedure body + let state := proc.body.eval state + + -- Step 3: Check postconditions (can reference both current and old) + ... +``` + +#### Benefits + +1. **No transformation**: Procedures don't need to be transformed - just identifier assignment +2. **Uniformity**: All variable references use the same mechanism +3. **Simplicity**: No special case for `old()` in the AST after translation +4. **Clarity**: Pre-state values are explicit identifiers, not implicit constructs +5. **Semantic responsibility**: The semantics layer handles pre-state capture, not the translation +6. **Future-proof**: Supports "old " prefix convention for potential future syntax + +#### Implementation Strategy + +1. During procedure translation, identify all modified globals (from `modifies` clause) +2. For each modified global, generate a fresh identifier for the old value +3. Store both current and old identifiers in the `modifies` clause +4. Replace all `old()` references with old identifier references +5. Update VarContext to use "old " prefix for old identifiers +6. Update semantics to capture pre-state values at procedure entry +7. Remove `old()` constructor from expression AST + +--- + +## Key Differences from De Bruijn Indices + +| Aspect | De Bruijn Indices | Performant Ordering | +|--------|-------------------|---------------------| +| **Variable Identity** | Relative (distance from binder) | Absolute (unique number) | +| **Substitution** | Requires index shifting | Simple replacement | +| **Beta Reduction** | Requires index adjustment | Simple substitution | +| **Alpha Conversion** | Requires index renormalization | Generate fresh number | +| **Shadowing** | Possible (same relative index) | Impossible (unique numbers) | +| **Transformation** | Variables change identity | Variables stable | +| **Proof Complexity** | High (shifting lemmas) | Low (freshness lemmas) | +| **Canonical Form** | Yes (unique representation) | No (alpha-equivalent terms differ) | +| **Strata Fit** | Poor (transformation-heavy) | Excellent (transformation-friendly) | diff --git a/.kiro/specs/performant-ordering-variables/requirements.md b/.kiro/specs/performant-ordering-variables/requirements.md new file mode 100644 index 000000000..e2d79654e --- /dev/null +++ b/.kiro/specs/performant-ordering-variables/requirements.md @@ -0,0 +1,256 @@ +# Performant Ordering Variable Numbering - Requirements + +## Executive Summary + +This specification describes the migration of Strata's variable representation from the current system to **Performant Ordering** - a unified variable numbering scheme using globally unique natural numbers. This change affects both lambda calculus (bound variables) and imperative layer (free variables), replacing all relative indexing schemes with a single, transformation-friendly approach. + +**Key Change:** All variables (lambda-bound and imperative-declared) are identified by unique natural numbers assigned from a global counter, with no relative counting or context-dependent indexing. + +## Current State + +Strata currently uses different variable representation schemes across its layers: +- **Lambda calculus**: Uses De Bruijn indices for bound variables (bvar) - relative counting from binding site +- **Imperative layer**: Uses string-based identifiers for free variables (fvar) +- **Mixed approach**: Leads to complex index shifting, weakening, and lifting operations + +## Problems with Current Approach + +1. **Complexity**: Index shifting logic is error-prone and difficult to prove correct +2. **Transformation friction**: Variables change identity during transformations +3. **Dual reasoning**: Different proof strategies for lambda-bound vs imperative variables +4. **Boundary cases**: Complex interactions between bvar and fvar +5. **Canonicity overhead**: De Bruijn requires normalization, but Strata prioritizes semantic equivalence + +## Proposed Solution: Performant Ordering + +**Core Concept**: All variables are identified by globally unique natural numbers assigned from a monotonically increasing counter. + +**Benefits**: +- **Simplicity**: No index shifting, no relative counting +- **Stability**: Variables maintain identity across all transformations +- **Unified reasoning**: Single proof strategy for all variables +- **Transformation-friendly**: Fresh variable generation is trivial and local +- **Strata-optimized**: Perfect for multi-stage transformation pipelines + +**Trade-off Accepted**: Loss of canonical representation (alpha-equivalent terms may differ syntactically), but semantic equivalence is what matters for verification. + +## Glossary + +- **Performant Ordering**: Variable numbering scheme using globally unique natural numbers +- **nextFree**: Counter representing the next available fresh variable number +- **NextFree α**: Generic structure containing content of type α and a nextFree counter +- **FreshM**: State monad for generating fresh variable numbers +- **bvar**: Bound variable (lambda-bound, quantifier-bound) - now uses Performant Ordering +- **fvar**: Free variable (imperative-declared) - now uses Performant Ordering +- **LExpr**: Lambda expressions in the Lambda dialect +- **Cmd**: Commands in the Imperative dialect +- **VarContext**: Mapping from variable numbers to metadata (name, type) for display/debugging +- **Unified numbering**: Single numbering space shared by both bvar and fvar + +--- + +## Requirements + +### Requirement 1: Unified Variable Representation + +**User Story**: As a language designer, I want all variables to use a single numbering scheme, so that there's no distinction between lambda-bound and imperative variable numbering. + +#### Acceptance Criteria + +1.1. WHEN a variable is represented in the AST, THE System SHALL use a natural number as its unique identifier + +1.2. WHEN a lambda parameter is declared, THE System SHALL assign it a unique number from nextFree + +1.3. WHEN an imperative variable is declared, THE System SHALL assign it a unique number from nextFree + +1.4. WHEN a quantifier binds a variable, THE System SHALL assign it a unique number from nextFree + +1.5. WHERE variables are used, THE System SHALL reference them by their assigned number + +### Requirement 2: NextFree Structure with Fresh Counter + +**User Story**: As a compiler developer, I want programs to track the next available variable number, so that fresh variable generation is guaranteed to be unique. + +#### Acceptance Criteria + +2.1. WHEN a NextFree is defined, THE System SHALL include a nextFree field of type Nat + +2.2. WHEN a NextFree is created, THE System SHALL initialize nextFree to 0 (or to the next available number if continuing from a previous context) + +2.3. WHEN a fresh variable is needed, THE System SHALL use the current nextFree value and increment it + +2.4. WHERE multiple types of variables exist (lambda-bound, imperative, quantifier-bound), THE System SHALL use the same nextFree counter for all of them + +2.5. WHERE all variables in a program, THE System SHALL maintain the invariant that all variable numbers are strictly less than nextFree + +### Requirement 3: Fresh Variable Generation + +**User Story**: As a transformation developer, I want to generate fresh variables using a state monad, so that uniqueness is guaranteed by construction. + +#### Acceptance Criteria + +3.1. WHEN generating a fresh variable, THE System SHALL provide a FreshM monad that returns the current nextFree and increments it + +3.2. WHEN multiple fresh variables are needed, THE System SHALL thread the state through all generations + +3.3. WHEN a transformation completes, THE System SHALL return an updated NextFree with the new nextFree value + +3.4. WHERE fresh variable generation occurs, THE System SHALL guarantee that the returned number was not previously used in the program + +### Requirement 4: Lambda Calculus Operations + +**User Story**: As a lambda calculus implementer, I want substitution and beta reduction to work without index shifting, so that operations are simpler and proofs are easier. + +#### Acceptance Criteria + +4.1. WHEN performing beta reduction (λ x₄₂. body) arg, THE System SHALL substitute all occurrences of variable 42 in body with arg, without any index adjustment + +4.2. WHEN performing substitution body[x₄₂ := arg], THE System SHALL replace all occurrences of variable 42 with arg, without shifting any other variables + +4.3. WHEN performing alpha conversion, THE System SHALL generate a fresh variable number and replace the old parameter number with the new one, without any index adjustment + +4.4. WHERE lambda expressions are evaluated, THE System SHALL NOT perform any index shifting, weakening, or lifting operations + +### Requirement 5: Imperative Operations + +**User Story**: As an imperative language implementer, I want variable declarations and assignments to use unique numbers, so that shadowing is impossible and variable tracking is unambiguous. + +#### Acceptance Criteria + +5.1. WHEN an init command declares a variable, THE System SHALL assign it a unique number from nextFree + +5.2. WHEN a set command modifies a variable, THE System SHALL reference it by its unique number + +5.3. WHEN a havoc command randomizes a variable, THE System SHALL reference it by its unique number + +5.4. WHERE multiple variables with the same display name exist, THE System SHALL distinguish them by their unique numbers + +5.5. WHEN inlining procedures, THE System SHALL generate fresh numbers for all parameters and locals + +### Requirement 6: Pretty Printing and Display + +**User Story**: As a developer debugging programs, I want variable numbers to be resolved to unambiguous readable names, so that output is human-friendly and clear even with shadowing. + +#### Acceptance Criteria + +6.1. WHEN displaying a variable, THE System SHALL resolve its number to a display name using VarContext + +6.2. WHEN a variable number cannot be resolved, THE System SHALL display it as @N (e.g., @42) + +6.3. WHEN multiple variables share the same display name, THE System SHALL disambiguate using @N suffix where N is the shadowing count + +6.4. WHERE a variable is the most recent with its name, THE System SHALL display it with the plain name (no suffix) + +6.5. WHERE a variable is shadowed by N later variables with the same name, THE System SHALL display it as name@N + +6.6. WHEN formatting expressions, THE System SHALL use VarContext to make output readable and unambiguous + +6.7. WHERE VarContext tracks variables, THE System SHALL maintain declaration order to compute shadowing counts + +### Requirement 7: Scope and Shadowing + +**User Story**: As a language designer, I want shadowing to be impossible by construction, so that variable references are always unambiguous. + +#### Acceptance Criteria + +7.1. WHEN a new variable is declared, THE System SHALL assign it a unique number that has never been used + +7.2. WHERE two variables have the same display name, THE System SHALL distinguish them by their unique numbers + +7.3. WHEN a variable goes out of scope, THE System SHALL NOT reuse its number + +7.4. WHERE variable lookup occurs, THE System SHALL use the unique number, not the display name + +### Requirement 8: Transformation Correctness + +**User Story**: As a verification engineer, I want transformations to preserve program semantics, so that verification results are trustworthy. + +#### Acceptance Criteria + +8.1. WHEN a transformation generates fresh variables, THE System SHALL prove that the fresh numbers are not in the original program + +8.2. WHEN a transformation preserves variables, THE System SHALL prove that variable numbers remain unchanged + +8.3. WHEN a transformation performs substitution, THE System SHALL prove that semantics are preserved + +8.4. WHERE transformations compose, THE System SHALL prove that the composition preserves semantics + +8.5. WHEN a transformation completes, THE System SHALL prove that all variables in the result are below the new nextFree + +### Requirement 9: Type System Integration + +**User Story**: As a type system maintainer, I want type checking to work with variable numbers, so that type safety is preserved. + +#### Acceptance Criteria + +9.1. WHEN type checking a variable reference, THE System SHALL look up its type using the variable number + +9.2. WHEN a variable is declared, THE System SHALL associate its number with its type + +9.3. WHEN type checking lambda expressions, THE System SHALL track parameter types by their numbers + +9.4. WHERE type environments are maintained, THE System SHALL map variable numbers to types + +### Requirement 10: Evaluation and Semantics + +**User Story**: As a semantics implementer, I want evaluation to work with variable numbers, so that runtime behavior is correct. + +#### Acceptance Criteria + +10.1. WHEN evaluating a variable reference, THE System SHALL look up its value using the variable number + +10.2. WHEN a variable is assigned, THE System SHALL update the value at its number + +10.3. WHEN evaluating lambda application, THE System SHALL substitute using variable numbers without shifting + +10.4. WHERE evaluation environments are maintained, THE System SHALL map variable numbers to values + +### Requirement 11: SMT Backend Integration + +**User Story**: As a verification backend developer, I want SMT encoding to generate unique variable names, so that verification conditions are correct. + +#### Acceptance Criteria + +11.1. WHEN encoding a variable to SMT, THE System SHALL generate a unique SMT identifier + +11.2. WHEN multiple variables share a display name, THE System SHALL use @N suffixes in SMT (e.g., x, x@1, x@2) + +11.3. WHEN encoding expressions, THE System SHALL resolve variable numbers to SMT identifiers + +11.4. WHERE SMT variable names are generated, THE System SHALL maintain a mapping from variable numbers to SMT identifiers + +### Requirement 12: Elimination of old() Construct + +**User Story**: As a verification engineer, I want to reference pre-state values of global variables using explicit variable identifiers instead of the `old()` construct, so that variable semantics are uniform and clear. + +#### Acceptance Criteria + +12.1. WHEN a procedure postcondition needs to reference a pre-state value of a global variable, THE System SHALL use an explicit variable identifier for the old value + +12.2. WHEN translating procedures with modifies clauses, THE System SHALL generate fresh identifiers for old values of modified globals + +12.3. WHERE the `old()` construct previously existed for globals, THE System SHALL replace it with old identifier references + +12.4. WHEN evaluating postconditions, THE System SHALL look up pre-state values using old identifiers + +12.5. WHERE procedure contracts are specified, THE System SHALL NOT use the `old()` construct for global variables + +12.6. WHEN a global variable is not in the modifies clause, THE System SHALL NOT generate an old identifier for it + +12.7. WHERE the modifies clause is defined, THE System SHALL store both current and old identifiers for each modified global + +12.8. WHEN displaying old identifiers, THE System SHALL use "old " prefix (e.g., "old g") + +### Requirement 13: Compilation and Testing + +**User Story**: As a project maintainer, I want the refactored code to compile and pass all tests, so that the system remains functional. + +#### Acceptance Criteria + +13.1. WHEN all changes are complete, THE System SHALL compile without errors using `lake build` + +13.2. WHEN tests are run, THE System SHALL pass all existing tests using `lake test` + +13.3. WHERE proofs are affected, THE System SHALL restore all proofs to working state + +13.4. WHEN the migration is complete, THE System SHALL have no remaining `sorry` placeholders diff --git a/.kiro/specs/performant-ordering-variables/specification.md b/.kiro/specs/performant-ordering-variables/specification.md new file mode 100644 index 000000000..4d247c918 --- /dev/null +++ b/.kiro/specs/performant-ordering-variables/specification.md @@ -0,0 +1,1684 @@ +# Performant Ordering Variable Numbering Specification + +## Executive Summary + +This specification describes the migration of Strata's variable representation from the current system to **Performant Ordering** - a unified variable numbering scheme using globally unique natural numbers. This change affects both lambda calculus (bound variables) and imperative layer (free variables), replacing all relative indexing schemes with a single, transformation-friendly approach. + +**Key Change:** All variables (lambda-bound and imperative-declared) are identified by unique natural numbers assigned from a global counter, with no relative counting or context-dependent indexing. + +## Table of Contents + +1. [Introduction](#introduction) +2. [Glossary](#glossary) +3. [Requirements](#requirements) +4. [Architecture](#architecture) +5. [Design](#design) +6. [Implementation Strategy](#implementation-strategy) +7. [Testing Strategy](#testing-strategy) +8. [Migration Plan](#migration-plan) + +--- + +## Introduction + +### Current State + +Strata currently uses different variable representation schemes across its layers: +- **Lambda calculus**: Uses De Bruijn indices for bound variables (bvar) - relative counting from binding site +- **Imperative layer**: Uses string-based identifiers for free variables (fvar) +- **Mixed approach**: Leads to complex index shifting, weakening, and lifting operations + +### Problems with Current Approach + +1. **Complexity**: Index shifting logic is error-prone and difficult to prove correct +2. **Transformation friction**: Variables change identity during transformations +3. **Dual reasoning**: Different proof strategies for lambda-bound vs imperative variables +4. **Boundary cases**: Complex interactions between bvar and fvar +5. **Canonicity overhead**: De Bruijn requires normalization, but Strata prioritizes semantic equivalence + +### Proposed Solution: Performant Ordering + +**Core Concept**: All variables are identified by globally unique natural numbers assigned from a monotonically increasing counter. + + +**Benefits**: +- **Simplicity**: No index shifting, no relative counting +- **Stability**: Variables maintain identity across all transformations +- **Unified reasoning**: Single proof strategy for all variables +- **Transformation-friendly**: Fresh variable generation is trivial and local +- **Strata-optimized**: Perfect for multi-stage transformation pipelines + +**Trade-off Accepted**: Loss of canonical representation (alpha-equivalent terms may differ syntactically), but semantic equivalence is what matters for verification. + +--- + +## Glossary + +- **Performant Ordering**: Variable numbering scheme using globally unique natural numbers +- **nextFree**: Counter representing the next available fresh variable number +- **Program α**: Generic program structure containing content of type α and a nextFree counter +- **FreshM**: State monad for generating fresh variable numbers +- **bvar**: Bound variable (lambda-bound, quantifier-bound) - now uses Performant Ordering +- **fvar**: Free variable (imperative-declared) - now uses Performant Ordering +- **LExpr**: Lambda expressions in the Lambda dialect +- **Cmd**: Commands in the Imperative dialect +- **VarContext**: Mapping from variable numbers to metadata (name, type) for display/debugging +- **Unified numbering**: Single numbering space shared by both bvar and fvar + +--- + +## Requirements + +### Requirement 1: Unified Variable Representation + +**User Story**: As a language designer, I want all variables to use a single numbering scheme, so that there's no distinction between lambda-bound and imperative variable numbering. + +#### Acceptance Criteria + +1.1. WHEN a variable is represented in the AST, THE System SHALL use a natural number as its unique identifier + +1.2. WHEN a lambda parameter is declared, THE System SHALL assign it a unique number from nextFree + +1.3. WHEN an imperative variable is declared, THE System SHALL assign it a unique number from nextFree + +1.4. WHEN a quantifier binds a variable, THE System SHALL assign it a unique number from nextFree + +1.5. WHERE variables are used, THE System SHALL reference them by their assigned number + + +### Requirement 2: Program Structure with Fresh Counter + +**User Story**: As a compiler developer, I want programs to track the next available variable number, so that fresh variable generation is guaranteed to be unique. + +#### Acceptance Criteria + +2.1. WHEN a Program is defined, THE System SHALL include a nextFree field of type Nat + +2.2. WHEN a Program is created, THE System SHALL initialize nextFree to 0 (or to the next available number if continuing from a previous context) + +2.3. WHEN a fresh variable is needed, THE System SHALL use the current nextFree value and increment it + +2.4. WHERE multiple types of variables exist (lambda-bound, imperative, quantifier-bound), THE System SHALL use the same nextFree counter for all of them + +2.4. WHERE all variables in a program, THE System SHALL maintain the invariant that all variable numbers are strictly less than nextFree + +### Requirement 3: Fresh Variable Generation + +**User Story**: As a transformation developer, I want to generate fresh variables using a state monad, so that uniqueness is guaranteed by construction. + +#### Acceptance Criteria + +3.1. WHEN generating a fresh variable, THE System SHALL provide a FreshM monad that returns the current nextFree and increments it + +3.2. WHEN multiple fresh variables are needed, THE System SHALL thread the state through all generations + +3.3. WHEN a transformation completes, THE System SHALL return an updated Program with the new nextFree value + +3.4. WHERE fresh variable generation occurs, THE System SHALL guarantee that the returned number was not previously used in the program + +### Requirement 4: Lambda Calculus Operations + +**User Story**: As a lambda calculus implementer, I want substitution and beta reduction to work without index shifting, so that operations are simpler and proofs are easier. + +#### Acceptance Criteria + +4.1. WHEN performing beta reduction (λ x₄₂. body) arg, THE System SHALL substitute all occurrences of variable 42 in body with arg, without any index adjustment + +4.2. WHEN performing substitution body[x₄₂ := arg], THE System SHALL replace all occurrences of variable 42 with arg, without shifting any other variables + +4.3. WHEN performing alpha conversion, THE System SHALL generate a fresh variable number and replace the old parameter number with the new one, without any index adjustment + +4.4. WHERE lambda expressions are evaluated, THE System SHALL NOT perform any index shifting, weakening, or lifting operations + + +### Requirement 5: Imperative Operations + +**User Story**: As an imperative language implementer, I want variable declarations and assignments to use unique numbers, so that shadowing is impossible and variable tracking is unambiguous. + +#### Acceptance Criteria + +5.1. WHEN an init command declares a variable, THE System SHALL assign it a unique number from nextFree + +5.2. WHEN a set command modifies a variable, THE System SHALL reference it by its unique number + +5.3. WHEN a havoc command randomizes a variable, THE System SHALL reference it by its unique number + +5.4. WHERE multiple variables with the same display name exist, THE System SHALL distinguish them by their unique numbers + +5.5. WHEN inlining procedures, THE System SHALL generate fresh numbers for all parameters and locals + +### Requirement 6: Pretty Printing and Display + +**User Story**: As a developer debugging programs, I want variable numbers to be resolved to readable names, so that output is human-friendly. + +#### Acceptance Criteria + +6.1. WHEN displaying a variable, THE System SHALL resolve its number to a display name using VarContext + +6.2. WHEN a variable number cannot be resolved, THE System SHALL display it as %N (e.g., %42) + +6.3. WHEN multiple variables share the same display name, THE System SHALL disambiguate using @N suffix (e.g., x, x@1, x@2) + +6.4. WHERE shadowing does not occur, THE System SHALL display variables with their plain names (no suffix) + +6.5. WHEN formatting expressions, THE System SHALL use VarContext to make output readable + +### Requirement 7: Scope and Shadowing + +**User Story**: As a language designer, I want shadowing to be impossible by construction, so that variable references are always unambiguous. + +#### Acceptance Criteria + +7.1. WHEN a new variable is declared, THE System SHALL assign it a unique number that has never been used + +7.2. WHERE two variables have the same display name, THE System SHALL distinguish them by their unique numbers + +7.3. WHEN a variable goes out of scope, THE System SHALL NOT reuse its number + +7.4. WHERE variable lookup occurs, THE System SHALL use the unique number, not the display name + + +### Requirement 8: Transformation Correctness + +**User Story**: As a verification engineer, I want transformations to preserve program semantics, so that verification results are trustworthy. + +#### Acceptance Criteria + +8.1. WHEN a transformation generates fresh variables, THE System SHALL prove that the fresh numbers are not in the original program + +8.2. WHEN a transformation preserves variables, THE System SHALL prove that variable numbers remain unchanged + +8.3. WHEN a transformation performs substitution, THE System SHALL prove that semantics are preserved + +8.4. WHERE transformations compose, THE System SHALL prove that the composition preserves semantics + +8.5. WHEN a transformation completes, THE System SHALL prove that all variables in the result are below the new nextFree + +### Requirement 9: Type System Integration + +**User Story**: As a type system maintainer, I want type checking to work with variable numbers, so that type safety is preserved. + +#### Acceptance Criteria + +9.1. WHEN type checking a variable reference, THE System SHALL look up its type using the variable number + +9.2. WHEN a variable is declared, THE System SHALL associate its number with its type + +9.3. WHEN type checking lambda expressions, THE System SHALL track parameter types by their numbers + +9.4. WHERE type environments are maintained, THE System SHALL map variable numbers to types + +### Requirement 10: Evaluation and Semantics + +**User Story**: As a semantics implementer, I want evaluation to work with variable numbers, so that runtime behavior is correct. + +#### Acceptance Criteria + +10.1. WHEN evaluating a variable reference, THE System SHALL look up its value using the variable number + +10.2. WHEN a variable is assigned, THE System SHALL update the value at its number + +10.3. WHEN evaluating lambda application, THE System SHALL substitute using variable numbers without shifting + +10.4. WHERE evaluation environments are maintained, THE System SHALL map variable numbers to values + + +### Requirement 11: SMT Backend Integration + +**User Story**: As a verification backend developer, I want SMT encoding to generate unique variable names, so that verification conditions are correct. + +#### Acceptance Criteria + +11.1. WHEN encoding a variable to SMT, THE System SHALL generate a unique SMT identifier + +11.2. WHEN multiple variables share a display name, THE System SHALL use @N suffixes in SMT (e.g., x, x@1, x@2) + +11.3. WHEN encoding expressions, THE System SHALL resolve variable numbers to SMT identifiers + +11.4. WHERE SMT variable names are generated, THE System SHALL maintain a mapping from variable numbers to SMT identifiers + +### Requirement 12: Compilation and Testing + +**User Story**: As a project maintainer, I want the refactored code to compile and pass all tests, so that the system remains functional. + +#### Acceptance Criteria + +12.1. WHEN all changes are complete, THE System SHALL compile without errors using `lake build` + +12.2. WHEN tests are run, THE System SHALL pass all existing tests using `lake test` + +12.3. WHERE proofs are affected, THE System SHALL restore all proofs to working state + +12.4. WHEN the migration is complete, THE System SHALL have no remaining `sorry` placeholders + +--- + +## Architecture + +### High-Level Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Program Structure │ +│ ┌────────────────────────────────────────────────────────┐ │ +│ │ structure Program (α : Type) where │ │ +│ │ content : α │ │ +│ │ nextFree : Nat │ │ +│ └────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ + │ + ├─────────────────────────────────┐ + │ │ + ▼ ▼ +┌──────────────────────────────────────┐ ┌──────────────────────────────────┐ +│ Lambda Calculus Layer │ │ Imperative Layer │ +│ ┌────────────────────────────────┐ │ │ ┌────────────────────────────┐ │ +│ │ LExpr: │ │ │ │ Cmd: │ │ +│ │ bvar (index: Nat) │ │ │ │ init (name: String) │ │ +│ │ abs (param: Nat) (body) │ │ │ │ set (var: Nat) (expr) │ │ +│ │ app (fn) (arg) │ │ │ │ havoc (var: Nat) │ │ +│ │ fvar (index: Nat) │ │ │ └────────────────────────────┘ │ +│ └────────────────────────────────┘ │ └──────────────────────────────────┘ +└──────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Fresh Variable Monad │ +│ ┌────────────────────────────────────────────────────────┐ │ +│ │ def FreshM (α : Type) : Type := StateM Nat α │ │ +│ │ │ │ +│ │ def freshVar : FreshM Nat := do │ │ +│ │ let n ← get │ │ +│ │ set (n + 1) │ │ +│ │ return n │ │ +│ └────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + + +### Key Architectural Principles + +1. **Unified Numbering**: All variables (bvar, fvar, parameters, locals) share the same numbering space +2. **Global Uniqueness**: Each variable number is used exactly once across the entire program +3. **Monotonic Counter**: nextFree only increases, never decreases +4. **Stable Identity**: Variable numbers never change during transformations +5. **Display Separation**: Variable numbers (for semantics) are separate from display names (for humans) + +### Component Relationships + +``` +Program + ├── content (AST with variable numbers) + └── nextFree (fresh variable counter) + +VarContext (for display/debugging) + └── Map Nat (String × Type) -- number → (display name, type) + +FreshM Monad (for transformations) + └── StateM Nat -- threads nextFree through operations + +Evaluation Environment + └── Map Nat Value -- number → runtime value + +Type Environment + └── Map Nat Type -- number → type +``` + +--- + +## Design + +### 1. Program Structure + +All program entry points (expressions, statements, procedures, etc.) are wrapped in a `Program` structure: + +```lean +structure Program (α : Type) where + content : α + nextFree : Nat + +namespace Program + +def empty (content : α) : Program α := + ⟨content, 0⟩ + +def withFreshCounter (content : α) (nextFree : Nat) : Program α := + ⟨content, nextFree⟩ + +-- Primary invariant: all variables in content are below nextFree +def wellFormed (p : Program α) : Prop := + ∀ v ∈ allVars p.content, v < p.nextFree + +end Program +``` + + +### 2. Fresh Variable Monad + +```lean +def FreshM (α : Type) : Type := StateM Nat α + +namespace FreshM + +def freshVar : FreshM Nat := do + let n ← get + set (n + 1) + return n + +def freshVars (count : Nat) : FreshM (List Nat) := do + let mut vars := [] + for _ in [0:count] do + vars := (← freshVar) :: vars + return vars.reverse + +def run (m : FreshM α) (initialNextFree : Nat) : α × Nat := + m.run initialNextFree + +def runProgram (m : FreshM α) (p : Program β) : Program α := + let (content, nextFree) := m.run p.nextFree + ⟨content, nextFree⟩ + +end FreshM +``` + +**Key Lemma**: +```lean +theorem freshVar_is_fresh (s : Nat) : + let (n, s') := freshVar.run s + n = s ∧ s' = s + 1 ∧ n < s' +``` + +### 3. Lambda Calculus with Performant Ordering + +#### LExpr Definition + +```lean +inductive LExpr (T : LExprParamsT) : Type where + | bvar (m : T.base.Metadata) (index : Nat) (hint : String) (ty : Option T.TypeType) + | fvar (m : T.base.Metadata) (index : Nat) (hint : String) (ty : Option T.TypeType) + | abs (m : T.base.Metadata) (param : Nat) (hint : String) (body : LExpr T) (ty : Option T.TypeType) + | app (m : T.base.Metadata) (fn : LExpr T) (arg : LExpr T) (ty : Option T.TypeType) + | quant (m : T.base.Metadata) (kind : QuantKind) (vars : List (Nat × String)) + (body : LExpr T) (ty : Option T.TypeType) + -- ... other constructors +``` + +**Key Changes**: +- `bvar` uses `Nat` for unique identity + `String` hint for display +- `fvar` uses `Nat` for unique identity + `String` hint for display +- `abs` stores parameter number + hint for the parameter name +- `quant` stores list of (number, hint) pairs for bound variables + +**Design Principle**: Every variable node stores both: +1. **Semantic identity** (Nat) - used for all operations (substitution, lookup, etc.) +2. **Display hint** (String) - used only for pretty printing and debugging + + +#### Substitution (Simplified) + +```lean +def substitute (e : LExpr T) (var : Nat) (replacement : LExpr T) : LExpr T := + match e with + | .bvar m index ty => if index == var then replacement else e + | .fvar m index ty => if index == var then replacement else e + | .abs m param body ty => + .abs m param (substitute body var replacement) ty + | .app m fn arg ty => + .app m (substitute fn var replacement) (substitute arg var replacement) ty + | .quant m kind vars body ty => + .quant m kind vars (substitute body var replacement) ty + -- ... other cases +``` + +**No index shifting needed!** Simple replacement throughout. + +#### Beta Reduction (Simplified) + +```lean +def betaReduce (e : LExpr T) : LExpr T := + match e with + | .app _ (.abs _ param body _) arg _ => + substitute body param arg + | _ => e +``` + +**No index adjustment!** Just substitute the parameter number with the argument. + +#### Alpha Conversion (Simplified) + +```lean +def alphaConvert (abs : LExpr T) : FreshM (LExpr T) := + match abs with + | .abs m param body ty => do + let newParam ← freshVar + let newBody := substitute body param (.bvar m newParam ty) + return .abs m newParam newBody ty + | _ => return abs +``` + +**No index adjustment!** Just generate a fresh number and substitute. + +### 4. Imperative Layer with Performant Ordering + +#### Cmd Definition + +```lean +inductive Cmd (P : PureExpr) : Type where + | init (hint : String) (var : Nat) (ty : P.Ty) (e : Option P.Expr) (md : MetaData P := .empty) + | set (var : Nat) (e : P.Expr) (md : MetaData P := .empty) + | havoc (var : Nat) (md : MetaData P := .empty) + | assert (label : String) (b : P.Expr) (md : MetaData P := .empty) + | assume (label : String) (b : P.Expr) (md : MetaData P := .empty) +``` + +**Key Points**: +- `init` stores both hint (String) for display AND var (Nat) for semantic identity +- `init` has optional expression: `some e` for initialized, `none` for havoc +- `set` and `havoc` reference variables by number only (hint is in VarContext) +- Variable numbers are assigned when `init` is executed +- **All constructors have `md : MetaData P` parameter** with default `.empty` + +**Rationale for Optional Init**: +- `init "x" 42 ty (some e)` - declares variable 42 with initial value e +- `init "x" 42 ty none` - declares variable 42 with havoc/uninitialized value +- Eliminates pattern of `init` followed by immediate `havoc` +- More concise and clearer intent + +**Design Principle**: +- Declaration nodes (`init`, `abs`, `quant`) store hints directly +- Reference nodes (`bvar`, `fvar`, `set`, `havoc`) use numbers only +- VarContext maintains the mapping for display purposes +- Metadata is preserved throughout all operations + + +#### Variable Context for Display + +```lean +structure VarContext (P : PureExpr) where + vars : Map Nat (String × P.Ty) -- number → (display name, type) + +namespace VarContext + +def empty : VarContext P := ⟨Map.empty⟩ + +def insert (ctx : VarContext P) (num : Nat) (name : String) (ty : P.Ty) : VarContext P := + ⟨ctx.vars.insert num (name, ty)⟩ + +def lookup (ctx : VarContext P) (num : Nat) : Option (String × P.Ty) := + ctx.vars.find? num + +def lookupName (ctx : VarContext P) (num : Nat) : Option String := + (ctx.lookup num).map (·.1) + +def lookupType (ctx : VarContext P) (num : Nat) : Option P.Ty := + (ctx.lookup num).map (·.2) + +end VarContext +``` + +#### Evaluation with Performant Ordering + +```lean +def Cmd.eval [EC : EvalContext P S] (σ : S) (c : Cmd P) : Cmd P × S := + match c with + | .init name ty (some e) md => do + let varNum := EC.nextVar σ + let e := EC.eval σ e + let σ := EC.addVar σ varNum ty e + let σ := EC.incrementNextVar σ + return (.init name ty (some e) md, σ) + + | .init name ty none md => do + let varNum := EC.nextVar σ + let (e, σ) := EC.genFreeVar σ ty + let σ := EC.addVar σ varNum ty e + let σ := EC.incrementNextVar σ + return (.init name ty none md, σ) + + | .set varNum e md => + match EC.lookupVar σ varNum with + | none => return (c, EC.updateError σ (.VarNotFound varNum)) + | some (ty, _) => + let e := EC.eval σ e + let σ := EC.updateVar σ varNum ty e + return (.set varNum e md, σ) + + | .havoc varNum md => + match EC.lookupVar σ varNum with + | none => return (c, EC.updateError σ (.VarNotFound varNum)) + | some (ty, _) => + let (e, σ) := EC.genFreeVar σ ty + let σ := EC.updateVar σ varNum ty e + return (.havoc varNum md, σ) +``` + + +### 5. Pretty Printing + +```lean +def formatVar (ctx : VarContext P) (varNum : Nat) : String := + match ctx.lookupName varNum with + | none => s!"%{varNum}" + | some name => + if name.isEmpty then s!"%{varNum}" + else + -- Count how many other variables share this name + let sameNameVars := ctx.vars.toList.filter (fun (_, (n, _)) => n == name) + if sameNameVars.length == 1 then name + else + -- Find position of this var among same-named vars + let sorted := sameNameVars.map (·.1) |>.sort + let pos := sorted.indexOf varNum + s!"{name}@{pos}" + +def formatCmd (ctx : VarContext P) (c : Cmd P) : String := + match c with + | .init name ty (some e) _ => + s!"init ({name} : {ty}) := {formatExpr ctx e}" + | .init name ty none _ => + s!"init ({name} : {ty})" + | .set varNum e _ => + s!"{formatVar ctx varNum} := {formatExpr ctx e}" + | .havoc varNum _ => + s!"havoc {formatVar ctx varNum}" + | .assert label b _ => + s!"assert {label}: {formatExpr ctx b}" + | .assume label b _ => + s!"assume {label}: {formatExpr ctx b}" +``` + +### 6. SMT Encoding + +```lean +structure SMTContext where + varNames : Map Nat String -- variable number → unique SMT name + nameCounters : Map String Nat -- base name → occurrence count + +namespace SMTContext + +def empty : SMTContext := ⟨Map.empty, Map.empty⟩ + +def addVar (ctx : SMTContext) (varNum : Nat) (baseName : String) : SMTContext × String := + let counter := ctx.nameCounters.findD baseName 0 + let smtName := if counter == 0 then baseName else s!"{baseName}@{counter}" + let ctx' := { + varNames := ctx.varNames.insert varNum smtName, + nameCounters := ctx.nameCounters.insert baseName (counter + 1) + } + (ctx', smtName) + +def lookupVar (ctx : SMTContext) (varNum : Nat) : Option String := + ctx.varNames.find? varNum + +end SMTContext +``` + + +### 7. Key Invariants + +#### Primary Invariant: All Variables Below nextFree + +```lean +def Program.allVarsBelow (p : Program α) : Prop := + ∀ v ∈ allVars p.content, v < p.nextFree +``` + +**Theorem**: All well-formed programs satisfy this invariant. + +#### Uniqueness Invariant: All Variables Distinct + +```lean +def Program.allVarsUnique (p : Program α) : Prop := + ∀ v₁ v₂ ∈ allVars p.content, v₁ ≠ v₂ → v₁ ≠ v₂ +``` + +**Theorem**: Fresh variable generation preserves uniqueness. + +#### Well-formedness Invariant: All Uses Have Declarations + +```lean +def Program.allUsesHaveDecls (p : Program α) : Prop := + ∀ use ∈ varUses p.content, ∃ decl ∈ varDecls p.content, use = decl +``` + +**Theorem**: Type checking ensures this invariant. + +### 8. Handling Source-Level Shadowing + +**Problem**: Source languages (Boogie, C_Simp) allow shadowing, but Performant Ordering makes it impossible. + +**Solution**: During parsing/translation, assign fresh numbers to all variables, even if they have the same source name. + +#### Example: Boogie Source with Shadowing + +```boogie +procedure Example() { + var x: int; + x := 1; + { + var x: int; // Shadows outer x + x := 2; // Refers to inner x + } + assert x == 1; // Refers to outer x +} +``` + +#### Translation to Performant Ordering + +```lean +-- During translation, maintain a scope stack +-- Outer scope: x → 42 +-- Inner scope: x → 73 + +init "x" 42 int (some 1) -- Outer x gets number 42 +set 42 (const 1) +-- Enter inner scope +init "x" 73 int (some 2) -- Inner x gets number 73 (fresh!) +set 73 (const 2) -- Refers to 73 +-- Exit inner scope +assert "outer_x" (42 == 1) -- Refers to 42 +``` + +#### Scope Stack During Translation + +```lean +structure ScopeStack where + scopes : List (Map String Nat) -- Stack of name → number mappings + +namespace ScopeStack + +def empty : ScopeStack := ⟨[Map.empty]⟩ + +def pushScope (s : ScopeStack) : ScopeStack := + ⟨Map.empty :: s.scopes⟩ + +def popScope (s : ScopeStack) : ScopeStack := + ⟨s.scopes.tail⟩ + +def lookup (s : ScopeStack) (name : String) : Option Nat := + -- Search from innermost to outermost scope + s.scopes.findSome? (fun scope => scope.find? name) + +def insert (s : ScopeStack) (name : String) (num : Nat) : ScopeStack := + match s.scopes with + | [] => ⟨[Map.empty.insert name num]⟩ + | scope :: rest => ⟨scope.insert name num :: rest⟩ + +end ScopeStack +``` + +#### Translation Algorithm + +```lean +def translateStmt (stmt : BoogieStmt) : FreshM (ScopeStack → Cmd × ScopeStack) := + match stmt with + | .varDecl name ty init => + fun scopes => do + let varNum ← freshVar + let scopes' := scopes.insert name varNum + let cmd := .init name varNum ty (translateExpr init scopes) + return (cmd, scopes') + + | .assign name expr => + fun scopes => + match scopes.lookup name with + | none => error s!"Variable {name} not found" + | some varNum => + let cmd := .set varNum (translateExpr expr scopes) + return (cmd, scopes) + + | .block stmts => + fun scopes => do + let scopes' := scopes.pushScope + let (cmds, scopes'') ← translateStmts stmts scopes' + let scopes''' := scopes''.popScope + return (.block cmds, scopes''') +``` + +**Key Insight**: Source-level shadowing is resolved during translation by assigning different numbers to variables with the same name. The Performant Ordering representation has no shadowing - each variable has a unique number. + +#### User-Facing Syntax for Referring to Shadowed Variables + +**Problem**: How does a user refer to a specific variable when multiple have the same hint? + +**Solution**: Use disambiguation syntax in source code (if needed): + +```boogie +// Option 1: Explicit numbering (for debugging/advanced use) +var x: int; // Gets number 42 +x := 1; +{ + var x: int; // Gets number 73 + x := 2; + x@42 := 3; // Explicitly refer to outer x (rare, for debugging) +} + +// Option 2: Standard scoping (normal case) +// Users write normal code, translator handles numbering +var x: int; +x := 1; +{ + var x: int; + x := 2; // Translator knows this is the inner x +} +assert x == 1; // Translator knows this is the outer x +``` + +**Implementation Note**: The `@N` syntax is optional and primarily for: +1. Debugging output (showing which variable is which) +2. Advanced users who want explicit control +3. Generated code that needs to be precise + +Normal users write standard scoped code, and the translator handles the numbering automatically. + +### 9. Names vs Indices Design Principle + +**Core Principle**: Commands use indices for implementation, specifications use names for user interface. + +#### When to Use Names + +1. **Variable declarations** (`init` hint parameter) - For display and debugging +2. **Procedure specifications** (`modifies` clauses) - Users write names +3. **Error messages** - Display names to users +4. **Pretty printing** - Show names to users +5. **Source code** - Users write names + +#### When to Use Indices + +1. **Variable references** (`set`, `havoc`, `bvar`, `fvar`) - For semantics +2. **modifiedVars return value** - Returns list of indices +3. **Evaluation** - Look up values by index +4. **Type checking** - Look up types by index +5. **Transformations** - Track variables by index + +#### Resolution Pattern + +When specifications (names) need to interact with implementation (indices): + +```lean +-- Pattern 1: Validate procedure modifies clause +def validateModifies (proc : Procedure) (body : Cmd) (ctx : VarContext) : Bool := + let modifiedIndices := body.modifiedVars + let modifiedNames := modifiedIndices.filterMap (ctx.lookupName ·) + modifiedNames.all (fun name => name ∈ proc.modifies ∨ name ∈ proc.outputs) + +-- Pattern 2: Generate havoc statements for loop elimination +def generateLoopHavocs (body : Cmd) (ctx : VarContext) : FreshM (List Cmd) := do + let modifiedIndices := body.modifiedVars + return modifiedIndices.map (fun idx => .havoc idx) + +-- Pattern 3: Error messages with names +def formatError (err : Error) (ctx : VarContext) : String := + match err with + | .VarNotFound idx => + match ctx.lookupName idx with + | some name => s!"Variable '{name}' (#{idx}) not found" + | none => s!"Variable #{idx} not found" +``` + +### 10. modifiedVars and definedVars + +**Critical Feature**: Track which variables are modified by commands. + +#### modifiedVars Implementation + +```lean +def Cmd.modifiedVars (c : Cmd P) : List Nat := + match c with + | .init _ _ _ _ _ => [] + | .set var _ _ => [var] + | .havoc var _ => [var] + | .assert _ _ _ => [] + | .assume _ _ _ => [] + +def Cmds.modifiedVars (cs : List (Cmd P)) : List Nat := + cs.flatMap Cmd.modifiedVars +``` + +**Key Points**: +- Returns **indices**, not names +- `init` returns empty list (declares, doesn't modify) +- `set` and `havoc` return the modified variable's index +- Consumers resolve indices to names when needed + +#### definedVars Implementation + +```lean +def Cmd.definedVars (c : Cmd P) : List String := + match c with + | .init hint _ _ _ _ => [hint] + | .set _ _ _ => [] + | .havoc _ _ => [] + | .assert _ _ _ => [] + | .assume _ _ _ => [] +``` + +**Key Points**: +- Returns **names** (hints), not indices +- Only `init` defines new variables +- Used for tracking variable declarations + +#### Use Cases + +1. **Loop Elimination**: +```lean +def eliminateLoop (loop : Loop) (ctx : VarContext) : FreshM Cmd := do + let modifiedIndices := loop.body.modifiedVars + let havocs := modifiedIndices.map (fun idx => .havoc idx) + -- Generate havoc statements for all modified variables + return .block havocs +``` + +2. **Procedure Validation**: +```lean +def validateProcedure (proc : Procedure) (ctx : VarContext) : Bool := + let modifiedIndices := proc.body.modifiedVars + let modifiedNames := modifiedIndices.filterMap (ctx.lookupName ·) + -- Check that all modified variables are in modifies clause or outputs + modifiedNames.all (fun name => + name ∈ proc.modifies ∨ name ∈ proc.outputs) +``` + +3. **Definedness Checking**: +```lean +def isDefinedOver (cmd : Cmd) (store : Store) : Bool := + let modifiedIndices := cmd.modifiedVars + modifiedIndices.all (fun idx => store.contains idx) +``` + +### 11. Global Variables and VarContext Initialization + +**Problem**: Global variables need to be accessible in all procedures. + +**Solution**: Initialize VarContext with globals before translating procedures. + +#### Global Variable Initialization + +```lean +structure VarContext (P : PureExpr) where + vars : Map Nat (String × P.Ty) -- number → (hint, type) + +namespace VarContext + +def empty : VarContext P := ⟨Map.empty⟩ + +def insert (ctx : VarContext P) (num : Nat) (hint : String) (ty : P.Ty) : VarContext P := + ⟨ctx.vars.insert num (hint, ty)⟩ + +def lookup (ctx : VarContext P) (num : Nat) : Option (String × P.Ty) := + ctx.vars.find? num + +def lookupName (ctx : VarContext P) (num : Nat) : Option String := + (ctx.lookup num).map (·.1) + +def lookupType (ctx : VarContext P) (num : Nat) : Option P.Ty := + (ctx.lookup num).map (·.2) + +-- Initialize VarContext with global variables +def fromGlobals (globals : List (String × P.Ty)) : FreshM (VarContext P) := do + let mut ctx := empty + for (name, ty) in globals do + let varNum ← freshVar + ctx := ctx.insert varNum name ty + return ctx + +end VarContext +``` + +#### Program Translation with Globals + +```lean +def translateProgram (prog : BoogieProgram) : FreshM StrataProgram := do + -- Step 1: Assign numbers to globals + let globalVars := prog.globals.map (fun g => (g.name, g.type)) + let ctx ← VarContext.fromGlobals globalVars + + -- Step 2: Translate procedures with global context + let procedures ← prog.procedures.mapM (translateProcedure ctx) + + return { + globals := globalVars, + procedures := procedures + } +``` + +**Key Insight**: Globals get fresh numbers first, then procedures can reference them by those numbers. + +--- + +## Implementation Strategy + +**Implementation Order**: We will implement changes to the **Lambda Calculus layer first**, then the Imperative layer, then translations and backends. This order is chosen because: +1. Lambda calculus is the foundation - both bvar and fvar changes happen here +2. Imperative layer depends on fvar being updated +3. Translations depend on both layers being complete + +### Phase 1: Core Infrastructure (Foundation) + +**Goal**: Establish the basic structures and monad without breaking existing code. + +**Tasks**: +1. Define `Program α` structure +2. Implement `FreshM` monad +3. Prove basic lemmas about `freshVar` +4. Create `VarContext` structure +5. No changes to existing AST yet + +**Success Criteria**: New code compiles, existing code unchanged. + +### Phase 2: Lambda Calculus Migration (FIRST IMPLEMENTATION PHASE) + +**Goal**: Update LExpr to use Performant Ordering. + +**Why First**: Lambda calculus is the foundation. Both `bvar` (lambda-bound variables) and `fvar` (imperative variables) are defined in LExpr, so we must update this layer before the imperative layer can use the new fvar representation. + +**Tasks**: +1. Change `LExpr.bvar` to use `Nat` + hint (remove De Bruijn relative indexing) +2. Change `LExpr.fvar` to use `Nat` + hint (remove `Identifier`) +3. Update `LExpr.abs` to store parameter number + hint +4. Update `LExpr.quant` to store bound variable numbers + hints +5. Remove all index shifting functions +6. Simplify substitution (no shifting) +7. Simplify beta reduction (no adjustment) +8. Update evaluation to use number-based lookup +9. Update type checking to use number-based lookup + +**Success Criteria**: Lambda calculus layer compiles with simplified operations. + +### Phase 3: Imperative Layer Migration (SECOND IMPLEMENTATION PHASE) + +**Goal**: Update Cmd to use Performant Ordering. + +**Why Second**: Imperative layer uses `fvar` from LExpr, so it must be updated after the lambda calculus layer is complete. + +**Tasks**: +1. Update `Cmd.init` to store hint + var number + optional expression +2. Update `Cmd.set` to use `Nat` variable number +3. Update `Cmd.havoc` to use `Nat` variable number +4. Update `Cmd.modifiedVars` to return `List Nat` +5. Update evaluation to track nextFree +6. Update type checking to use number-based lookup +7. Update semantics to use number-based operations +8. Implement pretty printing with VarContext + +**Success Criteria**: Imperative layer compiles with number-based variables. + +### Phase 4: Translation and Backend Updates (THIRD IMPLEMENTATION PHASE) + +**Goal**: Update all language frontends and backends. + +**Why Third**: Translations depend on both lambda calculus and imperative layers being complete. + +**Tasks**: +1. Update Boogie translation to assign fresh numbers +2. Update C_Simp translation to assign fresh numbers +3. Update SMT encoding to use SMTContext +4. Update all transformations to use FreshM +5. Update loop elimination to use fresh numbers +6. Update procedure inlining to use fresh numbers + +**Success Criteria**: All translations and backends compile. + +### Phase 5: Proof Restoration (FINAL PHASE) + +**Goal**: Restore all proofs to working state. + +**Tasks**: +1. Remove all De Bruijn shifting lemmas +2. Prove freshness lemmas for FreshM +3. Prove substitution correctness (simplified) +4. Prove transformation correctness +5. Prove invariant preservation +6. Update all semantic proofs + +**Success Criteria**: All proofs compile, no `sorry` remains. + +### Phase 6: Testing and Validation + +**Goal**: Ensure all tests pass. + +**Tasks**: +1. Run `lake build` - must succeed +2. Run `lake test` - must pass all tests +3. Update test expectations if needed +4. Verify examples still work +5. Performance testing + +**Success Criteria**: Full test suite passes. + +--- + +## Testing Strategy + +### Unit Testing + +**Approach**: Update existing tests to use Performant Ordering. + +**Test Categories**: +1. **Fresh variable generation**: Verify uniqueness +2. **Substitution**: Verify correctness without shifting +3. **Beta reduction**: Verify correctness without adjustment +4. **Evaluation**: Verify number-based lookup works +5. **Type checking**: Verify number-based type lookup works +6. **Pretty printing**: Verify display names are correct + + +### Property-Based Testing + +**Properties to Test**: + +1. **Freshness Property**: + ```lean + ∀ (p : Program α) (m : FreshM β), + let (_, nextFree') := m.run p.nextFree + ∀ v ∈ allVars p.content, v < nextFree' + ``` + +2. **Uniqueness Property**: + ```lean + ∀ (p : Program α), + p.allVarsUnique → (transform p).allVarsUnique + ``` + +3. **Substitution Property**: + ```lean + ∀ (e : LExpr) (v : Nat) (arg : LExpr), + semantics (substitute e v arg) = semantics e [v ↦ semantics arg] + ``` + +4. **Beta Reduction Property**: + ```lean + ∀ (param : Nat) (body arg : LExpr), + semantics (betaReduce (.app (.abs param body) arg)) = + semantics (substitute body param arg) + ``` + +### Integration Testing + +**Test Scenarios**: +1. Parse Boogie program → Translate → Verify +2. Complex transformations (call elimination, loop elimination) +3. Multi-stage transformation pipelines +4. SMT encoding and solving + +### Regression Testing + +**Approach**: All existing tests must pass with same semantics. + +**Test Files to Update**: +- `StrataTest/DL/Lambda/*.lean` +- `StrataTest/DL/Imperative/*.lean` +- `StrataTest/Languages/Boogie/*.lean` +- `StrataTest/Transform/*.lean` + +--- + +## Migration Plan + +### Compiler-Driven Approach + +**Principle**: Make breaking changes, then fix what the compiler complains about. + +### Step 1: Create Infrastructure (No Breaking Changes) + +1. Create `Strata/Core/Program.lean` with `Program α` structure +2. Create `Strata/Core/FreshM.lean` with monad and lemmas +3. Create `Strata/Core/VarContext.lean` with display context +4. Verify these compile independently + +### Step 2: Break Lambda Calculus (Controlled Break) + +1. Update `LExpr.bvar` to use `Nat` +2. Update `LExpr.fvar` to use `Nat` +3. Update `LExpr.abs` to store parameter number +4. Run `lake build` to identify all broken files +5. Create `broken-files-lambda.md` to track them + + +### Step 3: Fix Lambda Calculus Files + +For each broken file: +1. Remove index shifting logic +2. Simplify substitution to direct replacement +3. Update evaluation to use number lookup +4. Update type checking to use number lookup +5. If proofs break, add `sorry --TODO: Restore proof` and document in `broken-proofs.md` +6. Continue until `lake build` succeeds for lambda layer + +### Step 4: Break Imperative Layer (Controlled Break) + +1. Update `Cmd.set` to use `Nat` +2. Update `Cmd.havoc` to use `Nat` +3. Run `lake build` to identify broken files +4. Create `broken-files-imperative.md` to track them + +### Step 5: Fix Imperative Layer Files + +For each broken file: +1. Update evaluation to track nextFree +2. Update type checking to use number lookup +3. Update semantics to use number operations +4. If proofs break, add `sorry --TODO: Restore proof` and document +5. Continue until `lake build` succeeds for imperative layer + +### Step 6: Update Translations and Backends + +1. Update Boogie translation to use FreshM +2. Update C_Simp translation to use FreshM +3. Update SMT encoding to use SMTContext +4. Update all transformations to use FreshM +5. Fix compilation errors as they arise + +### Step 7: Restore Proofs + +1. Review `broken-proofs.md` +2. For each broken proof: + - Remove `sorry` + - Prove freshness using FreshM lemmas + - Prove substitution correctness (simpler than before) + - Prove transformation correctness +3. Continue until no `sorry` remains + +### Step 8: Test and Validate + +1. Run `lake build` - must succeed +2. Run `lake test` - fix any failures +3. Update test expectations if needed +4. Verify all examples work +5. Final validation + +--- + +## Success Criteria + +### Compilation + +- [ ] `lake build` succeeds with no errors +- [ ] No compilation warnings related to variable numbering +- [ ] All modules compile successfully + +### Testing + +- [ ] `lake test` passes all tests +- [ ] No test failures +- [ ] Test coverage maintained or improved + + +### Proofs + +- [ ] No `sorry` placeholders remain +- [ ] All freshness lemmas proven +- [ ] All substitution lemmas proven +- [ ] All transformation correctness proofs restored +- [ ] All invariant preservation proofs complete + +### Code Quality + +- [ ] All De Bruijn index shifting code removed +- [ ] All De Bruijn weakening code removed +- [ ] All De Bruijn lifting code removed +- [ ] Code is simpler and more maintainable +- [ ] Documentation updated + +### Functionality + +- [ ] All examples work correctly +- [ ] Boogie translation works +- [ ] C_Simp translation works +- [ ] SMT encoding works +- [ ] Verification produces correct results + +--- + +## Appendix A: Key Differences from De Bruijn Indices + +| Aspect | De Bruijn Indices | Performant Ordering | +|--------|-------------------|---------------------| +| **Variable Identity** | Relative (distance from binder) | Absolute (unique number) | +| **Substitution** | Requires index shifting | Simple replacement | +| **Beta Reduction** | Requires index adjustment | Simple substitution | +| **Alpha Conversion** | Requires index renormalization | Generate fresh number | +| **Shadowing** | Possible (same relative index) | Impossible (unique numbers) | +| **Transformation** | Variables change identity | Variables stable | +| **Proof Complexity** | High (shifting lemmas) | Low (freshness lemmas) | +| **Canonical Form** | Yes (unique representation) | No (alpha-equivalent terms differ) | +| **Strata Fit** | Poor (transformation-heavy) | Excellent (transformation-friendly) | + +--- + +## Appendix B: Example Transformations + +### Example 1: Beta Reduction + +**De Bruijn Approach**: +```lean +-- (λ. body[0]) arg +-- Need to shift arg's free variables up by 1 +-- Then substitute 0 with shifted arg +-- Then shift result's free variables down by 1 +betaReduce (.app (.abs body) arg) = + let shiftedArg := shift arg 1 + let substituted := subst body 0 shiftedArg + shift substituted (-1) +``` + +**Performant Ordering Approach**: +```lean +-- (λ x₄₂. body) arg +-- Just substitute 42 with arg, no shifting! +betaReduce (.app (.abs 42 body) arg) = + substitute body 42 arg +``` + + +### Example 2: Procedure Inlining + +**De Bruijn Approach**: +```lean +-- Inline: call y := f(x) +-- Need to: +-- 1. Shift all indices in f's body +-- 2. Substitute parameters with shifted arguments +-- 3. Shift result back +-- 4. Handle variable capture carefully +inlineProcedure (call y f [x]) = + let body := f.body + let shiftedBody := shiftIndices body (currentDepth) + let substituted := substParams shiftedBody f.params [x] + let result := shiftIndices substituted (-currentDepth) + result +``` + +**Performant Ordering Approach**: +```lean +-- Inline: call y := f(x) +-- Just generate fresh numbers for f's locals and substitute! +inlineProcedure (call y f [x]) : FreshM Stmt := do + let freshLocals ← freshVars f.locals.length + let body := f.body + let substituted := substParams body f.params [x] + let renamed := renameLocals substituted f.locals freshLocals + return renamed +``` + +### Example 3: Loop Elimination + +**De Bruijn Approach**: +```lean +-- Eliminate: while guard { body } +-- Need to track which indices are modified +-- Shift indices when generating havoc statements +-- Complex index management +eliminateLoop (while guard body) = + let modifiedIndices := getModifiedIndices body + let shiftedIndices := shiftIndices modifiedIndices (currentDepth) + let havocs := generateHavocs shiftedIndices + -- ... complex index management +``` + +**Performant Ordering Approach**: +```lean +-- Eliminate: while guard { body } +-- Modified variables have stable numbers, just havoc them! +eliminateLoop (while guard body) = + let modifiedVars := getModifiedVars body -- Returns List Nat + let havocs := modifiedVars.map (fun v => .havoc v) + -- ... simple, no index management needed +``` + +--- + +## Appendix C: Proof Strategy Examples + +### Freshness Proof Pattern + +```lean +theorem transform_preserves_freshness (p : Program α) : + p.allVarsBelow → + (transform p).allVarsBelow := by + intro h + unfold transform + -- Show that FreshM generates variables >= p.nextFree + have fresh : ∀ v ∈ newVars, v >= p.nextFree := by + apply freshVar_generates_above + -- Show that old variables are unchanged + have old : ∀ v ∈ oldVars, v < p.nextFree := by + apply h + -- Combine to show all variables below new nextFree + omega +``` + + +### Substitution Correctness Proof Pattern + +```lean +theorem substitute_preserves_semantics (e : LExpr) (v : Nat) (arg : LExpr) : + semantics (substitute e v arg) = semantics e [v ↦ semantics arg] := by + induction e with + | bvar m index ty => + simp [substitute] + split + · -- index == v, substitution happens + simp [semantics] + · -- index != v, no substitution + simp [semantics] + | abs m param body ty ih => + simp [substitute, semantics] + -- No index shifting needed! + apply ih + | app m fn arg ty ih_fn ih_arg => + simp [substitute, semantics] + rw [ih_fn, ih_arg] + -- ... other cases, all straightforward +``` + +**Key Insight**: No shifting lemmas needed! Each case is simple. + +### Uniqueness Preservation Proof Pattern + +```lean +theorem transform_preserves_uniqueness (p : Program α) : + p.allVarsUnique → + (transform p).allVarsUnique := by + intro h + unfold transform + -- Show that fresh variables are distinct from old variables + have fresh_distinct : ∀ v₁ ∈ newVars, ∀ v₂ ∈ oldVars, v₁ ≠ v₂ := by + intros v₁ h₁ v₂ h₂ + have : v₁ >= p.nextFree := freshVar_generates_above h₁ + have : v₂ < p.nextFree := h h₂ + omega + -- Show that fresh variables are distinct from each other + have fresh_unique : ∀ v₁ v₂ ∈ newVars, v₁ ≠ v₂ → v₁ ≠ v₂ := by + apply freshVars_generates_unique + -- Combine + intros v₁ v₂ h_in₁ h_in₂ h_neq + cases h_in₁ <;> cases h_in₂ + · apply h -- both old + · apply fresh_distinct -- one old, one new + · apply fresh_distinct.symm -- one new, one old + · apply fresh_unique -- both new +``` + +--- + +## Appendix D: File-by-File Migration Checklist + +### Core Files + +- [ ] `Strata/Core/Program.lean` - Create new +- [ ] `Strata/Core/FreshM.lean` - Create new +- [ ] `Strata/Core/VarContext.lean` - Create new + +### Lambda Calculus Layer + +- [ ] `Strata/DL/Lambda/LExpr.lean` - Update bvar, fvar, abs, quant +- [ ] `Strata/DL/Lambda/LExprEval.lean` - Remove shifting, simplify substitution +- [ ] `Strata/DL/Lambda/LExprType.lean` - Update type checking +- [ ] `Strata/DL/Lambda/LExprWF.lean` - Remove shifting lemmas +- [ ] `Strata/DL/Lambda/Semantics.lean` - Update semantics +- [ ] `Strata/DL/Lambda/Scopes.lean` - Update scope handling + + +### Imperative Layer + +- [ ] `Strata/DL/Imperative/Cmd.lean` - Update set, havoc +- [ ] `Strata/DL/Imperative/CmdEval.lean` - Update evaluation +- [ ] `Strata/DL/Imperative/CmdType.lean` - Update type checking +- [ ] `Strata/DL/Imperative/CmdSemantics.lean` - Update semantics +- [ ] `Strata/DL/Imperative/Stmt.lean` - Update statement handling +- [ ] `Strata/DL/Imperative/StmtSemantics.lean` - Update statement semantics +- [ ] `Strata/DL/Imperative/SemanticsProps.lean` - Update semantic properties + +### Boogie Language + +- [ ] `Strata/Languages/Boogie/Statement.lean` - Update translation +- [ ] `Strata/Languages/Boogie/Procedure.lean` - Update procedure handling +- [ ] `Strata/Languages/Boogie/Program.lean` - Update program structure +- [ ] `Strata/Languages/Boogie/Verifier.lean` - Update verification +- [ ] `Strata/Languages/Boogie/SMTEncoder.lean` - Update SMT encoding +- [ ] `Strata/Languages/Boogie/BoogieGen.lean` - Update code generation + +### Transformations + +- [ ] `Strata/Transform/CallElim.lean` - Use FreshM +- [ ] `Strata/Transform/CallElimCorrect.lean` - Update proofs +- [ ] `Strata/Transform/LoopElim.lean` - Use FreshM +- [ ] `Strata/Transform/DetToNondet.lean` - Use FreshM +- [ ] `Strata/Transform/DetToNondetCorrect.lean` - Update proofs +- [ ] `Strata/Transform/ProcedureInlining.lean` - Use FreshM + +### Tests + +- [ ] `StrataTest/DL/Lambda/*.lean` - Update all lambda tests +- [ ] `StrataTest/DL/Imperative/*.lean` - Update all imperative tests +- [ ] `StrataTest/Languages/Boogie/*.lean` - Update all Boogie tests +- [ ] `StrataTest/Transform/*.lean` - Update all transformation tests + +--- + +## Appendix E: Common Pitfalls and Solutions + +### Pitfall 1: Forgetting to Thread nextFree + +**Problem**: Generating multiple fresh variables without threading state. + +**Wrong**: +```lean +def transform (e : Expr) : Expr := + let v1 := freshVar -- Wrong! No state threading + let v2 := freshVar -- Wrong! Will get same number + ... +``` + +**Right**: +```lean +def transform (e : Expr) : FreshM Expr := do + let v1 ← freshVar -- Correct! State threaded + let v2 ← freshVar -- Correct! Gets next number + ... +``` + + +### Pitfall 2: Mixing Variable Numbers and Display Names + +**Problem**: Using display names for semantic operations. + +**Wrong**: +```lean +def lookupVar (env : Env) (name : String) : Option Value := + env.find? name -- Wrong! Names are for display only +``` + +**Right**: +```lean +def lookupVar (env : Env) (varNum : Nat) : Option Value := + env.find? varNum -- Correct! Use numbers for semantics +``` + +### Pitfall 3: Not Updating nextFree After Fresh Generation + +**Problem**: Generating fresh variable but not incrementing counter. + +**Wrong**: +```lean +def addVar (p : Program α) : Program α := + let varNum := p.nextFree + ⟨addVarToContent p.content varNum, p.nextFree⟩ -- Wrong! nextFree not updated +``` + +**Right**: +```lean +def addVar (p : Program α) : Program α := + let varNum := p.nextFree + ⟨addVarToContent p.content varNum, p.nextFree + 1⟩ -- Correct! +``` + +### Pitfall 4: Trying to Shift Indices + +**Problem**: Old habits from De Bruijn indices. + +**Wrong**: +```lean +def substitute (e : Expr) (v : Nat) (arg : Expr) : Expr := + match e with + | .abs param body => + let shiftedArg := shift arg 1 -- Wrong! No shifting needed + .abs param (substitute body v shiftedArg) +``` + +**Right**: +```lean +def substitute (e : Expr) (v : Nat) (arg : Expr) : Expr := + match e with + | .abs param body => + .abs param (substitute body v arg) -- Correct! Just recurse +``` + +--- + +## Appendix F: Performance Considerations + +### Variable Lookup + +**Concern**: Map lookup by number vs. by name. + +**Analysis**: +- Number lookup: O(log n) with balanced tree +- Name lookup: O(log n) with balanced tree +- **No performance difference** + +**Benefit**: Numbers are smaller (Nat vs String), better cache locality. + +### Fresh Variable Generation + +**Concern**: Incrementing counter vs. generating unique names. + +**Analysis**: +- Counter increment: O(1) +- Unique name generation: O(n) to check uniqueness +- **Performant Ordering is faster** + + +### Substitution + +**Concern**: Traversing entire AST for substitution. + +**Analysis**: +- De Bruijn: O(n) traversal + O(n) shifting = O(n) +- Performant Ordering: O(n) traversal = O(n) +- **Performant Ordering is faster** (no shifting overhead) + +### Memory Usage + +**Concern**: Storing numbers vs. storing names. + +**Analysis**: +- Numbers: 8 bytes per variable (Nat) +- Names: 8+ bytes per variable (String pointer + data) +- **Performant Ordering uses less memory** + +**Conclusion**: Performant Ordering is equal or better in all performance aspects. + +--- + +## Appendix G: Frequently Asked Questions + +### Q1: Why not use De Bruijn indices? + +**A**: De Bruijn indices are great for canonical representation, but Strata is transformation-heavy. Variables changing identity during transformations makes proofs complex. Performant Ordering keeps variables stable, simplifying everything. + +### Q2: What about alpha-equivalence? + +**A**: We lose syntactic canonicity, but we don't need it. Strata cares about semantic equivalence, not syntactic uniqueness. For verification, semantic equivalence is what matters. + +### Q3: How do we handle variable capture? + +**A**: Variable capture is impossible by construction! All variables have unique numbers, so there's nothing to capture. This is a huge simplification. + +### Q4: What about pretty printing? + +**A**: We maintain a VarContext that maps numbers to display names. Pretty printing resolves numbers to names. Users see readable names, but semantics use numbers. + +### Q5: How do we handle globals? + +**A**: Globals are just variables that get assigned numbers first. Start with nextFree = 0, assign globals numbers 0, 1, 2, ... using freshVar, then continue with nextFree at N (where N is the number of globals). All subsequent variables (locals, parameters, lambda-bound) get numbers >= N using the same counter. + +### Q6: What about procedure parameters? + +**A**: When declaring a procedure, assign fresh numbers to all parameters. When calling a procedure, substitute arguments for parameter numbers. When inlining, generate fresh numbers for all locals. + +### Q7: How do we handle quantifiers? + +**A**: Quantifiers store the list of bound variable numbers. When evaluating, bind those numbers in the environment. No relative indexing needed. + +### Q8: What about performance? + +**A**: Performant Ordering is equal or better in all aspects: lookup speed, generation speed, memory usage, and substitution speed. + + +--- + +## Appendix H: References and Related Work + +### Performant Ordering in Other Systems + +- **SSA Form**: Static Single Assignment uses unique variable numbers +- **ANF**: Administrative Normal Form uses unique names +- **CPS**: Continuation-Passing Style uses unique names +- **LLVM IR**: Uses numbered registers (%0, %1, %2, ...) + +### Why Strata Needs Performant Ordering + +1. **Multi-stage transformations**: Boogie → Imperative → SMT → Solver +2. **Transformation-heavy**: Call elimination, loop elimination, inlining +3. **Verification focus**: Semantic equivalence matters, not syntactic canonicity +4. **Proof engineering**: Simpler proofs = more maintainable system + +### Academic References + +- Pierce, B. C. (2002). *Types and Programming Languages*. MIT Press. + - Chapter on De Bruijn indices and their limitations +- Appel, A. W. (1998). *Modern Compiler Implementation in ML*. Cambridge University Press. + - Chapter on SSA form and unique naming +- Leroy, X. (2009). "Formal verification of a realistic compiler". *Communications of the ACM*. + - CompCert uses unique names for intermediate representations + +--- + +## Conclusion + +This specification describes the migration of Strata to **Performant Ordering** - a unified variable numbering scheme using globally unique natural numbers. This change simplifies the entire system: + +- **Lambda calculus**: No more index shifting, weakening, or lifting +- **Imperative layer**: Unambiguous variable references +- **Transformations**: Variables maintain stable identity +- **Proofs**: Simpler freshness lemmas replace complex shifting lemmas +- **Strata-optimized**: Perfect fit for transformation-heavy verification pipelines + +The migration follows a compiler-driven approach: make breaking changes, fix what breaks, restore proofs, validate tests. The result is a simpler, more maintainable, and more transformation-friendly system. + +**Next Steps**: Begin implementation with Phase 1 (Core Infrastructure). + +--- + +## Document History + +| Version | Date | Author | Changes | +|---------|------|--------|---------| +| 1.0 | 2026-01-19 | Kiro AI | Initial specification | + +--- + +## Approval + +This specification requires review and approval before implementation begins. + +- [ ] Technical Lead Review +- [ ] Architecture Review +- [ ] Implementation Team Review +- [ ] Approved for Implementation + +--- + +*End of Specification* diff --git a/.kiro/specs/performant-ordering-variables/tasks.md b/.kiro/specs/performant-ordering-variables/tasks.md new file mode 100644 index 000000000..e7bdba368 --- /dev/null +++ b/.kiro/specs/performant-ordering-variables/tasks.md @@ -0,0 +1,445 @@ +# Performant Ordering Variable Numbering - Implementation Tasks + +## Implementation Strategy + +**Implementation Order**: We will implement changes to the **Lambda Calculus layer first**, then the Imperative layer, then translations and backends. This order is chosen because: +1. Lambda calculus is the foundation - both bvar and fvar changes happen here +2. Imperative layer depends on fvar being updated +3. Translations depend on both layers being complete + +**Compiler-Driven Approach**: Make breaking changes, then fix what the compiler complains about. + +--- + +## Phase 1: Core Infrastructure (Foundation) + +**Goal**: Establish the basic structures and monad without breaking existing code. + +- [ ] 1.1 Create `Strata/Core/NextFree.lean` with `NextFree α` structure + - [ ] 1.1.1 Define `NextFree` structure with `content` and `nextFree` fields and `HasVars α` constraint + - [ ] 1.1.2 Implement `NextFree.empty` constructor + - [ ] 1.1.3 Implement `NextFree.withFreshCounter` constructor + - [ ] 1.1.4 Define `NextFree.wellFormed` invariant using `HasVars.vars` + - [ ] 1.1.5 Define `NextFree.wellFormed'` alternative formulation + - [ ] 1.1.6 Prove `wellFormed_iff_wellFormed'` equivalence + - [ ] 1.1.7 Add basic helper functions + +- [ ] 1.2 Add `HasVars` typeclass to `Strata/DL/Imperative/HasVars.lean` + - [ ] 1.2.1 Review existing `HasVarsPure` and `HasVarsImp` typeclasses + - [ ] 1.2.2 Define generic `HasVars` typeclass with `vars : α → List Nat` + - [ ] 1.2.3 Add default instance for types with `HasVarsPure` (uses `getVars`) + - [ ] 1.2.4 Add default instance for types with `HasVarsImp` (uses `touchedVars`) + - [ ] 1.2.5 Verify compilation + +- [ ] 1.2 Create `Strata/Core/FreshM.lean` with monad implementation + - [ ] 1.2.1 Define `FreshM` as `StateM Nat` + - [ ] 1.2.2 Implement `freshVar` function + - [ ] 1.2.3 Implement `freshVars` function for multiple variables + - [ ] 1.2.4 Implement `FreshM.run` function + - [ ] 1.2.5 Implement `FreshM.runProgram` function + - [ ] 1.2.6 Prove `freshVar_is_fresh` lemma + +- [ ] 1.3 Create `Strata/Core/VarContext.lean` with display context + - [ ] 1.3.1 Define `VarContext` structure + - [ ] 1.3.2 Implement `VarContext.empty` + - [ ] 1.3.3 Implement `VarContext.insert` + - [ ] 1.3.4 Implement `VarContext.lookup` + - [ ] 1.3.5 Implement `VarContext.lookupName` + - [ ] 1.3.6 Implement `VarContext.lookupType` + - [ ] 1.3.7 Implement `VarContext.fromGlobals` + +- [ ] 1.4 Verify infrastructure compiles independently + - [ ] 1.4.1 Run `lake build` on new files + - [ ] 1.4.2 Fix any compilation errors + - [ ] 1.4.3 Ensure no existing code is broken + +**Success Criteria**: New code compiles, existing code unchanged. + +--- + +## Phase 2: Lambda Calculus Migration (FIRST IMPLEMENTATION PHASE) + +**Goal**: Update LExpr to use Performant Ordering. + +**Why First**: Lambda calculus is the foundation. Both `bvar` (lambda-bound variables) and `fvar` (imperative variables) are defined in LExpr, so we must update this layer before the imperative layer can use the new fvar representation. + +- [ ] 2.1 Update `Strata/DL/Lambda/LExpr.lean` AST definition + - [ ] 2.1.1 Change `LExpr.bvar` to use only `Nat` index (remove hint string) + - [ ] 2.1.2 Change `LExpr.fvar` to use only `Nat` index (remove hint string and `Identifier`) + - [ ] 2.1.3 Update `LExpr.abs` to store parameter `Nat` + `name : String` + - [ ] 2.1.4 Update `LExpr.quant` to store `var : Nat` + `name : String` for the bound variable + - [ ] 2.1.5 Run `lake build` to identify broken files + - [ ] 2.1.6 Create `broken-files-lambda.md` to track compilation errors + +- [ ] 2.2 Remove index shifting functions + - [ ] 2.2.1 Identify all index shifting functions in Lambda layer + - [ ] 2.2.2 Remove `shift`, `weaken`, `lift` functions + - [ ] 2.2.3 Remove all calls to these functions + - [ ] 2.2.4 Document removed functions in migration notes + +- [ ] 2.3 Simplify substitution in `Strata/DL/Lambda/LExpr.lean` + - [ ] 2.3.1 Update `substitute` to use direct replacement (no shifting) + - [ ] 2.3.2 Remove index adjustment logic + - [ ] 2.3.3 Test substitution with simple examples + - [ ] 2.3.4 Verify compilation + +- [ ] 2.4 Simplify beta reduction + - [ ] 2.4.1 Update `betaReduce` to use simple substitution + - [ ] 2.4.2 Remove index adjustment logic + - [ ] 2.4.3 Test beta reduction with simple examples + - [ ] 2.4.4 Verify compilation + +- [ ] 2.5 Update alpha conversion + - [ ] 2.5.1 Implement `alphaConvert` using `FreshM` + - [ ] 2.5.2 Generate fresh variable numbers + - [ ] 2.5.3 Remove index adjustment logic + - [ ] 2.5.4 Test alpha conversion + - [ ] 2.5.5 Verify compilation + +- [ ] 2.6 Update evaluation in `Strata/DL/Lambda/LExprEval.lean` + - [ ] 2.6.1 Change environment to use `Map Nat Value` + - [ ] 2.6.2 Update variable lookup to use numbers + - [ ] 2.6.3 Remove index-based lookup logic + - [ ] 2.6.4 Update all evaluation cases + - [ ] 2.6.5 Verify compilation + +- [ ] 2.7 Update type checking in `Strata/DL/Lambda/LExprType*.lean` + - [ ] 2.7.1 Change type environment to use `Map Nat Type` + - [ ] 2.7.2 Update type lookup to use numbers + - [ ] 2.7.3 Update all type checking cases + - [ ] 2.7.4 Verify compilation + +- [ ] 2.8 Update well-formedness in `Strata/DL/Lambda/LExprWF.lean` + - [ ] 2.8.1 Remove De Bruijn well-formedness checks + - [ ] 2.8.2 Add Performant Ordering well-formedness checks + - [ ] 2.8.3 Update all well-formedness lemmas + - [ ] 2.8.4 Add `sorry` for broken proofs with TODO comments + - [ ] 2.8.5 Document broken proofs in `broken-proofs.md` + +- [ ] 2.9 Update semantics in `Strata/DL/Lambda/Semantics.lean` + - [ ] 2.9.1 Update semantic definitions to use numbers + - [ ] 2.9.2 Remove index-based semantic rules + - [ ] 2.9.3 Update all semantic lemmas + - [ ] 2.9.4 Add `sorry` for broken proofs with TODO comments + - [ ] 2.9.5 Verify compilation + +- [ ] 2.10 Update scope handling in `Strata/DL/Lambda/Scopes.lean` + - [ ] 2.10.1 Implement `ScopeStack` structure + - [ ] 2.10.2 Implement scope operations (push, pop, lookup, insert) + - [ ] 2.10.3 Update scope-related functions + - [ ] 2.10.4 Verify compilation + +- [ ] 2.11 Fix all broken Lambda layer files + - [ ] 2.11.1 Review `broken-files-lambda.md` + - [ ] 2.11.2 Fix each file systematically + - [ ] 2.11.3 Run `lake build` after each fix + - [ ] 2.11.4 Continue until Lambda layer compiles + +**Success Criteria**: Lambda calculus layer compiles with simplified operations. + +--- + +## Phase 3: Imperative Layer Migration (SECOND IMPLEMENTATION PHASE) + +**Goal**: Update Cmd to use Performant Ordering. + +**Why Second**: Imperative layer uses `fvar` from LExpr, so it must be updated after the lambda calculus layer is complete. + +- [ ] 3.1 Update `Strata/DL/Imperative/Cmd.lean` definition + - [ ] 3.1.1 Update `Cmd.init` to store `hint : String`, `var : Nat`, `ty : P.Ty`, `e : Option P.Expr` + - [ ] 3.1.2 Update `Cmd.set` to use `var : Nat` + - [ ] 3.1.3 Update `Cmd.havoc` to use `var : Nat` + - [ ] 3.1.4 Ensure all constructors have `md : MetaData P` parameter + - [ ] 3.1.5 Run `lake build` to identify broken files + - [ ] 3.1.6 Create `broken-files-imperative.md` to track compilation errors + +- [ ] 3.2 Update `Cmd.modifiedVars` implementation + - [ ] 3.2.1 Change return type to `List Nat` + - [ ] 3.2.2 Update `init` case to return empty list + - [ ] 3.2.3 Update `set` case to return `[var]` + - [ ] 3.2.4 Update `havoc` case to return `[var]` + - [ ] 3.2.5 Verify compilation + +- [ ] 3.3 Implement `Cmd.definedVars` + - [ ] 3.3.1 Define function with return type `List String` + - [ ] 3.3.2 Implement `init` case to return `[hint]` + - [ ] 3.3.3 Implement other cases to return empty list + - [ ] 3.3.4 Verify compilation + +- [ ] 3.4 Update evaluation in `Strata/DL/Imperative/CmdEval.lean` + - [ ] 3.4.1 Update `EvalContext` to track `nextVar` + - [ ] 3.4.2 Implement `incrementNextVar` function + - [ ] 3.4.3 Update `Cmd.eval` for `init` with `some e` + - [ ] 3.4.4 Update `Cmd.eval` for `init` with `none` + - [ ] 3.4.5 Update `Cmd.eval` for `set` + - [ ] 3.4.6 Update `Cmd.eval` for `havoc` + - [ ] 3.4.7 Verify compilation + +- [ ] 3.5 Update type checking in `Strata/DL/Imperative/CmdType.lean` + - [ ] 3.5.1 Change type environment to use `Map Nat Type` + - [ ] 3.5.2 Update type lookup to use numbers + - [ ] 3.5.3 Update all type checking cases + - [ ] 3.5.4 Verify compilation + +- [ ] 3.6 Update semantics in `Strata/DL/Imperative/CmdSemantics.lean` + - [ ] 3.6.1 Update semantic definitions to use numbers + - [ ] 3.6.2 Remove name-based semantic rules + - [ ] 3.6.3 Update all semantic lemmas + - [ ] 3.6.4 Add `sorry` for broken proofs with TODO comments + - [ ] 3.6.5 Verify compilation + +- [ ] 3.7 Update statement handling in `Strata/DL/Imperative/Stmt.lean` + - [ ] 3.7.1 Update statement operations to use numbers + - [ ] 3.7.2 Update all statement cases + - [ ] 3.7.3 Verify compilation + +- [ ] 3.8 Update statement semantics in `Strata/DL/Imperative/StmtSemantics.lean` + - [ ] 3.8.1 Update semantic definitions to use numbers + - [ ] 3.8.2 Update all semantic cases + - [ ] 3.8.3 Add `sorry` for broken proofs with TODO comments + - [ ] 3.8.4 Verify compilation + +- [ ] 3.9 Update semantic properties in `Strata/DL/Imperative/SemanticsProps.lean` + - [ ] 3.9.1 Update property definitions to use numbers + - [ ] 3.9.2 Update all property lemmas + - [ ] 3.9.3 Add `sorry` for broken proofs with TODO comments + - [ ] 3.9.4 Verify compilation + +- [ ] 3.10 Implement pretty printing + - [ ] 3.10.1 Implement `formatVar` function + - [ ] 3.10.2 Implement `formatCmd` function + - [ ] 3.10.3 Implement `formatExpr` function + - [ ] 3.10.4 Handle disambiguation with @N suffix + - [ ] 3.10.5 Verify compilation + +- [ ] 3.11 Fix all broken Imperative layer files + - [ ] 3.11.1 Review `broken-files-imperative.md` + - [ ] 3.11.2 Fix each file systematically + - [ ] 3.11.3 Run `lake build` after each fix + - [ ] 3.11.4 Continue until Imperative layer compiles + +**Success Criteria**: Imperative layer compiles with number-based variables. + +--- + +## Phase 4: Translation and Backend Updates (THIRD IMPLEMENTATION PHASE) + +**Goal**: Update all language frontends and backends. + +**Why Third**: Translations depend on both lambda calculus and imperative layers being complete. + +- [ ] 4.1 Update Strata Core language in `Strata/Languages/Core/` + - [ ] 4.1.1 Update `Expressions.lean` to remove `old()` constructor + - [ ] 4.1.2 Update `Statement.lean` to use Performant Ordering + - [ ] 4.1.3 Update `Procedure.lean` to use Performant Ordering + - [ ] 4.1.4 Add `preStateCopies` field to Procedure structure + - [ ] 4.1.5 Update `Program.lean` to use Performant Ordering + - [ ] 4.1.6 Update `Verifier.lean` to use FreshM + - [ ] 4.1.7 Update `SMTEncoder.lean` to use SMTContext + - [ ] 4.1.8 Verify compilation + +- [ ] 4.2 Implement SMT encoding with SMTContext + - [ ] 4.2.1 Create `SMTContext` structure + - [ ] 4.2.2 Implement `SMTContext.empty` + - [ ] 4.2.3 Implement `SMTContext.addVar` + - [ ] 4.2.4 Implement `SMTContext.lookupVar` + - [ ] 4.2.5 Update SMT encoding to use SMTContext + - [ ] 4.2.6 Verify compilation + +- [ ] 4.3 Update C_Simp translation in `Strata/Languages/C_Simp/` + - [ ] 4.3.1 Update translation to use FreshM + - [ ] 4.3.2 Implement scope stack for shadowing + - [ ] 4.3.3 Assign fresh numbers to all variables + - [ ] 4.3.4 Update all translation functions + - [ ] 4.3.5 Verify compilation + +- [ ] 4.4 Implement old() elimination in procedure translation + - [ ] 4.4.1 Identify modified globals from procedure's modifies clause + - [ ] 4.4.2 Generate fresh variables for pre-state copies of modified globals only + - [ ] 4.4.3 Insert initialization statements at procedure entry to copy globals + - [ ] 4.4.4 Implement `replaceOldReferences` function for global variables + - [ ] 4.4.5 Replace all `old(global)` references with pre-state variable references + - [ ] 4.4.6 Update procedure structure to include `preStateCopies` + - [ ] 4.4.7 Verify compilation + +- [ ] 4.5 Update transformations to use FreshM + - [ ] 4.5.1 Update `CallElim.lean` to use FreshM + - [ ] 4.5.2 Update `LoopElim.lean` to use FreshM + - [ ] 4.5.3 Update `DetToNondet.lean` to use FreshM + - [ ] 4.5.4 Update `ProcedureInlining.lean` to use FreshM + - [ ] 4.5.5 Verify compilation + +- [ ] 4.6 Update loop elimination + - [ ] 4.6.1 Update to generate havoc statements using variable numbers + - [ ] 4.6.2 Use `modifiedVars` to get list of indices + - [ ] 4.6.3 Generate fresh numbers for loop-related variables + - [ ] 4.6.4 Verify compilation + +- [ ] 4.7 Update procedure inlining + - [ ] 4.6.1 Generate fresh numbers for all parameters + - [ ] 4.6.2 Generate fresh numbers for all locals + - [ ] 4.6.3 Update substitution to use numbers + - [ ] 4.6.4 Verify compilation + +- [ ] 4.7 Update all other language frontends + - [ ] 4.7.1 Update Dyn translation if needed + - [ ] 4.7.2 Update Laurel translation if needed + - [ ] 4.7.3 Update Python translation if needed + - [ ] 4.7.4 Verify compilation + +- [ ] 4.8 Update CBMC backend in `Strata/Backends/CBMC/` + - [ ] 4.8.1 Update to use variable numbers + - [ ] 4.8.2 Update GOTO translation + - [ ] 4.8.3 Verify compilation + +**Success Criteria**: All translations and backends compile. + +--- + +## Phase 5: Proof Restoration (FINAL PHASE) + +**Goal**: Restore all proofs to working state. + +- [ ] 5.1 Remove De Bruijn shifting lemmas + - [ ] 5.1.1 Identify all shifting lemmas + - [ ] 5.1.2 Remove lemmas that are no longer needed + - [ ] 5.1.3 Document removed lemmas + +- [ ] 5.2 Prove freshness lemmas for FreshM + - [ ] 5.2.1 Prove `freshVar_is_fresh` + - [ ] 5.2.2 Prove `freshVars_generates_unique` + - [ ] 5.2.3 Prove `freshVar_generates_above` + - [ ] 5.2.4 Prove other freshness properties + +- [ ] 5.3 Prove substitution correctness + - [ ] 5.3.1 Prove `substitute_preserves_semantics` + - [ ] 5.3.2 Prove substitution is simpler than De Bruijn version + - [ ] 5.3.3 Prove all substitution lemmas + +- [ ] 5.4 Prove transformation correctness + - [ ] 5.4.1 Update `CallElimCorrect.lean` proofs + - [ ] 5.4.2 Update `DetToNondetCorrect.lean` proofs + - [ ] 5.4.3 Prove other transformation correctness lemmas + +- [ ] 5.5 Prove invariant preservation + - [ ] 5.5.1 Prove `transform_preserves_freshness` + - [ ] 5.5.2 Prove `transform_preserves_uniqueness` + - [ ] 5.5.3 Prove all invariant preservation lemmas + +- [ ] 5.6 Update all semantic proofs + - [ ] 5.6.1 Review `broken-proofs.md` + - [ ] 5.6.2 Fix each proof systematically + - [ ] 5.6.3 Remove all `sorry` placeholders + - [ ] 5.6.4 Verify all proofs compile + +- [ ] 5.7 Final proof verification + - [ ] 5.7.1 Run `lake build` to ensure all proofs compile + - [ ] 5.7.2 Search for remaining `sorry` placeholders + - [ ] 5.7.3 Fix any remaining proof issues + - [ ] 5.7.4 Verify no `sorry` remains + +**Success Criteria**: All proofs compile, no `sorry` remains. + +--- + +## Phase 6: Testing and Validation + +**Goal**: Ensure all tests pass. + +- [ ] 6.1 Update Lambda layer tests in `StrataTest/DL/Lambda/` + - [ ] 6.1.1 Update `Lambda.lean` tests + - [ ] 6.1.2 Update `LExprEvalTests.lean` tests + - [ ] 6.1.3 Update `LExprTTests.lean` tests + - [ ] 6.1.4 Update `TypeFactoryTests.lean` tests + - [ ] 6.1.5 Run tests and fix failures + +- [ ] 6.2 Update Imperative layer tests in `StrataTest/DL/Imperative/` + - [ ] 6.2.1 Update all test files + - [ ] 6.2.2 Update test expectations if needed + - [ ] 6.2.3 Run tests and fix failures + +- [ ] 6.3 Update Core language tests in `StrataTest/Languages/Core/` + - [ ] 6.3.1 Update all test files + - [ ] 6.3.2 Update test expectations if needed + - [ ] 6.3.3 Run tests and fix failures + +- [ ] 6.4 Update transformation tests in `StrataTest/Transform/` + - [ ] 6.4.1 Update `CallElim.lean` tests + - [ ] 6.4.2 Update `DetToNondet.lean` tests + - [ ] 6.4.3 Update `ProcedureInlining.lean` tests + - [ ] 6.4.4 Run tests and fix failures + +- [ ] 6.5 Update all other tests + - [ ] 6.5.1 Update C_Simp tests if needed + - [ ] 6.5.2 Update Dyn tests if needed + - [ ] 6.5.3 Update CBMC tests if needed + - [ ] 6.5.4 Run tests and fix failures + +- [ ] 6.6 Run full test suite + - [ ] 6.6.1 Run `lake test` + - [ ] 6.6.2 Fix any test failures + - [ ] 6.6.3 Verify all tests pass + +- [ ] 6.7 Verify examples work + - [ ] 6.7.1 Run all examples in `Examples/` + - [ ] 6.7.2 Verify output is correct + - [ ] 6.7.3 Fix any example issues + +- [ ] 6.8 Performance testing + - [ ] 6.8.1 Measure compilation time + - [ ] 6.8.2 Measure test execution time + - [ ] 6.8.3 Compare with baseline + - [ ] 6.8.4 Document performance results + +- [ ] 6.9 Final validation + - [ ] 6.9.1 Run `lake build` - must succeed + - [ ] 6.9.2 Run `lake test` - must pass all tests + - [ ] 6.9.3 Verify no `sorry` remains + - [ ] 6.9.4 Verify all examples work + - [ ] 6.9.5 Document completion + +**Success Criteria**: Full test suite passes, all examples work. + +--- + +## Success Criteria Summary + +### Compilation +- [ ] `lake build` succeeds with no errors +- [ ] No compilation warnings related to variable numbering +- [ ] All modules compile successfully + +### Testing +- [ ] `lake test` passes all tests +- [ ] No test failures +- [ ] Test coverage maintained or improved + +### Proofs +- [ ] No `sorry` placeholders remain +- [ ] All freshness lemmas proven +- [ ] All substitution lemmas proven +- [ ] All transformation correctness proofs restored +- [ ] All invariant preservation proofs complete + +### Code Quality +- [ ] All De Bruijn index shifting code removed +- [ ] All De Bruijn weakening code removed +- [ ] All De Bruijn lifting code removed +- [ ] Code is simpler and more maintainable +- [ ] Documentation updated + +### Functionality +- [ ] All examples work correctly +- [ ] All translations work +- [ ] SMT encoding works +- [ ] Verification produces correct results + +--- + +## Notes + +- **Incremental Approach**: Fix one file at a time, verify compilation after each fix +- **Track Progress**: Maintain `broken-files-lambda.md`, `broken-files-imperative.md`, and `broken-proofs.md` +- **Proof Strategy**: Add `sorry` with TODO comments for broken proofs, restore them in Phase 5 +- **Testing**: Run tests frequently to catch regressions early +- **Documentation**: Update documentation as changes are made From acb6e8f021154ba4e732f5b3bce28c164df61a1c Mon Sep 17 00:00:00 2001 From: Mikael Mayer Date: Mon, 19 Jan 2026 09:26:56 +0100 Subject: [PATCH 2/2] Remove redundant specification.md file The content is already split into requirements.md, design.md, and tasks.md --- .../specification.md | 1684 ----------------- 1 file changed, 1684 deletions(-) delete mode 100644 .kiro/specs/performant-ordering-variables/specification.md diff --git a/.kiro/specs/performant-ordering-variables/specification.md b/.kiro/specs/performant-ordering-variables/specification.md deleted file mode 100644 index 4d247c918..000000000 --- a/.kiro/specs/performant-ordering-variables/specification.md +++ /dev/null @@ -1,1684 +0,0 @@ -# Performant Ordering Variable Numbering Specification - -## Executive Summary - -This specification describes the migration of Strata's variable representation from the current system to **Performant Ordering** - a unified variable numbering scheme using globally unique natural numbers. This change affects both lambda calculus (bound variables) and imperative layer (free variables), replacing all relative indexing schemes with a single, transformation-friendly approach. - -**Key Change:** All variables (lambda-bound and imperative-declared) are identified by unique natural numbers assigned from a global counter, with no relative counting or context-dependent indexing. - -## Table of Contents - -1. [Introduction](#introduction) -2. [Glossary](#glossary) -3. [Requirements](#requirements) -4. [Architecture](#architecture) -5. [Design](#design) -6. [Implementation Strategy](#implementation-strategy) -7. [Testing Strategy](#testing-strategy) -8. [Migration Plan](#migration-plan) - ---- - -## Introduction - -### Current State - -Strata currently uses different variable representation schemes across its layers: -- **Lambda calculus**: Uses De Bruijn indices for bound variables (bvar) - relative counting from binding site -- **Imperative layer**: Uses string-based identifiers for free variables (fvar) -- **Mixed approach**: Leads to complex index shifting, weakening, and lifting operations - -### Problems with Current Approach - -1. **Complexity**: Index shifting logic is error-prone and difficult to prove correct -2. **Transformation friction**: Variables change identity during transformations -3. **Dual reasoning**: Different proof strategies for lambda-bound vs imperative variables -4. **Boundary cases**: Complex interactions between bvar and fvar -5. **Canonicity overhead**: De Bruijn requires normalization, but Strata prioritizes semantic equivalence - -### Proposed Solution: Performant Ordering - -**Core Concept**: All variables are identified by globally unique natural numbers assigned from a monotonically increasing counter. - - -**Benefits**: -- **Simplicity**: No index shifting, no relative counting -- **Stability**: Variables maintain identity across all transformations -- **Unified reasoning**: Single proof strategy for all variables -- **Transformation-friendly**: Fresh variable generation is trivial and local -- **Strata-optimized**: Perfect for multi-stage transformation pipelines - -**Trade-off Accepted**: Loss of canonical representation (alpha-equivalent terms may differ syntactically), but semantic equivalence is what matters for verification. - ---- - -## Glossary - -- **Performant Ordering**: Variable numbering scheme using globally unique natural numbers -- **nextFree**: Counter representing the next available fresh variable number -- **Program α**: Generic program structure containing content of type α and a nextFree counter -- **FreshM**: State monad for generating fresh variable numbers -- **bvar**: Bound variable (lambda-bound, quantifier-bound) - now uses Performant Ordering -- **fvar**: Free variable (imperative-declared) - now uses Performant Ordering -- **LExpr**: Lambda expressions in the Lambda dialect -- **Cmd**: Commands in the Imperative dialect -- **VarContext**: Mapping from variable numbers to metadata (name, type) for display/debugging -- **Unified numbering**: Single numbering space shared by both bvar and fvar - ---- - -## Requirements - -### Requirement 1: Unified Variable Representation - -**User Story**: As a language designer, I want all variables to use a single numbering scheme, so that there's no distinction between lambda-bound and imperative variable numbering. - -#### Acceptance Criteria - -1.1. WHEN a variable is represented in the AST, THE System SHALL use a natural number as its unique identifier - -1.2. WHEN a lambda parameter is declared, THE System SHALL assign it a unique number from nextFree - -1.3. WHEN an imperative variable is declared, THE System SHALL assign it a unique number from nextFree - -1.4. WHEN a quantifier binds a variable, THE System SHALL assign it a unique number from nextFree - -1.5. WHERE variables are used, THE System SHALL reference them by their assigned number - - -### Requirement 2: Program Structure with Fresh Counter - -**User Story**: As a compiler developer, I want programs to track the next available variable number, so that fresh variable generation is guaranteed to be unique. - -#### Acceptance Criteria - -2.1. WHEN a Program is defined, THE System SHALL include a nextFree field of type Nat - -2.2. WHEN a Program is created, THE System SHALL initialize nextFree to 0 (or to the next available number if continuing from a previous context) - -2.3. WHEN a fresh variable is needed, THE System SHALL use the current nextFree value and increment it - -2.4. WHERE multiple types of variables exist (lambda-bound, imperative, quantifier-bound), THE System SHALL use the same nextFree counter for all of them - -2.4. WHERE all variables in a program, THE System SHALL maintain the invariant that all variable numbers are strictly less than nextFree - -### Requirement 3: Fresh Variable Generation - -**User Story**: As a transformation developer, I want to generate fresh variables using a state monad, so that uniqueness is guaranteed by construction. - -#### Acceptance Criteria - -3.1. WHEN generating a fresh variable, THE System SHALL provide a FreshM monad that returns the current nextFree and increments it - -3.2. WHEN multiple fresh variables are needed, THE System SHALL thread the state through all generations - -3.3. WHEN a transformation completes, THE System SHALL return an updated Program with the new nextFree value - -3.4. WHERE fresh variable generation occurs, THE System SHALL guarantee that the returned number was not previously used in the program - -### Requirement 4: Lambda Calculus Operations - -**User Story**: As a lambda calculus implementer, I want substitution and beta reduction to work without index shifting, so that operations are simpler and proofs are easier. - -#### Acceptance Criteria - -4.1. WHEN performing beta reduction (λ x₄₂. body) arg, THE System SHALL substitute all occurrences of variable 42 in body with arg, without any index adjustment - -4.2. WHEN performing substitution body[x₄₂ := arg], THE System SHALL replace all occurrences of variable 42 with arg, without shifting any other variables - -4.3. WHEN performing alpha conversion, THE System SHALL generate a fresh variable number and replace the old parameter number with the new one, without any index adjustment - -4.4. WHERE lambda expressions are evaluated, THE System SHALL NOT perform any index shifting, weakening, or lifting operations - - -### Requirement 5: Imperative Operations - -**User Story**: As an imperative language implementer, I want variable declarations and assignments to use unique numbers, so that shadowing is impossible and variable tracking is unambiguous. - -#### Acceptance Criteria - -5.1. WHEN an init command declares a variable, THE System SHALL assign it a unique number from nextFree - -5.2. WHEN a set command modifies a variable, THE System SHALL reference it by its unique number - -5.3. WHEN a havoc command randomizes a variable, THE System SHALL reference it by its unique number - -5.4. WHERE multiple variables with the same display name exist, THE System SHALL distinguish them by their unique numbers - -5.5. WHEN inlining procedures, THE System SHALL generate fresh numbers for all parameters and locals - -### Requirement 6: Pretty Printing and Display - -**User Story**: As a developer debugging programs, I want variable numbers to be resolved to readable names, so that output is human-friendly. - -#### Acceptance Criteria - -6.1. WHEN displaying a variable, THE System SHALL resolve its number to a display name using VarContext - -6.2. WHEN a variable number cannot be resolved, THE System SHALL display it as %N (e.g., %42) - -6.3. WHEN multiple variables share the same display name, THE System SHALL disambiguate using @N suffix (e.g., x, x@1, x@2) - -6.4. WHERE shadowing does not occur, THE System SHALL display variables with their plain names (no suffix) - -6.5. WHEN formatting expressions, THE System SHALL use VarContext to make output readable - -### Requirement 7: Scope and Shadowing - -**User Story**: As a language designer, I want shadowing to be impossible by construction, so that variable references are always unambiguous. - -#### Acceptance Criteria - -7.1. WHEN a new variable is declared, THE System SHALL assign it a unique number that has never been used - -7.2. WHERE two variables have the same display name, THE System SHALL distinguish them by their unique numbers - -7.3. WHEN a variable goes out of scope, THE System SHALL NOT reuse its number - -7.4. WHERE variable lookup occurs, THE System SHALL use the unique number, not the display name - - -### Requirement 8: Transformation Correctness - -**User Story**: As a verification engineer, I want transformations to preserve program semantics, so that verification results are trustworthy. - -#### Acceptance Criteria - -8.1. WHEN a transformation generates fresh variables, THE System SHALL prove that the fresh numbers are not in the original program - -8.2. WHEN a transformation preserves variables, THE System SHALL prove that variable numbers remain unchanged - -8.3. WHEN a transformation performs substitution, THE System SHALL prove that semantics are preserved - -8.4. WHERE transformations compose, THE System SHALL prove that the composition preserves semantics - -8.5. WHEN a transformation completes, THE System SHALL prove that all variables in the result are below the new nextFree - -### Requirement 9: Type System Integration - -**User Story**: As a type system maintainer, I want type checking to work with variable numbers, so that type safety is preserved. - -#### Acceptance Criteria - -9.1. WHEN type checking a variable reference, THE System SHALL look up its type using the variable number - -9.2. WHEN a variable is declared, THE System SHALL associate its number with its type - -9.3. WHEN type checking lambda expressions, THE System SHALL track parameter types by their numbers - -9.4. WHERE type environments are maintained, THE System SHALL map variable numbers to types - -### Requirement 10: Evaluation and Semantics - -**User Story**: As a semantics implementer, I want evaluation to work with variable numbers, so that runtime behavior is correct. - -#### Acceptance Criteria - -10.1. WHEN evaluating a variable reference, THE System SHALL look up its value using the variable number - -10.2. WHEN a variable is assigned, THE System SHALL update the value at its number - -10.3. WHEN evaluating lambda application, THE System SHALL substitute using variable numbers without shifting - -10.4. WHERE evaluation environments are maintained, THE System SHALL map variable numbers to values - - -### Requirement 11: SMT Backend Integration - -**User Story**: As a verification backend developer, I want SMT encoding to generate unique variable names, so that verification conditions are correct. - -#### Acceptance Criteria - -11.1. WHEN encoding a variable to SMT, THE System SHALL generate a unique SMT identifier - -11.2. WHEN multiple variables share a display name, THE System SHALL use @N suffixes in SMT (e.g., x, x@1, x@2) - -11.3. WHEN encoding expressions, THE System SHALL resolve variable numbers to SMT identifiers - -11.4. WHERE SMT variable names are generated, THE System SHALL maintain a mapping from variable numbers to SMT identifiers - -### Requirement 12: Compilation and Testing - -**User Story**: As a project maintainer, I want the refactored code to compile and pass all tests, so that the system remains functional. - -#### Acceptance Criteria - -12.1. WHEN all changes are complete, THE System SHALL compile without errors using `lake build` - -12.2. WHEN tests are run, THE System SHALL pass all existing tests using `lake test` - -12.3. WHERE proofs are affected, THE System SHALL restore all proofs to working state - -12.4. WHEN the migration is complete, THE System SHALL have no remaining `sorry` placeholders - ---- - -## Architecture - -### High-Level Overview - -``` -┌─────────────────────────────────────────────────────────────┐ -│ Program Structure │ -│ ┌────────────────────────────────────────────────────────┐ │ -│ │ structure Program (α : Type) where │ │ -│ │ content : α │ │ -│ │ nextFree : Nat │ │ -│ └────────────────────────────────────────────────────────┘ │ -└─────────────────────────────────────────────────────────────┘ - │ - ├─────────────────────────────────┐ - │ │ - ▼ ▼ -┌──────────────────────────────────────┐ ┌──────────────────────────────────┐ -│ Lambda Calculus Layer │ │ Imperative Layer │ -│ ┌────────────────────────────────┐ │ │ ┌────────────────────────────┐ │ -│ │ LExpr: │ │ │ │ Cmd: │ │ -│ │ bvar (index: Nat) │ │ │ │ init (name: String) │ │ -│ │ abs (param: Nat) (body) │ │ │ │ set (var: Nat) (expr) │ │ -│ │ app (fn) (arg) │ │ │ │ havoc (var: Nat) │ │ -│ │ fvar (index: Nat) │ │ │ └────────────────────────────┘ │ -│ └────────────────────────────────┘ │ └──────────────────────────────────┘ -└──────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ Fresh Variable Monad │ -│ ┌────────────────────────────────────────────────────────┐ │ -│ │ def FreshM (α : Type) : Type := StateM Nat α │ │ -│ │ │ │ -│ │ def freshVar : FreshM Nat := do │ │ -│ │ let n ← get │ │ -│ │ set (n + 1) │ │ -│ │ return n │ │ -│ └────────────────────────────────────────────────────────┘ │ -└─────────────────────────────────────────────────────────────┘ -``` - - -### Key Architectural Principles - -1. **Unified Numbering**: All variables (bvar, fvar, parameters, locals) share the same numbering space -2. **Global Uniqueness**: Each variable number is used exactly once across the entire program -3. **Monotonic Counter**: nextFree only increases, never decreases -4. **Stable Identity**: Variable numbers never change during transformations -5. **Display Separation**: Variable numbers (for semantics) are separate from display names (for humans) - -### Component Relationships - -``` -Program - ├── content (AST with variable numbers) - └── nextFree (fresh variable counter) - -VarContext (for display/debugging) - └── Map Nat (String × Type) -- number → (display name, type) - -FreshM Monad (for transformations) - └── StateM Nat -- threads nextFree through operations - -Evaluation Environment - └── Map Nat Value -- number → runtime value - -Type Environment - └── Map Nat Type -- number → type -``` - ---- - -## Design - -### 1. Program Structure - -All program entry points (expressions, statements, procedures, etc.) are wrapped in a `Program` structure: - -```lean -structure Program (α : Type) where - content : α - nextFree : Nat - -namespace Program - -def empty (content : α) : Program α := - ⟨content, 0⟩ - -def withFreshCounter (content : α) (nextFree : Nat) : Program α := - ⟨content, nextFree⟩ - --- Primary invariant: all variables in content are below nextFree -def wellFormed (p : Program α) : Prop := - ∀ v ∈ allVars p.content, v < p.nextFree - -end Program -``` - - -### 2. Fresh Variable Monad - -```lean -def FreshM (α : Type) : Type := StateM Nat α - -namespace FreshM - -def freshVar : FreshM Nat := do - let n ← get - set (n + 1) - return n - -def freshVars (count : Nat) : FreshM (List Nat) := do - let mut vars := [] - for _ in [0:count] do - vars := (← freshVar) :: vars - return vars.reverse - -def run (m : FreshM α) (initialNextFree : Nat) : α × Nat := - m.run initialNextFree - -def runProgram (m : FreshM α) (p : Program β) : Program α := - let (content, nextFree) := m.run p.nextFree - ⟨content, nextFree⟩ - -end FreshM -``` - -**Key Lemma**: -```lean -theorem freshVar_is_fresh (s : Nat) : - let (n, s') := freshVar.run s - n = s ∧ s' = s + 1 ∧ n < s' -``` - -### 3. Lambda Calculus with Performant Ordering - -#### LExpr Definition - -```lean -inductive LExpr (T : LExprParamsT) : Type where - | bvar (m : T.base.Metadata) (index : Nat) (hint : String) (ty : Option T.TypeType) - | fvar (m : T.base.Metadata) (index : Nat) (hint : String) (ty : Option T.TypeType) - | abs (m : T.base.Metadata) (param : Nat) (hint : String) (body : LExpr T) (ty : Option T.TypeType) - | app (m : T.base.Metadata) (fn : LExpr T) (arg : LExpr T) (ty : Option T.TypeType) - | quant (m : T.base.Metadata) (kind : QuantKind) (vars : List (Nat × String)) - (body : LExpr T) (ty : Option T.TypeType) - -- ... other constructors -``` - -**Key Changes**: -- `bvar` uses `Nat` for unique identity + `String` hint for display -- `fvar` uses `Nat` for unique identity + `String` hint for display -- `abs` stores parameter number + hint for the parameter name -- `quant` stores list of (number, hint) pairs for bound variables - -**Design Principle**: Every variable node stores both: -1. **Semantic identity** (Nat) - used for all operations (substitution, lookup, etc.) -2. **Display hint** (String) - used only for pretty printing and debugging - - -#### Substitution (Simplified) - -```lean -def substitute (e : LExpr T) (var : Nat) (replacement : LExpr T) : LExpr T := - match e with - | .bvar m index ty => if index == var then replacement else e - | .fvar m index ty => if index == var then replacement else e - | .abs m param body ty => - .abs m param (substitute body var replacement) ty - | .app m fn arg ty => - .app m (substitute fn var replacement) (substitute arg var replacement) ty - | .quant m kind vars body ty => - .quant m kind vars (substitute body var replacement) ty - -- ... other cases -``` - -**No index shifting needed!** Simple replacement throughout. - -#### Beta Reduction (Simplified) - -```lean -def betaReduce (e : LExpr T) : LExpr T := - match e with - | .app _ (.abs _ param body _) arg _ => - substitute body param arg - | _ => e -``` - -**No index adjustment!** Just substitute the parameter number with the argument. - -#### Alpha Conversion (Simplified) - -```lean -def alphaConvert (abs : LExpr T) : FreshM (LExpr T) := - match abs with - | .abs m param body ty => do - let newParam ← freshVar - let newBody := substitute body param (.bvar m newParam ty) - return .abs m newParam newBody ty - | _ => return abs -``` - -**No index adjustment!** Just generate a fresh number and substitute. - -### 4. Imperative Layer with Performant Ordering - -#### Cmd Definition - -```lean -inductive Cmd (P : PureExpr) : Type where - | init (hint : String) (var : Nat) (ty : P.Ty) (e : Option P.Expr) (md : MetaData P := .empty) - | set (var : Nat) (e : P.Expr) (md : MetaData P := .empty) - | havoc (var : Nat) (md : MetaData P := .empty) - | assert (label : String) (b : P.Expr) (md : MetaData P := .empty) - | assume (label : String) (b : P.Expr) (md : MetaData P := .empty) -``` - -**Key Points**: -- `init` stores both hint (String) for display AND var (Nat) for semantic identity -- `init` has optional expression: `some e` for initialized, `none` for havoc -- `set` and `havoc` reference variables by number only (hint is in VarContext) -- Variable numbers are assigned when `init` is executed -- **All constructors have `md : MetaData P` parameter** with default `.empty` - -**Rationale for Optional Init**: -- `init "x" 42 ty (some e)` - declares variable 42 with initial value e -- `init "x" 42 ty none` - declares variable 42 with havoc/uninitialized value -- Eliminates pattern of `init` followed by immediate `havoc` -- More concise and clearer intent - -**Design Principle**: -- Declaration nodes (`init`, `abs`, `quant`) store hints directly -- Reference nodes (`bvar`, `fvar`, `set`, `havoc`) use numbers only -- VarContext maintains the mapping for display purposes -- Metadata is preserved throughout all operations - - -#### Variable Context for Display - -```lean -structure VarContext (P : PureExpr) where - vars : Map Nat (String × P.Ty) -- number → (display name, type) - -namespace VarContext - -def empty : VarContext P := ⟨Map.empty⟩ - -def insert (ctx : VarContext P) (num : Nat) (name : String) (ty : P.Ty) : VarContext P := - ⟨ctx.vars.insert num (name, ty)⟩ - -def lookup (ctx : VarContext P) (num : Nat) : Option (String × P.Ty) := - ctx.vars.find? num - -def lookupName (ctx : VarContext P) (num : Nat) : Option String := - (ctx.lookup num).map (·.1) - -def lookupType (ctx : VarContext P) (num : Nat) : Option P.Ty := - (ctx.lookup num).map (·.2) - -end VarContext -``` - -#### Evaluation with Performant Ordering - -```lean -def Cmd.eval [EC : EvalContext P S] (σ : S) (c : Cmd P) : Cmd P × S := - match c with - | .init name ty (some e) md => do - let varNum := EC.nextVar σ - let e := EC.eval σ e - let σ := EC.addVar σ varNum ty e - let σ := EC.incrementNextVar σ - return (.init name ty (some e) md, σ) - - | .init name ty none md => do - let varNum := EC.nextVar σ - let (e, σ) := EC.genFreeVar σ ty - let σ := EC.addVar σ varNum ty e - let σ := EC.incrementNextVar σ - return (.init name ty none md, σ) - - | .set varNum e md => - match EC.lookupVar σ varNum with - | none => return (c, EC.updateError σ (.VarNotFound varNum)) - | some (ty, _) => - let e := EC.eval σ e - let σ := EC.updateVar σ varNum ty e - return (.set varNum e md, σ) - - | .havoc varNum md => - match EC.lookupVar σ varNum with - | none => return (c, EC.updateError σ (.VarNotFound varNum)) - | some (ty, _) => - let (e, σ) := EC.genFreeVar σ ty - let σ := EC.updateVar σ varNum ty e - return (.havoc varNum md, σ) -``` - - -### 5. Pretty Printing - -```lean -def formatVar (ctx : VarContext P) (varNum : Nat) : String := - match ctx.lookupName varNum with - | none => s!"%{varNum}" - | some name => - if name.isEmpty then s!"%{varNum}" - else - -- Count how many other variables share this name - let sameNameVars := ctx.vars.toList.filter (fun (_, (n, _)) => n == name) - if sameNameVars.length == 1 then name - else - -- Find position of this var among same-named vars - let sorted := sameNameVars.map (·.1) |>.sort - let pos := sorted.indexOf varNum - s!"{name}@{pos}" - -def formatCmd (ctx : VarContext P) (c : Cmd P) : String := - match c with - | .init name ty (some e) _ => - s!"init ({name} : {ty}) := {formatExpr ctx e}" - | .init name ty none _ => - s!"init ({name} : {ty})" - | .set varNum e _ => - s!"{formatVar ctx varNum} := {formatExpr ctx e}" - | .havoc varNum _ => - s!"havoc {formatVar ctx varNum}" - | .assert label b _ => - s!"assert {label}: {formatExpr ctx b}" - | .assume label b _ => - s!"assume {label}: {formatExpr ctx b}" -``` - -### 6. SMT Encoding - -```lean -structure SMTContext where - varNames : Map Nat String -- variable number → unique SMT name - nameCounters : Map String Nat -- base name → occurrence count - -namespace SMTContext - -def empty : SMTContext := ⟨Map.empty, Map.empty⟩ - -def addVar (ctx : SMTContext) (varNum : Nat) (baseName : String) : SMTContext × String := - let counter := ctx.nameCounters.findD baseName 0 - let smtName := if counter == 0 then baseName else s!"{baseName}@{counter}" - let ctx' := { - varNames := ctx.varNames.insert varNum smtName, - nameCounters := ctx.nameCounters.insert baseName (counter + 1) - } - (ctx', smtName) - -def lookupVar (ctx : SMTContext) (varNum : Nat) : Option String := - ctx.varNames.find? varNum - -end SMTContext -``` - - -### 7. Key Invariants - -#### Primary Invariant: All Variables Below nextFree - -```lean -def Program.allVarsBelow (p : Program α) : Prop := - ∀ v ∈ allVars p.content, v < p.nextFree -``` - -**Theorem**: All well-formed programs satisfy this invariant. - -#### Uniqueness Invariant: All Variables Distinct - -```lean -def Program.allVarsUnique (p : Program α) : Prop := - ∀ v₁ v₂ ∈ allVars p.content, v₁ ≠ v₂ → v₁ ≠ v₂ -``` - -**Theorem**: Fresh variable generation preserves uniqueness. - -#### Well-formedness Invariant: All Uses Have Declarations - -```lean -def Program.allUsesHaveDecls (p : Program α) : Prop := - ∀ use ∈ varUses p.content, ∃ decl ∈ varDecls p.content, use = decl -``` - -**Theorem**: Type checking ensures this invariant. - -### 8. Handling Source-Level Shadowing - -**Problem**: Source languages (Boogie, C_Simp) allow shadowing, but Performant Ordering makes it impossible. - -**Solution**: During parsing/translation, assign fresh numbers to all variables, even if they have the same source name. - -#### Example: Boogie Source with Shadowing - -```boogie -procedure Example() { - var x: int; - x := 1; - { - var x: int; // Shadows outer x - x := 2; // Refers to inner x - } - assert x == 1; // Refers to outer x -} -``` - -#### Translation to Performant Ordering - -```lean --- During translation, maintain a scope stack --- Outer scope: x → 42 --- Inner scope: x → 73 - -init "x" 42 int (some 1) -- Outer x gets number 42 -set 42 (const 1) --- Enter inner scope -init "x" 73 int (some 2) -- Inner x gets number 73 (fresh!) -set 73 (const 2) -- Refers to 73 --- Exit inner scope -assert "outer_x" (42 == 1) -- Refers to 42 -``` - -#### Scope Stack During Translation - -```lean -structure ScopeStack where - scopes : List (Map String Nat) -- Stack of name → number mappings - -namespace ScopeStack - -def empty : ScopeStack := ⟨[Map.empty]⟩ - -def pushScope (s : ScopeStack) : ScopeStack := - ⟨Map.empty :: s.scopes⟩ - -def popScope (s : ScopeStack) : ScopeStack := - ⟨s.scopes.tail⟩ - -def lookup (s : ScopeStack) (name : String) : Option Nat := - -- Search from innermost to outermost scope - s.scopes.findSome? (fun scope => scope.find? name) - -def insert (s : ScopeStack) (name : String) (num : Nat) : ScopeStack := - match s.scopes with - | [] => ⟨[Map.empty.insert name num]⟩ - | scope :: rest => ⟨scope.insert name num :: rest⟩ - -end ScopeStack -``` - -#### Translation Algorithm - -```lean -def translateStmt (stmt : BoogieStmt) : FreshM (ScopeStack → Cmd × ScopeStack) := - match stmt with - | .varDecl name ty init => - fun scopes => do - let varNum ← freshVar - let scopes' := scopes.insert name varNum - let cmd := .init name varNum ty (translateExpr init scopes) - return (cmd, scopes') - - | .assign name expr => - fun scopes => - match scopes.lookup name with - | none => error s!"Variable {name} not found" - | some varNum => - let cmd := .set varNum (translateExpr expr scopes) - return (cmd, scopes) - - | .block stmts => - fun scopes => do - let scopes' := scopes.pushScope - let (cmds, scopes'') ← translateStmts stmts scopes' - let scopes''' := scopes''.popScope - return (.block cmds, scopes''') -``` - -**Key Insight**: Source-level shadowing is resolved during translation by assigning different numbers to variables with the same name. The Performant Ordering representation has no shadowing - each variable has a unique number. - -#### User-Facing Syntax for Referring to Shadowed Variables - -**Problem**: How does a user refer to a specific variable when multiple have the same hint? - -**Solution**: Use disambiguation syntax in source code (if needed): - -```boogie -// Option 1: Explicit numbering (for debugging/advanced use) -var x: int; // Gets number 42 -x := 1; -{ - var x: int; // Gets number 73 - x := 2; - x@42 := 3; // Explicitly refer to outer x (rare, for debugging) -} - -// Option 2: Standard scoping (normal case) -// Users write normal code, translator handles numbering -var x: int; -x := 1; -{ - var x: int; - x := 2; // Translator knows this is the inner x -} -assert x == 1; // Translator knows this is the outer x -``` - -**Implementation Note**: The `@N` syntax is optional and primarily for: -1. Debugging output (showing which variable is which) -2. Advanced users who want explicit control -3. Generated code that needs to be precise - -Normal users write standard scoped code, and the translator handles the numbering automatically. - -### 9. Names vs Indices Design Principle - -**Core Principle**: Commands use indices for implementation, specifications use names for user interface. - -#### When to Use Names - -1. **Variable declarations** (`init` hint parameter) - For display and debugging -2. **Procedure specifications** (`modifies` clauses) - Users write names -3. **Error messages** - Display names to users -4. **Pretty printing** - Show names to users -5. **Source code** - Users write names - -#### When to Use Indices - -1. **Variable references** (`set`, `havoc`, `bvar`, `fvar`) - For semantics -2. **modifiedVars return value** - Returns list of indices -3. **Evaluation** - Look up values by index -4. **Type checking** - Look up types by index -5. **Transformations** - Track variables by index - -#### Resolution Pattern - -When specifications (names) need to interact with implementation (indices): - -```lean --- Pattern 1: Validate procedure modifies clause -def validateModifies (proc : Procedure) (body : Cmd) (ctx : VarContext) : Bool := - let modifiedIndices := body.modifiedVars - let modifiedNames := modifiedIndices.filterMap (ctx.lookupName ·) - modifiedNames.all (fun name => name ∈ proc.modifies ∨ name ∈ proc.outputs) - --- Pattern 2: Generate havoc statements for loop elimination -def generateLoopHavocs (body : Cmd) (ctx : VarContext) : FreshM (List Cmd) := do - let modifiedIndices := body.modifiedVars - return modifiedIndices.map (fun idx => .havoc idx) - --- Pattern 3: Error messages with names -def formatError (err : Error) (ctx : VarContext) : String := - match err with - | .VarNotFound idx => - match ctx.lookupName idx with - | some name => s!"Variable '{name}' (#{idx}) not found" - | none => s!"Variable #{idx} not found" -``` - -### 10. modifiedVars and definedVars - -**Critical Feature**: Track which variables are modified by commands. - -#### modifiedVars Implementation - -```lean -def Cmd.modifiedVars (c : Cmd P) : List Nat := - match c with - | .init _ _ _ _ _ => [] - | .set var _ _ => [var] - | .havoc var _ => [var] - | .assert _ _ _ => [] - | .assume _ _ _ => [] - -def Cmds.modifiedVars (cs : List (Cmd P)) : List Nat := - cs.flatMap Cmd.modifiedVars -``` - -**Key Points**: -- Returns **indices**, not names -- `init` returns empty list (declares, doesn't modify) -- `set` and `havoc` return the modified variable's index -- Consumers resolve indices to names when needed - -#### definedVars Implementation - -```lean -def Cmd.definedVars (c : Cmd P) : List String := - match c with - | .init hint _ _ _ _ => [hint] - | .set _ _ _ => [] - | .havoc _ _ => [] - | .assert _ _ _ => [] - | .assume _ _ _ => [] -``` - -**Key Points**: -- Returns **names** (hints), not indices -- Only `init` defines new variables -- Used for tracking variable declarations - -#### Use Cases - -1. **Loop Elimination**: -```lean -def eliminateLoop (loop : Loop) (ctx : VarContext) : FreshM Cmd := do - let modifiedIndices := loop.body.modifiedVars - let havocs := modifiedIndices.map (fun idx => .havoc idx) - -- Generate havoc statements for all modified variables - return .block havocs -``` - -2. **Procedure Validation**: -```lean -def validateProcedure (proc : Procedure) (ctx : VarContext) : Bool := - let modifiedIndices := proc.body.modifiedVars - let modifiedNames := modifiedIndices.filterMap (ctx.lookupName ·) - -- Check that all modified variables are in modifies clause or outputs - modifiedNames.all (fun name => - name ∈ proc.modifies ∨ name ∈ proc.outputs) -``` - -3. **Definedness Checking**: -```lean -def isDefinedOver (cmd : Cmd) (store : Store) : Bool := - let modifiedIndices := cmd.modifiedVars - modifiedIndices.all (fun idx => store.contains idx) -``` - -### 11. Global Variables and VarContext Initialization - -**Problem**: Global variables need to be accessible in all procedures. - -**Solution**: Initialize VarContext with globals before translating procedures. - -#### Global Variable Initialization - -```lean -structure VarContext (P : PureExpr) where - vars : Map Nat (String × P.Ty) -- number → (hint, type) - -namespace VarContext - -def empty : VarContext P := ⟨Map.empty⟩ - -def insert (ctx : VarContext P) (num : Nat) (hint : String) (ty : P.Ty) : VarContext P := - ⟨ctx.vars.insert num (hint, ty)⟩ - -def lookup (ctx : VarContext P) (num : Nat) : Option (String × P.Ty) := - ctx.vars.find? num - -def lookupName (ctx : VarContext P) (num : Nat) : Option String := - (ctx.lookup num).map (·.1) - -def lookupType (ctx : VarContext P) (num : Nat) : Option P.Ty := - (ctx.lookup num).map (·.2) - --- Initialize VarContext with global variables -def fromGlobals (globals : List (String × P.Ty)) : FreshM (VarContext P) := do - let mut ctx := empty - for (name, ty) in globals do - let varNum ← freshVar - ctx := ctx.insert varNum name ty - return ctx - -end VarContext -``` - -#### Program Translation with Globals - -```lean -def translateProgram (prog : BoogieProgram) : FreshM StrataProgram := do - -- Step 1: Assign numbers to globals - let globalVars := prog.globals.map (fun g => (g.name, g.type)) - let ctx ← VarContext.fromGlobals globalVars - - -- Step 2: Translate procedures with global context - let procedures ← prog.procedures.mapM (translateProcedure ctx) - - return { - globals := globalVars, - procedures := procedures - } -``` - -**Key Insight**: Globals get fresh numbers first, then procedures can reference them by those numbers. - ---- - -## Implementation Strategy - -**Implementation Order**: We will implement changes to the **Lambda Calculus layer first**, then the Imperative layer, then translations and backends. This order is chosen because: -1. Lambda calculus is the foundation - both bvar and fvar changes happen here -2. Imperative layer depends on fvar being updated -3. Translations depend on both layers being complete - -### Phase 1: Core Infrastructure (Foundation) - -**Goal**: Establish the basic structures and monad without breaking existing code. - -**Tasks**: -1. Define `Program α` structure -2. Implement `FreshM` monad -3. Prove basic lemmas about `freshVar` -4. Create `VarContext` structure -5. No changes to existing AST yet - -**Success Criteria**: New code compiles, existing code unchanged. - -### Phase 2: Lambda Calculus Migration (FIRST IMPLEMENTATION PHASE) - -**Goal**: Update LExpr to use Performant Ordering. - -**Why First**: Lambda calculus is the foundation. Both `bvar` (lambda-bound variables) and `fvar` (imperative variables) are defined in LExpr, so we must update this layer before the imperative layer can use the new fvar representation. - -**Tasks**: -1. Change `LExpr.bvar` to use `Nat` + hint (remove De Bruijn relative indexing) -2. Change `LExpr.fvar` to use `Nat` + hint (remove `Identifier`) -3. Update `LExpr.abs` to store parameter number + hint -4. Update `LExpr.quant` to store bound variable numbers + hints -5. Remove all index shifting functions -6. Simplify substitution (no shifting) -7. Simplify beta reduction (no adjustment) -8. Update evaluation to use number-based lookup -9. Update type checking to use number-based lookup - -**Success Criteria**: Lambda calculus layer compiles with simplified operations. - -### Phase 3: Imperative Layer Migration (SECOND IMPLEMENTATION PHASE) - -**Goal**: Update Cmd to use Performant Ordering. - -**Why Second**: Imperative layer uses `fvar` from LExpr, so it must be updated after the lambda calculus layer is complete. - -**Tasks**: -1. Update `Cmd.init` to store hint + var number + optional expression -2. Update `Cmd.set` to use `Nat` variable number -3. Update `Cmd.havoc` to use `Nat` variable number -4. Update `Cmd.modifiedVars` to return `List Nat` -5. Update evaluation to track nextFree -6. Update type checking to use number-based lookup -7. Update semantics to use number-based operations -8. Implement pretty printing with VarContext - -**Success Criteria**: Imperative layer compiles with number-based variables. - -### Phase 4: Translation and Backend Updates (THIRD IMPLEMENTATION PHASE) - -**Goal**: Update all language frontends and backends. - -**Why Third**: Translations depend on both lambda calculus and imperative layers being complete. - -**Tasks**: -1. Update Boogie translation to assign fresh numbers -2. Update C_Simp translation to assign fresh numbers -3. Update SMT encoding to use SMTContext -4. Update all transformations to use FreshM -5. Update loop elimination to use fresh numbers -6. Update procedure inlining to use fresh numbers - -**Success Criteria**: All translations and backends compile. - -### Phase 5: Proof Restoration (FINAL PHASE) - -**Goal**: Restore all proofs to working state. - -**Tasks**: -1. Remove all De Bruijn shifting lemmas -2. Prove freshness lemmas for FreshM -3. Prove substitution correctness (simplified) -4. Prove transformation correctness -5. Prove invariant preservation -6. Update all semantic proofs - -**Success Criteria**: All proofs compile, no `sorry` remains. - -### Phase 6: Testing and Validation - -**Goal**: Ensure all tests pass. - -**Tasks**: -1. Run `lake build` - must succeed -2. Run `lake test` - must pass all tests -3. Update test expectations if needed -4. Verify examples still work -5. Performance testing - -**Success Criteria**: Full test suite passes. - ---- - -## Testing Strategy - -### Unit Testing - -**Approach**: Update existing tests to use Performant Ordering. - -**Test Categories**: -1. **Fresh variable generation**: Verify uniqueness -2. **Substitution**: Verify correctness without shifting -3. **Beta reduction**: Verify correctness without adjustment -4. **Evaluation**: Verify number-based lookup works -5. **Type checking**: Verify number-based type lookup works -6. **Pretty printing**: Verify display names are correct - - -### Property-Based Testing - -**Properties to Test**: - -1. **Freshness Property**: - ```lean - ∀ (p : Program α) (m : FreshM β), - let (_, nextFree') := m.run p.nextFree - ∀ v ∈ allVars p.content, v < nextFree' - ``` - -2. **Uniqueness Property**: - ```lean - ∀ (p : Program α), - p.allVarsUnique → (transform p).allVarsUnique - ``` - -3. **Substitution Property**: - ```lean - ∀ (e : LExpr) (v : Nat) (arg : LExpr), - semantics (substitute e v arg) = semantics e [v ↦ semantics arg] - ``` - -4. **Beta Reduction Property**: - ```lean - ∀ (param : Nat) (body arg : LExpr), - semantics (betaReduce (.app (.abs param body) arg)) = - semantics (substitute body param arg) - ``` - -### Integration Testing - -**Test Scenarios**: -1. Parse Boogie program → Translate → Verify -2. Complex transformations (call elimination, loop elimination) -3. Multi-stage transformation pipelines -4. SMT encoding and solving - -### Regression Testing - -**Approach**: All existing tests must pass with same semantics. - -**Test Files to Update**: -- `StrataTest/DL/Lambda/*.lean` -- `StrataTest/DL/Imperative/*.lean` -- `StrataTest/Languages/Boogie/*.lean` -- `StrataTest/Transform/*.lean` - ---- - -## Migration Plan - -### Compiler-Driven Approach - -**Principle**: Make breaking changes, then fix what the compiler complains about. - -### Step 1: Create Infrastructure (No Breaking Changes) - -1. Create `Strata/Core/Program.lean` with `Program α` structure -2. Create `Strata/Core/FreshM.lean` with monad and lemmas -3. Create `Strata/Core/VarContext.lean` with display context -4. Verify these compile independently - -### Step 2: Break Lambda Calculus (Controlled Break) - -1. Update `LExpr.bvar` to use `Nat` -2. Update `LExpr.fvar` to use `Nat` -3. Update `LExpr.abs` to store parameter number -4. Run `lake build` to identify all broken files -5. Create `broken-files-lambda.md` to track them - - -### Step 3: Fix Lambda Calculus Files - -For each broken file: -1. Remove index shifting logic -2. Simplify substitution to direct replacement -3. Update evaluation to use number lookup -4. Update type checking to use number lookup -5. If proofs break, add `sorry --TODO: Restore proof` and document in `broken-proofs.md` -6. Continue until `lake build` succeeds for lambda layer - -### Step 4: Break Imperative Layer (Controlled Break) - -1. Update `Cmd.set` to use `Nat` -2. Update `Cmd.havoc` to use `Nat` -3. Run `lake build` to identify broken files -4. Create `broken-files-imperative.md` to track them - -### Step 5: Fix Imperative Layer Files - -For each broken file: -1. Update evaluation to track nextFree -2. Update type checking to use number lookup -3. Update semantics to use number operations -4. If proofs break, add `sorry --TODO: Restore proof` and document -5. Continue until `lake build` succeeds for imperative layer - -### Step 6: Update Translations and Backends - -1. Update Boogie translation to use FreshM -2. Update C_Simp translation to use FreshM -3. Update SMT encoding to use SMTContext -4. Update all transformations to use FreshM -5. Fix compilation errors as they arise - -### Step 7: Restore Proofs - -1. Review `broken-proofs.md` -2. For each broken proof: - - Remove `sorry` - - Prove freshness using FreshM lemmas - - Prove substitution correctness (simpler than before) - - Prove transformation correctness -3. Continue until no `sorry` remains - -### Step 8: Test and Validate - -1. Run `lake build` - must succeed -2. Run `lake test` - fix any failures -3. Update test expectations if needed -4. Verify all examples work -5. Final validation - ---- - -## Success Criteria - -### Compilation - -- [ ] `lake build` succeeds with no errors -- [ ] No compilation warnings related to variable numbering -- [ ] All modules compile successfully - -### Testing - -- [ ] `lake test` passes all tests -- [ ] No test failures -- [ ] Test coverage maintained or improved - - -### Proofs - -- [ ] No `sorry` placeholders remain -- [ ] All freshness lemmas proven -- [ ] All substitution lemmas proven -- [ ] All transformation correctness proofs restored -- [ ] All invariant preservation proofs complete - -### Code Quality - -- [ ] All De Bruijn index shifting code removed -- [ ] All De Bruijn weakening code removed -- [ ] All De Bruijn lifting code removed -- [ ] Code is simpler and more maintainable -- [ ] Documentation updated - -### Functionality - -- [ ] All examples work correctly -- [ ] Boogie translation works -- [ ] C_Simp translation works -- [ ] SMT encoding works -- [ ] Verification produces correct results - ---- - -## Appendix A: Key Differences from De Bruijn Indices - -| Aspect | De Bruijn Indices | Performant Ordering | -|--------|-------------------|---------------------| -| **Variable Identity** | Relative (distance from binder) | Absolute (unique number) | -| **Substitution** | Requires index shifting | Simple replacement | -| **Beta Reduction** | Requires index adjustment | Simple substitution | -| **Alpha Conversion** | Requires index renormalization | Generate fresh number | -| **Shadowing** | Possible (same relative index) | Impossible (unique numbers) | -| **Transformation** | Variables change identity | Variables stable | -| **Proof Complexity** | High (shifting lemmas) | Low (freshness lemmas) | -| **Canonical Form** | Yes (unique representation) | No (alpha-equivalent terms differ) | -| **Strata Fit** | Poor (transformation-heavy) | Excellent (transformation-friendly) | - ---- - -## Appendix B: Example Transformations - -### Example 1: Beta Reduction - -**De Bruijn Approach**: -```lean --- (λ. body[0]) arg --- Need to shift arg's free variables up by 1 --- Then substitute 0 with shifted arg --- Then shift result's free variables down by 1 -betaReduce (.app (.abs body) arg) = - let shiftedArg := shift arg 1 - let substituted := subst body 0 shiftedArg - shift substituted (-1) -``` - -**Performant Ordering Approach**: -```lean --- (λ x₄₂. body) arg --- Just substitute 42 with arg, no shifting! -betaReduce (.app (.abs 42 body) arg) = - substitute body 42 arg -``` - - -### Example 2: Procedure Inlining - -**De Bruijn Approach**: -```lean --- Inline: call y := f(x) --- Need to: --- 1. Shift all indices in f's body --- 2. Substitute parameters with shifted arguments --- 3. Shift result back --- 4. Handle variable capture carefully -inlineProcedure (call y f [x]) = - let body := f.body - let shiftedBody := shiftIndices body (currentDepth) - let substituted := substParams shiftedBody f.params [x] - let result := shiftIndices substituted (-currentDepth) - result -``` - -**Performant Ordering Approach**: -```lean --- Inline: call y := f(x) --- Just generate fresh numbers for f's locals and substitute! -inlineProcedure (call y f [x]) : FreshM Stmt := do - let freshLocals ← freshVars f.locals.length - let body := f.body - let substituted := substParams body f.params [x] - let renamed := renameLocals substituted f.locals freshLocals - return renamed -``` - -### Example 3: Loop Elimination - -**De Bruijn Approach**: -```lean --- Eliminate: while guard { body } --- Need to track which indices are modified --- Shift indices when generating havoc statements --- Complex index management -eliminateLoop (while guard body) = - let modifiedIndices := getModifiedIndices body - let shiftedIndices := shiftIndices modifiedIndices (currentDepth) - let havocs := generateHavocs shiftedIndices - -- ... complex index management -``` - -**Performant Ordering Approach**: -```lean --- Eliminate: while guard { body } --- Modified variables have stable numbers, just havoc them! -eliminateLoop (while guard body) = - let modifiedVars := getModifiedVars body -- Returns List Nat - let havocs := modifiedVars.map (fun v => .havoc v) - -- ... simple, no index management needed -``` - ---- - -## Appendix C: Proof Strategy Examples - -### Freshness Proof Pattern - -```lean -theorem transform_preserves_freshness (p : Program α) : - p.allVarsBelow → - (transform p).allVarsBelow := by - intro h - unfold transform - -- Show that FreshM generates variables >= p.nextFree - have fresh : ∀ v ∈ newVars, v >= p.nextFree := by - apply freshVar_generates_above - -- Show that old variables are unchanged - have old : ∀ v ∈ oldVars, v < p.nextFree := by - apply h - -- Combine to show all variables below new nextFree - omega -``` - - -### Substitution Correctness Proof Pattern - -```lean -theorem substitute_preserves_semantics (e : LExpr) (v : Nat) (arg : LExpr) : - semantics (substitute e v arg) = semantics e [v ↦ semantics arg] := by - induction e with - | bvar m index ty => - simp [substitute] - split - · -- index == v, substitution happens - simp [semantics] - · -- index != v, no substitution - simp [semantics] - | abs m param body ty ih => - simp [substitute, semantics] - -- No index shifting needed! - apply ih - | app m fn arg ty ih_fn ih_arg => - simp [substitute, semantics] - rw [ih_fn, ih_arg] - -- ... other cases, all straightforward -``` - -**Key Insight**: No shifting lemmas needed! Each case is simple. - -### Uniqueness Preservation Proof Pattern - -```lean -theorem transform_preserves_uniqueness (p : Program α) : - p.allVarsUnique → - (transform p).allVarsUnique := by - intro h - unfold transform - -- Show that fresh variables are distinct from old variables - have fresh_distinct : ∀ v₁ ∈ newVars, ∀ v₂ ∈ oldVars, v₁ ≠ v₂ := by - intros v₁ h₁ v₂ h₂ - have : v₁ >= p.nextFree := freshVar_generates_above h₁ - have : v₂ < p.nextFree := h h₂ - omega - -- Show that fresh variables are distinct from each other - have fresh_unique : ∀ v₁ v₂ ∈ newVars, v₁ ≠ v₂ → v₁ ≠ v₂ := by - apply freshVars_generates_unique - -- Combine - intros v₁ v₂ h_in₁ h_in₂ h_neq - cases h_in₁ <;> cases h_in₂ - · apply h -- both old - · apply fresh_distinct -- one old, one new - · apply fresh_distinct.symm -- one new, one old - · apply fresh_unique -- both new -``` - ---- - -## Appendix D: File-by-File Migration Checklist - -### Core Files - -- [ ] `Strata/Core/Program.lean` - Create new -- [ ] `Strata/Core/FreshM.lean` - Create new -- [ ] `Strata/Core/VarContext.lean` - Create new - -### Lambda Calculus Layer - -- [ ] `Strata/DL/Lambda/LExpr.lean` - Update bvar, fvar, abs, quant -- [ ] `Strata/DL/Lambda/LExprEval.lean` - Remove shifting, simplify substitution -- [ ] `Strata/DL/Lambda/LExprType.lean` - Update type checking -- [ ] `Strata/DL/Lambda/LExprWF.lean` - Remove shifting lemmas -- [ ] `Strata/DL/Lambda/Semantics.lean` - Update semantics -- [ ] `Strata/DL/Lambda/Scopes.lean` - Update scope handling - - -### Imperative Layer - -- [ ] `Strata/DL/Imperative/Cmd.lean` - Update set, havoc -- [ ] `Strata/DL/Imperative/CmdEval.lean` - Update evaluation -- [ ] `Strata/DL/Imperative/CmdType.lean` - Update type checking -- [ ] `Strata/DL/Imperative/CmdSemantics.lean` - Update semantics -- [ ] `Strata/DL/Imperative/Stmt.lean` - Update statement handling -- [ ] `Strata/DL/Imperative/StmtSemantics.lean` - Update statement semantics -- [ ] `Strata/DL/Imperative/SemanticsProps.lean` - Update semantic properties - -### Boogie Language - -- [ ] `Strata/Languages/Boogie/Statement.lean` - Update translation -- [ ] `Strata/Languages/Boogie/Procedure.lean` - Update procedure handling -- [ ] `Strata/Languages/Boogie/Program.lean` - Update program structure -- [ ] `Strata/Languages/Boogie/Verifier.lean` - Update verification -- [ ] `Strata/Languages/Boogie/SMTEncoder.lean` - Update SMT encoding -- [ ] `Strata/Languages/Boogie/BoogieGen.lean` - Update code generation - -### Transformations - -- [ ] `Strata/Transform/CallElim.lean` - Use FreshM -- [ ] `Strata/Transform/CallElimCorrect.lean` - Update proofs -- [ ] `Strata/Transform/LoopElim.lean` - Use FreshM -- [ ] `Strata/Transform/DetToNondet.lean` - Use FreshM -- [ ] `Strata/Transform/DetToNondetCorrect.lean` - Update proofs -- [ ] `Strata/Transform/ProcedureInlining.lean` - Use FreshM - -### Tests - -- [ ] `StrataTest/DL/Lambda/*.lean` - Update all lambda tests -- [ ] `StrataTest/DL/Imperative/*.lean` - Update all imperative tests -- [ ] `StrataTest/Languages/Boogie/*.lean` - Update all Boogie tests -- [ ] `StrataTest/Transform/*.lean` - Update all transformation tests - ---- - -## Appendix E: Common Pitfalls and Solutions - -### Pitfall 1: Forgetting to Thread nextFree - -**Problem**: Generating multiple fresh variables without threading state. - -**Wrong**: -```lean -def transform (e : Expr) : Expr := - let v1 := freshVar -- Wrong! No state threading - let v2 := freshVar -- Wrong! Will get same number - ... -``` - -**Right**: -```lean -def transform (e : Expr) : FreshM Expr := do - let v1 ← freshVar -- Correct! State threaded - let v2 ← freshVar -- Correct! Gets next number - ... -``` - - -### Pitfall 2: Mixing Variable Numbers and Display Names - -**Problem**: Using display names for semantic operations. - -**Wrong**: -```lean -def lookupVar (env : Env) (name : String) : Option Value := - env.find? name -- Wrong! Names are for display only -``` - -**Right**: -```lean -def lookupVar (env : Env) (varNum : Nat) : Option Value := - env.find? varNum -- Correct! Use numbers for semantics -``` - -### Pitfall 3: Not Updating nextFree After Fresh Generation - -**Problem**: Generating fresh variable but not incrementing counter. - -**Wrong**: -```lean -def addVar (p : Program α) : Program α := - let varNum := p.nextFree - ⟨addVarToContent p.content varNum, p.nextFree⟩ -- Wrong! nextFree not updated -``` - -**Right**: -```lean -def addVar (p : Program α) : Program α := - let varNum := p.nextFree - ⟨addVarToContent p.content varNum, p.nextFree + 1⟩ -- Correct! -``` - -### Pitfall 4: Trying to Shift Indices - -**Problem**: Old habits from De Bruijn indices. - -**Wrong**: -```lean -def substitute (e : Expr) (v : Nat) (arg : Expr) : Expr := - match e with - | .abs param body => - let shiftedArg := shift arg 1 -- Wrong! No shifting needed - .abs param (substitute body v shiftedArg) -``` - -**Right**: -```lean -def substitute (e : Expr) (v : Nat) (arg : Expr) : Expr := - match e with - | .abs param body => - .abs param (substitute body v arg) -- Correct! Just recurse -``` - ---- - -## Appendix F: Performance Considerations - -### Variable Lookup - -**Concern**: Map lookup by number vs. by name. - -**Analysis**: -- Number lookup: O(log n) with balanced tree -- Name lookup: O(log n) with balanced tree -- **No performance difference** - -**Benefit**: Numbers are smaller (Nat vs String), better cache locality. - -### Fresh Variable Generation - -**Concern**: Incrementing counter vs. generating unique names. - -**Analysis**: -- Counter increment: O(1) -- Unique name generation: O(n) to check uniqueness -- **Performant Ordering is faster** - - -### Substitution - -**Concern**: Traversing entire AST for substitution. - -**Analysis**: -- De Bruijn: O(n) traversal + O(n) shifting = O(n) -- Performant Ordering: O(n) traversal = O(n) -- **Performant Ordering is faster** (no shifting overhead) - -### Memory Usage - -**Concern**: Storing numbers vs. storing names. - -**Analysis**: -- Numbers: 8 bytes per variable (Nat) -- Names: 8+ bytes per variable (String pointer + data) -- **Performant Ordering uses less memory** - -**Conclusion**: Performant Ordering is equal or better in all performance aspects. - ---- - -## Appendix G: Frequently Asked Questions - -### Q1: Why not use De Bruijn indices? - -**A**: De Bruijn indices are great for canonical representation, but Strata is transformation-heavy. Variables changing identity during transformations makes proofs complex. Performant Ordering keeps variables stable, simplifying everything. - -### Q2: What about alpha-equivalence? - -**A**: We lose syntactic canonicity, but we don't need it. Strata cares about semantic equivalence, not syntactic uniqueness. For verification, semantic equivalence is what matters. - -### Q3: How do we handle variable capture? - -**A**: Variable capture is impossible by construction! All variables have unique numbers, so there's nothing to capture. This is a huge simplification. - -### Q4: What about pretty printing? - -**A**: We maintain a VarContext that maps numbers to display names. Pretty printing resolves numbers to names. Users see readable names, but semantics use numbers. - -### Q5: How do we handle globals? - -**A**: Globals are just variables that get assigned numbers first. Start with nextFree = 0, assign globals numbers 0, 1, 2, ... using freshVar, then continue with nextFree at N (where N is the number of globals). All subsequent variables (locals, parameters, lambda-bound) get numbers >= N using the same counter. - -### Q6: What about procedure parameters? - -**A**: When declaring a procedure, assign fresh numbers to all parameters. When calling a procedure, substitute arguments for parameter numbers. When inlining, generate fresh numbers for all locals. - -### Q7: How do we handle quantifiers? - -**A**: Quantifiers store the list of bound variable numbers. When evaluating, bind those numbers in the environment. No relative indexing needed. - -### Q8: What about performance? - -**A**: Performant Ordering is equal or better in all aspects: lookup speed, generation speed, memory usage, and substitution speed. - - ---- - -## Appendix H: References and Related Work - -### Performant Ordering in Other Systems - -- **SSA Form**: Static Single Assignment uses unique variable numbers -- **ANF**: Administrative Normal Form uses unique names -- **CPS**: Continuation-Passing Style uses unique names -- **LLVM IR**: Uses numbered registers (%0, %1, %2, ...) - -### Why Strata Needs Performant Ordering - -1. **Multi-stage transformations**: Boogie → Imperative → SMT → Solver -2. **Transformation-heavy**: Call elimination, loop elimination, inlining -3. **Verification focus**: Semantic equivalence matters, not syntactic canonicity -4. **Proof engineering**: Simpler proofs = more maintainable system - -### Academic References - -- Pierce, B. C. (2002). *Types and Programming Languages*. MIT Press. - - Chapter on De Bruijn indices and their limitations -- Appel, A. W. (1998). *Modern Compiler Implementation in ML*. Cambridge University Press. - - Chapter on SSA form and unique naming -- Leroy, X. (2009). "Formal verification of a realistic compiler". *Communications of the ACM*. - - CompCert uses unique names for intermediate representations - ---- - -## Conclusion - -This specification describes the migration of Strata to **Performant Ordering** - a unified variable numbering scheme using globally unique natural numbers. This change simplifies the entire system: - -- **Lambda calculus**: No more index shifting, weakening, or lifting -- **Imperative layer**: Unambiguous variable references -- **Transformations**: Variables maintain stable identity -- **Proofs**: Simpler freshness lemmas replace complex shifting lemmas -- **Strata-optimized**: Perfect fit for transformation-heavy verification pipelines - -The migration follows a compiler-driven approach: make breaking changes, fix what breaks, restore proofs, validate tests. The result is a simpler, more maintainable, and more transformation-friendly system. - -**Next Steps**: Begin implementation with Phase 1 (Core Infrastructure). - ---- - -## Document History - -| Version | Date | Author | Changes | -|---------|------|--------|---------| -| 1.0 | 2026-01-19 | Kiro AI | Initial specification | - ---- - -## Approval - -This specification requires review and approval before implementation begins. - -- [ ] Technical Lead Review -- [ ] Architecture Review -- [ ] Implementation Team Review -- [ ] Approved for Implementation - ---- - -*End of Specification*