Skip to content

Decompose symbolic evaluation into independent phases (constant propagation + tree-of-if * restructuring) #1194

@MikaelMayer

Description

@MikaelMayer

Current pipeline

Code references below are pinned to commit 7b72423c4e.

The symbolic evaluation phase (Core.toCoreProofObligationProgram, registered as symbolicEvalPhase) does several things at once:

  1. Walks statements, unconditionally inlining every variable's value at every use site (LExprEval.lean:265).
  2. Evaluates expressions where possible.
  3. Restructures the program into a tree whose leaves are sequences of assume/assert/cover/init and whose only branching is non-deterministic if * — exactly the shape ObligationExtraction.extractObligations expects (no if * { } else { } is ever followed by another statement, so the result is a true tree to verify, not a DAG).

Because step 1 inlines aggressively, step 3 produces lots of duplication. The anfEncoderPipelinePhase runs after to detect duplicated subexpressions and re-extract them under fresh $__anf.N names (pipeline order).

Problem

The ANF phase is expensive and exists primarily to clean up the inlining damage from symbolic eval. Most of that inlining never enabled simplification — the variable's expression just got copied into every use site, and ANF then re-extracts it.

Proposed decomposition

Split the current symbolic eval into two independent phases, plus a companion:

  1. Constant/value-propagation phase: walks statements like symbolic execution does, tracking what each variable currently denotes (or marking it as unknown after havoc, after a loop, or when an if-then-else updates it non-uniformly). Inlines a variable's expression at a use site only if doing so enables simplification (full or partial evaluation). Otherwise the variable reference is preserved.

  2. Verification-tree restructuring phase: takes a (possibly already-simplified) Core program and produces the top-level if * tree shape that ObligationExtraction expects — branches contain only assume/assert/cover/init, and no if * { } else { } is ever followed by another statement, ensuring the result is a tree (not a DAG). This phase should mimic what symbolic evaluation does today, including the path-capping mechanism (see pathCap in Options.lean:194 and Env.merge / findCondPairs in StatementEval.lean) that bounds the number of branches by merging them.

  3. Dead-code elimination phase: removes init declarations whose variable no longer appears anywhere after propagation. No such phase exists today.

Benefits

  • ANF can mostly be avoided: when constant propagation only inlines what simplifies, there is far less duplication for ANF to clean up. ANF stays available for genuine common subexpressions (e.g., the same expression written twice in source, or arising from procedure inlining).
  • Better fit for incremental SMT: with the verification-tree shape produced as an explicit, reusable phase, the verifier can later walk the tree and dispatch --incremental solver calls along branches (push/pop along shared prefixes) instead of materialising independent proof obligations via extractObligations. This complements Make --incremental the default, deprecate or remove the batch mode #1187 (making --incremental the default).
  • Composable for other analyses: constant propagation and dead-code elimination are useful outside the SMT pipeline. Today they're entangled with obligation generation.

Out of scope

  • Changing Imperative command semantics or the Lambda evaluator's contract.
  • Removing the ANF phase.
  • Implementing the incremental-solver walk itself (this issue only enables it).

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions