Skip to content

Asymmetric translator handling of nondet init produces a precision mismatch with CBMC #1186

@PROgram52bc

Description

@PROgram52bc

Issue: asymmetric translator handling of nondet init produces a precision mismatch with CBMC

Status: present on origin/main (HEAD: 53a3df53).

Affects: Strata Core programs with uninitialized var declarations that get read before any subsequent assignment. CBMC flags an uninitialized-read diagnostic that the source semantics treats as a defined value — a false positive, not an unsoundness.

Concrete end-to-end reproducer

program Core;

procedure nondet_init_read (out r : int)
spec { ensures (r == r); }
{
  var x : int;
  r := x;
  assert [tautology]: (r == r);
};

var x : int; parses to Core.Statement.init id tp .nondet md.

Run:

lake exe StrataToCBMC <input.core.st> > out.json
symtab2gb out.json -o out.gb
cbmc out.gb --uninitialized-check

Expected (source semantics): assert passes; no diagnostic. Observed (CBMC on translator output): uninitialized-read diagnostic on x.

Framing

This is not a "semantic mismatch" claim between two formal semantics — GOTO has no formal Lean-checked semantics in the repo. The issue is:

  1. An asymmetry inside the translator. Cmd.toGotoInstructions handles .init _ _ .nondet _ differently from .set _ .nondet _ even though the user-level intent ("a non-deterministically chosen value") is the same.
  2. A precision mismatch against CBMC's documented runtime behavior. The GOTO output, when run by CBMC, doesn't fulfill the post-state binding that the source semantics promises — CBMC sees the variable as unbound and flags reads as undefined, while the source semantics treats the variable as bound. CBMC is more conservative than the source: it rejects (or warns on) programs the source accepts. This is a false-positive surface, not unsoundness — but it does mean a user reasoning about Strata source is surprised by CBMC's diagnostic.

Where the asymmetry lives

Cmd.toGotoInstructions lines 99-127 emits different numbers of instructions for the two .init flavors:

Source command Emitted instructions
.init v ty (.det e) md DECL v_expr, ASSIGN v_expr := e_expr (lines 111-120)
.init v ty .nondet md DECL v_expr only (lines 121-125)

For comparison, the .set _ _ arms (lines 126-178):

Source command Emitted instructions
.set v (.det e) md ASSIGN v_expr := e_expr
.set v .nondet md ASSIGN v_expr := side_effect Nondet (lines 162-178)

The asymmetry: for .set, both flavors emit one ASSIGN — the .nondet case uses a side_effect Nondet GOTO expression (CBMC's primitive for "produce an arbitrary value"). For .init, only the .det flavor produces an ASSIGN; the .nondet flavor stops at DECL.

The natural symmetric handling — and what this issue proposes — is for .init v ty .nondet to emit DECL plus an ASSIGN with a side_effect Nondet RHS, mirroring the .set v .nondet pattern.

Why this matters: CBMC's DECL doesn't bind a value

CBMC's documentation of its goto_program_instruction_typet enum (see CBMC repo's goto_program.h) describes DECL as a scope marker: it declares that a variable is in scope but does not bind a value. The variable is uninitialized until a subsequent ASSIGN writes to it. CBMC's symbolic execution will, depending on options, flag reads of uninitialized variables as undefined behavior or treat them as non-deterministic — either way, the GOTO program is more conservative than the source.

By contrast, Strata's EvalCmd.eval_init_unconstrained (CmdSemantics.lean lines 298-302) requires InitState P σ x v σ', meaning the post-state σ' does bind x to some value v. The source promises that immediately after init x : T := nondet, the variable x has a defined value.

After init x : T := nondet What you get
Source semantics x is bound to some value
CBMC view of the GOTO output x is in scope but unbound; reads are undefined

CBMC rejects (or warns on) programs the source accepts, producing false positives. The mismatch is a precision regression, not unsoundness.

Proposed fix

In Strata/DL/Imperative/ToCProverGOTO.lean, change the .init v ty .nondet md arm to mirror the .set v .nondet arm:

    | .nondet =>
      let nondet_expr : Expr :=
        { id := .side_effect .Nondet,
          sourceLoc := srcLoc,
          type := gty }
      let assign_inst :=
        { type := .ASSIGN, locationNum := (trans.nextLoc + 1),
          sourceLoc := srcLoc,
          code := Code.assign v_expr nondet_expr }
      return { trans with
                instructions := trans.instructions.append #[decl_inst, assign_inst],
                nextLoc := trans.nextLoc + 2,
                T := T }

Effect: the new GOTO program now performs the binding immediately via the side_effect Nondet RHS, matching the source's promise.

After the fix, the four cases of nondet treatment are symmetric:

Source command Emitted instructions
.init v ty (.det e) md DECL, ASSIGN v := e_goto
.init v ty .nondet md DECL, ASSIGN v := side_effect Nondet
.set v (.det e) md ASSIGN v := e_goto
.set v .nondet md ASSIGN v := side_effect Nondet

init adds DECL on top of set's emission. Det/nondet differs only in the RHS shape. Symmetric on both axes.

Test plan

  1. Add a test case to StrataTest/Backends/CBMC/GOTO/E2E_CFGPipeline.lean exercising var x : int; r := x; and asserting:
    • The emitted GOTO has DECL + ASSIGN-with-Nondet for the var x : int; declaration.
    • The instruction count matches 2 + whatever the assignment emits.
  2. Run the reproducer above through CBMC and confirm: before the fix, CBMC flags an uninitialized read; after the fix, no diagnostic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions