Issue: asymmetric translator handling of nondet init produces a precision mismatch with CBMC
Status: present on origin/main (HEAD: 53a3df53).
Affects: Strata Core programs with uninitialized var declarations that get read before any subsequent assignment. CBMC flags an uninitialized-read diagnostic that the source semantics treats as a defined value — a false positive, not an unsoundness.
Concrete end-to-end reproducer
program Core;
procedure nondet_init_read (out r : int)
spec { ensures (r == r); }
{
var x : int;
r := x;
assert [tautology]: (r == r);
};
var x : int; parses to Core.Statement.init id tp .nondet md.
Run:
lake exe StrataToCBMC <input.core.st> > out.json
symtab2gb out.json -o out.gb
cbmc out.gb --uninitialized-check
Expected (source semantics): assert passes; no diagnostic. Observed (CBMC on translator output): uninitialized-read diagnostic on x.
Framing
This is not a "semantic mismatch" claim between two formal semantics — GOTO has no formal Lean-checked semantics in the repo. The issue is:
- An asymmetry inside the translator.
Cmd.toGotoInstructions handles .init _ _ .nondet _ differently from .set _ .nondet _ even though the user-level intent ("a non-deterministically chosen value") is the same.
- A precision mismatch against CBMC's documented runtime behavior. The GOTO output, when run by CBMC, doesn't fulfill the post-state binding that the source semantics promises — CBMC sees the variable as unbound and flags reads as undefined, while the source semantics treats the variable as bound. CBMC is more conservative than the source: it rejects (or warns on) programs the source accepts. This is a false-positive surface, not unsoundness — but it does mean a user reasoning about Strata source is surprised by CBMC's diagnostic.
Where the asymmetry lives
Cmd.toGotoInstructions lines 99-127 emits different numbers of instructions for the two .init flavors:
| Source command |
Emitted instructions |
.init v ty (.det e) md |
DECL v_expr, ASSIGN v_expr := e_expr (lines 111-120) |
.init v ty .nondet md |
DECL v_expr only (lines 121-125) |
For comparison, the .set _ _ arms (lines 126-178):
| Source command |
Emitted instructions |
.set v (.det e) md |
ASSIGN v_expr := e_expr |
.set v .nondet md |
ASSIGN v_expr := side_effect Nondet (lines 162-178) |
The asymmetry: for .set, both flavors emit one ASSIGN — the .nondet case uses a side_effect Nondet GOTO expression (CBMC's primitive for "produce an arbitrary value"). For .init, only the .det flavor produces an ASSIGN; the .nondet flavor stops at DECL.
The natural symmetric handling — and what this issue proposes — is for .init v ty .nondet to emit DECL plus an ASSIGN with a side_effect Nondet RHS, mirroring the .set v .nondet pattern.
Why this matters: CBMC's DECL doesn't bind a value
CBMC's documentation of its goto_program_instruction_typet enum (see CBMC repo's goto_program.h) describes DECL as a scope marker: it declares that a variable is in scope but does not bind a value. The variable is uninitialized until a subsequent ASSIGN writes to it. CBMC's symbolic execution will, depending on options, flag reads of uninitialized variables as undefined behavior or treat them as non-deterministic — either way, the GOTO program is more conservative than the source.
By contrast, Strata's EvalCmd.eval_init_unconstrained (CmdSemantics.lean lines 298-302) requires InitState P σ x v σ', meaning the post-state σ' does bind x to some value v. The source promises that immediately after init x : T := nondet, the variable x has a defined value.
After init x : T := nondet |
What you get |
| Source semantics |
x is bound to some value |
| CBMC view of the GOTO output |
x is in scope but unbound; reads are undefined |
CBMC rejects (or warns on) programs the source accepts, producing false positives. The mismatch is a precision regression, not unsoundness.
Proposed fix
In Strata/DL/Imperative/ToCProverGOTO.lean, change the .init v ty .nondet md arm to mirror the .set v .nondet arm:
| .nondet =>
let nondet_expr : Expr :=
{ id := .side_effect .Nondet,
sourceLoc := srcLoc,
type := gty }
let assign_inst :=
{ type := .ASSIGN, locationNum := (trans.nextLoc + 1),
sourceLoc := srcLoc,
code := Code.assign v_expr nondet_expr }
return { trans with
instructions := trans.instructions.append #[decl_inst, assign_inst],
nextLoc := trans.nextLoc + 2,
T := T }
Effect: the new GOTO program now performs the binding immediately via the side_effect Nondet RHS, matching the source's promise.
After the fix, the four cases of nondet treatment are symmetric:
| Source command |
Emitted instructions |
.init v ty (.det e) md |
DECL, ASSIGN v := e_goto |
.init v ty .nondet md |
DECL, ASSIGN v := side_effect Nondet |
.set v (.det e) md |
ASSIGN v := e_goto |
.set v .nondet md |
ASSIGN v := side_effect Nondet |
init adds DECL on top of set's emission. Det/nondet differs only in the RHS shape. Symmetric on both axes.
Test plan
- Add a test case to
StrataTest/Backends/CBMC/GOTO/E2E_CFGPipeline.lean exercising var x : int; r := x; and asserting:
- The emitted GOTO has DECL + ASSIGN-with-Nondet for the
var x : int; declaration.
- The instruction count matches
2 + whatever the assignment emits.
- Run the reproducer above through CBMC and confirm: before the fix, CBMC flags an uninitialized read; after the fix, no diagnostic.
Issue: asymmetric translator handling of nondet
initproduces a precision mismatch with CBMCStatus: present on
origin/main(HEAD:53a3df53).Affects: Strata Core programs with uninitialized
vardeclarations that get read before any subsequent assignment. CBMC flags an uninitialized-read diagnostic that the source semantics treats as a defined value — a false positive, not an unsoundness.Concrete end-to-end reproducer
var x : int;parses toCore.Statement.init id tp .nondet md.Run:
Expected (source semantics): assert passes; no diagnostic. Observed (CBMC on translator output): uninitialized-read diagnostic on
x.Framing
This is not a "semantic mismatch" claim between two formal semantics — GOTO has no formal Lean-checked semantics in the repo. The issue is:
Cmd.toGotoInstructionshandles.init _ _ .nondet _differently from.set _ .nondet _even though the user-level intent ("a non-deterministically chosen value") is the same.Where the asymmetry lives
Cmd.toGotoInstructionslines 99-127 emits different numbers of instructions for the two.initflavors:.init v ty (.det e) mdv_expr, ASSIGNv_expr := e_expr(lines 111-120).init v ty .nondet mdv_expronly (lines 121-125)For comparison, the
.set _ _arms (lines 126-178):.set v (.det e) mdv_expr := e_expr.set v .nondet mdv_expr := side_effect Nondet(lines 162-178)The asymmetry: for
.set, both flavors emit one ASSIGN — the.nondetcase uses aside_effect NondetGOTO expression (CBMC's primitive for "produce an arbitrary value"). For.init, only the.detflavor produces an ASSIGN; the.nondetflavor stops at DECL.The natural symmetric handling — and what this issue proposes — is for
.init v ty .nondetto emit DECL plus an ASSIGN with a side_effect Nondet RHS, mirroring the.set v .nondetpattern.Why this matters: CBMC's DECL doesn't bind a value
CBMC's documentation of its
goto_program_instruction_typetenum (see CBMC repo'sgoto_program.h) describesDECLas a scope marker: it declares that a variable is in scope but does not bind a value. The variable is uninitialized until a subsequentASSIGNwrites to it. CBMC's symbolic execution will, depending on options, flag reads of uninitialized variables as undefined behavior or treat them as non-deterministic — either way, the GOTO program is more conservative than the source.By contrast, Strata's
EvalCmd.eval_init_unconstrained(CmdSemantics.leanlines 298-302) requiresInitState P σ x v σ', meaning the post-stateσ'does bindxto some valuev. The source promises that immediately afterinit x : T := nondet, the variablexhas a defined value.init x : T := nondetxis bound to some valuexis in scope but unbound; reads are undefinedCBMC rejects (or warns on) programs the source accepts, producing false positives. The mismatch is a precision regression, not unsoundness.
Proposed fix
In
Strata/DL/Imperative/ToCProverGOTO.lean, change the.init v ty .nondet mdarm to mirror the.set v .nondetarm:| .nondet => let nondet_expr : Expr := { id := .side_effect .Nondet, sourceLoc := srcLoc, type := gty } let assign_inst := { type := .ASSIGN, locationNum := (trans.nextLoc + 1), sourceLoc := srcLoc, code := Code.assign v_expr nondet_expr } return { trans with instructions := trans.instructions.append #[decl_inst, assign_inst], nextLoc := trans.nextLoc + 2, T := T }Effect: the new GOTO program now performs the binding immediately via the side_effect Nondet RHS, matching the source's promise.
After the fix, the four cases of nondet treatment are symmetric:
.init v ty (.det e) mdv := e_goto.init v ty .nondet mdv := side_effect Nondet.set v (.det e) mdv := e_goto.set v .nondet mdv := side_effect Nondetinitadds DECL on top ofset's emission. Det/nondet differs only in the RHS shape. Symmetric on both axes.Test plan
StrataTest/Backends/CBMC/GOTO/E2E_CFGPipeline.leanexercisingvar x : int; r := x;and asserting:var x : int;declaration.2 +whatever the assignment emits.