Skip to content

Comments

Integrate Isabelle/C to provide a framework for reasoning about C code using AutoCorrode#151

Open
DominicPM wants to merge 37 commits intoawslabs:mainfrom
DominicPM:isabelle-c-integration
Open

Integrate Isabelle/C to provide a framework for reasoning about C code using AutoCorrode#151
DominicPM wants to merge 37 commits intoawslabs:mainfrom
DominicPM:isabelle-c-integration

Conversation

@DominicPM
Copy link
Contributor

Uses Isabelle/C as an alternative frontend parser for C, introducing Micro C in analogy to Micro Rust. Provide initial parsing support plus translations for arithmetic, logical, conditional, and looping expressions, as well as support for arrays, structs, and pointers.

Several examples are included, based on new WP rules for Micro C constructs.

Closes #63.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Register the Isabelle_C AFP component alongside Word_Lib in the Makefile.
Users must place the Isabelle_C AFP entry in dependencies/afp/Isabelle_C/.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Imports Isabelle/C (C_Main) and Shallow_Micro_Rust. Contains an empty
Micro_C_Syntax theory that will host the C parser integration.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Depends on Shallow_Micro_Rust, Shallow_Separation_Logic, and
Byte_Level_Encoding. Will host C numeric types, abort types, pointer
types, and memory operations.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Depends on Micro_Rust_Interfaces_Core, Separation_Lenses, and
Shallow_Micro_C. Will host C block references, stack frame management,
and deallocation.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add Micro_C_Examples session depending on Micro_C_Interfaces and Crush.
Update top-level ROOT and AutoCorrode.thy to include all four new
Micro C sessions in the umbrella build.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Parse a minimal main() and a swap() function via the Isabelle/C parser
to confirm the C11 front-end loads and produces C_Ast output.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_abort datatype with constructors for C undefined behaviors:
NullPointerDereference, BufferOverflow, SignedOverflow, DivisionByZero,
UseAfterFree. Provide convenience abbreviations using CustomAbort.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add type aliases for C integer types (c_int, c_uint, c_char, c_long, etc.)
using Isabelle word and sword types. Define signed arithmetic operations
that abort with SignedOverflow on undefined behavior, unsigned division
that aborts on division by zero, and signed comparison operations.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_sizeof overloaded constant mapping C types to byte sizes via
overloading instances for each C type. Prove basic sizeof lemmas and
connection to byte-level encoding lengths.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_null as the zero-address gref, is_null_nat predicate, and
c_null_guard combinator that aborts with NullPointerDereference
when a null pointer is accessed.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Adds C_To_Core_Translation.thy with ML structures for:
- C_Ast_Utils: AST walking utilities for Isabelle/C's C_Ast types
- C_Trans_Ctxt: translation context tracking variable bindings
- C_Term_Build: term construction for core monad expressions
- C_Translate: statement/expression translation (integer literals,
  return statements, compound blocks; unsupported constructs error)
- C_Def_Gen: definition generation via Local_Theory.define
- micro_c_translate command: parses C source via Isabelle/C and
  generates core monad definitions

Smoke tests verify end-to-end: C source -> parsed AST -> core monad
definitions (c_f for void return, c_g for return 42).

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Extend the ML translation backend to handle:
- Local variable declarations via Ref::new (store_reference_const)
- Variable reads via bind/deep_compose1/call/store_dereference_const
- Variable writes via bind2/deep_compose2/call/store_update_const
- Binary arithmetic (+, -, *, /, %) via bindlift2 with HOL operators
- Comparison operators (<, <=, >, >=, ==) with swapped operands for
  > and >= to reuse less/less_eq
- Logical operators (&&, ||) via bindlift2 with conj/disj
- Statement sequencing with variable bindings via nested bind/lambda

Fixes open C_Ast shadowing Isabelle's Const/Free/dummyT by saving them
before the open. Uses concrete int type for integer literals. Imports
Core_Syntax for overloaded store constants.

Smoke test: void h() { int x = 5; x = x + 1; } translates to
  FunctionBody (let x = Ref::new ⟨↑5⟩; ⋆↑x ← bindlift2 (+) (⇧⋆↑x) (↑1))

Note: definitions using store_reference_const have 2 phantom TYPE params
for ref-internal types, resolved by adhoc_overloading in concrete contexts
(same pattern as Micro_Rust_Examples/Showcase.thy).

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add if/else translation using two_armed_conditional from Bool_Type.
Implement function parameter extraction from C AST declarators and
distinguish parameters (by-value, accessed via literal) from local
variables (mutable references, accessed via dereference). Parameters
are wrapped in lambdas in the generated definition.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Adds mk_funcall to C_Term_Build using funcall0-funcall10 from
Core_Expression. The CCall0 handler resolves the callee via
Proof_Context.read_const to get the fully qualified constant name,
enabling cross-function references. C_Trans_Ctxt now carries the
proof context, and process_translation_unit interleaves translation
with definition so later functions can reference earlier ones.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Recognizes the standard for-loop pattern:
  for (int i = start; i < bound; i++) body
and translates it to:
  raw_for_loop (map of_nat [start..<bound]) (λi. body)

Loop bounds can be integer literals or function parameters.
Non-standard loop forms fall through to an unsupported error.
Adds Rust_Iterator import for raw_for_loop definition.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Define opaque constants c_while_stub, c_goto_stub, and c_unsupported
for C constructs that cannot be translated (while, goto, switch,
continue, break). These have no WP rules, so symbolic execution
silently gets stuck — users can see from the constant names which
constructs need attention. Labels are handled by ignoring the label
and translating the labeled statement.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add C_Memory_Operations.thy with c_ptr_add for C pointer arithmetic
(address + n * stride), c_ptr_add_typed convenience abbreviation using
c_sizeof, and c_ptr_diff for pointer subtraction. Includes basic
lemmas: ptr_add_zero, ptr_add_address, ptr_add_add.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_ptr_at for computing the gref of array element i at base + i *
stride. Array dereference and update use the existing locale-provided
dereference_by_value_raw_fun and update_raw_fun on the computed gref.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add mk_deref and mk_ptr_write to C_Term_Build for pointer dereference
(*p) and pointer assignment (*p = v). Handle CUnary0(CIndOp0, ...) and
CAssign0 with dereference LHS in the expression translator. Smoke test
with a C swap function that correctly translates to monadic pointer
operations.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Parse C swap function via micro_c_translate, define a separation-logic
contract specifying that two disjoint references exchange values, and
verify it fully automatically with crush_boot + crush_base. This is the
first complete pipeline: C source -> Isabelle/C parser -> core monad
definition -> separation-logic contract -> automated proof.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add two pure-function verification examples exercising conditionals
and return: c_max and c_abs_val, both proved via crush_boot + crush_base.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Provides weakest-precondition rules for the overflow-checked C arithmetic
operations (c_signed_add, c_signed_sub, c_signed_mul, c_signed_div,
c_signed_mod) and comparison operations (c_signed_less, c_signed_le,
c_signed_eq). Each operation has:
- A [micro_rust_wp_simps] equational rule (requires ucincl)
- A [micro_rust_wp_intros] introduction rule (no ucincl, reduces to
  wp_literalI under the no-overflow assumption)

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Switch the ML translation backend to produce c_int (32 sword) literals
and use the monadic overflow-checked operations (c_signed_add,
c_signed_sub, c_signed_mul, c_signed_div, c_signed_mod) instead of
plain HOL arithmetic. Comparison operators now use c_signed_less,
c_signed_le, c_signed_eq. Logical operators (&&, ||) remain pure
(bindlift2 with conj/disj).

Key changes:
- mk_literal_int now produces c_int (32 sword) literals
- translate_binop returns Monadic/Pure tagged operators
- CBinary0 dispatches to mk_bind2 for monadic ops, mk_bindlift2 for pure
- Micro_C_Parsing_Frontend/ROOT gains Shallow_Micro_C dependency

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Update the verification examples to work with the new c_int (32 sword)
translation output:
- Locale uses c_int_prism instead of int_prism
- Swap contract uses c_int for reference values
- Max contract postcondition uses signed comparison (sint b < sint a)
- Abs contract adds no-overflow precondition (-(2^31) < sint x) and
  uses word_of_int for the negation result
- Import C_Arithmetic_Rules for WP rules on c_signed_less/c_signed_sub

All three examples (swap, max, abs_val) verify end-to-end.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Extend the ML translation backend to handle C struct field access
(p->field) and struct field assignment (p->field = val). Struct
definitions in C source are automatically parsed and registered.
Parameter struct types are extracted from function signatures.

Naming convention: struct "foo" field "bar" maps to Isabelle
accessor "c_foo_bar" and updater "update_c_foo_bar", matching
the names generated by datatype_record.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
- Add c_point datatype_record and smoke test for struct field
  translation in C_To_Core_Translation.thy
- Add verified swap_coords example in Simple_C_Functions.thy that
  parses C struct code, defines a separation-logic contract, and
  proves correctness using crush_boot + crush_base

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Translate C array reads (arr[idx]) by dereferencing the whole list ref
and indexing with List.nth, resolving dereference_fun from locale context
to avoid adhoc overloading ambiguity.  Array writes (arr[idx] = val) use
focus_nth for focused update via update_fun.

Adds c_idx_to_nat (a proper constant wrapping unat), mk_unat and
mk_focus_nth term builders, CIndex0 read/write cases in translate_expr,
and smoke tests for read_at / write_at.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Parse read_at and write_at via micro_c_translate, define a separation
logic contract for read_at, and verify it with crush_boot + crush_base.
Arrays are modeled as references to c_int list; the contract requires a
valid index and guarantees the returned value equals vs ! (c_idx_to_nat idx).

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Define c_unsigned_add/sub/mul (wrapping) and c_unsigned_less/le/eq
(comparison) for unsigned word types, with corresponding WP simp and
intro rules. Unsigned arithmetic wraps naturally via Isabelle's word
arithmetic; no overflow checks needed.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_numeric_type ML datatype (CInt, CUInt, CChar, CSChar, CShort,
CUShort, CLong, CULong) with type specifier parsing from C AST
declarations. Track types through the translation context and dispatch
to c_signed_* vs c_unsigned_* operations based on signedness. Function
parameters now carry explicit HOL types for correct type inference.

Includes smoke tests for unsigned int, long, and char translation
verifying correct operation dispatch.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_uint_verification_ctx locale with verified u_add (wrapping
addition) and u_max (unsigned comparison) functions, demonstrating
end-to-end verification of unsigned C code through the full pipeline.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Split the monolithic Simple_C_Functions.thy into three thematic files:
- Simple_C_Functions.thy: scalar verification (swap, max, abs_val, u_add, u_max)
- C_Struct_Examples.thy: struct verification (swap_coords with c_point)
- C_Array_Examples.thy: array verification (read_at, write_at)

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Translate a C for-loop that fills the first n elements of an array with
a given value. Define list_fill_prefix helper with step lemma, then prove
the contract using wp_raw_for_loop_framedI' with an existential invariant
tracking the partially-filled array.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Define 12 operations (bitwise AND/OR/XOR/NOT for signed/unsigned,
plus shift left/right for signed/unsigned) with correct C UB
semantics: bitwise ops never abort, shifts abort on out-of-range
shift amount, signed shifts additionally check for negative operand
and overflow. Add 24 WP rules (simp + intro) for proof automation.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Handle 5 new binary operators (& | ^ << >>) and 2 new unary
operators (~ bitwise complement, - unary minus) in the C-to-core
ML translation. All dispatch signed/unsigned based on operand type.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
New C_Bitwise_Examples.thy with 7 verified examples:
- Smoke tests: unsigned AND, NOT, shift left
- Semantic proofs: mask_low_byte (result < 256), set_bit (bit is set),
  clear_bit (bit is cleared) — composing multiple bitwise ops
- Signed AND (no UB precondition)

Fix integer literal translation to use polymorphic types (dummyT)
so literals work in both signed and unsigned contexts. Add C usual
arithmetic conversion: unsigned wins when mixed with signed literals.
Also add unary minus (-x) translation as (0 - x).

Closes awslabs#63.

Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create C frontend based on Isabelle/C

1 participant