Integrate Isabelle/C to provide a framework for reasoning about C code using AutoCorrode#151
Open
DominicPM wants to merge 37 commits intoawslabs:mainfrom
Open
Integrate Isabelle/C to provide a framework for reasoning about C code using AutoCorrode#151DominicPM wants to merge 37 commits intoawslabs:mainfrom
DominicPM wants to merge 37 commits intoawslabs:mainfrom
Conversation
Register the Isabelle_C AFP component alongside Word_Lib in the Makefile. Users must place the Isabelle_C AFP entry in dependencies/afp/Isabelle_C/. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Imports Isabelle/C (C_Main) and Shallow_Micro_Rust. Contains an empty Micro_C_Syntax theory that will host the C parser integration. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Depends on Shallow_Micro_Rust, Shallow_Separation_Logic, and Byte_Level_Encoding. Will host C numeric types, abort types, pointer types, and memory operations. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Depends on Micro_Rust_Interfaces_Core, Separation_Lenses, and Shallow_Micro_C. Will host C block references, stack frame management, and deallocation. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add Micro_C_Examples session depending on Micro_C_Interfaces and Crush. Update top-level ROOT and AutoCorrode.thy to include all four new Micro C sessions in the umbrella build. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Parse a minimal main() and a swap() function via the Isabelle/C parser to confirm the C11 front-end loads and produces C_Ast output. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_abort datatype with constructors for C undefined behaviors: NullPointerDereference, BufferOverflow, SignedOverflow, DivisionByZero, UseAfterFree. Provide convenience abbreviations using CustomAbort. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add type aliases for C integer types (c_int, c_uint, c_char, c_long, etc.) using Isabelle word and sword types. Define signed arithmetic operations that abort with SignedOverflow on undefined behavior, unsigned division that aborts on division by zero, and signed comparison operations. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_sizeof overloaded constant mapping C types to byte sizes via overloading instances for each C type. Prove basic sizeof lemmas and connection to byte-level encoding lengths. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_null as the zero-address gref, is_null_nat predicate, and c_null_guard combinator that aborts with NullPointerDereference when a null pointer is accessed. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Adds C_To_Core_Translation.thy with ML structures for: - C_Ast_Utils: AST walking utilities for Isabelle/C's C_Ast types - C_Trans_Ctxt: translation context tracking variable bindings - C_Term_Build: term construction for core monad expressions - C_Translate: statement/expression translation (integer literals, return statements, compound blocks; unsupported constructs error) - C_Def_Gen: definition generation via Local_Theory.define - micro_c_translate command: parses C source via Isabelle/C and generates core monad definitions Smoke tests verify end-to-end: C source -> parsed AST -> core monad definitions (c_f for void return, c_g for return 42). Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Extend the ML translation backend to handle:
- Local variable declarations via Ref::new (store_reference_const)
- Variable reads via bind/deep_compose1/call/store_dereference_const
- Variable writes via bind2/deep_compose2/call/store_update_const
- Binary arithmetic (+, -, *, /, %) via bindlift2 with HOL operators
- Comparison operators (<, <=, >, >=, ==) with swapped operands for
> and >= to reuse less/less_eq
- Logical operators (&&, ||) via bindlift2 with conj/disj
- Statement sequencing with variable bindings via nested bind/lambda
Fixes open C_Ast shadowing Isabelle's Const/Free/dummyT by saving them
before the open. Uses concrete int type for integer literals. Imports
Core_Syntax for overloaded store constants.
Smoke test: void h() { int x = 5; x = x + 1; } translates to
FunctionBody (let x = Ref::new ⟨↑5⟩; ⋆↑x ← bindlift2 (+) (⇧⋆↑x) (↑1))
Note: definitions using store_reference_const have 2 phantom TYPE params
for ref-internal types, resolved by adhoc_overloading in concrete contexts
(same pattern as Micro_Rust_Examples/Showcase.thy).
Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add if/else translation using two_armed_conditional from Bool_Type. Implement function parameter extraction from C AST declarators and distinguish parameters (by-value, accessed via literal) from local variables (mutable references, accessed via dereference). Parameters are wrapped in lambdas in the generated definition. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Adds mk_funcall to C_Term_Build using funcall0-funcall10 from Core_Expression. The CCall0 handler resolves the callee via Proof_Context.read_const to get the fully qualified constant name, enabling cross-function references. C_Trans_Ctxt now carries the proof context, and process_translation_unit interleaves translation with definition so later functions can reference earlier ones. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Recognizes the standard for-loop pattern: for (int i = start; i < bound; i++) body and translates it to: raw_for_loop (map of_nat [start..<bound]) (λi. body) Loop bounds can be integer literals or function parameters. Non-standard loop forms fall through to an unsupported error. Adds Rust_Iterator import for raw_for_loop definition. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Define opaque constants c_while_stub, c_goto_stub, and c_unsupported for C constructs that cannot be translated (while, goto, switch, continue, break). These have no WP rules, so symbolic execution silently gets stuck — users can see from the constant names which constructs need attention. Labels are handled by ignoring the label and translating the labeled statement. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add C_Memory_Operations.thy with c_ptr_add for C pointer arithmetic (address + n * stride), c_ptr_add_typed convenience abbreviation using c_sizeof, and c_ptr_diff for pointer subtraction. Includes basic lemmas: ptr_add_zero, ptr_add_address, ptr_add_add. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_ptr_at for computing the gref of array element i at base + i * stride. Array dereference and update use the existing locale-provided dereference_by_value_raw_fun and update_raw_fun on the computed gref. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add mk_deref and mk_ptr_write to C_Term_Build for pointer dereference (*p) and pointer assignment (*p = v). Handle CUnary0(CIndOp0, ...) and CAssign0 with dereference LHS in the expression translator. Smoke test with a C swap function that correctly translates to monadic pointer operations. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Parse C swap function via micro_c_translate, define a separation-logic contract specifying that two disjoint references exchange values, and verify it fully automatically with crush_boot + crush_base. This is the first complete pipeline: C source -> Isabelle/C parser -> core monad definition -> separation-logic contract -> automated proof. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add two pure-function verification examples exercising conditionals and return: c_max and c_abs_val, both proved via crush_boot + crush_base. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Provides weakest-precondition rules for the overflow-checked C arithmetic operations (c_signed_add, c_signed_sub, c_signed_mul, c_signed_div, c_signed_mod) and comparison operations (c_signed_less, c_signed_le, c_signed_eq). Each operation has: - A [micro_rust_wp_simps] equational rule (requires ucincl) - A [micro_rust_wp_intros] introduction rule (no ucincl, reduces to wp_literalI under the no-overflow assumption) Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Switch the ML translation backend to produce c_int (32 sword) literals and use the monadic overflow-checked operations (c_signed_add, c_signed_sub, c_signed_mul, c_signed_div, c_signed_mod) instead of plain HOL arithmetic. Comparison operators now use c_signed_less, c_signed_le, c_signed_eq. Logical operators (&&, ||) remain pure (bindlift2 with conj/disj). Key changes: - mk_literal_int now produces c_int (32 sword) literals - translate_binop returns Monadic/Pure tagged operators - CBinary0 dispatches to mk_bind2 for monadic ops, mk_bindlift2 for pure - Micro_C_Parsing_Frontend/ROOT gains Shallow_Micro_C dependency Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Update the verification examples to work with the new c_int (32 sword) translation output: - Locale uses c_int_prism instead of int_prism - Swap contract uses c_int for reference values - Max contract postcondition uses signed comparison (sint b < sint a) - Abs contract adds no-overflow precondition (-(2^31) < sint x) and uses word_of_int for the negation result - Import C_Arithmetic_Rules for WP rules on c_signed_less/c_signed_sub All three examples (swap, max, abs_val) verify end-to-end. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Extend the ML translation backend to handle C struct field access (p->field) and struct field assignment (p->field = val). Struct definitions in C source are automatically parsed and registered. Parameter struct types are extracted from function signatures. Naming convention: struct "foo" field "bar" maps to Isabelle accessor "c_foo_bar" and updater "update_c_foo_bar", matching the names generated by datatype_record. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
- Add c_point datatype_record and smoke test for struct field translation in C_To_Core_Translation.thy - Add verified swap_coords example in Simple_C_Functions.thy that parses C struct code, defines a separation-logic contract, and proves correctness using crush_boot + crush_base Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Translate C array reads (arr[idx]) by dereferencing the whole list ref and indexing with List.nth, resolving dereference_fun from locale context to avoid adhoc overloading ambiguity. Array writes (arr[idx] = val) use focus_nth for focused update via update_fun. Adds c_idx_to_nat (a proper constant wrapping unat), mk_unat and mk_focus_nth term builders, CIndex0 read/write cases in translate_expr, and smoke tests for read_at / write_at. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Parse read_at and write_at via micro_c_translate, define a separation logic contract for read_at, and verify it with crush_boot + crush_base. Arrays are modeled as references to c_int list; the contract requires a valid index and guarantees the returned value equals vs ! (c_idx_to_nat idx). Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Define c_unsigned_add/sub/mul (wrapping) and c_unsigned_less/le/eq (comparison) for unsigned word types, with corresponding WP simp and intro rules. Unsigned arithmetic wraps naturally via Isabelle's word arithmetic; no overflow checks needed. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_numeric_type ML datatype (CInt, CUInt, CChar, CSChar, CShort, CUShort, CLong, CULong) with type specifier parsing from C AST declarations. Track types through the translation context and dispatch to c_signed_* vs c_unsigned_* operations based on signedness. Function parameters now carry explicit HOL types for correct type inference. Includes smoke tests for unsigned int, long, and char translation verifying correct operation dispatch. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Add c_uint_verification_ctx locale with verified u_add (wrapping addition) and u_max (unsigned comparison) functions, demonstrating end-to-end verification of unsigned C code through the full pipeline. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Split the monolithic Simple_C_Functions.thy into three thematic files: - Simple_C_Functions.thy: scalar verification (swap, max, abs_val, u_add, u_max) - C_Struct_Examples.thy: struct verification (swap_coords with c_point) - C_Array_Examples.thy: array verification (read_at, write_at) Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Translate a C for-loop that fills the first n elements of an array with a given value. Define list_fill_prefix helper with step lemma, then prove the contract using wp_raw_for_loop_framedI' with an existential invariant tracking the partially-filled array. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Define 12 operations (bitwise AND/OR/XOR/NOT for signed/unsigned, plus shift left/right for signed/unsigned) with correct C UB semantics: bitwise ops never abort, shifts abort on out-of-range shift amount, signed shifts additionally check for negative operand and overflow. Add 24 WP rules (simp + intro) for proof automation. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
Handle 5 new binary operators (& | ^ << >>) and 2 new unary operators (~ bitwise complement, - unary minus) in the C-to-core ML translation. All dispatch signed/unsigned based on operand type. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
New C_Bitwise_Examples.thy with 7 verified examples: - Smoke tests: unsigned AND, NOT, shift left - Semantic proofs: mask_low_byte (result < 256), set_bit (bit is set), clear_bit (bit is cleared) — composing multiple bitwise ops - Signed AND (no UB precondition) Fix integer literal translation to use polymorphic types (dummyT) so literals work in both signed and unsigned contexts. Add C usual arithmetic conversion: unsigned wins when mixed with signed literals. Also add unary minus (-x) translation as (0 - x). Closes awslabs#63. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uses Isabelle/C as an alternative frontend parser for C, introducing Micro C in analogy to Micro Rust. Provide initial parsing support plus translations for arithmetic, logical, conditional, and looping expressions, as well as support for arrays, structs, and pointers.
Several examples are included, based on new WP rules for Micro C constructs.
Closes #63.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.