Skip to content

Commit 62e16df

Browse files
feat(codegen): #225 PR2 — WasmGC CPS transform, Async base case (ADR-013) (#233)
Transparent async on the WasmGC backend for the recognised base case: an Async fn whose body is `let r = <async-call>; <cont>` with no live-local capture beyond the result binder. The author writes no thenableThen/closure plumbing — the backend synthesises it. lib/codegen.ml: - eff_expr_has_async/fn_is_async: trigger iff Async in fd_eff. Bare effects parse as EffVar (not EffCon — only Throws[E] is EffCon); both spellings checked. - detect_async_base_case: recognises the expression-let and the block-statement (StmtLet :: rest) desugarings; conservative — any unrecognised shape or extra live-local capture falls back to the pre-existing synchronous lowering verbatim (zero behaviour change). - gen_async_base_case: reifies <cont> as a zero-arg continuation via the existing #199 ExprLambda path (binder auto-captured into the [fnId@0,envPtr@4] env — no new closure code); once-resumption guard module-global traps on a second entry (ADR-013 obligation 1). - wired into gen_function, gated on fn_is_async + detector + thenableThen resolvable; else the old path, untouched. Two root-cause fixes in the #199 closure path, which until now had only ever been static-verified (PR1's skeleton was pure pass-through); PR2 is its first real end-to-end wasm exercise and surfaced both: - export the function table as __indirect_function_table (the host ABI dispatches through it; was unexported). Guarded on a non-empty table so closure-free modules are byte-identical. - closure_code used LocalTee where the surrounding code assumes LocalSet, leaving a dangling pointer (two stack values); corrected. stdlib/Http.affine: + pub extern fn thenableThen (the wasm-path continuation-registration primitive the transform targets, mirroring the proven #205 Vscode.thenableThen). Conforms to typed-wasm ADR-005 / spec/async-convergence-abi.adoc as amended (accessor model, (env_ptr)->i32) — see that spec's Conformance — AffineScript section. typed Response reader is PR3 (strict ADR-013 PR2 scope). New wasm e2e tests/codegen/http_cps_base.{affine,mjs}: host re-enters the continuation, reads the settled scalar via a minimal accessor (=200), fires exactly once, second entry traps. Full tools/run_codegen_wasm_tests.sh green; dune test --force 258 green. Refs #225 #160
1 parent 819db54 commit 62e16df

4 files changed

Lines changed: 373 additions & 4 deletions

File tree

lib/codegen.ml

Lines changed: 187 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -636,7 +636,15 @@ let rec gen_expr (ctx : context) (expr : expr) : (context * instr list) result =
636636
let (ctx_final2, closure_alloc) = gen_heap_alloc ctx_with_lambda closure_size in
637637
let (ctx_final3, closure_idx) = alloc_local ctx_final2 "__closure" in
638638

639-
let closure_code = closure_alloc @ [LocalTee closure_idx] @ [
639+
(* `LocalSet`, not `LocalTee`: the alloc'd pointer must be consumed
640+
into the local, NOT left on the stack — every subsequent use
641+
re-fetches it via `LocalGet closure_idx`, and the comments below
642+
("Stack is now [closure_idx, env_ptr]") assume an empty stack
643+
here. With `LocalTee` the dangling pointer made `closure_code`
644+
leave TWO values; this never failed before only because the #199
645+
closure path had never been validated by a real wasm engine
646+
end-to-end (static-only until #225 PR2). *)
647+
let closure_code = closure_alloc @ [LocalSet closure_idx] @ [
640648
(* Store function ID at offset 0 *)
641649
LocalGet closure_idx;
642650
I32Const (Int32.of_int lambda_id);
@@ -1747,6 +1755,140 @@ and gen_stmt (ctx : context) (stmt : stmt) : (context * instr list) result =
17471755
end
17481756

17491757
(** Generate code for a function *)
1758+
(* ── #225 PR2: selective CPS transform for the WasmGC Async backend ──
1759+
ADR-013 (docs/specs/async-on-wasm-cps.adoc). PR2 scope = the base
1760+
case only: an `Async` function whose body is `let r = <async-call>;
1761+
<cont>` with NO live-local capture across the split. Live-local
1762+
capture + Async→Async chaining + the typed Response reader are PR3.
1763+
Detection is deliberately conservative: any shape it does not
1764+
recognise falls back to the pre-existing synchronous lowering, so
1765+
PR2 is strictly additive (no behaviour change for unrecognised
1766+
Async fns — same as today, where codegen ignores fd_eff entirely). *)
1767+
1768+
(** ADR-013 obligation 2 (effect-row fidelity): the transform triggers
1769+
iff `Async ∈ fd_eff`. Pure recursive walk of the effect row. *)
1770+
let rec eff_expr_has_async (e : effect_expr) : bool =
1771+
match e with
1772+
(* A bare effect name (`Async`, `Net`, …) parses as [EffVar] — only the
1773+
parametric form (`Throws[E]`) is [EffCon] (parser.mly effect_term).
1774+
Both spellings must be checked. *)
1775+
| EffVar id -> id.name = "Async"
1776+
| EffCon (id, _) -> id.name = "Async"
1777+
| EffUnion (a, b) -> eff_expr_has_async a || eff_expr_has_async b
1778+
1779+
let fn_is_async (fd : fn_decl) : bool =
1780+
match fd.fd_eff with
1781+
| None -> false
1782+
| Some e -> eff_expr_has_async e
1783+
1784+
(** Single binder name of a trivial `let` pattern, if any. The PR2 base
1785+
case only recognises `let <var> = <async-call>; <cont>`. *)
1786+
let simple_pat_name (p : pattern) : string option =
1787+
match p with
1788+
| PatVar id -> Some id.name
1789+
| _ -> None
1790+
1791+
(** PR2 base-case recogniser. Given an `Async` function's parameter
1792+
names and normalised body, returns [Some (binder, async_call, cont)]
1793+
iff the body is exactly `let binder = <call-expr>; <cont>` and the
1794+
only variables [cont] can reference across the async split are
1795+
[binder] itself or the function parameters (zero live-local
1796+
capture — ADR-013 PR2 scope; capture is PR3). Otherwise [None] ⇒
1797+
caller keeps the existing synchronous lowering. Conservative: a
1798+
[cont] that references a top-level helper/global is also rejected
1799+
here (its name is free relative to []), which is safe — it merely
1800+
defers more shapes to PR3 rather than risking an unsound split. *)
1801+
let detect_async_base_case ~(globals : string list) (params : string list)
1802+
(body : expr) : (string * expr * expr) option =
1803+
(* Unwrap a trivial single-expression block (`{ e }`) so a block-bodied
1804+
`Async` fn whose sole content is `let r = <call>; cont` is still
1805+
recognised; any non-trivial block falls through to [None]. *)
1806+
let rec unwrap = function
1807+
| ExprBlock { blk_stmts = []; blk_expr = Some e } -> unwrap e
1808+
| e -> e
1809+
in
1810+
(* Live-local capture check (ADR-013 PR2 scope). A name free in [cont]
1811+
is a live-local capture unless it is the async result binder, a
1812+
function parameter (re-supplied, not captured), or a top-level
1813+
function/const/global (resolved via func_indices, not the closure
1814+
env). The lone binder capture itself is handled by the proven #199
1815+
ExprLambda path; any *other* live local ⇒ defer to PR3. *)
1816+
let accept binder (call : expr) (cont : expr) =
1817+
match call with
1818+
| ExprApp _ ->
1819+
let allowed = binder :: params @ globals in
1820+
let escaping =
1821+
List.filter (fun v -> not (List.mem v allowed))
1822+
(dedup (find_free_vars [] cont))
1823+
in
1824+
if escaping = [] then Some (binder, call, cont) else None
1825+
| _ -> None
1826+
in
1827+
match unwrap body with
1828+
(* Expression-form let: `let r = <call>; cont`. *)
1829+
| ExprLet lb ->
1830+
begin match simple_pat_name lb.el_pat, lb.el_body with
1831+
| Some binder, Some cont -> accept binder lb.el_value cont
1832+
| _ -> None
1833+
end
1834+
(* Block-statement form: `{ let r = <call>; <rest...> }` — the parser's
1835+
normal desugaring of a block-bodied fn. The continuation is the
1836+
remainder of the block (trailing stmts + tail expr). *)
1837+
| ExprBlock { blk_stmts = StmtLet sl :: rest; blk_expr } ->
1838+
begin match simple_pat_name sl.sl_pat with
1839+
| Some binder ->
1840+
let cont = ExprBlock { blk_stmts = rest; blk_expr } in
1841+
accept binder sl.sl_value cont
1842+
| None -> None
1843+
end
1844+
| _ -> None
1845+
1846+
(** ADR-013 PR2 transform. Lowers a recognised base-case `Async` fn
1847+
body `let binder = <async-call>; <cont>` to:
1848+
1. the async call (yields a `Thenable` handle), bound to [binder];
1849+
2. [cont] reified as a zero-arg continuation via the EXISTING #199
1850+
ExprLambda path — [binder] is the sole live local so it is
1851+
auto-captured into the `[fnId@0,envPtr@4]` env (no new closure
1852+
code); a once-resumption trap is prepended to its body;
1853+
3. `thenableThen(handle, <closure>)`; the fn returns the result.
1854+
Pure/non-recognised fns never reach here (caller gates on
1855+
[fn_is_async] + [detect_async_base_case]); behaviour is unchanged
1856+
for them, exactly as before this slice. *)
1857+
let gen_async_base_case (ctx : context) (binder : string)
1858+
(async_call : expr) (cont : expr) (thenable_then_idx : int)
1859+
: (context * instr list) result =
1860+
let* (ctx1, call_code) = gen_expr ctx async_call in
1861+
let (ctx2, binder_idx) = alloc_local ctx1 binder in
1862+
(* Once-resumption guard global (ADR-013 obligation 1): a second
1863+
continuation entry traps. Defence-in-depth over the host's
1864+
single-fire (`thenableThen` settles a Promise exactly once). *)
1865+
let fired_gidx = List.length ctx2.globals in
1866+
let fired_global =
1867+
{ g_type = I32; g_mutable = true; g_init = [I32Const 0l] } in
1868+
let ctx3 = { ctx2 with globals = ctx2.globals @ [fired_global] } in
1869+
let cont_lambda =
1870+
ExprLambda { elam_params = []; elam_ret_ty = None; elam_body = cont } in
1871+
let* (ctx4, closure_code) = gen_expr ctx3 cont_lambda in
1872+
let guard =
1873+
[ GlobalGet fired_gidx;
1874+
If (BtEmpty, [Unreachable], []);
1875+
I32Const 1l; GlobalSet fired_gidx ] in
1876+
let ctx5 =
1877+
match List.rev ctx4.lambda_funcs with
1878+
| last :: rest ->
1879+
let patched = { last with f_body = guard @ last.f_body } in
1880+
{ ctx4 with lambda_funcs = List.rev (patched :: rest) }
1881+
| [] -> ctx4 (* unreachable: ExprLambda always lifts exactly one *)
1882+
in
1883+
let body =
1884+
call_code
1885+
@ [ LocalSet binder_idx ]
1886+
@ [ LocalGet binder_idx ] (* arg 1: the Thenable handle *)
1887+
@ closure_code (* arg 2: [fnId,envPtr] continuation *)
1888+
@ [ Call thenable_then_idx ]
1889+
in
1890+
Ok (ctx5, body)
1891+
17501892
let gen_function (ctx : context) (fd : fn_decl) : (context * func) result =
17511893
(* Create fresh context for function scope, but preserve lambda_funcs and next_lambda_id *)
17521894
let fn_ctx = { ctx with locals = []; next_local = 0; loop_depth = 0 } in
@@ -1777,7 +1919,32 @@ let gen_function (ctx : context) (fd : fn_decl) : (context * func) result =
17771919
| FnBlock blk -> ExprBlock blk
17781920
| FnExpr e -> e
17791921
in
1780-
let* (ctx_final, body_code) = gen_expr ctx_with_params body_expr in
1922+
(* ADR-013 PR2: an `Async` fn whose body is the recognised base-case
1923+
shape is lowered via the CPS transform; everything else keeps the
1924+
existing synchronous lowering verbatim (no behaviour change). The
1925+
transform also requires `thenableThen` to be resolvable as an
1926+
import; if it is not in scope we fall back rather than fail. *)
1927+
let async_base =
1928+
if fn_is_async fd then
1929+
match detect_async_base_case
1930+
~globals:(List.map fst ctx_with_params.func_indices)
1931+
(List.map (fun p -> p.p_name.name) fd.fd_params)
1932+
body_expr with
1933+
| Some (binder, call, cont) ->
1934+
begin match List.assoc_opt "thenableThen"
1935+
ctx_with_params.func_indices with
1936+
| Some tt -> Some (binder, call, cont, tt)
1937+
| None -> None
1938+
end
1939+
| None -> None
1940+
else None
1941+
in
1942+
let* (ctx_final, body_code) =
1943+
match async_base with
1944+
| Some (binder, call, cont, tt_idx) ->
1945+
gen_async_base_case ctx_with_params binder call cont tt_idx
1946+
| None -> gen_expr ctx_with_params body_expr
1947+
in
17811948

17821949
(* Compute additional locals (beyond parameters) *)
17831950
let local_count = ctx_final.next_local - param_count in
@@ -2105,8 +2272,24 @@ let generate_module ?loader (prog : program) : wasm_module result =
21052272
([], [])
21062273
in
21072274

2108-
(* Add memory export *)
2109-
let exports_with_mem = { e_name = "memory"; e_desc = ExportMemory 0 } :: ctx'.exports in
2275+
(* Add memory export, and — when the unit lifts any closure — the
2276+
function table under the name the #199 host ABI dispatches through
2277+
(`inst.exports.__indirect_function_table`, see wrapHandler in
2278+
packages/affine-vscode/mod.js). Before #225 PR2 no end-to-end
2279+
closure dispatch was ever exercised in wasm (PR1's skeleton was
2280+
pure pass-through), so this export was missing though the #199
2281+
marshalling code assumed it; the CPS continuation makes it load-
2282+
bearing. Guarded on a non-empty table so closure-free modules are
2283+
byte-for-byte unchanged. *)
2284+
let table_export =
2285+
if tables <> [] then
2286+
[{ e_name = "__indirect_function_table"; e_desc = ExportTable 0 }]
2287+
else []
2288+
in
2289+
let exports_with_mem =
2290+
({ e_name = "memory"; e_desc = ExportMemory 0 } :: ctx'.exports)
2291+
@ table_export
2292+
in
21102293

21112294
(* Stage 2: Build [affinescript.ownership] custom section from collected annotations *)
21122295
let ownership_payload = build_ownership_section ctx'.ownership_annots in

stdlib/Http.affine

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,3 +115,16 @@ pub extern type Thenable;
115115
pub extern fn http_request_thenable(url: String,
116116
method: String,
117117
body: String) -> Thenable / { Net, Async };
118+
119+
/// Register a guest continuation to run once, after `t` settles (issue
120+
/// #225 ADR-013, mirroring the proven #205 `Vscode.thenableThen` shape
121+
/// but on the Http wasm-path module). The host invokes `on_settle`
122+
/// exactly once via the #199 closure-pointer ABI; the returned `Int`
123+
/// is an opaque disposable token. The transparent CPS transform
124+
/// (ADR-013) emits calls to this — source authors never write it; they
125+
/// write `fetch`/`get`/`post` and the WasmGC backend lowers onto this
126+
/// verified primitive. PR 2 base case: single async boundary, no
127+
/// live-local capture; the continuation is fired at most once (host
128+
/// single-fire + a guest-side defensive trap on re-entry).
129+
pub extern fn thenableThen(t: Thenable,
130+
on_settle: fn(Unit) -> Int) -> Int / { Async };

tests/codegen/http_cps_base.affine

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
// SPDX-License-Identifier: PMPL-1.0-or-later
2+
// issue #225 PR 2 — WasmGC CPS transform, base case (ADR-013).
3+
//
4+
// This is the FIRST end-to-end exercise of the transparent async
5+
// transform. The source is the recognised base-case shape — a single
6+
// async boundary then a continuation with NO live-local capture beyond
7+
// the async result binder:
8+
//
9+
// let r = <async-call>;
10+
// <cont using only r>
11+
//
12+
// The author writes no `thenableThen`/closure plumbing. The WasmGC
13+
// backend detects `Async ∈ fd_eff` + this shape and lowers it to:
14+
// emit the call (Thenable handle), reify the continuation as a #199
15+
// closure that captures `r`, and `thenableThen(handle, closure)` with
16+
// a once-resumption trap on the continuation. The host re-enters the
17+
// continuation after settlement; it reads the settled value via the
18+
// MINIMAL scalar accessor below (NOT the typed Response reader — that
19+
// is PR 3, per ADR-013 §Delivery-plan).
20+
//
21+
// `thenableThen` is in the `use` list so it resolves as an import even
22+
// though the author never calls it directly; auto-injecting that
23+
// import is part of the broader transparent-surface work (PR 3+). PR 2
24+
// proves the mechanism.
25+
26+
use Http::{Thenable, http_request_thenable, thenableThen};
27+
28+
// Minimal host accessor: returns the settled HTTP status as a scalar,
29+
// keyed by the Thenable handle. Deliberately NOT a typed `Response`
30+
// decoder (ADR-013 defers structured/`headers` decode to #161/PR 3) —
31+
// just enough to prove the continuation ran and observed the right
32+
// settled value.
33+
extern fn httpThenableStatus(t: Thenable) -> Int;
34+
35+
pub fn launch() -> Int / { Net, Async } {
36+
let r = http_request_thenable("https://example.test/ok", "GET", "");
37+
httpThenableStatus(r)
38+
}

0 commit comments

Comments
 (0)