Skip to content

Commit bf28bcd

Browse files
feat(codegen): #225 PR3a — WasmGC CPS transform, multi-var live-local capture (#236)
Lifts PR2's zero-capture restriction. An Async fn body may now be <simple pre-lets>; let binder = <async-prim call>; <cont>, where cont references prelude-bound live locals alongside the async result binder. The prelude runs synchronously; the continuation is reified via the existing #199 ExprLambda path, which env-captures binder + every referenced pre-local (the #199 path already marshals N captures — no new closure code). Async boundary identification is structural-conservative (owner-chosen, affinescript#234 tracks the effect-threaded generalisation): the boundary is the let whose RHS calls a recognised async primitive (http_request_thenable). This also TIGHTENS PR2 detection (was: any ExprApp) — strictly safer. Single boundary only; a recognised primitive appearing in pre values or cont ⇒ fall back (Async→Async chaining is PR3c, affinescript#... PR3c task). pre is restricted to simple lets so the captured set is well-defined; any other shape ⇒ the unchanged synchronous lowering (zero regression). ADR-013 obligation 1 (affine/linear capture used exactly once; double-resumption impossible) is discharged by composition, not new static machinery: borrow-check runs on this straight-line AST BEFORE the transform (verified pipeline order in bin/main.ml), and the PR2 once-resumption trap guarantees cont executes exactly once. Documented in the recogniser docstring. New e2e tests/codegen/http_cps_capture.{affine,mjs}: a prelude local tag=7 captured across the split; continuation sees addTag(7, 200) ⇒ 7200, fires once, second entry traps. Full tools/run_codegen_wasm_tests.sh green (incl. PR2 base + PR1 skeleton, no regression); dune test --force 258 green. Conforms to typed-wasm ADR-005 (accessor model). Refs #225 #160
1 parent 62e16df commit bf28bcd

3 files changed

Lines changed: 249 additions & 54 deletions

File tree

lib/codegen.ml

Lines changed: 114 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1788,76 +1788,135 @@ let simple_pat_name (p : pattern) : string option =
17881788
| PatVar id -> Some id.name
17891789
| _ -> None
17901790

1791-
(** PR2 base-case recogniser. Given an `Async` function's parameter
1792-
names and normalised body, returns [Some (binder, async_call, cont)]
1793-
iff the body is exactly `let binder = <call-expr>; <cont>` and the
1794-
only variables [cont] can reference across the async split are
1795-
[binder] itself or the function parameters (zero live-local
1796-
capture — ADR-013 PR2 scope; capture is PR3). Otherwise [None] ⇒
1797-
caller keeps the existing synchronous lowering. Conservative: a
1798-
[cont] that references a top-level helper/global is also rejected
1799-
here (its name is free relative to []), which is safe — it merely
1800-
defers more shapes to PR3 rather than risking an unsound split. *)
1791+
(** Recognised async-primitive calls (ADR-013 #225 PR3a, owner-chosen
1792+
structural-conservative async-boundary identification — see
1793+
affinescript#234 for the effect-threaded generalisation). The async
1794+
boundary is a `let` whose RHS is a call to one of these names. Extend
1795+
this list as wasm-path async stdlib primitives are added. *)
1796+
let async_primitives = [ "http_request_thenable" ]
1797+
1798+
let is_async_prim_call (e : expr) : bool =
1799+
match e with
1800+
| ExprApp (ExprVar id, _) -> List.mem id.name async_primitives
1801+
| _ -> false
1802+
1803+
(** Any recognised async-primitive name occurring free in [e]. Reuses
1804+
[find_free_vars] (which traverses call heads), so this catches an
1805+
async primitive anywhere in a sub-expression — used to enforce the
1806+
single-boundary rule for PR3a (Async→Async chaining is PR3c). *)
1807+
let mentions_async_prim (e : expr) : bool =
1808+
let fv = find_free_vars [] e in
1809+
List.exists (fun p -> List.mem p fv) async_primitives
1810+
1811+
(** Async-fn body recogniser (ADR-013 #225). Returns
1812+
[Some (pre, binder, async_call, cont)] iff the body is, after
1813+
trivial-block unwrapping, a sequence of zero or more simple
1814+
`let`-bindings ([pre]) followed by `let binder = <async-prim call>`
1815+
then a continuation [cont] (the remaining stmts + tail expr), where:
1816+
1817+
- PR2: [pre] = [] (zero live-local capture).
1818+
- PR3a: [pre] may bind live locals; [cont] may reference them — they
1819+
are captured into the continuation env by the proven #199
1820+
ExprLambda path (which already marshals N captures). [pre] is
1821+
restricted to simple `let`s so the captured set is well-defined.
1822+
1823+
Single boundary only: no recognised async primitive may appear in
1824+
[pre] values or in [cont] (chaining = PR3c). Capture soundness: every
1825+
free name in [cont] must be the binder, a param, a top-level
1826+
func/const/global, or a [pre]-bound local — anything else ⇒ [None]
1827+
(fall back to the unchanged synchronous lowering, zero regression).
1828+
The affine/linear single-use obligation (ADR-013 obl. 1) is
1829+
discharged by composition: borrow-check runs on this straight-line
1830+
AST before the transform, and the once-resumption trap guarantees
1831+
[cont] executes exactly once — so no new static machinery here. *)
18011832
let detect_async_base_case ~(globals : string list) (params : string list)
1802-
(body : expr) : (string * expr * expr) option =
1803-
(* Unwrap a trivial single-expression block (`{ e }`) so a block-bodied
1804-
`Async` fn whose sole content is `let r = <call>; cont` is still
1805-
recognised; any non-trivial block falls through to [None]. *)
1833+
(body : expr)
1834+
: (stmt list * string * expr * expr) option =
18061835
let rec unwrap = function
18071836
| ExprBlock { blk_stmts = []; blk_expr = Some e } -> unwrap e
18081837
| e -> e
18091838
in
1810-
(* Live-local capture check (ADR-013 PR2 scope). A name free in [cont]
1811-
is a live-local capture unless it is the async result binder, a
1812-
function parameter (re-supplied, not captured), or a top-level
1813-
function/const/global (resolved via func_indices, not the closure
1814-
env). The lone binder capture itself is handled by the proven #199
1815-
ExprLambda path; any *other* live local ⇒ defer to PR3. *)
1816-
let accept binder (call : expr) (cont : expr) =
1817-
match call with
1818-
| ExprApp _ ->
1819-
let allowed = binder :: params @ globals in
1820-
let escaping =
1821-
List.filter (fun v -> not (List.mem v allowed))
1822-
(dedup (find_free_vars [] cont))
1823-
in
1824-
if escaping = [] then Some (binder, call, cont) else None
1839+
let pre_bound_name = function
1840+
| StmtLet sl -> simple_pat_name sl.sl_pat
18251841
| _ -> None
18261842
in
1843+
let accept pre binder (call : expr) (cont : expr) =
1844+
if not (is_async_prim_call call) then None
1845+
else begin
1846+
(* [pre] must be only simple `let`s (well-defined capture set). *)
1847+
let pre_names = List.map pre_bound_name pre in
1848+
if List.exists (fun n -> n = None) pre_names then None
1849+
else
1850+
let pre_bound = List.filter_map (fun x -> x) pre_names in
1851+
(* Single boundary: no async primitive in pre values or cont. *)
1852+
let pre_vals = List.filter_map
1853+
(function StmtLet sl -> Some sl.sl_value | _ -> None) pre in
1854+
if mentions_async_prim cont
1855+
|| List.exists mentions_async_prim pre_vals then None
1856+
else
1857+
let allowed = binder :: params @ globals @ pre_bound in
1858+
let escaping =
1859+
List.filter (fun v -> not (List.mem v allowed))
1860+
(dedup (find_free_vars [] cont))
1861+
in
1862+
if escaping = [] then Some (pre, binder, call, cont) else None
1863+
end
1864+
in
1865+
(* Split a stmt list at the first `let b = <async-prim call>`. *)
1866+
let rec split_at_boundary acc = function
1867+
| StmtLet sl :: rest when is_async_prim_call sl.sl_value ->
1868+
begin match simple_pat_name sl.sl_pat with
1869+
| Some binder -> Some (List.rev acc, binder, sl.sl_value, rest)
1870+
| None -> None
1871+
end
1872+
| s :: rest -> split_at_boundary (s :: acc) rest
1873+
| [] -> None
1874+
in
18271875
match unwrap body with
1828-
(* Expression-form let: `let r = <call>; cont`. *)
1876+
(* Expression-form let: `let r = <async-prim call>; cont` (no pre). *)
18291877
| ExprLet lb ->
18301878
begin match simple_pat_name lb.el_pat, lb.el_body with
1831-
| Some binder, Some cont -> accept binder lb.el_value cont
1879+
| Some binder, Some cont -> accept [] binder lb.el_value cont
18321880
| _ -> None
18331881
end
1834-
(* Block-statement form: `{ let r = <call>; <rest...> }` — the parser's
1835-
normal desugaring of a block-bodied fn. The continuation is the
1836-
remainder of the block (trailing stmts + tail expr). *)
1837-
| ExprBlock { blk_stmts = StmtLet sl :: rest; blk_expr } ->
1838-
begin match simple_pat_name sl.sl_pat with
1839-
| Some binder ->
1840-
let cont = ExprBlock { blk_stmts = rest; blk_expr } in
1841-
accept binder sl.sl_value cont
1882+
(* Block form: optional simple `let`s, then the async boundary, then
1883+
the continuation (remaining stmts + tail expr). *)
1884+
| ExprBlock { blk_stmts; blk_expr } ->
1885+
begin match split_at_boundary [] blk_stmts with
1886+
| Some (pre, binder, call, post) ->
1887+
let cont = ExprBlock { blk_stmts = post; blk_expr } in
1888+
accept pre binder call cont
18421889
| None -> None
18431890
end
18441891
| _ -> None
18451892

1846-
(** ADR-013 PR2 transform. Lowers a recognised base-case `Async` fn
1847-
body `let binder = <async-call>; <cont>` to:
1848-
1. the async call (yields a `Thenable` handle), bound to [binder];
1849-
2. [cont] reified as a zero-arg continuation via the EXISTING #199
1850-
ExprLambda path — [binder] is the sole live local so it is
1851-
auto-captured into the `[fnId@0,envPtr@4]` env (no new closure
1893+
(** ADR-013 #225 transform. Lowers a recognised `Async` fn body
1894+
`<pre simple-lets>; let binder = <async-call>; <cont>` to:
1895+
1. [pre] generated synchronously (its locals become live locals,
1896+
in scope for capture);
1897+
2. the async call (yields a `Thenable` handle), bound to [binder];
1898+
3. [cont] reified as a zero-arg continuation via the EXISTING #199
1899+
ExprLambda path — [binder] and every [pre]-bound local that
1900+
[cont] references are auto-captured into the `[fnId@0,envPtr@4]`
1901+
env (the #199 path already marshals N captures; no new closure
18521902
code); a once-resumption trap is prepended to its body;
1853-
3. `thenableThen(handle, <closure>)`; the fn returns the result.
1903+
4. `thenableThen(handle, <closure>)`; the fn returns the result.
18541904
Pure/non-recognised fns never reach here (caller gates on
18551905
[fn_is_async] + [detect_async_base_case]); behaviour is unchanged
1856-
for them, exactly as before this slice. *)
1857-
let gen_async_base_case (ctx : context) (binder : string)
1906+
for them, exactly as before. *)
1907+
let gen_async_base_case (ctx : context) (pre : stmt list) (binder : string)
18581908
(async_call : expr) (cont : expr) (thenable_then_idx : int)
18591909
: (context * instr list) result =
1860-
let* (ctx1, call_code) = gen_expr ctx async_call in
1910+
(* [pre] runs synchronously before the async call; folding [gen_stmt]
1911+
allocates its locals into the context so the continuation's #199
1912+
capture pass (find_free_vars ∩ ctx.locals) picks them up. Simple
1913+
`let`s are stack-neutral, so [pre_code] leaves the stack empty. *)
1914+
let* (ctx_pre, pre_code) =
1915+
List.fold_left (fun acc st ->
1916+
let* (c, code) = acc in
1917+
let* (c', s) = gen_stmt c st in
1918+
Ok (c', code @ s)) (Ok (ctx, [])) pre in
1919+
let* (ctx1, call_code) = gen_expr ctx_pre async_call in
18611920
let (ctx2, binder_idx) = alloc_local ctx1 binder in
18621921
(* Once-resumption guard global (ADR-013 obligation 1): a second
18631922
continuation entry traps. Defence-in-depth over the host's
@@ -1881,7 +1940,8 @@ let gen_async_base_case (ctx : context) (binder : string)
18811940
| [] -> ctx4 (* unreachable: ExprLambda always lifts exactly one *)
18821941
in
18831942
let body =
1884-
call_code
1943+
pre_code (* synchronous prelude; stack-neutral *)
1944+
@ call_code
18851945
@ [ LocalSet binder_idx ]
18861946
@ [ LocalGet binder_idx ] (* arg 1: the Thenable handle *)
18871947
@ closure_code (* arg 2: [fnId,envPtr] continuation *)
@@ -1930,19 +1990,19 @@ let gen_function (ctx : context) (fd : fn_decl) : (context * func) result =
19301990
~globals:(List.map fst ctx_with_params.func_indices)
19311991
(List.map (fun p -> p.p_name.name) fd.fd_params)
19321992
body_expr with
1933-
| Some (binder, call, cont) ->
1993+
| Some (pre, binder, call, cont) ->
19341994
begin match List.assoc_opt "thenableThen"
19351995
ctx_with_params.func_indices with
1936-
| Some tt -> Some (binder, call, cont, tt)
1996+
| Some tt -> Some (pre, binder, call, cont, tt)
19371997
| None -> None
19381998
end
19391999
| None -> None
19402000
else None
19412001
in
19422002
let* (ctx_final, body_code) =
19432003
match async_base with
1944-
| Some (binder, call, cont, tt_idx) ->
1945-
gen_async_base_case ctx_with_params binder call cont tt_idx
2004+
| Some (pre, binder, call, cont, tt_idx) ->
2005+
gen_async_base_case ctx_with_params pre binder call cont tt_idx
19462006
| None -> gen_expr ctx_with_params body_expr
19472007
in
19482008

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// SPDX-License-Identifier: PMPL-1.0-or-later
2+
// issue #225 PR 3a — WasmGC CPS transform, multi-var live-local capture.
3+
//
4+
// PR2 proved the zero-capture base case. PR3a lifts that: a `let` bound
5+
// BEFORE the async boundary (`tag`) is a live local the continuation
6+
// references alongside the async result binder (`r`). The transform
7+
// runs the prelude synchronously, then reifies the continuation via the
8+
// existing #199 closure path, which env-captures BOTH `tag` and `r`
9+
// (the #199 path already marshals N captures — no new closure code).
10+
//
11+
// Structural-conservative boundary (owner-chosen, affinescript#234 for
12+
// the effect-threaded generalisation): the async boundary is the `let`
13+
// whose RHS calls a recognised async primitive (http_request_thenable).
14+
// Single boundary only here (Async->Async chaining is PR3c).
15+
//
16+
// `addTag(tag, status)` is a minimal host accessor that proves BOTH the
17+
// captured prelude local AND the settled value reached the continuation.
18+
19+
use Http::{Thenable, http_request_thenable, thenableThen};
20+
21+
extern fn httpThenableStatus(t: Thenable) -> Int;
22+
extern fn addTag(base: Int, status: Int) -> Int;
23+
24+
pub fn launch() -> Int / { Net, Async } {
25+
let tag = 7;
26+
let r = http_request_thenable("https://example.test/ok", "GET", "");
27+
addTag(tag, httpThenableStatus(r))
28+
}
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
// SPDX-License-Identifier: PMPL-1.0-or-later
2+
// issue #225 PR 3a — wasm e2e for multi-var live-local capture.
3+
//
4+
// Proves the continuation, when the host re-enters it after settlement,
5+
// sees BOTH the captured prelude local `tag` (=7) and the settled HTTP
6+
// status (=200): addTag(7, 200) -> 7200. Same #199 wrapHandler dispatch
7+
// + #205 Thenable convention as the PR2 base-case test.
8+
import assert from 'node:assert/strict';
9+
import { readFile } from 'node:fs/promises';
10+
11+
globalThis.fetch = async (url, init) => ({
12+
status: url.includes('/missing') ? 404 : 200,
13+
headers: { forEach: (cb) => cb('text/plain', 'content-type') },
14+
text: async () => `ok:${init && init.method}`,
15+
});
16+
17+
let inst = null;
18+
const _handles = new Map();
19+
const _results = new Map();
20+
let _next = 1;
21+
let contFired = 0;
22+
let contReturn = null;
23+
let addTagCalls = [];
24+
let savedCb = null;
25+
26+
function readString(ptr) {
27+
const dv = new DataView(inst.exports.memory.buffer);
28+
const len = dv.getUint32(ptr, true);
29+
const bytes = new Uint8Array(inst.exports.memory.buffer, ptr + 4, len);
30+
return new TextDecoder('utf-8').decode(bytes);
31+
}
32+
33+
function wrapHandler(closurePtr) {
34+
return () => {
35+
const tbl = inst.exports.__indirect_function_table;
36+
const dv = new DataView(inst.exports.memory.buffer);
37+
const fnId = dv.getInt32(closurePtr, true);
38+
const envPtr = dv.getInt32(closurePtr + 4, true);
39+
const fn = tbl.get(fnId);
40+
const args = [envPtr];
41+
while (args.length < fn.length) args.push(0);
42+
return fn(...args);
43+
};
44+
}
45+
46+
const imports = {
47+
wasi_snapshot_preview1: { fd_write: () => 0 },
48+
env: {
49+
httpThenableStatus: (tHandle) => {
50+
const v = _results.get(tHandle);
51+
return v && typeof v.status === 'number' ? v.status : -1;
52+
},
53+
// Proves the continuation received the captured prelude local AND
54+
// the settled value: distinct, order-sensitive encoding.
55+
addTag: (base, status) => {
56+
addTagCalls.push([base, status]);
57+
return base * 1000 + status;
58+
},
59+
},
60+
Http: {
61+
http_request_thenable: (urlPtr, methodPtr, bodyPtr) => {
62+
const url = readString(urlPtr);
63+
const method = readString(methodPtr);
64+
const body = readString(bodyPtr);
65+
const h = _next++;
66+
const p = globalThis
67+
.fetch(url, { method, body: body || undefined })
68+
.then(async (r) => ({ status: r.status, body: await r.text() }))
69+
.catch((e) => ({ __error: String(e) }));
70+
_handles.set(h, p);
71+
return h;
72+
},
73+
thenableThen: (tHandle, onSettlePtr) => {
74+
const cb = wrapHandler(onSettlePtr);
75+
savedCb = cb;
76+
Promise.resolve(_handles.get(tHandle)).then((v) => {
77+
_results.set(tHandle, v);
78+
contFired += 1;
79+
contReturn = cb();
80+
});
81+
return 1;
82+
},
83+
},
84+
};
85+
86+
const buf = await readFile('./tests/codegen/http_cps_capture.wasm');
87+
const m = await WebAssembly.instantiate(buf, imports);
88+
inst = m.instance;
89+
90+
const disposable = inst.exports.launch();
91+
assert.ok(Number.isInteger(disposable), 'launch() returns synchronously');
92+
assert.equal(contFired, 0, 'continuation deferred until settlement');
93+
94+
await new Promise((res) => setTimeout(res, 0));
95+
await Promise.resolve();
96+
97+
assert.equal(contFired, 1, 'continuation fired exactly once');
98+
assert.deepEqual(addTagCalls, [[7, 200]],
99+
'continuation saw captured prelude local tag=7 AND settled status=200');
100+
assert.equal(contReturn, 7200, 'addTag(7,200) result returned by the continuation');
101+
102+
assert.throws(() => savedCb(),
103+
(e) => e instanceof WebAssembly.RuntimeError,
104+
'second continuation entry traps (once-resumption guard)');
105+
assert.equal(contFired, 1, 'trapped re-entry did not re-run continuation');
106+
107+
console.log('test_http_cps_capture.mjs OK');

0 commit comments

Comments
 (0)