Skip to content

Latest commit

 

History

History
444 lines (364 loc) · 22 KB

File metadata and controls

444 lines (364 loc) · 22 KB

AffineScript: Settled Design Decisions

This document records design decisions that are fully settled. Each entry gives the decision, the reasoning, and a pointer to the full ADR in .machine_readable/6a2/META.a2ml.

These are not open questions. Do not reopen them without amending the ADR.

Effect Invocation Syntax: Direct Call (ADR-008)

Effect operations are called with plain call syntax. No perform keyword.

fn fetch_user(id: Int) -> User / Http, Async {
  let resp = Http.get("/users/" ++ show(id))
  Async.await(resp.json())
}

The effects are declared once — in the return type (/ Http, Async). The call sites look like ordinary function calls. The type signature is the contract.

perform at every call site is noise that contradicts the ergonomics goal. Face parsers (Python-face, JS-face) can map async/await naturally to the Async effect via desugaring — no perform concept needed.

Conformance Suite Is Authoritative (ADR-009)

The spec is the source of truth. The parser must conform to the spec.

Standing rule: "Language scope lives in a written thesis. Thesis authoritative, code must conform. If disagreement, code is wrong."

As of 2026-04-11, 8/12 valid conformance tests fail to parse. Required fixes:

  1. PascalCase type namesInt, String, Bool, Option, user-defined types are PascalCase. The parser’s ident rule (lowercase only) must be split: ident for values, type_ident (uppercase-leading) for types.

  2. Enum declaration syntax — parser must match the spec’s enum syntax.

  3. Effect op type parameters — parser must support type parameters on effect operations.

Target: 12/12 conformance tests passing. Once passing, any future parser change that breaks a conformance test is a bug, not a spec disagreement.

QTT Surface Syntax: @-Annotations Primary (ADR-007)

The surface forms for quantity annotations are:

@linear        — used exactly once            (:1 sugar also accepted)
@erased        — compile-time only, not at runtime  (:0 sugar also accepted)
@unrestricted  — any number of uses           (:ω sugar also accepted)
                 (default — omitting the annotation means @unrestricted)

@linear/@erased/@unrestricted are the primary forms used in tutorials, error messages, IDE tooling, and generated code. The :1/:0/ numeric forms are accepted as sugar (preserved for QTT paper compatibility and for users porting from Idris2/Granule) but are never the canonical form.

Type parameters on generic functions are implicitly erased — no annotation needed:

fn get_name[..r](obj: { name: String, ..r }) -> String = obj.name
//          ^^^  — implicitly @erased (compile-time only)

The own/ref/mut ownership keywords on function parameters are surface syntax for common combinations:

own T   →  @linear T       (caller transfers ownership)
ref T   →  @unrestricted T (read-only borrowed view)
mut T   →  @linear T + mutation rights (exclusive mutable access)

Face-Aware Error Formatting Is a First-Class Toolchain Concern (ADR-010)

When a developer writes Python-face AffineScript and hits a type error, the error must be expressed in Python-face terms — not raw AffineScript canonical syntax.

Architecture:

compiler (canonical error representation, face-agnostic)
      ↓
face-aware error formatter   ← separate layer, swappable
      ↓
terminal / IDE / LSP

The compiler itself has no face knowledge. The formatter receives (error, active-face) and produces the display string. Each face ships an error vocabulary mapping (e.g. Python-face: "affine binding" → "single-use variable").

This is a design commitment, not an immediate implementation task. The compiler’s error representation must be designed with formatability in mind from the start — errors must carry enough structured information for the formatter to reconstruct face-appropriate messages.

typed-wasm: Dual Role (typed-wasm ADR-004)

The typed-wasm repository has two distinct, compatible roles:

  1. Primary: TypeLL 10-level type safety for WebAssembly linear memory. WASM linear memory as a schemaless database with schemas (regions) and type-safe access operations, verified through 10 progressive levels.

  2. Secondary: Aggregate library — the convergence infrastructure for AffineScript and Ephapax. Both languages compile to typed WasmGC and need agreed binary conventions (type layout, ABI, stdlib types, linking). The Idris2 ABI infrastructure that proves the 10-level claims also proves the cross-language layout contracts.

Neither role is subordinate. They are kept distinct in documentation and in src/abi/ (separate Idris2 modules, clear naming).

See typed-wasm/docs/architecture/AGGREGATE-LIBRARY-VISION.adoc for the aggregate library design.

Stdlib Namespace Model: Real Modules (ADR-011)

The standard library uses real modules with qualified paths, not a flat de-duplicated prelude.

Every stdlib/.affine declares module <Name>;; cross-file use is explicit (use option::{Option, Some, None}; / Result::unwrap); there is exactly one canonical definition per name, owned by its module. A minimal prelude module may *re-export the universally needed names (Option, Result, Some/None/Ok/Err) — re-exports, never independent redefinitions.

This resolves the prelude vs option vs result signature-divergent duplicates (prelude.map(arr, f) vs option.map(f, opt)) by single ownership, and makes the AOT pipeline exercise real cross-module resolution — the actual objective of issue #128. It is consistent with the machinery the compiler (module_loader.ml, module/use/:: grammar) and the newer stdlib files (Core, Deno, Vscode, …) already use.

Settles issue #132; gates #133/#135/#137/#138. Full ADR in .machine_readable/6a2/META.a2ml (ADR-011).

Grammar Changes Are Correctness Assertions; Parser-Conflict Disclosure (ADR-012)

Principle. AffineScript’s grammar is changed only to make it state something true about the language’s semantics. The grammar is never contorted to lower a cosmetic build metric (e.g. a parser-generator conflict count). Correctness is the target; tooling noise is downstream of it, not the other way round.

Applications of the principle.

  • { } record syntax (issue #218, #215 families C+D). { in expression position is unconditionally a block; record/struct construction uses the { … } sigil. Justification is semantic: it removes the block-vs-record ambiguity and the struct-literal-in-if/match-scrutinee hazard by construction. The fall in conflict count was a consequence, never the rationale.

  • return/resume are diverging prefix expressions (issue #215 family B). They are parsed at statement-expression top level, not as expr_primary. Justification is semantic: return e has type Never — it never yields a value to an enclosing operator, so (return a) + b is dead code wearing an expression’s costume. Hoisting them makes the grammar assert "control flow diverges and greedily owns the rest of the computation; it is not an operand." (return a) + b now requires explicit parentheses — a feature (post-divergence dead code is made visible), in the same spirit as affine typing and explicit effect rows. This change is not motivated by, and does not reduce, the conflict count.

Residual LALR conflicts: won’t-fix, on correctness grounds. After the above, the parser still emits ~68 shift/reduce and a small number of reduce/reduce notices (chiefly the inherent expression-cascade ambiguity and the block trailing-expression-vs-statement boundary, state 401). These are inherent LALR(1) artefacts that Menhir resolves correctly — the full just test gate (257 cases, incl. AOT, golden, e2e) proves every accepted parse is the intended one. Eliminating them would require systemic precedence/left-factoring surgery across the core expression grammar: high blast radius (it can shift the parse of every existing program), for a payoff that is cosmetic only. That is exactly the contortion this ADR forbids. They are therefore intentionally left in place and tracked as a documented won’t-fix on issue #215.

Disclosure policy (masking is not hiding). Because a wall of "conflicts arbitrarily resolved" warnings makes a correct toolchain look broken — particularly damaging when correctness is the product — the default just build masks these specific benign notices. It does not pretend they are absent: every build that masks them prints the masked count, states that the parser parses correctly, references this decision, and gives the exact command (just build-loud, or AFFINESCRIPT_SHOW_MENHIR_NOISE=1, or plain dune build) to show the full raw output. Nothing is suppressed at the parser-generator level (that would itself be risky change for a cosmetic end); the underlying build is byte-for-byte unchanged and fully transparent on demand.

Settles the disposition of issue #215 residual families. Full ADR in .machine_readable/6a2/META.a2ml (ADR-012).

Async on WasmGC: Transparent CPS Continuation Transform (ADR-013)

stdlib/Http.affine exposes one portable surface, fetch(req) → Response / { Net, Async }. The Deno-ESM backend lowers it to a direct await (#226). The WasmGC backend cannot — extern calls are synchronous, the boundary is i32-only, and the estate’s async-over-wasm mechanism (#205) is callback-shaped. Issue #225 (Option 2, owner-chosen) requires one byte-identical source surface on both targets: the wasm backend hides the continuation plumbing.

Decision: on the WasmGC backend only, functions whose effect row includes Async are compiled via a selective CPS transform. Each async boundary splits the body; the post-split code becomes a generated continuation captured via the existing #199 closure ABI and registered via the existing #205 thenableThen host→guest re-entry. The enclosing Async function returns a Thenable handle, so it composes transparently up the call chain. Pure / non-Async code is untouched; this is not JSPI and not a general continuation feature — new orchestration over three proven primitives only.

Binding correctness obligations: affine/linear capture is the borrow checker’s single use with double-resumption impossible; the transform triggers iff Async ∈ fd_eff; Response reconstruction is a typed reader with no silent lossiness (general decode deferred to #161); the 258-case gate is green at every commit with a wasm e2e parity test.

The Thenable-handle + thenableThen continuation protocol is the agreed async ABI for the typed-wasm convergence layer; Ephapax is a co-stakeholder (typed-wasm ADR-004). Delivered as 4 incremental, gated PRs. Full design in docs/specs/async-on-wasm-cps.adoc; full ADR in .machine_readable/6a2/META.a2ml (ADR-013).

Module-Qualified Type/Effect Path Separator (ADR-014)

The type/effect grammar had no module-qualified path production, so a qualified reference like Externs.Res was unrepresentable in any type or effect position (parse error at the .). An estate audit (compiler-as-oracle) found this the single dominant fault: 525 of ~1177 .affine files fail to parse, qualified-path the leading cause. ADR-011 already settled real modules with qualified paths; this consequence was simply unspeakable. The separator was the escalated, owner-decided question (module_path uses .; use/value paths use ::; the corpus uses . for types).

Decision: accept both . and :: (mixed permitted); Pkg::Type is canonical (consistent with ADR-011 value paths); . is tolerated and normalised to ::. The parser folds a qualified name into one canonical ::-joined ident, so resolve/typecheck/codegen need no change and the formatter prints the canonical form for free. Conflict-neutral by construction (it only adds ./:: lookahead where no reduce action existed — exactly the parse-error void): measured Menhir conflict states unchanged at 21 S/R + 1 R/R, item counts unchanged at 68 S/R / 7 R/R. The grammar PR unblocks parsing; most estate parse failures clear with zero consumer churn (genuine ReScript-surface residue is #229).

This decision is settled; do not reopen without amending the ADR. Full ADR in .machine_readable/6a2/META.a2ml (ADR-014); ledger CORE-03 in docs/TECH-DEBT.adoc.

WASI Preview2 / WASM Component-Model Migration (ADR-015)

lib/wasi_runtime.ml emits only wasi_snapshot_preview1.fd_write to stdout. No files, sockets, environment, clock, or argv — so there is no server-side runtime (INT-06 blocked) and no real host I/O. INT-03 (#180) is the substrate fix. The approach was an escalated one-way-door fork (2026-05-19 AskUserQuestion): (a) expand the preview1 surface, (b) preview1 + an external preview1→preview2 adapter, or (c) a full re-target to the WebAssembly Component Model with WASI 0.2. The owner chose (c) — explicitly the largest, one-way, highest-blast-radius path.

Decision: re-target the wasm/wasm-gc output to the WebAssembly Component Model with WASI 0.2 WIT worlds (wit/affinescript.wit, world affinescript:cli/run). Staged and non-breaking per slice; the legacy core-wasm + preview1 stdout path remains the default until the final slice, so the migration is reversible while in progress even though the end-state is one-way. The affinescript.ownership custom section is a multi-producer ABI (shared with hyperpolymath/ephapax; the typed-wasm contract carrier) and must survive verbatim onto the component’s embedded core module — its format must not change here; any change is coordinated in hyperpolymath/typed-wasm, never made unilaterally for this migration. Only the wasm target re-points; the 22+ source-to-source targets are unaffected.

Staged plan (ledger INT-03; each row is one gated PR): S1 (this) ADR
WIT world + plan + tooling prerequisite + roadmap truthing, no codegen change; S2 toolchain provisioning (wasm-tools, wasm-component-ld, wac) — ABSENT today, a hard gate on S3+; S3 componentize on-ramp (codegen still emits core wasm; post-step wraps it via the standard preview1→preview2 adapter; ownership-section survival asserted; wasmtime component-run smoke); S4 native wasi:clocks/environment/argv; S5 wasi:filesystem (unblocks INT-06); S6 wasi:sockets, then flip the default wasm target to component and demote preview1 to a legacy target.

This decision is settled; do not reopen without amending the ADR. Full ADR in .machine_readable/6a2/META.a2ml (ADR-015); ledger INT-03 in docs/TECH-DEBT.adoc; roadmap in docs/ECOSYSTEM.adoc.

Effect-Threaded Async-Boundary Detection (ADR-016)

The WasmGC CPS transform (ADR-013) detects the async boundary structurally — a let whose RHS calls a primitive in a hardcoded set async_primitives = ["http_request_thenable"]. That was the contained #225 PR3a scope; it is brittle (every new async primitive needs a codegen edit) and blind to user-defined Async functions, blocking the fully transparent ADR-013 surface for user async code. #234 generalises it: the boundary is any call whose effect row ⊇ Async.

The AST carries no location/node id on ExprApp, and the escalated one-way-door fork (AskUserQuestion 2026-05-19) was decided in favour of a typecheck→codegen side-table, not AST annotation.

Decision: thread per-call effect rows via a side-table keyed by a deterministic shared call-site numbering (a single traversal in a new lib/effect_sites.ml, called identically by typecheck and codegen, so keys cannot drift; no AST shape change). Typecheck.check_program populates ordinal → effect_row (declared and inferred) and returns it; bin/main.ml threads it into codegen (source-to-source backends ignore it); the CPS boundary predicate becomes a table lookup with the existing structural recogniser as the sound table-miss fallback.

Staged (ledger #234): S1 this ADR + plan (no code); S2 effect_sites.ml + typecheck builds/returns the table (built, unused — gate-neutral); S3 pipeline threads it, codegen switches to the table with structural fallback, new e2e proving a user-defined Async fn triggers the transform; S4 retire the hardcoded set (fallback kept for table-miss only).

This decision is settled; do not reopen without amending the ADR. Full ADR in .machine_readable/6a2/META.a2ml (ADR-016); ledger #234 / CORE-02 in docs/TECH-DEBT.adoc.

ReScript Block-Module Disposition: One Module Per File (ADR-017)

ReScript module Name { … } block modules have no AffineScript block form: the grammar (parser.mly:130-134) is a single optional module Path; header, before imports — module A { } parse-errors. ADR-011 already settled "real modules": the file is the module. The estate ReScript→AffineScript ports (#229) carry block modules — a single one per file is mechanical (hoist the header, drop the braces, dedent; already in the #229 canonical map and verified to parse), but multiple block-modules in one file (e.g. standards/lol/…​/OpenCyc.affine: module Config { } module Concepts { } module Types { } …, 14 estate occurrences) had no clean target. Escalated language-side as ESC-04 (#262), the same bidirectional-evidence discipline as ADR-014 / #228.

Decision: one module per file — split, do not nest. Each ReScript module X { body } becomes its own X.affine whose first declaration is module X;, body dedented, use imports after the header. A file with N block-modules is split into N files. The grammar is not extended with a block/nested-module form: that would contradict ADR-011 (file = module) and is a major, conflict-risky grammar change for cosmetic gain — exactly the contortion ADR-012 forbids. No language/compiler change is needed (the file-header form already parses); this ADR settles the porting doctrine + the #229 canonical-map structural rule.

Adjacent, explicitly NOT in scope here: a split file still hits Resolve.UndefinedModule until the repo’s module-path↔file-layout matches the loader — that is cross-module graph coherence (INT-02 loader-bridge territory), tracked in RESCRIPT-ELIMINATION.adoc Tier-4 and the INT-02 ledger, never conflated with this disposition.

This decision is settled; do not reopen without amending the ADR. Full ADR in .machine_readable/6a2/META.a2ml (ADR-017); #229 canonical map in docs/RESCRIPT-ELIMINATION.adoc; escalation issue #262.

No Raw/FFI Escape: Typed extern Is the Only Host Bridge (ADR-018)

ReScript %%raw("<host source>") / %raw injects arbitrary untyped host source. AffineScript’s only host bridge is extern fn / extern type (grammar parser.mly: extern_fn_decl / extern_type_decl, FnExtern body) — typed, host-supplied, no body, no arbitrary-source escape. 14 estate %%raw occurrences (#229 Tier-3, ESC-01 #245) had no clean target. Escalated language-side — the bidirectional-evidence discipline of ADR-014 / #228, and the affine spirit of explicit, typed boundaries.

Decision: no raw escape — there is no %%raw analogue, by design. The typed extern fn / extern type declaration is the sole FFI surface. Every estate %%raw ports to a typed extern declaration whose signature states the host contract the raw blob assumed implicitly; the host implementation moves to the embedder/runtime shim. AffineScript will not gain an untyped intrinsic/extern raw block: an arbitrary-source hole defeats affine/effect tracking at the very boundary where the guarantees matter most — exactly what the type-and-effect discipline (and ADR-012) exists to prevent. This is a doctrine decision: extern already exists, so there is no compiler change; it settles the #229 canonical-map target and the language’s FFI stance.

Consequence: a %%raw that encodes genuine logic (not just a host call) is a design smell surfaced by the port — it must be re-expressed as real AffineScript plus a typed extern for any true host primitive, not smuggled through. %%raw-bearing #229 files (idaptik Main/StartupError, parts of burble) port under this doctrine; per-file execution is the #229 per-repo work.

This decision is settled; do not reopen without amending the ADR. Full ADR in .machine_readable/6a2/META.a2ml (ADR-018); #229 canonical map in docs/RESCRIPT-ELIMINATION.adoc; escalation issue #245.

Compiler Distribution: GitHub Releases Binaries + Thin Deno/JSR Shim (ADR-019)

INT-04 (#181) is "publish compiler + runtime". The runtime JS packages are JSR-publishable and shipped (#261); the compiler is a native OCaml binary — not a JSR/npm package — so its distribution was an escalated one-way-door fork (issue #260, AskUserQuestion 2026-05-19) over Releases-binaries / Guix-Nix / JSR-shim / combination. The owner chose the combination.

Decision: Releases-canonical, dual-channel. The existing release.yml (v* tags) is extended to build per-platform compiler binaries + a SHA256SUMS manifest attached to the GitHub Release (the single source of truth — Guix/Nix and any npm tail are additive fetch-derivations over it later, not separate producers). A thin Deno/JSR package @hyperpolymath/affinescript is the ergonomic front door: it downloads the host-platform binary from the pinned Release, verifies it against the SHA256SUMS checksum embedded in that shim version, caches and execs it (HTTPS-only, no secrets, one version+checksum pinned per shim release — no floating fetch). affinescript-lsp (INT-10) consumes the shim, which is what unblocks INT-10. An npm tail is deferred (only if an npm-native consumer needs it, mirroring the affine-vscode exception).

Staged (ledger #260 / INT-10): S1 this ADR + plan + file INT-10 (no code); S2 release.yml per-platform binary + SHA256SUMS matrix; S3 the shim package (download + checksum-verify + cache + exec)
tests, publish owner-gated via the existing manual JSR workflow; S4 wire INT-10 affinescript-lsp onto the shim.

This decision is settled; do not reopen without amending the ADR. Full ADR in .machine_readable/6a2/META.a2ml (ADR-019); ledger #260 / INT-10 in docs/TECH-DEBT.adoc.