Skip to content

docs(runtime): master spec for Willow App Runtime (handoff draft)#636

Draft
intendednull wants to merge 7 commits into
mainfrom
claude/wasm-plugin-system-WyY1p
Draft

docs(runtime): master spec for Willow App Runtime (handoff draft)#636
intendednull wants to merge 7 commits into
mainfrom
claude/wasm-plugin-system-WyY1p

Conversation

@intendednull
Copy link
Copy Markdown
Owner

Status

Draft / handoff — not intended to merge into main as-is.

This branch captures a brainstorming arc that started as "WASM plugin system" and reframed into a full P2P app runtime for Willow. The runtime is large enough that it likely deserves its own project; this PR exists so the design state is reviewable and transferable in one place.

What's here

Two new files under docs/specs/2026-04-27-willow-runtime/:

  • README.md (~675 lines) — master spec, high-level only. Outlines the runtime model, component profiles, determinism story, crypto/capability boundaries, ABI commitments, MVP shape, and a list of planned child specs.
  • research-notes-distributed-maintenance.md (~157 lines) — research bibliography and open questions for the distributed-maintenance / participation-enforcement problem. Defers a master-spec section pending a literature read.

No code changes. No crate changes. Docs only.

Core reframe

Willow apps are bundles of WASM components, not "plugins on top of a chat client." Each app declares components with one of four runtime profiles:

Profile Purpose Determinism requirement
state-apply Materialize event into state; authority verdict Strict — pure function of (prior state, event) plus a deterministic helper set
state-propose Build candidate event from intent Loose — kernel re-checks via state-apply in dry-run mode
interaction UI / user-facing surface Non-deterministic OK; runs per-peer
behavior Bots, bridges, automations Non-deterministic OK; identity per (peer, instance)

The kernel brokers all I/O via capabilities. Apps cannot touch keys, network, or storage directly.

Key design decisions captured

  • Component Model + WIT as the interface surface (with Extism as a possible v1).
  • Browser viability via jco transpile + sync-ABI submit-and-poll for async surfaces.
  • Pre-check soundness — pre-check is mechanically the same WASM function as apply, called by kernel in dry-run mode. Not a convention.
  • Cross-peer convergence of state-apply outputs requires an app-exported canonical state-digest() (raw memory hashes diverge across allocators).
  • Key custody stays in kernel. Apps see host.seal / host.open / host.install-key / host.mls / host.verify-payload-mac. install-key returns () — whether a peer can decrypt is an interaction-side query (host.can-open).
  • UI is an app. Default substrate Leptos; Dioxus 0.7 as v2 candidate; Bevy ruled out as primary substrate but kept as a far-future GPU surface escape hatch. Custom-pixel UIs (whiteboard, code editor, 3D voice room) ship as sandboxed iframes with a postMessage protocol.
  • Peer-symmetric, not client/server. Worker nodes are peers running the same component bundle, just with different profile-instantiation choices.
  • Lazy loading by hash; components instantiated on first use, cached, and content-addressed via iroh-blobs.

Open / deferred

The biggest open problem is distributed maintenance: scaling an app's maintenance work (persister, snapshot provider, sync provider, replay buffer) across peers without letting custom clients free-ride via Sybil identities. The research notes file enumerates the relevant prior art (BAR Gossip, EigenTrust, SybilGuard, Whanau, BitTorrent choking, Holochain DHT-responsibility, Filecoin storage proofs, IPFS Bitswap) and flags Willow's existing permission/invite trust graph as a unique advantage worth exploiting.

Other deferred items, all listed at the bottom of README.md:

  • Stable public discovery channel for relay topic rotation
  • Specific WIT interfaces per profile
  • Capability install UX
  • Distribution / signing / versioning
  • App SDK ergonomics
  • Determinism enforcement details
  • Worker-as-untrusted-WASM
  • Crypto boundaries (concrete API)
  • Runtime/actor coexistence
  • MVP demo app + chat-server migration

Why draft, why not merge

  • The spec is brainstorming-grade and hasn't been pressure-tested against an implementation prototype.
  • The runtime is plausibly a separate project. Merging into willow would imply commitment we haven't made.
  • Several load-bearing decisions (participation enforcement, behavior-identity custody under multi-device, snapshot canonicalization across implementation revs) are explicitly deferred and could shift assumptions in the master spec.

Review history (for context)

The two files went through four review passes plus a cold-read pass plus a coverage audit before this PR. Notable fixes:

  • host.install-key originally returned a peer-local boolean — kernel can't enforce "MUST NOT branch on return"; replaced with () plus host.can-open on the interaction side.
  • Pre-check originally documented as "by convention via WIT-exposed function" — corrected to "mechanically the same WASM function as apply, dry-run mode."
  • Snapshot determinism originally relied on raw-memory hashing — corrected to require an app-exported state-digest().
  • "No host imports = proof of determinism" phrasing dropped after the helper-set was added; replaced with "imports must be pure functions of inputs."
  • Custom-pixel UI surfaces and lazy-loading semantics were added in the audit pass.

Test plan

N/A — docs-only PR. No code paths exercised.

https://claude.ai/code/session_01Cu88fzF2LcU5udP11jpSrz


Generated by Claude Code

claude added 7 commits April 27, 2026 08:31
Reframes the WASM-plugin discussion as a small kernel hosting typed,
capability-mediated, content-addressed P2P apps. Chat becomes one app among
many, the UI becomes one app among many, and workers become commodity peer
hosts. Captures the agreed framing as of 2026-04-27 and seeds a directory
for child specs to refine sub-systems incrementally.

This is exploratory and long-horizon; nothing here is committed code.
Pass-1 review surfaced ten major coherence issues. All applied:

- Split state component into apply (deterministic) and propose
  (non-deterministic, originating peer only); update profile table.
- Add "Crypto and key custody" section placing keys, sealing, MLS on
  the kernel side via host.seal/host.open/host.mls bound to opaque
  handles. Components hold handles; kernel custodies bytes.
- Add "Runtime and actors" section: each component instance is owned
  by exactly one actor; runtime is a typed-sandboxing layer above the
  existing concurrency model, not a replacement.
- Soften "UI is an app": the default UI app is privileged with a
  broad platform surface; ui:* is an interaction contract, not a
  portable substrate.
- Tighten ABI option B: Extism is not a WIT subset; cross-component
  composition in v1 is kernel-brokered RPC by opaque ID; migration
  to Component Model is a real refactor for app authors.
- Add "worker trust model shifts" paragraph: workers move from
  trusted-Rust to untrusted-WASM execution host; DoS, fuel, fair-share
  are load-bearing.
- Acknowledge per-app permission system delegates pre-check to app
  code; a precise responsibility shifts to authors and chat-server
  migration must address it directly.
- Promote behavior identity from open question to constraint:
  kernel-custodied keypair, app-defined permission grants. Weaken
  MVP item 6 to "observe and log" only; emit-events is post-MVP.
- Soften view-model coarse-grain claim: per-surface, version-tagged,
  paged for large lists; diff strategy for child spec.
- Fix dual-target compilation framing: kernel keeps dual-target
  discipline; app code is built once to wasm.

Also adds three new child specs to the planned list (worker as
untrusted-WASM host, crypto boundaries, actor coexistence).
Pass-2 review surfaced eight major issues. Three must-fix were
soundness-level: state-apply needs deterministic crypto helpers (the
absence-of-imports proof was too strong); behavior identity needed a
concrete per-(peer, instance) shape; key-distribution apply had no
import path to install received keys. All applied:

- Replace "no imports = proof" with "imports must be pure functions of
  inputs." Enumerate the deterministic helper set apply may import:
  verify-signature, verify-seal-envelope, hash, install-key,
  now-hlc-from-event. Update profile table and capability model.
- Fix the crypto section's contradiction by adding the apply-side
  helpers (verify-seal-envelope, install-key) to the typed crypto
  host imports list.
- Commit to behavior identity being per-(peer, behavior-instance) with
  no kernel-mediated cross-peer migration. Apps that need stable bot
  identity define an in-band registration event mapping the per-peer
  keypair to an app-level role.
- Add the constraint that an app's pre-check and apply MUST share the
  authority decision (typically the same WIT-exposed function) — drift
  is a soundness bug the kernel cannot detect. Add open question for
  pre-check failure-mode handling on the originating peer.
- Tighten "actor owns one component instance" to make the per-peer
  scope explicit; the runtime makes no claim about cross-peer actor
  topology.
- Soften kernel dual-target claim: subsystems like MLS engine and
  persistent key storage may need platform-specific backends behind a
  stable trait; cataloguing belongs in the crypto child spec.
- Reconcile relay role with topic-ID rotation: app-defined rotations
  must publish post-rotation IDs on a stable, public discovery channel
  the relay can subscribe to; relays remain transport-only.
- Soften "What changes about Willow" framing to responsibility-level
  rather than file-tree level, matching the spec's stated scope.
Pass-3 caught two contradictions pass-2 introduced and one soundness
gap I missed in pass-2's helper-set design.

Contradictions resolved:
- The "no host imports = no non-determinism" constraint bullet still
  said the old thing while the determinism section was reframed in
  pass-2. Aligned both: deterministic-by-construction, only
  non-deterministic imports are forbidden.
- The "stable public discovery channel" requirement for relay topic-ID
  rotation directly contradicted the epoch-rotation spec's whole point
  (future topic IDs unpredictable to non-members). Weakened to a
  child-spec deferral with a hint at the likely member-announced shape.

Soundness:
- host.install-key returns success/failure based on local key custody,
  which is observable. Apps could branch on the return and diverge the
  snapshot. Added an explicit MUST: the return is informational about
  local capability only and apps MUST NOT incorporate it into any state
  included in the cross-peer snapshot hash. Cross-peer state-hash gossip
  is the conformance check.

Wording and placement:
- Pre-check now explicitly placed under the state-apply runtime profile,
  same deterministic helper set, same fuel posture. This is what makes
  the pre-check/apply shared-decision MUST mechanically possible.
- "App-level pre-check in state-apply" sentence in the crypto section
  was collapsing pre-check into apply; rephrased to match pass-2's
  careful separation.
- Renamed verify-seal-envelope to verify-payload-mac to avoid collision
  with the rejected seal-gift-wrap design. Clarified the authenticity
  property (key possession, not author identity), and that MLS
  application messages do not flow through the DAG so are not what
  apply verifies.
- "Nothing to leak" overclaim softened to acknowledge resource
  consumption (handle namespace, key-store, fuel) is bounded by the
  worker child spec.

Open-question list expanded with four decisions the spec is now
silently assuming: pre-check fuel budget, handle namespace ownership,
snapshot portability across component-version upgrades, multi-peer
behavior coordination (leader election / dedup). Added relay-and-
topic-rotation as a planned child spec.
Cold-read pass (no briefing on prior fixes) caught two soundness
issues the prior passes missed and several real should-fixes.

Soundness fixes:
- host.install-key now returns (), not success/failure. The kernel
  records the (handle, blob) pair on every peer; whether THIS peer can
  unwrap is kernel-local, never visible to apply. Eliminates the
  peer-local-return carve-out entirely; apply is bit-identical across
  peers by construction. Interaction profile asks separately
  (host.can-open / host.open error). Stronger than the
  "MUST-NOT-branch-on-return" rule pass-3 added.
- Pre-check is now mechanically the same WASM function as apply's
  authority verdict — kernel calls it in dry-run mode against a
  scratch post-state. "Shared decision" is no longer a convention
  app authors must follow; it's a structural property of the WIT
  contract because pre-check and apply are the same export.

Master-level commitments added:
- Pre-check fails closed: panic, fuel exhaustion, trap, or unbounded
  loop within budget rejects the user action and does not sign.
  Failing open is forbidden because rejected events accumulate in the
  per-author DAG. Adversarial-app self-DoS is detectable and
  recoverable.
- Sync ABI uses submit-and-poll for inherently-async surfaces:
  components call sync host functions returning request-tokens; the
  kernel re-enters the component via on-completion handlers when ops
  finish. Ergonomic cost flagged.
- Snapshot convergence is via app-exported canonical state-digest(),
  not raw linear-memory hashing. Canonical encoding (postcard with
  sorted collections per existing precedent) is for the determinism
  child spec.
- ui:* calls that proxy privileged platform surfaces (clipboard,
  file pickers, navigation, push) are capability-checked PER CALL
  against the calling component's manifest, not just at import-binding.
  Prevents composed components from social-engineering the broadly-
  privileged UI app.

Cross-spec links:
- Behavior identity custody is structurally the same problem as
  multi-device user identity; flagged as something that should share
  one kernel mechanism, not be invented twice.
- The in-flight epoch-rotation work (2026-04-24) needs to land in the
  new shape: relay no longer told "this is a rotation, here's the next
  topic id" by app code.

Open questions added: worker capability advertisement (parallel to
existing relay capability document).
…tion

Captures the reframe from earlier conversation: maintenance is not a
separate work-tracking concept but a fourth class of components
alongside state/interaction/behavior. Persister, snapshot provider,
sync provider, replay buffer all become components in an app's bundle.
Scaling = more peers running an app = more maintenance capacity,
automatically.

The participation/free-rider problem under Sybil is load-bearing for
what the master spec will eventually commit to here, and is research-
heavy. Captures pointers into prior art across:

- Free-rider quantification (Adar/Huberman)
- Tit-for-tat reciprocity (BitTorrent choking, Bitswap)
- Reputation aggregation (EigenTrust, BarterCast)
- BAR (Byzantine/Altruistic/Rational) game-theoretic frame
- Sybil resistance without proof-of-work (SybilGuard, Whanau)
- Storage proofs (Filecoin, Storj)
- Holochain validator selection / DHT responsibility
- IPFS pinning economics

Notes the unique advantage Willow has — the existing permission/invite
system gives us a social trust graph for free, which generic P2P
systems had to bootstrap.

Master-spec section deferred until the next session can read the
relevant prior art and pick a model (likely hybrid).
Light audit against the conversation thread surfaced two referenced
ideas missing from the master spec. Both added at the appropriate
section, kept brief.

- Custom-pixel UI escape hatch: surfaces like whiteboard, code editor,
  3D voice room, network-graph visualizer aren't in the ui:* contract.
  On web they're sandboxed iframes the default UI app embeds, with
  postMessage as a kernel-mediated capability. Bevy slots in here as
  a far-future GPU surface plugin once its web tooling matures
  (~2027-2028), not as a default-UI replacement. TUI/MCP hosts render
  unavailable rather than fall back.
- Lazy loading: components are loaded on first use and hash-cached. A
  user in five apps doesn't instantiate all five interaction components
  at startup. State components materialize on subscribe so events can
  be applied; other profiles load on demand. Worker-computed snapshots
  cover the warm-up window.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants