docs(v0.4): add Phase 6 implementation blueprint#189
Merged
Conversation
Phases 5a/5b/5c (memfd + MAP_SHARED) are done. The next chunk is
mode="live" — actually exercising WpBranch from branch_sandbox. That
work splits into 4 PRs (6.1-6.4) which share several non-obvious
decisions worth pinning down before any of them starts:
- FC needs a SnapshotType::VmstateOnly variant. Already vendoring
FC for MAP_SHARED; cheaper than the tee approach and unblocks
every downstream PR.
- memory.bin written by --live is byte-identical to --full — same
dense layout, no sparse holes. Restore path is untouched.
(USER-API doc's open Q5 will be flipped from "not compatible"
to "compatible" by PR 6.4.)
- Two pieces of plumbing precede the live path itself:
Vm::memfd_handle() getter + Vm::snapshot_vmstate_only wrapper.
- In-flight branch tracking for wait:false stays in-memory only
for v0.4 (daemon restart marks them failed; persistence is v0.5+).
The doc also captures the open questions worth re-evaluating during
implementation (concurrent --live on same parent, RAII pause guard,
sync-vs-async copy thread, live-of-live chaining).
Companion to DESIGN-v0.4.md, DESIGN-v0.4-USER-API.md, and the
RESOLVED Phase 3 spike.
5 tasks
UFFDIO_REGISTER is per-process. Controller can't WP-arm FC's guest memory from outside; the registration has to happen inside FC. Pattern: FC creates uffd + UFFDIO_REGISTER (WP), sends fd to controller via SCM_RIGHTS, controller arms + handles events. Adds Phase 6.1.5 PR (FC patch for POST /uffd/wp endpoint) in front of the original 6.2. Total estimate moves from ~2 weeks to ~2.5 weeks. 6.3 and 6.4 unchanged in shape; 6.3's WpBranch::begin will take an externally-registered uffd instead of creating one. Discovered while reading wp_snapshot.rs's begin() signature against KVM's real architecture (KVM runs in FC's process, not controller). The Phase 2 PoC works because it had KVM + handler in the same process — that simplification doesn't survive contact with forkd's controller/FC split.
All discovered during self-review post-scope-correction. None are
runtime-impacting; the goal is to make the blueprint actually
match what the next four PRs will look like.
Highlights:
1. Integration sketch (line 90) was still in the pre-correction
shape — calling WpBranch::begin(memfd, ...) which would arm WP
in the controller's process and miss every guest write. Rewrote
to show request_wp_uffd + begin_with_external_uffd, the
constructor the correction implies, and added a PauseGuard
bracket. This was the most consequential bug — anyone reading
the sketch to implement 6.3 would have shipped broken code.
2. The main `## PR breakdown` table only listed 4 PRs while the
scope-correction inset said 5. Added a 6.1.5 row and rewrote
6.2's Done-when to capture the SCM_RIGHTS receiver scope that
the correction added.
3. Phase 6.1's patch shape claimed "2-3 files including
api_server/request/snapshot.rs in Optional form" — what
actually shipped touched 5 files (rpc_interface.rs +
api_server/mod.rs + vstate/vm.rs needed exhaustive-match
branches) and kept mem_file_path as PathBuf. Corrected so
future rebases follow what the patch actually does.
4. The ~50 LOC estimate for 6.1.5 ignored the seccomp filter
update and the UFFDIO_REGISTER EINVAL we hit on first contact.
Rewrote to ~100 LOC + seccomp work + a half-day diagnosis
budget; documented the specific failure modes future
implementers will hit.
5. The "USER-API open Q #5" reference was pointing at the wrong
section. Q5 there is about Python SDK wait=False semantics;
the backward-compatibility claim lives in §Backward
compatibility. Fixed both the in-text reference and the 6.4
Done-when so the engineer edits the right section.
6. Open question #2 was framed against the old "begin after
pause" flow. Rewrote it to target the actual remaining
vulnerable region — snapshot_vmstate_only between pause and
resume.
Self-review used the same xhigh-effort code-review pass that
caught the three stale-reference bugs in #188.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phases 5a/5b/5c (memfd + MAP_SHARED) are done as of 2026-05-29. The next chunk — Phase 6,
mode=\"live\"— is ~2 weeks of work split into 4 incremental PRs (6.1–6.4). This doc pins down the non-obvious decisions before any of them starts so each PR can be reviewed in isolation.Key decisions captured:
SnapshotType::VmstateOnlyvariant. Cheaper than the tee approach (which would double per-BRANCH disk I/O and defeat the whole point of v0.4); same vendor strategy as MAP_SHARED. Adds one more commit toforkd-v0.4-mem-backend-shared-v1.12.memory.binwritten by--liveis byte-identical to--full. Restore path is untouched;format_versiondoesn't bump. (UpdatesDESIGN-v0.4-USER-API.mdopen Q5 from "not compatible" to "compatible" — that PR is 6.4.)Vm::snapshot_vmstate_onlywrapper.Vm::memfd_handle()getter + region geometry.mode=\"live\"inbranch_sandbox.wait: false+ in-flight branch tracking + status field onGET /v1/images/<tag>.Companion to
DESIGN-v0.4.md,DESIGN-v0.4-USER-API.md, and the now-RESOLVED Phase 3 spike.Test plan
🤖 Generated with Claude Code