emulator: replace docker save/nuke/reload with in-place prune#1329
Closed
BilalG1 wants to merge 3 commits into
Closed
emulator: replace docker save/nuke/reload with in-place prune#1329BilalG1 wants to merge 3 commits into
BilalG1 wants to merge 3 commits into
Conversation
After flattening, reclaim intermediate layers with `docker rmi` + `docker image prune -af` rather than round-tripping the final image through a tar and wiping /var/lib/docker. The round-trip cost ~15 min under same-arch TCG on the arm64 runner because every byte of the image is read, written to tar, then read and written back. Relies on the drive's `discard=on,detect-zeroes=unmap` + fstrim to return freed clusters to the qcow2, which also lets the zero-fill `dd` go.
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
With -a, docker image prune removes every image that isn't referenced by a running or stopped container. At this point in provisioning the flatten container has been rm'd and stack.service is only enabled (not started), so the freshly-tagged stack-local-emulator image has zero container refs and was getting nuked. The VM then booted cleanly but stack.service failed to `docker run` the image on startup, producing a green systemd log with no services reachable on their ports — the symptom we saw in the amd64 run. Drop -a so we only prune dangling (untagged) images. The explicit rmi of the fat + slim intermediates still leaves them dangling, so they still get reclaimed.
The Start/Verify/Stop emulator steps boot the freshly-built qcow2 and wait for all services — including the Next.js backend — to respond on their ports. Under same-arch TCG on ubuntu-24.04-arm there's no KVM, so the backend can't come up within any reasonable window (this is the same reason the build-time smoke test is already skipped on arm64). Today the step just burns the 53-minute EMULATOR_READY_TIMEOUT and fails. Gate those three steps to amd64. The build step still fully produces and validates the arm64 image; it just doesn't try to run it under emulation. The amd64 job continues to prove the image's service stack end-to-end, and the arm64 artifact is trusted to be equivalent since real arm64 hardware has KVM.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
After flattening, reclaim intermediate layers with
docker rmi+docker image prune -afrather than round-tripping the final image through a tar and wiping /var/lib/docker. The round-trip cost ~15 min under same-arch TCG on the arm64 runner because every byte of the image is read, written to tar, then read and written back. Relies on the drive'sdiscard=on,detect-zeroes=unmap+ fstrim to return freed clusters to the qcow2, which also lets the zero-fillddgo.