Decompress function runtime tarballs once when loading#127
Conversation
dcb275a to
0ee8d33
Compare
PR crossplane#24 added support for gzipped function runtime image tarballs by streaming each one through gzip.NewReader directly into go-containerregistry's tarball.Image, writing no temporary files. go-containerregistry calls the tarball.Opener it's given once per layer, plus once each for the manifest and config. Because the gzip opener re-opened and re-decompressed the whole file from the start on every call, loading a single image decompressed it once per layer. Nix's dockerTools emits one layer per store path, so a typical function image has ~50 layers and was fully gunzipped ~54 times. Computing the image digest then re-reads every layer again. With functions built concurrently, a project with a dozen multi-arch functions spent over ten minutes pegging every core in this loop. This change decompresses each gzipped tarball once into a temporary file and serves every opener call from that plain tar, turning ~54 full decompressions per image into one. The temporary files back the returned images lazily, so they must outlive Build; the builder now creates them under a per-build temporary directory and exposes a Close method that removes the directory once the caller has finished consuming the images. NewBuilder returns the concrete *realBuilder so callers can defer Close, and the build, run, and render entry points do so after they have written, sideloaded, or loaded the images. On a project with twelve functions built for amd64 and arm64, loading all twenty-four images drops from over ten minutes to roughly eighty seconds. Signed-off-by: Nic Cope <nicc@rk0n.org>
0ee8d33 to
4e6953a
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe PR refactors ChangesTemp-Dir Tarball Decompression
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/project/build_test.go (1)
605-670: ⚡ Quick winPlease keep the new gzip runtime test table-driven for consistency.
Could we reshape
TestLoadRuntimeImageGzipinto anargs/wanttable (even with one initial case) and add a briefreasonfield, to match the repository’s test conventions and keep future case expansion straightforward? Thanks for the solid coverage here.As per coding guidelines,
**/*_test.go: “Enforce table-driven test structure: ... args/want pattern ... proper test case naming and reason fields.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/project/build_test.go` around lines 605 - 670, Refactor TestLoadRuntimeImageGzip into a table-driven test structure to match repository conventions. Create a slice of test case structs with fields for args (containing the tarball type and architecture), want (the expected outcome), and reason (explaining what the test validates). Then wrap the existing test logic in a loop that iterates through the test cases, extracting the appropriate values from each case's args and want fields. Even though there is currently only one test case, this table-driven structure will provide consistency with the codebase's testing patterns and make it straightforward to add additional test cases in the future.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@internal/project/build_test.go`:
- Around line 605-670: Refactor TestLoadRuntimeImageGzip into a table-driven
test structure to match repository conventions. Create a slice of test case
structs with fields for args (containing the tarball type and architecture),
want (the expected outcome), and reason (explaining what the test validates).
Then wrap the existing test logic in a loop that iterates through the test
cases, extracting the appropriate values from each case's args and want fields.
Even though there is currently only one test case, this table-driven structure
will provide consistency with the codebase's testing patterns and make it
straightforward to add additional test cases in the future.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0ae9627d-b006-4b00-a65e-065b5935c53b
📒 Files selected for processing (6)
cmd/crossplane/project/build.gocmd/crossplane/project/run.gocmd/crossplane/render/op/cmd.gocmd/crossplane/render/xr/cmd.gointernal/project/build.gointernal/project/build_test.go
adamwg
left a comment
There was a problem hiding this comment.
One comment inline, but this lgtm overall.
Co-authored-by: Adam Wolfe Gordon <awg+github@xvx.ca> Signed-off-by: Nic Cope <nicc@rk0n.org>
Description of your changes
#24 added support for gzipped function runtime image tarballs by streaming each one through
gzip.NewReaderdirectly into go-containerregistry'starball.Image.go-containerregistry calls the
tarball.Openerit's given once per layer, plus once each for the manifest and config. Because the gzip opener re-opened and re-decompressed the whole file from the start on every call, loading a single image decompressed it once per layer.My project uses Nix's
dockerTools, which emits one layer per store path, so a typical function image has ~50 layers and was fully gunzipped ~54 times. Computing the image digest then re-reads every layer again. With functions built concurrently, a project with a dozen multi-arch functions spent over ten minutes pegging every core in this loop.This PR decompresses each gzipped tarball once into a temporary file and serves every opener call from that plain tar. The temporary files back the returned images lazily, so they must outlive
Build; the builder now creates them under a per-build temporary directory and exposes aClosemethod that removes the directory once the caller has finished consuming the images.NewBuilderreturns the concrete*realBuilderso callers candefer Close, and the build, run, and render entry points do so after they have written, sideloaded, or loaded the images.On a project with twelve functions built for amd64 and arm64, loading all twenty-four images drops from over ten minutes to roughly eighty seconds.
This follows on from #21 and #24, which introduced pre-built and gzipped function runtime tarball support respectively.
Fixes #
I have:
./nix.sh flake checkto ensure this PR is ready for review.Linked a PR or a docs tracking issue to document this change.Addedbackport release-x.ylabels to auto-backport this PR.