Skip to content

[DX-3383] Optimize Docker Caching#1487

Merged
erikburt merged 23 commits intomainfrom
moreDockerCache
Mar 30, 2026
Merged

[DX-3383] Optimize Docker Caching#1487
erikburt merged 23 commits intomainfrom
moreDockerCache

Conversation

@kalverra
Copy link
Copy Markdown
Contributor

@kalverra kalverra commented Mar 25, 2026

/chainlink PR: smartcontractkit/chainlink#21705

Helps reduce docker build times from ~5m to ~2m30s

@kalverra kalverra marked this pull request as ready for review March 26, 2026 16:38
@kalverra kalverra requested a review from a team as a code owner March 26, 2026 16:38
@kalverra kalverra requested a review from chainchad March 26, 2026 16:39
chainchad
chainchad previously approved these changes Mar 26, 2026
@kalverra kalverra marked this pull request as draft March 26, 2026 18:33
@kalverra kalverra marked this pull request as ready for review March 26, 2026 19:13
@kalverra kalverra requested a review from chainchad March 26, 2026 19:13
chainchad
chainchad previously approved these changes Mar 26, 2026
Copy link
Copy Markdown
Contributor

@erikburt erikburt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good, but they hinge on the other PR which I left comments on. So just going to leave a comment for now.

@kalverra kalverra requested review from chainchad and erikburt March 27, 2026 16:01
chainchad
chainchad previously approved these changes Mar 27, 2026
@kalverra kalverra enabled auto-merge (squash) March 27, 2026 17:45
@kalverra kalverra disabled auto-merge March 27, 2026 17:45
@kalverra kalverra requested a review from chainchad March 30, 2026 15:52
Copy link
Copy Markdown
Contributor

@erikburt erikburt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small concerns

Comment on lines +181 to +199
- name: Compute remote plugin cache key
id: plugin-cache
shell: bash
run: |
HASH=$(cat \
plugins/plugins.public.yaml \
plugins/plugins.private.yaml \
plugins/plugins.testing.yaml \
plugins/scripts/* \
| sha256sum | cut -d' ' -f1)
echo "key=remote-plugins-${HASH}" >> "$GITHUB_OUTPUT"
mkdir -p .plugin-cache

- name: Restore cached remote plugin binaries
id: plugin-cache-restore
uses: actions/cache/restore@v5
with:
key: ${{ steps.plugin-cache.outputs.key }}
path: .plugin-cache/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces better results than the native docker layer cache?

Because we build a normal image and a plugins image in parrallel, they will both try and write to the cache, and only 1 will actually succeed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces better results than the native docker layer cache?

Docker layer caching can't help here because build-remote-plugins inherits from deps-base (which includes go.mod). Any dependency bump invalidates the parent layer, cascading to a full plugin rebuild (~160s), even when the plugin manifests are unchanged. The actions/cache key is based solely on plugin manifests + scripts, so it only invalidates when plugins actually change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they will both try and write to the cache, and only 1 will actually succeed.

Right, we end up wasting a little time and money on one of the runners each time there's a cache miss, and I think that's maybe the best solution.

I'm changing the PR to include a save-remote-plugin-cache input so that only one runner saves the plugin cache, but the solution seems messier than the problem it solves to me. Not sure on your take.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker layer caching can't help here because build-remote-plugins inherits from deps-base (which includes go.mod). Any dependency bump invalidates the parent layer, cascading to a full plugin rebuild (~160s), even when the plugin manifests are unchanged. The actions/cache key is based solely on plugin manifests + scripts, so it only invalidates when plugins actually change.

That makes sense. I would like to point out that this is sacrificing correctness for speed. Because technically build-remote-plugins depends on more than just the files that are responsible for the hash key.

Other dependencies:

  • GNUMakefile
  • The version of loopinstall (technically in the go.sum)
  • The docker build args (ARG CL_INSTALL_PRIVATE_PLUGINS=true, ARG CL_INSTALL_TESTING_PLUGINS=false)

Copy link
Copy Markdown
Contributor

@erikburt erikburt Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also makes this action tied to the contents of /chainlink even though it is used in more repositories: https://github.com/search?q=org%3Asmartcontractkit+%22ctf-build-image%22+%28NOT+repo%3Asmartcontractkit%2Fchainlink%29+%28NOT+repo%3Asmartcontractkit%2F.github%29&type=code

Edit: I guess this isn't a big deal, because other usages still checkout /chainlink beforehand. This isn't used to build non CL images. The drift is still an issue though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we were to bump this to v2, it also leaves it open for drift. ie. New plugins file? Will have to remember to add that path to this action

Copy link
Copy Markdown
Contributor Author

@kalverra kalverra Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. I wonder how much the hit on performance actually is.

Using just docker cache: 1f735a1

Testing in /chainlink:

@kalverra kalverra requested a review from erikburt March 30, 2026 19:31
docker-save-cache:
${{ github.event_name == 'schedule' || github.event_name == 'push' }}
docker-save-cache: ${{ github.event_name == 'schedule' ||
github.event_name == 'push' || github.event_name == 'pull_request' }} # TODO: Remove pull_request after testing
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove in follow-up

@erikburt erikburt enabled auto-merge (squash) March 30, 2026 21:22
@erikburt erikburt merged commit e853abf into main Mar 30, 2026
18 checks passed
@erikburt erikburt deleted the moreDockerCache branch March 30, 2026 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants