Skip to content

refactor(ci): split docker-publish into native-arch matrix + manifest merge#40

Merged
strausmann merged 1 commit into
mainfrom
refactor/docker-publish-native-arm64-runners
May 10, 2026
Merged

refactor(ci): split docker-publish into native-arch matrix + manifest merge#40
strausmann merged 1 commit into
mainfrom
refactor/docker-publish-native-arm64-runners

Conversation

@strausmann
Copy link
Copy Markdown
Owner

Motivation

Current pipeline (after #34 / #37 / #38 / #39) builds linux/amd64 and linux/arm64 in one buildx invocation on an x86_64 runner. arm64 goes through QEMU emulation, which for the Python backend (Pillow + brother_ql with C extensions) costs ~3 minutes per leg. End-to-end run: ~3:30.

GitHub has free native arm64 runners (ubuntu-24.04-arm) for public repos since 2025-01. This PR uses them.

Architecture

Two-phase pipeline, both phases use matrix parallelism:

Phase 1 — build (4 jobs in parallel, each on a NATIVE runner)

  ┌───────────────────────────┐  ┌───────────────────────────┐
  │ backend / amd64           │  │ backend / arm64           │
  │ runs-on: ubuntu-24.04     │  │ runs-on: ubuntu-24.04-arm │
  │ buildx --platform amd64   │  │ buildx --platform arm64   │
  │ push by digest, no tag    │  │ push by digest, no tag    │
  │ upload digest as artifact │  │ upload digest as artifact │
  └───────────────────────────┘  └───────────────────────────┘
  ┌───────────────────────────┐  ┌───────────────────────────┐
  │ frontend / amd64          │  │ frontend / arm64          │
  │ … same shape …            │  │ … same shape …            │
  └───────────────────────────┘  └───────────────────────────┘

Phase 2 — merge (2 jobs in parallel)

  ┌───────────────────────────────────────────────┐
  │ merge (backend)                               │
  │ download digests-backend-amd64 + -arm64       │
  │ docker buildx imagetools create               │
  │   -t backend:0.2.0 -t backend:0.2 -t … etc    │
  │   --annotation index:org.opencontainers.…     │
  │   backend@sha256:DIGEST_AMD64                 │
  │   backend@sha256:DIGEST_ARM64                 │
  │ verify multi-arch                             │
  └───────────────────────────────────────────────┘
  ┌───────────────────────────────────────────────┐
  │ merge (frontend) … same shape …               │
  └───────────────────────────────────────────────┘

The merge phase doesn't re-build anything — it just composes a manifest list pointing at the digests the build phase produced. End-to-end run time is now dominated by the slowest single-platform build, not the sum.

What stays the same

  • Image names: ghcr.io/strausmann/label-printer-hub-{backend,frontend}
  • Tag scheme: 1.0.0, 1.0, 1, latest (stable) / full version only (pre-release)
  • Both platforms: linux/amd64 + linux/arm64
  • OCI labels on per-platform manifests (title, description, version, revision, …)
  • OCI annotations on the multi-arch index (description for GHCR's package UI — see fix(ci): emit GHCR package description as index annotation #39)
  • Build-args VERSION / REVISION / BUILD_DATE flowing into Dockerfile ARGs → OCI labels + runtime ENV
  • Verify step still asserts both architectures are present on every published tag

What's new

  • Per-(service, arch) GHA cache scope (was: per-service only)
  • imagetools create step that composes the manifest list and applies tags + index annotations in one go
  • Phase-1 jobs push by digest (push-by-digest=true,name-canonical=true) instead of by tag

Expected speedup

Phase Before After
Backend build (amd64+arm64 QEMU) ~3:00 ~0:30 (amd64 native) ‖ ~0:45 (arm64 native)
Frontend build (amd64+arm64 QEMU) ~0:30 ~0:15 ‖ ~0:25
Verify ~0:05 ~0:05
Total ~3:30 ~1:30

(Numbers are best-effort — first run will repopulate caches per arch.)

Test plan

  • CI green on the PR (no docker-publish runs from PR-event, this only runs on release / workflow_dispatch).
  • Merge.
  • gh workflow run docker-publish.yml -f tag=0.2.0 --ref main and time it.
  • Confirm ghcr.io/strausmann/label-printer-hub-{backend,frontend}:0.2.0 still multi-arch (amd64 + arm64).
  • Confirm /healthz of pulled image still reports version=0.2.0, revision=….
  • Confirm GHCR package page shows description (from fix(ci): emit GHCR package description as index annotation #39).
  • If something is off, revert this PR — old workflow is in the history.

… merge

The previous workflow built linux/amd64 + linux/arm64 in one buildx
invocation on an x86_64 runner, which forced arm64 through QEMU. For
the Python backend (Pillow, brother_ql with C extensions) that took
~3 minutes per leg; the whole publish ran ~3:30 end-to-end.

GitHub has shipped free native arm64 runners (`ubuntu-24.04-arm`) for
public repos since 2025-01. This refactor exploits them.

Phase 1 — build (4 jobs in parallel):
  matrix = service × platform → backend/amd64, backend/arm64,
  frontend/amd64, frontend/arm64. Each leg runs on a NATIVE runner,
  builds its single-platform image, and pushes it to the registry by
  digest (no tag). The digest is exported as a per-leg artifact.

Phase 2 — merge (2 jobs in parallel):
  matrix = service → backend, frontend. Each job downloads its two
  digest artifacts and calls `docker buildx imagetools create` to
  compose a multi-arch manifest list pointing at the per-platform
  digests, applying every tag (1.0.0, 1.0, 1, latest) and the
  index-level annotations in one shot. No re-build, no re-push of
  layers — just manifest assembly (~5 seconds).

Expected end-to-end run time: ~1:30 instead of ~3:30 (slowest single-
platform build dominates, not the sum).

Side benefits:
- Per-(service, arch) cache scope so amd64 and arm64 don't trash each
  other's caches.
- Image-index annotations are emitted explicitly via the merge step,
  filtered to the `index:` prefix so the build phase's per-platform
  manifests are not double-annotated.
- The Verify-Step is unchanged and still asserts both architectures
  are present on every published tag.

No image, label, annotation, or tag scheme is changing — the only
visible difference is faster runs.
Copilot AI review requested due to automatic review settings May 10, 2026 20:35
@gemini-code-assist
Copy link
Copy Markdown

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the docker-publish GitHub Actions workflow into a two-phase pipeline that builds linux/amd64 and linux/arm64 on native runners, then merges the per-arch digests into a multi-arch manifest list via docker buildx imagetools create to avoid QEMU overhead.

Changes:

  • Split publishing into a phase-1 (service × platform) native build matrix that pushes by digest and uploads digest artifacts.
  • Add a phase-2 per-service merge job that composes and tags a multi-arch index from the collected digests and applies index annotations.
  • Adjust cache scoping to be per (service, arch).

Comment on lines 136 to 144
context: ./${{ matrix.service }}
file: ./${{ matrix.service }}/Dockerfile
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ steps.meta.outputs.tags }}
platforms: ${{ matrix.platform }}
labels: ${{ steps.meta.outputs.labels }}
# Pass through the annotations metadata-action computed (with both
# manifest:* and index:* levels, see DOCKER_METADATA_ANNOTATIONS_LEVELS
# at the workflow root). Without this the package shows up on GHCR
# with "No description provided".
annotations: ${{ steps.meta.outputs.annotations }}
# Build-args flow into the Dockerfile's ARG VERSION / REVISION /
# BUILD_DATE, which the Dockerfile then bakes into both OCI image
# labels (via LABEL) and runtime ENV vars (HUB_VERSION, …) so the
# running app can surface them through /healthz.
# `push-by-digest=true` skips the tag write and returns digest in
# `steps.build.outputs.digest`. `name-canonical=true` makes the
# registry reference the image by its canonical name.
outputs: type=image,name=${{ env.REGISTRY_GHCR }}/${{ github.repository }}-${{ matrix.service }},push-by-digest=true,name-canonical=true,push=true
build-args: |
Comment on lines +264 to +267
env:
DIGEST_REPO: ${{ env.REGISTRY_GHCR }}/${{ github.repository }}-${{ matrix.service }}
TAGS: ${{ steps.meta.outputs.tags }}
ANNOTATIONS: ${{ steps.meta.outputs.annotations }}
Comment on lines +179 to +185
name: Merge manifest (${{ matrix.service }})
runs-on: ubuntu-24.04
needs: build
strategy:
fail-fast: false
matrix:
service: [backend, frontend]
Comment on lines +193 to +194
merge-multiple: true

@strausmann strausmann merged commit 8cd824d into main May 10, 2026
13 checks passed
github-actions Bot pushed a commit that referenced this pull request May 11, 2026
## <small>0.2.1 (2026-05-11)</small>

* fix(ci): emit GHCR package description as index annotation (#39) ([12c6b6c](12c6b6c)), closes [#39](#39)
* fix(ci): lowercase image ref before push-by-digest (#41) ([9dd954e](9dd954e)), closes [#41](#41)
* fix(ci): repair docker-publish.yml startup failure (#37) ([fb7cb59](fb7cb59)), closes [#37](#37)
* fix(ci): repair Verify multi-arch manifest step + drop fail-fast (#38) ([5d2ff7d](5d2ff7d)), closes [#38](#38)
* refactor(ci): split docker-publish into native-arch matrix + manifest merge (#40) ([8cd824d](8cd824d)), closes [#40](#40)
* chore(deps): bump github.com/go-chi/chi/v5 from 5.1.0 to 5.2.2 in /frontend (#36) ([a5971b9](a5971b9)), closes [#36](#36)

[skip ci]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants