fix(ci): repair Verify multi-arch manifest step + drop fail-fast by strausmann · Pull Request #38 · strausmann/Label-Printer-Hub

strausmann · 2026-05-10T20:18:09Z

Summary

After #37 fixed the workflow-startup failure, the run for tag 0.2.0 ran further but exposed two more bugs:

The Verify multi-arch manifest step interpolated the newline-separated steps.meta.outputs.tags directly into a bash for loop, which injected literal newlines into the script body and broke the parser.
fail-fast: true cancelled the still-building backend half-way through when the frontend's Verify step crashed. Result: frontend:0.2.0 was pushed, backend:0.2.0 is missing.

What's in this PR

Pipe steps.meta.outputs.tags through env: and iterate with while IFS= read -r.
Switch the matrix to fail-fast: false so a verify-step failure on one service does not destroy the other.

State of GHCR right now

✅ ghcr.io/strausmann/label-printer-hub-frontend:0.2.0 (+ :0.2, :0, :latest) — multi-arch amd64+arm64
❌ ghcr.io/strausmann/label-printer-hub-backend:0.2.0 — missing (build was cancelled)

After this merges, a manual workflow_dispatch tag=0.2.0 will publish the backend half and re-verify the frontend half.

Test plan

Merge PR.
gh workflow run docker-publish.yml -f tag=0.2.0 --ref main.
Confirm backend:0.2.0 exists as multi-arch on GHCR.
Confirm frontend Verify step now passes.

First container-buildable code for the frontend. Mirrors the backend's shape (chi router with /healthz, multi-stage Dockerfile, OCI labels, non-root UID 1000, configurable PORT, multi-arch buildable) so the docker-publish workflow can push both images side-by-side at the next release. Go skeleton (cmd/server/main.go): - chi router with RequestID + RealIP + Logger + Recoverer middleware - /healthz returns the same JSON shape as the backend: status, version, revision, build_date, repository — so orchestrator probe configs work for both containers uniformly - BuildInfo populated from HUB_* env vars baked in by Dockerfile; safe defaults when running uncontained (local dev, unit tests) - Graceful shutdown on SIGTERM/SIGINT with 15s context timeout - ReadHeaderTimeout / ReadTimeout / WriteTimeout / IdleTimeout all set to sane defaults (mitigates slowloris-style attacks) Tests (8 unit tests): - /healthz: 200, correct JSON shape, Content-Type, no auth required - /healthz body does not leak secret-ish substrings (parity with backend test_does_not_expose_secrets) - envDefault: fallback on unset / empty, env value when set Dockerfile (multi-stage): - Stage 1 (builder, golang:1.23-alpine): go mod download cached separately, then CGO_ENABLED=0 static binary with -trimpath -ldflags '-s -w' for smallest reproducible output - Stage 2 (runtime, alpine:3.20): tini + curl (HEALTHCHECK) + ca-certificates (HTTPS to backend for proxied requests later) + non-root UID 1000 user matching the backend container - ARGs VERSION / REVISION / BUILD_DATE flow into OCI labels (12 of them, identical schema to backend) AND runtime ENV vars (HUB_VERSION, HUB_REVISION, HUB_BUILD_DATE, HUB_REPO_URL) so the running app surfaces them through /healthz - Shell-form HEALTHCHECK expands ${PORT:-8080} - ENTRYPOINT tini -- /usr/local/bin/server (no shell wrapper needed; Go binary reads $PORT itself) .dockerignore minimal: ignores build output, IDE, secrets, .git, node_modules (Tailwind toolchain lands later). frontend/README.md: subdirectory README pointing at the repo root, documenting env vars and the local dev loop. Verified locally with --build-arg + docker run + curl /healthz on default port (8080) and custom port (PORT=7777). This is intentionally a SKELETON — Tailwind, HTMX, PWA manifest, service worker, OpenAPI-generated backend client, and the actual UI routes land in follow-up PRs once the backend exposes real endpoints. The point of this PR is to make 'docker compose up' work end-to-end with both containers from the next release. Refs: ADR 0001 (two-container), ADR 0003 (Go + Tailwind + HTMX + PWA), ADR 0007 (multi-arch tag scheme)

Go's testing package panics when a test uses both t.Parallel() and t.Setenv (or t.Chdir, or cryptotest.SetGlobalRandom). The setenv calls mutate process-wide state and can't safely interleave with parallel tests. The two affected tests in main_test.go set FRONTEND_TEST_KEY and FRONTEND_EMPTY_KEY via t.Setenv — they now run serially, while the six other tests still use t.Parallel(). Caught by the Go CI job on PR #35. Adding this pattern to docs/learnings/code-review-patterns.md in a follow-up commit if it likely recurs — Go contributors may not know the t.Setenv/t.Parallel incompatibility off the top of their head.

…imeout Addresses Gemini + Copilot review findings on PR #35: - Cache HUB_* env vars once at startup into `buildInfo` instead of reading os.Getenv on every /healthz request (Gemini): the values are baked into the image and never change at runtime, so per-request syscalls were waste. - Replace chi's `middleware.Logger` with a small custom slog-based request logger (Gemini): chi's logger writes through the stdlib `log` package and bypasses our slog handler, so request lines would not honour the configured log level/format/destination. - Register signal.Notify BEFORE launching the shutdown goroutine (Gemini): a SIGTERM arriving in the scheduling window between `go func` and the channel registration would otherwise terminate the process by default instead of triggering graceful shutdown. - Set `WriteTimeout: 0` with explanatory comment (Copilot): the frontend will proxy Server-Sent Events, and any non-zero WriteTimeout would tear down long-lived SSE responses mid-stream. Per-route timeouts will be applied to non-SSE routes when they are added. Tests: - New TestLoadBuildInfo_AppliesEnvOverrides verifies the startup-cache path. - New TestLoadBuildInfo_UsesDefaultsWhenUnset verifies the fallback path. - Existing tests now call initBuildInfoForTests so /healthz sees populated values (main() is what loads the cache in production, and that does not run during `go test`).

`go test -race` flagged a race on the global `buildInfo` var: every healthz test calls `initBuildInfoForTests`, and with `t.Parallel()` those calls run concurrently — multiple goroutines wrote to the same variable. Wrapping the write in `sync.Once` makes the initialization race-free: the first caller populates `buildInfo`, every subsequent caller is a no-op. Test correctness is preserved because `loadBuildInfo()` is pure with respect to the env it reads — once it has produced a value, calling it again with the same env would produce the same value.

The workflow has been failing with "workflow file issue" (0 jobs, 0s) on every push, release, and workflow_dispatch trigger since the initial commit — no container images have ever been published. Two root causes: 1. `matrix.service` was referenced in a job-level `if:` expression: if: hashFiles(format('{0}/Dockerfile', matrix.service)) != '' The matrix context is not available before the matrix expands, and the job-level `if:` evaluates BEFORE expansion. GitHub treats this as a startup failure and reports zero jobs. Replaced with a step-level `test -f` check after checkout, which has matrix.service available. 2. `secrets.*` was referenced in a step-level `if:` expression: if: secrets.DOCKERHUB_USERNAME != '' && secrets.DOCKERHUB_TOKEN != '' Since the 2024 Actions hardening, `secrets` is no longer a recognised named-value in step `if:` conditions ("Unrecognized named-value: 'secrets'"). Surfacing the secrets into the step's env first and then conditioning on env.* is the documented workaround. After these two fixes the workflow parses cleanly, the matrix expands into [backend, frontend], and both images can be built and pushed to GHCR (Docker Hub is still optional and gated on the env-var check).

The previous docker-publish run (25638615417) pushed the frontend image to GHCR correctly, but the Verify step crashed and the fail-fast=true strategy cancelled the still-building backend half-way through. Net result: frontend:0.2.0 published, backend:0.2.0 missing. Two bugs: 1. The Verify step interpolated `${{ steps.meta.outputs.tags }}` directly into the bash `for` loop. The metadata-action emits tags newline-separated, so the substitution injected literal newlines into the script body — bash then parsed the second tag as a syntax error. Fix: pass the tag list through the step `env:` and iterate with `while IFS= read -r`, which is safe for newline-separated data. 2. `fail-fast: true` made the frontend's Verify-step failure cancel the backend's still-running build. The two images are independent — a failure on one must not destroy the other. Switched to `fail-fast: false` so a partial publish is still useful and the user only has to re-run the failing matrix leg. The Build & push step itself worked correctly: frontend:0.2.0, frontend:0.2, frontend:0, frontend:latest exist on GHCR as multi-arch (amd64 + arm64) manifests. After this fix the backend half will catch up on the next workflow_dispatch with tag=0.2.0.

gemini-code-assist · 2026-05-10T20:18:18Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses critical CI failures encountered during the 0.2.0 release process by refining how multi-arch manifests are verified and preventing premature workflow cancellation. Additionally, it introduces the foundational Go service for the frontend, providing a deployable baseline that includes structured logging, graceful shutdown, and OCI-compliant image labeling.

Highlights

CI Workflow Improvements: Updated the 'Verify multi-arch manifest' step to safely handle newline-separated tags using a while-read loop and disabled fail-fast in the matrix to ensure independent service verification.
Frontend Skeleton Implementation: Added a new Go-based frontend service skeleton, including a Dockerfile, basic health check endpoint, and unit tests to support the project's two-container architecture.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/docker-publish.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

This PR aims to make the docker-publish.yml workflow reliably publish and verify multi-arch images without one matrix leg cancelling the other, and also adds a new buildable frontend/ Go service + container so the workflow has an actual frontend image to build/push.

Changes:

Fix the “Verify multi-arch manifest” step by iterating newline-separated tags safely via an env var + while read.
Disable matrix fail-fast to avoid cancelling the other service’s publish leg on a verify-step failure.
Add an initial frontend/ Go module, healthcheck endpoint, Dockerfile, and basic tests to produce a deployable frontend image.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`.github/workflows/docker-publish.yml`	Makes tag verification robust and prevents matrix cancellation via `fail-fast: false`.
`frontend/README.md`	Documents local dev, configuration, and container image details for the new frontend service.
`frontend/go.mod`	Introduces the frontend Go module and dependencies (chi).
`frontend/go.sum`	Adds checksum entries for the new Go dependency set.
`frontend/Dockerfile`	Adds a multi-stage build for a non-root, healthcheckable frontend container image.
`frontend/cmd/server/main.go`	Implements a minimal chi server with `/healthz`, structured request logging, and graceful shutdown.
`frontend/cmd/server/main_test.go`	Adds tests for `/healthz`, env defaulting, and build-info loading behavior.
`frontend/.dockerignore`	Reduces build context and prevents accidental inclusion of local/dev/secrets artifacts.

    strategy:
-      fail-fast: true   # if backend fails, don't push frontend mismatched
+      # Each image stands on its own — a verify-step crash on one service
+      # must not cancel the other half-built. We tolerate the rare case
+      # where one image gets a tag and the other doesn't; the user can
+      # re-run the failing matrix leg from the Actions UI.
+      fail-fast: false
      matrix:
        service: [backend, frontend]


+LABEL org.opencontainers.image.title="label-printer-hub-frontend" \
+      org.opencontainers.image.description="Self-hosted label printer hub for Brother PT/QL series — frontend container (Go + Tailwind + HTMX + PWA)" \
+      org.opencontainers.image.url="https://github.com/strausmann/label-printer-hub" \
+      org.opencontainers.image.source="https://github.com/strausmann/label-printer-hub" \
+      org.opencontainers.image.documentation="https://github.com/strausmann/label-printer-hub/tree/main/docs" \
+      org.opencontainers.image.authors="Björn Strausmann <strausmannservices@googlemail.com>" \
+      org.opencontainers.image.vendor="strausmann (independent open-source project; not affiliated with Brother Industries, Ltd.)" \
+      org.opencontainers.image.licenses="MIT" \


gemini-code-assist

Code Review

This pull request establishes the foundational skeleton for the Go-based frontend, including a multi-stage Dockerfile, project documentation, and a basic web server using the chi router. The implementation features a /healthz endpoint, structured logging with slog, and graceful shutdown handling, supported by a suite of unit tests. Review feedback identified a privacy violation regarding hardcoded PII in the Dockerfile labels and recommended reordering the middleware to ensure the request logger captures events even when a panic occurs.

gemini-code-assist · 2026-05-10T20:23:59Z

+      org.opencontainers.image.url="https://github.com/strausmann/label-printer-hub" \
+      org.opencontainers.image.source="https://github.com/strausmann/label-printer-hub" \
+      org.opencontainers.image.documentation="https://github.com/strausmann/label-printer-hub/tree/main/docs" \
+      org.opencontainers.image.authors="Björn Strausmann <strausmannservices@googlemail.com>" \


This line contains the maintainer's real name and email address, which constitutes Personally Identifiable Information (PII). According to the repository's privacy policy, PII should not be hardcoded in the repository to ensure the maintainer's privacy and prevent network deduction.

org.opencontainers.image.authors="Maintainer <maintainer@example.invalid>" \

References

Privacy violations. Flag any hardcoded LAN IPs, real hostnames, real domains, real tokens, or PII. The maintainer's network must not be deducible from this repository. ^(link)

gemini-code-assist · 2026-05-10T20:23:59Z

+	r.Use(middleware.Recoverer)
+	r.Use(slogRequestLogger)


The slogRequestLogger middleware is currently placed after middleware.Recoverer. This means that if a panic occurs in a handler, the Recoverer will catch it and return a 500 response, but the slogRequestLogger will be bypassed and won't log the request or the resulting status code. Moving the logger above the recoverer ensures that all requests, including those that result in a panic, are logged with their final status code. This also ensures the middleware uses the slog logging framework consistently as required.

Suggested change

r.Use(middleware.Recoverer)

r.Use(slogRequestLogger)

r.Use(slogRequestLogger)

r.Use(middleware.Recoverer)

References

Ensure HTTP middleware uses the same logging framework (e.g., slog) as the rest of the application to maintain consistent log formatting and configuration.

## <small>0.2.1 (2026-05-11)</small> * fix(ci): emit GHCR package description as index annotation (#39) ([12c6b6c](12c6b6c)), closes [#39](#39) * fix(ci): lowercase image ref before push-by-digest (#41) ([9dd954e](9dd954e)), closes [#41](#41) * fix(ci): repair docker-publish.yml startup failure (#37) ([fb7cb59](fb7cb59)), closes [#37](#37) * fix(ci): repair Verify multi-arch manifest step + drop fail-fast (#38) ([5d2ff7d](5d2ff7d)), closes [#38](#38) * refactor(ci): split docker-publish into native-arch matrix + manifest merge (#40) ([8cd824d](8cd824d)), closes [#40](#40) * chore(deps): bump github.com/go-chi/chi/v5 from 5.1.0 to 5.2.2 in /frontend (#36) ([a5971b9](a5971b9)), closes [#36](#36) [skip ci]

strausmann added 6 commits May 10, 2026 19:47

Copilot AI review requested due to automatic review settings May 10, 2026 20:18

Copilot started reviewing on behalf of strausmann May 10, 2026 20:18 View session

strausmann merged commit 5d2ff7d into main May 10, 2026
11 checks passed

Copilot AI reviewed May 10, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 10, 2026

View reviewed changes

strausmann mentioned this pull request May 10, 2026

refactor(ci): split docker-publish into native-arch matrix + manifest merge #40

Merged

7 tasks

strausmann deleted the fix/docker-publish-verify-step branch May 15, 2026 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ci): repair Verify multi-arch manifest step + drop fail-fast#38

fix(ci): repair Verify multi-arch manifest step + drop fail-fast#38
strausmann merged 6 commits into
mainfrom
fix/docker-publish-verify-step

strausmann commented May 10, 2026

Uh oh!

gemini-code-assist Bot commented May 10, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Uh oh!

gemini-code-assist Bot May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

strausmann commented May 10, 2026

Summary

What's in this PR

State of GHCR right now

Test plan

Uh oh!

gemini-code-assist Bot commented May 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants