Skip to content

fix(ci): repair docker-publish.yml startup failure#37

Merged
strausmann merged 5 commits into
mainfrom
fix/docker-publish-workflow
May 10, 2026
Merged

fix(ci): repair docker-publish.yml startup failure#37
strausmann merged 5 commits into
mainfrom
fix/docker-publish-workflow

Conversation

@strausmann
Copy link
Copy Markdown
Owner

Summary

The docker-publish.yml workflow has been silently failing on every trigger since the initial commit — no container images have ever been published to GHCR. v0.1.0 had no images; v0.2.0 (released a few minutes ago) also has no images.

Every run shows up as 0 seconds, conclusion=failure, 0 jobs with the message "This run likely failed because of a workflow file issue."

Root cause

Two issues in the workflow file:

1. matrix.service in job-level if:

jobs:
  publish:
    if: hashFiles(format('{0}/Dockerfile', matrix.service)) != ''
    strategy:
      matrix:
        service: [backend, frontend]

The matrix context is not available before the matrix expands, and the job-level if: evaluates before that. GitHub treats this as a startup failure and reports zero jobs.

Fix: replaced with a step-level test -f "${{ matrix.service }}/Dockerfile" check, which has access to matrix.service.

2. secrets.* in step-level if:

- name: Log in to Docker Hub
  if: secrets.DOCKERHUB_USERNAME != '' && secrets.DOCKERHUB_TOKEN != ''

Since the 2024 Actions hardening, secrets is no longer a recognised named-value in step if: conditions (the error is "Unrecognized named-value: 'secrets'").

Fix: surface the secrets into the step's env: first, then condition on env.* which is allowed.

Verification

  • actionlint passes locally.
  • Once merged, semantic-release will not trigger a new release (no feat:/fix: to the package source code), so we need to re-trigger publishing for v0.2.0 either by re-publishing the existing release or by running workflow_dispatch with tag=0.2.0.

Test plan

  • Merge this PR.
  • Trigger docker-publish.yml via workflow_dispatch with tag=0.2.0.
  • Confirm ghcr.io/strausmann/label-printer-hub-backend:0.2.0 and …-frontend:0.2.0 exist as multi-arch (linux/amd64 + linux/arm64) manifests.
  • Confirm OCI labels and /healthz build-info reflect VERSION=0.2.0.

First container-buildable code for the frontend. Mirrors the backend's
shape (chi router with /healthz, multi-stage Dockerfile, OCI labels,
non-root UID 1000, configurable PORT, multi-arch buildable) so the
docker-publish workflow can push both images side-by-side at the next
release.

Go skeleton (cmd/server/main.go):
- chi router with RequestID + RealIP + Logger + Recoverer middleware
- /healthz returns the same JSON shape as the backend: status, version,
  revision, build_date, repository — so orchestrator probe configs
  work for both containers uniformly
- BuildInfo populated from HUB_* env vars baked in by Dockerfile;
  safe defaults when running uncontained (local dev, unit tests)
- Graceful shutdown on SIGTERM/SIGINT with 15s context timeout
- ReadHeaderTimeout / ReadTimeout / WriteTimeout / IdleTimeout all
  set to sane defaults (mitigates slowloris-style attacks)

Tests (8 unit tests):
- /healthz: 200, correct JSON shape, Content-Type, no auth required
- /healthz body does not leak secret-ish substrings (parity with
  backend test_does_not_expose_secrets)
- envDefault: fallback on unset / empty, env value when set

Dockerfile (multi-stage):
- Stage 1 (builder, golang:1.23-alpine): go mod download cached
  separately, then CGO_ENABLED=0 static binary with -trimpath -ldflags
  '-s -w' for smallest reproducible output
- Stage 2 (runtime, alpine:3.20): tini + curl (HEALTHCHECK) +
  ca-certificates (HTTPS to backend for proxied requests later) +
  non-root UID 1000 user matching the backend container
- ARGs VERSION / REVISION / BUILD_DATE flow into OCI labels (12 of them,
  identical schema to backend) AND runtime ENV vars (HUB_VERSION,
  HUB_REVISION, HUB_BUILD_DATE, HUB_REPO_URL) so the running app
  surfaces them through /healthz
- Shell-form HEALTHCHECK expands ${PORT:-8080}
- ENTRYPOINT tini -- /usr/local/bin/server (no shell wrapper needed;
  Go binary reads $PORT itself)

.dockerignore minimal: ignores build output, IDE, secrets, .git,
node_modules (Tailwind toolchain lands later).

frontend/README.md: subdirectory README pointing at the repo root,
documenting env vars and the local dev loop.

Verified locally with --build-arg + docker run + curl /healthz on
default port (8080) and custom port (PORT=7777).

This is intentionally a SKELETON — Tailwind, HTMX, PWA manifest,
service worker, OpenAPI-generated backend client, and the actual UI
routes land in follow-up PRs once the backend exposes real endpoints.
The point of this PR is to make 'docker compose up' work end-to-end
with both containers from the next release.

Refs: ADR 0001 (two-container), ADR 0003 (Go + Tailwind + HTMX + PWA),
      ADR 0007 (multi-arch tag scheme)
Go's testing package panics when a test uses both t.Parallel() and
t.Setenv (or t.Chdir, or cryptotest.SetGlobalRandom). The setenv calls
mutate process-wide state and can't safely interleave with parallel
tests.

The two affected tests in main_test.go set FRONTEND_TEST_KEY and
FRONTEND_EMPTY_KEY via t.Setenv — they now run serially, while the
six other tests still use t.Parallel().

Caught by the Go CI job on PR #35.

Adding this pattern to docs/learnings/code-review-patterns.md in a
follow-up commit if it likely recurs — Go contributors may not know
the t.Setenv/t.Parallel incompatibility off the top of their head.
…imeout

Addresses Gemini + Copilot review findings on PR #35:

- Cache HUB_* env vars once at startup into `buildInfo` instead of reading
  os.Getenv on every /healthz request (Gemini): the values are baked into
  the image and never change at runtime, so per-request syscalls were waste.

- Replace chi's `middleware.Logger` with a small custom slog-based
  request logger (Gemini): chi's logger writes through the stdlib `log`
  package and bypasses our slog handler, so request lines would not honour
  the configured log level/format/destination.

- Register signal.Notify BEFORE launching the shutdown goroutine
  (Gemini): a SIGTERM arriving in the scheduling window between `go func`
  and the channel registration would otherwise terminate the process by
  default instead of triggering graceful shutdown.

- Set `WriteTimeout: 0` with explanatory comment (Copilot): the frontend
  will proxy Server-Sent Events, and any non-zero WriteTimeout would tear
  down long-lived SSE responses mid-stream. Per-route timeouts will be
  applied to non-SSE routes when they are added.

Tests:
- New TestLoadBuildInfo_AppliesEnvOverrides verifies the startup-cache path.
- New TestLoadBuildInfo_UsesDefaultsWhenUnset verifies the fallback path.
- Existing tests now call initBuildInfoForTests so /healthz sees populated
  values (main() is what loads the cache in production, and that does not
  run during `go test`).
`go test -race` flagged a race on the global `buildInfo` var: every
healthz test calls `initBuildInfoForTests`, and with `t.Parallel()` those
calls run concurrently — multiple goroutines wrote to the same variable.

Wrapping the write in `sync.Once` makes the initialization race-free:
the first caller populates `buildInfo`, every subsequent caller is a
no-op. Test correctness is preserved because `loadBuildInfo()` is pure
with respect to the env it reads — once it has produced a value, calling
it again with the same env would produce the same value.
The workflow has been failing with "workflow file issue" (0 jobs, 0s) on
every push, release, and workflow_dispatch trigger since the initial
commit — no container images have ever been published.

Two root causes:

1. `matrix.service` was referenced in a job-level `if:` expression:

     if: hashFiles(format('{0}/Dockerfile', matrix.service)) != ''

   The matrix context is not available before the matrix expands, and the
   job-level `if:` evaluates BEFORE expansion. GitHub treats this as a
   startup failure and reports zero jobs. Replaced with a step-level
   `test -f` check after checkout, which has matrix.service available.

2. `secrets.*` was referenced in a step-level `if:` expression:

     if: secrets.DOCKERHUB_USERNAME != '' && secrets.DOCKERHUB_TOKEN != ''

   Since the 2024 Actions hardening, `secrets` is no longer a recognised
   named-value in step `if:` conditions ("Unrecognized named-value:
   'secrets'"). Surfacing the secrets into the step's env first and then
   conditioning on env.* is the documented workaround.

After these two fixes the workflow parses cleanly, the matrix expands
into [backend, frontend], and both images can be built and pushed to
GHCR (Docker Hub is still optional and gated on the env-var check).
Copilot AI review requested due to automatic review settings May 10, 2026 20:10
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses critical failures in the CI pipeline that prevented container image publication. Additionally, it introduces the foundational Go service for the frontend, establishing the necessary structure for future UI and API proxying development.

Highlights

  • CI Workflow Fixes: Resolved startup failures in docker-publish.yml by moving conditional logic from the job-level to step-level and updating secret handling to comply with current GitHub Actions security hardening.
  • Frontend Skeleton Implementation: Added a new Go-based frontend service skeleton, including a Dockerfile, basic health check endpoint, and unit tests to support the project's two-container architecture.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/docker-publish.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@strausmann strausmann merged commit fb7cb59 into main May 10, 2026
11 checks passed
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the frontend skeleton for the Label Printer Hub, introducing a Go web server with a chi router, a /healthz endpoint, and a multi-stage Dockerfile. Feedback suggests replacing the default chi Recoverer middleware with a custom implementation that uses slog to ensure consistent structured logging across the application.

r := chi.NewRouter()
r.Use(middleware.RequestID)
r.Use(middleware.RealIP)
r.Use(middleware.Recoverer)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The chi middleware.Recoverer uses the standard library's log package to output stack traces, which bypasses the structured slog logger configured for the rest of the application. This leads to inconsistent log formatting and destination (e.g., if slog is configured to output JSON to a file, panics will still go to stderr as plain text). Consider implementing a custom recoverer middleware that leverages slog.Error to maintain consistency.

References
  1. Ensure HTTP middleware uses the same logging framework (e.g., slog) as the rest of the application to maintain consistent log formatting and configuration.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Repairs the Docker image publishing workflow so it no longer fails at workflow startup, and introduces a minimal frontend/ Go service + container baseline so releases can publish both backend and frontend images to GHCR (and optionally Docker Hub).

Changes:

  • Fixes .github/workflows/docker-publish.yml startup failure by removing a job-level if: that referenced matrix.*, and adjusts Docker Hub login gating to avoid secrets.* in step-level if:.
  • Adds a buildable frontend/ Go module with a /healthz endpoint that mirrors the backend’s build-info shape.
  • Adds a multi-stage frontend/Dockerfile (non-root, OCI labels, healthcheck) plus supporting README and .dockerignore.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
frontend/README.md Documents the frontend skeleton, local dev, and container behavior.
frontend/go.mod Defines the new frontend Go module and dependency on chi.
frontend/go.sum Locks module checksums for the new Go dependency.
frontend/Dockerfile Adds a multi-stage, non-root frontend image with /healthz healthcheck and OCI labels.
frontend/cmd/server/main.go Implements the minimal Go HTTP server, /healthz, and graceful shutdown.
frontend/cmd/server/main_test.go Adds tests for /healthz, env-default behavior, and build-info loading.
frontend/.dockerignore Keeps the frontend Docker build context minimal and avoids leaking local artifacts/secrets.
.github/workflows/docker-publish.yml Fixes workflow startup failure and hardening-related conditional logic.

Comment on lines +31 to +33
- name: Verify Dockerfile exists for ${{ matrix.service }}
run: test -f "${{ matrix.service }}/Dockerfile"

Comment on lines +28 to +33
func TestHealthz_ReturnsOK(t *testing.T) {
t.Parallel()
initBuildInfoForTests(t)
r := newRouter()
req := httptest.NewRequest(http.MethodGet, "/healthz", nil)
w := httptest.NewRecorder()
github-actions Bot pushed a commit that referenced this pull request May 11, 2026
## <small>0.2.1 (2026-05-11)</small>

* fix(ci): emit GHCR package description as index annotation (#39) ([12c6b6c](12c6b6c)), closes [#39](#39)
* fix(ci): lowercase image ref before push-by-digest (#41) ([9dd954e](9dd954e)), closes [#41](#41)
* fix(ci): repair docker-publish.yml startup failure (#37) ([fb7cb59](fb7cb59)), closes [#37](#37)
* fix(ci): repair Verify multi-arch manifest step + drop fail-fast (#38) ([5d2ff7d](5d2ff7d)), closes [#38](#38)
* refactor(ci): split docker-publish into native-arch matrix + manifest merge (#40) ([8cd824d](8cd824d)), closes [#40](#40)
* chore(deps): bump github.com/go-chi/chi/v5 from 5.1.0 to 5.2.2 in /frontend (#36) ([a5971b9](a5971b9)), closes [#36](#36)

[skip ci]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants