^B Use snapshot-cloudflare.debian.org CDN mirror (origin fallback) to end mirror-halting CI#392
Conversation
…allback (the pinned snapshot.debian.org origin is chronically overloaded and has been halting Container smoke + Validate for hours; snapshot-cloudflare.debian.org is Debian's Cloudflare-fronted mirror of the same service and was verified to serve byte-identical content - the pinned openssl .deb hashes to the exact committed SHA256 - and the apt Release path; both Dockerfiles now bootstrap TLS by alternating cloudflare/origin per retry attempt (still SHA256-verified) and point apt at the CDN, with the origin kept as the bootstrap fallback and in the egress allowlist; DEBIAN_SNAPSHOT date and every pin are unchanged so the build stays reproducible) (pinned-inputs/hadolint/shellcheck/go test/manifest green locally, container build confirms in CI; build-reliability fix touching the egress allowlist - reviewer please scrutinize the allowlist addition)
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: efaa286111
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: efaa286111
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…) (P2a: apt sources listed only the CDN, so a transient CDN outage during apt-get update/install would still halt the build despite the bootstrap fallback - append snapshot.debian.org as a second apt source in both Dockerfiles so apt has true mirror redundancy, same signed/hash-verified content; P2b: operators denying snapshot.debian.org:443 to block Debian egress would silently regain it via the CDN since deny_endpoints match exact host:port - document in injection-policy.md that both mirror endpoints must be denied) (hadolint/pinned-inputs/markdownlint/manifest green; supporting resilience + egress-policy doc for the mirror PR)
|
Fixed both: (P2a) both Dockerfiles now list snapshot.debian.org as a second apt source after the CDN, so apt has genuine mirror redundancy — a transient CDN outage during apt-get falls through to the origin (identical signed, hash-verified content). (P2b) documented in injection-policy.md that the automatic Debian snapshot egress now uses two endpoints and, since deny_endpoints match exact host:port, blocking it requires denying both snapshot-cloudflare.debian.org:443 and snapshot.debian.org:443. hadolint/pinned-inputs/markdownlint/manifest green. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 91e6e8a6bd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… (P2) (listing cloudflare and origin as concurrent apt sources meant every apt-get update contacted both, so a slow/unreachable origin - the exact failure being avoided - would fail the update and stall the retry budget on every attempt; replace the static dual-source list with a set_snapshot_sources helper and call it per retry attempt (odd=cloudflare, even=origin) in both Dockerfiles, so each attempt uses a single mirror and a down mirror only fails its own attempt before falling through to the other) (hadolint/pinned-inputs/manifest green, Container smoke already passed the CDN build; supporting the mirror-resilience fix)
|
Fixed: replaced the concurrent dual-source apt list with a per-attempt mirror selection. A |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: af6e0c0cde
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Codex Review: Didn't find any major issues. Delightful! Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Pushing back on this one, with reasoning. The build-time per-attempt fallback is safe because I control the retry loop. Runtime in-session |
Problem
The container build pins
snapshot.debian.orgas its sole Debian snapshot source. That origin is chronically overloaded and has been haltingContainer smokeandValidate repositoryfor hours (repeated exit-38 / fetch failures), blocking every PR merge.Fix
Switch the primary snapshot source to
snapshot-cloudflare.debian.org— Debian's Cloudflare-CDN-fronted mirror of the same snapshot service — keepingsnapshot.debian.orgas a fallback.Verified reproducibility-safe (from this host):
openssl_3.5.5-1~deb13u1_arm64.debfetched from the CDN hashes to the exact committed SHA256 (92dfcdc2…).dists/trixie/Releasepath (HTTP 200).DEBIAN_SNAPSHOTdate and all SHA256 pins are unchanged.What changed (5 files, 49 lines)
runtime/container/Dockerfile+tools/validator/Dockerfile: the TLS bootstrap fetch now alternatessnapshot-cloudflare.debian.org/snapshot.debian.orgper retry attempt (still SHA256-verified), and apt points at the CDN. Origin remains the automatic bootstrap fallback.scripts/workcell: addssnapshot-cloudflare.debian.org:443to the egress allowlist (bothbootstrap_endpointsand the ephemeral-container allow set) — an addition alongside the existing origin, not a weakening.scripts/verify-invariants.sh: enforces the new allowlist entry.control-plane-manifest.json: regenerated for thescripts/workcellchange.Validation
Local:
check-pinned-inputs.sh,hadolint(both Dockerfiles),shellcheck(both scripts),go test ./internal/metadatautil/... ./cmd/workcell-citools/..., control-plane manifest — all green. The container-build lanes in this PR's own CI exercise the CDN end-to-end.Security note for reviewers: please scrutinize the egress-allowlist addition — it adds one verified Debian CDN hostname; the default-deny posture and all denies are unchanged.
🤖 Generated with Claude Code