-
Notifications
You must be signed in to change notification settings - Fork 0
Add Gentoo distfiles proxy support with cache exclude feature #125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+587
−9
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
d1da36e
openspec: Add gentoo-distfiles-proxy change spec
ganto 6d5b8e8
Add Gentoo distfiles proxy support with cache exclude feature
ganto ed1d3e9
github-workflow: Add Gentoo to e2e test matrix
ganto d39d9cf
test: Fix error handling in e2e assertNotCached
ganto e6ff9d6
test: Restore default slog logger after validateConfig tests
ganto c5f5fdb
openspec: Archive gentoo-distfiles-proxy change spec
ganto File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 2 additions & 0 deletions
2
openspec/changes/archive/2026-04-06-gentoo-distfiles-proxy/.openspec.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| schema: spec-driven | ||
| created: 2026-04-06 |
71 changes: 71 additions & 0 deletions
71
openspec/changes/archive/2026-04-06-gentoo-distfiles-proxy/design.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| ## Context | ||
|
|
||
| pkgproxy routes requests by stripping the first URL path segment as the repository name, then proxying the remainder to configured upstream mirrors. Cache candidacy is currently decided solely by file suffix (`IsCacheCandidate` in `cache.go`). Gentoo distfiles are content-addressed, permanent blobs with heterogeneous file extensions — the suffix model alone cannot represent "cache everything except a few metadata files". | ||
|
|
||
| ## Goals / Non-Goals | ||
|
|
||
| **Goals:** | ||
| - Cache all Gentoo distfiles by default with a minimal exclude list for mirror-specific metadata. | ||
| - Introduce an `exclude` field that works independently of `"*"`, so operators can also exclude oversized individual files from any repo (e.g. `verylarge.rpm`). | ||
| - No changes to the proxy routing or transport layers — Gentoo fits the existing first-segment routing convention. | ||
|
|
||
| **Non-Goals:** | ||
| - Computing or validating the BLAKE2B path prefix — pkgproxy is a transparent proxy; path correctness is portage's responsibility. | ||
| - Caching `layout.conf` — excluded by default in the Gentoo config entry; no special-case code needed. | ||
| - Supporting `mirror://gentoo/` pseudo-URI scheme in ebuilds — handled transparently when portage resolves it to a real URL. | ||
|
|
||
| ## Decisions | ||
|
|
||
| ### 1. `"*"` wildcard in `suffixes` means "cache all" | ||
|
|
||
| **Decision:** A literal `"*"` entry in the `suffixes` list makes every proxied file a cache candidate, subject to `exclude` filtering. | ||
|
|
||
| **Alternatives considered:** | ||
| - `cache_all: true` boolean flag — adds a new top-level field and duplicates semantics already expressible via `suffixes`. | ||
| - Empty `suffixes` list means cache all — inverts current behavior (empty = cache nothing) and is surprising. | ||
| - `suffixes: ["*"]` is explicit, additive, and requires no validator changes. | ||
|
|
||
| **Edge case:** If `suffixes` contains both `"*"` and explicit entries (e.g. `["*", ".rpm"]`), the explicit entries are redundant. The config is accepted but `validateConfig` logs a warning naming the repository and the redundant suffixes. `IsCacheCandidate` treats this identically to `["*"]` alone. | ||
|
|
||
| ### 2. `exclude` matches both exact filenames and suffixes | ||
|
|
||
| **Decision:** Each entry in `exclude` is tested against the filename as an exact match first, then as a suffix. This covers: | ||
| - Exact files: `layout.conf`, `timestamp.mirmon`, `timestamp.dev-local` | ||
| - Suffix-based: `.sig`, `.asc` if an operator wanted to exclude signatures | ||
|
|
||
| **Alternatives considered:** | ||
| - Separate `exclude_names` and `exclude_suffixes` fields — more explicit but adds config verbosity for a simple feature. | ||
| - Glob/regex patterns — more powerful but over-engineered for current needs; can be added later. | ||
|
|
||
| ### 3. `exclude` is valid without `"*"` in suffixes | ||
|
|
||
| **Decision:** The `exclude` field is always applied, regardless of whether `"*"` is present. When no `"*"` is present, it acts as an override on top of suffix matching — useful for excluding a specific large file from an otherwise suffix-matched repo. | ||
|
|
||
| **Implementation:** `IsCacheCandidate` runs exclude check before suffix check. If any exclude entry matches, return false immediately. | ||
|
|
||
| ### 4. Gentoo config uses init7 + Adfinis as primary Swiss mirrors | ||
|
|
||
| **Decision:** `mirror.init7.net` first, `pkg.adfinis-on-exoscale.ch` second, `distfiles.gentoo.org` as authoritative fallback. | ||
|
|
||
| ### 5. E2e test bootstraps portage snapshot and uses emerge --fetchonly | ||
|
|
||
| **Decision:** Use `gentoo/stage3:latest`. The test script downloads `portage-latest.tar.xz` directly from `distfiles.gentoo.org` (bypassing the proxy — bootstrap only), unpacks it into `/var/db/repos/gentoo`, sets `GENTOO_MIRRORS` to pkgproxy, then runs `emerge --fetchonly app-text/tree`. This exercises the real portage fetch path including BLAKE2B path resolution. | ||
|
|
||
| **Alternatives considered:** | ||
| - Raw `wget` of a known distfile URL — simpler and faster, but doesn't validate that portage's mirror resolution works end-to-end through pkgproxy. | ||
|
|
||
| The test verifies: | ||
| 1. `emerge --fetchonly app-text/tree` exits successfully with `GENTOO_MIRRORS` pointing at pkgproxy. | ||
| 2. The tree source archive is cached on disk under `gentoo/distfiles/`. | ||
| 3. `wget` of `distfiles/layout.conf` through the proxy succeeds but the file is NOT written to cache. | ||
|
|
||
| ## Risks / Trade-offs | ||
|
|
||
| - **`"*"` caches everything including unexpected content** → Mitigated by the `exclude` list; operators can tune it. | ||
| - **Gentoo distfiles are large** → Cache disk usage is unbounded; this is an existing property of pkgproxy (no eviction). No change needed. | ||
| - **`portage-latest.tar.xz` snapshot download adds ~300 MB to each e2e test run** → Acceptable; Gentoo e2e tests are run manually on request, not in automated CI. | ||
| - **Mirror availability** → `distfiles.gentoo.org` as authoritative fallback ensures correctness. | ||
|
|
||
| ## Open Questions | ||
|
|
||
| None — design is fully resolved by this document. |
30 changes: 30 additions & 0 deletions
30
openspec/changes/archive/2026-04-06-gentoo-distfiles-proxy/proposal.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| ## Why | ||
|
|
||
| pkgproxy supports caching for RPM, DEB, and Arch-based distros but not Gentoo. Gentoo users who build many packages fetch large source tarballs (distfiles) repeatedly across machines; a local caching proxy reduces bandwidth and improves build times. | ||
|
|
||
| ## What Changes | ||
|
|
||
| - Add `exclude` field to the `Repository` config type: a list of filenames or suffixes that are **never** cached, even when `suffixes` contains `"*"`. | ||
| - Add `"*"` wildcard support to the existing `suffixes` field: when present, all proxied files are cache candidates except those matching `exclude` entries. | ||
| - Add a `gentoo` repository entry to `configs/pkgproxy.yaml` using Swiss mirrors (init7, Adfinis/Exoscale) with `suffixes: ["*"]` and `exclude` covering mirror-specific metadata files. | ||
| - Add a Gentoo e2e test (`TestGentoo`) that fetches a distfile via the proxy from a `gentoo/stage3` container and asserts it is cached. | ||
|
|
||
| ## Capabilities | ||
|
|
||
| ### New Capabilities | ||
|
|
||
| - `gentoo-distfiles`: Proxy and cache Gentoo distfiles from configurable upstream mirrors, honoring the BLAKE2B hash-based directory layout (`distfiles/<xx>/<filename>`). | ||
| - `cache-exclude`: Per-repository `exclude` list that prevents specific filenames or suffixes from being cached, complementing the existing `suffixes` include list and enabling the `"*"` wildcard use case. | ||
|
|
||
| ### Modified Capabilities | ||
|
|
||
| - `e2e-multi-distro`: Gentoo is added as a supported distro with a corresponding e2e test. | ||
|
|
||
| ## Impact | ||
|
|
||
| - `pkg/pkgproxy/repository.go`: Add `Exclude []string` field to `Repository` struct; update `validateConfig` (no required validation, field is optional). | ||
| - `pkg/cache/cache.go`: Update `CacheConfig` to carry the exclude list; update `IsCacheCandidate` to handle `"*"` wildcard and exclude matching. | ||
| - `configs/pkgproxy.yaml`: Add `gentoo` repository entry. | ||
| - `test/e2e/e2e_test.go`: Add `TestGentoo`. | ||
| - `README.md` and landing page: Add Gentoo `make.conf` snippet. | ||
| - `CHANGELOG.md`: Document new features. |
45 changes: 45 additions & 0 deletions
45
...c/changes/archive/2026-04-06-gentoo-distfiles-proxy/specs/cache-exclude/spec.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| ## ADDED Requirements | ||
|
|
||
| ### Requirement: Wildcard suffix caches all files | ||
| When the `suffixes` list for a repository contains `"*"`, the cache SHALL treat every proxied file as a cache candidate, subject to the `exclude` list. | ||
|
|
||
| #### Scenario: File with uncommon extension is cached under wildcard repo | ||
| - **WHEN** a request is made for a file with an extension not in any explicit suffix list (e.g. `.crate`) under a repo with `suffixes: ["*"]` | ||
| - **THEN** `IsCacheCandidate` returns true | ||
|
|
||
| #### Scenario: Wildcard does not affect repos without it | ||
| - **WHEN** a request is made for a file under a repo whose `suffixes` list does not contain `"*"` | ||
| - **THEN** `IsCacheCandidate` applies the existing suffix-match logic unchanged | ||
|
|
||
| ### Requirement: Exclude list prevents specific files from being cached | ||
| A repository MAY define an `exclude` list. Each entry is matched against the request filename as an exact name first, then as a suffix. If any entry matches, the file SHALL NOT be cached regardless of `suffixes`. | ||
|
|
||
| #### Scenario: Exact filename match prevents caching | ||
| - **WHEN** a request is made for a file whose name exactly matches an `exclude` entry (e.g. `layout.conf`) | ||
| - **THEN** `IsCacheCandidate` returns false | ||
|
|
||
| #### Scenario: Suffix match prevents caching | ||
| - **WHEN** a request is made for a file whose name ends with an `exclude` entry (e.g. `.sig`) | ||
| - **THEN** `IsCacheCandidate` returns false | ||
|
|
||
| #### Scenario: Non-matching file is not excluded | ||
| - **WHEN** a request is made for a file that does not match any `exclude` entry | ||
| - **THEN** the `exclude` list has no effect on the cache candidacy decision | ||
|
|
||
| #### Scenario: Exclude applies without wildcard suffix | ||
| - **WHEN** a repository has explicit suffixes (no `"*"`) and an `exclude` list, and a request is made for a file that matches both a suffix and an exclude entry | ||
| - **THEN** `IsCacheCandidate` returns false (exclude takes precedence) | ||
|
|
||
| ### Requirement: Explicit suffixes alongside wildcard are redundant but valid | ||
| When the `suffixes` list contains both `"*"` and explicit suffix entries, the configuration SHALL be accepted. pkgproxy SHALL log a warning identifying the repository and the redundant entries. Cache behavior is identical to having only `"*"`. | ||
|
|
||
| #### Scenario: Mixed wildcard and explicit suffixes triggers a warning | ||
| - **WHEN** pkgproxy loads a repository config whose `suffixes` list contains `"*"` and at least one other entry | ||
| - **THEN** the repository is accepted without error, a warning is logged naming the repository and the redundant suffixes, and `IsCacheCandidate` behaves as if only `"*"` were present | ||
|
|
||
| ### Requirement: Exclude field is optional | ||
| The `exclude` field in a repository config SHALL be optional. Repositories without it SHALL behave identically to the current behavior. | ||
|
|
||
| #### Scenario: Repository without exclude field | ||
| - **WHEN** pkgproxy loads a repository config with no `exclude` key | ||
| - **THEN** the repository is accepted without error and cache behavior is unchanged |
17 changes: 17 additions & 0 deletions
17
...hanges/archive/2026-04-06-gentoo-distfiles-proxy/specs/e2e-multi-distro/spec.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| ## ADDED Requirements | ||
|
|
||
| ### Requirement: Gentoo e2e test | ||
| The test suite SHALL include a Gentoo test function `TestGentoo` using a `docker.io/gentoo/stage3:latest` container. The test script SHALL: | ||
| 1. Download the latest portage ebuild snapshot directly from `https://distfiles.gentoo.org/snapshots/portage-latest.tar.xz` (bypassing the proxy — this is bootstrap, not a distfile fetch). | ||
| 2. Unpack the snapshot into `/var/db/repos/gentoo` inside the container. | ||
| 3. Configure `GENTOO_MIRRORS` in `/etc/portage/make.conf` to point at the pkgproxy `gentoo` repository. | ||
| 4. Run `emerge --fetchonly app-text/tree` to fetch the `tree` package sources through the proxy. | ||
| 5. Fetch `http://<proxy>/gentoo/distfiles/layout.conf` via `wget` to exercise the negative cache path. | ||
|
|
||
| #### Scenario: emerge --fetchonly proxies and caches tree distfiles | ||
| - **WHEN** the Gentoo container runs `emerge --fetchonly app-text/tree` with `GENTOO_MIRRORS` pointing at pkgproxy | ||
| - **THEN** the command exits successfully and the tree source archive exists in the pkgproxy cache under the `gentoo/` subdirectory | ||
|
|
||
| #### Scenario: layout.conf is proxied but not cached | ||
| - **WHEN** the Gentoo container fetches `http://<proxy>/gentoo/distfiles/layout.conf` via `wget` | ||
| - **THEN** the request returns HTTP 200 and `layout.conf` does NOT exist in the pkgproxy cache under the `gentoo/` subdirectory |
27 changes: 27 additions & 0 deletions
27
...hanges/archive/2026-04-06-gentoo-distfiles-proxy/specs/gentoo-distfiles/spec.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| ## ADDED Requirements | ||
|
|
||
| ### Requirement: Gentoo distfiles repository config entry | ||
| The `configs/pkgproxy.yaml` SHALL include a `gentoo` repository configured with `suffixes: ["*"]`, an `exclude` list covering mirror-specific metadata files (`layout.conf`, `timestamp.mirmon`, `timestamp.dev-local`), and at least two Swiss HTTPS mirrors plus `distfiles.gentoo.org` as authoritative fallback. | ||
|
|
||
| #### Scenario: Gentoo distfiles repository is configured | ||
| - **WHEN** pkgproxy loads its configuration | ||
| - **THEN** the `gentoo` repository is available with at least one upstream mirror | ||
|
|
||
| #### Scenario: layout.conf is not cached | ||
| - **WHEN** a client fetches `<proxy>/gentoo/distfiles/layout.conf` | ||
| - **THEN** pkgproxy proxies the file upstream but does not write it to the local cache | ||
|
|
||
| #### Scenario: Distfile fetched via emerge --fetchonly is proxied and cached | ||
| - **WHEN** portage runs `emerge --fetchonly app-text/tree` with `GENTOO_MIRRORS` pointing at pkgproxy | ||
| - **THEN** pkgproxy proxies the distfile from the upstream mirror and saves it to the local cache under `gentoo/distfiles/<xx>/<filename>` | ||
|
|
||
| #### Scenario: Cached distfile is served from disk on subsequent request | ||
| - **WHEN** portage fetches the same distfile a second time | ||
| - **THEN** pkgproxy serves the file from the local cache without contacting the upstream mirror | ||
|
|
||
| ### Requirement: make.conf snippet in README and landing page | ||
| The README.md and HTTP landing page SHALL include a Gentoo `make.conf` snippet showing how to configure `GENTOO_MIRRORS` to point at the proxy. | ||
|
|
||
| #### Scenario: Gentoo configuration snippet is present | ||
| - **WHEN** a user views the README or the pkgproxy landing page | ||
| - **THEN** a `make.conf` snippet with `GENTOO_MIRRORS="http://<proxy>/gentoo"` is visible |
28 changes: 28 additions & 0 deletions
28
openspec/changes/archive/2026-04-06-gentoo-distfiles-proxy/tasks.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| ## 1. Cache exclude feature | ||
|
|
||
| - [x] 1.1 Add `Exclude []string` field to `Repository` struct in `pkg/pkgproxy/repository.go`; in `validateConfig`, if a repository's `suffixes` list contains `"*"` alongside other entries, log a `slog.Warn` naming the repository and the redundant suffixes | ||
| - [x] 1.2 Add `Exclude []string` field to `CacheConfig` in `pkg/cache/cache.go` | ||
| - [x] 1.3 Pass `Exclude` from `Repository` into `CacheConfig` when constructing upstreams in `proxy.go` | ||
| - [x] 1.4 Update `IsCacheCandidate` in `cache.go` to: run exclude check first (exact name + suffix), then handle `"*"` wildcard, then existing suffix logic | ||
| - [x] 1.5 Add unit tests for `IsCacheCandidate` covering: wildcard match, exclude exact name, exclude suffix, exclude overrides wildcard, exclude overrides explicit suffix, no exclude field | ||
| - [x] 1.6 Add unit test for `validateConfig` covering: wildcard with redundant explicit suffixes emits a warning and returns no error | ||
|
|
||
| ## 2. Gentoo repository config | ||
|
|
||
| - [x] 2.1 Add `gentoo` entry to `configs/pkgproxy.yaml` with `suffixes: ["*"]`, `exclude: [layout.conf, timestamp.mirmon, timestamp.dev-local]`, and mirrors: `mirror.init7.net`, `pkg.adfinis-on-exoscale.ch`, `distfiles.gentoo.org` | ||
|
|
||
| ## 3. E2e test | ||
|
|
||
| - [x] 3.1 Add `assertNotCached` helper to `test/e2e/e2e_test.go` that asserts no file matching a given name exists anywhere under a cache subdirectory | ||
| - [x] 3.2 Write `test/e2e/test-gentoo.sh` shell script that: downloads `portage-latest.tar.xz` directly from `distfiles.gentoo.org`, unpacks it to `/var/db/repos/gentoo`, sets `GENTOO_MIRRORS` in `make.conf` to point at pkgproxy, runs `emerge --fetchonly app-text/tree`, then fetches `distfiles/layout.conf` via `wget` through the proxy | ||
| - [x] 3.3 Add `TestGentoo` to `test/e2e/e2e_test.go` using `docker.io/gentoo/stage3:latest`, mounting the script, asserting tree source archive is cached under `gentoo/distfiles/`, and asserting `layout.conf` is NOT cached using `assertNotCached` | ||
|
|
||
| ## 3b. Makefile | ||
|
|
||
| - [x] 3b.1 Add `gentoo → TestGentoo` mapping to the `distroToTest` macro in `Makefile` so `make e2e DISTRO=gentoo` works; add `gentoo` to the error message's list of valid values | ||
|
|
||
| ## 4. Documentation | ||
|
|
||
| - [x] 4.1 Add Gentoo `make.conf` snippet to `README.md` | ||
| - [x] 4.2 Add Gentoo `make.conf` snippet to the HTTP landing page (`pkg/pkgproxy/landing.go` or template) | ||
| - [x] 4.3 Update `CHANGELOG.md` `[Unreleased]` section with new features |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.