Skip to content

feat(packages): GitHub Releases binary installer#898

Merged
viettranx merged 16 commits intodevfrom
feat/github-releases-installer
Apr 16, 2026
Merged

feat(packages): GitHub Releases binary installer#898
viettranx merged 16 commits intodevfrom
feat/github-releases-installer

Conversation

@mrgoonie
Copy link
Copy Markdown
Contributor

@mrgoonie mrgoonie commented Apr 14, 2026

Summary

New runtime package source github:owner/repo[@tag] — install Linux CLI binaries directly from GitHub Releases. Admin-only, SHA256-verified, ELF-validated, with a release-picker UI.

Closes #741.

How It Works

End-to-end install flow

sequenceDiagram
    autonumber
    actor Admin
    participant UI as Web UI
    participant API as HTTP API<br/>(internal/http)
    participant Inst as GitHubInstaller<br/>(internal/skills)
    participant GH as GitHub<br/>(api.github.com)
    participant Obj as github asset CDN<br/>(objects.githubusercontent.com)
    participant FS as Runtime bin dir<br/>(/app/data/.runtime/bin)

    Admin->>UI: Enter "cli/cli@v2.45.0"
    UI->>API: POST /v1/packages/install<br/>{ "package": "github:cli/cli@v2.45.0" }
    API->>API: adminAuth + requireMasterScope
    API->>API: parseAndValidatePackage → ParseGitHubSpec
    API->>Inst: InstallSingleDep(ctx, "github:...")
    Inst->>Inst: AllowedOrg(owner)?
    Inst->>GH: GET /repos/cli/cli/releases/tags/v2.45.0<br/>(Bearer token optional)
    GH-->>Inst: release + assets[]
    Inst->>Inst: SelectAsset(linux, runtime.GOARCH)
    Inst->>Obj: DownloadAsset(url, maxBytes)
    Note over Obj,Inst: SSRF allowlist re-validated<br/>on every redirect hop
    Obj-->>Inst: asset bytes → temp file + sha256
    opt checksums.txt present
        Inst->>Obj: DownloadAsset(checksum url)
        Obj-->>Inst: checksum file
        Inst->>Inst: VerifyChecksum(expected, sha)<br/>(constant-time compare)
    end
    Inst->>Inst: ExtractArchive(temp, 2×maxBytes)
    Note over Inst: sanitizePath rejects ../ abs \\ NUL<br/>cumulative size tracked (zip-bomb)
    Inst->>Inst: pickBinaries + validateELF<br/>(magic + 64-bit + arch)
    Inst->>FS: write binary 0755 + manifest (atomic)
    Inst-->>API: GitHubPackageEntry
    API-->>UI: 200 OK { ok: true }
    UI-->>Admin: ✓ installed
Loading

Component map

flowchart LR
    subgraph UI["Web UI (ui/web)"]
        S[github-binaries-section.tsx]
        P[Release Picker Modal]
        W[Musl/glibc Warning]
    end

    subgraph HTTP["HTTP Layer (internal/http)"]
        H1[POST /v1/packages/install]
        H2[POST /v1/packages/uninstall]
        H3[GET  /v1/packages]
        H4[GET  /v1/packages/github-releases<br/>rate-limited 30/min/user]
    end

    subgraph Skills["Skills (internal/skills)"]
        D[DepInstaller dispatch<br/>prefix: github: / pip: / npm:]
        I[GitHubInstaller]
        API[GitHubClient<br/>10-min cache]
        DL[Downloader<br/>SSRF + SHA256]
        EX[Archive Extract<br/>tar.gz / zip / raw]
        CK[Checksum<br/>constant-time cmp]
        EL[ELF Validator<br/>magic + class + machine]
        M[(Manifest<br/>github-packages.json)]
    end

    subgraph FS["Runtime dir (/app/data/.runtime/bin)"]
        B1[installed binaries on PATH]
    end

    subgraph External["External"]
        GH[(GitHub API)]
        CDN[(objects.githubusercontent.com)]
    end

    S --> H1
    S --> H2
    P --> H4
    S --> H3
    H1 --> D
    H2 --> D
    H3 --> I
    H4 --> API
    D -->|github:| I
    I --> API
    I --> DL
    I --> EX
    I --> CK
    I --> EL
    I --> M
    I --> B1
    API --> GH
    DL --> CDN
    DL --> GH
Loading

Security defense-in-depth

flowchart TD
    Req[Install request: github:owner/repo@tag] --> A1
    A1{Admin +<br/>master scope?}
    A1 -->|No| Deny1[403]
    A1 -->|Yes| A2
    A2{Valid spec<br/>ParseGitHubSpec?}
    A2 -->|No| Deny2[400]
    A2 -->|Yes| A3
    A3{Org in allowlist?}
    A3 -->|No| Deny3[ErrOrgNotAllowed]
    A3 -->|Yes| A4
    A4[GitHub API call] --> A5
    A5[Asset download] --> A6
    A6{HTTPS + host allowlist?<br/>re-validated per hop}
    A6 -->|No| Deny4[ErrHostNotAllowed]
    A6 -->|Yes| A7
    A7{Size ≤ maxBytes?}
    A7 -->|No| Deny5[ErrAssetTooLarge]
    A7 -->|Yes| A8
    A8{checksum match?<br/>constant-time cmp}
    A8 -->|Mismatch| Deny6[ErrChecksumMismatch]
    A8 -->|Match or absent| A9
    A9{Archive path safe?<br/>reject ../ abs NUL}
    A9 -->|No| Deny7[ErrUnsafePath]
    A9 -->|Yes| A10
    A10{Total uncompressed<br/>≤ 2×maxBytes?}
    A10 -->|No| Deny8[ErrZipBomb]
    A10 -->|Yes| A11
    A11{ELF magic +<br/>64-bit + arch match?}
    A11 -->|No| Deny9[ErrNotELF / ErrELFArchMismatch]
    A11 -->|Yes| OK[write 0755 + atomic manifest update]
Loading

What's Included

Backend (internal/skills/)

  • github_api.go — Releases API client (10-min cache, rate-limit mapping, token auth)
  • github_download.go — HTTPS-only streaming download with SSRF allowlist re-validated on every redirect hop, literal-IP rejection, size cap + SHA256 via io.TeeReader
  • github_checksum.gochecksums.txt / SHA256SUMS lookup + subtle.ConstantTimeCompare
  • archive_extract.go — tar.gz / zip / raw extract, strict path sanitization (rejects .., absolute, Windows drives, null bytes), zip-bomb guard (cumulative size at read time + declared-size pre-check), symlink skip
  • github_installer.go — parse spec, asset-select heuristic, ELF magic + 64-bit + arch validation, atomic manifest (tmp + Rename), install/uninstall/list, org allowlist

HTTP API

  • POST /v1/packages/install routes github: specs through installer
  • GET /v1/packages/github-releases?repo=owner/repo&limit=10 (viewer+, arch-filtered)
  • Per-user rate limit on picker: 30 req/min burst 10 → 429 + Retry-After on exceed
  • Extended GET /v1/packages with github field + github-bin runtime probe

Infra

  • Dockerfile creates /app/data/.runtime/bin (goclaw:goclaw 0755)
  • docker-entrypoint prepends bin dir to PATH
  • Env-only config (never config.json): GOCLAW_PACKAGES_GITHUB_TOKEN, GOCLAW_PACKAGES_MAX_ASSET_SIZE_MB, GOCLAW_PACKAGES_GITHUB_ALLOWED_ORGS, GOCLAW_PACKAGES_GITHUB_BIN_DIR, GOCLAW_PACKAGES_GITHUB_MANIFEST

UI

  • GitHub Binaries section + release picker modal + dismissible musl/glibc warning (localStorage)
  • i18n keys across en / vi / zh

Docs

  • docs/packages-github.md — user-facing install guide, admin config, security posture, troubleshooting (musl/glibc, rate limit, checksum mismatch)
  • docs/14-skills-runtime.md cross-ref

Security Posture

Attack vector Defense
SSRF HTTPS-only + host allowlist re-validated on every redirect hop; literal IP (v4/v6) rejected
Path traversal Strict sanitizePath rejects .., absolute, Windows drives, null bytes, backslash-normalized
Zip bomb Cumulative uncompressed size capped at 2× MaxAsset, enforced at both pre-check AND read time
Malicious binary ELF magic + 64-bit class + arch match before chmod +x
Checksum tamper subtle.ConstantTimeCompare; fail-hard on mismatch
Auth bypass Install/uninstall gated by adminAuth + requireMasterScope
Quota abuse Per-user rate limit on picker endpoint (30/min/user + burst 10)
Token leak Env-only config; startup logs token_set=bool, never value

Test Plan

  • go build ./... (PG backend)
  • go build -tags sqliteonly ./... (desktop/lite)
  • go vet ./...
  • go test ./internal/skills/... — parser, asset-select, checksum, archive (path traversal + zip bomb), ELF, SSRF, GitHub API mock, rate-limit
  • go test ./internal/http/... — picker rate limiter + full packages handler
  • pnpm build (UI strict TS check)
  • Manual: install github:jesseduffield/lazygit@v0.42.0 → binary in /app/data/.runtime/bin/lazygit + manifest entry
  • Manual: install with checksum mismatch → fail hard, no binary written
  • Manual: install github:evil/tool with allowlist → org-not-allowed error
  • Manual: UI picker shows 10 releases, filter matches current arch
  • Manual: picker spam > 30 req/min → 429 + Retry-After
  • Manual: Dockerfile build all variants, verify /app/data/.runtime/bin in PATH

Known Limitations (Phase 1)

  • Linux-only; Lite/Desktop editions not yet exposed
  • Multi-binary archives install all top-level executables (no interactive picker)
  • Binary basename collision = last-writer-wins
  • No version history / rollback
  • Global manifest (not per-tenant)

Unresolved

  • Should picker rate-limit be tunable via env var? (Currently hardcoded 30/min/burst-10.)
  • Backend i18n catalog keys intentionally skipped (errors stay raw English, matching existing installer patterns).

New runtime source `github:owner/repo[@tag]` for installing Linux CLI
binaries from GitHub Releases. Admin-only, SHA256-verified, ELF-validated.

Backend:
- GitHub API client with 10-min cache + rate-limit mapping
- SSRF-guarded streaming downloader (HTTPS + host allowlist, re-validated
  on every redirect hop, literal-IP rejection)
- Checksums.txt / SHA256SUMS lookup with constant-time verify
- Archive extract (tar.gz / zip / raw) with path-traversal + zip-bomb
  guards, symlink skip
- ELF magic + 64-bit class + runtime-arch validation
- Atomic manifest persistence (temp + rename)

HTTP:
- POST /v1/packages/install accepts github: spec
- GET /v1/packages/github-releases for picker UI (viewer+, arch-filtered)
- Extended InstalledPackages response with github field
- github-bin runtime probe

Infra:
- Dockerfile creates /app/data/.runtime/bin (goclaw:goclaw 0755)
- docker-entrypoint.sh prepends bin dir to PATH
- Env-only config (never config.json): token, max size, org allowlist,
  bin dir, manifest path

UI:
- GitHub Binaries section + release picker modal
- Dismissable musl/glibc compatibility warning (localStorage)
- i18n keys across en/vi/zh

Docs: docs/packages-github.md user guide + 14-skills-runtime.md cross-ref.

Closes #741
…inel

Code review cleanup:
- validPkgName regex had `:` added defensively, but github: specs are
  validated separately via skills.ParseGitHubSpec before reaching this
  check — the broadening was dead attack surface.
- Drop unused ErrUnknownArchive sentinel + the `_ = ErrUnknownArchive`
  stub in extractRaw.
Cap picker endpoint at 30 req/min/user (burst 10) to protect the shared
GitHub API quota. Key is userID (header X-GoClaw-User-Id) or remote IP
for anonymous callers. Returns 429 + Retry-After: 60 when tripped.

Standalone token-bucket limiter (stale-entry cleanup every 5 min) lives
in internal/http rather than importing internal/gateway, which would
create a package cycle.
CI pnpm build failed on TS2345: `.split('@')[0]` returns
`string | undefined` under strict index access. Default to empty
string to satisfy the type checker; runtime behaviour unchanged
because the downstream regex rejects empty strings.
@mrgoonie mrgoonie requested a review from viettranx April 14, 2026 09:31
@mrgoonie mrgoonie self-assigned this Apr 14, 2026
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 3m 51s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Review backend security implementation (SSRF, path traversal, zip bomb, ELF validation)
  • Review archive extraction and checksum verification
  • Review HTTP API layer (auth, rate limiting)
  • Review GitHub API client (caching, error handling)
  • Review UI components
  • Review infra changes (Dockerfile, entrypoint)
  • Post final review

Overall this is a solid, well-structured feature with good security depth. The architecture (layered defense, atomic manifest writes, SSRF allowlist re-validated per redirect hop) is well-designed. A few issues need attention before merge.


Medium Issues

1. Data race on lastSeen in the rate limiter

packages_rate_limiter.go:53 writes entry.lastSeen = time.Now() in Allow() while cleanupLoop reads v.(*perKeyEntry).lastSeen.Before(cutoff) concurrently, with no synchronization between them. The Go race detector will flag this.

// Allow() — no lock
entry.lastSeen = time.Now()

// cleanupLoop() — no lock on the same field
if v.(*perKeyEntry).lastSeen.Before(cutoff) {

Fix: use atomic.Pointer[time.Time] for lastSeen, or add a sync.Mutex to perKeyEntry. Fix this →

2. Checksum verification has silent soft-failure paths that contradict the security posture docs

The PR security table says "fail-hard on mismatch". The VerifyChecksum call itself does fail hard. But the surrounding guard in github_installer.go:388-408 silently continues on several failure modes:

if cerr == nil {
    defer os.Remove(checksumPath)
    if data, rerr := os.ReadFile(checksumPath); rerr == nil {   // silent continue on read failure
        if sums, perr := ParseChecksums(data); perr == nil {    // silent continue on parse failure
            if expected, ok := sums[asset.Name]; ok {
                // actual mismatch → fail hard ✓
            } else {
                slog.Warn(...)  // asset not in file → warn and continue
            }
        }
    }
}

An adversary who can serve a modified checksum file that omits the asset's name (while serving a malicious asset) bypasses verification entirely. This is HTTPS-protected, but the silent paths should at minimum log warnings, and the "asset not in file" case is worth considering as a fail-hard depending on your threat model. Fix this →

3. Download HTTP client timeout (30s) is too tight for large assets

github_download.go:67:

client := &http.Client{
    Timeout: c.HTTPClient.Timeout, // 30s from NewGitHubClient
    ...
}

http.Client.Timeout is the total time for the entire request including body read. A 200MB file on a modest 5 MB/s connection takes ~40s, exceeding the 30s timeout. The request already carries a context with installTimeout = 5min, so removing the client-level Timeout and relying on ctx is the right fix. Fix this →


Low / Style Issues

4. extractRaw ignores maxUncompressed

archive_extract.go:237 calls io.ReadAll with no size cap. It's safe in practice (the download is already capped at maxBytes, and ExtractArchive receives 2*maxBytes), but the inconsistency is surprising and would become a bug if called from any other context.

5. Defaults() is called twice on the same config

cmd/gateway_github_installer.go:41 calls cfg.Defaults() explicitly, then NewGitHubInstaller calls cfg.Defaults() again. The second call is idempotent but the duplication is confusing. Consider removing the explicit call from the cmd file and relying on NewGitHubInstaller.

6. InstalledBy in the manifest is never populated

github_installer.go:444 sets InstalledBy: "" (omitempty). The field exists for audit purposes but Install() has no way to receive a user ID. Consider removing the field from the struct entirely (and manifest schema) until it can actually be populated from the request context, or thread the user ID through the call chain.

7. Owner regex allows 40-char names (GitHub limit is 39)

gitHubSpecRE allows [A-Za-z0-9] + up to 38 [A-Za-z0-9-] + [A-Za-z0-9] = 40 characters max. GitHub usernames are capped at 39. Not a security issue (the API will reject it), but the validation is slightly off.

8. Misleading mu lock comment

github_installer.go:96: mu sync.Mutex // serializes manifest writes + concurrent installs. The lock only serializes the final disk-write phase (lines 427–474). Download, extraction, and ELF validation all run concurrently. The comment should say "serializes manifest writes and bin dir writes".

9. cleanupLoop goroutine leaks in tests

TestEnforceGitHubReleasesLimit_Writes429 swaps githubReleasesLimiter for a tight limiter. If Allow() triggers sync.Oncego rl.cleanupLoop(), that goroutine leaks when the test restores the original limiter. Minor, but will show up with -count >1 or leak detectors.


UI Notes

  • github-binaries-section.tsx:49isValidFullSpec uses [A-Za-z0-9-]* (no length cap on owner) while the backend regex caps at ~40 chars. Not a security issue but causes silent UX mismatch.
  • github-binaries-section.tsx:84split("@")[0] ?? ""split never returns undefined so ?? "" is dead code.
  • github-binaries-section.tsx:298rel.matching_assets[0] access is safe because it's guarded by rel.matching_assets?.[0], but this also exposes browser_download_url from the backend. Minor: that URL goes to the CDN but the UI only renders name, so no real concern.

What's Well Done

  • SSRF defense: allowlist + HTTPS-only + per-redirect-hop re-validation + literal-IP rejection is solid.
  • Path traversal: sanitizePath covers .., absolute, Windows drives, null bytes, backslash normalization — comprehensive.
  • Zip bomb: both pre-check (declared size) and streaming cap (read time) enforce the limit. The in[:0:0] pattern in filterAssets correctly avoids aliasing the backing array.
  • ELF validation: magic + class + machine check before chmod +x is the right level of validation.
  • Atomic manifest writes: tmp + os.Rename is correct.
  • Auth gating: adminAuth + requireMasterScope on install/uninstall is correct and consistent with the existing security model.
  • Infra: bin dir ownership in Dockerfile (goclaw-owned, 0755) and PATH prepend in entrypoint are clean.
  • Test coverage: parser, asset select, org allowlist, checksum, archive (path traversal + zip bomb), rate limiter — solid unit test surface.

Medium
- rate limiter: atomic.Int64 lastSeen + amortized sweep replaces
  goroutine-based cleanup → fixes data race on lastSeen and the
  goroutine leak when tests swap the package-level limiter.
- checksum pipeline: slog.Warn on ReadFile and ParseChecksums failures
  (previously silent). "asset not listed" stays warn+proceed but is now
  documented as the publisher's choice — ELF validation remains the
  final gate.
- downloader: drop http.Client.Timeout (30s capped the whole request
  including body read, aborting large downloads on slow links). Context
  deadline from install timeout (5 min) is the correct bound.

Low / style / UI
- extractRaw honors maxUncompressed (ErrFileTooLarge on overflow) so
  the helper is safe outside the hot path.
- cmd/gateway_github_installer.go: drop the explicit cfg.Defaults()
  call — NewGitHubInstaller already invokes it.
- GitHubPackageEntry: remove unpopulated InstalledBy field + document
  why.
- owner regex tightened to 39-char GitHub limit (was 40).
- mu lock comment corrected: serializes only the disk-write phase.
- UI: shared stripPrefixAndTag helper + owner regex mirrors the backend
  39-char cap; destructure-with-default kills the split()[0] ?? ""
  awkwardness while still satisfying noUncheckedIndexedAccess.

Verified: go build (pg + sqliteonly) · go vet · go test -race
./internal/skills ./internal/http · pnpm build.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 2m 58s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Review rate limiter (data race, cleanup)
  • Review download + SSRF defense
  • Review archive extraction (path traversal, zip bomb)
  • Review checksum verification logic
  • Review GitHub installer (ELF, manifest, org allowlist)
  • Review HTTP API layer
  • Review UI components
  • Post final review

Overall this is a well-executed, security-conscious feature. The architecture is solid and the implementation shows careful attention to the attack surface. Several issues from the previous review pass have been proactively fixed — those are called out in "What Changed" below.


Medium Issues

1. validRepoPath allows trailing hyphens in owner names

packages.go:21:

var validRepoPath = regexp.MustCompile(`^[A-Za-z0-9](?:[A-Za-z0-9-]{0,38})/[A-Za-z0-9][A-Za-z0-9._-]*$`)

This matches foo-/repo because [A-Za-z0-9-]{0,38} includes trailing -. The backend gitHubSpecRE (github_installer.go:43) correctly requires a non-hyphen terminator: [A-Za-z0-9](?:[A-Za-z0-9-]{0,37})?[A-Za-z0-9]|[A-Za-z0-9]. GitHub doesn't allow trailing hyphens in usernames/org names. Not a security issue (GitHub API rejects it, and the allowlist check comes after), but the validator is inconsistent with gitHubSpecRE and may produce confusing 502 errors rather than 400s for malformed input. Fix this →

2. Raw GitHub API error messages forwarded to client in the picker endpoint

packages.go:185-186:

writeJSON(w, http.StatusBadGateway, map[string]string{"error": err.Error()})

err.Error() from ListReleases surfaces the full GitHub error string — e.g. rate-limit reset timestamps, server error bodies — to any authenticated viewer. For an admin-restricted install path this would be acceptable, but handleGitHubReleases is readAuth (viewer+). Consider mapping sentinel errors to generic messages:

switch {
case errors.Is(err, skills.ErrGitHubRateLimited):
    writeJSON(w, http.StatusTooManyRequests, map[string]string{"error": "github rate limit reached"})
case errors.Is(err, skills.ErrGitHubNotFound):
    writeJSON(w, http.StatusNotFound, map[string]string{"error": "repository not found"})
default:
    writeJSON(w, http.StatusBadGateway, map[string]string{"error": "failed to fetch releases"})
}

Fix this →


Low / Style Issues

3. ListReleases cache is fragmented by the limit parameter

github_api.go:119: key := fmt.Sprintf("list:%s/%s:%d", owner, repo, limit). A caller passing limit=9 and limit=10 hits separate cache entries, each with its own 10-minute TTL. The picker always passes limit=10, so in practice this is a non-issue — but if the limit is ever varied, the cache busts unnecessarily. Consider caching at a canonical limit (e.g. always request the max you'd ever need) or stripping limit from the key if the result covers the requested count.

4. Install response gives no detail on what was written

packages.go:119:

writeJSON(w, http.StatusOK, map[string]any{"ok": true})

InstallSingleDep calls gh.Install() which returns a *GitHubPackageEntry with binary names, tag, SHA256, etc. This is discarded. Returning the entry in the success response would let the UI show "installed: lazygit v0.42.0" without a separate list refresh. Minor UX improvement.

5. UI full-spec regex allows unbounded tag length

github-binaries-section.tsx:66: (@[^\s]+)? — the tag portion has no length cap. GitHub tags are rarely > 100 chars but the backend gitHubSpecRE also has no explicit tag length limit (just [^\s\x00]+), so this is consistent. Worth adding {1,100} on both sides if you want strict client-side rejection before the API call.

6. handleGitHubReleasesvalidRepoPath admits two-char minimum for owner but single-char is valid on GitHub

packages.go:21 — the regex requires at least 1 char before the / which handles single-char handles like k/ but the overall regex [A-Za-z0-9](?:[A-Za-z0-9-]{0,38}) doesn't enforce the terminator issue noted above. (This is the same as issue #1; just noting it affects single-char org names too if they carry trailing characters.)


What Changed Since the Previous Review Pass

All medium/high findings from the previous review have been addressed:

Previous finding Status
Data race on lastSeen in rate limiter Fixed — uses atomic.Int64, no mutex needed
cleanupLoop goroutine leak in tests Fixed — replaced with amortized inline sweep via callCounter%1024
30s http.Client.Timeout too tight for large downloads Fixed — download client has no Timeout; relies on context deadline
Checksum soft-failure silent paths Fixed — all failure modes now have explicit slog.Warn
InstalledBy always empty in manifest Fixed — field removed entirely, replaced with a clear comment
Double Defaults() call Fixed — comment in cmd/ file explicitly notes this
mu lock comment misleading Fixed — now says "serializes the final disk-write phase"
Owner regex allows 40 chars Fixed — regex correctly caps at 39

What's Well Done

  • Rate limiter redesign: Amortized inline sweep (callCounter.Add(1)%1024) instead of a background goroutine is elegant — no goroutine leaks, no data races, no mutex on per-entry state. The atomic.Int64 for lastSeen is the right tool here.
  • SSRF defense: HTTPS-only + allowlist re-validated per redirect hop + literal-IP rejection. Comprehensive and correct.
  • sanitizePath: Covers .., absolute, Windows drives, null bytes, backslash normalization, and post-path.Clean escape check. Hard to slip anything through.
  • Zip bomb: Both declared-size pre-check and streaming cap in extractZip; extractRaw also enforces maxBytes via LimitReader. Belt-and-braces is the right call for binary installs.
  • filterAssets aliasing: in[:0:0] avoids sharing the backing array with the input slice — subtle but correct.
  • ELF validation: Magic + class + machine before chmod +x. Exactly the right level of validation.
  • Atomic manifest writes: tmp + os.Rename is correct and crash-safe.
  • Uninstall name resolution: UninstallPackage correctly handles github:name, github:owner/repo@tag, and bare owner/repo forms — resilient to different call sites.
  • Auth layering: Install/uninstall gated by adminAuth + requireMasterScope. Picker is viewer+ with per-user rate limiting. Allowlist membership is not enumerable (404 instead of 403).
  • Test coverage: Parser, asset-select, org allowlist, checksum, archive (path traversal + zip bomb), ELF, SSRF, rate limiter — comprehensive unit surface.

Medium
- validRepoPath now rejects trailing hyphens in the owner segment and
  caps at 39 chars, matching gitHubSpecRE exactly. Previously a subtle
  drift between the two validators could let `foo-/repo` slip to the
  GitHub API and surface as a 502 instead of a clean 400.
- handleGitHubReleases no longer forwards raw err.Error() from the
  upstream call. Maps sentinel errors:
    ErrGitHubRateLimited  → 429 + Retry-After
    ErrGitHubNotFound     → 404
    ErrGitHubUnauthorized → 502 "github authentication failed"
    default               → 502 "failed to fetch releases"
  Avoids leaking rate-limit reset timestamps / server internals to
  viewer-tier callers.

Low / UX
- Install response now returns the manifest entry for github: specs
  (new lookupGitHubEntry helper; nil-safe fallback to {ok:true}). Lets
  the UI display "installed: lazygit v0.42.0" without a list refresh.
- gitHubSpecRE tag segment capped at 1..255 chars (git ref-name bound).
  UI isValidFullSpec mirrors the same cap.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 3m 45s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of medium findings from previous review pass
  • Read current state of key files
  • Review for new issues
  • Post final review

Previous Review Findings — Status

Both medium issues from the second review pass have been addressed:

Finding Status
validRepoPath allowing trailing hyphens in owner names Fixed — regex now mirrors gitHubSpecRE exactly: ([A-Za-z0-9](?:[A-Za-z0-9-]{0,37})?[A-Za-z0-9]|[A-Za-z0-9])
Raw GitHub API error messages surfaced to viewer-level callers FixedhandleGitHubReleases now maps ErrGitHubRateLimited, ErrGitHubNotFound, ErrGitHubUnauthorized to generic client-safe messages; default falls through to 502 "failed to fetch releases"

New Findings

1. Tag names in API path are not URL-encoded (Medium)

github_api.go:100-101:

path = fmt.Sprintf("/repos/%s/%s/releases/tags/%s", owner, repo, tag)

gitHubSpecRE allows [^\s\x00]{1,255} for tags, which admits #, ?, % and other URL-special characters. Go's http.NewRequestWithContext parses the URL before sending — a tag like v1.0.0#extra would silently have #extra stripped (treated as a URL fragment), resulting in a request for tag v1.0.0 without any error. A tag containing ?key=val would inject a query parameter, potentially hitting a different API endpoint.

Fix: use url.PathEscape(tag) when constructing the path, or tighten the regex to exclude #, ?, %:

path = fmt.Sprintf("/repos/%s/%s/releases/tags/%s", owner, repo, url.PathEscape(parsed.Tag))

Fix this →

2. UninstallPackage via full spec fails for packages with renamed binaries (Low)

dep_installer.go:184:

if spec, err := ParseGitHubSpec(dep); err == nil {
    name = spec.Repo  // "cli" for github:cli/cli@v2.66.0
}

When github:cli/cli is installed and the binary is gh, canonicalPackageName stores the manifest entry as Name: "gh". But an API call like UninstallPackage(ctx, "github:cli/cli@v2.66.0") resolves name = spec.Repo = "cli", which won't match the manifest entry "gh". The UI avoids this by sending github:<manifest-name> (e.g. github:gh), but direct API callers may get ErrPackageNotInstalled unexpectedly. Worth documenting or adding a secondary lookup by owner/repo across manifest entries.

3. GitHubClient.cache grows without eviction (Low)

github_api.go:56-58: the in-memory cache is a plain map[string]releaseCacheEntry with a 10-minute TTL. Expired entries are only skipped on read — they are never removed. Over long uptime with many distinct repo queries, this map grows monotonically. Not a practical concern at current scale, but a simple periodic sweep (or using sync.Map with a background ticker) would prevent the theoretical leak.

4. lookupGitHubEntry takes a second manifest read after install (Low)

packages.go:127-130:

if entry := lookupGitHubEntry(pkg); entry != nil {
    resp["entry"] = entry
}

lookupGitHubEntry calls gh.List() which acquires i.mu.Lock() and re-reads the manifest from disk. gh.Install() already returns *GitHubPackageEntry (and the manifest was just updated). Threading the return value through InstallSingleDep would eliminate the extra disk read and lock, though the current approach is correct and the impact is minimal.

5. pickBinaries comment says "top-level" but doesn't check depth (Low)

github_installer.go:317:

// 2. all top-level executables (single depth) that aren't in docs/man/etc. paths

The code checks nonBinaryPathRE for named exclusion dirs (man, docs, contrib, etc.) but doesn't limit to a single path depth. A binary at subdir1/subdir2/tool with the executable bit set would be picked if neither directory name matches the exclusion regex. This is typically benign but the comment over-promises. Either enforce filepath.Dir(f.Name) == "." (or a single stripping) for the top-level case, or update the comment.


What Changed Since the Previous Review Pass

Previous finding Status
validRepoPath trailing hyphen Fixed
Raw GitHub API errors exposed to viewers Fixed
Cache keyed by limit (Low) Still present — acknowledged, non-issue in practice

What's Well Done

The feature is in good shape. After three review passes, all high and medium findings from the prior rounds are resolved. The security defense-in-depth (SSRF allowlist per hop, path sanitization, zip bomb at both pre-check and read time, ELF validation, atomic manifest writes, auth + scope gating) is comprehensive and correctly implemented. The rate limiter redesign with amortized inline sweep and atomic lastSeen is clean. No goroutine leaks remain in tests. The newly added entry field in the install response and the proper sentinel error mapping in the picker endpoint are both clear improvements.

The only actionable finding is the URL-encoding issue on tag names (issue #1) — that's worth a one-line fix before merge.


Medium
- github_api: URL-encode owner, repo, and tag via url.PathEscape when
  building API paths. Previously a tag containing '#' would be stripped
  as a URL fragment and '?' would inject a query parameter, silently
  hitting the wrong release.

Low / polish
- Uninstall via full "github:owner/repo[@tag]" spec now falls back to
  manifest lookup by owner/repo, handling packages whose binary name
  differs from the repo name (cli/cli → gh).
- GitHubClient.cache sweeps expired entries opportunistically when the
  map grows past 256 entries (prevents theoretical unbounded growth
  over long uptime).
- handleInstall for github: specs now calls GitHubInstaller.Install
  directly and returns the freshly-created manifest entry, eliminating
  the double manifest read via List() from the lookupGitHubEntry helper.
- pickBinaries comment corrected — actual behavior excludes paths
  matched by nonBinaryPathRE rather than enforcing a single-depth limit.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 2m 42s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of previous review findings (URL-encoding, cache growth, UninstallPackage)
  • Read current state of key files
  • Review for new issues
  • Post final review

Previous Round 3 Findings — All Resolved

Finding Status
Tag names not URL-encoded in API path (Medium) Fixedurl.PathEscape(owner), url.PathEscape(repo), and url.PathEscape(tag) are all applied in GetRelease (github_api.go:112–118) and ListReleases (github_api.go:141–142)
UninstallPackage fails for packages with renamed binaries (Low) Fixeddep_installer.go:186–196 now does an owner/repo lookup via gh.List() to find the canonical manifest name (handles cli/cligh)
GitHubClient.cache grows without eviction (Low) FixedcacheMaxEntries = 256 constant + opportunistic sweep in cacheSet when threshold is hit (github_api.go:86–100)
Second manifest read after install (Low) Fixedpackages.go:137 returns the *GitHubPackageEntry directly from gh.Install(), no secondary disk read
pickBinaries comment says "top-level" but no depth check Still present (Low, carried forward below)

New / Remaining Findings

1. interface{} should be any throughout github_api.go (Low, style)

github_api.go:45, 73, 88 use interface{} for the cache type:

type releaseCacheEntry struct {
    data      interface{}   // ← should be any
    ...
}
func (c *GitHubClient) cacheGet(key string) (interface{}, bool) { ... }
func (c *GitHubClient) cacheSet(key string, v interface{}) { ... }

The project targets Go 1.26; any is the idiomatic alias and is what the rest of the codebase uses. Fix this →

2. doJSON local variable shadows the url package import (Low, style)

github_api.go:153:

func (c *GitHubClient) doJSON(ctx context.Context, path string, out interface{}) error {
    url := c.BaseURL + path   // shadows "net/url"
    req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)

The "net/url" package is used correctly above this function (GetRelease, ListReleases), so there's no actual bug — Go scopes the shadow to doJSON only. But golangci-lint's govet -shadow check will flag it, and the intent is clearer with a non-conflicting name. Rename to apiURL or rawURL. Fix this →

3. Uninstall removes binaries from disk before saving the manifest (Low)

github_installer.go:525–533:

for _, b := range m.Packages[idx].Binaries {
    path := filepath.Join(i.Config.BinDir, filepath.Base(b))
    if rerr := os.Remove(path); rerr != nil && !os.IsNotExist(rerr) {
        slog.Warn(...)
    }
}
m.Packages = append(m.Packages[:idx], m.Packages[idx+1:]...)
return i.saveManifest(m)   // if this fails, binaries are gone but manifest still lists the package

If saveManifest returns an error after some or all binaries are already deleted, the manifest still lists the entry as installed but its files are gone. The next install of the same package overwrites the manifest correctly (self-healing), so this is Low severity — but a caller that retries on error would get ErrPackageNotInstalled on the second attempt. A simple mitigation is to update and save the manifest first, then remove the binaries; a failed disk remove is already a warn-and-continue path.

4. pickBinaries comment over-promises single-depth extraction (Low, carryover)

github_installer.go:319–320:

// Otherwise: top-level executables (no intermediate dir beyond a single release dir).
var execs []ArchiveFile
for _, f := range candidates {
    if isLikelyExecutable(f) {
        execs = append(execs, f)
    }
}

There is no depth enforcement — a binary at a/b/c/tool with the executable bit set is picked if none of the path components match nonBinaryPathRE. The ELF validation step is the actual final gate. Either enforce strings.Count(f.Name, "/") <= 1 for the "top-level" case, or update the comment to accurately describe what the filter actually does.


Summary

This is the fourth review pass and the feature is in excellent shape. All medium and high findings from prior rounds are resolved. The only new actionable item above medium is the manifest-before-binaries ordering in Uninstall (Low); the remaining three are style issues. The security model (SSRF per-hop, path sanitization, zip bomb at declared + streaming, ELF validation, atomic manifest, auth + scope gating, rate-limited picker, constant-time checksum compare) is comprehensive and correctly implemented throughout.

Ready to merge once the low-priority items above are addressed at your discretion.

All 4 findings are Low severity:

- github_api.go: replace interface{} with any across the cache type,
  cacheGet return, cacheSet param, and doJSON out param.
- doJSON: rename local `url` to `apiURL` to avoid shadowing the
  "net/url" package import used by GetRelease/ListReleases.
- Uninstall: save the updated manifest BEFORE removing binaries on
  disk. If saveManifest fails we now bail out without leaving a
  manifest entry that still claims binaries which have been deleted
  (a retried Uninstall would otherwise hit ErrPackageNotInstalled
  after the first attempt wiped the files). Disk removal stays
  best-effort and warn-on-error, which matches the idempotent intent.
- pickBinaries: inline comment corrected to reflect actual behavior —
  depth is not enforced; nonBinaryPathRE filter + downstream ELF
  validation are the real gates.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 3m 28s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of Round 4 findings
  • Read current state of key files
  • Review for new issues
  • Post final review

Round 4 Findings — All Resolved

Finding Status
interface{} should be any in github_api.go (Low) FixedreleaseCacheEntry.data, cacheGet, cacheSet, doJSON all use any
doJSON shadows url package import (Low) Fixed — renamed to apiURL with explanatory comment
Uninstall removes binaries before saving manifest (Low) Fixed — manifest saved first (saveManifest at line 533), then binaries removed with warn-and-continue; comment explains the ordering rationale
pickBinaries comment over-promises single-depth (Low, carryover) Fixed — comment now accurately states "Depth is not enforced here — ELF validation in the caller is the final gate before chmod +x"

New Findings

1. handleInstall github fast-path omits installTimeout (Low)

packages.go:130 calls gh.Install(r.Context(), pkg) directly, but InstallSingleDep (dep_installer.go:49) wraps its context with the 5-minute installTimeout before passing it into the same gh.Install. The fast path was added to return the manifest entry without a second disk read — but the timeout wrapping was not carried over.

In practice the HTTP server's WriteTimeout (if set) would terminate the handler, and the request context is cancelled on connection close. But the asymmetry between github: via the API vs github: via InstallSingleDep is a latent consistency issue worth closing:

// packages.go — handleInstall fast path
ctx, cancel := context.WithTimeout(r.Context(), 5*time.Minute)
defer cancel()
entry, err := gh.Install(ctx, pkg)

Fix this →

2. handleInstall github fast-path skips top-level install log (Low)

packages.go:122-138 goes directly to gh.Install() without emitting the slog.Info("skills: installing dep", "dep", dep) that InstallSingleDep (dep_installer.go:52) writes. The installer itself logs errors, but successful github: package installs via the HTTP API have no top-level entry in the operator log — unlike pip/npm packages which go through InstallSingleDep. A one-liner before the gh.Install call closes the gap.

3. cacheSet doesn't enforce a hard cap when all entries are live (Low)

github_api.go:91-99 — when len(c.cache) >= 256, a sweep of expired entries runs. But if all 256 entries are within their 10-minute TTL, no entries are deleted and the new entry is still appended, growing the map to 257 (and eventually higher in bursts). The cap is a soft eviction trigger, not a hard limit. Extremely unlikely to matter at any realistic query rate, but the semantic mismatch between "max entries" and "actual behaviour" is worth a brief comment clarifying it's a sweep threshold rather than a hard ceiling.


What Changed Since Round 4

All four findings from the previous pass are addressed cleanly. The Uninstall ordering fix (saveManifest before file removal) is particularly well done — the comment at lines 528-538 explains both the ordering rationale and the failure semantics clearly.


Summary

This is the fifth review pass. All high and medium findings from every prior round are resolved. The feature is in excellent shape:

  • Security model: SSRF allowlist re-validated per redirect hop, path sanitization covering all traversal vectors, zip bomb at declared+streaming level, ELF magic+class+arch before chmod +x, atomic manifest writes, admin+master-scope auth on install/uninstall, constant-time checksum compare, rate-limited picker with sentinel-mapped errors.
  • Rate limiter: Amortized inline sweep with atomic lastSeen — no goroutine leaks, no data races.
  • Manifest lifecycle: correct saveManifest-before-remove ordering in Uninstall, atomic tmp+rename in saveManifest.
  • Test coverage: parser, asset-select, org allowlist, checksum, archive (path traversal + zip bomb), ELF, SSRF, rate limiter — comprehensive unit surface.

The three remaining findings are all Low. Issue #1 (missing installTimeout on fast path) is the only one with a concretely actionable fix; issues #2 and #3 are observability/documentation notes. Ready to merge once the install timeout is wired in if desired.

All 3 findings are Low severity:

- handleInstall github fast-path now wraps the context with
  skills.InstallTimeout (5 min) before calling gh.Install and emits the
  same "skills: installing dep" / "dep installed" / "github install
  failed" log lines as the generic InstallSingleDep path, so
  operator-observability is identical between github: and pip:/npm:
  install flows.
- installTimeout promoted to exported InstallTimeout so the http layer
  shares the single source of truth rather than duplicating the
  5-minute constant.
- cacheMaxEntries comment clarifies it is a soft sweep trigger, not a
  hard cap — when every entry is still within TTL the map can briefly
  exceed the threshold by one insert.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 4m 12s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of Round 5 findings
  • Read current state of key files
  • Review for new issues
  • Post final review

Round 5 Findings — All Resolved

Finding Status
handleInstall fast-path missing installTimeout (Low) Fixedpackages.go:135 wraps with context.WithTimeout(r.Context(), skills.InstallTimeout)
handleInstall fast-path missing top-level install log (Low) Fixedpackages.go:134 now has slog.Info("skills: installing dep", "dep", pkg)
cacheSet soft cap not documented (Low) Fixedgithub_api.go:83–88 comment now explicitly states "soft sweep trigger, not a hard cap" with rationale

New Findings

1. slog.Warn for "no checksum asset available" is too noisy for common packages (Low)

github_installer.go:427:

} else {
    slog.Warn("github.installer: no checksum asset available", "asset", asset.Name)
}

Many widely-used tools — jq, fzf, fd, ripgrep older releases — don't publish a separate checksum file. This slog.Warn fires on every install of such packages, flooding the warning stream with expected behavior. The genuinely suspicious case (checksum file present but this asset is missing from it) already has its own slog.Warn at line 420. Consider downgrading the "no checksum file found" case to slog.Info or slog.Debug — it's upstream policy, not a problem with this install. Fix this →

2. handleGitHubReleases response includes CDN browser_download_url to viewer-level callers (Low, carried forward)

github_api.go:26–29GitHubAsset.DownloadURL (json:"browser_download_url") is serialised into the matching_assets field of the picker response, which is accessible by any authenticated viewer. The URLs are on allowlisted CDN hosts and are publicly reachable on GitHub anyway, so this is not a security issue. But it's unnecessary surface — the UI only renders rel.matching_assets[0].name (the filename), never the URL. Consider projecting a slimmed-down struct in the releaseDTO (just name and size) rather than embedding the full GitHubAsset type. Minor UX/API hygiene.


What Changed Since Round 5

All three Round 5 findings are closed. The InstallTimeout constant is now exported (dep_installer.go:18const InstallTimeout = 5 * time.Minute) precisely so the HTTP fast path can use the same constant, which is a clean design decision. The logging parity between the fast path and InstallSingleDep is now exact: same log message, same timeout, same error handling shape.


Summary

This is the sixth review pass. After six rounds, no high or medium severity findings remain open. The only remaining items are two low-severity notes above.

The security model is comprehensive and correctly implemented end-to-end:

  • SSRF: HTTPS-only + host allowlist re-validated per redirect hop + literal-IP rejection
  • Path traversal: sanitizePath covers .., absolute, Windows drives, null bytes, backslash normalization, post-clean escape check
  • Zip bomb: declared-size pre-check in extractZip + streaming cap at read time in all three extract paths
  • ELF validation: magic + 64-bit class + arch match before chmod +x; applies to all binaries including named-match candidates
  • Manifest lifecycle: saveManifest-before-file-remove ordering in Uninstall, atomic tmp+Rename in saveManifest
  • Auth: adminAuth + requireMasterScope on install/uninstall; viewer+ on picker with per-user rate limit
  • Checksum: constant-time compare, all soft-failure paths logged, fail-hard on mismatch
  • Cache: any-typed, 256-entry soft eviction, correct %w-wrapped sentinel errors for errors.Is in callers
  • Rate limiter: amortized inline sweep with atomic.Int64 lastSeen — no goroutine leaks, no data races
  • Tests: parser, asset-select, org allowlist, checksum, archive (path traversal + zip bomb), ELF, SSRF, rate limiter — comprehensive unit surface

Ready to merge. The two Low findings above can be addressed at your discretion — neither blocks merge.

Both findings are Low severity (reviewer marked the PR "ready to merge"
already):

- github_installer: "no checksum asset available" downgraded from
  slog.Warn to slog.Info. Many popular upstream releases (jq, fzf,
  older ripgrep, etc.) ship no checksum file at all — that is publisher
  policy, not a problem with the install. The suspicious cases
  (checksum file unreadable, unparseable, or missing this asset) stay
  at Warn so they stand out.
- handleGitHubReleases response now uses a narrow assetPreview DTO
  (name + size_bytes) instead of embedding the full GitHubAsset type
  which also carried browser_download_url. The picker UI never rendered
  the URL; trimming the response keeps the viewer-tier surface minimal.
  UI AssetPreview interface realigned to match.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 3m 29s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of Round 6 findings
  • Read current state of key files
  • Review for new issues
  • Post final review

Round 6 Findings — All Resolved

Finding Status
slog.Warn for "no checksum asset available" too noisy (Low) Fixedgithub_installer.go:432 now uses slog.Info with an explicit comment: "Not a problem with the install — many upstream publishers simply don't ship checksum files (jq, fzf, older ripgrep, etc.). Downgraded from Warn so the suspicious cases stand out cleanly."
handleGitHubReleases response includes CDN browser_download_url to viewer-level callers (Low) Fixedpackages.go:241–252 now projects a deliberate assetPreview struct with only name and size_bytes. Comment at lines 237–240 documents the rationale. The GitHubBinariesSection UI type ReleaseDTO was updated to match.

New Findings

1. GET /v1/packages (viewer+) exposes CDN download URLs for installed GitHub packages (Low)

package_lister.go:25 includes the full GitHubPackageEntry in the list response:

GitHub []GitHubPackageEntry `json:"github,omitempty"`

GitHubPackageEntry contains AssetURL string \json:"asset_url"`(the CDN download URL). This endpoint usesreadAuth(viewer+), so any authenticated viewer can retrieve the CDN URL for every installed GitHub binary viaGET /v1/packages`.

This is inconsistent with the work done in Round 6 to strip CDN URLs from the picker endpoint (assetPreview projection), which was explicitly motivated by reducing the API surface for viewer-level callers. For public repos the URL is not sensitive, but for private repos it exposes download location metadata to viewers.

Fix: either project a viewer-safe struct in ListInstalledPackages (e.g. omit AssetURL / SHA256 / AssetName from the public projection), or tag AssetURL with json:"-" in the entry struct and add a separate AssetURL field for the admin-only install response. Fix this →

2. Install() writes binaries before saving manifest — orphaned files on saveManifest failure (Low)

github_installer.go:459–496 — binaries are written to disk under the lock, then loadManifest() / saveManifest() are called. If saveManifest fails (e.g. disk full, race on temp file rename), the binaries exist on disk at 0755 with no manifest entry:

for _, b := range binaries {
    // write binary 0755 …  ← happens first
}
// loadManifest, update, saveManifest  ← can fail after write

The Uninstall ordering was specifically designed manifest-before-remove (fixed in Round 3) precisely to avoid this class of inconsistency. The Install side has the inverse risk: a retried install self-heals (overwrites the binary + updates manifest), but between the failure and the retry the binary is live on PATH and untracked. Not a security issue (the binary passed ELF validation before being written), but worth noting for operational consistency.


Summary

This is the seventh review pass. Both Round 6 findings are fully resolved. The only actionable item is finding #1 (CDN URL exposure in the list endpoint) — finding #2 is informational only, as the self-healing path on retry is correct and the exposed binary has already passed all validation gates.

The feature's security model (SSRF allowlist per hop, path sanitization, zip bomb at declared+streaming level, ELF validation, atomic manifest writes, admin+master-scope auth on install/uninstall, constant-time checksum compare, rate-limited picker with sentinel-mapped errors, assetPreview projection for viewer-level endpoints) is comprehensive and correctly implemented.

Ready to merge — finding #1 above is Low severity and can be addressed at your discretion.

Narrow the GET /v1/packages GitHub entry to a viewer-safe projection
(repo/tag/binaries/name/installed_at), mirroring the assetPreview fix
from round 6. Strips asset_url, sha256, and asset_name from the list
response — viewer-level callers no longer see CDN download URLs or
checksum metadata for installed packages. UI types realigned; the
removed fields were never rendered.

Finding #2 (install writes binary before manifest save) left as noted —
reviewer confirmed informational only, self-heals on retry, no security
impact since binaries pass ELF validation before being written.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 3m 36s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of Round 7 findings
  • Read current state of key files
  • Review for new issues
  • Post final review

Round 7 Findings — Status

Finding Status
GET /v1/packages exposes CDN download URLs to viewer-level callers (Low) Fixedpackage_lister.go now uses a dedicated GitHubPackageListEntry struct that deliberately omits AssetURL, SHA256, and AssetName. The Go comment at lines 20–24 and the matching TypeScript GitHubPackageEntry interface at github-binaries-section.tsx:16–22 both document why these fields are excluded. Clean and consistent.
Install() writes binaries before saving manifest — orphaned on saveManifest failure (Low, informational) Still present as designed. The Round 7 review correctly marked this informational: a retried install self-heals and the orphaned binary passed all validation gates.

New Finding

1. GitHub secondary rate limit (HTTP 429) not handled in doJSON (Low)

github_api.go:174–197 — the switch on HTTP status codes handles 403 with X-RateLimit-Remaining: 0 as ErrGitHubRateLimited. But GitHub also returns 429 for secondary rate limits (abuse detection, search, unauthenticated burst). A 429 falls into the default branch:

default:
    body, _ := io.ReadAll(io.LimitReader(resp.Body, 1024))
    return fmt.Errorf("github: unexpected status %d: %s", resp.StatusCode, ...)

This does not wrap ErrGitHubRateLimited, so errors.Is(err, ErrGitHubRateLimited) returns false. The picker endpoint's handleGitHubReleases switch (packages.go:224) misses it and falls to the default arm, returning 502 "failed to fetch releases" instead of 429 "github rate limit reached". Misleading to UI users (they see "gateway error" rather than "slow down").

Fix:

// github_api.go — in doJSON switch
case resp.StatusCode == http.StatusTooManyRequests:
    return ErrGitHubRateLimited

Fix this →


What Changed Since Round 7

Round 7's only actionable finding (CDN URL exposure in the list endpoint) is cleanly resolved:

  • GitHubPackageListEntry struct has a clear comment explaining the deliberate omission of sensitive fields
  • The ListInstalledPackages function explicitly constructs the projection entry-by-entry (no accidental field inclusion)
  • The TypeScript interface mirrors the narrowed Go struct exactly, with a matching comment
  • The Uninstall manifest-before-file-remove ordering (fixed in Round 3) remains correct at github_installer.go:531–548

Summary

This is the eighth review pass. After eight rounds, the feature is in excellent shape with only one Low severity finding remaining.

The security model continues to be comprehensive:

  • SSRF: HTTPS-only + host allowlist re-validated per redirect hop + literal-IP rejection
  • Path traversal: sanitizePath covers .., absolute, Windows drives, null bytes, backslash normalization, post-clean escape check
  • Zip bomb: declared-size pre-check in extractZip + LimitReader streaming cap in all three extract paths
  • ELF validation: magic + 64-bit class + arch match before chmod +x
  • Manifest lifecycle: saveManifest-before-file-remove in Uninstall, atomic tmp+Rename in saveManifest
  • Auth: adminAuth + requireMasterScope on install/uninstall; viewer+ on picker with per-user rate limit
  • Checksum: constant-time compare, all soft-failure paths logged, fail-hard on mismatch
  • API surface: viewer-safe projections on both the list endpoint (GitHubPackageListEntry) and the picker endpoint (assetPreview)
  • Rate limiter: amortized inline sweep with atomic.Int64 lastSeen — no goroutine leaks, no data races

Ready to merge. The single Low finding above (HTTP 429 handling) can be addressed at your discretion.

Map HTTP 429 (GitHub secondary rate limits — abuse detection,
unauthenticated bursts, search) to ErrGitHubRateLimited in the API
client so the picker endpoint renders 429 "rate limit reached" with
Retry-After: 60 instead of falling through to 502 "failed to fetch
releases". Primary rate limits (403 + X-RateLimit-Remaining: 0) were
already handled; this covers the secondary class documented at
https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api#about-secondary-rate-limits
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 4m 22s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of Round 8 findings
  • Read current state of key files
  • Review for new issues
  • Post final review

Round 8 Finding — Resolved

Finding Status
GitHub secondary rate limit (HTTP 429) not handled in doJSON (Low) Fixedgithub_api.go:192–197 adds case resp.StatusCode == http.StatusTooManyRequests: return ErrGitHubRateLimited, with a clear comment explaining the 403 vs 429 distinction. handleGitHubReleases now correctly maps both to 429 "github rate limit reached" instead of 502 "failed to fetch releases".

New Findings

1. Checksum files with ./-prefixed filenames silently bypass verification (Very Low)

github_checksum.go:69:

name := strings.TrimSpace(string(m[2]))
out[name] = sha

checksumLineRE is (?m)^([a-fA-F0-9]{64})[ \t]+\*?(\S.*)$. The *? strips the binary-mode prefix, but not a leading ./. If a checksum file contains:

abc123...  ./lazygit_0.42.0_Linux_x86_64.tar.gz

The parsed name is ./lazygit_0.42.0_Linux_x86_64.tar.gz. The lookup sums[asset.Name] (sums["lazygit_0.42.0_Linux_x86_64.tar.gz"]) misses it, falling into the "asset not listed in checksum file" warn-and-continue branch. In practice, sha256sum can emit ./-prefixed names when run from a directory (e.g., sha256sum ./file). Real GitHub release checksums almost never use this notation, so this is not a practical concern — but adding a strings.TrimPrefix(name, "./") in ParseChecksums would close the gap defensively.

2. doJSON API response body has no explicit size cap (Very Low)

github_api.go:205:

if err := json.NewDecoder(resp.Body).Decode(out); err != nil { ... }

The response body is decoded without a MaxBytesReader or size-limited reader. For release listing this is bounded by GitHub's per_page parameter (max 100 via the caller), and for single releases the response is small. Not exploitable in the current call sites — but if doJSON is ever called with a new path that returns a large payload, a server-side response with a very large body could exhaust memory. Wrapping resp.Body in io.LimitReader(resp.Body, someCap) before decoding would be belt-and-braces.


What Changed Since Round 8

Round 8's single actionable finding is resolved. The doJSON 429 fix is clean — both the existing 403+X-RateLimit-Remaining:0 path and the new 429 path correctly return ErrGitHubRateLimited, ensuring handleGitHubReleases maps both onto 429 "github rate limit reached" with a Retry-After: 60 header.


Summary

This is the ninth review pass. After nine rounds, all high and medium findings are resolved. Both new findings above are Very Low severity and do not block merge.

The security model continues to be comprehensive and correctly implemented:

  • SSRF: HTTPS-only + host allowlist re-validated per redirect hop + literal-IP rejection
  • Path traversal: sanitizePath covers .., absolute, Windows drives, null bytes, backslash normalization, post-clean escape check
  • Zip bomb: declared-size pre-check in extractZip + LimitReader streaming cap in all three extract paths (extractTarGz, extractZip, extractRaw)
  • ELF validation: magic + 64-bit class + arch match before chmod +x
  • Manifest lifecycle: saveManifest-before-file-remove in Uninstall, atomic tmp+Rename in saveManifest
  • Auth: adminAuth + requireMasterScope on install/uninstall; viewer+ on picker with per-user rate limit
  • Checksum: constant-time compare, all failure paths logged, fail-hard on mismatch
  • API surface: viewer-safe projections (GitHubPackageListEntry for /v1/packages, assetPreview for the picker) strip CDN URLs and checksums from viewer-level responses
  • Rate limiter: amortized inline sweep with atomic.Int64 lastSeen — no goroutine leaks, no data races
  • Singleton: defaultGitHubInstaller correctly guarded by sync.RWMutex

Ready to merge.

Two defensive hardenings flagged as Very Low:

- ParseChecksums: strip leading `./` from checksum filenames.
  `sha256sum ./file` emits `./file` in the name column; the caller
  looks up by bare asset basename so `./`-prefixed entries would
  silently miss. Real release checksums almost never use this form,
  but the guard is essentially free.

- doJSON: cap response body at 8 MiB via io.LimitReader before JSON
  decode. Current GitHub list/release payloads are well under this
  (~1 MiB at per_page=100). Guards against future call sites or a
  misbehaving upstream returning an oversized document.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Claude finished @mrgoonie's task in 2m 44s —— View job


PR Review: feat(packages): GitHub Releases binary installer

Tasks:

  • Gather PR context
  • Check status of Round 9 findings
  • Read current state of key files
  • Review for new issues
  • Post final review

Round 9 Findings — All Resolved

Finding Status
Checksum files with ./-prefixed filenames silently bypass verification (Very Low) Fixedgithub_checksum.go:73 adds name = strings.TrimPrefix(name, "./") with a clear comment: "sha256sum ./file emits ./file — strip the leading ./ so the lookup matches the bare asset basename."
doJSON API response body has no explicit size cap (Very Low) Fixedgithub_api.go:210–213 adds const maxAPIResponseBytes = 8 * 1024 * 1024 with io.LimitReader(resp.Body, maxAPIResponseBytes) before JSON decode. Comment explains the rationale (100-release list ≈ 1 MiB; 8 MiB is generous belt-and-braces).

New Findings

None.

After reviewing all key files — github_api.go, github_installer.go, github_checksum.go, github_download.go, archive_extract.go, packages_rate_limiter.go, package_lister.go, dep_installer.go, packages.go, github_default_installer.go, cmd/gateway_github_installer.go, and github-binaries-section.tsx — no new issues were found. The codebase is in clean, production-ready shape.


Summary

This is the tenth and final review pass. All findings across all prior rounds have been resolved. The feature is ready to merge.

The security model as implemented:

  • SSRF: HTTPS-only + host allowlist re-validated per redirect hop + literal-IP rejection
  • Path traversal: sanitizePath covers .., absolute, Windows drives, null bytes, backslash normalization, post-clean escape check
  • Zip bomb: declared-size pre-check in extractZip + LimitReader streaming cap in all three extract paths (extractTarGz, extractZip, extractRaw)
  • ELF validation: magic + 64-bit class + arch match before chmod +x
  • Manifest lifecycle: saveManifest-before-file-remove in Uninstall; atomic tmp+Rename in saveManifest; binaries written before manifest in Install is self-healing on retry (known limitation, documented)
  • Auth: adminAuth + requireMasterScope on install/uninstall; viewer+ on picker with per-user rate limit (30/min burst 10)
  • Checksum: constant-time compare; all soft-failure paths logged at Warn; ./-prefixed entries normalised; fail-hard on mismatch
  • API surface: viewer-safe projections on both the list endpoint (GitHubPackageListEntry) and the picker endpoint (assetPreview) — CDN URLs and checksum metadata stripped from viewer-level responses
  • Rate limiter: amortized inline sweep with atomic.Int64 lastSeen — no goroutine leaks, no data races
  • GitHub API: HTTP 429 (secondary rate limit) correctly mapped to ErrGitHubRateLimited; response body capped at 8 MiB; tags URL-encoded via url.PathEscape
  • Cache: 256-entry soft-sweep eviction; ./-prefix normalisation in checksum parser
  • Timeout parity: both handleInstall fast path and InstallSingleDep use skills.InstallTimeout (5 min)

Ready to merge.


`Service.Stop()` closes stopChan but does not wait for the runLoop
goroutine to exit. In the test suite, test A's `defer cs.Stop()` can
return before the spawned runLoop has reached
`ticker := time.NewTicker(runLoopTickInterval)`. If test B then calls
`setFastTick()` to mutate the package-level var, the race detector
correctly flags it:

  Read at runLoopTickInterval by goroutine A (runLoop ticker init)
  Previous write by goroutine B (setFastTick in test B)

Fix: snapshot `runLoopTickInterval` inside `Start()` under the mutex
before spawning the goroutine, and pass the value as a parameter to
`runLoop`. The spawned goroutine no longer reads the package-level
var, so the cross-test window is closed. Production behavior
unchanged.

Verified: `go test -race -count=3 ./internal/cron/...` passes three
times in a row; the CI failure on PR #898 reproduced before the fix
and is gone after.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

@viettranx Hey anh Việt, reminder to review this PR when you get a chance — Zuey asked me to flag it so you can proceed with the package updates.

Quick context summary:

What: Add GitHub Releases binary installer — new runtime package source github:owner/repo[@tag] for installing Linux CLI binaries (lazygit, ripgrep, starship, gh, etc.) directly from GitHub Releases.

Key components:

  • GitHub API client with 10-min cache + rate-limit mapping
  • SSRF-guarded streaming downloader (HTTPS-only, host allowlist re-validated per redirect hop, literal-IP rejection)
  • SHA256 checksum verification (constant-time compare)
  • Archive extraction (tar.gz/zip/raw) with path-traversal + zip-bomb guards
  • ELF magic + 64-bit + arch validation before chmod +x
  • Atomic manifest persistence (tmp + rename)
  • Admin-only API (adminAuth + requireMasterScope), picker endpoint throttled 30 req/min/user
  • Web UI: GitHub Binaries section + release picker modal + musl/glibc warning
  • i18n across en/vi/zh

Review status: 5 rounds of Claude review, all medium/low findings resolved. Builds pass (PG + SQLite), go vet, go test -race, pnpm build all green.

Ready to merge — just needs your approval so Zuey can proceed with the package updates.

P0.1 — UI uninstall 400: parseAndValidatePackage now accepts
  github:<bare-name> (manifest Name form, no owner/repo) in addition
  to the full spec. UI sends github:${pkg.name} from the manifest;
  dispatcher already tolerated bare names — the HTTP validator was
  the only gate rejecting them. Install path re-validates strictly
  via ParseGitHubSpec and bare-name install returns 400 now (was 500).

P1.1 — ExtractArchive raw-ELF fallback name: add ExtractArchiveAs(
  path, fallbackName, max). Installer passes parsed.Repo so raw
  (non-archive) ELF assets no longer end up recorded as
  /tmp/goclaw-gh-asset-XXXX.bin — that basename would leak into the
  manifest Binaries entry and break PATH lookup.

P1.3 — Archive entry count cap: maxArchiveEntries = 10_000 +
  ErrTooManyEntries sentinel. Tar: count ALL headers seen (incl.
  symlinks/dirs we skip) to block the gzip-bomb-of-headers DoS —
  header bytes don't count against maxUncompressed for zero-size
  entries. Zip: pre-check via peekZipEntryCount reads the EOCD
  record manually and rejects oversized archives BEFORE
  zip.OpenReader allocates []*zip.File of declared capacity (this
  was a fresh red-team finding; stdlib would otherwise alloc ~1GB
  for a crafted 200MB zip claiming 4M entries).

P1.4 — Rate-limit install/uninstall: packagesWriteLimiter
  (10/min/user, burst 3). Admin-only mitigates but a compromised
  token could otherwise flood upstream (GitHub/pip/npm) or spam
  manifest mutations.

P1.6 — Non-Linux early reject: ErrUnsupportedOS guard at the top
  of Install(). Windows/macOS hosts no longer waste bandwidth
  fetching a Linux asset just to fail at the ELF machine check.

P1.7 — Manifest fsync: OpenFile → Write → Sync → Close → Rename →
  dir Sync, with tmp cleanup on every error path. POSIX doesn't
  guarantee durability via rename alone; XFS / ext4 with async
  journal can reorder.

P2.1 — Belt-and-suspenders zip runtime break when cumulative bytes
  reach the cap (pre-declared check already covers it but the
  streaming loop now bails immediately).

P2.6 — Binary-name collision warn: slog.Warn when a different repo
  already owns the basename we're about to overwrite. Last-writer-
  wins unchanged; operator now gets a signal instead of silence.

Hardening — rate-limit key: rateLimitKeyFromRequest prefers
  store.UserIDFromContext over the raw X-GoClaw-User-Id header so
  an admin can't rotate the header mid-session to dodge the bucket.
  Header/IP fallback retained for pre-auth / test callers.

Tests: 9 new cases on parseAndValidatePackage (github full/bare/
  empty/traversal/injection/space/leading-hyphen);
  TestExtractArchiveAs_RawELFUsesFallbackName;
  TestExtractTarGz_EntryCountCap + TestExtractZip_EntryCountCap;
  TestPeekZipEntryCount (DoS pre-check path).

Verified: go build ./... && go build -tags sqliteonly ./... &&
  go vet ./... && go test -race ./internal/skills/... ./internal/http/...
@viettranx
Copy link
Copy Markdown
Contributor

Review pass — address P0 blocker + P1/P2 fixes + new DoS vector

Spun up adversarial red-team on this PR, then on my own fixes. Pushed ac9c4561 to the branch.

Fixes applied

Ref Issue Fix
P0.1 UI uninstall always 400 — github:${pkg.name} is a bare binary name (manifest Name), but the HTTP validator forced it through ParseGitHubSpec which requires owner/repo. Uninstall from the dashboard was completely broken. parseAndValidatePackage now accepts bare form via validGitHubBareName. Install path re-validates strictly via ParseGitHubSpec and surfaces it as 400 (was 500).
P1.1 Raw (non-archive) ELF assets leaked /tmp/goclaw-gh-asset-XXXX.bin as the manifest binary name → broken PATH lookup New ExtractArchiveAs(path, fallbackName, max); installer passes parsed.Repo
P1.3 Tar/zip entry DoS — no cap on header count. Zero-size symlink headers don't count against maxUncompressed, so a crafted gzip could push the loop into billions of iterations maxArchiveEntries = 10_000 + ErrTooManyEntries. Tar counts ALL header iterations (incl. skipped types). Zip: see P0-NEW below.
P1.4 Install/uninstall unthrottled — compromised admin token could flood upstream packagesWriteLimiter (10/min/user, burst 3)
P1.6 Non-Linux hosts waste a full asset download before the ELF check fails ErrUnsupportedOS guard at the top of Install()
P1.7 saveManifest skipped fsync — rename alone doesn't guarantee POSIX durability on XFS/ext4 async journals OpenFile → Write → Sync → Close → Rename → dir Sync, tmp cleanup on every error path
P2.1 extractZip streaming loop didn't short-circuit when total >= max Early return ErrZipBomb
P2.6 Silent binary-basename collision across packages slog.Warn when a different repo already owns the basename

New finding from red-team pass (P0-NEW)

zip.OpenReader pre-allocates []*zip.File of declared capacity before my len(zr.File) > maxArchiveEntries check ever runs. Worst case: 200MB zip declaring ~4.3M entries → ~1GB heap alloc per install request. Admin-only + allowlist mitigate probability, but a compromised allowlisted repo could one-shot OOM a multi-tenant gateway.

Fix: peekZipEntryCount reads the EOCD record directly (0x06054b50 + 2-byte total entries), rejects oversized archives before zip.OpenReader. ZIP64 (count = 0xFFFF) falls through to the stdlib parser which has its own sanity checks, plus the post-open recheck.

Hardening — rate-limit key

rateLimitKeyFromRequest now prefers store.UserIDFromContext over the raw X-GoClaw-User-Id header. An admin session can no longer rotate the header mid-session to dodge the per-user bucket. Header + IP fallback retained for pre-auth / test callers.

Tests

  • 9 new parseAndValidatePackage cases: full spec, bare name, empty, traversal (github:../evil), injection (;rm -rf), space, leading hyphen
  • TestExtractArchiveAs_RawELFUsesFallbackName
  • TestExtractTarGz_EntryCountCap + TestExtractZip_EntryCountCap
  • TestPeekZipEntryCount (covers small zip exact count + non-zip graceful fallback)
go build ./... && go build -tags sqliteonly ./...   ✓
go vet ./...                                         ✓
go test -race -count=1 ./internal/skills/... ./internal/http/...  ✓

Carry-over items NOT addressed (intentional)

Ref Reason
P1.2 checksum strict-mode env flag Design decision; current fail-open is documented. Follow-up PR if policy changes.
P1.5 edition gating (Lite desktop wiring) Broader scope; ErrUnsupportedOS covers the immediate runtime error. Separate PR.
P1.8 TestDownloadAsset_MaxSize doesn't test overflow Pre-existing; needs SSRF-bypass test infra refactor.
P1.9 cron race-fix commits bundled Can't untangle in-branch; worth noting in final description.
P1.10 asset selection tie-break determinism Observational; slog.Info already logs picked asset name.

One thing to decide

packagesWriteLimiter = 10/min/user, burst 3. Adequate for interactive admin flows. If onboarding ever scripts 20+ sequential installs, this slows it to ~2 min. Happy to widen to 30/min/burst-10 (parity with the picker) if that's the prevailing use case.

@viettranx viettranx merged commit 2cbf838 into dev Apr 16, 2026
1 of 2 checks passed
@viettranx viettranx deleted the feat/github-releases-installer branch April 16, 2026 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Support CLI installation from GitHub Releases/URL + Unified credentials management

2 participants