Skip to content

Sync dev to main#167

Closed
tis24dev wants to merge 18 commits intomainfrom
dev
Closed

Sync dev to main#167
tis24dev wants to merge 18 commits intomainfrom
dev

Conversation

@tis24dev
Copy link
Owner

@tis24dev tis24dev commented Mar 10, 2026

  • test(orchestrator): maximize coverage for network_apply
  • ci: bump the actions-updates group across 1 directory with 2 updates (ci: bump the actions-updates group across 1 directory with 2 updates #164)
  • backup/pve: clarify datastore skip logs
  • feat(webhook): optional Discord content fallback for embed-only messages
  • Revert "feat(webhook): optional Discord content fallback for embed-only messages"
  • Update go.mod
  • Harden restore decisions against untrusted backup metadata
  • Enforce staged backup checksum verification in restore/decrypt flows
  • fix(orchestrator): preserve staged network symlinks
  • fix(orchestrator): harden restore path validation against broken symlink escapes
  • fix(backup): sanitize datastore and storage path segments
  • fix(io): bound timed fs and input goroutine buildup
  • fix(orchestrator): preserve operational restore errors during path validation

Summary by CodeRabbit

  • New Features

    • Archive analysis to infer backup type, cluster payload, and hostname; filesystem-based namespace discovery.
    • Timeout-limited, concurrency-limited filesystem operations.
  • Bug Fixes

    • Restore safety hardened to reject symlink escapes and enforce root-bound extraction.
    • Safer handling of unsafe datastore/storage names to avoid collisions and leaking raw paths.
    • Skip raw artifacts when checksum verification is unavailable.
  • Improvements

    • Staged-archive integrity checks with normalized checksum validation.
    • Consolidated prepare/decrypt flows and coalesced concurrent input reads.
  • Chores

    • Updated Go toolchain and CI release workflow.
  • Tests

    • Large expansion of tests covering restore, network, PBS, path-security, integrity, and concurrency.

tis24dev and others added 13 commits February 27, 2026 15:30
- Add network_apply_additional_test.go with extensive tests for rollback (arm/disarm), NIC repair CLI (overrides/conflicts), snapshot IP parsing, command selection, and error paths.
- Use FakeFS/FakeCommandRunner plus PATH stubs to deterministically exercise both success and failure branches.
- Bring network_apply.go coverage to ~99% with no production code changes.
…164)

Bumps the actions-updates group with 2 updates in the / directory: [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) and [actions/attest-build-provenance](https://github.com/actions/attest-build-provenance).


Updates `goreleaser/goreleaser-action` from 6 to 7
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](goreleaser/goreleaser-action@v6...v7)

Updates `actions/attest-build-provenance` from 3 to 4
- [Release notes](https://github.com/actions/attest-build-provenance/releases)
- [Changelog](https://github.com/actions/attest-build-provenance/blob/main/RELEASE.md)
- [Commits](actions/attest-build-provenance@v3...v4)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions-updates
- dependency-name: actions/attest-build-provenance
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions-updates
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Make PVE datastore skip messages user-friendly: log disabled storages as SKIP,
offline storages as actionable WARNING, and always emit DEBUG skip details with
active/enabled/status and an explicit reason.
Fix detected datastores report header to match output column order.
Extend unit tests to cover runtime flag parsing and the new skip logging, and
make Makefile coverage targets honor the go.mod toolchain via GOTOOLCHAIN.
Adds per-endpoint opt-in Discord content text (WEBHOOK_<ENDPOINT>_DISCORD_CONTENT_ENABLED/..._DISCORD_CONTENT) to avoid “empty” messages when embeds aren’t rendered. Default remains unchanged (embed-only). Includes 2000-char truncation, docs/template updates, and tests.
Derive restore compatibility, cluster gating, and hostname warnings from archive-backed facts instead of candidate manifest metadata. Add restore archive analysis for internal backup metadata and category-based fallback, update restore planning and compatibility checks to consume trusted decision inputs, and add regression tests covering manifest spoofing for backup type and cluster mode.
Require checksum verification for staged backup archives before any restore or decrypt operation proceeds. Add strict SHA256 parsing and normalization, enforce checksum source agreement between sidecar and manifest metadata, reject raw backup candidates with no verifiable integrity source, centralize shared prepare-path verification for bundle and raw inputs, preserve source artifact checksum separately from the prepared plain archive checksum, and update regression tests to cover missing, mismatched, and legacy checksum scenarios.
Use Lstat in network staged apply so symlink entries are handled as symlinks instead of being dereferenced and copied as regular file contents. Reject symlinked staged network directories, extend the FS abstraction with Lstat, and add regression tests covering preserved symlinks and invalid symlinked stage roots.
…ink escapes

Replace the restore path safety checks with a shared FS-aware resolver that walks paths component by component using Lstat and Readlink, correctly handling missing tails and rejecting intermediate symlink escapes. Apply the new validation to regular restore, safety restore, symlink and hardlink handling, and add regression tests for broken escaping symlinks, symlink loops, and symlinked restore roots.
Introduce a shared path-safe key for PBS datastore names and PVE storage names when building collector output paths, preventing path traversal and filename collisions from raw config values. Keep raw names unchanged in metadata and command invocations, update all affected PBS/PVE collector call sites, and add regression tests covering unsafe names while preserving the existing layout for already-safe names.
Prevent unbounded goroutine accumulation in timeout/cancel wrappers by limiting in-flight safefs operations and reusing a single in-flight read per reader/file descriptor in the input package. Keep public APIs unchanged, preserve current timeout semantics, and add regression coverage for limiter saturation, repeated timeouts, and race-safe cleanup of timed operations.
…lidation

Teach the restore path resolver to distinguish security violations from local operational failures, so only real root-escape conditions are reported as illegal paths. Keep broken-symlink and traversal hardening intact, allow in-root permission and ENOTDIR failures to surface through the normal restore flow, and add regression tests covering the new security vs operational error classification.
@github-actions
Copy link

github-actions bot commented Mar 10, 2026

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 2 package(s) with unknown licenses.
See the Details below.

License Issues

.github/workflows/release.yml

PackageVersionLicenseIssue Type
actions/attest-build-provenance4.*.*NullUnknown License
goreleaser/goreleaser-action7.*.*NullUnknown License
Allowed Licenses: MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, LicenseRef-scancode-google-patent-license-golang

OpenSSF Scorecard

PackageVersionScoreDetails
actions/actions/attest-build-provenance 4.*.* UnknownUnknown
actions/goreleaser/goreleaser-action 7.*.* 🟢 5
Details
CheckScoreReason
Code-Review🟢 3Found 5/16 approved changesets -- score normalized to 3
Binary-Artifacts🟢 10no binaries found in the repo
Maintained🟢 1016 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 10
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
License🟢 10license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: some github tokens can't read classic branch protection rules: https://github.com/ossf/scorecard-action/blob/main/docs/authentication/fine-grained-auth-token.md
Security-Policy⚠️ 0security policy file not detected
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
Pinned-Dependencies🟢 5dependency not pinned by hash detected -- score normalized to 5

Scanned Files

  • .github/workflows/release.yml

@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds root-bound path resolution and symlink safety, staged-archive integrity verification and unified bundle preparation, deterministic datastore path-keying, coalesced input reads, filesystem operation limiting, symlink-aware staged file application, many tests, and small CI/tooling bumps.

Changes

Cohort / File(s) Summary
Build / Tooling
/.github/workflows/release.yml, Makefile, go.mod
Bump GitHub actions versions; add Makefile toolchain extraction and COVER_GOTOOLCHAIN usage; upgrade Go toolchain to go1.25.8.
Checksum utils & tests
internal/backup/checksum.go, internal/backup/checksum_test.go, internal/backup/checksum_legacy_test.go
Add NormalizeChecksum and ParseChecksumData; Normalize used in verification and legacy loader; tests use computed expected SHA values.
Collector keying & PBS model
internal/backup/collector.go, internal/backup/collector_helpers_extra_test.go, internal/backup/collector_pbs_datastore.go, internal/backup/...collector_pbs*.go, internal/backup/collector_pbs_datastore_inventory.go
Introduce collectorPathKey and deterministic pbsDatastore fields (Source, CLIName, NormalizedPath, OutputKey); use path-keyed filenames and merging; add collision/override tests.
PVE storage updates & tests
internal/backup/collector_pve.go, internal/backup/collector_pve_test.go, internal/backup/collector_pve_parse_test.go
Switch to storage.pathKey() for file paths and reports; improve skip/log details; extend parsing tests for runtime fields.
PBS staged apply & inventory tests
internal/orchestrator/pbs_staged_apply.go, internal/orchestrator/pbs_staged_apply_additional_test.go, internal/backup/collector_pbs_commands_coverage_test.go
Extend staged inventory with Origin/CLIName; ignore override-origin entries when loading datastore.cfg; tests ensure overrides omitted and collision handling.
Namespaces helper
internal/pbs/namespaces.go, internal/pbs/namespaces_test.go
Export DiscoverNamespacesFromFilesystem wrapper and add test.
Input concurrency
internal/input/input.go, internal/input/input_test.go
Coalesce concurrent ReadLine/ReadPassword via per-reader/FD in-flight state (sync.Map + single-flight goroutine); normalize context errors; add blocking-read tests.
Path-security core & restore integration
internal/orchestrator/path_security.go, internal/orchestrator/restore.go, internal/orchestrator/backup_safety.go, related tests
Add root-bound resolver with symlink hop limits and error kinds; replace ad-hoc resolution with FS-aware helpers across extraction and safety code; extensive symlink/escape tests.
FS Lstat & fake FS updates
internal/orchestrator/deps.go, internal/orchestrator/deps_test.go, internal/orchestrator/restore_errors_test.go
Add Lstat to FS interface and implementations (osFS, FakeFS, ErrorInjectingFS) to support symlink-aware resolution.
Staged integrity & bundle preparation
internal/orchestrator/decrypt_integrity.go, internal/orchestrator/decrypt_prepare_common.go, internal/orchestrator/decrypt_*.go, internal/orchestrator/decrypt_integrity_test.go, helpers
Resolve expected checksums from sidecar or manifest; verify staged archives; centralize preparePlainBundleCommon (staging, integrity, optional decryption); preparedBundle adds SourceChecksum; tests added.
Restore archive analysis & decision
internal/orchestrator/restore_decision.go, internal/orchestrator/restore_decision_test.go, internal/orchestrator/selective.go
Introduce AnalyzeRestoreArchive returning categories and RestoreDecisionInfo (BackupType, ClusterPayload, Hostname, Source); selective analysis delegates to it; tests added.
Restore workflow & compatibility
internal/orchestrator/restore_workflow_*.go, internal/orchestrator/compatibility.go, internal/orchestrator/compatibility_test.go, internal/orchestrator/restore_plan.go
PlanRestore now accepts clusterBackup bool; ValidateCompatibility accepts SystemType args via parseSystemTypeString; restore UI/workflow uses AnalyzeRestoreArchive and RestoreDecisionInfo; tests updated.
Network staged apply & symlink handling
internal/orchestrator/network_staged_apply.go, internal/orchestrator/network_staged_apply_test.go
Use Lstat to detect symlinks; add copySymlinkOverlay for relative symlink preservation; reject directory-stage symlinks; tests ensure preservation and rejection semantics.
Network apply tests
internal/orchestrator/network_apply_additional_test.go
Add extensive tests for network apply/rollback flows, markers, timers, NIC mapping repairs, and many edge cases.
Filesystem operation limiting
internal/safefs/safefs.go, internal/safefs/safefs_test.go
Add operation limiter, runLimited helper, ErrTimeout/TimeoutError and context-aware Stat/ReadDir/Statfs with timeout semantics; tests for limiter capacity and behavior.
Path-security tests
internal/orchestrator/path_security_test.go, related restore tests
Comprehensive tests for symlink loops, broken-intermediate symlink escapes, and operational vs security error classification.
Misc docs/templates/tests
docs/*, internal/config/templates/backup.env, various test helpers
Documentation clarifications for PBS_DATASTORE_PATH and template wording; tests updated to use checksum helpers and generated archive bytes; new test helpers added.

Sequence Diagram(s)

sequenceDiagram
    participant Client as AnalyzeRestoreArchive
    participant FS as restoreFS
    participant Reader as Tar Reader
    participant Parser as Metadata Parser
    participant Analyzer as Decision Builder

    Client->>+FS: Open archive path
    FS-->>-Client: Reader
    Client->>+Reader: Decompress & iterate entries
    Reader->>+Parser: Send each entry path & contents
    Parser->>Parser: Collect paths, detect metadata entry
    alt Metadata present
        Parser->>Parser: Parse BACKUP_TYPE / HOSTNAME / CLUSTER_MODE
    end
    Parser-->>-Reader: Completed facts
    Reader-->>-Client: Return categories & metadata
    Client->>+Analyzer: Build RestoreDecisionInfo
    Analyzer->>Analyzer: Derive BackupType, ClusterPayload, Hostname, Source
    Analyzer-->>-Client: Return RestoreDecisionInfo
Loading
sequenceDiagram
    participant Orch as preparePlainBundleCommon
    participant Rclone as Rclone Handler
    participant Staging as Staging FS
    participant Integrity as Integrity Checker
    participant Decrypt as Decryption Callback
    participant Manifest as Manifest Updater

    Orch->>Orch: Validate inputs
    alt rclone source
        Orch->>+Rclone: Download bundle
        Rclone-->>-Orch: Local bundle path
    end
    Orch->>+Staging: Stage bundle (extract or copy)
    Staging-->>-Orch: Staged archive path
    Orch->>+Integrity: Resolve expected checksum (sidecar/manifest)
    Integrity-->>-Orch: Expected checksum or error
    Orch->>+Integrity: Verify staged archive checksum
    Integrity-->>-Orch: Verified
    alt encrypted
        Orch->>+Decrypt: Decrypt to plain archive (callback)
        Decrypt-->>-Orch: Plain archive path
    else unencrypted
        Orch->>Orch: Copy staged to plain path
    end
    Orch->>+Manifest: Compute size & checksum, update manifest fields
    Manifest-->>-Orch: PreparedBundle with SourceChecksum
    Orch-->>Caller: Return prepared bundle + cleanup
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~110 minutes

Poem

🐰 I hop through paths where symlinks used to stray,
Checksums hum tidy, every byte in play,
Concurrency corralled, timeouts kept in line,
Datastores keyed safe, staged bundles all shine,
A rabbit tips its hat — the changes are fine!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Sync dev to main' accurately reflects the primary purpose of this pull request, which is a routine merge of accumulated development changes into the main branch.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/orchestrator/restore_workflow_ui.go (1)

89-98: ⚠️ Potential issue | 🟡 Minor

SystemTypeUnknown now bypasses the compatibility warning.

Line 89 relies entirely on decisionInfo.BackupType, but ValidateCompatibility() returns nil for SystemTypeUnknown and expects the caller to surface the warning. In this flow, restores whose source type cannot be inferred now proceed without telling the user that compatibility could not be validated.

Suggested fix
-	if warn := ValidateCompatibility(systemType, decisionInfo.BackupType); warn != nil {
+	if decisionInfo.BackupType == SystemTypeUnknown {
+		warn := fmt.Errorf("warning: could not determine backup system type from archive contents; compatibility could not be validated")
+		logger.Warning("Compatibility check: %v", warn)
+		proceed, perr := ui.ConfirmCompatibility(ctx, warn)
+		if perr != nil {
+			return perr
+		}
+		if !proceed {
+			return ErrRestoreAborted
+		}
+	} else if warn := ValidateCompatibility(systemType, decisionInfo.BackupType); warn != nil {
 		logger.Warning("Compatibility check: %v", warn)
 		proceed, perr := ui.ConfirmCompatibility(ctx, warn)
 		if perr != nil {
 			return perr
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/orchestrator/restore_workflow_ui.go` around lines 89 - 98,
ValidateCompatibility returning nil for SystemTypeUnknown means we must still
surface a warning when systemType == SystemTypeUnknown; change the block around
ValidateCompatibility(decisionInfo.BackupType) to also check for systemType ==
SystemTypeUnknown and in that case set a non-nil warning message (e.g., "source
system type unknown; compatibility could not be validated") so the existing flow
(logger.Warning(...); ui.ConfirmCompatibility(ctx, warn); handle perr and
!proceed -> return ErrRestoreAborted) runs for both real warnings and the
unknown-system case; update references to ValidateCompatibility, systemType,
decisionInfo.BackupType, logger.Warning, ui.ConfirmCompatibility, and
ErrRestoreAborted accordingly.
🧹 Nitpick comments (10)
internal/orchestrator/compatibility_test.go (1)

8-21: Consider removing the now-unnecessary filesystem setup.

With the updated ValidateCompatibility signature accepting explicit SystemType parameters, the fake filesystem setup (lines 12-17 creating /etc/pve) is no longer needed for this test. The validation is now driven entirely by the passed arguments, not by filesystem detection.

♻️ Suggested simplification
 func TestValidateCompatibility_Mismatch(t *testing.T) {
-	orig := compatFS
-	defer func() { compatFS = orig }()
-
-	fake := NewFakeFS()
-	defer func() { _ = os.RemoveAll(fake.Root) }()
-	compatFS = fake
-	if err := os.MkdirAll(fake.onDisk("/etc/pve"), 0o755); err != nil {
-		t.Fatalf("mkdir: %v", err)
-	}
-
 	if err := ValidateCompatibility(SystemTypePVE, SystemTypePBS); err == nil {
 		t.Fatalf("expected incompatibility error")
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/orchestrator/compatibility_test.go` around lines 8 - 21, The test
TestValidateCompatibility_Mismatch still sets up a fake filesystem (compatFS =
fake, NewFakeFS(), os.MkdirAll(fake.onDisk("/etc/pve"))) even though
ValidateCompatibility now takes explicit SystemType parameters; remove the
filesystem setup and cleanup (compatFS assignment, NewFakeFS(), os.RemoveAll,
and mkdir calls) so the test simply calls ValidateCompatibility(SystemTypePVE,
SystemTypePBS) and asserts an error—leave the function name
TestValidateCompatibility_Mismatch and the call to ValidateCompatibility
unchanged.
internal/safefs/safefs.go (2)

138-156: effectiveTimeout is called twice, potentially producing inconsistent values.

The TimeoutError is constructed with effectiveTimeout(ctx, timeout) before runLimited is called, but runLimited recalculates it internally (line 102). If the context deadline is close, time elapsed between the two calls could cause different results—the error's Timeout field may not match the actual timeout used.

Consider computing the effective timeout once and passing it through, or have runLimited construct the TimeoutError internally:

♻️ Proposed fix: compute effective timeout once
 func Stat(ctx context.Context, path string, timeout time.Duration) (fs.FileInfo, error) {
-	return runLimited(ctx, timeout, &TimeoutError{Op: "stat", Path: path, Timeout: effectiveTimeout(ctx, timeout)}, func() (fs.FileInfo, error) {
+	eff := effectiveTimeout(ctx, timeout)
+	return runLimited(ctx, eff, &TimeoutError{Op: "stat", Path: path, Timeout: eff}, func() (fs.FileInfo, error) {
 		return osStat(path)
 	})
 }

 func ReadDir(ctx context.Context, path string, timeout time.Duration) ([]os.DirEntry, error) {
-	return runLimited(ctx, timeout, &TimeoutError{Op: "readdir", Path: path, Timeout: effectiveTimeout(ctx, timeout)}, func() ([]os.DirEntry, error) {
+	eff := effectiveTimeout(ctx, timeout)
+	return runLimited(ctx, eff, &TimeoutError{Op: "readdir", Path: path, Timeout: eff}, func() ([]os.DirEntry, error) {
 		return osReadDir(path)
 	})
 }

 func Statfs(ctx context.Context, path string, timeout time.Duration) (syscall.Statfs_t, error) {
-	return runLimited(ctx, timeout, &TimeoutError{Op: "statfs", Path: path, Timeout: effectiveTimeout(ctx, timeout)}, func() (syscall.Statfs_t, error) {
+	eff := effectiveTimeout(ctx, timeout)
+	return runLimited(ctx, eff, &TimeoutError{Op: "statfs", Path: path, Timeout: eff}, func() (syscall.Statfs_t, error) {
 		var stat syscall.Statfs_t
 		err := syscallStatfs(path, &stat)
 		return stat, err
 	})
 }

Note: You would also need to adjust runLimited to skip the internal effectiveTimeout call when the passed timeout is already effective (or pass a flag/use a different signature).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/safefs/safefs.go` around lines 138 - 156, The effective timeout is
computed twice causing potential mismatch between the TimeoutError.Timeout and
the timeout used in runLimited; compute the effective timeout once (e.g. t :=
effectiveTimeout(ctx, timeout)) and use that t when constructing the
TimeoutError in Stat/ReadDir/Statfs and pass that same t into runLimited (or
change runLimited to accept an already-resolved timeout and stop calling
effectiveTimeout internally); update runLimited's signature/usage accordingly so
it no longer recalculates effectiveTimeout and the TimeoutError.Timeout always
matches the actual timeout used.

86-91: The default case in release() silently ignores unbalanced releases.

If release() is called without a matching acquire() (programmer error), this silently does nothing rather than failing fast. While the controlled usage in runLimited shouldn't trigger this, the default case could mask bugs during future refactoring.

Consider removing the default case (blocking receive is correct for matched acquire/release), or adding a log/metric for the unexpected path:

Option 1: Remove default (fail-fast on bugs)
 func (l *operationLimiter) release() {
-	select {
-	case <-l.slots:
-	default:
-	}
+	<-l.slots
 }
Option 2: Log unexpected condition
 func (l *operationLimiter) release() {
 	select {
 	case <-l.slots:
 	default:
+		// This indicates a programming error - release without acquire
+		// Log or metric here if observability is available
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/safefs/safefs.go` around lines 86 - 91, The release method currently
swallows unbalanced releases via a default case; change operationLimiter.release
so it performs a blocking receive from l.slots (remove the default case) so an
unexpected release without a matching acquire will block/fail-fast and surface
the bug; locate the method named release on type operationLimiter (and related
acquire/runLimited usage) to make this change — alternatively, if you prefer
non-blocking behavior, replace the default branch with a log/metric emitting the
unexpected condition instead.
Makefile (1)

4-5: Centralize the toolchain-prefixed go command.

This works, but the env prefix is now repeated in every coverage command and is easy to miss in future targets. Extracting a reusable GO/COVER_GO variable keeps the toolchain logic in one place and makes it simpler to apply consistently.

♻️ Possible cleanup
 TOOLCHAIN_FROM_MOD := $(shell awk '/^toolchain /{print $$2}' go.mod 2>/dev/null)
 COVER_GOTOOLCHAIN := $(if $(TOOLCHAIN_FROM_MOD),$(TOOLCHAIN_FROM_MOD)+auto,auto)
+GO := GOTOOLCHAIN=$(COVER_GOTOOLCHAIN) go
@@
-	GOTOOLCHAIN=$(COVER_GOTOOLCHAIN) go test -coverprofile=coverage.out ./...
-	GOTOOLCHAIN=$(COVER_GOTOOLCHAIN) go tool cover -html=coverage.out
+	$(GO) test -coverprofile=coverage.out ./...
+	$(GO) tool cover -html=coverage.out
@@
-	`@GOTOOLCHAIN`=$(COVER_GOTOOLCHAIN) go test -coverpkg=./... -coverprofile=coverage.out ./...
-	`@GOTOOLCHAIN`=$(COVER_GOTOOLCHAIN) go tool cover -func=coverage.out | tail -n 1
+	@$(GO) test -coverpkg=./... -coverprofile=coverage.out ./...
+	@$(GO) tool cover -func=coverage.out | tail -n 1
@@
-	`@GOTOOLCHAIN`=$(COVER_GOTOOLCHAIN) go test -coverpkg=./... -coverprofile=coverage.out ./...
-	`@total`=$$(GOTOOLCHAIN=$(COVER_GOTOOLCHAIN) go tool cover -func=coverage.out | grep total: | awk '{print $$3}' | sed 's/%//'); \
+	@$(GO) test -coverpkg=./... -coverprofile=coverage.out ./...
+	`@total`=$$($(GO) tool cover -func=coverage.out | awk '/^total:/ { gsub(/%/,"",$$3); print $$3 }'); \

Also applies to: 65-78

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 4 - 5, The Makefile repeats the toolchain-prefixed go
invocation logic across targets (TOOLCHAIN_FROM_MOD and COVER_GOTOOLCHAIN);
create a single reusable variable (e.g., GO or COVER_GO) that evaluates
TOOLCHAIN_FROM_MOD once and yields either "$(TOOLCHAIN_FROM_MOD)go" or "go" (and
similarly for coverage: COVER_GO using COVER_GOTOOLCHAIN), then replace all
direct uses of the current COVER_GOTOOLCHAIN+auto/auto expressions in coverage
and other targets with the new GO/COVER_GO variable so the toolchain prefix
logic is centralized and consistent (update references to TOOLCHAIN_FROM_MOD and
COVER_GOTOOLCHAIN to use the new variable).
.github/workflows/release.yml (1)

72-79: Pin the GoReleaser binary to a specific version instead of latest.

The action major is now fixed, but version: latest still makes this release job drift over time. A new upstream GoReleaser release can change defaults or artifact naming without any repo change here, which also makes the provenance step harder to reason about. The official documentation supports pinning to specific versions (e.g., version: v0.180.0).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 72 - 79, The workflow step "Run
GoReleaser" currently sets with: version: latest; change this to pin a concrete
GoReleaser tag (for example version: v0.180.0) so the
goreleaser/goreleaser-action@v7 invocation uses a fixed binary version; update
the version field in that step (and any other occurrences) to the chosen
semantic tag and commit the change so releases are reproducible and won't drift
when upstream publishes new releases.
internal/input/input.go (1)

47-50: Evict finished entries from the global state caches.

lineStates / passwordStates only grow. In particular, every distinct *bufio.Reader stays reachable from lineStates, so short-lived readers and their buffers cannot be garbage-collected after the prompt finishes. Once the consumed in-flight operation is cleared, delete the cache entry too.

Also applies to: 88-104

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/input/input.go` around lines 47 - 50, The global caches lineStates
and passwordStates are never pruned, causing retained references to short-lived
*bufio.Reader keys; after the in-flight prompt/operation completes (where you
currently clear the operation state), also remove its entry from the caches by
calling lineStates.Delete(readerKey) and passwordStates.Delete(readerKey) so
finished readers can be GC'd; locate the places that finish/resolve the prompt
(the code that currently removes the in-flight marker) and add the Delete calls
for the same key there.
internal/orchestrator/backup_sources_test.go (1)

232-239: Back this digest with concrete archive bytes.

SHA256 here is derived from a label string rather than bytes representing the raw archive. Using a tiny archive fixture for the digest would make the integrity precondition explicit and keep this test aligned if raw-candidate discovery starts validating manifest/source agreement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/orchestrator/backup_sources_test.go` around lines 232 - 239, The
manifest in the test uses SHA256 computed from a label string instead of the
actual archive bytes; replace this with a concrete small archive fixture (e.g.,
create a tiny tar/xz byte slice or load a test fixture bytes value) and compute
the digest from those bytes so the manifest.SHA256 matches the real archive
content used in the test; update the test to use
checksumHexForBytes(archiveBytes) (or the helper that computes the SHA256) and
ensure the ArchivePath/manifest variable refers to that same fixture so
manifest/source agreement is explicit and verifiable.
internal/orchestrator/network_staged_apply_test.go (1)

36-58: Use a relative target in this symlink-preservation test.

This expectation currently locks in /etc/network/interfaces -> /stage/..., so the test still passes even if the applied link becomes unusable once the staging tree is removed. A relative target exercises the same preservation path without depending on stageRoot surviving after apply.

♻️ Suggested test update
-	if err := fakeFS.Symlink("/stage/etc/network/interfaces.real", "/stage/etc/network/interfaces"); err != nil {
+	if err := fakeFS.Symlink("interfaces.real", "/stage/etc/network/interfaces"); err != nil {
 		t.Fatalf("create staged symlink: %v", err)
 	}
@@
-	if gotTarget != "/stage/etc/network/interfaces.real" {
-		t.Fatalf("symlink target=%q, want %q", gotTarget, "/stage/etc/network/interfaces.real")
+	if gotTarget != "interfaces.real" {
+		t.Fatalf("symlink target=%q, want %q", gotTarget, "interfaces.real")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/orchestrator/network_staged_apply_test.go` around lines 36 - 58, The
test creates and asserts an absolute symlink target which masks regressions when
the stage tree is removed; update the staged symlink creation and expectation to
use a relative target instead. In the test around
fakeFS.Symlink("/stage/etc/network/interfaces.real",
"/stage/etc/network/interfaces") and the Readlink assertion after calling
applyNetworkFilesFromStage(newTestLogger(), "/stage"), create the symlink with a
relative target (e.g. "../etc/network/interfaces.real" from the /stage/etc path)
and assert that fakeFS.Readlink("/etc/network/interfaces") returns that relative
path; keep the same checks using Lstat and os.ModeSymlink to ensure the link is
preserved.
internal/orchestrator/restore_decision.go (1)

115-152: Unusual but intentional dual-error return pattern.

The function returns ([]string, *restoreDecisionMetadata, error, error) where the third value is a recoverable metadata parse error and the fourth is a fatal archive read error. While unconventional, this design allows callers to continue with available data when metadata parsing fails while still catching critical errors.

Consider adding a doc comment to clarify the semantics:

// collectRestoreArchiveFacts iterates the tar and collects entry paths plus optional metadata.
// Returns (paths, metadata, metadataParseErr, fatalErr) where metadataParseErr is non-fatal.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/orchestrator/restore_decision.go` around lines 115 - 152, Add a doc
comment above collectRestoreArchiveFacts explaining the unusual dual-error
return semantics: describe that it returns (archivePaths []string, metadata
*restoreDecisionMetadata, metadataParseErr error, fatalErr error) where
metadataParseErr is a non-fatal error produced by parseRestoreDecisionMetadata
when metadata cannot be parsed but iteration can continue, and fatalErr
represents terminal read errors (e.g., tarReader.Next()/io.ReadAll failures);
reference isRestoreDecisionMetadataEntry and parseRestoreDecisionMetadata in the
comment so callers understand how metadata is detected and parsed.
internal/orchestrator/decrypt_prepare_common.go (1)

34-55: Consider using os.TempDir() instead of hardcoded /tmp.

The hardcoded /tmp path may not work correctly on all systems (e.g., macOS uses /private/tmp, Windows uses %TEMP%). Consider using os.TempDir() for portability.

♻️ Suggested change for portability
-	tempRoot := filepath.Join("/tmp", "proxsave")
+	tempRoot := filepath.Join(os.TempDir(), "proxsave")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/orchestrator/decrypt_prepare_common.go` around lines 34 - 55, The
code builds tempRoot with a hardcoded "/tmp"; change it to use os.TempDir()
(e.g., tempRoot := filepath.Join(os.TempDir(), "proxsave")) so
restoreFS.MkdirAll and restoreFS.MkdirTemp operate on the platform temp
directory; update imports to include "os" if missing and keep existing
permission (0o755) and cleanup behavior in the cleanup closure that calls
restoreFS.RemoveAll(workDir) and rcloneCleanup when present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/backup/collector_pbs_datastore.go`:
- Around line 25-27: The pathKey() method currently returns
collectorPathKey(ds.Name) which allows distinct PBSDatastorePaths with identical
base names (synthesized in getDatastoreList()) to collide; change pathKey() to
include a stable path-derived discriminator (for example the cleaned/absolute
override path or a short hash of filepath.Clean/Abs(override)) alongside ds.Name
when building the collectorPathKey, or alternatively add collision detection in
getDatastoreList() to detect duplicate keys and disambiguate (e.g., append path
hash or numeric suffix) before writing files; update references to pathKey,
collectorPathKey, getDatastoreList and PBSDatastorePaths accordingly so
filenames/directories are unique.

In `@internal/input/input.go`:
- Around line 127-134: The goroutine that does reader.ReadString and sends to
inflight.done must not clear state.inflight immediately; change the code so the
goroutine only sends lineResult{line, MapInputError(err)} to inflight.done but
does not set state.inflight = nil — instead have the caller that actually
receives from inflight.done (the retry/consumer path that waits on the done
channel) be responsible for clearing state.inflight after it consumes the
result. Apply the same change to the analogous block referenced by inflight and
state.inflight at the other location (lines ~175-182) so completed reads remain
attached until a retry consumes them.

In `@internal/orchestrator/path_security.go`:
- Around line 89-108: The code must resolve baseDir symlinks before validating
relative targets: call a symlink-resolution step on baseAbs (e.g.,
filepath.EvalSymlinks or the equivalent prepared-root-FS-aware resolver) and use
that resolved path when calling normalizeAbsolutePathWithinRoot and when
computing candidateAbs for non-absolute candidates; update the code paths around
normalizeAbsolutePathWithinRoot and resolvePathWithinPreparedRootFS to use the
resolved base (not the raw baseAbs), and add a regression test exercising a
directory symlink like linkdir -> . with linkdir/escape -> ../outside to ensure
the escape is caught (also check related logic in extractSymlink and
RestoreSafetyBackup).

In `@internal/safefs/safefs_test.go`:
- Around line 113-116: The defer currently only restores globals (osStat,
fsOpLimiter) but doesn't release the blocked worker created by the test, which
can leave the Stat goroutine stuck; update the defer to first release/unblock
the worker from the previous limiter (call the limiter's Release/Done method on
prevLimiter or otherwise signal it) and then restore osStat and fsOpLimiter,
mirroring the other timeout tests' pattern so the blocked Stat goroutine is
always released even if t.Fatalf runs; reference the prevLimiter/prevStat
variables, fsOpLimiter, osStat, and the Stat goroutine when locating where to
add the release call.

---

Outside diff comments:
In `@internal/orchestrator/restore_workflow_ui.go`:
- Around line 89-98: ValidateCompatibility returning nil for SystemTypeUnknown
means we must still surface a warning when systemType == SystemTypeUnknown;
change the block around ValidateCompatibility(decisionInfo.BackupType) to also
check for systemType == SystemTypeUnknown and in that case set a non-nil warning
message (e.g., "source system type unknown; compatibility could not be
validated") so the existing flow (logger.Warning(...);
ui.ConfirmCompatibility(ctx, warn); handle perr and !proceed -> return
ErrRestoreAborted) runs for both real warnings and the unknown-system case;
update references to ValidateCompatibility, systemType, decisionInfo.BackupType,
logger.Warning, ui.ConfirmCompatibility, and ErrRestoreAborted accordingly.

---

Nitpick comments:
In @.github/workflows/release.yml:
- Around line 72-79: The workflow step "Run GoReleaser" currently sets with:
version: latest; change this to pin a concrete GoReleaser tag (for example
version: v0.180.0) so the goreleaser/goreleaser-action@v7 invocation uses a
fixed binary version; update the version field in that step (and any other
occurrences) to the chosen semantic tag and commit the change so releases are
reproducible and won't drift when upstream publishes new releases.

In `@internal/input/input.go`:
- Around line 47-50: The global caches lineStates and passwordStates are never
pruned, causing retained references to short-lived *bufio.Reader keys; after the
in-flight prompt/operation completes (where you currently clear the operation
state), also remove its entry from the caches by calling
lineStates.Delete(readerKey) and passwordStates.Delete(readerKey) so finished
readers can be GC'd; locate the places that finish/resolve the prompt (the code
that currently removes the in-flight marker) and add the Delete calls for the
same key there.

In `@internal/orchestrator/backup_sources_test.go`:
- Around line 232-239: The manifest in the test uses SHA256 computed from a
label string instead of the actual archive bytes; replace this with a concrete
small archive fixture (e.g., create a tiny tar/xz byte slice or load a test
fixture bytes value) and compute the digest from those bytes so the
manifest.SHA256 matches the real archive content used in the test; update the
test to use checksumHexForBytes(archiveBytes) (or the helper that computes the
SHA256) and ensure the ArchivePath/manifest variable refers to that same fixture
so manifest/source agreement is explicit and verifiable.

In `@internal/orchestrator/compatibility_test.go`:
- Around line 8-21: The test TestValidateCompatibility_Mismatch still sets up a
fake filesystem (compatFS = fake, NewFakeFS(),
os.MkdirAll(fake.onDisk("/etc/pve"))) even though ValidateCompatibility now
takes explicit SystemType parameters; remove the filesystem setup and cleanup
(compatFS assignment, NewFakeFS(), os.RemoveAll, and mkdir calls) so the test
simply calls ValidateCompatibility(SystemTypePVE, SystemTypePBS) and asserts an
error—leave the function name TestValidateCompatibility_Mismatch and the call to
ValidateCompatibility unchanged.

In `@internal/orchestrator/decrypt_prepare_common.go`:
- Around line 34-55: The code builds tempRoot with a hardcoded "/tmp"; change it
to use os.TempDir() (e.g., tempRoot := filepath.Join(os.TempDir(), "proxsave"))
so restoreFS.MkdirAll and restoreFS.MkdirTemp operate on the platform temp
directory; update imports to include "os" if missing and keep existing
permission (0o755) and cleanup behavior in the cleanup closure that calls
restoreFS.RemoveAll(workDir) and rcloneCleanup when present.

In `@internal/orchestrator/network_staged_apply_test.go`:
- Around line 36-58: The test creates and asserts an absolute symlink target
which masks regressions when the stage tree is removed; update the staged
symlink creation and expectation to use a relative target instead. In the test
around fakeFS.Symlink("/stage/etc/network/interfaces.real",
"/stage/etc/network/interfaces") and the Readlink assertion after calling
applyNetworkFilesFromStage(newTestLogger(), "/stage"), create the symlink with a
relative target (e.g. "../etc/network/interfaces.real" from the /stage/etc path)
and assert that fakeFS.Readlink("/etc/network/interfaces") returns that relative
path; keep the same checks using Lstat and os.ModeSymlink to ensure the link is
preserved.

In `@internal/orchestrator/restore_decision.go`:
- Around line 115-152: Add a doc comment above collectRestoreArchiveFacts
explaining the unusual dual-error return semantics: describe that it returns
(archivePaths []string, metadata *restoreDecisionMetadata, metadataParseErr
error, fatalErr error) where metadataParseErr is a non-fatal error produced by
parseRestoreDecisionMetadata when metadata cannot be parsed but iteration can
continue, and fatalErr represents terminal read errors (e.g.,
tarReader.Next()/io.ReadAll failures); reference isRestoreDecisionMetadataEntry
and parseRestoreDecisionMetadata in the comment so callers understand how
metadata is detected and parsed.

In `@internal/safefs/safefs.go`:
- Around line 138-156: The effective timeout is computed twice causing potential
mismatch between the TimeoutError.Timeout and the timeout used in runLimited;
compute the effective timeout once (e.g. t := effectiveTimeout(ctx, timeout))
and use that t when constructing the TimeoutError in Stat/ReadDir/Statfs and
pass that same t into runLimited (or change runLimited to accept an
already-resolved timeout and stop calling effectiveTimeout internally); update
runLimited's signature/usage accordingly so it no longer recalculates
effectiveTimeout and the TimeoutError.Timeout always matches the actual timeout
used.
- Around line 86-91: The release method currently swallows unbalanced releases
via a default case; change operationLimiter.release so it performs a blocking
receive from l.slots (remove the default case) so an unexpected release without
a matching acquire will block/fail-fast and surface the bug; locate the method
named release on type operationLimiter (and related acquire/runLimited usage) to
make this change — alternatively, if you prefer non-blocking behavior, replace
the default branch with a log/metric emitting the unexpected condition instead.

In `@Makefile`:
- Around line 4-5: The Makefile repeats the toolchain-prefixed go invocation
logic across targets (TOOLCHAIN_FROM_MOD and COVER_GOTOOLCHAIN); create a single
reusable variable (e.g., GO or COVER_GO) that evaluates TOOLCHAIN_FROM_MOD once
and yields either "$(TOOLCHAIN_FROM_MOD)go" or "go" (and similarly for coverage:
COVER_GO using COVER_GOTOOLCHAIN), then replace all direct uses of the current
COVER_GOTOOLCHAIN+auto/auto expressions in coverage and other targets with the
new GO/COVER_GO variable so the toolchain prefix logic is centralized and
consistent (update references to TOOLCHAIN_FROM_MOD and COVER_GOTOOLCHAIN to use
the new variable).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 73564ae3-cba6-4081-8257-b96c076cdd62

📥 Commits

Reviewing files that changed from the base of the PR and between b7d9670 and 74b5123.

📒 Files selected for processing (53)
  • .github/workflows/release.yml
  • Makefile
  • go.mod
  • internal/backup/checksum.go
  • internal/backup/checksum_legacy_test.go
  • internal/backup/checksum_test.go
  • internal/backup/collector.go
  • internal/backup/collector_helpers_extra_test.go
  • internal/backup/collector_pbs.go
  • internal/backup/collector_pbs_datastore.go
  • internal/backup/collector_pbs_test.go
  • internal/backup/collector_pve.go
  • internal/backup/collector_pve_parse_test.go
  • internal/backup/collector_pve_test.go
  • internal/input/input.go
  • internal/input/input_test.go
  • internal/orchestrator/backup_safety.go
  • internal/orchestrator/backup_safety_test.go
  • internal/orchestrator/backup_sources.go
  • internal/orchestrator/backup_sources_test.go
  • internal/orchestrator/compatibility.go
  • internal/orchestrator/compatibility_test.go
  • internal/orchestrator/decrypt.go
  • internal/orchestrator/decrypt_integrity.go
  • internal/orchestrator/decrypt_integrity_test.go
  • internal/orchestrator/decrypt_integrity_test_helpers_test.go
  • internal/orchestrator/decrypt_prepare_common.go
  • internal/orchestrator/decrypt_test.go
  • internal/orchestrator/decrypt_tui.go
  • internal/orchestrator/decrypt_tui_test.go
  • internal/orchestrator/decrypt_workflow_test.go
  • internal/orchestrator/decrypt_workflow_ui.go
  • internal/orchestrator/deps.go
  • internal/orchestrator/deps_test.go
  • internal/orchestrator/network_apply_additional_test.go
  • internal/orchestrator/network_staged_apply.go
  • internal/orchestrator/network_staged_apply_test.go
  • internal/orchestrator/path_security.go
  • internal/orchestrator/path_security_test.go
  • internal/orchestrator/restore.go
  • internal/orchestrator/restore_decision.go
  • internal/orchestrator/restore_decision_test.go
  • internal/orchestrator/restore_errors_test.go
  • internal/orchestrator/restore_plan.go
  • internal/orchestrator/restore_plan_test.go
  • internal/orchestrator/restore_test.go
  • internal/orchestrator/restore_workflow_decision_test.go
  • internal/orchestrator/restore_workflow_more_test.go
  • internal/orchestrator/restore_workflow_ui.go
  • internal/orchestrator/restore_workflow_ui_helpers_test.go
  • internal/orchestrator/selective.go
  • internal/safefs/safefs.go
  • internal/safefs/safefs_test.go

- add explicit PBS datastore metadata for source, CLI identity and output key
- derive stable path-based output keys for PBS_DATASTORE_PATH overrides
- keep existing output names for real CLI-discovered PBS datastores
- skip datastore show/status CLI calls for override-only entries
- use filesystem-only namespace discovery for override paths
- keep PXAR outputs isolated for colliding override basenames
- preserve distinct override entries in PBS datastore inventory
- exclude override-only inventory entries from datastore.cfg fallback restore
- add regression coverage for basename collisions and restore safety
- update PBS_DATASTORE_PATH documentation to reflect scan-root semantics
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/backup/collector_pbs_datastore.go (1)

521-594: ⚠️ Potential issue | 🟠 Major

Keep override datastores reachable when PBS CLI discovery fails.

The early returns at Lines 527-534 happen before the override append at Lines 564-594. That means a missing proxmox-backup-manager binary or a datastore list error exits getDatastoreList before any PBSDatastorePaths entries are added, so override-only collection silently disappears in CLI-less restore/chroot environments.

🧹 Nitpick comments (1)
internal/orchestrator/pbs_staged_apply_additional_test.go (1)

283-315: Add a regression for the new cli_name fallback.

This test covers override filtering, but the new recovery branch also falls back from name to cli_name. A one-case inventory fixture with name:"" and cli_name:"DS1" would keep that path from regressing.

🧪 Suggested test
+func TestLoadPBSDatastoreCfgFromInventory_FallsBackToCLINameWhenNameMissing(t *testing.T) {
+	origFS := restoreFS
+	t.Cleanup(func() { restoreFS = origFS })
+
+	fakeFS := NewFakeFS()
+	t.Cleanup(func() { _ = os.RemoveAll(fakeFS.Root) })
+	restoreFS = fakeFS
+
+	stageRoot := "/stage"
+	inventory := `{"datastores":[{"name":"","cli_name":"DS1","path":"/mnt/ds1","comment":"primary","origin":"merged"}]}`
+	if err := fakeFS.WriteFile(stageRoot+"/var/lib/proxsave-info/commands/pbs/pbs_datastore_inventory.json", []byte(inventory), 0o640); err != nil {
+		t.Fatalf("write inventory: %v", err)
+	}
+
+	content, _, err := loadPBSDatastoreCfgFromInventory(stageRoot)
+	if err != nil {
+		t.Fatalf("loadPBSDatastoreCfgFromInventory: %v", err)
+	}
+
+	blocks, err := parsePBSDatastoreCfgBlocks(content)
+	if err != nil {
+		t.Fatalf("parsePBSDatastoreCfgBlocks: %v", err)
+	}
+	if len(blocks) != 1 || blocks[0].Name != "DS1" || blocks[0].Path != "/mnt/ds1" {
+		t.Fatalf("unexpected blocks: %+v", blocks)
+	}
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/orchestrator/pbs_staged_apply_additional_test.go` around lines 283 -
315, Add a regression case to
TestLoadPBSDatastoreCfgFromInventory_IgnoresOverrideEntries to cover the new
cli_name fallback: update the inventory fixture passed to
loadPBSDatastoreCfgFromInventory so one datastore has an empty "name" and a
"cli_name":"DS1" (and another override entry to ensure filtering still works),
then assert that src == "pbs_datastore_inventory.json.datastores", that the
generated content does not contain the override path ("/mnt/scan-root"), and
that parsePBSDatastoreCfgBlocks returns a single block with Name "DS1" and Path
"/mnt/ds1" to verify the fallback from name to cli_name and no regression.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/orchestrator/pbs_staged_apply_additional_test.go`:
- Around line 283-315: Add a regression case to
TestLoadPBSDatastoreCfgFromInventory_IgnoresOverrideEntries to cover the new
cli_name fallback: update the inventory fixture passed to
loadPBSDatastoreCfgFromInventory so one datastore has an empty "name" and a
"cli_name":"DS1" (and another override entry to ensure filtering still works),
then assert that src == "pbs_datastore_inventory.json.datastores", that the
generated content does not contain the override path ("/mnt/scan-root"), and
that parsePBSDatastoreCfgBlocks returns a single block with Name "DS1" and Path
"/mnt/ds1" to verify the fallback from name to cli_name and no regression.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 79d63a09-1740-4bad-9f93-7adca714da3e

📥 Commits

Reviewing files that changed from the base of the PR and between 74b5123 and 78dd8cb.

📒 Files selected for processing (13)
  • docs/CONFIGURATION.md
  • docs/RESTORE_GUIDE.md
  • internal/backup/collector_pbs.go
  • internal/backup/collector_pbs_commands_coverage_test.go
  • internal/backup/collector_pbs_datastore.go
  • internal/backup/collector_pbs_datastore_inventory.go
  • internal/backup/collector_pbs_datastore_inventory_test.go
  • internal/backup/collector_pbs_test.go
  • internal/config/templates/backup.env
  • internal/orchestrator/pbs_staged_apply.go
  • internal/orchestrator/pbs_staged_apply_additional_test.go
  • internal/pbs/namespaces.go
  • internal/pbs/namespaces_test.go
✅ Files skipped from review due to trivial changes (2)
  • docs/CONFIGURATION.md
  • internal/config/templates/backup.env
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/backup/collector_pbs.go

…tive links

resolvePathRelativeToBaseWithinRootFS now canonicalizes baseDir by resolving existing symlinks before validating relative link targets. This closes the bypass where an archive could create an apparently harmless parent symlink and then install a relative symlink that escapes destRoot once materialized by the kernel.
Make safefs timeout tests failure-safe by draining blocked workers from cleanup and separating worker cleanup from global state restore. Also capture limiter and fs operation hooks locally in runLimited/Stat/ReadDir/Statfs to avoid races with global resets while async work is still in flight.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/safefs/safefs.go (1)

44-57: ⚠️ Potential issue | 🟠 Major

Keep ctx.Deadline() active when callers pass timeout == 0.

The early return in effectiveTimeout drops the context deadline entirely. Since ioTimeout can be 0 in production (see collector_pbs_datastore.go:155 and collector_pve.go:1038), calls to Stat/ReadDir/Statfs with a request-scoped deadline and zero explicit timeout will block past the deadline and fall back to the synchronous path.

Reorder the logic to check ctx.Deadline() before returning 0 on timeout, so the remaining deadline can be used to drive the async timeout path. Also add a regression test for Stat(context.WithTimeout(...), path, 0) to catch this scenario.

💡 Proposed fix
 func effectiveTimeout(ctx context.Context, timeout time.Duration) time.Duration {
-	if timeout <= 0 {
-		return 0
-	}
 	if deadline, ok := ctx.Deadline(); ok {
 		remaining := time.Until(deadline)
 		if remaining <= 0 {
 			return 0
 		}
-		if remaining < timeout {
+		if timeout <= 0 || remaining < timeout {
 			return remaining
 		}
 	}
+	if timeout <= 0 {
+		return 0
+	}
 	return timeout
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/safefs/safefs.go` around lines 44 - 57, The function
effectiveTimeout currently returns 0 immediately when timeout <= 0 which
discards any ctx.Deadline(); reorder the logic in effectiveTimeout to first
check ctx.Deadline() and use the remaining time if present (return remaining if
>0 and less than timeout or if timeout <=0), and only return 0 when no context
deadline exists or remaining <=0; update callers (e.g., Stat/ReadDir/Statfs)
behavior will automatically use the context deadline when timeout==0. Add a
regression test that calls Stat with context.WithTimeout(..., shortDuration) and
timeout==0 to assert the async timeout path is taken (i.e., operation respects
the context deadline).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/safefs/safefs.go`:
- Around line 44-57: The function effectiveTimeout currently returns 0
immediately when timeout <= 0 which discards any ctx.Deadline(); reorder the
logic in effectiveTimeout to first check ctx.Deadline() and use the remaining
time if present (return remaining if >0 and less than timeout or if timeout
<=0), and only return 0 when no context deadline exists or remaining <=0; update
callers (e.g., Stat/ReadDir/Statfs) behavior will automatically use the context
deadline when timeout==0. Add a regression test that calls Stat with
context.WithTimeout(..., shortDuration) and timeout==0 to assert the async
timeout path is taken (i.e., operation respects the context deadline).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 428375c5-f4e0-4082-a0ff-8c0169c26f81

📥 Commits

Reviewing files that changed from the base of the PR and between 9e0fe52 and 86958be.

📒 Files selected for processing (2)
  • internal/safefs/safefs.go
  • internal/safefs/safefs_test.go

Keep completed in-flight line and password reads attached to their state until
a caller consumes the buffered result, instead of clearing the inflight state
from the producer goroutine.

This fixes the race where a read could complete after a timeout, be cleaned up
before the next retry started, and leave the completed input unreachable.
Add deterministic tests for both retry-while-pending and completion-before-retry
cases to lock the behavior down.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/input/input.go`:
- Around line 125-151: The mutex (state.mu) is being held while waiting on
ctx.Done() or inflight.done, causing queued callers to block and miss timeouts;
change the logic in the section that creates/reads state.inflight (using
lineInflight, inflight.done, inflight.completed, reader.ReadString,
MapInputError) so you only lock to install or read the pointer, then immediately
unlock before doing the select on ctx.Done() / inflight.done; after receiving a
result re-lock, check that state.inflight == inflight and clear it if so, and
return the result (preserve returning MapInputError and ErrInputAborted
behavior); apply the same pattern to the analogous block around lines 173-199.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 62a57fe5-100c-415c-8b57-6aad0a06cf93

📥 Commits

Reviewing files that changed from the base of the PR and between 86958be and 36a84d6.

📒 Files selected for processing (2)
  • internal/input/input.go
  • internal/input/input_test.go

Add a PBS-specific output-key resolver to guarantee stable, unique datastore filenames and directories across auto-detected datastores and PBS_DATASTORE_PATH overrides. This fixes real collisions between CLI datastore names and path-derived override keys, makes pbsDatastore.pathKey() source-aware when OutputKey is unset, and keeps inventory output_key values aligned with the actual files written by PBS collectors. Includes regression tests for CLI-vs-override collisions, override fallback behavior, and inventory consistency.
@tis24dev tis24dev closed this Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant