You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add tests for orchestrator, support, and tui modules
Introduces new test files for encryption, prompts, restore workflow, selective menu, support, and abort context functionalities. Refactors orchestrator/encryption.go to allow mocking terminal checks, and support.go to allow mocking email notifier creation for improved testability. Adds a stopHook to tui.App for controlled stopping in tests.
* Enforce root check only for real root filesystem restores
Updated restore privilege checks to require root only when restoring to the real system root (osFS), not for virtual or test filesystems. Added isRealRestoreFS helper to distinguish filesystem types.
* Expand storage tests and improve FilesystemDetector hooks
Added extensive test coverage for local and secondary storage, including error handling, edge cases, and permission scenarios. Refactored FilesystemDetector to support injectable test hooks for mount point and filesystem type lookups, and improved octal unescaping logic. These changes enhance testability and reliability of storage operations.
* Improve email and webhook notifier test coverage
Adds extensive unit tests for email and webhook notifiers, covering error branches, authentication methods, payload formats, and edge cases. Refactors email notifier to allow overriding Postfix config path for hermetic tests and fixes logger level checks for debug output.
* Add comprehensive tests for MAC, directory, and security logic
Added extensive unit tests to identity_test.go for MAC address handling, interface ranking, system data generation, and edge cases. Expanded directory_recreation_test.go with tests for storage/datastore config parsing, directory creation, error propagation, and ZFS detection. Added security_test.go tests for ownership/permission checks, config-driven logic, and error handling. These tests improve coverage and robustness for identity, orchestrator, and security modules.
* Add comprehensive coverage tests for decryption workflow
This commit adds extensive unit tests to internal/orchestrator/decrypt_test.go, covering error handling and edge cases for decryption workflows, rclone integration, bundle extraction, manifest inspection, and user prompt logic. The tests improve code reliability by simulating various failure scenarios, file system errors, and user interactions.
* Add network safe apply with rollback and diagnostics
Implements network configuration safe apply with a transactional rollback timer, health checks, NIC name repair, and diagnostics capture. Adds network inventory collection, network health/preflight validation, and CLI workflow for applying/restoring network config with rollback. Updates backup safety logic to support network-only rollback archives and integrates new reporting in system collector and restore guide documentation.
* Add cluster shadowing guard and NIC naming override detection
Introduces cluster shadowing guard to prevent direct restoration of /etc/pve paths during cluster recovery, with sanitization logic and tests. Adds detection and reporting of persistent NIC naming override rules (udev/systemd) to network_apply and TUI workflows, including user prompts and detailed logging. Enhances safe cluster apply to handle node mismatches, prompt for source node selection, and improves logging and test coverage for restore scenarios.
* feat: improve network staging, datastore handling, and restore workflows
- Add staged network file installation with automatic rollback on preflight validation failures in network_apply.go
- Implement node hostname mismatch detection when applying VM/CT configs in SAFE cluster restore mode (RESTORE_GUIDE)
- Add deferred datastore definition handling to prevent broken entries on unmounted disk locations (RESTORE_GUIDE)
- Implement NIC repair staged install workflow and persistent naming rule detection (network_apply.go and docs)
- Enhance directory_recreation.go with ZFS mount detection and datastore permission validation logic
- Add automatic /etc/resolv.conf repair documentation and failing PBS job config removal on live restores (RESTORE_GUIDE)
- Introduce promptYesNo CLI utility function for interactive confirmation prompts (prompts_cli.go)
- Add file deduplication optimization pass and additional test coverage in optimizations.go
- Expand restore workflow state management with additional safety checks and node handling (restore.go)
- Add staged installation documentation covering /tmp/proxsave/restore-stage-* workflow and rollback timer mechanics
* refactor: add filesystem category and smart fstab merge
- Add filesystem category (ID: "filesystem", path: "./etc/fstab") to restore workflow covering mount points and configurations
- Integrate filesystem category into storage, base, and full restore modes in GetStorageModeCategories and GetBaseModeCategories
- Implement skipFn parameter in extractArchiveNative and extractPlainArchive to skip /etc/fstab during initial extraction
- Add Smart Merge workflow for /etc/fstab via SmartMergeFstab function with user prompts on live restores to root (/)
- Intercept filesystem category during normal extraction pipeline in RunRestoreWorkflow to prevent blind overwrite
- Update extractArchiveNative to accept optional skipFn callback that filters entries before extraction with SKIPPED logging
- Add safeFstabMerge flag in runFullRestore when destRoot == "/" to defer /etc/fstab processing until after extraction
- Extend extractSelectiveArchive signature to pass skipFn parameter through the extraction chain
- Update TestGetStorageModeCategories and TestGetBaseModeCategories assertions to verify filesystem inclusion (+1 count)
- Refactor indentation in maybeInstallNetworkConfigFromStage and maybeApplyNetworkConfigCLI call chains for readability
* feat: enhance network apply diagnostics and error handling
• Increase network rollback timer from 90s to 180s (defaultNetworkRollbackTimeout constant)
• Add NetworkApplyNotCommittedError type to report rollback path and restored IP on timeout
• Refactor network validator order: prioritize ifup -n -a over ifquery --check -a for preflight validation
• Introduce runNetworkIfqueryDiagnostic function for non-blocking diagnostic checks of network state
• Capture baseline health report before apply with writeNetworkHealthReportFileNamed helper
• Generate network plan report and capture pre/post-apply ifquery diagnostics automatically
• Execute rollback immediately on timer expiration and capture after-rollback snapshots and ifquery output
• Enhance error messages with validation command names (preflight.CommandLine()) and rollback paths
- Add runCommandWithTimeoutCountdown function with visual progress feedback during service stop operations
• Update restore summary to report "warnings" when network apply incomplete, with restored IP information
* Add default wait delay to command runner
Introduces a default 3-second wait delay for commands executed via osCommandRunner. Handles exec.ErrWaitDelay by returning output without error, improving robustness of command execution.
* deps(deps): bump github.com/gdamore/tcell/v2 from 2.13.6 to 2.13.7 in the security-patches group (#112)
deps(deps): bump github.com/gdamore/tcell/v2
Bumps the security-patches group with 1 update: [github.com/gdamore/tcell/v2](https://github.com/gdamore/tcell).
Updates `github.com/gdamore/tcell/v2` from 2.13.6 to 2.13.7
- [Release notes](https://github.com/gdamore/tcell/releases)
- [Changelog](https://github.com/gdamore/tcell/blob/main/CHANGESv3.md)
- [Commits](gdamore/tcell@v2.13.6...v2.13.7)
---
updated-dependencies:
- dependency-name: github.com/gdamore/tcell/v2
dependency-version: 2.13.7
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: security-patches
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* deps(deps): bump golang.org/x/crypto from 0.46.0 to 0.47.0 (#113)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.46.0 to 0.47.0.
- [Commits](golang/crypto@v0.46.0...v0.47.0)
---
updated-dependencies:
- dependency-name: golang.org/x/crypto
dependency-version: 0.47.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix octal unescaping to use ParseUint instead of ParseInt
Replaces strconv.ParseInt with strconv.ParseUint in unescapeOctal to correctly handle unsigned octal values. This prevents potential issues when parsing octal escape sequences as bytes.
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@@ -709,7 +710,8 @@ Cluster backup detected. Choose how to restore the cluster database:
709
710
710
711
**Post-restore actions (SAFE mode)**:
711
712
After export, the workflow offers interactive options to apply configurations via `pvesh`:
712
-
1.**VM/CT configs**: Scans exported configs and applies them via `pvesh set /nodes/<node>/qemu/<vmid>/config`
713
+
1.**VM/CT configs**: Scans exported configs (under `/etc/pve/nodes/<node>/...`) and applies them via `pvesh set /nodes/<node>/qemu/<vmid>/config`
714
+
- If the target node hostname differs from the hostname stored in the backup (common after hardware migration / reinstall), ProxSave detects the mismatch and prompts you to select the exported node directory to import from (instead of silently reporting “No VM/CT configs found”).
713
715
2.**Storage configuration**: Applies `storage.cfg` entries via `pvesh set /cluster/storage/<id>`
714
716
3.**Datacenter configuration**: Applies `datacenter.cfg` via `pvesh set /cluster/config`
715
717
@@ -722,6 +724,7 @@ Each action prompts for confirmation before execution.
722
724
- Unmounts `/etc/pve` FUSE filesystem
723
725
- Writes directly to `/var/lib/pve-cluster/config.db`
724
726
- Restarts services with restored configuration
727
+
- Avoids restoring files under `/etc/pve/*` while pmxcfs is stopped/unmounted (to prevent "shadowed" writes on the underlying disk). Those files are expected to come from the restored `config.db`.
725
728
726
729
**When to use**:
727
730
- Complete disaster recovery
@@ -1348,6 +1351,21 @@ These configurations are included in every backup and can be restored using **th
1348
1351
Apply all VM/CT configs via pvesh? (y/N): y
1349
1352
```
1350
1353
1354
+
**If the node name changed** (example: backup from `pve-old`, restore on `pve-new`), ProxSave prompts for the exported source node:
1355
+
```
1356
+
SAFE cluster restore: applying configs via pvesh (node=pve-new)
1357
+
1358
+
WARNING: VM/CT configs in this backup are stored under different node names.
1359
+
Current node: pve-new
1360
+
Select which exported node to import VM/CT configs from (they will be applied to the current node):
1361
+
[1] pve-old (qemu=12, lxc=3)
1362
+
[0] Skip VM/CT apply
1363
+
Choice: 1
1364
+
1365
+
Found 15 VM/CT configs for exported node pve-old (will apply to current node pve-new)
If the **network** category is restored, ProxSave can optionally apply the
1663
+
new network configuration immediately using a **transactional rollback timer**.
1664
+
1665
+
**Important (console recommended)**:
1666
+
- Run the live network apply/commit step from the **local console** (physical console, IPMI/iDRAC/iLO, Proxmox console, or hypervisor console), not from SSH.
1667
+
- If the restored network config changes the management IP or routes, your SSH session will drop and you may be unable to type `COMMIT`.
1668
+
- In that case, ProxSave will treat the lack of `COMMIT` as “not confirmed” and will restore the previous network settings (rollback).
1669
+
1670
+
**How it works**:
1671
+
- On live restores (writing to `/`), ProxSave **stages** network files first under `/tmp/proxsave/restore-stage-*` and does **not** overwrite `/etc/network/*` during archive extraction.
1672
+
- After extraction, ProxSave performs a prevention-first **staged install**: it writes the staged files to disk (no reload), runs safe NIC repair + preflight validation, and **rolls back automatically** if validation fails (leaving the staged copy for review).
1673
+
- If rollback backup creation fails (or ProxSave is not running as root), ProxSave keeps network files staged and avoids writing to `/etc`.
1674
+
- When you choose to apply live, ProxSave (re)validates and reloads networking inside the rollback timer window.
1675
+
- ProxSave arms a local rollback job **before** applying changes
1676
+
- Rollback restores **only network-related files** using a dedicated archive under `/tmp/proxsave/network_rollback_backup_*` (so it won’t undo other restored categories)
1677
+
- Rollback also prunes network config files that were **created after** the backup (e.g. extra files under `/etc/network/interfaces.d/`), so rollback returns to the exact pre-restore state
1678
+
- The user has **180 seconds** to type `COMMIT`
1679
+
- If `COMMIT` is not received, ProxSave triggers the rollback and restores the pre-restore network configuration
1680
+
- If the network-only rollback archive is not available, ProxSave prompts before falling back to the full safety backup (or skipping live apply)
1681
+
1682
+
This protects SSH/GUI access during network changes.
1683
+
1684
+
**Health checks**:
1685
+
- After applying changes, ProxSave runs local checks (SSH route if available, default route, link state, IP addresses, gateway ping, DNS config/resolve, local web UI port)
1686
+
- On PVE systems, additional checks are included for cluster networking: `/etc/pve` (pmxcfs) mount status, `pve-cluster` / `corosync` service state, and `pvecm status` quorum
1687
+
- The result is shown to help decide whether to type `COMMIT`
1688
+
- Diagnostics are saved under `/tmp/proxsave/network_apply_*` (snapshots `before.txt` / `after.txt` / `after_rollback.txt` when relevant, `health_before.txt` / `health_after.txt`, `preflight.txt`, `plan.txt`, and `ifquery_*`)
1689
+
1690
+
**NIC name repair**:
1691
+
- If physical NIC names changed after reinstall (e.g. `eno1` → `enp3s0`), ProxSave attempts an automatic mapping using backup network inventory (permanent MAC / MAC / PCI path / udev IDs like `ID_PATH`, `ID_NET_NAME_PATH`, `ID_NET_NAME_SLOT`, `ID_SERIAL`)
1692
+
- When a safe mapping is found, `/etc/network/interfaces` and `/etc/network/interfaces.d/*` are rewritten before applying the network config
1693
+
- If you skip live network apply, ProxSave may still install the staged config to disk (no reload) after safe NIC repair + preflight; if validation fails, it rolls back and keeps the staged copy.
1694
+
- If a mapping would overwrite an interface name that already exists on the current system, ProxSave prompts before applying it (conflict-safe)
1695
+
- If persistent NIC naming rules are detected (custom udev `NAME=` rules or systemd `.link` files), ProxSave warns and prompts before applying NIC repair to avoid conflicts with user-intended naming
1696
+
- A backup of the pre-repair files is stored under `/tmp/proxsave/nic_repair_*`
1697
+
1698
+
**Preflight validation**:
1699
+
- After NIC repair, ProxSave runs a **gate** validation of the ifupdown configuration before reloading networking (e.g. `ifup -n -a` / `ifup --no-act -a` / `ifreload --syntax-check -a`)
1700
+
- If validation fails, live apply is aborted and the validator output is saved under `/tmp/proxsave/network_apply_*/preflight.txt`
1701
+
- Additionally (diagnostics-only), ProxSave can run `ifquery --check -a`**before and after apply** to show how the runtime state matches the target config. Its output is saved under `/tmp/proxsave/network_apply_*/ifquery_*`. Note that `ifquery --check` can show `[fail]`**before apply** even when the config is valid (because the running state still reflects the old config).
1702
+
- On staged installs/applies, a failed preflight triggers an **automatic rollback of network files** (no prompt), returning to the pre-restore state and keeping the staged copy for review.
1703
+
1704
+
**Result reporting**:
1705
+
- If you do not type `COMMIT`, ProxSave completes the restore with warnings and reports that the original network settings were restored (including the current IP, when detectable), plus the rollback log path.
1706
+
1642
1707
### 4. Hard Guards
1643
1708
1644
1709
**Path Traversal Prevention**:
@@ -2002,9 +2067,105 @@ zfs list
2002
2067
# If ZFS, import pool
2003
2068
zpool import <pool-name>
2004
2069
2005
-
# If directory, create it
2006
-
mkdir -p /mnt/datastore/{.chunks,.lock}
2007
-
chown backup:backup /mnt/datastore -R
2070
+
# If directory-based datastore (non-ZFS), verify permissions for backup user
2071
+
# NOTE:
2072
+
# - On live restores, ProxSave stages PBS datastore/job configuration first under `/tmp/proxsave/restore-stage-*`
2073
+
# and applies it safely after checking the current system state.
2074
+
# - If a datastore path looks like a mountpoint location (e.g. under `/mnt`) but resolves to the root filesystem,
2075
+
# ProxSave will **defer** that datastore definition (it will NOT be written to `datastore.cfg`), to avoid ending up
2076
+
# with a broken datastore entry that blocks re-creation on a new/empty disk. Deferred entries are saved under
2077
+
# `/tmp/proxsave/datastore.cfg.deferred.*` for manual review.
2078
+
# - ProxSave may create missing datastore directories and fix `.lock`/ownership, but it will NOT format disks.
2079
+
# - To avoid accidental writes to the wrong disk, ProxSave will skip datastore directory initialization if the
2080
+
# datastore path looks like a mountpoint location (e.g. under /mnt) but resolves to the root filesystem.
2081
+
# In that case, mount/import the datastore disk/pool first, then restart PBS (or re-run restore).
2082
+
# - If the datastore path is not empty and contains unexpected files/directories, ProxSave will not touch it.
2083
+
ls -ld /mnt/datastore /mnt/datastore/<DatastoreName>2>/dev/null
**Issue: "Bad Request (400) unable to read /etc/resolv.conf (No such file or directory)"**
2094
+
2095
+
**Cause**: `/etc/resolv.conf` is missing or a broken symlink. This can happen after a restore if a previous backup contained an invalid symlink (e.g. pointing to `../commands/resolv_conf.txt`), or if the target system uses `systemd-resolved` and the expected `/run/systemd/resolve/*` files are not present.
2096
+
2097
+
**Solution**:
2098
+
```bash
2099
+
ls -la /etc/resolv.conf
2100
+
readlink /etc/resolv.conf 2>/dev/null ||true
2101
+
2102
+
# If the link is broken or points to commands/resolv_conf.txt, replace it:
**Cause**: In PBS, properties inside a `datastore:` section must be indented. A malformed file (often from manual edits or very old configs) will prevent PBS from loading datastore config.
2123
+
2124
+
**Solution**:
2125
+
```bash
2126
+
# ProxSave will attempt to auto-normalize datastore.cfg during restore and store a backup under /tmp/proxsave/,
2127
+
# but you can also fix it manually:
2128
+
cp -a /etc/proxmox-backup/datastore.cfg /root/datastore.cfg.bak.$(date +%F_%H%M%S)
**Cause**: PBS job config files (`/etc/proxmox-backup/prune.cfg`, `/etc/proxmox-backup/verification.cfg`) are empty or malformed. PBS expects a section header at the first non-comment line; an empty file can trigger parse errors.
2144
+
2145
+
**Restore behavior**:
2146
+
- On live restores, ProxSave stages PBS job config files and will **remove** empty staged job configs instead of writing a 0-byte file (to avoid breaking PBS parsing).
**Issue: "Datastore error: Is a directory (os error 21)"**
2157
+
2158
+
**Cause**: PBS expects a lock file at `<datastore-path>/.lock`. If `.lock` is a directory (common after manual fixes or incorrect initialization), PBS will fail to open it and the datastore becomes unavailable.
2159
+
2160
+
**Solution**:
2161
+
```bash
2162
+
P=/mnt/datastore/<DatastoreName>
2163
+
ls -ld "$P/.lock"
2164
+
2165
+
# If .lock is a directory, replace it with a file:
#### Error during network preflight: `addr_add_dry_run() got an unexpected keyword argument 'nodad'`
557
+
558
+
**Symptoms**:
559
+
- Restore networking preflight fails when running `ifup -n -a`
560
+
- Log contains: `NetlinkListenerWithCache.addr_add_dry_run() got an unexpected keyword argument 'nodad'`
561
+
562
+
**Cause**:
563
+
- A Proxmox-packaged `ifupdown2` version may ship a Python signature mismatch between `addr_add()` and `addr_add_dry_run()` (dry-run path), which crashes `ifup -n` when `nodad` is used.
564
+
565
+
**What ProxSave does**:
566
+
- During restore, ProxSave can apply a guarded hotfix (only when needed) by patching `/usr/share/ifupdown2/lib/nlcache.py` and writing a timestamped `.bak.*` backup first.
567
+
568
+
**Recovery / rollback**:
569
+
- To revert the hotfix, restore the `.bak.*` copy back onto `nlcache.py`, or upgrade `ifupdown2` when Proxmox publishes a fixed build.
0 commit comments