Conversation
rb3ckers
reviewed
Nov 5, 2025
rb3ckers
reviewed
Nov 5, 2025
rb3ckers
reviewed
Nov 5, 2025
rb3ckers
reviewed
Nov 5, 2025
rb3ckers
approved these changes
Nov 6, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds VictoriaMetrics backup/restore functionality and introduces a new
internal/orchestration/restore/package that eliminates code duplication between Stackgraph,VictoriaMetrics and potentially soon coming Clickhouse and Configuration restore operations.
Architecture Improvements
Code Deduplication
This PR extracts common restore patterns into a reusable orchestration
layer:
internal/orchestration/restore/: New package containing shared restore operationsconfirmation.go: User confirmation promptsjob.go: Kubernetes Job lifecycle management (wait, monitor, logs)finalize.go: Background job status checking and cleanup orchestrationresources.go: ConfigMap and Secret resource managementStatefulSet Scaling Support
Extended the
internal/orchestration/scale/package to support both Deployments and StatefulSetsthrough a unified interface, enabling VictoriaMetrics StatefulSet scaling during restore
operations.
📖 Updated architecture documentation in ARCHITECTURE.md and
README.md
New Commands
victoriametrics listLists available VictoriaMetrics backups from Minio S3 storage.
Examples:
List backups for a single-node VM setup
List backups for a HA VM setup (mirroring by vmagent)
victoriametrics restore
Restores VictoriaMetrics from a backup archive with automatic StatefulSet scaling and Kubernetes
job orchestration.
Restore Workflow
- --latest flag: Automatically fetches the most recent backup
- --archive flag: Uses the explicitly provided backup name
- Warns that restore will PURGE all existing VictoriaMetrics data
- Displays backup file and namespace
- Prompts: Do you want to continue? (yes/no):
- Scales down affected StatefulSets to zero replicas
- Waits for all pods to terminate gracefully
- Stores original replica counts in annotations
- ConfigMap: Contains restore script
- Secret: Mounts Minio credentials
- Job: Executes restore in containers (one per HA instance)
- Without --background: Waits for completion and streams logs
- With --background: Returns immediately for monitoring separately
- Restores StatefulSets to original replica counts
- Triggered after job completion (or immediately with --background)
- Job is automatically cleaned up via TTL (24 hours after completion)
Usage:
sts-backup victoriametrics restore [flags]
Flags:
--archive string Specific backup name to restore (e.g.,
sts-victoria-metrics-backup/victoria-metrics-0-20251030143500)
--background Run restore job in background without waiting for completion
--latest Restore from the most recent backup
-y, --yes Skip confirmation prompt
Example 1: Restore Latest Backup
sts-backup victoriametrics restore --namespace --latest --yes
Example 2: Restore Specific Backup in Background
sts-backup victoriametrics restore --namespace
--archive sts-victoria-metrics-backup/victoria-metrics-0-20251030143500
--background
Examples:
Restoring the latest available VM backup with auto-confirmation
Running restore operation in background
victoriametrics check-and-finalize
Checks the status of a background VictoriaMetrics restore job and cleans up resources.
Usage:
sts-backup victoriametrics check-and-finalize --job [--wait] -n
Flags:
-j, --job string VictoriaMetrics restore job name (required)
-w, --wait Wait for job to complete before cleanup
Note: This command automatically scales up StatefulSets that were scaled down during restore.
Example: Check Job Status
sts-backup victoriametrics check-and-finalize
--job victoriametrics-restore-20251104t143000
-n
Example: Wait for Completion and Cleanup
sts-backup victoriametrics check-and-finalize
--job victoriametrics-restore-20251104t143000
--wait
-n
Examples:
Waiting for the restore job to complete and cleaning up, scaling after that
Stackgraph Updates
The stackgraph commands now also benefit from the shared orchestration layer. The
check-and-finalize command was refactored to use the same orchestration functions as
VictoriaMetrics, ensuring consistent behavior across services.