feat(backup): nightly encrypted off-host backups via restic#41
Merged
Conversation
Closes audit #3 (no backups). Biggest remaining production risk after the resource-limits deploy: a VPS disk failure today loses the mainnet seed phrase from config/config.json with no recovery path. Adds an opinionated, backend-agnostic backup chain: - `scripts/backup.sh` -- nightly snapshot of: - config/config.json (the seed phrase + Blockfrost key) - .env (REDIS_PASSWORD, MAINNET flag) - Redis AOF volume (dedup keys, UTXO reservations) - data/files/ (uploaded payment-gated content) Stages to /tmp, runs `restic backup` to an encrypted remote repo, prunes per retention policy (14 daily / 8 weekly / 12 monthly), verifies a 5% pack-file sample. Lock file prevents overlapping runs. - `scripts/restore.sh` -- restores any snapshot to a target dir (defaults to /tmp). Intentionally NEVER restores over the live tree; recovery is a deliberate operator copy from the restored staging area. - `scripts/cardano402-backup.cron` -- cron template, 03:00 UTC nightly. - `scripts/cardano402-restic.env.example` -- credentials template with inline blocks for Backblaze B2 (default), Cloudflare R2, AWS S3, Hetzner Storage Box, and tailnet SFTP. The env file lives at /etc/cardano402/restic.env (mode 0600, root-owned), outside the repo. - `docs/backup-restore.md` -- operator runbook: one-time setup, restore verification, routine ops, three disaster-recovery scenarios, what's intentionally NOT backed up. - `docs/operations.md` -- new pointer to backup-restore.md. Restic gives us: client-side AES-256 encryption, content-addressed dedup (the growing AOF costs almost nothing per night), one binary, many backend choices via one config. The encryption passphrase MUST be stored offline; the runbook is explicit about that. No production state changes from this PR -- it just adds files. The backup loop activates only after the operator installs restic, fills in the env file, runs `restic init`, and copies the cron template into place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes audit #3 (no backups). Biggest remaining production risk after the resource-limits deploy: a VPS disk failure today loses the mainnet seed phrase from `config/config.json` with no recovery path.
Architecture
What's backed up
Not backed up (by design): Docker images, `node_modules`, application logs, git working tree.
Files
What happens on merge
Nothing in production changes. This PR only adds files to the repo. The backup loop activates only after a manual operator setup on the VPS:
Full step-by-step in `docs/backup-restore.md` § "One-time setup".
Why these design choices
Test plan
🤖 Generated with Claude Code