Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions docs/stack-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,19 @@ Think Spring Boot for self-hosted services on a Mac.

---

## Stacklet vs core

The distinction is **infrastructure vs feature**. Core hosts cross-cutting
infrastructure that every running stack needs: Watchtower (auto-updates),
the bot runtime, the API socket, the LLM tools server. Vertical stacklets
host features a user might rationally decline: photos, documents, backup,
the AI stack. The test: could a user legitimately want this off? If yes,
it belongs in its own stacklet, even when multiple other stacklets opt
into it via manifest contracts (e.g. `[[backup.archive]]` declared by
photos and docs, consumed by the `backup` stacklet).

---

## Directory Structure

A stacklet is a directory under `stacklets/` containing at minimum a
Expand Down Expand Up @@ -293,6 +306,44 @@ Subscribing to custom events on the bot side isn't wired yet — that
comes with the first consumer bot. The emit contract is stable and
tested end-to-end (see `tests/integration/test_archivist_e2e.py`).

### Backup

A stacklet declares the data it wants backed up. The `backup` stacklet
discovers these declarations across all enabled stacklets and routes them
through configured targets (see `stack.toml` → `[backup.targets.*]`).
The stacklet itself never reads the manifest — discovery is the runtime's
job.

```toml
# stacklets/photos/stacklet.toml
[[backup.archive]]
name = "library"
path = "{data_dir}/photos/library/library"
min_files = 10
```

| Field | Description |
|---|---|
| `name` | Short slug for this source. Combined with the stacklet id, this becomes the global source id (`photos/library`). Used in `stack backup status` output and (future) `--source=` selection. |
| `path` | Filesystem path to sync. Template variables from the rendered environment are available (`{data_dir}`, etc.). |
| `min_files` | Coarse ransomware smoke test. The engine counts files at `path` before syncing and refuses if the count is below this. The canary file is the precise tripwire; this is the dumb-and-cheap secondary check. Keep low enough that fresh installs don't trip it. |

**`[[backup.archive]]`** declares an append-only store: files are added,
never modified, never deleted. The engine commits to kernel-enforced
immutability where the filesystem supports it. If a stacklet's data is
genuinely append-only (photo originals, archived PDFs), this is the
right section. (Storage-industry vocabulary calls this WORM — Write
Once Read Many.)

**`[[backup.snapshot]]`** is reserved for time-stamped point-in-time
captures of mutable state (Postgres dumps, Docker volume tarballs). Not
yet implemented — declare an `archive` section today; a `snapshot`
section will be added later when DB-restore semantics ship.

A stacklet may declare zero, one, or several entries of each kind. Sources
flow to every configured target whose engine supports the declared
section type.

---

## Lifecycle
Expand Down Expand Up @@ -434,6 +485,7 @@ stack destroy:
| `on_start_ready` | Every up | **Runs after health checks pass.** The service is healthy and accepting API calls. Seed data, sync accounts, anything that needs the service running. Must be idempotent. |
| `on_stop` | Every down | Stop native services. Only stops services we manage (.state/ markers). |
| `on_destroy` | Once | Remove native services entirely (unload plists, uninstall). |
| `on_restore` | On `stack backup restore` | **Reserved — not yet invoked.** Runs after the backup engine has put a stacklet's files back on disk. Owns stacklet-specific recovery: DB import, search-index rebuild, account re-seed. Photos can ship an empty stub (Immich re-indexes from the library on its own); Docs needs `pg_restore` + Paperless reindex. Hook signature will match the other `run(ctx)` hooks. |

**File resolution:** for each hook, the runtime looks for `.py` first,
then `.sh`. Only one can exist — not both. Python is preferred.
Expand Down Expand Up @@ -843,8 +895,22 @@ openai_key = "local"
default = "mlx-community/Qwen2.5-14B-Instruct-4bit"
whisper_url = "http://localhost:6111/v1"
language = "en"

[backup]
# One block per destination. Engine name selects the implementation.
[backup.targets.vault]
engine = "external-disk"
disk = "backup-vault"
schedule = "0 2 * * *"
```

`[backup.targets.<name>]` defines one destination. `<name>` is the user's
label for that destination (any string). The required `engine` field
picks the implementation under `stacklets/backup/engines/`. Today only
`external-disk` ships. Engine-specific fields (`disk`, `schedule`,
future `repository`, `password`) live alongside `engine` in the same
block.

Stacklets never read `stack.toml` directly. The runtime resolves
template variables and passes everything through the rendered `.env`.

Expand Down
62 changes: 58 additions & 4 deletions docs/user-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -508,15 +508,69 @@ All commands output JSON when piped. Use `--json` to force it, `--pretty` to for

## Backups

This is the part everyone skips and regrets. famstack puts every byte of user data under one directory:
This is the part everyone skips and regrets. famstack ships an opt-in backup stacklet for the irreplaceable file-level data (photo originals, scanned documents). It does not yet cover stacklet databases or your config files; for those, layer it with Time Machine or a periodic tar.

### The backup stacklet

Run `stack up backup`. You need an APFS-formatted external drive plugged in. The setup wizard asks for the disk name, encryption (default: plain APFS), and a nightly time (default 02:00). It then installs a cron entry that runs `stack backup sync` at that hour, and walks you through granting Full Disk Access to `/usr/sbin/cron` in System Settings. The FDA grant is what lets the scheduled sync reach the archive disk; without it, cron jobs are sandbox-blocked from `/Volumes/*` on macOS Catalina and later. One drag-and-drop, then every cron job on this Mac inherits the access.

**What gets backed up**

| Source | Path on the archive disk |
|---|---|
| Immich photo originals | `/Volumes/<disk>/data/photos-library/` |
| Paperless archived PDFs | `/Volumes/<disk>/data/docs-media/` |

Postgres databases for both are not backed up yet. You get your files back but lose albums, tags, custom fields, and saved views. Pg-dump snapshots will ship as `[[backup.snapshot]]` in a later release.

**How the protection works**

Every file written to the archive gets the kernel `uchg` flag. macOS refuses to modify or delete uchg files, even with `sudo`. `rsync --ignore-existing` means files already in the archive are skipped on every run, so the backup is append-only by design and accidental `rm -rf` on your main system cannot propagate.

A **canary file** is a tripwire (named after the canary miners used to take underground to detect bad air). famstack plants a small file with known contents inside `~/famstack-data`; before every sync the engine reads it and refuses to proceed if the contents have changed. If something has been encrypting or modifying files under the data directory, the canary will not match what was planted and the sync aborts before opening the archive, so the corrupted state cannot propagate.

**Daily operation**

```bash
ls ~/famstack-data
stack backup sync # run a sync now (from any context)
stack backup status # last run, source counts, current mount state
stack down backup # remove the cron entry; canary, logs and archive contents stay
stack destroy backup # also remove BACKUP_DATA_DIR; archive disk and Keychain are preserved
```

That is what you back up. Time Machine works. So does `restic`, `rsync`, or copying it to an external drive every Sunday. Pick one and do it.
The scheduled nightly run leaves the disk mounted between runs (cron cannot trigger eject under the macOS sandbox; files are kernel-locked regardless). Manual `stack backup sync` from Terminal does eject when it finishes. Results post to the `#famstack` Matrix room via stacker-bot.

**Recovery (no special tooling needed)**

Plug the archive disk into any Mac and browse the files in Finder. To copy locked files out:

```bash
sudo chflags -R nouchg /Volumes/<disk>/data/photos-library/<folder>/
cp -R /Volumes/<disk>/data/photos-library/<folder>/ ~/recovered/
```

A `stack backup restore` command and `on_restore` hooks for database recovery are planned but not yet shipped.

**What it protects you from**

| Threat | Covered |
|---|---|
| Ransomware encrypts your Mac | Yes (uchg + canary) |
| Accidental `rm -rf` on `~/famstack-data` | Yes (rsync never deletes from the archive) |
| You delete a photo on your phone | Yes (Immich propagates the delete to disk; the archive keeps the original) |
| Vault drive stolen from your house | Only if you opted in to APFS encryption |
| Vault drive hardware failure | No (single physical copy; offsite engine planned) |
| Fire or flood | No (archive is in the same building; offsite engine planned) |

**Limitations to know about**

Only APFS or HFS+ disks attached via USB or Thunderbolt. Network shares (SMB/NFS/Synology) are refused at probe time because the kernel cannot enforce `uchg` over a network filesystem. One target only; encrypted offsite via restic is planned, not shipped.

### Everything else

`~/famstack-data` holds every byte of user data. Even with the backup stacklet running you may want Time Machine or `restic` against this directory for full coverage (databases, AI model caches).

What is **not** in `~/famstack-data` and therefore needs separate handling:
What is **not** in `~/famstack-data` and needs separate handling:

- `stack.toml`, `users.toml`, `.stack/secrets.toml` in the repo. Small but irreplaceable. Gitignored on purpose, so they will not survive a fresh `git clone`.
- oMLX models (in `~/.omlx/models`). Re-downloadable.
Expand Down
16 changes: 16 additions & 0 deletions stack.example.toml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,22 @@ whisper_url = "http://localhost:42062/v1"
# "en" = English (alloy voice), "de" = German (onyx voice)
language = "en"

[backup]
# Nightly append-only backup of stacklet data. Enable the backup stacklet
# (`stack up backup`) to install the cron entry and FDA-granted .app
# wrapper that runs the sync. Each [[backup.targets.*]] block defines one
# destination; sources are discovered from stacklets that declare
# [[backup.archive]] (append-only) in their own manifest.
#
# In v1 only the external-disk engine ships. Restic-based encrypted
# offsite is planned and will slot in as a second target without
# touching this file's existing entries.

# [backup.targets.vault]
# engine = "external-disk" # rsync + chflags uchg + diskutil eject
# disk = "backup-vault" # APFS volume name (case-sensitive)
# schedule = "0 2 * * *" # 5-field cron, runs nightly at 02:00

[services]
# Homepage dashboard URL — set automatically when core is enabled.
homepage_url = "http://localhost:3000"
Expand Down
119 changes: 119 additions & 0 deletions stacklets/backup/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# backup — append-only backup of stacklet data

## What it does

Coordinates nightly backups of stacklet data to attached external disks.
The model is an **append-only archive**: files are added, never modified,
never deleted. Once a photo or document lands on the backup disk, the
kernel itself refuses to let anything change it. The threat model is
*ransomware, accidents, mistakes* — not a sophisticated targeted
attack.

See `engines/external-disk/README.md` for the full protection layers and
the rationale behind each one.

## Architecture

Backup is **a coordinator, not a backup tool**. The actual work is done
by *engines*, each of which implements one well-defined backup strategy
with explicit guarantees:

| Engine | Status | What it does |
|---|---|---|
| `external-disk` | scaffolded, port pending | rsync + chflags uchg on attached APFS disk |
| `restic` | planned | encrypted, deduplicated, snapshotted offsite (S3/B2) |

Sources are discovered from other stacklets via a manifest contract.
Every stacklet that declares `[[backup.archive]]` (an append-only store)
in its `stacklet.toml` contributes one source path to the next sync:

```toml
# stacklets/photos/stacklet.toml
[[backup.archive]]
name = "library"
path = "{data_dir}/photos/library/library"
min_files = 10
```

Targets are configured in `stack.toml`. Today the only target is the
attached disk:

```toml
# stack.toml
[backup]
[backup.targets.vault]
engine = "external-disk"
disk = "backup-vault"
```

Routing: every `[[backup.archive]]` source flows to every target whose
engine supports append-only semantics. Adding a second target later
(offsite restic) is purely additive — no manifest change on photos/docs.

## CLI

```
stack backup sync [--dry-run] [--no-eject] [--verbose]
stack backup status # last run, source counts, cron presence
```

Per-stacklet aliases (`stack photos backup`, `stack docs backup`) and
restore (`stack backup restore --source=…`) are intentionally not in
v1 — they'll layer on once the engine port lands and the manifest
contract has been exercised on at least one production sync.

## Destroy semantics

`stack destroy backup` removes the backup *tooling* — never the
*backups*. Specifically:

- **Removed:** cron entry, local logs, canary file under BACKUP_DATA_DIR.
- **Preserved:** every file on the archive disk. The whole point of an
append-only archive is that it outlives the system that wrote it.
- **Preserved:** the macOS Keychain entry for the disk passphrase
(encrypted archives only). The user may want manual disk access after
uninstall; the command to remove it is surfaced if they want a fully
clean state.
- **Preserved:** the Full Disk Access grant on `/usr/sbin/cron`. It
also covers any other cron jobs on the system; the user can remove
it manually if they prefer.

Defensive measure: `on_configure` refuses to let `BACKUP_DATA_DIR`
point at a path under `/Volumes/`. That way the framework's automatic
data-dir cleanup at destroy time can never accidentally reach external
storage.

## Recovery without restore tooling

The v1 engine writes plain files in plain directory structures. No
restore CLI exists yet — but you don't need one to get your photos
back:

```bash
# 1. Plug the archive disk into any Mac and unlock it (Finder prompts
# for the passphrase if encrypted)

# 2. Browse to the originals
ls /Volumes/backup-vault/data/photos/library/

# 3. Files are immutable. Unlock the ones you want to recover:
sudo chflags -R nouchg /Volumes/backup-vault/data/photos/library/

# 4. Copy them wherever you need
cp -R /Volumes/backup-vault/data/photos/library/ ~/recovered-photos/
```

This is the "survivalist" property the append-only design buys: no
special software needed to read the archive. The future restore CLI will
automate this and run stacklet-specific recovery via `on_restore`
hooks (DB import, search-index rebuild). For v1, manual recovery
is the documented path.

## Status

This stacklet is currently **scaffold only**. The hooks and CLI files
raise `NotImplementedError`. The next step is porting `vault-sync.sh`
from `family-server/backup/` into `engines/external-disk/`, with two
adaptations: source discovery via the manifest contract, and Matrix
notifications via the local `stacker-bot` instead of the legacy
`kit-control-bot`.
Loading