Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,25 @@ behaves identically.

## [Unreleased]

### Fix: backups with a non-default tablespace (#17)

A backup of a cluster that has any user tablespace failed to commit with
`backup.manifest_invalid: backup_label is empty (required for restore)`.
PG streams the base/default tablespace archive — the one carrying
`backup_label` and `tablespace_map` — *last* when user tablespaces exist,
but the tar sink only looked for those files in the first archive. It now
captures them from whichever archive holds them, so multi-tablespace
clusters back up (and restore) correctly.

### Fix: `pg_hardstorage demo` now actually runs (#15)

The `demo` command previously printed a one-line description and exited
without doing anything. It now runs the real end-to-end flow — start a
throwaway PostgreSQL in Docker, initialise a repo, back up, restore, and
verify, then clean up — driving your `docker` CLI so a non-default daemon
set via `DOCKER_HOST` (Lima, Colima, Podman) is honoured, and reporting a
clear error if Docker isn't reachable instead of silently succeeding.

### Packaging: remove the obsolete homebrew-formula.json manifest

Dropped `scripts/homebrew-formula.json`, a leftover hand-maintained tap
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,9 @@ pg_hardstorage demo
```

The demo prints progress and a result summary. No existing PostgreSQL or
pg_hardstorage configuration is needed — just a running Docker daemon.
pg_hardstorage configuration is needed — just a running Docker daemon. It
drives your `docker` CLI, so a non-default daemon set via `DOCKER_HOST`
(Lima, Colima, Podman, a remote socket) is picked up automatically.

### One-command setup

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/cli/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ generated from the Cobra command tree on every
| [`pg_hardstorage db`](pg_hardstorage_db.md) | In-database integration (SQL views, upsert helpers) |
| [`pg_hardstorage db install-extension`](pg_hardstorage_db_install-extension.md) | Install the pg_hardstorage in-DB extension (creates schema + tables + views + functions) |
| [`pg_hardstorage db uninstall-extension`](pg_hardstorage_db_uninstall-extension.md) | Remove the pg_hardstorage in-DB schema |
| [`pg_hardstorage demo`](pg_hardstorage_demo.md) | Run a 60-second demo with temporary PG 18 via Docker |
| [`pg_hardstorage demo`](pg_hardstorage_demo.md) | Run a throwaway end-to-end demo (init → backup → restore → verify) on a temporary PG in Docker |
| [`pg_hardstorage deployment`](pg_hardstorage_deployment.md) | Manage deployments (add/list/remove/edit/test) |
| [`pg_hardstorage deployment add`](pg_hardstorage_deployment_add.md) | Add a new deployment to the config |
| [`pg_hardstorage deployment edit`](pg_hardstorage_deployment_edit.md) | Update fields on an existing deployment |
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/cli/pg_hardstorage.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ sandbox-PG runtime — extend per docs/SPEC.md.
* [pg_hardstorage compliance](pg_hardstorage_compliance.md) - Compliance reporting (SOC 2 / ISO 27001 / HIPAA / PCI / FedRAMP-friendly)
* [pg_hardstorage cost](pg_hardstorage_cost.md) - Per-deployment / per-tenant repository cost
* [pg_hardstorage db](pg_hardstorage_db.md) - In-database integration (SQL views, upsert helpers)
* [pg_hardstorage demo](pg_hardstorage_demo.md) - Run a 60-second demo with temporary PG 18 via Docker
* [pg_hardstorage demo](pg_hardstorage_demo.md) - Run a throwaway end-to-end demo (init → backup → restore → verify) on a temporary PG in Docker
* [pg_hardstorage deployment](pg_hardstorage_deployment.md) - Manage deployments (add/list/remove/edit/test)
* [pg_hardstorage doctor](pg_hardstorage_doctor.md) - Run health checks and suggest fixes
* [pg_hardstorage dsa](pg_hardstorage_dsa.md) - GDPR Data Subject Access helper: locate which backups contain a subject's data
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/cli/pg_hardstorage_demo.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ tags:

## pg_hardstorage demo

Run a 60-second demo with temporary PG 18 via Docker
Run a throwaway end-to-end demo (init → backup → restore → verify) on a temporary PG in Docker

```
pg_hardstorage demo [flags]
Expand Down
34 changes: 21 additions & 13 deletions internal/backup/tarsink/tarsink.go
Original file line number Diff line number Diff line change
Expand Up @@ -548,20 +548,28 @@ func (s *Sink) parseTar(ctx context.Context, r io.Reader, idx int) ([]backup.Fil
continue
}

// Special files captured for the manifest.
if idx == 0 {
switch hdr.Name {
case BackupLabelName:
if err := s.captureSpecial(tr, &s.backupLabel); err != nil {
return files, dirs, fmt.Errorf("read %s: %w", hdr.Name, err)
}
continue
case TablespaceMapName:
if err := s.captureSpecial(tr, &s.tablespaceMap); err != nil {
return files, dirs, fmt.Errorf("read %s: %w", hdr.Name, err)
}
continue
// Special files captured for the manifest. backup_label and
// tablespace_map sit at the root of the base/default tablespace's
// tar — and PG streams that archive LAST when user tablespaces
// exist, not first. Keying on idx==0 therefore silently dropped
// backup_label whenever a non-default tablespace was present,
// producing a manifest that fails its own invariant check and
// refuses to commit (issue #17). Match by name in whichever
// archive carries them instead: the exact-name match can only
// fire for the base tar, since user-tablespace entries are nested
// under PG_<ver>_<cat>/... and never named exactly "backup_label"
// or "tablespace_map" at the root.
switch hdr.Name {
case BackupLabelName:
if err := s.captureSpecial(tr, &s.backupLabel); err != nil {
return files, dirs, fmt.Errorf("read %s: %w", hdr.Name, err)
}
continue
case TablespaceMapName:
if err := s.captureSpecial(tr, &s.tablespaceMap); err != nil {
return files, dirs, fmt.Errorf("read %s: %w", hdr.Name, err)
}
continue
}

entry, err := s.chunkFile(ctx, hdr, tr)
Expand Down
63 changes: 44 additions & 19 deletions internal/backup/tarsink/tarsink_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -322,32 +322,57 @@ func TestTarsink_BackupLabel_And_TablespaceMap_Captured(t *testing.T) {
}
}

func TestTarsink_SpecialFiles_OnlyInTablespaceZero(t *testing.T) {
// TestTarsink_BackupLabel_CapturedFromBaseTablespace_NotIndexZero is the
// regression for issue #17: when a non-default tablespace exists, PG
// streams the user tablespace archive(s) FIRST and the base/default
// tablespace (the one carrying backup_label + tablespace_map) LAST. The
// sink must capture those special files from whichever archive holds
// them, not assume tablespace index 0. The previous idx==0 gate dropped
// backup_label here, leaving an empty manifest field that failed its own
// invariant check ("backup_label is empty (required for restore)") and
// refused to commit the backup.
func TestTarsink_BackupLabel_CapturedFromBaseTablespace_NotIndexZero(t *testing.T) {
sink, _ := newSinkAndCAS(t)
// A second tablespace also has a file called backup_label — but it
// is NOT the special one (only tablespace 0's is). The test confirms
// we don't intercept it.
t0 := buildTar(t, []fileSpec{
{name: "backup_label", body: []byte("real label")},

wantLabel := []byte("START WAL LOCATION: 0/23000168 (file 000000010000000000000023)\n")
wantMap := []byte("16384 /data/postgresql/18/tablespaces/tbs1\n")

// idx 0 — the USER tablespace (tbs1). Its tar entries are nested
// under PG_<ver>_<cat>/<dboid>/...; there is no root backup_label.
userTS := buildTar(t, []fileSpec{
{name: "PG_18_202209061/16384/12345", body: []byte("user tablespace relfile")},
})
t1 := buildTar(t, []fileSpec{
{name: "backup_label", body: []byte("not actually a label, just a same-named file")},
// idx 1 — the BASE/default tablespace, streamed last, carrying the
// special files at its root.
baseTS := buildTar(t, []fileSpec{
{name: "backup_label", body: wantLabel},
{name: "tablespace_map", body: wantMap},
{name: "PG_VERSION", body: []byte("18\n")},
{name: "global/pg_control", body: []byte("control")},
})
if err := drive(t, sink, 0, basebackup.TablespaceInfo{OID: 1663}, t0, 0); err != nil {
t.Fatal(err)

if err := drive(t, sink, 0, basebackup.TablespaceInfo{OID: 16384}, userTS, 0); err != nil {
t.Fatalf("drive user tablespace: %v", err)
}
if err := drive(t, sink, 1, basebackup.TablespaceInfo{OID: 16384}, t1, 0); err != nil {
t.Fatal(err)
if err := drive(t, sink, 1, basebackup.TablespaceInfo{OID: 0}, baseTS, 0); err != nil {
t.Fatalf("drive base tablespace: %v", err)
}

// Tablespace 0's backup_label is captured.
if string(sink.BackupLabel()) != "real label" {
t.Errorf("BackupLabel mismatch: %q", sink.BackupLabel())
if !bytes.Equal(sink.BackupLabel(), wantLabel) {
t.Errorf("backup_label not captured from the base tablespace (issue #17): got %q", sink.BackupLabel())
}
if !bytes.Equal(sink.TablespaceMap(), wantMap) {
t.Errorf("tablespace_map not captured from the base tablespace: got %q", sink.TablespaceMap())
}
// The special files must not leak into the base tablespace's file list.
for _, f := range sink.Files(1) {
if f.Path == "backup_label" || f.Path == "tablespace_map" {
t.Errorf("special file %q leaked into Files(1)", f.Path)
}
}
// Tablespace 1's same-named file IS in Files (it's not the special one).
files1 := sink.Files(1)
if len(files1) != 1 || files1[0].Path != "backup_label" {
t.Errorf("tablespace 1 should keep its backup_label as a regular file: %+v", files1)
// The user tablespace's real relfile is preserved as a normal file.
if files0 := sink.Files(0); len(files0) != 1 || files0[0].Path != "PG_18_202209061/16384/12345" {
t.Errorf("user tablespace file list = %+v, want the single relfile", files0)
}
}

Expand Down
201 changes: 201 additions & 0 deletions internal/cli/demo.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
// demo.go — `pg_hardstorage demo`: a self-contained, throwaway
// end-to-end run (init repo → backup → restore → verify) against a
// temporary PostgreSQL spun up in Docker. It exists so a brand-new user
// can see the whole flow work in under a couple of minutes without
// configuring anything.
//
// Previously this command only printed a one-line description and
// exited 0 — it never touched Docker (issue #15). It now drives the
// real flow through the `docker` CLI (which honours DOCKER_HOST, so
// Lima / Colima / Podman-with-docker-shim setups work) and the same
// subcommands an operator would run by hand.
//
// The orchestration is written against a small commandRunner seam so
// the full step sequence, error handling, and cleanup are unit-testable
// without a Docker daemon; the real end-to-end run is exercised in CI.
package cli

import (
"bufio"
"context"
"fmt"
"io"
"os"
"os/exec"
"strings"
"time"

"github.com/spf13/cobra"

"github.com/cybertec-postgresql/pg_hardstorage/internal/output"
)

// demoImage is the PostgreSQL image the demo runs. Kept here as a
// single constant so a supported-major bump is a one-line change.
const demoImage = "postgres:18"

// commandRunner runs an external command and returns its combined
// output. The seam lets tests drive the demo without a Docker daemon
// or a second pg_hardstorage process.
type commandRunner interface {
run(ctx context.Context, name string, args ...string) (string, error)
}

// execRunner is the production commandRunner: it shells out for real.
type execRunner struct{}

func (execRunner) run(ctx context.Context, name string, args ...string) (string, error) {
out, err := exec.CommandContext(ctx, name, args...).CombinedOutput()
return string(out), err
}

func newDemoCmdImpl() *cobra.Command {
return &cobra.Command{
Use: "demo",
Short: "Run a throwaway end-to-end demo (init → backup → restore → verify) on a temporary PG in Docker",
Args: cobra.NoArgs,
SilenceUsage: true,
RunE: func(cmd *cobra.Command, _ []string) error {
d := DispatcherFrom(cmd)
self, err := os.Executable()
if err != nil {
self = "pg_hardstorage"
}
if err := runDemo(cmd.Context(), cmd.ErrOrStderr(), execRunner{}, self); err != nil {
return err
}
return d.Result(output.NewResult(cmd.CommandPath()).WithBody(map[string]any{
"status": "ok",
"message": "demo completed: a temporary PostgreSQL was backed up, restored, and verified, then cleaned up",
}))
},
}
}

// runDemo executes the end-to-end demo. progress is written to w as the
// flow advances; r runs docker + self subcommands; self is the path to
// this binary (used to invoke the real backup/restore/verify verbs).
func runDemo(ctx context.Context, w io.Writer, r commandRunner, self string) error {
step := func(format string, a ...any) { fmt.Fprintf(w, " → "+format+"\n", a...) }

// 1. Preflight: Docker must be reachable. `docker info` fails fast
// and clearly when the daemon (or DOCKER_HOST) isn't set up.
fmt.Fprintln(w, "pg_hardstorage demo — spinning up a throwaway PostgreSQL in Docker")
if out, err := r.run(ctx, "docker", "info"); err != nil {
return output.NewError("demo.docker_unavailable",
"demo: Docker does not appear to be reachable").
WithSuggestion(&output.Suggestion{
Human: "start Docker (Docker Desktop / Colima / Lima / Podman) and ensure the daemon is up. " +
"If your socket isn't the default, set DOCKER_HOST (e.g. " +
"export DOCKER_HOST=unix:///path/to/docker.sock). Underlying error: " + firstLine(strings.TrimSpace(out)),
}).Wrap(err)
}

// 2. Start PG. POSTGRES_HOST_AUTH_METHOD=trust makes the official
// image emit a pg_hba `host replication all all trust` line, so
// BASE_BACKUP over the replication protocol works out of the box.
// Publishing 5432 to an ephemeral host port avoids collisions.
step("starting %s", demoImage)
cid, err := r.run(ctx, "docker", "run", "-d", "--rm",
"-e", "POSTGRES_HOST_AUTH_METHOD=trust",
"-P", demoImage,
"-c", "wal_level=replica", "-c", "max_wal_senders=10")
if err != nil {
return output.NewError("demo.start_failed",
"demo: could not start the PostgreSQL container: "+firstLine(strings.TrimSpace(cid))).Wrap(err)
}
cid = strings.TrimSpace(cid)
// Always tear the container down, even on a mid-flow failure.
defer func() {
_, _ = r.run(context.WithoutCancel(ctx), "docker", "rm", "-f", cid)
}()

// 3. Resolve the published host port for 5432.
portOut, err := r.run(ctx, "docker", "port", cid, "5432/tcp")
if err != nil {
return output.NewError("demo.port_failed",
"demo: could not resolve the container's published port: "+firstLine(strings.TrimSpace(portOut))).Wrap(err)
}
hostPort, err := parseDockerPort(portOut)
if err != nil {
return output.NewError("demo.port_failed", "demo: "+err.Error()).Wrap(err)
}

// 4. Wait for PG to accept connections.
step("waiting for PostgreSQL to become ready")
if err := waitForPG(ctx, r, cid); err != nil {
return err
}

// 5. Throwaway repo + restore target.
repoDir, err := os.MkdirTemp("", "pg_hardstorage-demo-repo-")
if err != nil {
return output.NewError("internal", "demo: temp repo: "+err.Error()).Wrap(err)
}
defer func() { _ = os.RemoveAll(repoDir) }()
restoreDir, err := os.MkdirTemp("", "pg_hardstorage-demo-restore-")
if err != nil {
return output.NewError("internal", "demo: temp restore dir: "+err.Error()).Wrap(err)
}
defer func() { _ = os.RemoveAll(restoreDir) }()

repoURL := "file://" + repoDir
dsn := fmt.Sprintf("postgres://postgres@127.0.0.1:%s/postgres?sslmode=disable", hostPort)

// 6. The real flow, through the same verbs an operator runs.
flow := []struct {
label string
args []string
}{
{"initialising repository", []string{"repo", "init", repoURL}},
{"taking a base backup", []string{"backup", "demo", "--pg-connection", dsn, "--repo", repoURL}},
{"restoring the backup", []string{"restore", "demo", "latest", "--repo", repoURL, "--target", restoreDir}},
{"verifying the backup", []string{"verify", "demo", "latest", "--repo", repoURL}},
}
for _, s := range flow {
step("%s", s.label)
if out, err := r.run(ctx, self, s.args...); err != nil {
return output.NewError("demo.step_failed",
fmt.Sprintf("demo: step %q failed: %s", s.label, firstLine(strings.TrimSpace(out)))).Wrap(err)
}
}

fmt.Fprintln(w, "✓ demo complete — backup, restore, and verify all succeeded; cleaning up")
return nil
}

// waitForPG polls pg_isready inside the container until PG accepts
// connections or the budget runs out.
func waitForPG(ctx context.Context, r commandRunner, cid string) error {
deadline := time.Now().Add(60 * time.Second)
for {
if _, err := r.run(ctx, "docker", "exec", cid, "pg_isready", "-U", "postgres"); err == nil {
return nil
}
if time.Now().After(deadline) {
return output.NewError("demo.pg_not_ready",
"demo: PostgreSQL did not become ready within 60s").Wrap(context.DeadlineExceeded)
}
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(time.Second):
}
}
}

// parseDockerPort extracts the host port from `docker port` output,
// which looks like "0.0.0.0:49153" (optionally with extra IPv6 lines).
func parseDockerPort(out string) (string, error) {
sc := bufio.NewScanner(strings.NewReader(out))
for sc.Scan() {
line := strings.TrimSpace(sc.Text())
if i := strings.LastIndex(line, ":"); i >= 0 && i < len(line)-1 {
port := line[i+1:]
if port != "" {
return port, nil
}
}
}
return "", fmt.Errorf("could not parse a host port from docker output %q", strings.TrimSpace(out))
}
Loading
Loading