Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ roaring = "0.11"
aws-lc-rs = "1"
quick-xml = "0.40"
chrono = { version = "0.4", default-features = false, features = ["clock", "serde"] }
tar = "0.4"
astral-tokio-tar = "0.6"
postgres-protocol = "0.6"
fallible-iterator = "0.2"
tokio-rustls = { version = "0.26", default-features = false, features = ["aws-lc-rs", "tls12"] }
Expand All @@ -50,9 +50,11 @@ rustls-pki-types = "1"
rustls-pemfile = "2"
webpki-roots = "1"
dryoc = { version = "0.8", default-features = false, features = ["u64_backend"] }
libc = "0.2"

[dev-dependencies]
tempfile = "3"
tar = "0.4"

[features]
# Enabled only on the VM test runner: hits a real PG cluster at PGPORT
Expand Down
10 changes: 6 additions & 4 deletions bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
Reproducible single-host benchmark comparing three PostgreSQL 18 WAL archivers on
**throughput** and **memory** under heavy write load:

- **walrus** (this repo, Rust) — serial wal-push daemon
- **wal-g** (Go) — fan-out daemon (`WALG_UPLOAD_CONCURRENCY`)
- **walrus** (this repo, Rust) — look-ahead fan-out daemon (`WALG_UPLOAD_CONCURRENCY`; pre-uploads `concurrency-1` segments, streaming per-upload, no full-segment buffer)
- **wal-g** (Go) — fan-out daemon (same `WALG_UPLOAD_CONCURRENCY`)
- **pgbackrest** (C) — daemonless; PG forks `archive-push`, async `process-max` workers

All three are driven identically: PG `archive_command` → the tool's own client → S3.
Expand Down Expand Up @@ -118,7 +118,7 @@ daemon (~27 MB for walrus; wal-g's fan-out daemon adds more baseline).

| OP | walrus / wal-g | pgbackrest | measures |
|---|---|---|---|
| `backup-send` | `backup-push --full` | `backup --type=full` | full base backup → S3 |
| `backup-send` | `backup-push <PGDATA> --full` | `backup --type=full` | full base backup → S3 |
| `backup-fetch` | `backup-fetch <dst> LATEST` | `restore` | restore ← S3 |
| `backup-delta` | `backup-push` (delta, `wi1`) | `backup --type=incr` | delta backup → S3 |
| `backup-delta-summaries` | `backup-push --delta-from-wal-summaries` | — (walrus-only) | delta from PG17 WAL summaries → S3 |
Expand Down Expand Up @@ -174,7 +174,9 @@ Notes:
## Config knobs

See `config.env.example`. Common ones: `UPLOAD_CONCURRENCY` (wal-g concurrency /
pgbackrest `process-max`), `SCALE` (pgbench DB size), `CHURN_ROWS`, `BURST_SECONDS`,
pgbackrest `process-max`; also seeds `WALG_DOWNLOAD_CONCURRENCY` so `backup-fetch`
scales with the same knob — set `DOWNLOAD_CONCURRENCY` to decouple), `SCALE`
(pgbench DB size), `CHURN_ROWS`, `BURST_SECONDS`,
`BURST_WORKERS`. `matrix.sh` honors `DAEMONS` (and `RUN_ID`). Operation benchmarks add
`RESTORE_DIR`, `WAL_RECV_DIR`, `WAL_RECEIVE_SECONDS`, `DELTA_CHURN_SECONDS`,
`DELTA_MAX_STEPS`, `DELTA_ORIGIN`; `op_matrix.sh` honors `OPS`, `TOOLS` (and
Expand Down
4 changes: 4 additions & 0 deletions bench/op_matrix.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# Skipped cells: pgbackrest has no wal-receive equivalent; backup-delta-summaries
# is walrus-only (no wal-g / pgbackrest WAL-summary delta). Override OPS / TOOLS
# via env. Counterpart of matrix.sh (archive path).
#
# backup-delta-chain (DELTA_MAX_STEPS-deep chain + leaf restore) is omitted from
# the default sweep — it churns once per step, so its cost scales with depth. Opt
# in with OPS="backup-send backup-delta-chain" (backup-send must precede it).
set -euo pipefail

SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
Expand Down
148 changes: 135 additions & 13 deletions bench/run_op.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,23 @@
#
# run_op.sh OP TOOL RUN_ID
#
# OP - backup-send | backup-fetch | backup-delta |
# backup-delta-summaries | wal-receive (data-movement operation)
# OP - backup-send | backup-fetch | backup-delta | backup-delta-summaries |
# backup-delta-chain | wal-receive (data-movement operation)
# TOOL - walrus | walg | pgbackrest (implementation)
# RUN_ID - free-form label, e.g. r1 / 2026-06-22
#
# Benchmarks ONE data-movement operation with ONE tool, single-host (PG + tool
# local), cross-tool where an equivalent exists. Counterpart of run.sh, which
# benches the archive_command (wal-push) path; this covers the rest of walrus:
#
# backup-send base backup -> S3 walrus/wal-g backup-push --full | pgbackrest backup --type=full
# backup-send base backup -> S3 walrus/wal-g backup-push ... --full |
# pgbackrest backup --type=full
# backup-fetch restore <- S3 walrus/wal-g backup-fetch | pgbackrest restore
# backup-delta delta backup -> S3 walrus/wal-g backup-push (wi1) | pgbackrest backup --type=incr
# backup-delta-summaries delta from WAL walrus backup-push | (walrus-only)
# summaries -> S3 --delta-from-wal-summaries
# backup-delta-chain N-deep delta chain walrus/wal-g backup-push xN | pgbackrest backup --type=incr xN
# + restore of leaf (origin=LATEST), then backup-fetch LATEST
# wal-receive stream WAL from PG walrus/wal-g wal-receive | (no pgbackrest peer)
#
# Delta cells need a parent full backup (backup-send must precede them) and a
Expand All @@ -28,6 +31,14 @@
# anchor to chain root. Delta size is S3-inventory byte growth across the push,
# not on-disk cluster size.
#
# backup-delta-chain builds a real DELTA_MAX_STEPS-deep chain: each step churns,
# drains, then pushes a delta with WALG_DELTA_ORIGIN=LATEST so it extends the
# PREVIOUS delta (LATEST_FULL would re-anchor each to the root, leaving restore
# depth 2). Every step is timed + sized on its own (chain_metrics.txt), then a
# backup-fetch LATEST walks full + all N deltas to exercise restore-time replay.
# Its churn is per-step and INSIDE the sampler window, so the daemon's archiving
# during churn is sampled too; the per-step push timings isolate the push.
#
# walrus's walsender (serving WAL via the replication protocol) has no CLI entry
# point yet, so wal-send is intentionally absent.
#
Expand Down Expand Up @@ -55,16 +66,16 @@ LOG_TAG=op
load_config

if [[ $# -ne 3 ]]; then
echo "usage: $0 <backup-send|backup-fetch|backup-delta|backup-delta-summaries|wal-receive> <walrus|walg|pgbackrest> <run_id>" >&2
echo "usage: $0 <backup-send|backup-fetch|backup-delta|backup-delta-summaries|backup-delta-chain|wal-receive> <walrus|walg|pgbackrest> <run_id>" >&2
exit 2
fi
OP="$1"
TOOL="$2"
RUN_ID="$3"

case "${OP}" in
backup-send|backup-fetch|backup-delta|backup-delta-summaries|wal-receive) ;;
*) echo "error: OP must be backup-send|backup-fetch|backup-delta|backup-delta-summaries|wal-receive, got '${OP}'" >&2; exit 2 ;;
backup-send|backup-fetch|backup-delta|backup-delta-summaries|backup-delta-chain|wal-receive) ;;
*) echo "error: OP must be backup-send|backup-fetch|backup-delta|backup-delta-summaries|backup-delta-chain|wal-receive, got '${OP}'" >&2; exit 2 ;;
esac
case "${TOOL}" in
walrus|walg|pgbackrest) ;;
Expand All @@ -80,17 +91,20 @@ if [[ "${OP}" == "backup-delta-summaries" && "${TOOL}" != "walrus" ]]; then
exit 2
fi

# Delta ops drive a churn phase, then a delta push; group them for branch tests.
# Single-delta ops drive one churn phase, then one delta push; group for branches.
IS_DELTA=0
[[ "${OP}" == "backup-delta" || "${OP}" == "backup-delta-summaries" ]] && IS_DELTA=1
# Chain op churns + pushes per step inside the timed loop (not the single step 1b).
IS_CHAIN=0
[[ "${OP}" == "backup-delta-chain" ]] && IS_CHAIN=1

# Backup-push ops (full + delta) take a base backup, whose pg_backup_stop blocks
# on BackupWaitWalArchive until the backup's WAL is archived. So the tool's
# archiver MUST stay live across these cells (the sampler then sees the op
# process plus the mostly-idle daemon; for walrus that baseline is ~27 MB).
# backup-fetch (restore) and wal-receive need no archiver.
NEEDS_ARCHIVE=0
case "${OP}" in backup-send|backup-delta|backup-delta-summaries) NEEDS_ARCHIVE=1 ;; esac
case "${OP}" in backup-send|backup-delta|backup-delta-summaries|backup-delta-chain) NEEDS_ARCHIVE=1 ;; esac

: "${BUCKET:?set BUCKET in config.env}"
: "${PGUSER:?set PGUSER in config.env}"
Expand Down Expand Up @@ -122,7 +136,7 @@ WAL_RECEIVE_SECONDS="${WAL_RECEIVE_SECONDS:-300}"
# Delta cells: churn window that dirties pages between the parent full and the
# delta push, and the delta-chain depth handed to walrus/wal-g (WALG_DELTA_MAX_STEPS).
DELTA_CHURN_SECONDS="${DELTA_CHURN_SECONDS:-300}"
DELTA_MAX_STEPS="${DELTA_MAX_STEPS:-7}"
DELTA_MAX_STEPS="${DELTA_MAX_STEPS:-3}"
DELTA_ORIGIN="${DELTA_ORIGIN:-LATEST_FULL}"

case "${TOOL}" in
Expand Down Expand Up @@ -161,6 +175,28 @@ inv_size() {
| awk '/Total Size:/ {print $3}' | tail -1
}

# Fail fast if no parent backup exists for a delta to anchor to. Without one,
# backup-push silently emits a FULL (mislabeled as a delta) and inv-growth sizing
# reports a full's bytes. op_matrix runs backup-send first; this guards lone runs.
assert_delta_parent() {
local roots
if [[ "${TOOL}" == "pgbackrest" ]]; then
# full backup-set dirs end in 'F/'; incr (delta) dirs end in 'I/'
roots="$(sudo aws s3 ls "s3://${BUCKET}${PGBACKREST_REPO_PATH}/backup/${PGBACKREST_STANZA}/" \
--region "${AWS_REGION}" 2>/dev/null | awk '/ PRE / && /F\/$/ {n++} END{print n+0}')"
else
# walrus/wal-g chain root = base_<lsn> without the _D_ delta suffix
roots="$(sudo aws s3 ls "${WALG_PREFIX}/basebackups_005/" \
--region "${AWS_REGION}" 2>/dev/null | awk '/ PRE base_/ && !/_D_/ {n++} END{print n+0}')"
fi
if [[ "${roots:-0}" -eq 0 ]]; then
echo "error: no parent full backup under ${INV_PREFIX}; run backup-send ${TOOL} ${RUN_ID} first" >&2
echo " (a delta with no parent silently becomes a full, corrupting the measurement)" >&2
exit 1
fi
log "parent check: ${roots} full backup(s) under ${INV_PREFIX}"
}

# --- pre-flight: DB seeded? (backup-send + wal-receive need a populated DB) ---
[[ "${OP}" == "backup-fetch" ]] || require_seeded

Expand Down Expand Up @@ -197,7 +233,7 @@ sudo -u postgres pgbackrest --stanza="${STANZA}" stanza-create || true
# backup (full or incr) needs WAL archiving live (pgbackrest blocks on the
# start-WAL archive), so point archive_command at pgbackrest and drain. restore
# reads only the repo. backup-delta (incr) churns + drains in the delta-prep step.
if [[ "${OP}" == "backup-send" || "${OP}" == "backup-delta" ]]; then
if [[ "${OP}" == "backup-send" || "${OP}" == "backup-delta" || "${OP}" == "backup-delta-chain" ]]; then
ARCHIVE_CMD="pgbackrest --stanza=${STANZA} archive-push %p"
sudo -u postgres "${PGBIN}/psql" -p 5432 -tA \
-c "ALTER SYSTEM SET archive_library = '';" \
Expand Down Expand Up @@ -238,6 +274,9 @@ if [[ "${OP}" == "backup-send" || "${OP}" == "wal-receive" ]]; then
CHECKPOINT_BEFORE_WORKLOAD=1
fi

# Delta ops must extend an existing full; bail before churning if none exists.
[[ "${IS_DELTA}" -eq 1 || "${IS_CHAIN}" -eq 1 ]] && assert_delta_parent

# --- step 1b: delta prep — churn between the parent full and the delta push ---
# The default delta map walks ARCHIVED WAL, so the churn WAL must reach the repo
# before the push. The tool's archiver is already live (step 1, NEEDS_ARCHIVE)
Expand Down Expand Up @@ -272,7 +311,7 @@ case "${OP}" in
backup-send)
log "base backup -> ${INV_PREFIX} (full)"
case "${TOOL}" in
walrus) run_tool "${WALRUS_BIN}" backup-push --full ;;
walrus) run_tool "${WALRUS_BIN}" backup-push "${PGDATA_DIR}" --full ;;
walg) run_tool "${WALG_BIN}" backup-push "${PGDATA_DIR}" --full ;;
pgbackrest) sudo -u postgres pgbackrest --stanza="${PGBACKREST_STANZA}" backup --type=full ;;
esac
Expand All @@ -285,7 +324,7 @@ case "${OP}" in
case "${TOOL}" in
walrus) run_tool env WALG_DELTA_MAX_STEPS="${DELTA_MAX_STEPS}" \
WALG_DELTA_ORIGIN="${DELTA_ORIGIN}" \
"${WALRUS_BIN}" backup-push --pgdata "${PGDATA_DIR}" ;;
"${WALRUS_BIN}" backup-push "${PGDATA_DIR}" ;;
walg) run_tool env WALG_DELTA_MAX_STEPS="${DELTA_MAX_STEPS}" \
WALG_DELTA_ORIGIN="${DELTA_ORIGIN}" \
"${WALG_BIN}" backup-push "${PGDATA_DIR}" ;;
Expand All @@ -300,10 +339,93 @@ case "${OP}" in
log "delta-from-wal-summaries backup -> ${INV_PREFIX} (origin=${DELTA_ORIGIN}; parent inventory ${inv_before} B)"
run_tool env WALG_DELTA_MAX_STEPS="${DELTA_MAX_STEPS}" \
WALG_DELTA_ORIGIN="${DELTA_ORIGIN}" \
"${WALRUS_BIN}" backup-push --pgdata "${PGDATA_DIR}" --delta-from-wal-summaries
"${WALRUS_BIN}" backup-push "${PGDATA_DIR}" --delta-from-wal-summaries
inv_after="$(inv_size)"; inv_after="${inv_after:-0}"
BYTES=$(( inv_after - inv_before )); (( BYTES < 0 )) && BYTES=0
;;
backup-delta-chain)
# Build a DELTA_MAX_STEPS-deep chain (origin=LATEST: each delta extends the
# prior one). Per step: churn, drain, then time + size the push alone. BYTES
# accumulates per-step delta payloads (not END-START inventory: that would
# also count the inter-step churn WAL). chain_metrics.txt holds the breakdown.
DELTA_ORIGIN=LATEST
CHAIN_METRICS="${RESULT_DIR}/chain_metrics.txt"
push_s_total=0
chain_rows=""
log "delta chain: ${DELTA_MAX_STEPS} steps (origin=LATEST, cap WALG_DELTA_MAX_STEPS=${DELTA_MAX_STEPS}) -> ${INV_PREFIX}"
for ((i=1; i<=DELTA_MAX_STEPS; i++)); do
log "chain step ${i}/${DELTA_MAX_STEPS}: checkpoint + churn ${DELTA_CHURN_SECONDS}s"
checkpoint_pg
CHECKPOINT_BEFORE_WORKLOAD=1
CH_ENV=(PGHOST="${PGHOST_DRIVER}" PGUSER="${PGUSER}" PGPASSWORD="${PGPASSWORD}"
DURATION="${DELTA_CHURN_SECONDS}" CHURN_ROWS="${CHURN_ROWS:-2000000}")
[[ -n "${BURST_WORKERS:-}" ]] && CH_ENV+=("WORKERS=${BURST_WORKERS}")
if ! env "${CH_ENV[@]}" bash "${SCRIPT_DIR}/scripts/driver/workload_burst.sh"; then
mark_invalid "chain step ${i} churn degraded (non-comparable delta)"
fi
drain_backlog 5 600
step_before="$(inv_size)"; step_before="${step_before:-0}"
step_t0="$(date +%s.%N)"
case "${TOOL}" in
walrus) run_tool env WALG_DELTA_MAX_STEPS="${DELTA_MAX_STEPS}" WALG_DELTA_ORIGIN=LATEST \
"${WALRUS_BIN}" backup-push "${PGDATA_DIR}" ;;
walg) run_tool env WALG_DELTA_MAX_STEPS="${DELTA_MAX_STEPS}" WALG_DELTA_ORIGIN=LATEST \
"${WALG_BIN}" backup-push "${PGDATA_DIR}" ;;
pgbackrest) sudo -u postgres pgbackrest --stanza="${PGBACKREST_STANZA}" backup --type=incr ;;
esac
step_t1="$(date +%s.%N)"
step_after="$(inv_size)"; step_after="${step_after:-0}"
step_bytes=$(( step_after - step_before )); (( step_bytes < 0 )) && step_bytes=0
step_s="$(awk -v a="${step_t0}" -v b="${step_t1}" 'BEGIN{printf "%.3f", b-a}')"
step_mbps="$(awk -v by="${step_bytes}" -v s="${step_s}" 'BEGIN{printf "%.2f",(s>0)?by/1e6/s:0}')"
push_s_total="$(awk -v a="${push_s_total}" -v b="${step_s}" 'BEGIN{printf "%.3f", a+b}')"
BYTES=$(( BYTES + step_bytes ))
log "chain step ${i}: elapsed=${step_s}s delta=${step_bytes} B (${step_mbps} MB/s)"
chain_rows+="step=${i} elapsed_s=${step_s} bytes=${step_bytes} mb_s=${step_mbps}"$'\n'
done

log "chain restore: backup-fetch LATEST (walks full + ${DELTA_MAX_STEPS} deltas) -> ${RESTORE_DIR}"
run_root "${RESTORE_DIR}" <<'REMOTE'
set -euo pipefail
RESTORE_DIR="$1"
rm -rf "${RESTORE_DIR}"
install -d -o postgres -g postgres "${RESTORE_DIR}"
REMOTE
restore_t0="$(date +%s.%N)"
case "${TOOL}" in
walrus) run_tool "${WALRUS_BIN}" backup-fetch "${RESTORE_DIR}" LATEST ;;
walg) run_tool "${WALG_BIN}" backup-fetch "${RESTORE_DIR}" LATEST ;;
pgbackrest)
sudo -u postgres pgbackrest --stanza="${PGBACKREST_STANZA}" \
--pg1-path="${RESTORE_DIR}" --type=none restore ;;
esac
restore_t1="$(date +%s.%N)"
restore_s="$(awk -v a="${restore_t0}" -v b="${restore_t1}" 'BEGIN{printf "%.3f", b-a}')"
restore_bytes="$(sudo du -sb "${RESTORE_DIR}" | awk '{print $1}')"
log "chain restore: elapsed=${restore_s}s restored=${restore_bytes} B"
sudo rm -rf "${RESTORE_DIR}"

run_root "${CHAIN_METRICS}" "${TOOL}" "${RUN_ID}" "${DELTA_MAX_STEPS}" \
"${push_s_total}" "${BYTES}" "${restore_s}" "${restore_bytes}" "${chain_rows}" <<'REMOTE'
set -euo pipefail
CHAIN_METRICS="$1"; TOOL="$2"; RUN_ID="$3"; STEPS="$4"; PUSH_S_TOTAL="$5"
TOTAL_BYTES="$6"; RESTORE_S="$7"; RESTORE_BYTES="$8"; ROWS="$9"
{
echo "op=backup-delta-chain"
echo "tool=${TOOL}"
echo "run_id=${RUN_ID}"
echo "delta_origin=LATEST"
echo "chain_steps=${STEPS}"
printf '%s' "${ROWS}"
echo "push_s_total=${PUSH_S_TOTAL}"
echo "chain_delta_bytes=${TOTAL_BYTES}"
echo "restore_s=${RESTORE_S}"
echo "restore_bytes=${RESTORE_BYTES}"
} >"${CHAIN_METRICS}"
chown postgres:postgres "${CHAIN_METRICS}" 2>/dev/null || true
cat "${CHAIN_METRICS}"
REMOTE
;;
backup-fetch)
log "restore LATEST -> ${RESTORE_DIR}"
run_root "${RESTORE_DIR}" <<'REMOTE'
Expand Down
2 changes: 1 addition & 1 deletion bench/scripts/sut/05_install_pgbackrest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
set -euo pipefail

BUCKET="${BUCKET:-${1:-}}"
UPLOAD_CONCURRENCY="${UPLOAD_CONCURRENCY:-${2:-16}}"
UPLOAD_CONCURRENCY="${UPLOAD_CONCURRENCY:-${2:-4}}"
AWS_REGION="${AWS_REGION:-us-east-1}"
STANZA="${PGBACKREST_STANZA:-walbench}"
REPO_PATH="${PGBACKREST_REPO_PATH:-/pgbackrest-bench}"
Expand Down
Loading