Small CLI utility to incrementally sync rows from one PostgreSQL database into another for the htn-db-filler schema.
It reads from the source DB in batches and upserts into the target DB (INSERT .. ON CONFLICT .. DO UPDATE).
- Syncs selected tables from source → target
- By default discovers and syncs every common table in the chosen schema
- Tracks a per-table cursor (by primary key ordering) in the target (default table:
htn_db_sync_state) so repeated runs can resume - Avoids
pg_dump(works better for very large databases)
- Python 3.10+
- Connection auth via libpq env vars,
.pgpass, or your local Postgres config
This repo’s filler uses SQL_URI; this tool uses:
SOURCE_SQL_URI(e.g.postgresql://user:pass@host:5432/source_db)TARGET_SQL_URI(e.g.postgresql://user:pass@host:5432/target_db)
cd htn-db-sync
python -m pip install -e .
# Run via console script (if your venv bin is on PATH)
htn-db-sync sync
# Or run via module (always works)
python -m htn_db_sync.cli syncBy default it will discover all common tables in public and sync them in primary-key order.
Or pass URIs explicitly:
htn-db-sync sync \
--source "postgresql://.../source" \
--target "postgresql://.../target"--schema public--tables all(default)--tables blocks,transactions,...(explicit list)--batch-size 5000--dry-run(no writes)--conflict newest-wins(default),--conflict source-wins, or--conflict target-wins--freshness-column block_time(override which column defines “newest” fornewest-wins)
- Prefer using
.pgpassover embedding passwords in URIs. - Tables must have a primary key to be reconciled. Tables without PKs are skipped.