Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions cmd/genesis-replay/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
BINARY := genesis-replay
EXT_DATA_DIR ?= .docker-data

.PHONY: build keygen up down test-integration run verify clean

## build: compile the genesis-replay binary
build:
go build -o $(BINARY) .

## keygen: generate a new genesis migration keypair
keygen: build
./$(BINARY) keygen

## up: start the docker-compose test stack (detached)
up:
docker compose up -d --wait

## down: tear down the stack and remove Docker-managed named volumes
down:
docker compose down -v

## test-integration: run the full end-to-end integration test
## Prerequisites: stack must be running (make up)
test-integration:
go test -v -tags integration -run TestGenesisReplay -timeout 20m ./...

## run: replay entities from a source DB to a bootstrap chain
## Required env vars: GENESIS_SRC_DSN, GENESIS_CHAIN_URL, GENESIS_MIGRATION_PRIVATE_KEY
run: build
./$(BINARY) run

## verify: diff entities and plays between two discovery-provider databases
## Required env vars: GENESIS_SRC_DSN, GENESIS_DEST_DSN
verify: build
./$(BINARY) verify

## clean: remove the compiled binary and data directories under EXT_DATA_DIR (default: .docker-data)
clean:
rm -f $(BINARY)
rm -rf "$(EXT_DATA_DIR)/test-validator" "$(EXT_DATA_DIR)/test-pg-data"
199 changes: 199 additions & 0 deletions cmd/genesis-replay/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
# genesis-replay

Bootstraps a new Core chain with full historical Audius state by replaying
synthetic `ManageEntity` and `TrackPlay` transactions sourced from a
discovery-provider PostgreSQL database.

## How it works

The discovery provider indexes entity data from on-chain `ManageEntity`
transactions. Because all historical data lives in the DP's postgres DB rather
than on-chain, a new chain starts empty — `genesis-replay` bridges the gap by
re-submitting every entity as a genesis migration transaction signed by a
dedicated keypair. The DP treats transactions from `genesis_migration_address`
as trusted and indexes them without wallet ownership checks.

## Commands

### `keygen`

Generates a new Ethereum keypair for use as the genesis migration identity.

```
genesis-replay keygen
```

Prints the address and private key. Set the address as `genesis_migration_address`
in the chain's genesis JSON (`pkg/core/config/genesis/prod-v2.json`), then pass
the private key to `genesis-replay run`.

---

### `run`

Reads every current, non-deleted entity from the source DB and submits it to
the bootstrap chain as a `ManageEntity` transaction.

```
genesis-replay run \
--src-dsn <postgres_dsn> \
--chain-url <bootstrap_url> \
--private-key <hex_privkey> \
[--network prod|stage|dev] \
[--concurrency 500] \
[--batch-size 1000] \
[--skip-users] [--skip-tracks] [--skip-playlists] [--skip-social] [--skip-plays]
```

All flags can also be set via environment variables:

| Flag | Env var | Default | Description |
|------|---------|---------|-------------|
| `--src-dsn` | `GENESIS_SRC_DSN` | — | Source PostgreSQL DSN |
| `--chain-url` | `GENESIS_CHAIN_URL` | `http://localhost:50051` | Bootstrap chain URL |
| `--private-key` | `GENESIS_MIGRATION_PRIVATE_KEY` | — | Genesis migration private key (hex) |
| `--network` | `NETWORK` | `prod` | EIP-712 signing domain (`prod`, `stage`, `dev`) |
| `--concurrency` | `GENESIS_CONCURRENCY` | `500` | Concurrent transaction submissions |
| `--batch-size` | `GENESIS_BATCH_SIZE` | `1000` | Rows fetched per DB query |
| `--skip-users` | `GENESIS_SKIP_USERS` | false | Skip user replay |
| `--skip-tracks` | `GENESIS_SKIP_TRACKS` | false | Skip track replay |
| `--skip-playlists` | `GENESIS_SKIP_PLAYLISTS` | false | Skip playlist replay |
| `--skip-social` | `GENESIS_SKIP_SOCIAL` | false | Skip follows/saves/reposts replay |
| `--skip-plays` | `GENESIS_SKIP_PLAYS` | false | Skip play replay |

Entities are replayed in dependency order: users → tracks → playlists → social → plays.

For maximum throughput, point `--chain-url` directly at the node's gRPC port (`http://host:50051`)
rather than the HTTPS ingress. This bypasses nginx and TLS, roughly tripling the submission rate.

---

### `verify`

Streams two discovery-provider databases in sorted order and performs a
merge comparison to detect missing, extra, or changed rows.

```
genesis-replay verify \
--src <postgres_dsn> \
--dst <postgres_dsn> \
[--max-samples 10] \
[--skip-plays]
```

| Flag | Env var | Default | Description |
|------|---------|---------|-------------|
| `--src` | `GENESIS_SRC_DSN` | — | Source (reference) database |
| `--dst` | `GENESIS_DEST_DSN` | — | Destination database |
| `--max-samples` | `GENESIS_MAX_SAMPLES` | `10` | Mismatch rows to print per entity type |
| `--skip-plays` | `GENESIS_SKIP_PLAYS` | false | Skip play-count verification |

Produces a summary table:

```
entity src dst missing extra different status
------ --- --- ------- ----- --------- ------
users 1234567 1234567 0 0 0 OK
tracks 4567890 4567890 0 0 0 OK
plays 890123 890123 0 0 0 OK
```

**Exit codes**: `0` = all checks pass, `1` = mismatches found, `2` = fatal error.

Plays are compared as per-track aggregate counts (`GROUP BY play_item_id`) so the
command is safe to run against a full production dataset with billions of play rows.
Memory usage is O(1) — only two rows are held in memory at any time.

---

## Discovery provider patches

The DP must be rebuilt from source with the following env var set to relax
validation rules that are incompatible with historical data:

```
AUDIUS_GENESIS_MIGRATION_MODE=true
```

This disables, for `CREATE` actions only:
- Entity ID minimum offset checks (historical IDs are below the DP's offset thresholds)
- User wallet uniqueness checks (one signing key is used for all entities)
- Handle/name bad word filtering (historical handles may trip false positives)
- Bio/name character limit enforcement
- Signer ownership validation

From the `apps` repo root:

```bash
docker build -t audius-discovery-provider:latest \
-f packages/discovery-provider/Dockerfile.prod \
packages/discovery-provider
```

The docker-compose stack sets `AUDIUS_GENESIS_MIGRATION_MODE=true` automatically.

---

## Local development

The docker-compose stack in this directory provides everything needed to run the
integration test locally: a postgres instance pre-seeded with both the source
data and the discovery-provider schema, a local Core node, and a
discovery-provider service.

```bash
# Start the stack using Docker-managed volumes (default)
make up

# Start the stack with data on an external drive
EXT_DATA_DIR="/Volumes/T7 Shield" make up

# Run the integration test (stack must be running)
make test-integration

# Tear down (leaves external data intact if EXT_DATA_DIR was set)
make down

# Remove external data directories
EXT_DATA_DIR="/Volumes/T7 Shield" make clean
```

By default, `EXT_DATA_DIR` falls back to `.docker-data/` in the current directory.
Data directories created under `EXT_DATA_DIR`:

| Directory | Contents |
|-----------|----------|
| `test-validator/` | Core node chain data |
| `test-pg-data/` | PostgreSQL data |

### Stack requirements

- `openaudio/go-openaudio:dev` — build with `make docker-dev` from the repo root
- `audius-discovery-provider:latest` — build from `apps` repo with genesis mode patch (see above)

### Ports

| Service | Host port |
|---------|-----------|
| PostgreSQL | `5434` |
| Core node (gRPC) | `50051` |
| Ingress (HTTPS) | `443` |

### Monitoring DP errors

```bash
docker logs genesis-replay-discovery-provider-1 2>&1 \
| grep -v "index_spl_token\|index_rewards_manager\|solders\|ParseError\|Traceback\|File \"/audius\|transactions_history\|AttributeError" \
| grep -i "error\|critical" \
| sed 's/.*"msg": "\([^"]*\)".*/\1/' \
| sort | uniq -c | sort -rn \
| head -20
```

---

## Known limitations

- `is_verified` (ETH-verified artist badge) is controlled by on-chain Ethereum
verification and cannot be set via `ManageEntity`. Replayed users will have
`is_verified = false` regardless of source data.
155 changes: 155 additions & 0 deletions cmd/genesis-replay/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
name: genesis-replay

# Minimal stack for genesis replay testing.
# All paths are relative to the repo root.
#
# Usage (from repo root):
# docker compose -f cmd/genesis-replay/docker-compose.yml up -d
# go run ./cmd/genesis-replay/... --src postgres://... --dst https://node1.oap.devnet
# docker compose -f cmd/genesis-replay/docker-compose.yml down -v

services:

eth-ganache:
image: audius/eth-ganache:latest
pull_policy: always
stop_grace_period: 0s
# Run ganache directly against the pre-built contract DB — no blockscout wait needed.
command: >
npx ganache
--server.host 0.0.0.0
--wallet.deterministic
--wallet.totalAccounts 50
--database.dbPath /usr/db
--chain.networkId 12345
healthcheck:
test:
- CMD-SHELL
- >
node -e "require('http').request({port:8545,method:'POST',headers:{'Content-Type':'application/json'}},
r=>{let d='';r.on('data',c=>d+=c);r.on('end',()=>process.exit(JSON.parse(d).result?0:1))}).end(JSON.stringify({jsonrpc:'2.0',method:'eth_blockNumber',params:[],id:1}))"
interval: 5s
timeout: 5s
retries: 12

openaudio-1:
image: ${OPENAUDIO_IMAGE:-openaudio/go-openaudio:dev}
healthcheck:
test:
- CMD-SHELL
- "curl -fsk https://localhost/health-check | grep -v 'core service not ready'"
interval: 10s
timeout: 5s
retries: 30
start_period: 30s
environment:
NETWORK: dev
OPENAUDIO_ENV: dev
OPENAUDIO_GENESIS: dev-v2
OPENAUDIO_TLS_SELF_SIGNED: "true"
OPENAUDIO_GENESIS_MIGRATION: "true"
nodeEndpoint: https://node1.oap.devnet
delegatePrivateKey: d09ba371c359f10f22ccda12fd26c598c7921bda3220c9942174562bc6a36fe8
archive: "true"
stateSyncServeSnapshots: "true"
stateSyncEnable: "false"
audius_core_root_dir: /data/core
uptimeDataDir: /data/bolt
OPENAUDIO_PGALL: "true"
extra_hosts:
- "node1.oap.devnet:host-gateway"
volumes:
- ${EXT_DATA_DIR}/test-validator:/data
ports:
- "50051:50051"
depends_on:
eth-ganache:
condition: service_healthy

ingress:
image: nginx:latest
volumes:
- ../../dev/nginx.conf:/etc/nginx/conf.d/vhost.conf:ro
- ../../dev/tls/cert.pem:/etc/nginx/ssl/cert.pem:ro
- ../../dev/tls/key.pem:/etc/nginx/ssl/key.pem:ro
extra_hosts:
- "node1.oap.devnet:host-gateway"
ports:
- "80:80"
- "443:443"

db:
image: postgres:17
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: discovery_provider_1
volumes:
- ../../cmd/genesis-replay/testdata/dp_schema.sql:/docker-entrypoint-initdb.d/01_schema.sql:ro
- ../../cmd/genesis-replay/testdata/dp_seed.sql:/docker-entrypoint-initdb.d/02_seed.sql:ro
- ../../cmd/genesis-replay/testdata/source_init.sh:/docker-entrypoint-initdb.d/03_source_init.sh:ro
- ../../cmd/genesis-replay/testdata/seed.sql:/tmp/source_seed.sql:ro
- ${EXT_DATA_DIR}/test-pg-data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 3s
retries: 10
ports:
- "5434:5432"

redis:
image: redis:7.0
command: redis-server
healthcheck:
test: ["CMD", "redis-cli", "PING"]
interval: 5s
timeout: 3s
retries: 5

discovery-provider:
image: audius-discovery-provider:latest
command: scripts/start.sh
environment:
# Database / cache
audius_db_url: postgresql://postgres:postgres@db:5432/discovery_provider_1
audius_db_url_read_replica: postgresql://postgres:postgres@db:5432/discovery_provider_1
audius_redis_url: redis://redis:6379/00

# ETH
audius_web3_eth_provider_url: http://eth-ganache:8545
audius_eth_contracts_registry: "0xABbfF712977dB51f9f212B85e8A4904c818C2b63"
audius_eth_contracts_token: "0xdcB2fC9469808630DD0744b0adf97C0003fC29B2"

# POA (entity manager / legacy contracts)
audius_web3_host: http://eth-ganache:8545
audius_contracts_registry: "0xCfEB869F69431e42cdB54A4F4f105C19C080A601"
audius_contracts_entity_manager_address: "0x254dffcd3277C0b1660F6d42EFbB754edaBAbC2B"

# Identity
audius_discprov_url: http://discovery-provider:5000
audius_delegate_owner_wallet: "0x73EB6d82CFB20bA669e9c178b718d770C49BB52f"
audius_delegate_private_key: d09ba371c359f10f22ccda12fd26c598c7921bda3220c9942174562bc6a36fe8

# Dev flags
audius_db_run_migrations: "false"
audius_discprov_dev_mode: "true"
AUDIUS_GENESIS_MIGRATION_MODE: "true"
audius_discprov_loglevel: info
audius_enable_rsyslog: "false"
PYTHONPYCACHEPREFIX: /tmp/pycache

# Stub Solana — point to ganache so the process starts without a live validator
audius_solana_endpoint: http://eth-ganache:8545
extra_hosts:
- "node1.oap.devnet:host-gateway"
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
eth-ganache:
condition: service_healthy
openaudio-1:
condition: service_healthy

Binary file added cmd/genesis-replay/genesis-replay
Binary file not shown.
Loading
Loading