Skip to content

Add genesis-writer for offline chain history population#210

Open
rickyrombo wants to merge 10 commits into
mainfrom
mjp-genesis-writer
Open

Add genesis-writer for offline chain history population#210
rickyrombo wants to merge 10 commits into
mainfrom
mjp-genesis-writer

Conversation

@rickyrombo
Copy link
Copy Markdown
Contributor

Summary

  • Adds cmd/genesis-writer, a tool that populates a new Core chain with full historical Audius state by writing synthetic blocks directly to PostgreSQL — no running network or consensus needed
  • Reads every current, non-deleted entity from a source Discovery Provider database, wraps each in a ManageEntityLegacyMigration proto signed with the genesis migration keypair, packs them into real CometBFT blocks, and writes to core_blocks/core_transactions/core_block_parts
  • After writing, primes CometBFT's state.db and blockstore.db so a single node can start from the written height and immediately propose the next live block
  • Includes integration test with Docker Compose (source DP postgres + target Core postgres + CometBFT state DBs)
  • Entity types: users, tracks, playlists, follows, saves, reposts, comments, tips, plays, developer apps, grants, dashboard wallets, emails

Test plan

  • Integration test (integration_test.go) verifies round-trip: write entities → read back from target DB → verify counts and field correctness
  • End-to-end: run against production DP snapshot, start a Core node from the written state, verify it proposes the next block

🤖 Generated with Claude Code

rickyrombo and others added 2 commits April 3, 2026 14:56
Replaces genesis-replay with a fully offline tool that reads from a
source DP database and writes real CometBFT blocks directly to Core
chain PostgreSQL + blockstore.db + state.db. Produces a distributable
snapshot that third-party indexers can process from block 1.

Includes ManageEntityLegacyMigration proto type to distinguish
genesis migration transactions from live ones, managed postgres
lifecycle, auto-generated validator keys and genesis.json, and
resume support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new cmd/genesis-writer tool to bootstrap a Core chain from an existing Discovery Provider Postgres DB by emitting synthetic CometBFT blocks directly into Core’s DB tables and writing CometBFT state/blockstore data so a node can start at the migrated height.

Changes:

  • Introduces ManageEntityLegacyMigration as a new SignedTransaction variant and wires it into Core’s ABCI finalize path.
  • Adds cmd/genesis-writer (writer, CometBFT state priming, managed local Postgres option) plus an end-to-end Docker Compose integration test + seed data.
  • Updates dependencies to support the new CLI tool and Postgres/migrations usage.

Reviewed changes

Copilot reviewed 26 out of 28 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
proto/core/v1/types.proto Adds ManageEntityLegacyMigration message + SignedTransaction oneof case.
pkg/core/server/manage_entity.go Adds finalize handler for migration manage-entity txs.
pkg/core/server/abci.go Improves snapshot offer handling and routes migration txs through finalizeTransaction.
go.mod Adds direct deps for genesis-writer (cli, pq) and related indirects.
go.sum Updates module checksums after dependency changes.
cmd/genesis-writer/main.go CLI entrypoint, flag parsing, key resolution, managed-postgres startup.
cmd/genesis-writer/writer.go Core migration writer: reads DP entities, builds/signs txs, builds blocks, writes to Core DB + blockstore.
cmd/genesis-writer/batch.go Generic batched/concurrent entity processing helpers.
cmd/genesis-writer/cmt_state.go Loads genesis + validator key, opens blockstore, bootstraps CometBFT state.db, writes updated genesis.json.
cmd/genesis-writer/postgres.go Local managed Postgres cluster lifecycle for offline runs.
cmd/genesis-writer/entities_user.go User + wallet-related entity extraction/serialization.
cmd/genesis-writer/entities_track.go Track + track-download extraction/serialization.
cmd/genesis-writer/entities_playlist.go Playlist extraction/serialization.
cmd/genesis-writer/entities_social.go Social actions extraction/serialization (follows/saves/reposts/etc).
cmd/genesis-writer/entities_play.go Play event extraction into TrackPlays txs.
cmd/genesis-writer/entities_developer_app.go Developer app + grant extraction/serialization.
cmd/genesis-writer/entities_dashboard_wallet.go Dashboard wallet user extraction/serialization.
cmd/genesis-writer/entities_comment.go Comment + comment-reaction extraction/serialization.
cmd/genesis-writer/entities_email.go Encrypted email + email access extraction/serialization.
cmd/genesis-writer/entities_tip.go Tip reaction extraction/serialization.
cmd/genesis-writer/integration_test.go Docker-based integration test validating round-trip + consensus advancement + state sync.
cmd/genesis-writer/docker-compose.yml Integration-test stack (source DP DB, core DBs, ganache, ingress, nodes).
cmd/genesis-writer/README.md Tool documentation, usage, and integration test instructions.
cmd/genesis-writer/Makefile Convenience target to run integration test flow.
cmd/genesis-writer/testdata/source_init.sh Initializes/creates the seeded source DB in Docker.
cmd/genesis-writer/testdata/seed.sql Comprehensive DP seed dataset for integration tests.
cmd/genesis-writer/testdata/dp_seed.sql Minimal DP seed to satisfy DP indexer assumptions.
Comments suppressed due to low confidence (1)

pkg/core/server/abci.go:597

  • In the "new snapshot offered" branch, acceptedSnapshotHeight/Hash are cleared but execution continues into the hash-mismatch check. Since acceptedSnapshotHash is now nil, this will always reject the newly offered snapshot (even though we intended to accept it). Restructure this logic so that when height differs you either (a) immediately treat it as a fresh offer (skip the hash check) or (b) update acceptedSnapshotHeight/Hash to the new snapshot before validating further state.
	// If we've already accepted a snapshot, check if CometBFT is re-offering the
	// same one (resume) or a different one (previous snapshot failed verification).
	if s.acceptedSnapshotHeight != 0 {
		if req.Snapshot.Height != s.acceptedSnapshotHeight {
			// CometBFT is offering a different snapshot, which means the previously
			// accepted one failed (e.g. consensus params verification error). Clear
			// the old state so we can accept the new snapshot.
			s.logger.Info("clearing previous snapshot state: CometBFT offered a new snapshot",
				zap.Uint64("previous_height", s.acceptedSnapshotHeight),
				zap.Uint64("new_height", req.Snapshot.Height))
			s.acceptedSnapshotHeight = 0
			s.acceptedSnapshotHash = nil
		}
		// Check hash matches too
		if !bytes.Equal(req.Snapshot.Hash, s.acceptedSnapshotHash) {
			s.logger.Info("rejecting snapshot: hash mismatch",
				zap.Uint64("height", req.Snapshot.Height),
				zap.String("offered_hash", hex.EncodeToString(req.Snapshot.Hash)),
				zap.String("accepted_hash", hex.EncodeToString(s.acceptedSnapshotHash)))
			return &abcitypes.OfferSnapshotResponse{
				Result: abcitypes.OFFER_SNAPSHOT_RESULT_REJECT,
			}, nil

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/core/server/manage_entity.go
Comment thread cmd/genesis-writer/writer.go
Comment thread cmd/genesis-writer/writer.go
Comment thread cmd/genesis-writer/batch.go
Comment thread proto/core/v1/types.proto
Comment thread cmd/genesis-writer/writer.go
Comment thread cmd/genesis-writer/writer.go Outdated
Comment thread cmd/genesis-writer/README.md Outdated
Fix blockWriteErr data race with atomic.Pointer, restore block linkage
on resume, wire BatchSize config, add defer stopBlockWriter for leak
safety, use streaming sha256 for appHash, fix README SaveBlock wording.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rickyrombo rickyrombo requested a review from Copilot April 24, 2026 01:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/genesis-writer/writer.go
Comment thread cmd/genesis-writer/entities_social.go
Comment thread cmd/genesis-writer/cmt_state.go Outdated
- Remove redundant sql.Open/Close before RunMigrations
- Handle json.Marshal errors in social and comment entity writers
- Merge into existing app_state instead of replacing it in writeGenesisFile
- Improve README SaveBlock documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/genesis-writer/writer.go
Comment thread cmd/genesis-writer/writer.go
Comment thread cmd/genesis-writer/writer.go
- Load prevAppHash from core_app_state instead of block header (off-by-one fix)
- Make blockstore required for resume (error if CMTHome not set)
- Error on missing block/commit in blockstore during resume
- Document that Signer field is an identity hint, not signature authority
- Use uppercase hex for tx_hash to match CometBFT's HexBytes.String()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

pkg/core/server/abci.go:598

  • In the "accepted snapshot already set" branch, when a different snapshot height is offered you reset acceptedSnapshotHeight/Hash to allow accepting a new snapshot, but the function then continues into the hash-mismatch check and will reject because acceptedSnapshotHash is now nil. Restructure this logic so that when the offered height differs you clear state and then fall through to the "first snapshot" validation/accept path (skipping the hash check for the previous snapshot).
	if s.acceptedSnapshotHeight != 0 {
		if req.Snapshot.Height != s.acceptedSnapshotHeight {
			// CometBFT is offering a different snapshot, which means the previously
			// accepted one failed (e.g. consensus params verification error). Clear
			// the old state so we can accept the new snapshot.
			s.logger.Info("clearing previous snapshot state: CometBFT offered a new snapshot",
				zap.Uint64("previous_height", s.acceptedSnapshotHeight),
				zap.Uint64("new_height", req.Snapshot.Height))
			s.acceptedSnapshotHeight = 0
			s.acceptedSnapshotHash = nil
		}
		// Check hash matches too
		if !bytes.Equal(req.Snapshot.Hash, s.acceptedSnapshotHash) {
			s.logger.Info("rejecting snapshot: hash mismatch",
				zap.Uint64("height", req.Snapshot.Height),
				zap.String("offered_hash", hex.EncodeToString(req.Snapshot.Hash)),
				zap.String("accepted_hash", hex.EncodeToString(s.acceptedSnapshotHash)))
			return &abcitypes.OfferSnapshotResponse{
				Result: abcitypes.OFFER_SNAPSHOT_RESULT_REJECT,
			}, nil
		}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/genesis-writer/writer.go Outdated
Comment thread cmd/genesis-writer/writer.go
Comment thread cmd/genesis-writer/README.md Outdated
Comment thread cmd/genesis-writer/writer.go Outdated
Comment thread cmd/genesis-writer/writer.go
- Write to blockstore after postgres commit to keep them in sync on failure
- Return ctx.Err() on interruption instead of breaking to success path
- Update README: indexers must recover signer from signature, not trust
  the signer field (which carries the entity wallet address)
- Document step-based resume granularity limitation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/genesis-writer/integration_test.go
Comment thread cmd/genesis-writer/entities_track.go Outdated
Comment thread cmd/genesis-writer/entities_tip.go
Comment thread cmd/genesis-writer/writer.go Outdated
Comment thread cmd/genesis-writer/postgres.go
…rr, user lookup

- Sort imports in integration_test.go per gofmt
- Fall back to metadata_multihash for TrackCID when track_segments empty
- Add ORDER BY to wallet→user preload for deterministic tip attribution
- Check rows.Err() after iterating resume progress query
- Use os/user.Current() as fallback when USER env var is unset

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

pkg/core/server/abci.go:598

  • If a different snapshot height is offered, the code clears acceptedSnapshotHeight/Hash but then immediately compares the offered hash against the now-nil acceptedSnapshotHash and will always reject. After clearing state, it should fall through to the “first snapshot” accept path (or reinitialize acceptedSnapshotHash) rather than performing the hash mismatch check on a cleared value.
	// If we've already accepted a snapshot, check if CometBFT is re-offering the
	// same one (resume) or a different one (previous snapshot failed verification).
	if s.acceptedSnapshotHeight != 0 {
		if req.Snapshot.Height != s.acceptedSnapshotHeight {
			// CometBFT is offering a different snapshot, which means the previously
			// accepted one failed (e.g. consensus params verification error). Clear
			// the old state so we can accept the new snapshot.
			s.logger.Info("clearing previous snapshot state: CometBFT offered a new snapshot",
				zap.Uint64("previous_height", s.acceptedSnapshotHeight),
				zap.Uint64("new_height", req.Snapshot.Height))
			s.acceptedSnapshotHeight = 0
			s.acceptedSnapshotHash = nil
		}
		// Check hash matches too
		if !bytes.Equal(req.Snapshot.Hash, s.acceptedSnapshotHash) {
			s.logger.Info("rejecting snapshot: hash mismatch",
				zap.Uint64("height", req.Snapshot.Height),
				zap.String("offered_hash", hex.EncodeToString(req.Snapshot.Hash)),
				zap.String("accepted_hash", hex.EncodeToString(s.acceptedSnapshotHash)))
			return &abcitypes.OfferSnapshotResponse{
				Result: abcitypes.OFFER_SNAPSHOT_RESULT_REJECT,
			}, nil
		}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread proto/core/v1/types.proto
Comment thread cmd/genesis-writer/batch.go
Comment thread cmd/genesis-writer/postgres.go
rickyrombo and others added 2 commits April 24, 2026 10:16
The Python DP indexer sets track_cid from the metadata JSON field
"track_cid", not from track_segments or metadata_multihash.
metadata_multihash is the CID of the metadata blob itself (unrelated
to the audio CID), so using it as a fallback was incorrect.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Managed postgres uses trust auth for bulk-load performance. Ensure it
only listens on localhost to prevent exposing an unauthenticated
instance on the network.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rickyrombo rickyrombo requested a review from Copilot April 24, 2026 17:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/genesis-writer/integration_test.go Outdated
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

pkg/core/server/abci.go:598

  • In the height-mismatch case you clear acceptedSnapshotHeight/acceptedSnapshotHash, but then still run the hash-mismatch check against a nil acceptedSnapshotHash, which will reject the newly offered snapshot. After clearing, this branch should fall through to the “First snapshot, validate and accept it” path (e.g., return ACCEPT after re-validating, or restructure with an else/goto so the hash check only runs when the snapshot height matches).
	if s.acceptedSnapshotHeight != 0 {
		if req.Snapshot.Height != s.acceptedSnapshotHeight {
			// CometBFT is offering a different snapshot, which means the previously
			// accepted one failed (e.g. consensus params verification error). Clear
			// the old state so we can accept the new snapshot.
			s.logger.Info("clearing previous snapshot state: CometBFT offered a new snapshot",
				zap.Uint64("previous_height", s.acceptedSnapshotHeight),
				zap.Uint64("new_height", req.Snapshot.Height))
			s.acceptedSnapshotHeight = 0
			s.acceptedSnapshotHash = nil
		}
		// Check hash matches too
		if !bytes.Equal(req.Snapshot.Hash, s.acceptedSnapshotHash) {
			s.logger.Info("rejecting snapshot: hash mismatch",
				zap.Uint64("height", req.Snapshot.Height),
				zap.String("offered_hash", hex.EncodeToString(req.Snapshot.Hash)),
				zap.String("accepted_hash", hex.EncodeToString(s.acceptedSnapshotHash)))
			return &abcitypes.OfferSnapshotResponse{
				Result: abcitypes.OFFER_SNAPSHOT_RESULT_REJECT,
			}, nil
		}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +517 to +519
w.prevBlockID = cmttypes.BlockID{
Hash: hashBytes,
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the resume path where finalHeight==0, this “recover final state” block overwrites w.prevBlockID with only the block hash from core_blocks, discarding the PartSetHeader that was already restored earlier from blockstore. That can leave LastBlockID incomplete when writeCMTState bootstraps state.db. Consider either (a) not overwriting prevBlockID if it’s already populated, or (b) reconstructing the full BlockID (hash + partset header) from blockstore at maxHeight like the earlier resume logic does.

Suggested change
w.prevBlockID = cmttypes.BlockID{
Hash: hashBytes,
}
// Preserve any previously restored PartSetHeader when recovering
// the final block hash during resume.
w.prevBlockID.Hash = hashBytes

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants