Skip to content

fix(core): coerce nil ProverAddresses in finalizeStorageProof to unstick prod chain#313

Merged
raymondjacobson merged 1 commit into
mainfrom
fix/pos-nil-prover-addresses
May 26, 2026
Merged

fix(core): coerce nil ProverAddresses in finalizeStorageProof to unstick prod chain#313
raymondjacobson merged 1 commit into
mainfrom
fix/pos-nil-prover-addresses

Conversation

@raymondjacobson
Copy link
Copy Markdown
Contributor

@raymondjacobson raymondjacobson commented May 26, 2026

Summary

Prod chain (audius-mainnet-alpha-beta) halted at block 25229899 on 2026-05-26 09:47:22 UTC. Every validator hits the same deterministic failure attempting to commit block 25229900.

A `StorageProof` transaction in block 25229900 has an empty/nil `ProverAddresses`. `finalizeStorageProof` (`pkg/core/server/pos.go:285`) passes that nil slice straight to `InsertStorageProof`. pgx serializes the nil slice as SQL `NULL`, which violates the `NOT NULL` constraint on `storage_proofs.prover_addresses` (migration `00013_proof_of_storage.sql:27`):

```
ERROR: null value in column "prover_addresses" of relation "storage_proofs"
violates not-null constraint (SQLSTATE 23502)
```

That leaves the pgx transaction in aborted state (`SQLSTATE 25P02`), so every subsequent write in the same FinalizeBlock errors. `Commit` returns ROLLBACK:

```
ERROR client error during proxyAppConn.CommitSync err=commit unexpectedly resulted in rollback
ERROR CONSENSUS FAILURE!!! err=failed to apply block; error commit failed for application
```

CometBFT panics in `finalizeCommit` at `0x180fa4c` = 25229900. Because the failure is deterministic against the same proposed block, every validator hits it and consensus cannot advance.

This patch coerces a nil `ProverAddresses` to `[]string{}` before insert so the column receives `'{}'` instead of `NULL`. The change is deterministic — once a supermajority of validators is on the new binary, block 25229900 commits with `prover_addresses = '{}'` and the chain advances.

What this PR does NOT do

Intentionally avoiding new `CheckTx` rejection rules in this patch — adding a "reject empty `ProverAddresses`" check would keep the chain stuck. Stricter validation in `isValidStorageProofTx` and a guard in `sendPoSChallengeToStorage` to skip submission when no provers resolve should ship as a follow-up once the chain is unstuck.

…geProof

A StorageProof tx with empty/nil ProverAddresses caused pgx to serialize
the value as SQL NULL, violating the NOT NULL constraint on
storage_proofs.prover_addresses. That aborted the FinalizeBlock
transaction (SQLSTATE 25P02), Commit resolved to ROLLBACK, and CometBFT
panicked with CONSENSUS FAILURE — halting the chain deterministically on
every validator at block 25229900.

Coerce nil to []string{} before insert so the column receives '{}'
instead of NULL, allowing the block to commit on the next restart.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@raymondjacobson raymondjacobson merged commit c786d3f into main May 26, 2026
6 checks passed
@raymondjacobson raymondjacobson deleted the fix/pos-nil-prover-addresses branch May 26, 2026 15:01
raymondjacobson added a commit that referenced this pull request May 26, 2026
…validation (#315)

Prevent future chain halts from empty ProverAddresses by adding guards
at two layers:

1. sendPoSChallengeToStorage: early-return if GetNodesByEndpoints errors
   or resolves zero prover addresses, instead of submitting a
   StorageProof tx with an empty (wire-roundtrips to nil) field.

2. isValidStorageProofTx (CheckTx/ProcessProposal): reject StorageProof
   txs with empty ProverAddresses so they never enter the mempool or
   a proposed block.

Follow-up to #313 which fixed the FinalizeBlock crash (layer 3).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant