Skip to content

fix(core): guard against empty ProverAddresses in PoS submission and validation#315

Merged
raymondjacobson merged 1 commit into
mainfrom
fix/pos-guard-empty-provers
May 26, 2026
Merged

fix(core): guard against empty ProverAddresses in PoS submission and validation#315
raymondjacobson merged 1 commit into
mainfrom
fix/pos-guard-empty-provers

Conversation

@raymondjacobson
Copy link
Copy Markdown
Contributor

@raymondjacobson raymondjacobson commented May 26, 2026

Summary

Follow-up to #313 — hardens the PoS subsystem so a StorageProof with empty ProverAddresses can never reach FinalizeBlock again, regardless of why the field was empty.

Layer 1 — submission guard (sendPoSChallengeToStorage):

  • Early-return if GetNodesByEndpoints errors, instead of silently continuing with nodes == nil
  • Early-return if proverAddresses resolves to zero entries, with a log line capturing the replicas for debugging

Layer 2 — validation guard (isValidStorageProofTx, used by CheckTx/ProcessProposal):

  • Reject any StorageProof tx where ProverAddresses is empty — prevents it from entering the mempool or being included in a proposed block

Important: This PR should only be deployed AFTER the chain has advanced past block 25229900 (the halt block). The isValidStorageProofTx check would reject the poison tx during ProcessProposal, which is the correct long-term behavior but would interfere with the halt-recovery if validators haven't yet committed that block.

Root cause analysis

The exact trigger for the original empty ProverAddresses remains uncertain. What we know:

  • Mediorum returns a non-empty Replicas list (top-4 rendezvous-ranked hosts from 70+ entries)
  • All core_validators.endpoint values are lowercase, no trailing slash — matching the normalization in pos.go:60
  • The StorageProof proto message has only one construction site (pos.go:111)

The most plausible trigger: GetNodesByEndpoints errored (e.g. pgxpool contention — the goroutine shares the pool with FinalizeBlock's transaction), and the code swallowed the error without early-returning (line 78). The empty []string{} then survived through submitStorageProofTx, where proto.Marshal omits the field on the wire (proto3 empty repeated), and receivers unmarshal it as nil.

Regardless of the exact trigger, these guards make the failure impossible at two independent layers upstream of FinalizeBlock.

Test plan

  • Verify chain is past block 25229900 before deploying this change
  • CI passes (build + integration tests)
  • After deployment, monitor logs for "No prover addresses resolved for PoS challenge" — this would confirm the guard is being hit and would provide the replicas field needed to pinpoint the root cause

🤖 Generated with Claude Code

…validation

Prevent future chain halts from empty ProverAddresses by adding guards
at two layers:

1. sendPoSChallengeToStorage: early-return if GetNodesByEndpoints errors
   or resolves zero prover addresses, instead of submitting a
   StorageProof tx with an empty (wire-roundtrips to nil) field.

2. isValidStorageProofTx (CheckTx/ProcessProposal): reject StorageProof
   txs with empty ProverAddresses so they never enter the mempool or
   a proposed block.

Follow-up to #313 which fixed the FinalizeBlock crash (layer 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@raymondjacobson raymondjacobson merged commit 983bde8 into main May 26, 2026
7 checks passed
@raymondjacobson raymondjacobson deleted the fix/pos-guard-empty-provers branch May 26, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant