Skip to content

Conversation

@karlem
Copy link
Contributor

@karlem karlem commented Oct 28, 2025

Closes #1441 and #1442


Note

Cursor Bugbot is generating a summary for commit 8857277. Configure here.

@karlem karlem changed the title feat: init lifecycle feat: F3 e2e lifecycle Oct 29, 2025
@karlem karlem force-pushed the f3-lifecycle branch 2 times, most recently from 91db005 to cbce51c Compare November 4, 2025 17:20
Base automatically changed from f3-proofs-cache to main December 18, 2025 16:15
@karlem karlem marked this pull request as ready for review January 16, 2026 19:52
@karlem karlem requested a review from a team as a code owner January 16, 2026 19:52
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

///
/// This method should only be called from consensus code path which
/// contains the lightclient verifier. No additional validation is
/// performed here as it's expected to be done by the verifier.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation in F3 Light Client update_state

High Severity

The update_state function in state.rs has no validation logic—it unconditionally replaces light_client_state and returns Ok(()). However, multiple tests expect it to reject invalid updates with USR_ILLEGAL_ARGUMENT: test_update_state_non_advancing_height expects rejection when height doesn't advance, test_instance_id_skip_rejected expects rejection when instance_id skips values, and test_empty_epochs_rejected also expects an error. These tests will fail because the validation they expect does not exist in the implementation.

Additional Locations (2)

Fix in Cursor Fix in Web

assert!(result.is_err());
let err = result.unwrap_err();
assert_eq!(err.exit_code(), ExitCode::USR_ILLEGAL_ARGUMENT);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test name doesn't match test behavior

Low Severity

The test test_empty_epochs_rejected claims to test rejection of "empty finalized_epochs" per its comment on line 399, but it creates a state with Some(10) for latest_finalized_height rather than None. This makes it identical to test_update_state_non_advancing_height instead of testing the distinct case of a missing/empty finalized height. If the intent was to test rejection of None, the test should use create_test_state(1, None, ...).

Fix in Cursor Fix in Web

Comment on lines +338 to +339
if !proof_config.enabled {
tracing::info!("F3 proof service disabled in configuration");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this case is for full subnet nodes that want to follow the subnet with F3-based top-down, but don't follow the parent chain because they are not active validators?

Comment on lines +92 to +100
let cached = self
.proof_cache
.get_epoch_proof_with_certificate(msg.height)
.ok_or_else(|| {
anyhow::anyhow!(
"proof bundle not found in local cache for height {}",
msg.height
)
})?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better comply with CometBFT requirements and ensure that ProcessProposal is fully deterministic. In order to do that, we have to verify the certificate deterministically against the F3 power table stored in the F3 light client actor's current state. If the certificate is valid then we should not reject the proposal.

In order to avoid voting for the proposal for which we locally don't yet have the corresponding data, I think we could be waiting (e.g. by polling) for the corresponding entry to appear in the proof cache before accepting the proposal. This should be pretty safe, but we absolutely have to deterministically verify the validity of the proposed certificate first and immediately reject the proposal if the certificate happens to be invalid, otherwise we might end up waiting for a non-existent certificate made up by a Byzantine proposer.

Comment on lines +133 to +138
// Epoch must advance by exactly 1 relative to the latest finalized epoch in state.
//
// At genesis this is `None` (no finality yet). In that case we skip the check here; the
// cache lookup (and later execution) will still enforce that we only process epochs we
// have proofs for.
if let Some(prev_finalized) = f3_state.latest_finalized_height {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this would allow the proposer of the first parent chain update since the genesis skip arbitrarily many epochs at the beginning of the certified chain extension. I think we have to set latest_finalized_height in genesis.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what genesis_epoch in the legacy top-down for?

tracing::debug!(instance = instance_id, "updated F3LightClientActor state");

// Mark epoch as committed in cache.
if let Err(e) = self.mark_committed(epoch, instance_id) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If epoch is not the last one finalized by this certificate then we probably should not mark instance_id yet.

Comment on lines +305 to +314
// Convert BigInt -> u64 (saturating if too large).
// Power should be non-negative; we ignore the sign here and keep the magnitude.
let (_sign, digits) = pe.power.to_u64_digits();
let power = if digits.is_empty() {
0
} else if digits.len() == 1 {
digits[0]
} else {
u64::MAX
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 bits may not be enough. I haven't check this thoroughly myself, but here's what Claude thinks:

Storage power values on Filecoin can be extremely large (representing Quality-Adjusted Power across the entire network), so they need arbitrary precision integers

Evidence Supporting the Statement:

  1. Historical Network Size Exceeded uint64:

Filecoin network peaked at 17 EiB in Q3 2022
17 EiB = 19,599,665,578,516,398,592 bytes
uint64 max = 18,446,744,073,709,551,615 bytes
The peak exceeded uint64 by ~1 EiB!

  1. Quality-Adjusted Power Multipliers:

Verified deals in Filecoin receive a 10x power multiplier
Even current capacity (3 EiB) → 30 EiB with 10x QAP
30 EiB vastly exceeds uint64 range

  1. Current Capacity Still Large:

Current network: ~3.0 EiB (Q3 2025)
While this fits in uint64, it's close enough that:

Calculations involving sums and products could overflow
Future growth requires headroom
Safety margins are essential

So, at very least, we should raise an error if there's an overflow rather than silently saturate to u64.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, we could store just the power table's CID in the actor, though we'd need to fetch the whole table from the endpoint at start and keep it updated somewhere off-chain. (Assuming state sync may only happen on bootstrap, CometBFT wouldn't need to know about this.)

Comment on lines +125 to +127
// Execute F3-specific logic (certificate validation, proof extraction, state updates)
let (msgs, validator_changes, instance_id) =
f3.extract_messages_and_validator_changes(state, &msg)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here or inside extract_messages_and_validator_changes, we have to keep trying until the corresponding entry appears in the proof cache. We cannot return error if it's not yet there because this creates a non-deterministic behavior in the consensus execution path.

f3.extract_messages_and_validator_changes(state, &msg)?;

// Commit parent finality to gateway
let finality = IPCParentFinality::new(msg.height as i64, vec![]);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably need to get the tipset key (the 2nd parameter) from the F3 cert.

Comment on lines +26 to +33
/// Generalized top-down finality structure
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct GeneralisedTopDown {
/// The chain epoch this finality is for (height)
pub height: ChainEpoch,
/// The certificate that certifies finality (type-specific, proof is fetched from local cache)
pub certificate: Certificate,
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could omit the height field and commit all the epochs finalized by the cert at once. Though, this won't be so easy when we include the data with integrity proofs, due to potential issues with the block size limits.

Comment on lines +248 to +255
// The last tipset in the certificate has no child tipset inside this certificate, so it
// cannot be proven yet. We only treat the epochs we generated proofs for as "finalized
// tipsets" for verification purposes.
let finalized_tipsets = {
let parents: Vec<FinalizedTipset> =
tipset_pairs.iter().map(|(p, _)| p.clone()).collect();
FinalizedTipsets::from(parents.as_slice())
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, verify_proof_bundle_with_tipsets wants to verify the child tipsets as well, so we should use the whole cert.ec_chain as finalized_tipsets.

Comment on lines +276 to +278
self.verifier
.verify_proof_bundle_with_tipsets(&proof_bundle, &finalized_tipsets)
.with_context(|| format!("Failed to verify proof for epoch {}", parent_epoch))?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, there's no verification of continuity of top-down event nonces, yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

F3 topdown: Proof Verification & Completeness Enforcement

3 participants