Skip to content

feat(fibre): implement Fibre Module#4892

Open
vgonkivs wants to merge 13 commits intomainfrom
add_fibre_module
Open

feat(fibre): implement Fibre Module#4892
vgonkivs wants to merge 13 commits intomainfrom
add_fibre_module

Conversation

@vgonkivs
Copy link
Copy Markdown
Member

@vgonkivs vgonkivs commented Mar 25, 2026

@vgonkivs vgonkivs self-assigned this Mar 25, 2026
@vgonkivs vgonkivs requested a review from a team as a code owner March 25, 2026 16:59
@vgonkivs vgonkivs requested a review from walldiss March 25, 2026 16:59
devin-ai-integration[bot]

This comment was marked as resolved.

@vgonkivs vgonkivs force-pushed the add_fibre_module branch 3 times, most recently from 53ba2fa to cbe0341 Compare March 25, 2026 17:18
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two competing fibre blob submission paths, and one of them is incomplete

  • modfibre.Upload

The newly introduced API, however, is incomplete as it doesn't send the PFF transactions. The comment mentions that users should use modblob.SubmitFibreBlob to send TXs; however, doing so requires reuploading the entire blob over the API, and modblob.SubmitFibreBlob reuploads it again.

Essentially, there is no way for fibre users of Upload to pay for their uploads

  • modblob.SubmitFibreBlob

The purpose of this method is unclear. It duplicates the entire fibre blob submission on the old blob service, however, the intention was clearly to introduce a new API as seen in the new fibre module, so why duplicate? If the intention was to allow sending PFFs through this module, then it is unclear why blob module would be responsible for that and not fibre.

Too many layers of indirections

With this PR we have:

  • modfibre.Module
  • fibre.Client
  • fibre.client
  • txclient fibre methods
  • appfibre.Client

This is extremely messy and a pure pain to review. We can easily squash a bunch of them with no repercussions into:

  • modfibre.Module
  • fibre.Service
  • appfibre.Client.

Besides, for whatever reason, out of all those layers, the txclient turned out to be responsible for actually uploading. TxClient, Carl! It is supposed to send transactions and not call appfibre.Client.Upload.

Subscriptions

There is no way to subscribe for blobs from fibre in the order they'be landed through consensus.

// It encodes the blob, constructs a payment promise, uploads encoded rows to FSPs,
// and aggregates validator availability signatures. It does NOT submit MsgPayForFibre on-chain.
// Use blob.SubmitFibreBlob for the full submit flow.
Upload(ctx context.Context, ns libshare.Namespace, data []byte) (*fibre.UploadResult, error)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interface definition should be together with types it uses for modules in case those types are API level heler types(Result/Response), instead of internal types(Blob)

// It encodes the blob, constructs a payment promise, uploads encoded rows to FSPs,
// and aggregates validator availability signatures. It does NOT submit MsgPayForFibre on-chain.
// Use blob.SubmitFibreBlob for the full submit flow.
Upload(ctx context.Context, ns libshare.Namespace, data []byte) (*fibre.UploadResult, error)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is UploadResult, but GetBlobResponse? Consistencty?

Comment on lines +74 to +88
func (m *blobMetrics) observeUpload(ctx context.Context, dur time.Duration, blobSize int, err error) {
if m == nil {
return
}
m.uploadDuration.Record(ctx, dur.Seconds(), blobAttrs(blobSize, err))
}

func (m *blobMetrics) observeSubmit(ctx context.Context, dur time.Duration, blobSize int, err error) {
if m == nil {
return
}
m.submitDuration.Record(ctx, dur.Seconds(), blobAttrs(blobSize, err))
}

func (m *blobMetrics) observeGet(ctx context.Context, dur time.Duration, err error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need those metrics if they are already part of the appfibre.Client, besides the Submit, but that's just tx submission metric?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These metrics track different layers. fibre/metrics.go measures end-to-end node-level latency (upload, submit, get), txclient measures tx submission and gas estimation, and appfibre.Client tracks FSP networking internals. They complement each other — knowing that a submit took 5s total but only 200ms on tx submission tells you the bottleneck is in the upload phase.

Comment on lines +104 to +111
cl, err := appfibre.NewClient(c.keyring, appfibre.DefaultClientConfig())
if err != nil {
return fmt.Errorf("failed to setup fibre client: %w", err)
}
if err := cl.Start(c.ctx); err != nil {
return fmt.Errorf("failed to start fibre client: %w", err)
}
c.fibreClient = cl
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide fibre client as a component to DI with full lifecycling. Its Start method is blocking and is meant to called while node starts and break node start if there is misconfiguration, rather then being called in runtime

@vgonkivs
Copy link
Copy Markdown
Member Author

vgonkivs commented Apr 2, 2026

This PR was done in accordance to the ADR, made by @cmwaters.

There are two competing fibre blob submission paths, and one of them is incomplete
modfibre.Upload
The newly introduced API, however, is incomplete as it doesn't send the PFF transactions. The comment mentions that users should use modblob.SubmitFibreBlob to send TXs; however, doing so requires reuploading the entire blob over the API, and modblob.SubmitFibreBlob reuploads it again.

Link1

Link2

Subscriptions
There is no way to subscribe for blobs from fibre in the order they'be landed through consensus.

Auto-fetching full fibre data from FSPs needs design work first. Feel free to open an ADR update with a proposal

@Wondertan
Copy link
Copy Markdown
Member

@vgonkivs, I acknowledge that the PR implements the ADR. However, it does not change the fact that users can't pay for the uploads they make and we should fix this.

Auto-fetching full fibre data from FSPs needs design work first. Feel free to open an ADR update with a proposal

The subscription does not imply listening for data from FSPs, but listening to new fibre-blobs in the square and fetching respective fibre blobs by commitment.

@vgonkivs
Copy link
Copy Markdown
Member Author

vgonkivs commented Apr 2, 2026

Feel free to open an ADR update with a proposal

@vgonkivs
Copy link
Copy Markdown
Member Author

vgonkivs commented Apr 2, 2026

However, it does not change the fact that users can't pay for the uploads they make and we should fix this.

https://github.com/celestiaorg/celestia-node/pull/4892/changes#diff-575205cc93599bc2a9d28e62e576697e9fcf39733970a61aeebfade20493f1dbR50

@Wondertan
Copy link
Copy Markdown
Member

@Wondertan
Copy link
Copy Markdown
Member

Feel free to open an ADR update with a proposal

Nothing is stopping us from modifying the ADR in this PR. That's a normal feedback process where during implementation issues are discovered and spec/adr is updated accordingly.

@vgonkivs vgonkivs requested a review from Wondertan April 3, 2026 14:57
@vgonkivs
Copy link
Copy Markdown
Member Author

vgonkivs commented Apr 3, 2026

There is no final decision yet on the upload, so no changes were made to this part.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 12 additional findings in Devin Review.

Open in Devin Review

}

if resp == nil {
return nil, fmt.Errorf("querying escrow account %w", err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing colon separator in error format string produces poorly formatted error message

At fibre/account.go:94, the error format string "querying escrow account %w" is missing a colon or separator before the %w verb. When an error occurs (e.g., all gRPC connections fail), the resulting error message will read like "querying escrow account connection refused" instead of "querying escrow account: connection refused", making it harder to parse both by humans and programmatic error handling.

Suggested change
return nil, fmt.Errorf("querying escrow account %w", err)
return nil, fmt.Errorf("querying escrow account: %w", err)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines 133 to +136
Blobstream: &blobstreamAPI,
Header: &headerAPI,
Blob: &readOnlyBlobAPI{&blobAPI},
Fibre: &fibreAPI,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Nil pointer dereference in ReadClient.Fraud — field initialized but never assigned

In api/client/read_client.go, the ReadClient struct declares a Fraud fraudapi.Module field (line 34), and the constructor creates and closes a fraudAPI/fraudCloser (lines 90–96, line 125), but the Fraud field is never populated in the returned struct (lines 115–136). Any caller using client.Fraud will get a nil Module and panic on method calls. The PR modifies this exact return block to add Fibre: &fibreAPI but leaves the pre-existing Fraud omission unfixed.

(Refers to lines 132-136)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants