Skip to content

fix: improve retry and poll logic#813

Merged
hugomrdias merged 13 commits into
masterfrom
hugomrdias/738
Jun 4, 2026
Merged

fix: improve retry and poll logic#813
hugomrdias merged 13 commits into
masterfrom
hugomrdias/738

Conversation

@hugomrdias
Copy link
Copy Markdown
Member

@hugomrdias hugomrdias commented Jun 2, 2026

iso-web 3.0.0 added support for retry and poll logic and this PR updates all SP functions for it.

  • updates to iso-web 3.0.0 with separate retry and poll logic
  • sp functions can now retry on errors and also poll on 2XXs
  • new Curio 429 retry-after header is respected and properly handled
  • most SP endpoints retry on error
  • findOnProviders uses a HEAD request on the retrieval url
  • response schema validation is now handled by iso-web
  • retry and poll options exposed in all methods
{
    /** The number of retries. Defaults to 2. */
    retryCount?: number
    /** The delay with exponential backoff between retries in milliseconds. Defaults to 250ms. */
    retryDelay?: number
    /** Whether to poll the request. Defaults to false. */
    poll?: boolean
    /** The poll interval in milliseconds. Defaults to 4 second. */
    pollInterval?: number
}

closes #738

@hugomrdias hugomrdias requested a review from rvagg as a code owner June 2, 2026 16:31
@github-project-automation github-project-automation Bot moved this to 📌 Triage in FOC Jun 2, 2026
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Jun 2, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
🔵 In progress
View logs
synapse-dev 0020e21 Jun 04 2026, 01:25 PM

@hugomrdias hugomrdias self-assigned this Jun 2, 2026
@BigLep BigLep requested a review from Copilot June 2, 2026 17:34
@BigLep BigLep moved this from 📌 Triage to 🔎 Awaiting review in FOC Jun 2, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates synapse-core and synapse-sdk Service Provider (SP) HTTP operations to align with iso-web@^3.0.0 by separating retry (errors) from polling (2XX status workflows), and by updating piece-resolution to use retrieval-url HEAD checks.

Changes:

  • Bumped iso-web to ^3.0.0 and refactored SP request calls to use retry/poll options (including schema validation via iso-web in some endpoints).
  • Updated “waitFor*” and other SP flows to use iso-web polling semantics, and adjusted tests accordingly.
  • Changed provider piece discovery to use HEAD /piece/{cid} and updated resolvers + tests to reflect the new behavior.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
pnpm-workspace.yaml Bumps iso-web dependency to ^3.0.0.
packages/synapse-sdk/src/test/synapse.test.ts Updates mocks to expect HEAD /piece/:pieceCid for retrieval probing.
packages/synapse-sdk/src/test/storage.test.ts Updates storage tests to mock retrieval HEAD before GET.
packages/synapse-sdk/src/storage/context.ts Switches findPiece usage from retry to poll for “piece becomes available” behavior.
packages/synapse-core/test/sp.test.ts Updates SP tests to pass retryCount/retryDelay/poll options.
packages/synapse-core/test/resolve-piece-url.test.ts Updates provider discovery tests for HEAD /piece/{cid} and new return value semantics.
packages/synapse-core/test/pull.test.ts Adds retry-delay configuration for pull tests.
packages/synapse-core/src/utils/constants.ts Splits retry/poll constants into POLL_*, RETRY_DELAY, TIMEOUT.
packages/synapse-core/src/sp/upload.ts Adds retry options and switches JSON request bodies to json:; uses polling for findPiece.
packages/synapse-core/src/sp/upload-streaming.ts Adds retry options to create/finalize steps; keeps streaming upload behavior.
packages/synapse-core/src/sp/schedule-piece-deletion.ts Migrates delete flow to iso-web@3 error types and retry config.
packages/synapse-core/src/sp/pull-pieces.ts Refactors pull + waitFor logic to use iso-web retry/poll configuration.
packages/synapse-core/src/sp/ping.ts Adds retry + timeout options to ping request.
packages/synapse-core/src/sp/get-data-set.ts Adds retry options and delegates schema validation to iso-web.
packages/synapse-core/src/sp/find-piece.ts Replaces retry boolean with retryCount/retryDelay and adds poll options.
packages/synapse-core/src/sp/create-dataset.ts Adds retry/poll options to create + waitFor status flow using iso-web polling.
packages/synapse-core/src/sp/create-dataset-add-pieces.ts Threads retry/poll options through composite waitFor flow.
packages/synapse-core/src/sp/add-pieces.ts Adds retry/poll options for waitFor status checks using iso-web polling.
packages/synapse-core/src/piece/resolve-piece-url.ts Changes provider probing to HEAD on retrieval URL and returns the matching serviceURL.
packages/synapse-core/src/piece/download.ts Adds retry options and signals to download operations.
examples/cli/src/commands/upload-dataset.ts Updates CLI example to use poll: true for findPiece.
docs/src/content/docs/developer-guides/synapse-core.mdx Updates docs example from retry to poll for findPiece.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/synapse-core/src/sp/pull-pieces.ts
Comment thread packages/synapse-core/src/sp/pull-pieces.ts
Comment thread packages/synapse-core/src/sp/create-dataset.ts
Comment thread packages/synapse-core/src/sp/create-dataset.ts Outdated
Comment thread packages/synapse-core/src/sp/add-pieces.ts
Comment thread packages/synapse-core/src/sp/add-pieces.ts Outdated
Comment thread packages/synapse-core/src/sp/schedule-piece-deletion.ts
Comment thread packages/synapse-core/src/sp/upload.ts
Comment thread packages/synapse-core/src/utils/constants.ts Outdated
Comment thread packages/synapse-core/src/piece/resolve-piece-url.ts
Comment thread packages/synapse-core/src/piece/resolve-piece-url.ts Outdated
Comment thread packages/synapse-core/src/sp/get-data-set.ts Outdated
Comment thread packages/synapse-core/src/sp/ping.ts Outdated
Copy link
Copy Markdown
Collaborator

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving assuming all comments/suggestions are resolved, Copilot has a good point about poll on the waitFor msgs, and there's some minor cleanup that's possible

@github-project-automation github-project-automation Bot moved this from 🔎 Awaiting review to ✔️ Approved by reviewer in FOC Jun 3, 2026
@rvagg
Copy link
Copy Markdown
Collaborator

rvagg commented Jun 3, 2026

Lifting up something I was writing inline at createDataSet:

Cases like this call are interesting given that it'll retry on a 5xx. If we get a createDataSet or addPieces error from some intermediary piece or perhaps a Curio db problem even though it might have resulted in a tx on the chain will retry, and then the user will get an error because of nonce reuse, so now we're in a crappy situation of having submitted somethign to the chain but the user thinks there's a strange error that's disconnected from the real error. Since iso-web also does the isNetworkError for retries too it's even worse - bad wifi but got it submitted? well when we do retry and submit it you're going to be told "can't reuse that nonce" or whatever our errors are for that.

I'm not sure what the answer is here tbh. Retrying on a 429 is an easy choice, but maybe the user should be handed errors in the case of 5xx's and network errors?

@juliangruber
Copy link
Copy Markdown
Member

Lifting up something I was writing inline at createDataSet:

Cases like this call are interesting given that it'll retry on a 5xx. If we get a createDataSet or addPieces error from some intermediary piece or perhaps a Curio db problem even though it might have resulted in a tx on the chain will retry, and then the user will get an error because of nonce reuse, so now we're in a crappy situation of having submitted somethign to the chain but the user thinks there's a strange error that's disconnected from the real error. Since iso-web also does the isNetworkError for retries too it's even worse - bad wifi but got it submitted? well when we do retry and submit it you're going to be told "can't reuse that nonce" or whatever our errors are for that.

I'm not sure what the answer is here tbh. Retrying on a 429 is an easy choice, but maybe the user should be handed errors in the case of 5xx's and network errors?

Agreed. The mutating POSTs (createDataSet, createDataSetAndAddPieces, addPieces, pullPieces) now set retry: { methods: ['post'], ... }. iso-web's default retry status codes are [408, 413, 429, 500, 502, 503, 504] and it also retries all network errors. So if Curio submits an on-chain tx and then returns a 500 (or the connection drops), the SDK silently re-POSTs → second tx submission → nonce-reuse error surfaced to the user, disconnected from the real cause. For these non-idempotent, tx-producing endpoints I'd restrict retry to statusCodes: [429] (keeping afterStatusCodes for Retry-After) and not retry on 5xx/network. Worth resolving before merge.

Comment thread packages/synapse-core/src/piece/download.ts Outdated
Comment thread packages/synapse-core/src/piece/resolve-piece-url.ts
Comment thread packages/synapse-core/src/sp/add-pieces.ts Outdated
Comment thread packages/synapse-core/src/sp/add-pieces.ts Outdated
Comment thread packages/synapse-core/src/sp/add-pieces.ts Outdated
Comment thread packages/synapse-core/src/sp/add-pieces.ts
Comment thread packages/synapse-core/src/sp/create-dataset-add-pieces.ts Outdated
Comment thread packages/synapse-core/src/sp/ping.ts Outdated
Comment thread packages/synapse-core/src/sp/schedule-piece-deletion.ts
Comment thread packages/synapse-core/src/sp/upload.ts
@github-project-automation github-project-automation Bot moved this from ✔️ Approved by reviewer to ⌨️ In Progress in FOC Jun 3, 2026
@hugomrdias
Copy link
Copy Markdown
Member Author

@rvagg @juliangruber agree with your comments but i would keep the retry on NetworkError this error means the network failed the request never got to the other side so its safe to retry for this.

@juliangruber
Copy link
Copy Markdown
Member

juliangruber commented Jun 3, 2026

@rvagg @juliangruber agree with your comments but i would keep the retry on NetworkError this error means the network failed the request never got to the other side so its safe to retry for this.

A NetworkError does not guarantee the request never arrived. iso-web's NetworkError comes from is-network-error, whose triggers include 'terminated', 'Network connection lost', 'fetch failed', and 'Failed to fetch' — all of which can fire after the request was sent and the server acted on it, when only the response was lost.

@hugomrdias hugomrdias requested a review from juliangruber June 3, 2026 12:09
Copy link
Copy Markdown
Member

@juliangruber juliangruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue with 429 handling: iso-web's shouldRetry replaces the built-in decision rather than filtering it further, so the statusCodes option is ignored

Comment thread packages/synapse-core/src/sp/create-dataset.ts Outdated
Comment thread packages/synapse-core/src/sp/create-dataset-add-pieces.ts Outdated
Comment thread packages/synapse-core/src/sp/add-pieces.ts Outdated
Comment thread packages/synapse-core/src/sp/schedule-piece-deletion.ts Outdated
@hugomrdias hugomrdias requested a review from juliangruber June 3, 2026 15:00
Copy link
Copy Markdown
Collaborator

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving again, nice find by @juliangruber and it looks like its fixed precisely as prescribed.

only possible remaining change is that methods: ['post']/['delete']are now redundant where you have ashouldRetry` based on what Julian pointed out, but not a blocking concern, just possibly causing confusion

@rvagg rvagg dismissed juliangruber’s stale review June 4, 2026 05:07

Julian's OOO today and the concern was taken care of

- updates to iso-web 3.0.0 with retry and poll separate logic
- most SP endpoints retry on error
- findOnProviders uses a HEAD request on the retrieval url
- response schema validation is now handled by iso-web
- retry and poll options exposed in all methods

closes #738
Comment thread pnpm-workspace.yaml Outdated
# Remove after 2026-06-08 when both age out.
- viem@2.52.0
- ox@0.14.27
- astro
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you version this pls, and add a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm doing it in commit 1287666 in #821

@hugomrdias hugomrdias changed the title feat: improve retry and poll logic fix: improve retry and poll logic Jun 4, 2026
@hugomrdias hugomrdias merged commit 3eafe1f into master Jun 4, 2026
10 checks passed
@github-project-automation github-project-automation Bot moved this from ⌨️ In Progress to 🎉 Done in FOC Jun 4, 2026
@hugomrdias hugomrdias deleted the hugomrdias/738 branch June 4, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🎉 Done

Development

Successfully merging this pull request may close these issues.

Retry sp http requests on network error

5 participants