Skip to content

DRIVERS-3535 - Client Backpressure with retryAfterMS#1953

Open
NoahStapp wants to merge 16 commits into
mongodb:masterfrom
NoahStapp:DRIVERS-3535
Open

DRIVERS-3535 - Client Backpressure with retryAfterMS#1953
NoahStapp wants to merge 16 commits into
mongodb:masterfrom
NoahStapp:DRIVERS-3535

Conversation

@NoahStapp

Copy link
Copy Markdown
Contributor

Please complete the following before merging:

  • Is the relevant DRIVERS ticket in the PR title?
  • Update changelog.
  • Test changes in at least one language driver. Python.
  • Test these changes against all server versions and topologies (including standalone, replica set, and sharded
    clusters).

@NoahStapp NoahStapp requested a review from a team as a code owner June 16, 2026 18:04
@NoahStapp NoahStapp requested review from jyemin and tadjik1 June 16, 2026 18:04
Comment thread source/client-backpressure/client-backpressure.md Outdated
Comment thread source/client-backpressure/client-backpressure.md Outdated
Comment thread source/client-backpressure/client-backpressure.md Outdated
Comment thread source/client-backpressure/client-backpressure.md Outdated
Comment thread source/client-backpressure/client-backpressure.md Outdated
NoahStapp and others added 2 commits June 18, 2026 09:31
Co-authored-by: Sergey Zelenov <sergey.zelenov@mongodb.com>
@NoahStapp NoahStapp requested a review from a team as a code owner June 18, 2026 13:34
@NoahStapp NoahStapp requested a review from mana2-bot June 18, 2026 13:34

@tadjik1 tadjik1 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NoahStapp, everything looks good! I will keep an eye on retryAfterMS max value, once it's set I'll approve this PR.

Comment thread source/client-backpressure/tests/README.md Outdated
Comment thread source/mongodb-handshake/tests/README.md Outdated
Comment thread source/mongodb-handshake/tests/README.md
@NoahStapp

Copy link
Copy Markdown
Contributor Author

After further discussion with Server, I've simplified the backoff calculation formula to remove the custom jitter for retryAfterMS. Now retryAfterMS simply replaces BASE_BACKOFF in the existing formula.

@NoahStapp NoahStapp requested review from blink1073 and tadjik1 June 23, 2026 16:06
@blink1073

Copy link
Copy Markdown
Member

As a note, the two failing checks should be fixed by #1958

retry_after_ms = exc.retry_after_ms
if retry_after_ms:
retry_after = retry_after / 1000 # Convert from milliseconds to seconds
backoff = jitter * min(MAX_BACKOFF, retry_after * 2 ** (attempt - 1))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jitter is undefined in this if block, should it be hoisted to above it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, good catch.

@NoahStapp NoahStapp requested a review from blink1073 June 24, 2026 16:10

# If present on the error, retryAfterMS sets the base backoff
retry_after_ms = exc.retry_after_ms
if retry_after_ms:

@blink1073 blink1073 Jun 24, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the retry_after should be set as BASE_BACKOFF, then set to the new value if retry_after_ms is given, then use a single line to set the backoff

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 even more concise, thanks!

@blink1073 blink1073 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@NoahStapp NoahStapp requested a review from blink1073 June 24, 2026 16:24
- `MAX_BACKOFF` is 10000ms.
- This results in delays of 100ms and 200ms before accounting for jitter.
3. If the request is eligible for retry (as outlined in step 2 above), the client MUST apply backoff according to the
following formula: `backoff = jitter * min(MAX_BACKOFF, BASE_BACKOFF * 2^(attempt - 1))`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is attempt the number of retries, or does it include the first time the command is executed? It seems like it should be the former, but can we be explicit in the list below?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attempt is the retry number, so attempt=1 is the first retry.

2. `MAX_BACKOFF` is 10000ms.
3. `BASE_BACKOFF` is constant 100ms.
4. This results in delays of 100ms and 200ms before accounting for jitter.
5. If `retryAfterMS` is present on the error and has a positive value, the client MUST use that value instead of

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is retryAfterMS mis-named? It seems weird to multiply a value with that name by 2^(attempt - 1).
I'm not asking that we change it, but consider a sentence of explanation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retryAfterMS is the server-supplied base backoff to use in place of the driver's default BASE_BACKOFF. Is the explaination you're looking for making that more clear?

else:
cmd = {"legacy hello": 1, "helloOk": 1}
cmd["backpressure"] = True
cmd["backpressure"] = "2"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this to the normative part of the spec, perhaps as the third paragraph in "Connection handshake"? Something with a MUST, that also defines the semantics of the value "2", as compared to True.

@NoahStapp NoahStapp requested a review from jyemin June 25, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants