Skip to content

Fix/issue 9469 tso backoff weight#10569

Draft
ljluestc wants to merge 3 commits intotikv:masterfrom
ljluestc:fix/issue-9469-tso-backoff-weight
Draft

Fix/issue 9469 tso backoff weight#10569
ljluestc wants to merge 3 commits intotikv:masterfrom
ljluestc:fix/issue-9469-tso-backoff-weight

Conversation

@ljluestc
Copy link
Copy Markdown

@ljluestc ljluestc commented Apr 4, 2026

client: fix TSO stream-loop timeout reset

What problem does this PR solve?

Issue Number: Close #9469

What is changed and how does it work?

背景

#9469 中反馈:tidb_backoff_weight 未能影响 TSO timeout,而 pd-client.pd-server-timeout 对 TSO timeout 调整却有效,行为与预期不一致。

问题分析

当前 TSO dispatcher 中:

  • 初始化阶段使用了 GetTSOTimeout() 作为 stream loop timer 的初值
  • 但在每个 batch 循环里又执行了 streamLoopTimer.Reset(option.Timeout)
  • 这会把超时控制重新拉回到 pd-server-timeout,导致 TSO timeout 配置(及其相关机制)无法持续生效

修复方案

  1. client/opt/option.go:新增 TSO timeout 配置能力

    • 添加 TSOTimeout 字段,默认 15 秒
    • 添加 WithCustomTSOTimeoutOption() 客户端选项
    • 添加 GetTSOTimeout() 方法,优先考虑 backoffer 的总时间限制
  2. client/clients/tso/dispatcher.go:修复超时重置逻辑

    • 将每轮 batch 的 streamLoopTimer.Reset(option.Timeout) 改为 streamLoopTimer.Reset(tsoTimeout)
    • 保证 stream-loop retry 路径和 deadline watcher 一致使用 TSO timeout 语义,不再退化到 PD timeout
  3. client/pkg/retry/backoff.go:支持 backoffer 总时间查询

    • 添加 TotalTime() 方法,暴露 backoffer 的总时间限制
client: fix TSO stream-loop timeout reset

Use the effective TSO timeout when resetting the dispatcher stream loop timer on each batch, so timeout behavior no longer falls back to pd-server-timeout.

Add a dispatcher unit test that verifies stream-loop retries honor TSO timeout instead of the PD timeout, and align suite setup timeout defaults with the new semantics.

Remove local PR description artifact from tracked files.

Check List

Tests

  • Unit test

Code changes

  • Has the configuration change
  • Has HTTP APIs changed
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

Release note

Fix the issue where `tidb_backoff_weight` did not affect TSO timeout. The TSO dispatcher now correctly honors the TSO timeout configuration throughout the stream loop lifecycle.

变更文件

文件 说明
client/opt/option.go 添加 TSOTimeoutWithCustomTSOTimeoutOption()GetTSOTimeout()
client/clients/tso/dispatcher.go 修复 streamLoopTimer.Reset() 使用正确的 TSO timeout
client/pkg/retry/backoff.go 添加 TotalTime() 方法
client/opt/option_test.go 新增 TSO timeout 相关单元测试
client/clients/tso/dispatcher_test.go 新增 stream-loop timeout 验证测试

本地验证

cd /home/calelin/dev/pd/client && make gotest GOTEST_ARGS='./clients/tso ./opt'

所有测试通过。

Non-default keyspace group TSO services now read all keyspace group
timestamps at startup and use the maximum value, ensuring TSO
monotonicity without requiring metadata cleanup during TSO node
start/stop operations.

Signed-off-by: Jiale Lin <63439129+ljluestc@users.noreply.github.com>
Use the effective TSO timeout when resetting the dispatcher stream loop timer on each batch, so timeout behavior no longer falls back to pd-server-timeout.

Add a dispatcher unit test that verifies stream-loop retries honor TSO timeout instead of the PD timeout, and align suite setup timeout defaults with the new semantics.

Remove local PR description artifact from tracked files.

Closes: tikv#9469

Signed-off-by: Jiale Lin <63439129+ljluestc@users.noreply.github.com>
@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Apr 4, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Apr 4, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign disksing for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f598b13f-6ac9-46e6-a7e4-23f422e9a5a5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot bot added contribution This PR is from a community contributor. dco-signoff: no Indicates the PR's author has not signed dco. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Apr 4, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Apr 4, 2026

Hi @ljluestc. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Apr 4, 2026

Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits.

📝 Please follow instructions in the contributing guide to update your commits with the DCO

Full details of the Developer Certificate of Origin can be found at developercertificate.org.

The list of commits missing DCO signoff:

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot ti-chi-bot bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. dco-signoff: no Indicates the PR's author has not signed dco. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tidb_backoff_weight doesn't affect tso timeout

1 participant