Skip to content

TiFlash: add missing TiFlash Grafana metrics#21427

Open
hfxsd wants to merge 2 commits intopingcap:masterfrom
hfxsd:add-missing-tiflash-metrics
Open

TiFlash: add missing TiFlash Grafana metrics#21427
hfxsd wants to merge 2 commits intopingcap:masterfrom
hfxsd:add-missing-tiflash-metrics

Conversation

@hfxsd
Copy link
Copy Markdown
Collaborator

@hfxsd hfxsd commented Mar 11, 2026

Expand TiFlash monitoring doc by adding many new metrics and sections across the Grafana dashboards. Clarifies that TiFlash proxy/raft metrics overlap heavily with TiKV. Added/renamed entries (e.g. Read Index OPS -> Raft Read Index OPS, Wait Index Duration -> Raft Wait Index Duration) and introduced Write & Delta Management Total. New sections include Imbalance read/write, Memory trace, Storage Read Pool & Data Sharing, PageStorage, Rate Limiter, Raft Snapshot / IngestSST, Disaggregated-Write/Compute, S3, Pipeline Model, TiFlash Resource Control, Status Server, Vector Search, and extensive expansions to TiFlash-Proxy-Summary and TiFlash-Proxy-Details (cluster, errors, server, thread CPU, PD, raft IO/process/message/propose/admin, unified read pool, storage, scheduler, snapshot, task, threads, RocksDB, encryption, etc.). These additions improve coverage and clarity for TiFlash cluster monitoring.

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions (in Chinese).

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)

What is the related PR or file link(s)?

  • This PR is translated from:
  • Other reference link(s):

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

Expand TiFlash monitoring doc by adding many new metrics and sections across the Grafana dashboards. Clarifies that TiFlash proxy/raft metrics overlap heavily with TiKV. Added/renamed entries (e.g. Read Index OPS -> Raft Read Index OPS, Wait Index Duration -> Raft Wait Index Duration) and introduced Write & Delta Management Total. New sections include Imbalance read/write, Memory trace, Storage Read Pool & Data Sharing, PageStorage, Rate Limiter, Raft Snapshot / IngestSST, Disaggregated-Write/Compute, S3, Pipeline Model, TiFlash Resource Control, Status Server, Vector Search, and extensive expansions to TiFlash-Proxy-Summary and TiFlash-Proxy-Details (cluster, errors, server, thread CPU, PD, raft IO/process/message/propose/admin, unified read pool, storage, scheduler, snapshot, task, threads, RocksDB, encryption, etc.). These additions improve coverage and clarity for TiFlash cluster monitoring.
@hfxsd hfxsd self-assigned this Mar 11, 2026
@ti-chi-bot ti-chi-bot bot added contribution This PR is from a community contributor. missing-translation-status This PR does not have translation status info. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 11, 2026
@hfxsd hfxsd requested review from 3pointer and niubell March 11, 2026 03:30
@hfxsd hfxsd added translation/doing This PR’s assignee is translating this PR. and removed missing-translation-status This PR does not have translation status info. labels Mar 11, 2026
@xzhangxian1008
Copy link
Copy Markdown
Contributor

/assign

Comment thread tiflash/monitor-tiflash.md Outdated
Comment thread tiflash/monitor-tiflash.md Outdated
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Comment thread tiflash/monitor-tiflash.md
Co-authored-by: xzhangxian1008 <xzhangxian@foxmail.com>
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 15, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from hfxsd. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 15, 2026

@hfxsd: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-verify 733f73d link true /test pull-verify

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. translation/doing This PR’s assignee is translating this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants