Skip to content

util/stmtsummary: add tidb_stmt_summary_persist_evicted (#68513)#69132

Closed
ti-chi-bot wants to merge 1 commit into
pingcap:release-8.5from
ti-chi-bot:cherry-pick-68513-to-release-8.5
Closed

util/stmtsummary: add tidb_stmt_summary_persist_evicted (#68513)#69132
ti-chi-bot wants to merge 1 commit into
pingcap:release-8.5from
ti-chi-bot:cherry-pick-68513-to-release-8.5

Conversation

@ti-chi-bot

@ti-chi-bot ti-chi-bot commented Jun 12, 2026

Copy link
Copy Markdown
Member

This is an automated cherry-pick of #68513

What problem does this PR solve?

Issue Number: ref #67924

Problem Summary:

The v2 (persistent) statement summary maintains an LRU per window; when an entry is evicted before the window is persisted, downstream consumers of the stmt log have no record that the digest ever existed. Today the only signal is the aggregate "other" bucket appended at rotation time, which loses per-record identity.

What changed and how does it work?

Introduce a new global system variable tidb_stmt_summary_persist_evicted (default OFF).

  • When ON, every LRU eviction in the v2 backend persists a JSON entry to the stmt summary log with an "evicted":true marker, embedding the full StmtRecord snapshot, so downstream consumers can track which records were dropped from memory between rotations.
  • Persistence is asynchronous: evictions are cloned and enqueued on a buffered channel, so the Add() hot path never blocks on log I/O.
  • The async logger batches evicted records before writing them, flushing when the batch reaches 64 records, after 100 ms, or during shutdown. This reduces logger calls under high-frequency eviction while preserving one JSON line per evicted record.
  • The channel is bounded (1024 records). When full, evictions are dropped and a periodic warn log reports the dropped count so the operator can size accordingly.

The window callback was extended to optionally fire a per-eviction hook (onEvictFn). When the flag is OFF, the hook short-circuits immediately and no record cloning is performed.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

New test:

  • pkg/util/stmtsummary/v2/stmtsummary_test.go::TestStmtSummaryPersistEvicted

Verified locally:

  • GOCACHE=/private/tmp/tidb-go-build go test -run 'TestStmtSummaryPersistEvicted|TestWindowEvictedCountResetOnRotate|TestDefaultConfig' -tags=intest,deadlock -count=1 ./pkg/util/stmtsummary/v2
  • GOCACHE=/private/tmp/tidb-go-build go test -tags=intest,deadlock -count=1 ./pkg/util/stmtsummary/v2
  • make bazel_prepare
  • GOCACHE=/private/tmp/tidb-go-build make -o tools/bin/revive lint

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

When ON, stmt summary log volume grows in proportion to the eviction rate. The feature is default OFF, uses bounded buffering, performs non-blocking enqueue on the hot path, batches writes in the background, and may drop evicted records when the queue is full.

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Add a new system variable `tidb_stmt_summary_persist_evicted` (default OFF). When enabled, the v2 (persistent) statement summary persists JSON entries for LRU-evicted records to the stmt summary log with an `"evicted":true` marker, allowing downstream consumers to track records that were dropped from memory.

Summary by CodeRabbit

Release Notes

  • New Features
    • Added system variables tidb_stmt_summary_persist_evicted and tidb_stmt_summary_group_by_user for enhanced statement summary configuration control
    • Enhanced statement summary metrics to track and measure eviction events
    • Improved logging of evicted statements with detailed event tracking

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot ti-chi-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR. labels Jun 12, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

  1. It must be LGTMed and approved by the reviewers firstly.
  2. For pull requests to TiDB-x branches, it must have no failed tests.
  3. AFTER it has lgtm and approved labels, please wait for the cherry-pick merging approval from triage owners.
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot

Copy link
Copy Markdown
Member Author

@nolouch This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

@ti-chi-bot

ti-chi-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot

ti-chi-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign terry1purcell for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR extends TiDB's statement summary v2 system to persist individual evicted records and to optionally aggregate statements by executing user. It adds Prometheus metrics tracking, an async eviction logging pipeline with configurable persistence and grouping, and corresponding system variable hooks for runtime control.

Changes

Statement Summary Eviction Persistence and User Grouping

Layer / File(s) Summary
Prometheus metrics for statement summary
pkg/metrics/metrics.go, pkg/metrics/stmtsummary.go, pkg/metrics/metrics_internal_test.go
Defines StmtSummaryWindowRecordCount, StmtSummaryWindowEvictedCount, and StmtSummaryEvictedLogCounter metrics with lazy per-type gauge caching, initialization helpers, and test coverage for metric read/gather behavior.
Eviction persistence and user-grouping options in StmtSummary
pkg/util/stmtsummary/v2/stmtsummary.go
Extends StmtSummary with atomic runtime-togglable flags for eviction persistence and user-based grouping, exposing public getter/setter methods that coordinate state changes.
Async eviction logging infrastructure
pkg/util/stmtsummary/v2/stmtsummary.go
Implements onEvict callback to enqueue cloned evicted records into a buffered channel, evictedLogLoop goroutine to batch/flush with timer-based flushing and drop reporting, and wiring into window initialization paths.
Storage interface and dual-path evicted aggregation
pkg/util/stmtsummary/v2/stmtsummary.go
Extends stmtStorage with logEvicted method, updates stmtEvicted to maintain separate aggregates for normal vs per-record eviction logging, and adds cloneRecordForLog helper for safe async serialization.
Marshaling and logging of evicted records
pkg/util/stmtsummary/v2/logger.go
Adds logEvicted method to emit JSON-serialized evicted records with evicted marker, refactors marshaling helpers to conditionally include additional_fields from keyspace observability config, increments metrics on successful persistence.
Reader integration for persisted evicted records
pkg/util/stmtsummary/v2/reader.go, pkg/util/stmtsummary/v2/reader_test.go
Adds stmtPersistedRecord wrapper with Evicted flag, updates parse worker to skip processing records marked as evicted so they are not duplicated as aggregate rows.
Window lifecycle and user-based grouping
pkg/util/stmtsummary/v2/stmtsummary.go
Updates Add method to compute user-aware digest key under windowLock based on grouping mode, wires eviction callback into windows created during flush and rotation, uses compare-and-swap in Close for safe shutdown.
System variable configuration hooks
pkg/sessionctx/variable/sysvar.go, pkg/util/stmtsummary/v2/stmtsummary.go
Adds tidb_stmt_summary_persist_evicted and tidb_stmt_summary_group_by_user global sysvars backed by SetPersistEvicted and SetGroupByUser setter functions.
Test infrastructure and comprehensive coverage
pkg/util/stmtsummary/v2/BUILD.bazel, pkg/util/stmtsummary/v2/stmtsummary_test.go, pkg/util/stmtsummary/v2/record_test.go, pkg/metrics/metrics_internal_test.go
Adds Prometheus metric readers, tests for async eviction persistence with timestamp validation, non-duplication of logged records, user grouping behavior, gauge reset on window rotation, and additional_fields marshaling with evicted marker.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • pingcap/tidb#68513: Implements optional per-record eviction logging by extending v2 statement summary eviction path with async queueing and storage integration (onEvict/logEvicted).
  • pingcap/tidb#68404: Modifies statement-log JSON marshaling to incorporate keyspace observability fields, overlapping with this PR's marshalStmtRecord/evictedStmtRecord helpers.
  • pingcap/tidb#68836: Implements keyspace observability enhancements that drive the additional_fields attached to v2 statement-summary logs in this PR.

Suggested labels

cherry-pick-approved, ok-to-test, approved, lgtm

Suggested reviewers

  • yibin87
  • yudongusa
  • zimulala

Poem

🐰 Evictions hop through async lanes,
Records persist in metrics and chains,
User-grouped aggregates now align,
With gauges that reset and shine,
A statement summary now complete and fine!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding a new system variable tidb_stmt_summary_persist_evicted to the stmt summary feature.
Description check ✅ Passed The PR description is comprehensive, covering the problem statement, implementation details, test coverage, side effects, and documentation impact. All required template sections are completed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nolouch nolouch closed this Jun 12, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

@ti-chi-bot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/check_dev 0d44e4d link true /test check-dev

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/cherry-pick-not-approved do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants