Skip to content

release-8.5: cherry-pick improments of statement summary coverage#69129

Open
nolouch wants to merge 3 commits into
pingcap:release-8.5from
nolouch:cherry-pick-statement-prs-release-8.5
Open

release-8.5: cherry-pick improments of statement summary coverage#69129
nolouch wants to merge 3 commits into
pingcap:release-8.5from
nolouch:cherry-pick-statement-prs-release-8.5

Conversation

@nolouch

@nolouch nolouch commented Jun 11, 2026

Copy link
Copy Markdown
Member

This is an automated cherry-pick of #66700

This is an automated cherry-pick of #68868

This is an automated cherry-pick of #68512

This is an automated cherry-pick of #68513

What problem does this PR solve?

Issue Number: close #68865, close #66669, ref #67924

Problem Summary:

Backport statement-summary improvements and the prerequisite ROW_COUNT restore fix to release-8.5.

  • SET SESSION_STATES could restore ROW_COUNT() from a previous SELECT as current affected rows, polluting affected-row metrics and statement summary records.
  • Statement summary window metrics from metrics, util/stmtsummary: add stmt summary window metrics #66700 are prerequisites for the later statement-summary backports on release-8.5.
  • Statement summary rows could not be deterministically grouped by executing user.
  • The v2 persistent statement summary could evict an LRU entry before window rotation without preserving the per-record identity in the stmt summary log.

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Manual test:

  • GOCACHE=/private/tmp/tidb-go-build-stmt-persist go test ./pkg/metrics -run "Test(StmtSummaryMetricLabels|RetLabel)"
  • GOCACHE=/private/tmp/tidb-go-build-stmt-persist go test ./pkg/util/stmtsummary/v2 -run "Test(StmtSummaryPersistEvicted|StmtSummaryPersistEvictedDoesNotPersistLoggedRecordsAsAggregate|WindowEvictedCountResetOnRotate|StmtRecord)"
  • GOCACHE=/private/tmp/tidb-go-build-stmt-persist go test ./pkg/sessionctx/variable -run "^$"
  • git diff --check

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

When tidb_stmt_summary_group_by_user is ON, statement summary cardinality can grow by distinct users per digest. When tidb_stmt_summary_persist_evicted is ON, stmt summary log volume grows with eviction rate. Both variables are default OFF.

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Add statement summary window observability metrics required by later statement-summary backports.

Fixed an issue that `SET SESSION_STATES` could restore `ROW_COUNT()` from a previous `SELECT` as current statement affected rows, causing affected-row metrics and statement summary records to report `18446744073709551615`.

Added a new system variable `tidb_stmt_summary_group_by_user` (default OFF). When enabled, statement summary rows are additionally grouped by executing user.

Added a new system variable `tidb_stmt_summary_persist_evicted` (default OFF). When enabled, the v2 persistent statement summary persists JSON entries for LRU-evicted records to the stmt summary log with an `"evicted":true` marker.

@ti-chi-bot

ti-chi-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

  1. It must be LGTMed and approved by the reviewers firstly.
  2. For pull requests to TiDB-x branches, it must have no failed tests.
  3. AFTER it has lgtm and approved labels, please wait for the cherry-pick merging approval from triage owners.
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 11, 2026
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR integrates Prometheus metrics into statement summary tracking, implements per-record eviction persistence with async logging, and enables optional grouping by executing user. The changes span v1 and v2 statement summary backends, system variables, session state handling, and comprehensive test coverage.

Changes

Statement Summary Metrics and Persistence

Layer / File(s) Summary
Prometheus metrics definitions and initialization
pkg/metrics/stmtsummary.go, pkg/metrics/metrics.go, pkg/metrics/BUILD.bazel, pkg/metrics/metrics_internal_test.go
Declares Prometheus gauge and counter vectors for window record count, evicted count, and evicted-log outcomes. InitStmtSummaryMetrics constructs vectors and caches v1/v2 gauge instances under mutex; SetStmtSummaryWindowMetrics updates gauges by type label. Tests validate metric collection and labeled counter increments.
Statement digest key user parameter and v1 grouping toggle
pkg/util/stmtsummary/statement_summary.go, pkg/util/stmtsummary/evicted_test.go, pkg/util/stmtsummary/statement_summary_test.go, pkg/util/stmtsummary/BUILD.bazel
Updates StmtDigestKey.Init to accept a user parameter and append length-prefixed user bytes when non-empty. Adds optGroupByUser flag to v1 map; AddStatement computes userForKey under lock and initializes digest key accordingly. Implements SetGroupByUser API to toggle grouping and clear existing summaries. All existing tests updated to pass empty string for user parameter; new tests validate grouping behavior and digest boundary correctness.
Statement summary v2 per-record eviction persistence and logging
pkg/util/stmtsummary/v2/stmtsummary.go, pkg/util/stmtsummary/v2/logger.go, pkg/util/stmtsummary/v2/reader.go
Adds async eviction pipeline: onEvict callback non-blockingly enqueues evicted records to evictedCh; dedicated evictedLogLoop batches and drains to storage via logEvicted. Logger marshals evicted records with "evicted": true JSON field (newline-separated), increments Prometheus counter. Reader filters out evicted records during history parsing. Storage interface extended with logEvicted method.
Statement summary v2 grouping-by-user and window metrics updates
pkg/util/stmtsummary/v2/stmtsummary.go
Extends v2 to track optGroupByUser; Add computes userForKey under windowLock and initializes digest key with user when grouping enabled. SetGroupByUser clears window to enforce consistent keying on toggle. rotateLoop calls updateMetrics under lock to set Prometheus window gauges. Proxy functions SetPersistEvicted and SetGroupByUser route configuration to global backend or v1 fallback.
Reader and logger handling of evicted records
pkg/util/stmtsummary/v2/reader.go, pkg/util/stmtsummary/v2/reader_test.go, pkg/util/stmtsummary/v2/record_test.go, pkg/util/stmtsummary/v2/logger.go
Reader unmarshals into stmtPersistedRecord wrapper with Evicted boolean, skips evicted records during output generation. Logger refactors marshaling into marshalStmtRecord and marshalEvictedStmtRecord helpers. Tests extended with JSON validation of evicted markers and digest field assertions.
System variables for eviction persistence and grouping
pkg/sessionctx/variable/tidb_vars.go, pkg/sessionctx/variable/sysvar.go
Defines TiDBStmtSummaryPersistEvicted and TiDBStmtSummaryGroupByUser as global scope variables with default false. SetGlobal implementations delegate to v2 setter functions; wired after TiDBStmtSummaryMaxSQLLength in variable registry.
Session state affected-rows handling
pkg/executor/select.go, pkg/sessionctx/variable/session.go, pkg/sessionctx/sessionstates/session_states_test.go
ResetContextOfStmt special-cases InSetSessionStatesStmt to copy PrevAffectedRows from context instead of deriving from AffectedRows(), preventing row-count pollution. DecodeSessionStates restores PrevAffectedRows directly from session state. Test added to assert AffectedRows() is 0 after select 1.
V1 and V2 tests and build configuration
pkg/util/stmtsummary/statement_summary_test.go, pkg/util/stmtsummary/v2/stmtsummary_test.go, pkg/util/stmtsummary/BUILD.bazel, pkg/util/stmtsummary/v2/BUILD.bazel
All existing v1 tests updated to pass empty user string to StmtDigestKey.Init. New v1 tests validate group-by-user behavior and digest key boundary collisions. V2 tests cover eviction persistence with controlled timestamps, evicted-record non-aggregation, grouping behavior, and metrics reset on rotate. Bazel targets updated with metrics dependency and increased shard counts for parallelism.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

  • pingcap/tidb#68513: Introduces per-eviction "evicted": true JSON field mechanism for statement-log entries and async eviction draining that the main PR builds upon for persistence.
  • pingcap/tidb#68512: Implements the initial group-by-user digest keying and SetGroupByUser toggles that are now integrated with Prometheus metrics in this PR.
  • pingcap/tidb#68868: Shares the ResetContextOfStmt special-casing of InSetSessionStatesStmt to prevent affected-rows pollution across session-state restores.

Suggested labels

release-note, size/XL, cherry-pick-approved, type/cherry-pick-for-release-8.5, ok-to-test, approved, lgtm

Suggested reviewers

  • yibin87
  • yudongusa
  • zimulala

🐰 Metrics bloom where evictions dance!
Per-record logs now persist and prance.
Users grouped, their stats arranged,
Prometheus gauges, neatly gauged.
From v1 to v2, no row escapes mischance! 📊✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is partially related to the changeset—it mentions statement summary improvements but uses vague language ('improments' is likely a typo for 'improvements') and doesn't specify the key features being added (group-by-user and persist-evicted support). Consider revising the title to be more specific and correct the typo, e.g., 'release-8.5: cherry-pick statement summary group-by-user and persist-evicted support' to clearly indicate the main changes.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description is comprehensive and well-structured, addressing all key sections of the template including problem statement, changes, test coverage, side effects, and documentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nolouch nolouch changed the title release-8.5: cherry-pick statement summary fixes release-8.5: cherry-pick improments of statement summary coverage Jun 11, 2026
@ti-chi-bot ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 11, 2026
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 82.99320% with 50 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release-8.5@27e4cbc). Learn more about missing BASE report.

Additional details and impacted files
@@               Coverage Diff                @@
##             release-8.5     #69129   +/-   ##
================================================
  Coverage               ?   55.0817%           
================================================
  Files                  ?       1847           
  Lines                  ?     665676           
  Branches               ?          0           
================================================
  Hits                   ?     366666           
  Misses                 ?     271655           
  Partials               ?      27355           
Flag Coverage Δ
integration 38.1765% <9.1836%> (?)
unit 65.1490% <82.9932%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 55.1998% <0.0000%> (?)
parser ∅ <0.0000%> (?)
br 54.5572% <0.0000%> (?)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/metrics/stmtsummary.go (1)

91-106: ⚡ Quick win

Combine duplicate switch cases for v1 and v2.

The v1 and v2 cases execute identical code (three lines each). Combine them into a single case using Go's multi-case syntax.

♻️ Refactor to eliminate duplication
 func SetStmtSummaryWindowMetrics(typ string, recordCount, evictedCount float64) {
 	switch typ {
-	case StmtSummaryTypeV1:
-		recordGauge, evictedGauge := getStmtSummaryWindowMetricsLocked(typ)
-		recordGauge.Set(recordCount)
-		evictedGauge.Set(evictedCount)
-	case StmtSummaryTypeV2:
+	case StmtSummaryTypeV1, StmtSummaryTypeV2:
 		recordGauge, evictedGauge := getStmtSummaryWindowMetricsLocked(typ)
 		recordGauge.Set(recordCount)
 		evictedGauge.Set(evictedCount)
 	default:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/metrics/stmtsummary.go` around lines 91 - 106, The switch in
SetStmtSummaryWindowMetrics duplicates the same logic for StmtSummaryTypeV1 and
StmtSummaryTypeV2; combine them into a single multi-case branch (case
StmtSummaryTypeV1, StmtSummaryTypeV2:) that calls
getStmtSummaryWindowMetricsLocked(typ) and sets recordGauge and evictedGauge,
leaving the default branch that uses
StmtSummaryWindowRecordCount.WithLabelValues(typ).Set(...) and
StmtSummaryWindowEvictedCount.WithLabelValues(typ).Set(...) unchanged.
pkg/util/stmtsummary/v2/stmtsummary.go (1)

856-876: ⚡ Quick win

Consider making the dual-backend setter more robust to future error scenarios.

SetGroupByUser applies the setting to both v1 and v2 backends sequentially. If v1 succeeds but v2 fails (or vice versa), the two backends will have inconsistent grouping modes. Currently both methods always return nil, so this doesn't happen in practice, but future changes to either implementation could introduce real errors and cause subtle bugs.

♻️ Proposed refactor to collect both errors
 func SetGroupByUser(v bool) error {
-	if err := stmtsummary.StmtSummaryByDigestMap.SetGroupByUser(v); err != nil {
-		return err
-	}
+	var errs []error
+	if err := stmtsummary.StmtSummaryByDigestMap.SetGroupByUser(v); err != nil {
+		errs = append(errs, fmt.Errorf("v1: %w", err))
+	}
 	if GlobalStmtSummary != nil {
-		return GlobalStmtSummary.SetGroupByUser(v)
+		if err := GlobalStmtSummary.SetGroupByUser(v); err != nil {
+			errs = append(errs, fmt.Errorf("v2: %w", err))
+		}
+	}
+	if len(errs) > 0 {
+		return errors.Join(errs...)
 	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/util/stmtsummary/v2/stmtsummary.go` around lines 856 - 876,
SetGroupByUser currently calls stmtsummary.StmtSummaryByDigestMap.SetGroupByUser
and GlobalStmtSummary.SetGroupByUser sequentially but returns early on the first
error, risking inconsistent state; change it to call both targets
unconditionally, capture any errors from
stmtsummary.StmtSummaryByDigestMap.SetGroupByUser and (if GlobalStmtSummary !=
nil) GlobalStmtSummary.SetGroupByUser, and return a combined error (e.g.,
concatenate messages or wrap both using fmt.Errorf or a multi-error helper) so
callers see all failures while both backends are attempted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/metrics/stmtsummary.go`:
- Around line 91-106: The switch in SetStmtSummaryWindowMetrics duplicates the
same logic for StmtSummaryTypeV1 and StmtSummaryTypeV2; combine them into a
single multi-case branch (case StmtSummaryTypeV1, StmtSummaryTypeV2:) that calls
getStmtSummaryWindowMetricsLocked(typ) and sets recordGauge and evictedGauge,
leaving the default branch that uses
StmtSummaryWindowRecordCount.WithLabelValues(typ).Set(...) and
StmtSummaryWindowEvictedCount.WithLabelValues(typ).Set(...) unchanged.

In `@pkg/util/stmtsummary/v2/stmtsummary.go`:
- Around line 856-876: SetGroupByUser currently calls
stmtsummary.StmtSummaryByDigestMap.SetGroupByUser and
GlobalStmtSummary.SetGroupByUser sequentially but returns early on the first
error, risking inconsistent state; change it to call both targets
unconditionally, capture any errors from
stmtsummary.StmtSummaryByDigestMap.SetGroupByUser and (if GlobalStmtSummary !=
nil) GlobalStmtSummary.SetGroupByUser, and return a combined error (e.g.,
concatenate messages or wrap both using fmt.Errorf or a multi-error helper) so
callers see all failures while both backends are attempted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 34943a00-e56f-40b1-b4b6-6ab613137f29

📥 Commits

Reviewing files that changed from the base of the PR and between 609096e and e704ff7.

📒 Files selected for processing (20)
  • pkg/executor/select.go
  • pkg/metrics/BUILD.bazel
  • pkg/metrics/metrics.go
  • pkg/metrics/metrics_internal_test.go
  • pkg/metrics/stmtsummary.go
  • pkg/sessionctx/sessionstates/session_states_test.go
  • pkg/sessionctx/variable/session.go
  • pkg/sessionctx/variable/sysvar.go
  • pkg/sessionctx/variable/tidb_vars.go
  • pkg/util/stmtsummary/BUILD.bazel
  • pkg/util/stmtsummary/evicted_test.go
  • pkg/util/stmtsummary/statement_summary.go
  • pkg/util/stmtsummary/statement_summary_test.go
  • pkg/util/stmtsummary/v2/BUILD.bazel
  • pkg/util/stmtsummary/v2/logger.go
  • pkg/util/stmtsummary/v2/reader.go
  • pkg/util/stmtsummary/v2/reader_test.go
  • pkg/util/stmtsummary/v2/record_test.go
  • pkg/util/stmtsummary/v2/stmtsummary.go
  • pkg/util/stmtsummary/v2/stmtsummary_test.go

@tiprow

tiprow Bot commented Jun 11, 2026

Copy link
Copy Markdown

@nolouch: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow_for_release e704ff7 link true /test fast_test_tiprow_for_release

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@nolouch nolouch requested a review from yibin87 June 12, 2026 01:47

@yibin87 yibin87 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jun 12, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: terry1purcell, yibin87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added approved lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jun 12, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

[LGTM Timeline notifier]

Timeline:

  • 2026-06-12 01:49:03.687106987 +0000 UTC m=+1097444.757424376: ☑️ agreed by yibin87.
  • 2026-06-12 02:00:56.806953511 +0000 UTC m=+1098157.877270902: ☑️ agreed by terry1purcell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved do-not-merge/cherry-pick-not-approved lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants