client: add pre-throttling demand RU/s metric#10582
client: add pre-throttling demand RU/s metric#10582JmPotato wants to merge 3 commits intotikv:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds client-side pre-throttling demand RU/s tracking: records total demanded RU, computes a time-aware EMA of demanded RU/s, exposes it via a new per-resource-group Prometheus gauge, and ensures metric label cleanup during resource-group removal. Includes tests validating accumulation and EMA. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GroupController
participant TokenCounter
participant Metrics
Client->>GroupController: Send request (RU v) -- pre-throttle sample
GroupController->>TokenCounter: recordDemand(v) and calcDemandAvg(now)
TokenCounter-->>GroupController: updated avgDemandRUPerSec
GroupController->>Metrics: DemandRUPerSecGauge.Set(resource_group, avgDemandRUPerSec)
GroupController->>TokenCounter: Reserve()/AcquireTokens() (may block/reject)
TokenCounter-->>GroupController: allow/reject + consumption accounting
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| tokenRequestCounter: metrics.ResourceGroupTokenRequestCounter.WithLabelValues(oldName, name), | ||
| runningKVRequestCounter: metrics.GroupRunningKVRequestCounter.WithLabelValues(name), | ||
| consumeTokenHistogram: metrics.TokenConsumedHistogram.WithLabelValues(name), | ||
| demandRUPerSecGauge: metrics.DemandRUPerSecGauge.WithLabelValues(name), |
There was a problem hiding this comment.
Will these metrics being cleaned?
There was a problem hiding this comment.
Good catch. On the current master only ResourceGroupStatusGauge is cleaned in cleanUpResourceGroup, so this new gauge would leak label series the same way most other per-group metrics already do. The leak impact is small in practice (resource groups are typically long-lived, client-side only, bounded per process), but it's still a real leak and not a design decision. I'll add DeleteLabelValues for DemandRUPerSecGauge in the cleanup path to keep this PR self-consistent. Cleaning up the other pre-existing per-group metrics is out of scope here — I'm tracking that in a separate branch.
| // ResourceGroupStatusGauge comments placeholder | ||
| ResourceGroupStatusGauge *prometheus.GaugeVec | ||
| // DemandRUPerSecGauge is the EMA of demanded RU/s before throttling per resource group. | ||
| DemandRUPerSecGauge *prometheus.GaugeVec |
There was a problem hiding this comment.
If it is throttled, will the demand be recorded?
There was a problem hiding this comment.
Yes — that's exactly the design intent of this metric. Demand is accumulated at request entry (onRequestWaitImpl etc.) before acquireTokens is called, and on throttle error only consumption is rolled back via sub(gc.mu.consumption, delta); demandRUTotal is never subtracted. So throttled requests still count toward demand, which is the whole point of exposing this separately from the existing consumption-based avgRUPerSec. I'll also tighten the Help text / doc comment to make this invariant explicit ("including requests rejected by the token bucket").
Add a new client-side Prometheus Gauge `resource_manager_client_resource_group_demand_ru_per_sec` that tracks the EMA of demanded RU/s before Resource Control throttling takes effect. The existing `avgRUPerSec` is based on post-throttling consumption: when a request is rejected by the token bucket, its RU cost is subtracted from the consumption counter. This means the consumption-based EMA underreports the true workload demand when the resource group is actively throttled. The new demand metric samples RU cost at every `onRequestWaitImpl`, `onResponseImpl`, `onResponseWaitImpl`, and `addRUConsumption` entry point, accumulating into a monotonically increasing `demandRUTotal` counter that is never subtracted on throttle failure. A demand EMA is then computed using the same `movingAvgFactor` as the consumption EMA and flushed to the Gauge on each `updateAvgRequestResourcePerSec` tick. This enables operators to: - See per-instance RU demand in Grafana (natural `instance` label). - Aggregate cluster-wide demand via `sum by (resource_group)`. - Identify the true workload peak via `max_over_time(...)`. Close tikv#10581 Signed-off-by: JmPotato <ghzpotato@gmail.com> Signed-off-by: JmPotato <github@ipotato.me>
Address review feedback on tikv#10582: - Delete `DemandRUPerSecGauge` label series in `cleanUpResourceGroup` so the new gauge does not leak labels when a resource group is deleted. Add a TODO tracking the remaining per-group metrics that still leak (TokenConsumedHistogram, GroupRunningKVRequestCounter, SuccessfulRequestDuration, FailedRequestCounter, ResourceGroupTokenRequestCounter, RequestRetryCounter, FailedLimitReserveDuration). - Clarify the metric Help text and doc comment to make the pre-throttling semantics explicit: the EMA includes requests rejected by the token bucket, which is the whole reason this metric is exposed separately from the consumption-based `avg_ru_per_sec`. Signed-off-by: JmPotato <github@ipotato.me>
5933f85 to
4d6937d
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10582 +/- ##
==========================================
- Coverage 78.96% 78.94% -0.03%
==========================================
Files 532 532
Lines 71883 72003 +120
==========================================
+ Hits 56766 56840 +74
- Misses 11093 11130 +37
- Partials 4024 4033 +9
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
The new `demand_ru_per_sec` gauge promised "every entry point, never subtracted on throttle failure", but `onResponseWaitImpl` increments `mu.demandRUTotal` only after `acquireTokens` succeeds, so a throttle rejection silently drops the demand sample -- exactly the case the metric is meant to surface. Root cause: the increment was co-located with the consumption update inside the post-acquire lock block, even though demand and consumption have different lifetimes (demand is monotonic; consumption is rolled back on rejection). Four call sites carried the same inline expression, making the wrong placement easy to add and hard to notice. This commit makes the invariant structural: * Add `(*groupCostController).recordDemand`, the single point where `mu.demandRUTotal` grows. Its doc comment states the rule: callers MUST invoke it before any limiter wait/acquire so demand survives a rejection. * Route `onRequestWaitImpl`, `onResponseImpl`, `onResponseWaitImpl`, and `addRUConsumption` through `recordDemand`. In `onResponseWaitImpl` this also hoists the call above `acquireTokens`, fixing the bug. * Add `TestDemandRUCapturedOnResponseWaitThrottle` to lock the invariant in via the throttle-fail path. * Rewrite the EMA portion of `TestDemandRUTracking`: the previous version assigned `gc.run.now` only to have `updateRunState` immediately overwrite it with `time.Now()`, so the two-tick EMA assertion was a no-op. The new version drives `calcDemandAvg` directly with hand-set timestamps and asserts the actual EMA trajectory. * Mirror the `acceleratedReportingPeriod` failpoint into `calcDemandAvg` so any test that accelerates `calcAvg` accelerates the demand EMA in lockstep. * `calcDemandAvg` now returns whether it actually updated; the gauge Set is gated on that so we never re-publish a stale value when no time has elapsed. Drop the `< 0` clamp -- the input counter is monotonically increasing, so the EMA cannot go negative. * Extend the leak-TODO in `cleanUpResourceGroup` to include `LowTokenRequestNotifyCounter`, which has the same per-group label cardinality as the others on the list. Signed-off-by: JmPotato <github@ipotato.me>
|
@JmPotato: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| gc.mu.Lock() | ||
| latestConsumption := *gc.mu.consumption | ||
| gc.mu.Unlock() | ||
| if equalRU(latestConsumption, *gc.run.consumption) { |
There was a problem hiding this comment.
Will the metrics be deleted if the request RU is large?
What problem does this PR solve?
Issue Number: Close #10581
What is changed and how does it work?
Check List
Tests
Release note
Summary by CodeRabbit
New Features
Chores
Tests