Add built-in OpenTelemetry metrics to MemoryCache#126146
Add built-in OpenTelemetry metrics to MemoryCache#126146rjmurillo wants to merge 1 commit intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @dotnet/area-extensions-caching |
There was a problem hiding this comment.
Pull request overview
Adds built-in OpenTelemetry metrics support to Microsoft.Extensions.Caching.Memory.MemoryCache by exposing additional cache statistics and wiring them into observable instruments, gated by MemoryCacheOptions.TrackStatistics.
Changes:
- Added
TotalEvictionstoMemoryCacheStatisticsand tracked eviction counts inMemoryCache. - Introduced
MemoryCacheOptions.Nameand newMemoryCachector overload supportingIMeterFactory. - Added unit tests for metrics publication and eviction statistics, plus conditional DiagnosticSource references for non-inbox TFMs.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/Microsoft.Extensions.Caching.Memory/src/MemoryCache.cs | Implements eviction counting and publishes OTEL observable instruments via Meter/IMeterFactory. |
| src/libraries/Microsoft.Extensions.Caching.Memory/src/MemoryCacheOptions.cs | Adds Name option used as a metrics dimension. |
| src/libraries/Microsoft.Extensions.Caching.Memory/src/Microsoft.Extensions.Caching.Memory.csproj | Adds conditional DiagnosticSource reference for TFMs missing in-box metrics types. |
| src/libraries/Microsoft.Extensions.Caching.Memory/ref/Microsoft.Extensions.Caching.Memory.cs | Updates public API surface for new ctor overload and Name option. |
| src/libraries/Microsoft.Extensions.Caching.Memory/tests/MemoryCacheMetricsTests.cs | Adds tests validating instrument creation and basic measurements. |
| src/libraries/Microsoft.Extensions.Caching.Memory/tests/MemoryCacheGetCurrentStatisticsTests.cs | Adds tests for eviction statistics behavior. |
| src/libraries/Microsoft.Extensions.Caching.Memory/tests/Microsoft.Extensions.Caching.Memory.Tests.csproj | Adds conditional DiagnosticSource project reference for .NETFramework tests. |
| src/libraries/Microsoft.Extensions.Caching.Abstractions/src/MemoryCacheStatistics.cs | Adds TotalEvictions to the statistics snapshot type. |
| src/libraries/Microsoft.Extensions.Caching.Abstractions/ref/Microsoft.Extensions.Caching.Abstractions.cs | Updates ref assembly for TotalEvictions. |
| if (_options.TrackStatistics) | ||
| { | ||
| _meter = meterFactory?.Create("Microsoft.Extensions.Caching.Memory.MemoryCache") ?? SharedMeter.Instance; | ||
| InitializeMetrics(); | ||
| } |
There was a problem hiding this comment.
InitializeMetrics() creates new observable instruments per MemoryCache instance. Observable instruments are not cached by Meter (unlike counters/histograms), and DefaultMeterFactory caches meters by name+tags, so multiple MemoryCache instances will typically share the same Meter and end up publishing duplicate instruments with the same names. This can cause unbounded instrument growth, duplicated metric streams, and leaked MemoryCache instances because the observable callbacks capture this and cannot be unregistered when the cache is disposed. Consider creating the observable instruments once per meter (static) and having the callbacks enumerate active caches (e.g., via a weak-reference registry), or otherwise ensuring a per-cache meter identity and a way to stop publishing on Dispose.
There was a problem hiding this comment.
@rjmurillo I think we want the KeyValuePair<string, object?> cacheNameTag = new("cache.name", _options.Name); here in .Create() call to have proper deduplication.
Also Dispose method should call _meter?.Dispose() for completeness.
| coherentState.RemoveEntry(entry, _options); | ||
|
|
||
| if (_allStats is not null) | ||
| { | ||
| Interlocked.Increment(ref _accumulatedEvictions); | ||
| } | ||
| } |
There was a problem hiding this comment.
Eviction statistics are incremented after calling coherentState.RemoveEntry(...) without knowing whether the entry was actually removed. CoherentState.RemoveEntry only removes if the key/value pair matches, so under concurrency the remove can fail and TotalEvictions will be over-counted. Consider having RemoveEntry return a bool (or adding a dedicated TryRemoveEntry) and only increment when removal succeeds.
| _coherentState.RemoveEntry(entry, _options); | ||
|
|
||
| if (_allStats is not null) |
There was a problem hiding this comment.
EntryExpired increments TotalEvictions unconditionally after RemoveEntry, but CoherentState.RemoveEntry is a best-effort remove that can fail if the entry was already removed/replaced. This can over-count evictions in concurrent scenarios. Consider incrementing only when the removal actually happens (e.g., via a bool return from RemoveEntry).
| _coherentState.RemoveEntry(entry, _options); | |
| if (_allStats is not null) | |
| bool removed = _coherentState.RemoveEntry(entry, _options); | |
| if (_allStats is not null && removed) |
| coherentState.RemoveEntry(entry, _options); | ||
|
|
||
| if (_allStats is not null) |
There was a problem hiding this comment.
ScanForExpiredItems increments TotalEvictions even if RemoveEntry didn't actually remove the entry (e.g., due to concurrent removals/updates). This can inflate eviction counts. Consider tracking whether the removal succeeded (e.g., changing RemoveEntry to return bool) and only incrementing on success.
| coherentState.RemoveEntry(entry, _options); | |
| if (_allStats is not null) | |
| bool removed = coherentState.RemoveEntry(entry, _options); | |
| if (removed && _allStats is not null) |
| foreach (CacheEntry entry in entriesToRemove) | ||
| { | ||
| coherentState.RemoveEntry(entry, _options); | ||
| } | ||
|
|
||
| if (_allStats is not null && entriesToRemove.Count > 0) | ||
| { | ||
| Interlocked.Add(ref _accumulatedEvictions, entriesToRemove.Count); | ||
| } |
There was a problem hiding this comment.
In Compact, eviction stats are updated using entriesToRemove.Count, but RemoveEntry can fail to remove individual entries under concurrency; this can over-count evictions. Consider counting successful removals (or returning bool from RemoveEntry) instead of assuming all candidates were removed.
2d7d40d to
f3ef1c4
Compare
cincuranet
left a comment
There was a problem hiding this comment.
For the _accumulatedEvictions comments from Copilot, I think the idea with bool return is sound.
Or maybe putting the logic into RemoveEntry - given there's already _cacheSize handling - with a flag whether to count or not (not all RemoveEntry calls (should) update _accumulatedEvictions).
Up to you.
| if (_options.TrackStatistics) | ||
| { | ||
| _meter = meterFactory?.Create("Microsoft.Extensions.Caching.Memory.MemoryCache") ?? SharedMeter.Instance; | ||
| InitializeMetrics(); | ||
| } |
There was a problem hiding this comment.
@rjmurillo I think we want the KeyValuePair<string, object?> cacheNameTag = new("cache.name", _options.Name); here in .Create() call to have proper deduplication.
Also Dispose method should call _meter?.Dispose() for completeness.
| _meter.CreateObservableCounter("cache.requests", | ||
| () => | ||
| { | ||
| MemoryCacheStatistics? stats = GetCurrentStatistics(); |
There was a problem hiding this comment.
@rjmurillo This will prevent GC collecting MemoryCache instances, because it captures this in the lambda, i.e.: Static root → Meter → _instruments list → ObservableCounter → _callback delegate → this.
I think using new WeakReference<MemoryCache>(this) and adding if (!weakThis.TryGetTarget(out var cache)) return [] (conceptually, probably adding also _disposed into the condition is good idea) is good way to solve it. Basically, after GC collects this instance, the instruments starts returning empty measurements.
Implements dotnet#124140. Adds observable OTEL instruments for cache requests (hit/miss), evictions, entry count, and estimated size. Key design decisions: - MeterOptions with cache.name tag for per-cache meter deduplication - WeakReference<MemoryCache> in observable callbacks to prevent GC leaks - RemoveEntry returns bool for accurate eviction counting - IEnumerable<Measurement<long>> overloads to avoid phantom zero measurements - No instruments without IMeterFactory (GetCurrentStatistics still works) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f3ef1c4 to
98aa196
Compare
| private void InitializeMetrics(Meter meter) | ||
| { | ||
| var weakThis = new WeakReference<MemoryCache>(this); | ||
| KeyValuePair<string, object?> cacheNameTag = new("cache.name", _options.Name); | ||
|
|
||
| meter.CreateObservableCounter("cache.requests", | ||
| () => | ||
| { | ||
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | ||
| { | ||
| return []; | ||
| } | ||
|
|
||
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | ||
| return stats is null | ||
| ? [] | ||
| : new Measurement<long>[] | ||
| { | ||
| new(stats.TotalHits, cacheNameTag, new("cache.request.type", "hit")), | ||
| new(stats.TotalMisses, cacheNameTag, new("cache.request.type", "miss")), | ||
| }; | ||
| }, | ||
| unit: "{requests}", | ||
| description: "Total cache requests."); | ||
|
|
||
| meter.CreateObservableCounter<long>("cache.evictions", | ||
| () => | ||
| { | ||
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | ||
| { | ||
| return []; | ||
| } | ||
|
|
||
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | ||
| return stats is null | ||
| ? [] | ||
| : [new Measurement<long>(stats.TotalEvictions, cacheNameTag)]; | ||
| }, | ||
| unit: "{evictions}", | ||
| description: "Total cache evictions."); | ||
|
|
||
| meter.CreateObservableUpDownCounter<long>("cache.entries", | ||
| () => | ||
| { | ||
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | ||
| { | ||
| return []; | ||
| } | ||
|
|
||
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | ||
| return stats is null | ||
| ? [] | ||
| : [new Measurement<long>(stats.CurrentEntryCount, cacheNameTag)]; | ||
| }, | ||
| unit: "{entries}", | ||
| description: "Current number of cache entries."); | ||
|
|
||
| meter.CreateObservableGauge<long>("cache.estimated_size", | ||
| () => | ||
| { | ||
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | ||
| { | ||
| return []; | ||
| } | ||
|
|
||
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | ||
| return stats?.CurrentEstimatedSize is long size | ||
| ? [new Measurement<long>(size, cacheNameTag)] | ||
| : []; | ||
| }, | ||
| unit: "By", | ||
| description: "Estimated size of the cache."); | ||
| } | ||
|
|
There was a problem hiding this comment.
These observable instruments are created per MemoryCache instance, but IMeterFactory implementations (e.g., DefaultMeterFactory) cache and may return the same Meter for multiple caches with the same name+tags. Meter.CreateObservable* does not use the Meter’s instrument cache, so this can publish duplicate instruments (and callbacks) on a shared Meter, leading to duplicated/incorrect metric streams and unbounded instrument/callback growth. Consider a design that ensures per-cache meter identity even for duplicate Name values, or register instruments once and have callbacks enumerate active caches (e.g., via a weak registry).
| private void InitializeMetrics(Meter meter) | |
| { | |
| var weakThis = new WeakReference<MemoryCache>(this); | |
| KeyValuePair<string, object?> cacheNameTag = new("cache.name", _options.Name); | |
| meter.CreateObservableCounter("cache.requests", | |
| () => | |
| { | |
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| return []; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| return stats is null | |
| ? [] | |
| : new Measurement<long>[] | |
| { | |
| new(stats.TotalHits, cacheNameTag, new("cache.request.type", "hit")), | |
| new(stats.TotalMisses, cacheNameTag, new("cache.request.type", "miss")), | |
| }; | |
| }, | |
| unit: "{requests}", | |
| description: "Total cache requests."); | |
| meter.CreateObservableCounter<long>("cache.evictions", | |
| () => | |
| { | |
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| return []; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| return stats is null | |
| ? [] | |
| : [new Measurement<long>(stats.TotalEvictions, cacheNameTag)]; | |
| }, | |
| unit: "{evictions}", | |
| description: "Total cache evictions."); | |
| meter.CreateObservableUpDownCounter<long>("cache.entries", | |
| () => | |
| { | |
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| return []; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| return stats is null | |
| ? [] | |
| : [new Measurement<long>(stats.CurrentEntryCount, cacheNameTag)]; | |
| }, | |
| unit: "{entries}", | |
| description: "Current number of cache entries."); | |
| meter.CreateObservableGauge<long>("cache.estimated_size", | |
| () => | |
| { | |
| if (!weakThis.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| return []; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| return stats?.CurrentEstimatedSize is long size | |
| ? [new Measurement<long>(size, cacheNameTag)] | |
| : []; | |
| }, | |
| unit: "By", | |
| description: "Estimated size of the cache."); | |
| } | |
| private static readonly ConditionalWeakTable<Meter, CacheMeterRegistration> s_meterRegistrations = new(); | |
| private sealed class CacheMeterRegistration | |
| { | |
| private readonly Meter _meter; | |
| private readonly object _gate = new(); | |
| private readonly List<WeakReference<MemoryCache>> _caches = []; | |
| internal CacheMeterRegistration(Meter meter) | |
| { | |
| _meter = meter ?? throw new ArgumentNullException(nameof(meter)); | |
| _meter.CreateObservableCounter("cache.requests", | |
| ObserveRequests, | |
| unit: "{requests}", | |
| description: "Total cache requests."); | |
| _meter.CreateObservableCounter<long>("cache.evictions", | |
| ObserveEvictions, | |
| unit: "{evictions}", | |
| description: "Total cache evictions."); | |
| _meter.CreateObservableUpDownCounter<long>("cache.entries", | |
| ObserveEntries, | |
| unit: "{entries}", | |
| description: "Current number of cache entries."); | |
| _meter.CreateObservableGauge<long>("cache.estimated_size", | |
| ObserveEstimatedSize, | |
| unit: "By", | |
| description: "Estimated size of the cache."); | |
| } | |
| internal void AddCache(MemoryCache cache) | |
| { | |
| if (cache is null) | |
| { | |
| throw new ArgumentNullException(nameof(cache)); | |
| } | |
| lock (_gate) | |
| { | |
| _caches.Add(new WeakReference<MemoryCache>(cache)); | |
| } | |
| } | |
| private IEnumerable<Measurement<long>> ObserveRequests() | |
| { | |
| List<Measurement<long>> measurements = []; | |
| lock (_gate) | |
| { | |
| for (int i = 0; i < _caches.Count; i++) | |
| { | |
| WeakReference<MemoryCache> weak = _caches[i]; | |
| if (!weak.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| _caches.RemoveAt(i); | |
| i--; | |
| continue; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| if (stats is null) | |
| { | |
| continue; | |
| } | |
| KeyValuePair<string, object?> cacheNameTag = new("cache.name", cache._options.Name); | |
| measurements.Add(new Measurement<long>(stats.TotalHits, cacheNameTag, new("cache.request.type", "hit"))); | |
| measurements.Add(new Measurement<long>(stats.TotalMisses, cacheNameTag, new("cache.request.type", "miss"))); | |
| } | |
| } | |
| return measurements; | |
| } | |
| private IEnumerable<Measurement<long>> ObserveEvictions() | |
| { | |
| List<Measurement<long>> measurements = []; | |
| lock (_gate) | |
| { | |
| for (int i = 0; i < _caches.Count; i++) | |
| { | |
| WeakReference<MemoryCache> weak = _caches[i]; | |
| if (!weak.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| _caches.RemoveAt(i); | |
| i--; | |
| continue; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| if (stats is null) | |
| { | |
| continue; | |
| } | |
| KeyValuePair<string, object?> cacheNameTag = new("cache.name", cache._options.Name); | |
| measurements.Add(new Measurement<long>(stats.TotalEvictions, cacheNameTag)); | |
| } | |
| } | |
| return measurements; | |
| } | |
| private IEnumerable<Measurement<long>> ObserveEntries() | |
| { | |
| List<Measurement<long>> measurements = []; | |
| lock (_gate) | |
| { | |
| for (int i = 0; i < _caches.Count; i++) | |
| { | |
| WeakReference<MemoryCache> weak = _caches[i]; | |
| if (!weak.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| _caches.RemoveAt(i); | |
| i--; | |
| continue; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| if (stats is null) | |
| { | |
| continue; | |
| } | |
| KeyValuePair<string, object?> cacheNameTag = new("cache.name", cache._options.Name); | |
| measurements.Add(new Measurement<long>(stats.CurrentEntryCount, cacheNameTag)); | |
| } | |
| } | |
| return measurements; | |
| } | |
| private IEnumerable<Measurement<long>> ObserveEstimatedSize() | |
| { | |
| List<Measurement<long>> measurements = []; | |
| lock (_gate) | |
| { | |
| for (int i = 0; i < _caches.Count; i++) | |
| { | |
| WeakReference<MemoryCache> weak = _caches[i]; | |
| if (!weak.TryGetTarget(out MemoryCache? cache) || cache._disposed) | |
| { | |
| _caches.RemoveAt(i); | |
| i--; | |
| continue; | |
| } | |
| MemoryCacheStatistics? stats = cache.GetCurrentStatistics(); | |
| if (stats?.CurrentEstimatedSize is not long size) | |
| { | |
| continue; | |
| } | |
| KeyValuePair<string, object?> cacheNameTag = new("cache.name", cache._options.Name); | |
| measurements.Add(new Measurement<long>(size, cacheNameTag)); | |
| } | |
| } | |
| return measurements; | |
| } | |
| } | |
| private void InitializeMetrics(Meter meter) | |
| { | |
| CacheMeterRegistration registration = s_meterRegistrations.GetValue(meter, static m => new CacheMeterRegistration(m)); | |
| registration.AddCache(this); | |
| } |
| { | ||
| if (disposing) | ||
| { | ||
| _meter?.Dispose(); |
There was a problem hiding this comment.
Disposing the Meter returned from IMeterFactory can be problematic because the factory owns meter lifetime and may return shared/cached meters (the default DI factory explicitly makes Meter.Dispose a no-op). Consider not disposing meters created via IMeterFactory (or only disposing when you created/own a private/shared Meter yourself) to avoid interfering with other components using the same cached Meter in custom IMeterFactory implementations.
| _meter?.Dispose(); |
| /// Gets the total number of cache evictions. | ||
| /// </summary> |
There was a problem hiding this comment.
The new TotalEvictions documentation is vague about what counts as an eviction. The implementation/tests exclude explicit Remove/Clear (EvictionReason.Removed) and also don’t appear to count replacements; consider clarifying this in the XML docs to prevent consumers from misinterpreting the metric/statistic.
| /// Gets the total number of cache evictions. | |
| /// </summary> | |
| /// Gets the total number of cache entries evicted by the cache. | |
| /// </summary> | |
| /// <remarks> | |
| /// This count includes entries removed due to cache eviction policies such as expiration or capacity limits. | |
| /// It does not include entries removed explicitly by user code (for example, via <c>Remove</c> or <c>Clear</c>), | |
| /// and does not include entries that were replaced by new values. | |
| /// </remarks> |
Summary
Implements #124140 — adds built-in OpenTelemetry metrics to
Microsoft.Extensions.Caching.Memory.MemoryCache.API Changes
Microsoft.Extensions.Caching.Abstractions
Microsoft.Extensions.Caching.Memory
Observable Instruments
Meter name:
Microsoft.Extensions.Caching.Memory.MemoryCachecache.requestscache.name,cache.request.type(hit/miss)cache.evictionscache.namecache.entriescache.namecache.estimated_sizecache.nameKey Design Decisions
TrackStatistics: Instruments only created whenTrackStatistics = trueMemoryCacheServiceCollectionExtensions— DI picks 3-param ctor whenIMeterFactoryregistered, 2-param otherwiseIMeterFactoryprovided, uses a singletonSharedMeter(NOP Dispose) following theSystem.Net.HttppatternPostProcessTryGetValue,EntryExpired,ScanForExpiredItems,Compact. ExplicitRemove()/Clear()excluded (EvictionReason.Removed)NetCoreAppCurrentTFMs (types are in-box for net11.0+)Testing
Fixes #124140