-
Notifications
You must be signed in to change notification settings - Fork 855
Description
Description
Description
TLDR: The current per-request log buffering implementation suffers from a concurrency bug that renders it unusable for production use, as exceptions are regularly thrown under load.
Details: LogBufferingFilterRuleSelector has an unsynchronised race condition between Select() and InvalidateCache() that causes InvalidOperationException: Collection was modified; enumeration operation may not execute under concurrent load.
LogBufferingFilterRuleSelector is a shared singleton injected via DI. Multiple IncomingRequestLogBuffer instances (one per HTTP request) share the same LogBufferingFilterRuleSelector instance. Select() is called on every log call from any request thread. InvalidateCache() is called from IncomingRequestLogBuffer.Dispose() when a request completes. These two methods race on the internal List<LogBufferingFilterRule> values stored in _ruleCache.
This is easily reproduceable - see "Minimal Reproduction" section below and attached repro project.
Root Cause
NB: The analysis in this "Root Cause" section was generated by Claude Opus 4.6 Extended. However, please note that I have manually validated the reproduction steps and encountered the described issue many times during testing! The issue is sufficiently serious that I believe log buffering cannot currently be used in production.
Analysis:
LogBufferingFilterRuleSelector uses a ConcurrentDictionary<(LogLevel, EventId), List<LogBufferingFilterRule>> field called _ruleCache to cache filter rule candidates. The values in this dictionary are List<T> instances obtained from an ObjectPool.
ConcurrentDictionary guarantees thread-safety for its own operations (adding, removing, and retrieving entries). However, it provides no protection for the objects stored as values after they have been retrieved. Once two threads both hold a reference to the same List<T> value, any concurrent operations on that list are unsynchronised.
This is the source of the race condition.
Select() retrieves a cached list, then iterates it:
// Step 1: retrieve (or create) a List<LogBufferingFilterRule> from the cache
List<LogBufferingFilterRule> ruleCandidates = _ruleCache.GetOrAdd((logLevel, eventId), _ =>
{
List<LogBufferingFilterRule> candidates = _cachedRulePool.Get();
foreach (LogBufferingFilterRule rule in rules)
{
if (IsMatch(rule, logLevel, eventId))
candidates.Add(rule);
}
return candidates;
});
// Step 2: iterate the cached list to find the best matching rule
foreach (LogBufferingFilterRule ruleCandidate in ruleCandidates) // <-- enumerates the List<T>
{
if (IsAttributesMatch(ruleCandidate, attributes) && IsBetter(currentBest, ruleCandidate))
currentBest = ruleCandidate;
}The foreach at step 2 compiles to a List<T>.Enumerator. On each iteration, List<T>.Enumerator.MoveNext() checks whether the list's internal version counter has changed since enumeration began. If it has, MoveNext() throws InvalidOperationException.
InvalidateCache() clears those same cached lists by returning them to the object pool:
public void InvalidateCache()
{
foreach (((LogLevel, EventId) key, List<LogBufferingFilterRule> value) in _ruleCache)
{
_cachedRulePool.Return(value); // ObjectPool<List<T>>.Return() calls List.Clear()
}
_ruleCache.Clear();
}ObjectPool<List<T>>.Return() calls List<T>.Clear() on the returned list, which increments the list's internal version counter.
The crash sequence:
- Request A calls
Select(). Step 1 retrieves aList<LogBufferingFilterRule>reference from_ruleCache. Step 2 begins iterating it —List<T>.Enumeratorcaptures the list's current version counter. - Request B completes. Its
IncomingRequestLogBuffer.Dispose()callsInvalidateCache(), which calls_cachedRulePool.Return(value)on the same list. This callsList<T>.Clear(), incrementing the version counter. - Request A's
foreachat step 2 callsList<T>.Enumerator.MoveNext(). The version counter no longer matches, so it throwsInvalidOperationException: Collection was modified; enumeration operation may not execute.
Package Versions
Microsoft.AspNetCore.Diagnostics.Middleware10.3.0Microsoft.Extensions.Telemetry10.3.0
Reproduction Steps
Minimal Reproduction
A single-file .NET 10 minimal API project that reproduces the crash under concurrent load.
Program.cs
#pragma warning disable EXTEXP0003 // Log buffering is experimental
using Microsoft.Extensions.Diagnostics.Buffering;
using Microsoft.Extensions.Logging;
var builder = WebApplication.CreateBuilder(args);
builder.Logging.AddPerIncomingRequestBuffer(options =>
{
options.Rules.Add(new LogBufferingFilterRule(logLevel: LogLevel.Information));
});
var app = builder.Build();
app.MapGet("/", (ILogger<Program> logger) =>
{
// Log enough to populate the rule selector's cache and increase
// the chance of Select() iterating while InvalidateCache() fires.
for (int i = 0; i < 20; i++)
{
logger.LogInformation("Request log entry {Index}", i);
}
return "OK";
});
app.Run("http://localhost:5000");Repro.csproj
<Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<NoWarn>$(NoWarn);EXTEXP0003</NoWarn>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.AspNetCore.Diagnostics.Middleware" Version="10.3.0" />
<PackageReference Include="Microsoft.Extensions.Telemetry" Version="10.3.0" />
</ItemGroup>
</Project>Steps to reproduce
- Run the application (e.g. with the debugger enabled + break on all exceptions).
- Hit it with concurrent requests, for example using [bombardier](https://github.com/codesenberg/bombardier):
bombardier -c 200 -d 30s http://localhost:5000/
- Within seconds, the application throws
InvalidOperationException: Collection was modified; enumeration operation may not executewith the stack trace shown above.
The race occurs because each completing request calls IncomingRequestLogBuffer.Dispose() → LogBufferingFilterRuleSelector.InvalidateCache(), which clears the cached List<T> instances while other in-flight requests are iterating them in Select().
Expected behavior
No exceptions should be thrown.
Actual behavior
Stack Trace
at System.Collections.Generic.List`1.Enumerator.MoveNext()
at Microsoft.Extensions.Diagnostics.Buffering.LogBufferingFilterRuleSelector.Select(IList`1 rules, LogLevel logLevel, EventId eventId, IReadOnlyList`1 attributes)
at Microsoft.AspNetCore.Diagnostics.Buffering.IncomingRequestLogBuffer.TryEnqueue[TState](LogEntry`1 logEntry)
at Microsoft.AspNetCore.Diagnostics.Buffering.PerRequestLogBufferManager.TryEnqueue[TState](IBufferedLogger bufferedLogger, LogEntry`1& logEntry)
at Microsoft.Extensions.Logging.ExtendedLogger.LegacyPath[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
at Microsoft.Extensions.Logging.ExtendedLogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
at Microsoft.Extensions.Logging.Logger`1.Microsoft.Extensions.Logging.ILogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
at Microsoft.Extensions.Logging.LoggerMessage.<>c__DisplayClass12_0`2.<Define>g__Log|0(ILogger logger, T1 arg1, T2 arg2, Exception exception)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcher.MatchAsync(HttpContext httpContext)
at Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware.Invoke(HttpContext httpContext)
Regression?
Don't know
Known Workarounds
None
Configuration
Host:
Version: 10.0.3
Architecture: x64
Commit: c2435c3e0f
Running on Windows 11
Other information
Possible Fixes
- Snapshot the cached list to an array before iterating in Select().
- Do not return cached lists to the pool in InvalidateCache() while they may still be referenced by concurrent Select() callers (e.g. use a lock or interlocked swap pattern).
- Use an immutable collection (e.g. ImmutableArray) for cached rule sets instead of pooled List.
Not sure which approach would be most performant.