Skip to content

Thread-safety bug: LogBufferingFilterRuleSelector.Select races with InvalidateCache, causing InvalidOperationException - Collection was modified; enumeration operation may not execute #7383

@Sorttech

Description

@Sorttech

Description

Description

TLDR: The current per-request log buffering implementation suffers from a concurrency bug that renders it unusable for production use, as exceptions are regularly thrown under load.

Details: LogBufferingFilterRuleSelector has an unsynchronised race condition between Select() and InvalidateCache() that causes InvalidOperationException: Collection was modified; enumeration operation may not execute under concurrent load.

LogBufferingFilterRuleSelector is a shared singleton injected via DI. Multiple IncomingRequestLogBuffer instances (one per HTTP request) share the same LogBufferingFilterRuleSelector instance. Select() is called on every log call from any request thread. InvalidateCache() is called from IncomingRequestLogBuffer.Dispose() when a request completes. These two methods race on the internal List<LogBufferingFilterRule> values stored in _ruleCache.

This is easily reproduceable - see "Minimal Reproduction" section below and attached repro project.

Root Cause

NB: The analysis in this "Root Cause" section was generated by Claude Opus 4.6 Extended. However, please note that I have manually validated the reproduction steps and encountered the described issue many times during testing! The issue is sufficiently serious that I believe log buffering cannot currently be used in production.

Analysis:

LogBufferingFilterRuleSelector uses a ConcurrentDictionary<(LogLevel, EventId), List<LogBufferingFilterRule>> field called _ruleCache to cache filter rule candidates. The values in this dictionary are List<T> instances obtained from an ObjectPool.

ConcurrentDictionary guarantees thread-safety for its own operations (adding, removing, and retrieving entries). However, it provides no protection for the objects stored as values after they have been retrieved. Once two threads both hold a reference to the same List<T> value, any concurrent operations on that list are unsynchronised.

This is the source of the race condition.

Select() retrieves a cached list, then iterates it:

// Step 1: retrieve (or create) a List<LogBufferingFilterRule> from the cache
List<LogBufferingFilterRule> ruleCandidates = _ruleCache.GetOrAdd((logLevel, eventId), _ =>
{
    List<LogBufferingFilterRule> candidates = _cachedRulePool.Get();
    foreach (LogBufferingFilterRule rule in rules)
    {
        if (IsMatch(rule, logLevel, eventId))
            candidates.Add(rule);
    }
    return candidates;
});

// Step 2: iterate the cached list to find the best matching rule
foreach (LogBufferingFilterRule ruleCandidate in ruleCandidates)  // <-- enumerates the List<T>
{
    if (IsAttributesMatch(ruleCandidate, attributes) && IsBetter(currentBest, ruleCandidate))
        currentBest = ruleCandidate;
}

The foreach at step 2 compiles to a List<T>.Enumerator. On each iteration, List<T>.Enumerator.MoveNext() checks whether the list's internal version counter has changed since enumeration began. If it has, MoveNext() throws InvalidOperationException.

InvalidateCache() clears those same cached lists by returning them to the object pool:

public void InvalidateCache()
{
    foreach (((LogLevel, EventId) key, List<LogBufferingFilterRule> value) in _ruleCache)
    {
        _cachedRulePool.Return(value);  // ObjectPool<List<T>>.Return() calls List.Clear()
    }
    _ruleCache.Clear();
}

ObjectPool<List<T>>.Return() calls List<T>.Clear() on the returned list, which increments the list's internal version counter.

The crash sequence:

  1. Request A calls Select(). Step 1 retrieves a List<LogBufferingFilterRule> reference from _ruleCache. Step 2 begins iterating it — List<T>.Enumerator captures the list's current version counter.
  2. Request B completes. Its IncomingRequestLogBuffer.Dispose() calls InvalidateCache(), which calls _cachedRulePool.Return(value) on the same list. This calls List<T>.Clear(), incrementing the version counter.
  3. Request A's foreach at step 2 calls List<T>.Enumerator.MoveNext(). The version counter no longer matches, so it throws InvalidOperationException: Collection was modified; enumeration operation may not execute.

Package Versions

  • Microsoft.AspNetCore.Diagnostics.Middleware 10.3.0
  • Microsoft.Extensions.Telemetry 10.3.0

Reproduction Steps

Minimal Reproduction

A single-file .NET 10 minimal API project that reproduces the crash under concurrent load.

Program.cs

#pragma warning disable EXTEXP0003 // Log buffering is experimental

using Microsoft.Extensions.Diagnostics.Buffering;
using Microsoft.Extensions.Logging;

var builder = WebApplication.CreateBuilder(args);

builder.Logging.AddPerIncomingRequestBuffer(options =>
{
    options.Rules.Add(new LogBufferingFilterRule(logLevel: LogLevel.Information));
});

var app = builder.Build();

app.MapGet("/", (ILogger<Program> logger) =>
{
    // Log enough to populate the rule selector's cache and increase
    // the chance of Select() iterating while InvalidateCache() fires.
    for (int i = 0; i < 20; i++)
    {
        logger.LogInformation("Request log entry {Index}", i);
    }

    return "OK";
});

app.Run("http://localhost:5000");

Repro.csproj

<Project Sdk="Microsoft.NET.Sdk.Web">
  <PropertyGroup>
    <TargetFramework>net10.0</TargetFramework>
    <NoWarn>$(NoWarn);EXTEXP0003</NoWarn>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.AspNetCore.Diagnostics.Middleware" Version="10.3.0" />
    <PackageReference Include="Microsoft.Extensions.Telemetry" Version="10.3.0" />
  </ItemGroup>
</Project>

Steps to reproduce

  1. Run the application (e.g. with the debugger enabled + break on all exceptions).
  2. Hit it with concurrent requests, for example using [bombardier](https://github.com/codesenberg/bombardier):
    bombardier -c 200 -d 30s http://localhost:5000/
  3. Within seconds, the application throws InvalidOperationException: Collection was modified; enumeration operation may not execute with the stack trace shown above.

The race occurs because each completing request calls IncomingRequestLogBuffer.Dispose()LogBufferingFilterRuleSelector.InvalidateCache(), which clears the cached List<T> instances while other in-flight requests are iterating them in Select().

LogBufferingBugRepro.zip

Expected behavior

No exceptions should be thrown.

Actual behavior

Stack Trace

at System.Collections.Generic.List`1.Enumerator.MoveNext()
at Microsoft.Extensions.Diagnostics.Buffering.LogBufferingFilterRuleSelector.Select(IList`1 rules, LogLevel logLevel, EventId eventId, IReadOnlyList`1 attributes)
at Microsoft.AspNetCore.Diagnostics.Buffering.IncomingRequestLogBuffer.TryEnqueue[TState](LogEntry`1 logEntry)
at Microsoft.AspNetCore.Diagnostics.Buffering.PerRequestLogBufferManager.TryEnqueue[TState](IBufferedLogger bufferedLogger, LogEntry`1& logEntry)
at Microsoft.Extensions.Logging.ExtendedLogger.LegacyPath[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
at Microsoft.Extensions.Logging.ExtendedLogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
at Microsoft.Extensions.Logging.Logger`1.Microsoft.Extensions.Logging.ILogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
at Microsoft.Extensions.Logging.LoggerMessage.<>c__DisplayClass12_0`2.<Define>g__Log|0(ILogger logger, T1 arg1, T2 arg2, Exception exception)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcher.MatchAsync(HttpContext httpContext)
at Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware.Invoke(HttpContext httpContext)

Regression?

Don't know

Known Workarounds

None

Configuration

Host:
Version: 10.0.3
Architecture: x64
Commit: c2435c3e0f

Running on Windows 11

Other information

Possible Fixes

  • Snapshot the cached list to an array before iterating in Select().
  • Do not return cached lists to the pool in InvalidateCache() while they may still be referenced by concurrent Select() callers (e.g. use a lock or interlocked swap pattern).
  • Use an immutable collection (e.g. ImmutableArray) for cached rule sets instead of pooled List.

Not sure which approach would be most performant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-telemetrybugThis issue describes a behavior which is not expected - a bug.untriaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions