Skip to content

TelemetryManager.Dispose leaks Azure Monitor ActivityListener (only Shutdown(0)s, never disposes the providers) #17529

@radical

Description

@radical

Summary

TelemetryManager.Dispose shuts down its TracerProvider instances but never disposes them, so the underlying ActivityListener registered on Aspire.Cli.Reported leaks for the rest of the process lifetime.

Repro

// src/Aspire.Cli/Telemetry/TelemetryManager.cs:202
public void Dispose()
{
    if (!_shuttingDown)
    {
        _azureMonitorProvider?.Shutdown(0);
        _profilingProvider?.Shutdown(0);
        _debugDiagnosticProvider?.Shutdown(0);
        // Dispose isn't used here because it always flushes telemetry and waits for completion.
    }
}

TracerProvider.Shutdown(timeoutMs) flushes pending spans and stops accepting new ones; it does not unregister the ActivityListener that the provider builder added to Activity.AddActivityListener. Only TracerProvider.Dispose() does that.

Expected vs. actual

Expected: After TelemetryManager.Dispose() returns, the ActivityListener it registered should be unregistered. After Shutdown(0) has run there is nothing to flush, so a subsequent Dispose() on each provider should be fast.

Actual: The listener stays alive process-wide. Every disposed TelemetryManager adds one more stale listener to Aspire.Cli.Reported.

Impact

  • In production: non-issue. One TelemetryManager per CLI process; the process exits soon after.
  • In the test process: this is the underlying cause of the microsoft.sample_rate race tracked by [Failing test]: Aspire.Cli.Tests.CliSmokeTests.MainReturnsExpectedExitCode\(args: \[\], expectedExitCode: 1\) #17450. Across many test classes, each disposed TelemetryManager leaves its listener behind. Subsequent parallel Reported activities trigger multiple stale samplers concurrently, both calling ActivityCreationOptions.SamplingTags.Add("microsoft.sample_rate", ...) on the same activity → InvalidOperationException: The collection already contains item with same key 'microsoft.sample_rate'.

PR #17451 (commit e126a1a) worked around this by opting the whole Aspire.Cli.Tests assembly out of Azure Monitor, but the underlying lifecycle bug remains: any future test (or production caller) that opts back in and disposes a TelemetryManager still leaks the listener. This was rediscovered in #17461 when new tests in TelemetryConfigurationTests bypassed the workaround by directly constructing TelemetryManager and resurrected the race (fixed in that PR by routing the new tests through BuildHostAsync).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-engineering-systemsinfrastructure helix infra engineering repo stufftriage:bot-seenAspire triage bot has seen this issue

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions