Skip to content

CSHARP-5935: Command activities may be skipped when using pooled connection#1918

Open
ajcvickers wants to merge 2 commits intomongodb:mainfrom
ajcvickers:TestFixes_2
Open

CSHARP-5935: Command activities may be skipped when using pooled connection#1918
ajcvickers wants to merge 2 commits intomongodb:mainfrom
ajcvickers:TestFixes_2

Conversation

@ajcvickers
Copy link
Contributor

I was looking at flaky tests, and found MongoClient_should_create_activities_when_tracing_enabled was failing occasionally. Turns out it was an actual bug. Junie (Opus 4.6) says:

Root Cause

_shouldTrace in CommandEventHelper was set once at connection construction time via MongoTelemetry.ActivitySource.HasListeners(). Since connections are pooled and reused, if a connection was created before an ActivityListener was registered (or by a test with tracing disabled), _shouldTrace remained false permanently for that connection — command activities were never created even when a listener was later active.

…ection

I was looking at flaky tests, and found MongoClient_should_create_activities_when_tracing_enabled was failing occasionally. Turns out it was an actual bug. Junie (Opus 4.6) says:

### Root Cause

`_shouldTrace` in `CommandEventHelper` was set **once** at connection construction time via `MongoTelemetry.ActivitySource.HasListeners()`. Since connections are pooled and reused, if a connection was created before an `ActivityListener` was registered (or by a test with tracing disabled), `_shouldTrace` remained `false` permanently for that connection — command activities were never created even when a listener was later active.
@ajcvickers ajcvickers requested a review from adelinowona March 18, 2026 12:18
@ajcvickers ajcvickers requested a review from a team as a code owner March 18, 2026 12:18
Copilot AI review requested due to automatic review settings March 18, 2026 12:18
@ajcvickers ajcvickers added the bug Fixes issues or unintended behavior. label Mar 18, 2026
kmsProviders.Add("local", localKey);

var keyVaultNamespace = CollectionNamespace.FromFullName("encryption.__keyVault");
var keyVaultMongoClient = new MongoClient();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a simple test fix to use the configured local connection string. Not related to the bug fix--I can create another PR if necessary.

collection.Find(Builders<BsonDocument>.Filter.Empty).FirstOrDefault();
collection.DeleteOne(Builders<BsonDocument>.Filter.Eq("name", "test"));

SpinWait.SpinUntil(() => capturedActivities.Count >= 6, millisecondsTimeout: 10000);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is needed--I added it when I thought the test was flaky. But I don't think it does harm.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a tracing/instrumentation bug where command-level OpenTelemetry activities could be permanently skipped on pooled connections when an ActivityListener was registered after the connection was created.

Changes:

  • Update CommandEventHelper to decide whether to start command activities based on current ActivitySource listeners (instead of only at connection construction time).
  • Stabilize the OpenTelemetry smoke test by capturing activities thread-safely and waiting for async activity completion.
  • Minor smoke test / test infrastructure adjustments (key vault client URI selection; unobserved-exception tracking test case selection).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Adjusts tracing gating and activity creation logic for command events.
tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/OpenTelemetryTests.cs Makes activity capture thread-safe and waits for expected activities.
tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/LibmongocryptTests.cs Changes how the key vault MongoClient chooses its connection string.
tests/MongoDB.TestHelpers/XunitExtensions/TimeoutEnforcing/TimeoutEnforcingXunitTestAssemblyRunner.cs Avoids SingleOrDefault failure when multiple tracking test cases exist.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +57 to 61
_tracingDisabled = tracingOptions?.Disabled == true;
_queryTextMaxLength = tracingOptions?.QueryTextMaxLength ?? 0;

_shouldTrackState = _shouldTrackSucceeded || _shouldTrackFailed || _shouldTrace;
_shouldTrackState = _shouldTrackSucceeded || _shouldTrackFailed || !_tracingDisabled;
_shouldProcessRequestMessages = _shouldTrackStarted || _shouldTrackState;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While copilot is not wrong here, making this change re-introduces the bug, since if we're not tracking state, then we never get to the point of creating the activity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important not to track stack when there are no listeners.
If it comes to that, we might need a more creative solution, or adjust our testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think you are right.

public void ConnectionFailed(ConnectionId connectionId, ObjectId? serviceId, Exception exception, bool skipLogging)
{
if (!_shouldTrackFailed && !_shouldTrace)
if (!_shouldTrackFailed && (_tracingDisabled || !MongoTelemetry.ActivitySource.HasListeners()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: probably it' cleaner to extract to IsTracingDisabled method.

Comment on lines +57 to 61
_tracingDisabled = tracingOptions?.Disabled == true;
_queryTextMaxLength = tracingOptions?.QueryTextMaxLength ?? 0;

_shouldTrackState = _shouldTrackSucceeded || _shouldTrackFailed || _shouldTrace;
_shouldTrackState = _shouldTrackSucceeded || _shouldTrackFailed || !_tracingDisabled;
_shouldProcessRequestMessages = _shouldTrackStarted || _shouldTrackState;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important not to track stack when there are no listeners.
If it comes to that, we might need a more creative solution, or adjust our testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Fixes issues or unintended behavior.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants