Skip to content

Fix OpenTelemetry missing root span by reordering session activity lifecycle#5245

Merged
thomhurst merged 2 commits intomainfrom
copilot/fix-missing-root-span
Mar 25, 2026
Merged

Fix OpenTelemetry missing root span by reordering session activity lifecycle#5245
thomhurst merged 2 commits intomainfrom
copilot/fix-missing-root-span

Conversation

Copy link
Contributor

Copilot AI commented Mar 25, 2026

The "test session" root span was never created or exported because the activity lifecycle was ordered incorrectly relative to user hooks.

Problem: StartActivity("test session") ran before [Before(TestSession)] hooks, but users set up their TracerProvider/ActivityListener in those hooks — so the ActivitySource had no listeners and returned null. Symmetrically, FinishSessionActivity ran after [After(TestSession)] hooks, which dispose the TracerProvider — so the span could never be exported. This caused Grafana Traces Drilldown to show no root spans, with empty Trace Service and Trace Name.

Fix in HookExecutor.cs:

  • ExecuteBeforeTestSessionHooksAsync: Start the session activity after hooks run, so the listener is attached first
  • ExecuteAfterTestSessionHooksAsync: Stop the session activity before hooks run, so the span is exported while the exporter is still alive
  • Determine session error status from actual test results (Failed/Timeout/Cancelled) instead of only tracking after-hook exceptions
Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug]: OpenTelemetry - Missing root span</issue_title>
<issue_description>### Description

It seems that the produced Trace is missing a root span.
The trace hierarchy is otherwise fine, but the missing root span creates issues in Grafanas Traces Drilldown.
See image:
Image

  1. The trace does not appear in the tab "Root spans" as there are none.
  2. Trace Service and Trace Name is "empty"

Impact is relatively low as you can still use drilldown to find the trace under "All Spans" and drilldown further.
But it's a bit confusing for users that don't know how this is supposed to work.

Also:
Using the Grafana LGTM stack docker.io/grafana/otel-lgtm which uses vanilla OpenTelemetry and not Alloy.

Expected Behavior

Expects root span to be sent so Traces Drilldown works better.

Actual Behavior

No root span is sent.

Steps to Reproduce

Followed Open Telemetry bootstrapping as per docs: https://tunit.dev/docs/examples/opentelemetry/
Setup a LGTM instance with docker compose:

# docker-compose.yaml
volumes:
  grafana-data:
services:
  grafana-lgtm:
    image: docker.io/grafana/otel-lgtm
    restart: unless-stopped
    volumes:
      - grafana-data:/data
    ports:
      - 4040:4040
      - 3000:3000
      - 4317:4317
      - 4318:4318

[!WARNING]
If running behind a web proxy that presents a self signed CA Grafana Drilldown Plugins will not install automatically as that CA is most likely not in the image's trusted cert store.
In that case run:

docker exec -it -w /otel-lgtm/grafana {stackname*}-grafana-lgtm-1 bin/grafana cli plugins install grafana-metricsdrilldown-app --pluginsDir /data/grafana/plugins --insecure 
docker exec -it -w /otel-lgtm/grafana {stackname*}-grafana-lgtm-1 bin/grafana cli plugins install grafana-lokiexplore-app --pluginsDir /data/grafana/plugins --insecure 
docker exec -it -w /otel-lgtm/grafana {stackname*}-grafana-lgtm-1 bin/grafana cli plugins install grafana-exploretraces-app --pluginsDir /data/grafana/plugins --insecure 

* stackname is usually the folder where the docker-compose.yaml resides.

Force insecure gRPC:

.AddOtlpExporter(c =>
{
     c.Endpoint = new Uri("http://localhost:4317");
})

TUnit Version

1.21.6

.NET Version

.NET 10

Operating System

Windows

IDE / Test Runner

dotnet CLI (dotnet test / dotnet run)

Error Output / Stack Trace

Additional Context

No response

IDE-Specific Issue?

  • I've confirmed this issue occurs when running via dotnet test or dotnet run, not just in my IDE</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

…fecycle

The "test session" activity was created before [Before(TestSession)] hooks,
but users set up TracerProvider (which adds the ActivityListener) in those
hooks. The ActivitySource had no listeners so StartActivity returned null.
Similarly, FinishSessionActivity ran after [After(TestSession)] hooks which
dispose the TracerProvider, so the root span was never exported.

Fix: create the session activity after before-hooks run, and stop it before
after-hooks run.

Co-authored-by: thomhurst <30480171+thomhurst@users.noreply.github.com>
Agent-Logs-Url: https://github.com/thomhurst/TUnit/sessions/98881459-ab61-411c-8a31-0902db096e73
Copilot AI changed the title [WIP] Fix missing root span in OpenTelemetry traces Fix OpenTelemetry missing root span by reordering session activity lifecycle Mar 25, 2026
Copilot AI requested a review from thomhurst March 25, 2026 15:47
Copilot AI temporarily deployed to Pull Requests March 25, 2026 15:49 Inactive
Copilot AI temporarily deployed to Pull Requests March 25, 2026 15:49 Inactive
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Fix OpenTelemetry missing root span

The fix correctly identifies and addresses the root cause — was called before hooks registered the , and was called after hooks disposed the exporter. The reordering is the right approach.

Correctness

The core logic is sound. The two-part fix is well-reasoned:

  1. Start session activity after before-hooks → listener is registered by the time StartActivity is called
  2. Stop session activity before after-hooks → exporter is still alive when the span is exported

Concerns

1. After-hook exceptions are no longer reflected in span status (inherent trade-off)

The original code:

FinishSessionActivity(hasErrors: exceptions is { Count: > 0 });

The new code:

var hasTestFailures = _contextProvider.TestSessionContext.AllTests
    .Any(t => t.Result is { State: TestState.Failed or TestState.Timeout or TestState.Cancelled });
FinishSessionActivity(hasErrors: hasTestFailures);

Since must be called before after-hooks run, you cannot check whether those hooks will fail — so the span will never reflect after-session hook failures. This is an unavoidable consequence of the ordering fix, but it's worth documenting as a known limitation (a comment in the code, or in release notes for anyone relying on this behavior).

2. Session-level hook spans are now orphaned from the session root span

In , the hook activity's parent is set via . In the new flow:

  • Before-hooks: is when hooks run (session activity hasn't been created yet), so before-session hook spans become root spans with no parent.
  • After-hooks: is when hooks run (it's set to in ), so after-session hook spans are also unparented.

This creates an inconsistency with assembly/class levels, where hooks run inside their span lifetime and are correctly parented. This is again an inherent trade-off — you can't parent hook spans to a span that doesn't exist yet — but worth acknowledging. Users using Grafana will see session-level hook spans as top-level spans rather than as children of the session root.

**3. Minor: is now scoped inside **

Previously was declared at the top of the method. In the new code it's only declared inside . This is fine since the non-NET paths don't reference it, but it's a subtle structural change.

What's Good

  • The fix is minimal and targeted — it changes only the session-level ordering, not the assembly/class levels (where the problem doesn't exist because users don't set up the provider there)
  • Using actual test result states (, , ) for is arguably more correct than the original (which only checked after-hook exceptions and defaulted to when there were no after-hooks even if tests failed)
  • The comments in the code clearly explain the rationale for the ordering

Suggestion

Consider adding a note to or the call site that after-hook exceptions are intentionally excluded from span status due to the ordering constraint:

// Note: after-hook exceptions cannot be reflected here since FinishSessionActivity
// must be called before after-hooks run (to ensure the exporter is still alive).
// The status reflects only test results.
var hasTestFailures = _contextProvider.TestSessionContext.AllTests
    .Any(t => t.Result is { State: TestState.Failed or TestState.Timeout or TestState.Cancelled });
FinishSessionActivity(hasErrors: hasTestFailures);

Overall

The fix is correct and the trade-offs are inherent to the problem domain — you cannot simultaneously parent spans correctly AND create them after the listener is registered. The PR fixes the more impactful user-facing issue (missing root span) and the approach is sound. The orphaned hook spans and missing after-hook error status are acceptable limitations worth documenting.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Fix OpenTelemetry missing root span

The fix correctly identifies and addresses the root cause — StartActivity was called before [Before(TestSession)] hooks registered the TracerProvider, and FinishActivity was called after [After(TestSession)] hooks disposed the exporter. The reordering is the right approach.

Correctness

The core logic is sound. The two-part fix is well-reasoned:

  1. Start session activity after before-hooks → listener is registered by the time StartActivity is called
  2. Stop session activity before after-hooks → exporter is still alive when the span is exported

Concerns

1. After-hook exceptions are no longer reflected in span status (inherent trade-off)

The original code checked exceptions is { Count: > 0 } for hasErrors. The new code queries actual test results instead. Since FinishSessionActivity must be called before after-hooks run, you cannot check whether those hooks will fail — so the span will never reflect after-session hook failures. This is an unavoidable consequence of the ordering fix, but worth documenting as a known limitation.

2. Session-level hook spans are now orphaned from the session root span

In ExecuteHookWithActivityAsync, the hook activity's parent is set via context.Activity?.Context ?? default. In the new flow:

  • Before-hooks: context.Activity is null when hooks run (session activity hasn't been created yet), so before-session hook spans become orphaned root spans.
  • After-hooks: context.Activity is null when hooks run (set to null in FinishSessionActivity), so after-session hook spans are also unparented.

This creates an inconsistency with assembly/class levels, where hooks run inside their span lifetime and are correctly parented. Users using Grafana will see session-level hook spans as top-level spans rather than children of the session root. This is inherent to the trade-off, but worth acknowledging.

3. Minor: sessionContext is now scoped inside #if NET

Previously declared at the top of the method, it's now only inside #if NET. This is fine since non-NET paths don't reference it, but it's a subtle structural change.

What's Good

  • The fix is minimal and targeted — it changes only the session-level ordering, not assembly/class levels
  • Using actual test result states (Failed, Timeout, Cancelled) for hasErrors is arguably more correct than the original (which defaulted to false when there were no after-hooks, even if tests failed)
  • The comments clearly explain the rationale for the ordering

Suggestion

Consider noting that after-hook exceptions are intentionally excluded from span status due to the ordering constraint:

// Note: after-hook exceptions cannot be reflected here since FinishSessionActivity
// must be called before after-hooks run (to ensure the exporter is still alive).
// The status reflects only test results.
var hasTestFailures = _contextProvider.TestSessionContext.AllTests
    .Any(t => t.Result is { State: TestState.Failed or TestState.Timeout or TestState.Cancelled });
FinishSessionActivity(hasErrors: hasTestFailures);

Overall

The fix is correct and the trade-offs are inherent to the problem domain — you cannot simultaneously parent spans correctly AND create them after the listener is registered. The PR resolves the more impactful user-facing issue (missing root span in Grafana Traces Drilldown) and the approach is sound. The orphaned hook spans and missing after-hook error status are acceptable limitations that warrant a brief comment.

@thomhurst thomhurst marked this pull request as ready for review March 25, 2026 15:58
@thomhurst thomhurst enabled auto-merge (squash) March 25, 2026 15:58
@thomhurst thomhurst disabled auto-merge March 25, 2026 15:58
@thomhurst
Copy link
Owner

@copilot should we have another span that sits above the session?

Copy link
Contributor Author

Copilot AI commented Mar 25, 2026

@copilot should we have another span that sits above the session?

I don't think an additional span above the session is needed. The "test session" span already serves as a clean root - it's the top of the hierarchy that all assemblies/suites/cases nest under.

The fundamental issue here is a chicken-and-egg problem: users set up their TracerProvider in [Before(TestSession)] hooks and tear it down in [After(TestSession)] hooks. Any span we create before those hooks will have no listener (so StartActivity returns null), and any span we stop after them will miss the exporter (already disposed). This applies equally whether the root span is called "test session" or something above it.

Adding another layer (e.g. "test run" → "test session") would just add nesting without solving an additional problem - the ordering fix already ensures the root span is properly created and exported. That said, if you'd like a different design (e.g. renaming the root or introducing a wrapper), happy to adjust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OpenTelemetry - Missing root span

2 participants