Handle ServerNotAvailableException in CollectLinuxCommand process probing by Copilot · Pull Request #5705 · dotnet/diagnostics

Copilot · 2026-02-05T18:51:02Z

Summary

Handle ServerNotAvailableException and DiagnosticToolException inCollectLinuxCommand process probing to gracefully handle processes that cannot be resolved or connected to.

Fixes #5694

Problem

DiagnosticsClient.GetProcessInfo() throws ServerNotAvailableException when the diagnostics server is unavailable (process exits between enumeration and probing, connection failures, etc.). Additionally, CommandUtils.ResolveProcess() throws DiagnosticToolException for invalid process IDs or names. These unhandled exceptions caused probe operations to crash.

Solution

Introduce a four-state probe result to handle all outcomes:

Supported: Process supports UserEvents IPC command
NotSupported: Process does not support UserEvents IPC command (runtime too old)
ProcessNotFound: Process could not be resolved (invalid PID, no process with given
name)
ConnectionFailed: Process resolved but unable to connect to diagnostic endpoint

Behavior Changes

Non-probe mode (dotnet-trace collect-linux -p <pid>):

ProcessNotFound: [ERROR] Could not resolve process '<id>'.
ConnectionFailed: [ERROR] Unable to connect to process '<id>'. The process may have exited or its diagnostic endpoint is not accessible.
Both return TracingError

Single-process probe mode (dotnet-trace collect-linux --probe -p <pid>):

ProcessNotFound: Could not resolve process '<id>'.
ConnectionFailed: Process '<id>' could not be probed. Unable to connect to the process's diagnostic endpoint.
Returns Ok (informational output)

Machine-wide probe mode (dotnet-trace collect-linux --probe):

Shows "Processes that could not be probed" section when applicable
CSV output includes unknown value for unprobed processes
Processes that exit between enumeration and probing are handled gracefully

Other Changes

Added FormatProcessIdentifier helper - shows name (pid) when name is provided, just pid otherwise
Changed ".NET process" to "Process" in messages (probe accepts arbitrary PIDs)
Updated --probe option help text to document result categories

…lpers Add UserEventsProbeResult enum (Supported/NotSupported) to replace boolean return. Introduce ProbeProcess helper for probing a single process. Add GetAndProbeAllProcesses helper that enumerates and probes all published processes. Update callers in CollectLinux and SupportsCollectLinux to use new helpers. Update BuildProcessSupportCsv to use UserEventsProbeResult enum.

…cess probing Add ProcessNotFound and ConnectionFailed values to UserEventsProbeResult enum. Update ProbeProcess to catch DiagnosticToolException (process resolution failed) and ServerNotAvailableException (diagnostic endpoint not accessible) separately. Add FormatProcessIdentifier helper for clean display of process ID/name. Add unknownProcesses/unknownCsv tracking for processes that could not be probed. Update probe mode output to show 'Processes that could not be probed' section. Include 'unknown' value in CSV output for unprobed processes. Update non-probe mode to show distinct errors for each failure type. Change '.NET process' to 'Process' in messages since arbitrary PIDs may not be .NET. Fixes #5694

Document that results are categorized as supported, not supported, or unknown. Clarify that unknown status occurs when diagnostic endpoint is not accessible.

…iled handling Update test expectations to match new behavior: - Add FormatProcessNotFoundError and FormatProcessIdentifier helpers - Update ResolveProcessExceptions test data for ProcessNotFound handling - Update probe error test cases for process resolution errors - Tests now expect ReturnCode.TracingError for failures in non-probe mode - Tests expect ReturnCode.Ok for probe mode with informational output

Copilot

Pull request overview

This PR improves dotnet-trace collect-linux resilience by handling process-resolution and diagnostics-connection failures during “process probing” so the command no longer crashes when a target process can’t be resolved or connected to (e.g., exits between enumeration and probing, cross-container endpoint issues).

Changes:

Replaced boolean “supports” probing with a 4-state probe result (Supported/NotSupported/ProcessNotFound/ConnectionFailed) and updated user-facing output.
Updated machine-wide probe to track and report “unknown/unprobed” processes and emit unknown in CSV.
Adjusted functional tests to match new probe behaviors/messages (partially—some existing expectations still appear outdated).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
`src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs`	Introduces multi-state probing, catches `DiagnosticToolException`/`ServerNotAvailableException`, updates probe messaging and CSV output.
`src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs`	Updates/extends tests for new probe outcomes and adds helpers for the new process identifier/message formatting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-06T20:06:17Z

src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs

+                            return (int)ReturnCode.TracingError;
+                        case UserEventsProbeResult.ProcessNotFound:
+                            Console.Error.WriteLine($"[ERROR] Could not resolve process '{FormatProcessIdentifier(resolvedProcessId, resolvedProcessName)}'.");
+                            return (int)ReturnCode.TracingError;


CollectLinux now maps ResolveProcess failures (invalid PID/name, both PID and name, etc.) to ReturnCode.TracingError and a generic "Could not resolve process ..." message. This is a behavioral change from the previous DiagnosticToolException flow (which returned ArgumentError with a more specific message) and may break callers/scripts relying on ArgumentError for invalid input. Consider preserving DiagnosticToolException.ReturnCode (or at least returning ArgumentError for invalid arguments like negative PID / both options) while still handling ServerNotAvailableException gracefully.

Suggested change

return (int)ReturnCode.TracingError;

return (int)ReturnCode.ArgumentError;

Copilot · 2026-02-06T20:06:17Z

src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs

+            catch (DiagnosticToolException)
+            {
+                resolvedPid = originalPid;
+                resolvedName = originalName;


ProbeProcess catches DiagnosticToolException from CommandUtils.ResolveProcess and always returns ProcessNotFound, which replaces specific, actionable error messages like "Only one of the --name or --process-id options may be specified." with the generic "Could not resolve process ...". Consider distinguishing argument-validation failures from true "process not found" (e.g., by surfacing the original DiagnosticToolException message/ReturnCode for invalid option combinations).

Suggested change

catch (DiagnosticToolException)

{

resolvedPid = originalPid;

resolvedName = originalName;

catch (DiagnosticToolException ex)

{

// Restore original values in case ResolveProcess mutated the out parameters

resolvedPid = originalPid;

resolvedName = originalName;

// Distinguish argument-validation failures from true "process not found" cases.

// For argument/command-line errors, rethrow so the caller can surface the

// specific, actionable message and return code.

if (ex.ReturnCode == ReturnCode.InvalidArgument ||

ex.ReturnCode == ReturnCode.CommandLineError)

{

throw;

}

// For all other failures from ResolveProcess, treat as "process not found".

Copilot · 2026-02-06T20:06:18Z

src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs

+                        Console.WriteLine($"Probing processes for support of the EventPipe UserEvents IPC command used by collect-linux. Requires runtime '{minRuntimeSupportingUserEventsIPCCommand}' or later.");
                    }
                    StringBuilder supportedProcesses = new();
                    StringBuilder unsupportedProcesses = new();
+                    StringBuilder unknownProcesses = new();

-                    IEnumerable<int> pids = DiagnosticsClient.GetPublishedProcesses();
-                    foreach (int pid in pids)
-                    {
-                        if (pid == Environment.ProcessId)
-                        {
-                            continue;
-                        }
-
-                        bool supports = ProcessSupportsUserEventsIpcCommand(pid, string.Empty, out int resolvedPid, out string resolvedName, out string detectedRuntimeVersion);
-                        BuildProcessSupportCsv(resolvedPid, resolvedName, supports, supportedCsv, unsupportedCsv);
-                        if (supports)
-                        {
-                            supportedProcesses.AppendLine($"{resolvedPid} {resolvedName}");
-                        }
-                        else
-                        {
-                            unsupportedProcesses.AppendLine($"{resolvedPid} {resolvedName} - Detected runtime: '{detectedRuntimeVersion}'");
-                        }
-                    }
+                    GetAndProbeAllProcesses(supportedProcesses, unsupportedProcesses, unknownProcesses, supportedCsv, unsupportedCsv, unknownCsv);

                    if (mode == ProbeOutputMode.Console)
                    {
-                        Console.WriteLine($".NET processes that support the command:");
+                        Console.WriteLine($"Processes that support the command:");
                        Console.WriteLine(supportedProcesses.ToString());
-                        Console.WriteLine($".NET processes that do NOT support the command:");
+                        Console.WriteLine($"Processes that do NOT support the command:");
                        Console.WriteLine(unsupportedProcesses.ToString());


SupportsCollectLinux console output strings were changed from ".NET process(es)" to "Process(es)", but existing functional tests still assert the old ".NET processes" headings (e.g., CollectLinuxCommand_Probe_ListsProcesses_WhenNoArgs). This will cause test failures unless the test expectations are updated (or the output wording is kept compatible).

Copilot · 2026-02-06T20:06:18Z

src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs

        [ConditionalTheory(nameof(IsCollectLinuxSupported))]
        [MemberData(nameof(ResolveProcessExceptions))]
-        public void CollectLinuxCommand_ResolveProcessExceptions(object testArgs, string[] expectedError)
+        public void CollectLinuxCommand_ResolveProcessExceptions(object testArgs, string[] expectedError, int expectedExitCode)
        {
            MockConsole console = new(200, 30, _outputHelper);

            int exitCode = Run(testArgs, console);

-            Assert.Equal((int)ReturnCode.ArgumentError, exitCode);
+            Assert.Equal(expectedExitCode, exitCode);
            console.AssertSanitizedLinesEqual(null, expectedError);


The functional tests for non-probe mode (CollectLinuxCommand_ReportsResolveProcessErrors / _ReportsResolveProcessNameErrors) still assert ReturnCode.ArgumentError and the raw ResolveProcess exception text, but CollectLinux now routes these cases through ProbeProcess and returns TracingError with a different message. Update those tests to match the new behavior, otherwise the test suite will fail.

Copilot · 2026-02-06T20:06:18Z

src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs

+                        break;
+                    case UserEventsProbeResult.ProcessNotFound:
+                    case UserEventsProbeResult.ConnectionFailed:
+                        unknownProcesses?.AppendLine($"{resolvedPid} {resolvedName} - Unable to connect");


GetAndProbeAllProcesses appends "- Unable to connect" for both ProcessNotFound and ConnectionFailed results. For ProcessNotFound (e.g., process exited between enumeration and probe) this message is misleading; consider wording that covers both cases (e.g., "Exited or diagnostic endpoint not accessible") or differentiating the two outcomes.

Suggested change

unknownProcesses?.AppendLine($"{resolvedPid} {resolvedName} - Unable to connect");

unknownProcesses?.AppendLine($"{resolvedPid} {resolvedName} - Exited or diagnostic endpoint not accessible");

Copilot · 2026-02-06T20:06:18Z

src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs

            new("--probe")
            {
-                Description = "Probe .NET processes for support of the EventPipe UserEvents IPC command used by collect-linux, without collecting a trace. Results list supported processes first. Use '-o stdout' to print CSV (pid,processName,supportsCollectLinux) to the console, or '-o <file>' to write the CSV. Probe a single process with -n|--name or -p|--process-id.",
+                Description = "Probe processes for support of the EventPipe UserEvents IPC command used by collect-linux, without collecting a trace. Results are categorized as supported, not supported, or unknown (when the process's diagnostic endpoint is not accessible). Use '-o stdout' to print CSV (pid,processName,supportsCollectLinux) to the console, or '-o <file>' to write the CSV. Probe a single process with -n|--name or -p|--process-id.",


--probe help text describes results as "supported, not supported, or unknown", but single-process probe mode also emits a distinct "Could not resolve process ..." case (ProcessNotFound). Consider updating the option description to mention the ProcessNotFound outcome (or clarifying that it is included under "unknown") so the CLI help matches observed behavior.

Suggested change

Description = "Probe processes for support of the EventPipe UserEvents IPC command used by collect-linux, without collecting a trace. Results are categorized as supported, not supported, or unknown (when the process's diagnostic endpoint is not accessible). Use '-o stdout' to print CSV (pid,processName,supportsCollectLinux) to the console, or '-o <file>' to write the CSV. Probe a single process with -n|--name or -p|--process-id.",

Description = "Probe processes for support of the EventPipe UserEvents IPC command used by collect-linux, without collecting a trace. Results are categorized as supported, not supported, or unknown (for example, when the process's diagnostic endpoint is not accessible or the process cannot be found). Use '-o stdout' to print CSV (pid,processName,supportsCollectLinux) to the console, or '-o <file>' to write the CSV. Probe a single process with -n|--name or -p|--process-id.",

Copilot · 2026-02-06T20:06:19Z

src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs

                    Console.WriteLine("pid,processName,supportsCollectLinux");
                    Console.Write(supportedCsv?.ToString());
                    Console.Write(unsupportedCsv?.ToString());
+                    Console.Write(unknownCsv?.ToString());


Condition is always not null because of ... == ....

Copilot · 2026-02-06T20:06:19Z

src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs

                    writer.Write(supportedCsv?.ToString());
                    writer.Write(unsupportedCsv?.ToString());
+                    writer.Write(unknownCsv?.ToString());


Condition is always not null because of ... == ....

Suggested change

writer.Write(supportedCsv?.ToString());

writer.Write(unsupportedCsv?.ToString());

writer.Write(unknownCsv?.ToString());

writer.Write(supportedCsv.ToString());

writer.Write(unsupportedCsv.ToString());

writer.Write(unknownCsv.ToString());

noahfalk · 2026-02-06T22:07:58Z

src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs

        {
            MockConsole console = new(200, 30, _outputHelper);
-            var args = TestArgs(processId: -1, probe: true);
+            var args = TestArgs(processId: -1, probe: true, output: new FileInfo(CommonOptions.DefaultTraceName));


Does the argument to the test need to change? I would expect the args to match what we expect the user to pass in at the command-line and I don't imagine users will be adding a trace name to their probe commands.

noahfalk · 2026-02-06T22:14:31Z

src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs

+            // ProcessNotFound shows just PID when no name is provided
+            string[] expected = ExpectPreviewWithMessages(
+                new[] {
+                    $"Could not resolve process '{FormatProcessIdentifier(-1, string.Empty)}'.",


The error text seems less clear than it was previously. "Resolving a process" sounds more like our app's internal terminology rather than terminology users will universally understand.

Also I think its easier to review these UI test cases if the test includes the literal error text rather than calls into other formatting functions that have to be reasoned about. If changing app behavior causes 5 or 10 test cases to all have a similar looking edit that is useful to see.

noahfalk · 2026-02-06T22:16:59Z

src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs

+            // ProcessNotFound shows "name (pid)" when name is provided - pid is 0 from input
+            string[] expected = ExpectPreviewWithMessages(
+                new[] {
+                    $"Could not resolve process '{FormatProcessIdentifier(0, "process-that-should-not-exist")}'.",


Previous error text seems clearer again

noahfalk · 2026-02-06T22:17:23Z

src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs

+            // ProcessNotFound shows "name (pid)" when both are provided
+            string[] expected = ExpectPreviewWithMessages(
+                new[] {
+                    $"Could not resolve process '{FormatProcessIdentifier(1, "dummy")}'.",


Previous error text seems clearer.

noahfalk · 2026-02-06T22:19:32Z

src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs

            {
                TestArgs(processId: -1, name: string.Empty),
-                FormatException("-1 is not a valid process ID")
+                FormatProcessNotFoundError(FormatProcessIdentifier(-1, string.Empty)),


This is another spot where it would be useful to inline the exact error text rather than needing to read through multiple layers of formatting helpers to figure it out. I'm also guessing like above the new error messages are less precise than before.

noahfalk · 2026-02-06T22:24:54Z

src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs

        [ConditionalTheory(nameof(IsCollectLinuxSupported))]
        [MemberData(nameof(ResolveProcessExceptions))]
-        public void CollectLinuxCommand_ResolveProcessExceptions(object testArgs, string[] expectedError)
+        public void CollectLinuxCommand_ResolveProcessExceptions(object testArgs, string[] expectedError, int expectedExitCode)


I don't think I noticed any test case for the output when handling a ServerNotAvailableException. That seems important to add.

Is there an easy instrumentable way to have a .NET process that isn't ready to accept a diagnostics IPC command, or is shutting down by the time the probe tries to send the command? I haven't brainstormed anything simple yet. My big assumption is the tests are racing to probe a .NET process that is in the middle of shutting down, hence the flakey failure on CI

Copilot AI assigned Copilot and mdh1418 Feb 5, 2026

Copilot started work on behalf of mdh1418 February 5, 2026 18:51 View session

Copilot AI changed the title ~~[WIP] Investigate DiagnosticsClient.GetProcessInfo exception~~ Handle ServerNotAvailableException in CollectLinuxCommand process probing Feb 5, 2026

Copilot AI requested a review from mdh1418 February 5, 2026 19:03

Copilot finished work on behalf of mdh1418 February 5, 2026 19:03

mdh1418 force-pushed the copilot/fix-diagnostics-client-exception branch from 8fe26f7 to fe76ee6 Compare February 6, 2026 17:21

mdh1418 added 4 commits February 6, 2026 19:20

Update --probe option help text

6f420b2

Document that results are categorized as supported, not supported, or unknown. Clarify that unknown status occurs when diagnostic endpoint is not accessible.

mdh1418 force-pushed the copilot/fix-diagnostics-client-exception branch from fe76ee6 to 0cbcf44 Compare February 6, 2026 19:54

mdh1418 marked this pull request as ready for review February 6, 2026 20:02

mdh1418 requested a review from a team as a code owner February 6, 2026 20:02

mdh1418 requested review from Copilot, hoyosjs, lateralusX and noahfalk February 6, 2026 20:02

Copilot started reviewing on behalf of mdh1418 February 6, 2026 20:02 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

noahfalk reviewed Feb 6, 2026

View reviewed changes

	return (int)ReturnCode.TracingError;
	return (int)ReturnCode.ArgumentError;

-            catch (DiagnosticToolException)
-            {
-                resolvedPid = originalPid;
-                resolvedName = originalName;
+            catch (DiagnosticToolException ex)
+            {
+                // Restore original values in case ResolveProcess mutated the out parameters
+                resolvedPid = originalPid;
+                resolvedName = originalName;
+                // Distinguish argument-validation failures from true "process not found" cases.
+                // For argument/command-line errors, rethrow so the caller can surface the
+                // specific, actionable message and return code.
+                if (ex.ReturnCode == ReturnCode.InvalidArgument ||
+                    ex.ReturnCode == ReturnCode.CommandLineError)
+                {
+                    throw;
+                }
+                // For all other failures from ResolveProcess, treat as "process not found".

	unknownProcesses?.AppendLine($"{resolvedPid} {resolvedName} - Unable to connect");
	unknownProcesses?.AppendLine($"{resolvedPid} {resolvedName} - Exited or diagnostic endpoint not accessible");

Conversation

Copilot AI commented Feb 5, 2026 • edited by mdh1418 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Behavior Changes

Other Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

noahfalk Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

noahfalk Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

noahfalk Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

noahfalk Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

noahfalk Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

noahfalk Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

mdh1418 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 5, 2026 •

edited by mdh1418

Loading