-
Notifications
You must be signed in to change notification settings - Fork 385
Description
Tracing a single process in a separate container with -p|--process-id fails with
[ERROR] Microsoft.Diagnostics.NETCore.Client.ServerNotAvailableException: Unable to connect to Process 652542. Please verify that /tmp/ is writable by the current user. If the target process has environment variable TMPDIR set, please set TMPDIR to the same directory. Please also ensure that the target process has {TMPDIR}/dotnet-diagnostic-{pid}-{disambiguation_key}-socket shorter than 108 characters. Please see https://aka.ms/dotnet-diagnostics-port for more information
at Microsoft.Diagnostics.NETCore.Client.PidIpcEndpoint.GetDefaultAddress(Int32 pid)
at Microsoft.Diagnostics.NETCore.Client.PidIpcEndpoint.GetDefaultAddress()
at Microsoft.Diagnostics.NETCore.Client.PidIpcEndpoint.Connect(TimeSpan timeout)
at Microsoft.Diagnostics.NETCore.Client.IpcClient.SendMessageGetContinuation(IpcEndpoint endpoint, IpcMessage message)
at Microsoft.Diagnostics.NETCore.Client.DiagnosticsClient.TryGetProcessInfo3()
at Microsoft.Diagnostics.NETCore.Client.DiagnosticsClient.GetProcessInfo()
at Microsoft.Diagnostics.Tools.Trace.CollectLinuxCommandHandler.ProcessSupportsUserEventsIpcCommand(Int32 pid, String processName, Int32& resolvedPid, String& resolvedName, String& detectedRuntimeVersion)
at Microsoft.Diagnostics.Tools.Trace.CollectLinuxCommandHandler.CollectLinux(CollectLinuxArgs args)
When dotnet-trace collect-linux is tracing a single process, it attempts to connect to the process to probe for userevents support by sending a GetProcessInfo command. In this scenario, where the target process lives in a separate container, I believe the DiagnosticClient is unable to discover the cross-container .NET process from the PID alone, because the diagnostic-port of the Process in the separate container may live elsewhere.
We should be able to trace the cross-container process as record-trace seems to still be able to trace based on the PID because it currently enumerates PerfMaps and matches the corresponding PID from there. The limitation in this case is CollectLinuxCommand's approach to determining whether a specific process can support userevents or not. Cross-container single process tracing works in dotnet-trace collect using the diagnostic-port.
So maybe we should remove the tool's self-imposed limitation https://github.com/dotnet/diagnostics/blob/3272544fa65aa8c6af5285cecceb01ccb4acb029/src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs#L95C20-L99C22.
If we want collect-linux --probe to still have utility, as it currently does not require sudo to run (which would use the global namespace for ProcessIDs), so users need to be careful to use the same terminal/namespace to acquire the cross-container pid and pass it into dotnet-trace collect-linux --probe.
collect-linux requires the user to have r/w access to tracefs, which can be done through modifying user group permissions rather than running as sudo, requiring the same caution with pid acquisition/usage with respect to terminal/namespace.
Moreover, the tooling should be resilient to DiagnosticClient APIs throwing exceptions.