Search for the diagnostics socket in the mount NS of the target process#3539
Search for the diagnostics socket in the mount NS of the target process#3539Jongy wants to merge 1 commit intodotnet:mainfrom
Conversation
| { | ||
| try | ||
| { | ||
| defaultAddress = Directory.GetFiles(IpcRootPath, $"dotnet-diagnostic-{pid}-*-socket") // Try best match. |
There was a problem hiding this comment.
For Kubernetes scenarios with sharedProcessNamespace disabled, wouldn't this preclude the other solution from working? I.e. shared /tmp volume would not work and /proc will not be visible between containers.
There was a problem hiding this comment.
You're right, good point. Since dotnet-trace never actually required the PID to be visibile in /proc, we could always just share the /tmp and use --process-id with the NSpid of the process.
I think what will work is - if this path fails (i.e we don't have a valid socket in the /proc/pid/root/tmp/...nspid... path) we will fall back to the old behavior. Makes sense?
There was a problem hiding this comment.
That seems like it would work but be aware that in the above environment if two containers are both running managed apps as lead processes, they will both have a pid of 1. This conflict also exists for tmp mounting though.
Closes: #3480
This is a PoC for #3480 .
I tested the PoC by:
mcr.microsoft.com/dotnet/sdk:6.0-focal).mainbranch and ran it - I get:Sanity passes.
I had to fix 2 sites:
/proc/pid/root/.../proc/pid/statusfor that.TODOs
/proc/pid/rootby default. This requires privileges (I think it requires to be able to ptrace the process?) and dotnet-trace itself doesn't require root (I actually don't know which permission checks are implemented in the CLR before handling requests on the diagnostics socket. I know that in Java, HotSpot will compare the UID GID of the connecting process to the UID GID of itself). What makes sense, in order to maintain compatibility with the old versions of dotnet-trace, is to fallback to useIpcRootPathif we couldn't use/proc/pid/root. We can also check/proc/pid/ns/mntvs/proc/self/ns/mntto understand if we actually need to move mount namespace, but that requires ptrace permissions as well, so the first check ("can we use /proc/pid/root") is enough IMO./proc/pid/root/...which can fail if the added path contains absolute symlinks. The problem is explained thoroughly and a solution is given in Python here, this will affect profiled containers if their/tmpis e.g a link to/var/tmp. We can resolve it now or in a follow-up PR./proc/pid/status. This is available from Linux 4.1. If we want to support this feature on older kernels, there is a different method to get the NSpid, if we want to support that.dotnet-trace psuses). I don't think there's a way to properly "ps" all dotnet processes by their sockets, because this will require iterating over all mount namespaces? Or over all mount namespaces of alldotnetprocess? IDK. I suggest we skip that./tmpfor the temporary directory. Previous code usedPath.GetTempPathwhich getsTMPDIRfrom env, or defaults to/tmp. Since that's the same path the diagnostics socket will be created in - what makes sense is to implement the behavior ofGetTempPathfor the remote process, i.e read its environment variables and use/proc/pid/root/$TMPDIRifTMPDIRis present, and otherwise fallback to/tmp. Can probably be a follow-up PR but I'm just stating it. Btwdotnet-trace collect --process-idas of now wouldn't have worked on it processes on the same mount namespace if they change theirTMPDIRWhich of these TODOs do you think I should get done for merging it? I think only the first one is needed for MVP, to avoid regressions, and all others are features on top of it. If you agree I can implement it and we can move forward with merging this PR :)