Skip to content

fix: handle extra colons in /proc/self/cgroup entries#35

Merged
kacpersaw merged 1 commit intomainfrom
kacpersaw/fix-cgroup-extra-colons
Mar 4, 2026
Merged

fix: handle extra colons in /proc/self/cgroup entries#35
kacpersaw merged 1 commit intomainfrom
kacpersaw/fix-cgroup-extra-colons

Conversation

@kacpersaw
Copy link
Contributor

Fixes #34

Problem

On some Kubernetes nodes, /proc/self/cgroup contains entries with extra colons in the path:

0::/system.slice/kubepods-burstable-pod72e25f20.slice:cri-containerd:d24f9cc...

currentProcCgroup() used strings.Split(entry, ":") which expected exactly 3 parts. Extra colons produced 4+ parts, causing a parse error that crashed the agent.

Additionally, the error message wrapped a nil err variable, producing %!w(<nil>) in logs.

Fix

  • Use strings.SplitN(..., 3) to limit splitting to 3 parts, preserving extra colons in the path field.
  • Fix nil error wrapping in the parse error message to show the actual malformed entry.

Tests

Added 3 test cases:

  • Extra colons in path
  • Extra colons with mixed cgroup hierarchy
  • Malformed entry with too few fields

Use strings.SplitN with limit of 3 instead of strings.Split so that
extra colons in the cgroup path (e.g. from Kubernetes cri-containerd)
are preserved in the path field rather than causing a parse error.

Also fix nil error wrapping in the parse error message.
@kacpersaw kacpersaw marked this pull request as ready for review March 3, 2026 14:05
@DanielleMaywood
Copy link
Collaborator

Have we managed to create a reproduction of this issue? My understanding of the source is that the path we extract isn't the full path of the cgroup (although I could be completely misunderstanding, which is why it'd be nice to get a reproduction).

https://github.com/containerd/containerd/blob/cf600abecc27200a3c0e1415cd1f6c325eb05edf/pkg/cri/server/helpers_linux.go#L67-L75

// getCgroupsPath generates container cgroups path.
func getCgroupsPath(cgroupsParent, id string) string {
	base := path.Base(cgroupsParent)
	if strings.HasSuffix(base, ".slice") {
		// For a.slice/b.slice/c.slice, base is c.slice.
		// runc systemd cgroup path format is "slice:prefix:name".
		return strings.Join([]string{base, "cri-containerd", id}, ":")
	}
	return filepath.Join(cgroupsParent, id)
}

Copy link
Collaborator

@DanielleMaywood DanielleMaywood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Known misconfiguration, we may as well handle it cleanly, approving containerd/containerd#4900

@DanielleMaywood
Copy link
Collaborator

Once merged, feel free to create a patch release, and then bump the version used by coder/coder

@kacpersaw kacpersaw merged commit a2db32a into main Mar 4, 2026
30 checks passed
@kacpersaw kacpersaw deleted the kacpersaw/fix-cgroup-extra-colons branch March 4, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Agent cgroup parsing error on Kubernetes

2 participants