feat: add imagePullSecrets support for container-based skills by ppeau · Pull Request #1725 · kagent-dev/kagent

ppeau · 2026-04-21T18:59:20Z

Problem

Container-based skills using krane to pull OCI images had no way to authenticate against private registries (Artifactory, ACR, ECR, etc.). The imagePullSecrets defined on the agent deployment were not passed to the skills-init init container, causing authentication failures like:
No matching credentials were found for "docker.artifactory.dev.example.com"
Error: pulling ...: Authentication is required

Solution

Follows the approach discussed in #1222 by @s10gopal:

Added an imagePullSecrets field under spec.skills accepting a list of kubernetes.io/dockerconfigjson secrets
When imagePullSecrets is set, a new docker-auth-init init container is prepended — it merges all referenced secrets into a single config.json using jq
The skills-init container reads that merged config via the DOCKER_CONFIG env var, which krane picks up automatically when pulling skill images

Changes

go/api/v1alpha2/agent_types.go: add ImagePullSecrets []corev1.LocalObjectReference to SkillForAgent struct
go/api/v1alpha2/zz_generated.deepcopy.go: regenerated DeepCopy for new field
go/core/internal/controller/translator/agent/adk_api_translator.go: buildSkillsInitContainer now returns []Container, prepends docker-auth-init when imagePullSecrets are present
docker/skills-init/Dockerfile: add jq to the Alpine base image
.gitattributes: enforce LF line endings on *.sh.tmpl files (prevents shell script breakage on Windows contributors)

Usage

apiVersion: kagent.dev/v1alpha2
kind: Agent
spec:
  skills:
    refs:
      - private-registry.example.com/my-org/my-skill:v1
    imagePullSecrets:
      - name: my-registry-secret  # kubernetes.io/dockerconfigjson secret

Testing

Validated end-to-end on a local Kubernetes cluster with a private registry protected by htpasswd authentication:

Skill image hosted on the private registry, inaccessible without credentials
Agent configured with imagePullSecrets referencing a dockerconfigjson secret
docker-auth-init merged the credentials, skills-init pulled the image successfully via krane
Skill was correctly loaded and executed by the agent

Copilot

Pull request overview

Adds first-class support for authenticating OCI pulls for container-based skills by allowing agents to reference kubernetes.io/dockerconfigjson secrets and wiring those credentials into the skills-init workflow.

Changes:

Extend the Agent/SandboxAgent spec.skills schema and Go types with imagePullSecrets.
Update the agent manifest translation to optionally prepend a docker-auth-init initContainer that merges multiple dockerconfigjson secrets into a single Docker config.json, and set DOCKER_CONFIG for skills-init.
Update the skills-init image to include jq, and add unit/e2e coverage for the new behavior.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
helm/kagent-crds/templates/kagent.dev_sandboxagents.yaml	Exposes `spec.skills.imagePullSecrets` in the Helm-rendered SandboxAgent CRD schema.
helm/kagent-crds/templates/kagent.dev_agents.yaml	Exposes `spec.skills.imagePullSecrets` in the Helm-rendered Agent CRD schema.
go/core/test/e2e/invoke_api_test.go	Adds an e2e test verifying `docker-auth-init` is injected and the agent still functions end-to-end.
go/core/internal/controller/translator/agent/manifest_builder.go	Passes `ImagePullSecrets` through and adapts to `buildSkillsInitContainer` returning multiple init containers.
go/core/internal/controller/translator/agent/git_skills_test.go	Adds translator unit tests validating volumes/mounts/env for `imagePullSecrets`.
go/core/internal/controller/translator/agent/adk_api_translator.go	Implements `docker-auth-init`, merge script generation, volume/mount wiring, and `DOCKER_CONFIG` env injection.
go/api/v1alpha2/zz_generated.deepcopy.go	Regenerates DeepCopy to include the new `ImagePullSecrets` field.
go/api/v1alpha2/agent_types.go	Adds `ImagePullSecrets []LocalObjectReference` to `SkillForAgent`.
go/api/config/crd/bases/kagent.dev_sandboxagents.yaml	Updates the base CRD schema for SandboxAgent to include `imagePullSecrets`.
go/api/config/crd/bases/kagent.dev_agents.yaml	Updates the base CRD schema for Agent to include `imagePullSecrets`.
docker/skills-init/Dockerfile	Installs `jq` so the merge init container can build a combined Docker config.
.gitattributes	Forces LF endings for `*.sh.tmpl` to avoid cross-platform template/script breakage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-21T19:03:01Z

 	container, skillsVolumes, err := buildSkillsInitContainer(
 		gitRefs,
 		spec.Skills.GitAuthSecretRef,
 		skills,
 		spec.Skills.InsecureSkipVerify,
 		manifestCtx.deployment.SecurityContext,
 		initEnv,
 		getDefaultResources(initResources),
+		spec.Skills.ImagePullSecrets,
 	)
 	if err != nil {
 		return nil, fmt.Errorf("failed to build skills init container: %w", err)
 	}

 	*volumes = append(*volumes, skillsVolumes...)
-	return []corev1.Container{container}, nil
+	return container, nil


buildSkillsInitContainer now returns a slice of containers, but the receiving variable is still named container, which makes the call site harder to read. Consider renaming it to containers (or similar) to reflect the type and avoid confusion.

Copilot · 2026-04-21T19:03:01Z

+		for _, secret := range imagePullSecrets {
+			volName := "pull-secret-" + secret.Name
+			volumes = append(volumes, corev1.Volume{
+				Name: volName,
+				VolumeSource: corev1.VolumeSource{
+					Secret: &corev1.SecretVolumeSource{
+						SecretName: secret.Name,
+					},
+				},
+			})


Volume names are derived from the Secret name ("pull-secret-" + secret.Name). Kubernetes Secret names may contain characters (notably '.') and/or length that are invalid for Pod volume names, which can make the generated Deployment fail admission. Consider generating a safe volume name (e.g., index-based or hashed) and keep the SecretName field pointing at the original secret.

Copilot · 2026-04-21T19:03:01Z

+		// Shared EmptyDir volume for the merged Docker config.
+		volumes = append(volumes, corev1.Volume{
+			Name: "kagent-docker-config",
+			VolumeSource: corev1.VolumeSource{
+				EmptyDir: &corev1.EmptyDirVolumeSource{},
+			},
+		})
+
+		// Mount each imagePullSecret as a read-only directory under /docker-secrets/<name>.
+		authInitVolumeMounts := []corev1.VolumeMount{
+			{Name: "kagent-docker-config", MountPath: "/docker-config-out"},
+		}
+		for _, secret := range imagePullSecrets {
+			volName := "pull-secret-" + secret.Name
+			volumes = append(volumes, corev1.Volume{
+				Name: volName,
+				VolumeSource: corev1.VolumeSource{
+					Secret: &corev1.SecretVolumeSource{
+						SecretName: secret.Name,
+					},
+				},
+			})
+			authInitVolumeMounts = append(authInitVolumeMounts, corev1.VolumeMount{
+				Name:      volName,
+				MountPath: "/docker-secrets/" + secret.Name,
+				ReadOnly:  true,
+			})
+		}
+
+		mergeScript := buildDockerAuthMergeScript(imagePullSecrets)
+		dockerAuthInitContainer := corev1.Container{
+			Name:         "docker-auth-init",
+			Image:        DefaultSkillsInitImageConfig.Image(),
+			Command:      []string{"/bin/sh", "-c", mergeScript},
+			VolumeMounts: authInitVolumeMounts,
+		}
+		containers = append(containers, dockerAuthInitContainer)
+
+		// Mount the merged config into skills-init so krane picks it up via DOCKER_CONFIG.
+		volumeMounts = append(volumeMounts, corev1.VolumeMount{
+			Name:      "kagent-docker-config",
+			MountPath: "/.kagent/.docker",
+			ReadOnly:  true,
+		})


imagePullSecrets entries with an empty Name (LocalObjectReference allows this) or repeated names will currently produce invalid SecretVolumeSource.SecretName values and/or duplicate volume names. It would be safer to validate that every reference has a non-empty name and to de-duplicate (or error) before creating volumes/mounts.

Suggested change

// Shared EmptyDir volume for the merged Docker config.

volumes = append(volumes, corev1.Volume{

Name: "kagent-docker-config",

VolumeSource: corev1.VolumeSource{

EmptyDir: &corev1.EmptyDirVolumeSource{},

},

})

// Mount each imagePullSecret as a read-only directory under /docker-secrets/<name>.

authInitVolumeMounts := []corev1.VolumeMount{

{Name: "kagent-docker-config", MountPath: "/docker-config-out"},

}

for _, secret := range imagePullSecrets {

volName := "pull-secret-" + secret.Name

volumes = append(volumes, corev1.Volume{

Name: volName,

VolumeSource: corev1.VolumeSource{

Secret: &corev1.SecretVolumeSource{

SecretName: secret.Name,

},

},

})

authInitVolumeMounts = append(authInitVolumeMounts, corev1.VolumeMount{

Name: volName,

MountPath: "/docker-secrets/" + secret.Name,

ReadOnly: true,

})

}

mergeScript := buildDockerAuthMergeScript(imagePullSecrets)

dockerAuthInitContainer := corev1.Container{

Name: "docker-auth-init",

Image: DefaultSkillsInitImageConfig.Image(),

Command: []string{"/bin/sh", "-c", mergeScript},

VolumeMounts: authInitVolumeMounts,

}

containers = append(containers, dockerAuthInitContainer)

// Mount the merged config into skills-init so krane picks it up via DOCKER_CONFIG.

volumeMounts = append(volumeMounts, corev1.VolumeMount{

Name: "kagent-docker-config",

MountPath: "/.kagent/.docker",

ReadOnly: true,

})

validImagePullSecrets := make([]corev1.LocalObjectReference, 0, len(imagePullSecrets))

seenImagePullSecrets := make(map[string]struct{}, len(imagePullSecrets))

for _, secret := range imagePullSecrets {

if secret.Name == "" {

continue

}

if _, seen := seenImagePullSecrets[secret.Name]; seen {

continue

}

seenImagePullSecrets[secret.Name] = struct{}{}

validImagePullSecrets = append(validImagePullSecrets, secret)

}

if len(validImagePullSecrets) > 0 {

// Shared EmptyDir volume for the merged Docker config.

volumes = append(volumes, corev1.Volume{

Name: "kagent-docker-config",

VolumeSource: corev1.VolumeSource{

EmptyDir: &corev1.EmptyDirVolumeSource{},

},

})

// Mount each imagePullSecret as a read-only directory under /docker-secrets/<name>.

authInitVolumeMounts := []corev1.VolumeMount{

{Name: "kagent-docker-config", MountPath: "/docker-config-out"},

}

for _, secret := range validImagePullSecrets {

volName := "pull-secret-" + secret.Name

volumes = append(volumes, corev1.Volume{

Name: volName,

VolumeSource: corev1.VolumeSource{

Secret: &corev1.SecretVolumeSource{

SecretName: secret.Name,

},

},

})

authInitVolumeMounts = append(authInitVolumeMounts, corev1.VolumeMount{

Name: volName,

MountPath: "/docker-secrets/" + secret.Name,

ReadOnly: true,

})

}

mergeScript := buildDockerAuthMergeScript(validImagePullSecrets)

dockerAuthInitContainer := corev1.Container{

Name: "docker-auth-init",

Image: DefaultSkillsInitImageConfig.Image(),

Command: []string{"/bin/sh", "-c", mergeScript},

VolumeMounts: authInitVolumeMounts,

}

containers = append(containers, dockerAuthInitContainer)

// Mount the merged config into skills-init so krane picks it up via DOCKER_CONFIG.

volumeMounts = append(volumeMounts, corev1.VolumeMount{

Name: "kagent-docker-config",

MountPath: "/.kagent/.docker",

ReadOnly: true,

})

}

Copilot · 2026-04-21T19:03:02Z

+			Name:         "docker-auth-init",
+			Image:        DefaultSkillsInitImageConfig.Image(),
+			Command:      []string{"/bin/sh", "-c", mergeScript},
+			VolumeMounts: authInitVolumeMounts,


docker-auth-init is created without SecurityContext or resource requirements, while skills-init uses the pod/deployment securityContext and configured resources. This can cause PodSecurity admission failures or unexpected resource usage differences. Consider applying the same initSecCtx and resources (or a deliberate minimal set) to docker-auth-init as well.

Suggested change

Name: "docker-auth-init",

Image: DefaultSkillsInitImageConfig.Image(),

Command: []string{"/bin/sh", "-c", mergeScript},

VolumeMounts: authInitVolumeMounts,

Name: "docker-auth-init",

Image: DefaultSkillsInitImageConfig.Image(),

Command: []string{"/bin/sh", "-c", mergeScript},

VolumeMounts: authInitVolumeMounts,

SecurityContext: initSecCtx,

Resources: resources,

EItanya

Overall this makes sense, but before we go down the road of adding a new API, is there anyway we can re-use the ImagePullSecrets which already get used for image pulling, or do those remain on the node and never mounted into the pod themselves?

ppeau · 2026-04-29T22:36:53Z

Hi @EItanya, great question. We actually explored reusing the existing deployment imagePullSecrets first before adding a new field.

On the technical side, imagePullSecrets defined on a pod spec are consumed exclusively by the kubelet to pull container images. They are never mounted into the pod or made accessible to running containers. This means krane, executing inside the skills-init init container, has no way to read those credentials. We hit this wall directly during testing.

Even if it were technically possible, there is a design reason why it would not be the right approach in enterprise environments. The registry used to deploy the kagent system and the registry hosting skill images are typically owned by completely different teams with different security boundaries. The platform/ops team manages the kagent deployment and its registry credentials, while the line-of-business team produces and owns the skill images, hosted in their own private registry (Artifactory, ACR, ECR, etc.). Reusing the deployment imagePullSecrets would couple these two security contexts together, violate the principle of least privilege, and make skills effectively unusable for any team whose registry differs from the one used to deploy kagent. That is the common case at scale.

The new imagePullSecrets field under spec.skills directly mirrors the standard Kubernetes pattern where a pod can reference multiple imagePullSecrets for different registries. It introduces no new concept, just applies the same model at the skill level.

Happy to discuss further if needed!

EItanya · 2026-04-30T12:53:39Z

Hi @EItanya, great question. We actually explored reusing the existing deployment imagePullSecrets first before adding a new field.

On the technical side, imagePullSecrets defined on a pod spec are consumed exclusively by the kubelet to pull container images. They are never mounted into the pod or made accessible to running containers. This means krane, executing inside the skills-init init container, has no way to read those credentials. We hit this wall directly during testing.

Even if it were technically possible, there is a design reason why it would not be the right approach in enterprise environments. The registry used to deploy the kagent system and the registry hosting skill images are typically owned by completely different teams with different security boundaries. The platform/ops team manages the kagent deployment and its registry credentials, while the line-of-business team produces and owns the skill images, hosted in their own private registry (Artifactory, ACR, ECR, etc.). Reusing the deployment imagePullSecrets would couple these two security contexts together, violate the principle of least privilege, and make skills effectively unusable for any team whose registry differs from the one used to deploy kagent. That is the common case at scale.

The new imagePullSecrets field under spec.skills directly mirrors the standard Kubernetes pattern where a pod can reference multiple imagePullSecrets for different registries. It introduces no new concept, just applies the same model at the skill level.

Happy to discuss further if needed!

Ok I buy that logic. What do you think about renaming the field to PullSecrets instead ImagePullSecrets since these aren't really images

ppeau · 2026-04-30T15:03:36Z

Hi @EItanya, great question. We actually explored reusing the existing deployment imagePullSecrets first before adding a new field.
On the technical side, imagePullSecrets defined on a pod spec are consumed exclusively by the kubelet to pull container images. They are never mounted into the pod or made accessible to running containers. This means krane, executing inside the skills-init init container, has no way to read those credentials. We hit this wall directly during testing.
Even if it were technically possible, there is a design reason why it would not be the right approach in enterprise environments. The registry used to deploy the kagent system and the registry hosting skill images are typically owned by completely different teams with different security boundaries. The platform/ops team manages the kagent deployment and its registry credentials, while the line-of-business team produces and owns the skill images, hosted in their own private registry (Artifactory, ACR, ECR, etc.). Reusing the deployment imagePullSecrets would couple these two security contexts together, violate the principle of least privilege, and make skills effectively unusable for any team whose registry differs from the one used to deploy kagent. That is the common case at scale.
The new imagePullSecrets field under spec.skills directly mirrors the standard Kubernetes pattern where a pod can reference multiple imagePullSecrets for different registries. It introduces no new concept, just applies the same model at the skill level.
Happy to discuss further if needed!

Ok I buy that logic. What do you think about renaming the field to PullSecrets instead ImagePullSecrets since these aren't really images

Good point, I can see both sides here.

For keeping imagePullSecrets: Skills are stored and pulled as OCI artifacts using the exact same kubernetes.io/dockerconfigjson secrets as normal container images. The name imagePullSecrets is the standard Kubernetes convention, so it feels familiar right away. Anyone who’s used Kubernetes already knows what it means and how to set it up.

For renaming to pullSecrets: Skills aren’t executed as containers, they’re more like content or configuration that gets pulled. The original imagePullSecrets name is specifically tied to pulling runnable container images at the pod level, so using it in this context could feel a bit overloaded. pullSecrets is more neutral and probably more accurate here.

Both options are valid.
I’m happy to go with pullSecrets if you prefer it.

Just say the word and I’ll rename the field right away. 👍

EItanya · 2026-05-01T12:57:36Z

Hi @EItanya, great question. We actually explored reusing the existing deployment imagePullSecrets first before adding a new field.
On the technical side, imagePullSecrets defined on a pod spec are consumed exclusively by the kubelet to pull container images. They are never mounted into the pod or made accessible to running containers. This means krane, executing inside the skills-init init container, has no way to read those credentials. We hit this wall directly during testing.
Even if it were technically possible, there is a design reason why it would not be the right approach in enterprise environments. The registry used to deploy the kagent system and the registry hosting skill images are typically owned by completely different teams with different security boundaries. The platform/ops team manages the kagent deployment and its registry credentials, while the line-of-business team produces and owns the skill images, hosted in their own private registry (Artifactory, ACR, ECR, etc.). Reusing the deployment imagePullSecrets would couple these two security contexts together, violate the principle of least privilege, and make skills effectively unusable for any team whose registry differs from the one used to deploy kagent. That is the common case at scale.
The new imagePullSecrets field under spec.skills directly mirrors the standard Kubernetes pattern where a pod can reference multiple imagePullSecrets for different registries. It introduces no new concept, just applies the same model at the skill level.
Happy to discuss further if needed!

Ok I buy that logic. What do you think about renaming the field to PullSecrets instead ImagePullSecrets since these aren't really images

Good point, I can see both sides here.

For keeping imagePullSecrets: Skills are stored and pulled as OCI artifacts using the exact same kubernetes.io/dockerconfigjson secrets as normal container images. The name imagePullSecrets is the standard Kubernetes convention, so it feels familiar right away. Anyone who’s used Kubernetes already knows what it means and how to set it up.

For renaming to pullSecrets: Skills aren’t executed as containers, they’re more like content or configuration that gets pulled. The original imagePullSecrets name is specifically tied to pulling runnable container images at the pod level, so using it in this context could feel a bit overloaded. pullSecrets is more neutral and probably more accurate here.

Both options are valid. I’m happy to go with pullSecrets if you prefer it.

Just say the word and I’ll rename the field right away. 👍

Ok I buy that logic, let's stick with it for now. Just resolve merge conflicts and we'll get this merged

ppeau · 2026-05-01T15:09:44Z

Hi @EItanya, great question. We actually explored reusing the existing deployment imagePullSecrets first before adding a new field.
On the technical side, imagePullSecrets defined on a pod spec are consumed exclusively by the kubelet to pull container images. They are never mounted into the pod or made accessible to running containers. This means krane, executing inside the skills-init init container, has no way to read those credentials. We hit this wall directly during testing.
Even if it were technically possible, there is a design reason why it would not be the right approach in enterprise environments. The registry used to deploy the kagent system and the registry hosting skill images are typically owned by completely different teams with different security boundaries. The platform/ops team manages the kagent deployment and its registry credentials, while the line-of-business team produces and owns the skill images, hosted in their own private registry (Artifactory, ACR, ECR, etc.). Reusing the deployment imagePullSecrets would couple these two security contexts together, violate the principle of least privilege, and make skills effectively unusable for any team whose registry differs from the one used to deploy kagent. That is the common case at scale.
The new imagePullSecrets field under spec.skills directly mirrors the standard Kubernetes pattern where a pod can reference multiple imagePullSecrets for different registries. It introduces no new concept, just applies the same model at the skill level.
Happy to discuss further if needed!

Ok I buy that logic. What do you think about renaming the field to PullSecrets instead ImagePullSecrets since these aren't really images

Good point, I can see both sides here.
For keeping imagePullSecrets: Skills are stored and pulled as OCI artifacts using the exact same kubernetes.io/dockerconfigjson secrets as normal container images. The name imagePullSecrets is the standard Kubernetes convention, so it feels familiar right away. Anyone who’s used Kubernetes already knows what it means and how to set it up.
For renaming to pullSecrets: Skills aren’t executed as containers, they’re more like content or configuration that gets pulled. The original imagePullSecrets name is specifically tied to pulling runnable container images at the pod level, so using it in this context could feel a bit overloaded. pullSecrets is more neutral and probably more accurate here.
Both options are valid. I’m happy to go with pullSecrets if you prefer it.
Just say the word and I’ll rename the field right away. 👍

Ok I buy that logic, let's stick with it for now. Just resolve merge conflicts and we'll get this merged

Done! ✅
I’ve resolved the merge conflicts.

EItanya · 2026-05-04T12:39:39Z

+func buildDockerAuthMergeScript(imagePullSecrets []corev1.LocalObjectReference) string {
+	var sb strings.Builder
+	sb.WriteString(`set -e
+mkdir -p /docker-config-out
+merged='{"auths":{}}'
+`)
+	for _, secret := range imagePullSecrets {
+		sb.WriteString(`if [ -f /docker-secrets/` + secret.Name + `/.dockerconfigjson ]; then
+  merged="$(printf '%s\n%s\n' "$merged" "$(cat /docker-secrets/` + secret.Name + `/.dockerconfigjson)" | jq -s '.[0].auths * .[1].auths | {"auths": .}')"
+fi
+`)
+	}
+	sb.WriteString(`printf '%s' "$merged" > /docker-config-out/config.json
+`)
+	return sb.String()


Sorry for the continue reviews, is there anyway you could put this into a tmpl file similar to this existing one. In the future I want to move away from these scripts altogether, but I think they're a bit simpler to understand for now.

Perfect, I'll check that out right away. I'll get back to you as soon as it's ready.

Hi @Eltanya, done! The docker-auth-init script has been moved to a dedicated docker-auth-init.sh.tmpl file, following the same pattern as skills-init.sh.tmpl (//go:embed + template.Must + typed data struct). All existing tests pass including the imagePullSecrets ones.

I'll let you resolve the conversation if it looks good to you 👍

EItanya · 2026-05-07T18:24:40Z

Hey there, I'm sorry for going back and forth about this PR, but I have some more questions. It's not clear to me why we need a new container for this, why can't we run this logic inside of the existing skills-init container?

ppeau · 2026-05-07T20:03:40Z

Hey there, I'm sorry for going back and forth about this PR, but I have some more questions. It's not clear to me why we need a new container for this, why can't we run this logic inside of the existing skills-init container?

Hi @EItanya, completely fair question. Technically yes, we could add the jq merge at the start of the skills-init script.

We went with a separate docker-auth-init container because the two-container approach was the design that came out of the discussion in #1222, and it felt like a clean separation between auth setup and skill pulling. skills-init.sh.tmpl is already handling git auth, SSH keys, and krane pulls, so adding credential merging on top would blur its responsibility. It also gives better failure isolation: if credentials fail to merge, Kubernetes reports the failed init container by name immediately, without having to parse skills-init logs.

Worth noting too that docker-auth-init only runs when imagePullSecrets is set, so existing deployments without private registries are completely unaffected.

That said, if you strongly prefer consolidating into skills-init, we are happy to do that. Just let us know!

EItanya · 2026-05-07T20:12:07Z

Hey there, I'm sorry for going back and forth about this PR, but I have some more questions. It's not clear to me why we need a new container for this, why can't we run this logic inside of the existing skills-init container?

Hi @EItanya, completely fair question. Technically yes, we could add the jq merge at the start of the skills-init script.

We went with a separate docker-auth-init container because the two-container approach was the design that came out of the discussion in #1222, and it felt like a clean separation between auth setup and skill pulling. skills-init.sh.tmpl is already handling git auth, SSH keys, and krane pulls, so adding credential merging on top would blur its responsibility. It also gives better failure isolation: if credentials fail to merge, Kubernetes reports the failed init container by name immediately, without having to parse skills-init logs.

Worth noting too that docker-auth-init only runs when imagePullSecrets is set, so existing deployments without private registries are completely unaffected.

That said, if you strongly prefer consolidating into skills-init, we are happy to do that. Just let us know!

Although I agree with these, I also think that adding a new container comes with its own difficulties that I'd rather avoid. For example adding new SecurityContext and resources requirements options which clog up the Agent object. As I've mentioned before we're getting to the point where we really need to turn this skills-init container into a golang program so it's clearer what it's actually doing, and I think it should be responsible for all pieces related to that.

ppeau · 2026-05-07T23:08:44Z

Hey there, I'm sorry for going back and forth about this PR, but I have some more questions. It's not clear to me why we need a new container for this, why can't we run this logic inside of the existing skills-init container?

Hi @EItanya, completely fair question. Technically yes, we could add the jq merge at the start of the skills-init script.
We went with a separate docker-auth-init container because the two-container approach was the design that came out of the discussion in #1222, and it felt like a clean separation between auth setup and skill pulling. skills-init.sh.tmpl is already handling git auth, SSH keys, and krane pulls, so adding credential merging on top would blur its responsibility. It also gives better failure isolation: if credentials fail to merge, Kubernetes reports the failed init container by name immediately, without having to parse skills-init logs.
Worth noting too that docker-auth-init only runs when imagePullSecrets is set, so existing deployments without private registries are completely unaffected.
That said, if you strongly prefer consolidating into skills-init, we are happy to do that. Just let us know!

Although I agree with these, I also think that adding a new container comes with its own difficulties that I'd rather avoid. For example adding new SecurityContext and resources requirements options which clog up the Agent object. As I've mentioned before we're getting to the point where we really need to turn this skills-init container into a golang program so it's clearer what it's actually doing, and I think it should be responsible for all pieces related to that.

Totally makes sense, thanks for taking the time to explain your reasoning!
I'll consolidate the credential merge logic directly into skills-init and drop the separate docker-auth-init container.

I'll get back to you once it's ready! 👍

Add authentication support for pulling skill images from private registries (Artifactory, ACR, ECR, etc.) by introducing a new imagePullSecrets field under spec.skills. When imagePullSecrets is set, a docker-auth-init init container is prepended that merges all kubernetes.io/dockerconfigjson secrets into a single config.json using jq. The skills-init container then reads that config via the DOCKER_CONFIG env var, which krane picks up automatically when pulling skill images. Closes kagent-dev#1222 Signed-off-by: ppeau <patrice.peau@gmail.com>

Signed-off-by: ppeau <patrice.peau@gmail.com>

Previously, when imagePullSecrets were specified, the controller created two init containers: docker-auth-init (to merge dockerconfigjson secrets into a shared EmptyDir volume) and skills-init (to pull OCI/git skills). This commit eliminates the separate docker-auth-init container by embedding the credential merge logic directly into the skills-init shell script template. Each imagePullSecret is now mounted directly on skills-init under /docker-secrets/<name>; the script merges them with jq into /tmp/kagent-docker-config/config.json and exports DOCKER_CONFIG before invoking krane. Changes: - Remove docker-auth-init.sh.tmpl and all associated Go code - Add ImagePullSecrets []string field to skillsInitData - Render credential merge block at top of skills-init.sh.tmpl - Mount pull-secret volumes directly on skills-init container - Update unit tests: assert exactly one init container, no docker-auth-init, no kagent-docker-config EmptyDir volume - Update E2E test: verify single init container and script content Signed-off-by: ppeau <patrice.peau@gmail.com>

Copilot AI review requested due to automatic review settings April 21, 2026 18:59

ppeau requested review from EItanya, ilackarms, peterj and yuval-k as code owners April 21, 2026 18:59

Copilot started reviewing on behalf of ppeau April 21, 2026 18:59 View session

ppeau mentioned this pull request Apr 21, 2026

[FEATURE] Container-based skills image download authentication support #1222

Open

3 tasks

Copilot AI reviewed Apr 21, 2026

View reviewed changes

ppeau force-pushed the feat/skills-with-imagepullsecrets branch from 2129a46 to 58d8a73 Compare April 21, 2026 19:33

EItanya reviewed Apr 29, 2026

View reviewed changes

ppeau force-pushed the feat/skills-with-imagepullsecrets branch from acde26f to 2689d6c Compare May 1, 2026 14:40

ppeau requested review from iplay88keys, jmhbh and supreme-gg-gg as code owners May 1, 2026 14:40

EItanya reviewed May 4, 2026

View reviewed changes

ppeau force-pushed the feat/skills-with-imagepullsecrets branch from daaa447 to 9a9ce6e Compare May 5, 2026 15:10

ppeau added 3 commits May 7, 2026 19:10

refactor: move docker-auth-init script to tmpl file

0da774b

Signed-off-by: ppeau <patrice.peau@gmail.com>

ppeau force-pushed the feat/skills-with-imagepullsecrets branch from b1a02db to 1d6b918 Compare May 7, 2026 23:59

Conversation

ppeau commented Apr 21, 2026

Problem

Solution

Changes

Usage

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

EItanya left a comment

Choose a reason for hiding this comment

Uh oh!

ppeau commented Apr 29, 2026

Uh oh!

EItanya commented Apr 30, 2026

Uh oh!

ppeau commented Apr 30, 2026

Uh oh!

EItanya commented May 1, 2026

Uh oh!

ppeau commented May 1, 2026

Uh oh!

EItanya May 4, 2026

Choose a reason for hiding this comment

Uh oh!

ppeau May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ppeau May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EItanya commented May 7, 2026

Uh oh!

ppeau commented May 7, 2026

Uh oh!

EItanya commented May 7, 2026

Uh oh!

ppeau commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ppeau May 4, 2026 •

edited

Loading

ppeau May 4, 2026 •

edited

Loading