Skip to content

[BUG] Leaking kind-only resources onto the active kubectl context #1806

@danielorbach

Description

@danielorbach

📋 Prerequisites

  • I have searched the existing issues to avoid creating a duplicate
  • By submitting this issue, you agree to follow our Code of Conduct
  • I am using the latest version of the software
  • I can consistently reproduce this issue

🎯 Affected Service(s)

Multiple services / System-wide issue (developer tooling: Makefile, go/Makefile, scripts/kind/, plus the e2e test code under go/core/test/e2e/)

🚦 Impact/Severity

Blocker — these are normal developer commands; nothing in their names hints at "may mutate whichever cluster your kubeconfig happens to point at right now." Anyone running them with a non-kind context active (intentionally or by accident) modifies that cluster instead.

🐛 Bug Description

Several developer-facing entry points run kubectl and istioctl without --context, so the commands target whichever context is currently active in the kubeconfig instead of the kind cluster they describe. The kube-public/local-registry-hosting ConfigMap that landed on a real cluster recently is the smallest possible reproducer of this; the same code paths are how a developer ends up:

  • creating a kagent namespace on a remote cluster (make use-kind-cluster);
  • mutating the local kubeconfig to set that remote context's default namespace to kagent (also use-kind-cluster, via kubectl config set-context --current --namespace kagent);
  • installing the entire MetalLB stack on a remote cluster (scripts/kind/setup-metallb.sh, called by make create-kind-cluster);
  • installing Istio with the demo profile on a remote cluster (make kagent-addon-install, the istioctl install line);
  • port-forwarding services from a remote cluster to local ports (make kagent-cli-port-forward / kagent-ui-port-forward);
  • creating test Agents, MCPServers, ModelConfigs and other CRs in a remote cluster's kagent namespace (make -C go e2e, whose Go test code calls controller-runtime's config.GetConfig() and bare exec.Command("kubectl", ...) — both honor whatever the active context is).

These are all everyday developer flows: standing up a kind cluster, running e2e, port-forwarding the dashboard, exercising the addon stack. Each of them silently follows the active kubeconfig context unless the operator has manually exported KUBECONFIG to a kind-only file beforehand.

🔄 Steps To Reproduce

  1. Set kubectl config current-context to a non-kind cluster (e.g., a GKE cluster).
  2. Run make create-kind-cluster.
  3. Observe a local-registry-hosting ConfigMap in the remote cluster's kube-public namespace, with host: "localhost:5001" pointing at the local kind registry. The same run also tries to install MetalLB onto the remote cluster.

The same shape reproduces with make use-kind-cluster (creates the kagent namespace remotely and rewrites the kubeconfig), make kagent-addon-install (installs Istio remotely), and make -C go e2e (creates test-model-config-*, everything-mcp-server-*, and dynamically-named test Agents in the remote kagent namespace).

🤔 Expected Behavior

All kubectl/istioctl invocations issued by these scripts and Make targets should pin to the kind cluster they're operating on (via --context "kind-${KIND_CLUSTER_NAME}"), and make e2e should isolate the test process with a kind-only KUBECONFIG, so a stray current-context cannot cause any of these flows to mutate another cluster.

📱 Actual Behavior

The unscoped invocations follow the user's current-context. Confirmed instance: a kube-public/local-registry-hosting ConfigMap was applied to a real GKE cluster after make create-kind-cluster was run while that context was active. The same run would have installed MetalLB there too if the cluster had matching node networking; on a hardened cluster it would have been a noisier failure but still a write attempt.

The same --kube-context / --context kind-$(KIND_CLUSTER_NAME) discipline is already enforced in helm-install*, helm-uninstall, push-test-agent, and the kubectl apply lines inside kagent-addon-install. The scripts and remaining Make targets just hadn't been brought to the same baseline.

💻 Environment

  • OS: macOS / Linux (any host running the Make targets)
  • Application version: main at or after 526107f
  • Kubernetes provider: kind locally, plus whatever remote cluster the developer happens to be authenticated to

🔍 Additional Context

This has been the case since the affected scripts and targets were introduced; not a regression. Workaround for users today: explicitly export KUBECONFIG=$(kind get kubeconfig --name kagent) before running these targets, or kubectl config use-context kind-kagent first. The repair is to pin every kubectl / istioctl call to kind-${KIND_CLUSTER_NAME} and have make -C go e2e materialize a kind-only KUBECONFIG before invoking go test.

🙋 Are you willing to contribute?

  • I am willing to submit a PR to fix this issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions