From 6d94b479a8b19591db337680bb7b4a55e0d7574d Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Fri, 8 May 2026 10:58:42 +0000 Subject: [PATCH 1/2] docs: simplify architecture.mdx Cluster Stability Check section Remove implementation-level details (specific threshold counts, timeouts, busybox pod internals) from the end-user architecture page. These details belong in CONTRIBUTING.md. The key concepts (two-phase installation, stability check before GitOps) are preserved at a level useful to advanced users. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/src/content/docs/architecture.mdx | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/docs/src/content/docs/architecture.mdx b/docs/src/content/docs/architecture.mdx index 5167931c6d..b409437d1c 100644 --- a/docs/src/content/docs/architecture.mdx +++ b/docs/src/content/docs/architecture.mdx @@ -114,17 +114,11 @@ Installed after a cluster stability check confirms the API server is fully ready - **Flux** — GitOps continuous delivery - **ArgoCD** — GitOps continuous delivery -Before Phase 2, KSail always performs a **Cluster Stability Check** with three sequential steps: - -1. **API server stability** — requires consecutive successful health checks within a 2-minute timeout. The threshold is distribution-aware: 3 checks for Vanilla, K3s, and KWOK (which stabilize faster after webhook registration), and 5 for Talos and VCluster. -2. **DaemonSet readiness** — verifies all kube-system DaemonSets are ready within a 3-minute timeout. Runs after API server stability, as it does not retry transient transport errors. -3. **In-cluster API connectivity** *(Cilium only)* — creates a short-lived busybox pod that tests TCP connectivity to the API server ClusterIP (port 443) from within the cluster, with a 2-minute timeout. Only performed for Cilium CNI, where eBPF dataplane programming may lag behind DaemonSet readiness. Skipped for the default (distribution-provided) CNI and Calico. - -This check always runs before GitOps engines, even when no Phase 1 components are installed. It prevents race conditions where K3s/K3d clusters report creation success before the API server is fully ready to serve requests. On setups with Phase 1 infrastructure components, it also ensures API connectivity has recovered after those components register webhooks and CRDs. +Before Phase 2, KSail always performs a **Cluster Stability Check** — verifying API server health, DaemonSet readiness, and (for Cilium) in-cluster connectivity. This prevents race conditions where clusters report creation success before the API server is fully ready. The check always runs even when no Phase 1 components are installed. ### Detection and Updates -The detector service identifies installed components by querying Helm release history and the Kubernetes API, with additional checks against the Docker daemon where needed. It determines the active distribution, provider, and cluster name from the current kubeconfig context, and distinguishes KSail-managed GitOps resources from unrelated ones so it does not interfere with external GitOps setups. +KSail detects installed components by querying Helm release history and the Kubernetes API, and determines the active distribution, provider, and cluster name from the kubeconfig context. It distinguishes KSail-managed GitOps resources from unrelated ones to avoid interfering with external GitOps setups. The diff service classifies update impact as **in-place** (no disruption), **reboot-required** (node reboot), or **recreate-required** (full cluster recreation). From 685afcb2bdbbbc7b63fbffcf662a822379d738fb Mon Sep 17 00:00:00 2001 From: Copilot <223556219+Copilot@users.noreply.github.com> Date: Fri, 8 May 2026 20:14:12 +0200 Subject: [PATCH 2/2] docs: restore Docker daemon detection clause in architecture.mdx Re-add the mention that component detection also inspects the Docker daemon for Docker-based providers, which was accidentally dropped during the simplification of the Detection and Updates section. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/src/content/docs/architecture.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/content/docs/architecture.mdx b/docs/src/content/docs/architecture.mdx index b409437d1c..db935a79fe 100644 --- a/docs/src/content/docs/architecture.mdx +++ b/docs/src/content/docs/architecture.mdx @@ -118,7 +118,7 @@ Before Phase 2, KSail always performs a **Cluster Stability Check** — verifyin ### Detection and Updates -KSail detects installed components by querying Helm release history and the Kubernetes API, and determines the active distribution, provider, and cluster name from the kubeconfig context. It distinguishes KSail-managed GitOps resources from unrelated ones to avoid interfering with external GitOps setups. +KSail detects installed components by querying Helm release history and the Kubernetes API (and, for Docker-based providers, inspecting the Docker daemon where needed), and determines the active distribution, provider, and cluster name from the kubeconfig context. It distinguishes KSail-managed GitOps resources from unrelated ones to avoid interfering with external GitOps setups. The diff service classifies update impact as **in-place** (no disruption), **reboot-required** (node reboot), or **recreate-required** (full cluster recreation).