Skip to content

feat: fix dependency chain gaps in Cilium and ClusterProfile#46

Merged
patrick-hermann-sva merged 9 commits intomainfrom
feat/dependency-chain-fixes
Mar 31, 2026
Merged

feat: fix dependency chain gaps in Cilium and ClusterProfile#46
patrick-hermann-sva merged 9 commits intomainfrom
feat/dependency-chain-fixes

Conversation

@patrick-hermann-sva
Copy link
Copy Markdown
Contributor

Summary

  • XCilium: Add gateway-api CRD → GatewayClass and GatewayClass → Gateway Usage resources to prevent race conditions during apply and teardown
  • ClusterProfile: Add FluxInit → Cilium Usage for deletion ordering, gate Gateway on TrustManager readiness (_vbsReady and _tmReady), add clarifying comments for cert-manager reconcile lag, VBS/TM parallelism, and XCilium two-phase spec mutation
  • ClusterProfile README: Update render commands with --extra-resources flag, add k3s example and yq filter examples
  • functions.yaml: Add missing function-environment-configs declaration

Dependency chain (after fix)

Observe RC
    │
    ▼
IP Reservation
    │ (status patched, next reconcile)
    ▼
cert-manager ──────────────┐
    │                      │
    ▼                      ▼
VaultBaseSetup         TrustManager
    │    └──────────────────┘
    │         both ready
    ▼
XCilium (phase 1: install → phase 2: LB → phase 3: Gateway)
    │
    ▼
FluxInit

Test plan

  • crossplane render passes for cilium composition
  • crossplane render passes for cluster-profile composition (kind + k3s examples)
  • Verify new Usage resources appear in render output when gateway is enabled
  • Teardown ordering: FluxInit deleted before Cilium, GatewayClass before gateway-api CRDs

🤖 Generated with Claude Code

sthings user and others added 9 commits March 31, 2026 11:05
Move hardcoded Vault CA bundle, address, PKI role, and policy name
from the cluster-profile composition into a cluster-profile-defaults
EnvironmentConfig. The composition now uses function-environment-configs
to load defaults with a three-tier precedence chain:
spec.vaultBaseSetup.* > EnvironmentConfig > hardcoded fallback.

VaultBaseSetup and TrustManager are now auto-enabled for all non-kind
clusters when the EnvironmentConfig provides a vault.caBundle.

Also updates README with:
- RemoteCluster prerequisite example
- Minimal k3s ClusterProfile example
- EnvironmentConfig documentation and precedence table
- Updated install instructions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The wildcard cert defaulted to vault-pki issuer when VaultBaseSetup was
enabled, but VaultBaseSetup was gated on certManagerReady — creating a
circular dependency.

Now the wildcard cert uses cluster-ca (self-signed) initially, and only
switches to vault-pki once VaultBaseSetup is actually ready. This allows
cert-manager to become ready first, unblocking the rest of the pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cilium is now deployed in two phases:
- Phase 1 (stage 1): Helm install with only CNI config, gated on
  observeReady. This ensures pods can start before cert-manager,
  VaultBaseSetup, etc. are deployed.
- Phase 2 (stage 5): LB pool + Gateway are enabled on the same XCilium
  once IP reservation and VaultBaseSetup are ready.

Flux is now gated on ciliumInstallReady (CNI working) instead of full
ciliumReady (which includes Gateway), so GitOps starts earlier.

This fixes the deadlock where Cilium was gated behind cert-manager
but cert-manager pods couldn't start without a CNI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ates

- deployCilium boolean: defaults to true from EnvironmentConfig, can be
  disabled per-claim (cilium.enabled: false) or per-environment for
  clusters that already have a CNI
- Cilium LB: enabled independently when IP reservation is satisfied
- Cilium Gateway: enabled independently when VaultBaseSetup is ready
- Three-phase Cilium: install (stage 1) → LB (stage 5a) → Gateway (stage 5b)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…able

- Wrap secondary sections (kind pipeline, feature matrix, sub-compositions,
  prerequisites, status fields, examples) in <details> for readability
- Add Cilium Helm values per distribution table
- Add feature auto-enablement per distribution table
- Add skip-Cilium claim example with EnvironmentConfig override

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Read nodeCount from RemoteCluster status and use min(desired, nodeCount)
for Cilium operator replicas. This prevents scheduling 2 replicas on a
single-node cluster where they'd be pointless.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When gatewayAPI.enabled is true, the composition now:
- Installs Gateway API CRDs via the upstream gateway-api Helm chart
- Creates the cilium GatewayClass as a composed Object
- GatewayClass depends on Cilium Helm release via Usage

This eliminates the need to manually install Gateway API CRDs on
each target cluster before the Cilium Gateway can be created.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ions

XCilium: add gateway-api CRD → GatewayClass and GatewayClass → Gateway
Usage resources to prevent race conditions during apply and teardown.

ClusterProfile: add FluxInit → Cilium Usage for deletion ordering,
gate Gateway on TrustManager readiness (CA bundle propagation),
and add clarifying comments for cert-manager reconcile lag,
VBS/TM parallelism, and XCilium two-phase spec mutation.

Also fix functions.yaml (add function-environment-configs), update
README render commands with --extra-resources flag, and add yq
filter examples.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@patrick-hermann-sva patrick-hermann-sva merged commit 0703f9b into main Mar 31, 2026
@patrick-hermann-sva patrick-hermann-sva deleted the feat/dependency-chain-fixes branch March 31, 2026 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant