The limitation
Referenced ConfigMaps and Secrets (#129) follow a workload to the location that runs it — but a piece of data shared by workloads in two or more locations is only delivered to one of them. Workloads in the other locations wait indefinitely for configuration that will never arrive, and their status reads as a routine "waiting for data" rather than "you've hit an unsupported shape." Nothing in the system detects the conflict, because each piece is following its own rules correctly.
Today this is unreachable — the platform currently offers a single location — so the cost of the limitation is zero right now. This issue exists so multi-location data delivery is solved before the location catalog grows, not after users find it.
Why it happens (current implementation)
The federator creates one PropagationPolicy per city in each hub namespace (ensurePropagationPolicy, named per city code). Each city's policy carries resource selectors for two things: the WorkloadDeployments routed to that city, and all companion ConfigMaps/Secrets in the namespace carrying the referenced-data label. Companions aren't city-scoped — the resolver creates one companion per referenced object, shared by every workload in the namespace that references it — so the companion selectors in every city's policy are identical in both content and precision.
Karmada permits exactly one PropagationPolicy to claim any given resource, and that claim is exclusive and sticky: with two equally-precise matching policies, the engine picks one (effectively alphabetically by policy name) and the companion only ever travels where that policy points. The losing city's cells never receive it; nothing re-evaluates the claim until the winning policy is deleted.
Downstream, the cell-side gate does exactly what it's designed to: it holds Instances until the companions listed for their deployment are present on the cell. In the losing city that list never fills, so Instances wait indefinitely with a routine "waiting for data" status — neither the resolver, the federator, nor the gate can see that a claim conflict upstream is the cause.
In short: workload routing is one-to-one (each deployment, one city) and companion delivery was attached to the workloads' per-city policies — but shared data is many-to-many, and the claiming layer underneath only permits one owner per object.
Interim guard (separate, near-term work on #129)
Before #129 leaves draft, admission validation will reject the unsupported shape — one referenced object used by workloads targeting different cities — with a clear message. Sharing within a single city stays supported. This issue is about removing the limitation, not the guard.
Direction — open decision
Two candidate paths, both with substantial groundwork already done. Deciding between them (or finding a third) is part of this issue.
- Adopt the federation engine's native union mechanism (Karmada
propagateDeps + dependency interpretation): attached bindings inherit the union of all parent workloads' destinations, dissolving the single-owner conflict structurally. A full adoption spike + adversarial source-level review concluded the architecture works. If chosen, the gating items are known: hub upgrade to ≥ v1.15.3 (the deployed v1.15.0 silently mishandles multiple same-kind dependencies — fixed upstream in karmada-io/karmada#6931), webhook protection for the dependency annotation (it becomes security-bearing), corrected migration/rollback procedures, and accepting ongoing ownership of the engine's upgrade cadence from compute's side.
- Platform-level dependency federation (e.g. a Milo federation policy with CEL-declared dependencies): engine-agnostic at the API, owned by the platform layer, available to every service rather than wired by one consumer. Cleaner layering; requires platform-team design and roadmap space.
A hand-maintained "union of destinations" per shared object (reference counting across cities, transition ordering, cleanup) was evaluated and rejected as the bug-prone path — it re-implements the engine's union semantics poorly.
The spike and review materials (verified v1.15.x behavior, lifecycle and security traps, migration/rollback analysis) inform either path and should be pulled in when this is picked up.
Done when
References
🤖 Generated with Claude Code
The limitation
Referenced ConfigMaps and Secrets (#129) follow a workload to the location that runs it — but a piece of data shared by workloads in two or more locations is only delivered to one of them. Workloads in the other locations wait indefinitely for configuration that will never arrive, and their status reads as a routine "waiting for data" rather than "you've hit an unsupported shape." Nothing in the system detects the conflict, because each piece is following its own rules correctly.
Today this is unreachable — the platform currently offers a single location — so the cost of the limitation is zero right now. This issue exists so multi-location data delivery is solved before the location catalog grows, not after users find it.
Why it happens (current implementation)
The federator creates one PropagationPolicy per city in each hub namespace (
ensurePropagationPolicy, named per city code). Each city's policy carries resource selectors for two things: the WorkloadDeployments routed to that city, and all companion ConfigMaps/Secrets in the namespace carrying the referenced-data label. Companions aren't city-scoped — the resolver creates one companion per referenced object, shared by every workload in the namespace that references it — so the companion selectors in every city's policy are identical in both content and precision.Karmada permits exactly one PropagationPolicy to claim any given resource, and that claim is exclusive and sticky: with two equally-precise matching policies, the engine picks one (effectively alphabetically by policy name) and the companion only ever travels where that policy points. The losing city's cells never receive it; nothing re-evaluates the claim until the winning policy is deleted.
Downstream, the cell-side gate does exactly what it's designed to: it holds Instances until the companions listed for their deployment are present on the cell. In the losing city that list never fills, so Instances wait indefinitely with a routine "waiting for data" status — neither the resolver, the federator, nor the gate can see that a claim conflict upstream is the cause.
In short: workload routing is one-to-one (each deployment, one city) and companion delivery was attached to the workloads' per-city policies — but shared data is many-to-many, and the claiming layer underneath only permits one owner per object.
Interim guard (separate, near-term work on #129)
Before #129 leaves draft, admission validation will reject the unsupported shape — one referenced object used by workloads targeting different cities — with a clear message. Sharing within a single city stays supported. This issue is about removing the limitation, not the guard.
Direction — open decision
Two candidate paths, both with substantial groundwork already done. Deciding between them (or finding a third) is part of this issue.
propagateDeps+ dependency interpretation): attached bindings inherit the union of all parent workloads' destinations, dissolving the single-owner conflict structurally. A full adoption spike + adversarial source-level review concluded the architecture works. If chosen, the gating items are known: hub upgrade to ≥ v1.15.3 (the deployed v1.15.0 silently mishandles multiple same-kind dependencies — fixed upstream in karmada-io/karmada#6931), webhook protection for the dependency annotation (it becomes security-bearing), corrected migration/rollback procedures, and accepting ongoing ownership of the engine's upgrade cadence from compute's side.A hand-maintained "union of destinations" per shared object (reference counting across cities, transition ordering, cleanup) was evaluated and rejected as the bug-prone path — it re-implements the engine's union semantics poorly.
The spike and review materials (verified v1.15.x behavior, lifecycle and security traps, migration/rollback analysis) inform either path and should be pulled in when this is picked up.
Done when
References
🤖 Generated with Claude Code