-
Notifications
You must be signed in to change notification settings - Fork 48
CFP-42904: Rule evaluation order with multiple network policies #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,118 @@ | ||
| # CFP-42904: Rule evaluation order with multiple network policies | ||
|
|
||
| **SIG: SIG-Policy** | ||
|
|
||
| **Begin Design Discussion:** 2025-11-20 | ||
|
|
||
| **Cilium Release:** 1.19 | ||
|
|
||
| **Authors:** Blaz Zupan <blaz@google.com> | ||
|
|
||
| **Status:** Draft | ||
|
|
||
| ## Summary | ||
|
|
||
| This CFP defines the order of evaluation for network policy rules when Kubernetes ClusterNetworkPolicy (formerly AdminNetworkPolicy), Kubernetes NetworkPolicy, and CiliumNetworkPolicy or CiliumClusterwideNetworkPolicy are all present in a cluster | ||
|
|
||
| ## Motivation | ||
|
|
||
| ClusterNetworkPolicy is a cluster-scoped custom resource allowing administrators to define network policies that apply cluster-wide, taking precedence over namespaced network policies. It also allows the definition of a baseline tier, which specifies network policies that apply to traffic that is not matched by any other network policy ClusterNetworkPolicy assigns strict ordering to the evaluation of policy rules. | ||
|
|
||
| There are three levels of ordering: | ||
|
|
||
| ### Tier | ||
|
|
||
| The **Admin tier** takes precedence over all other policies. Policies defined in this tier are used to set cluster-wide security rules that cannot be overridden in the other tiers. If a policy in the Admin tier renders a final decision (Accept or Deny) for a connection, evaluation stops. | ||
|
|
||
| The standard Kubernetes v1.NetworkPolicy resources operate at the **NetworkPolicy tier**. These policies always make a final decision for the pods they select. Pods not selected by any v1.NetworkPolicy fall through to the Baseline tier. | ||
|
|
||
| The **Baseline tier** provides cluster-wide default policies. Policies in the NetworkPolicy tier take precedence over the Baseline tier. | ||
|
|
||
| ### Priority | ||
|
|
||
| Priority is a value from 0 to 1000 indicating the precedence of the policy within its tier. Lower priority values indicate higher precedence, meaning policies with lower values are evaluated first within the same tier. Admin tier policies always have higher precedence than NetworkPolicy or Baseline tier policies, regardless of the priority values within those tiers. If multiple policies in the same tier have the same priority and match a connection, the behavior is undefined. The implementation may choose any of the matching policies. | ||
|
|
||
| ### Rule order | ||
|
|
||
| A maximum of 25 rules is allowed per direction (ingress and egress). Within a single CNP object, rules are evaluated in the order they are listed. Rules appearing earlier have higher precedence. | ||
|
|
||
| ## Goals | ||
|
|
||
| * Implement ordering that conforms with the ClusterNetworkPolicy specifications | ||
| * Preserve the semantics of all existing network policies | ||
| * Make the interaction between various network policies easy to understand | ||
|
|
||
| ## Non-Goals | ||
|
|
||
| * Changing the fundamental behavior of existing policies: This proposal aims to integrate CNP ordering around existing NP semantics, not alter how standard NetworkPolicies function on their own. | ||
| * Introducing policy ordering within the standard Kubernetes NetworkPolicy, CiliumNetworkPolicy or CiliumClusterwideNetworkPolicy APIs. The priority is an internal implementation detail to handle the tiering, not a proposed change to the Kubernetes APIs. | ||
| * Deprecating or replacing CiliumNetworkPolicy or CiliumClusterwideNetworkPolicy: These policy types will continue to be supported within the "NetworkPolicy" tier. | ||
|
|
||
| ## Proposal | ||
|
|
||
| The interaction between ClusterNetworkPolicy and v1.NetworkPolicy is well defined in the ClusterNetworkPolicy CRD and is described above. | ||
|
|
||
| CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy do not support explicit rule ordering. Currently, CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy rules are ordered implicitly: Deny rules take precedence over Allow rules, and more specific L4/L7 rules take precedence over less specific ones. Otherwise, all rules effectively have the same priority. To maintain backward compatibility, CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy will be placed within the NetworkPolicy tier. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
nit, I think this is a bit simplified and glosses over a bit of the nuance of how these rules work. I'm not sure whether the subtlety is material to the overall CFP, but I'll clarify just in case.
Implication of the last rule is that if there is an (L3+)L4+L7 rule with a fairly limited "allow" (such as only allow I'm not sure if I would necessarily describe that as "L7 rules take precedence" rather than "allow rules ensure that traffic described by the allow rule is allowed, unless otherwise explicitly denied by another rule" + "the presence of an L7 match field in the policy rule enables all traffic on the same port+proto to be subject to L7 processing, including metrics and events generation" (From a pure API design perspective these side effects / implicit behaviors are not perfectly captured. They are important to the desired operation of Cilium though, so breaking them could cause surprising behavior.)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll note that this behaviour is the case today with any mix of CNP, CCNP and KNP policies. There could be a CiliumNetworkPolicy or CiliumClusterwideNetworkPolicy rule with an L7 allow, then another overlapping allow in a Kubernetes NetworkPolicy and this would cause the traffic on the overlapping L3/L4 to generate L7 metrics, potentially with specific URLs/methods/etc.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Footnote: This whole time, I was operating under the assumption that a L7 allow "cut off" a L4 allow. I just verified and, yes, an overlapping L7 and L3 will compound) |
||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll start a fresh thread here but it's relating to the observability use case, and is related to the L7 precedence discussion earlier. Today if a user wants to gain L7 visibility into their environment, we have this documentation: https://docs.cilium.io/en/stable/observability/visibility/#layer-7-protocol-visibility . If you create these rules then Cilium will emit events with Layer 7 information. I infer from this proposal that KCNP would take precedence over CNP/CCNP, and that "allows" at the higher precedence would not overlap with the L7 allows at the lower precedence (ie CNP/CCNP). This would mean that if a cluster administrator creates an explicit allow rule (as opposed to a "pass" rule), then this could disrupt L7 visibility that a namespace owner creates for their own purposes. Probably the implication of this is just that we should recommend to cluster administrators to use "allow" sparingly in the Admin tier, preferring "pass" statements instead; then leave allow/deny to the baseline tier. Most critically however, this means that if a cluster admin created a KCNP with an allow for DNS traffic, this could break Cilium's toFQDNs rules unless we otherwise somehow cover this use case. I recognize that KCNP provides some ability to match on domain names, however in Cilium's case we must somehow identify DNS traffic, ensure the DNS server is trusted, and then process the DNS responses in order to enable the domain-based allows to work. Until now, this is all configured through CNP or CCNP.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given the KCNP semantics, I expect it to be mostly used for DENY rules. As you noted, instead of ALLOW the admin would use PASS to seize control to lower-priority policies. A user that wishes to use L7 policies should probably just stick to CCNP and forget about KCNP.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's more or less fine, modulo domainNames handling in KCNP. We need to think about L7 to support that API field.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, I would expect that ALLOW in any sort of administrative tiers would be used for required access such as security scanners. We should recommend in the documentation that ALLOW be generally restricted to the baseline tier. |
||
| [PR 42784](https://github.com/cilium/cilium/pull/42784) will implement the capability to explicitly order network policy rules by introducing a new *Priority* field in the internal PolicyEntry struct. Policies with lower numeric Priority values will have higher precedence. The default priority value will be 0, representing the highest precedence. To ensure predictable ordering, we will internally assign explicit priority values to all policy rules, including those that don't natively support explicit priorities. | ||
|
|
||
| The priority field is an *int32*, but only the first 24 bits are usable with the upper 8 bits reserved by the dataplane for mapping the ProxyPort precedence. ClusterNetworkPolicy supports a maximum of 1001 priorities (0-1000) with a maximum of 25 rules in each policy, giving us a theoretical maximum of 25025 priorities. | ||
|
joestringer marked this conversation as resolved.
|
||
|
|
||
| There will be four numeric priority ranges: one for the admin tier, one for the regular tier (k8s NP and CNP/CCNP), one for the baseline tier and one for the explicit default allow or deny. | ||
|
|
||
|
|
||
| | Tier | Priority range | | ||
| | ------------- | ----------------- | | ||
| | Admin | 0-25024 | | ||
| | NetworkPolicy | 100000 | | ||
| | Baseline | 200000-225024 | | ||
| | Default | 300000 | | ||
|
|
||
| The specific ranges are chosen to provide clear separation and ample space for future expansion: | ||
| * The ClusterNetworkPolicy spec allows for priorities 0-1000, with up to 25 rules per policy. Reserving a range of 100,000 for the Admin and Baseline tiers provides a simple mapping (as shown below) and room for growth. | ||
| * The NetworkPolicy and Default tiers do not have explicit priorities within their tiers, but a large range is reserved for consistency and internal use. | ||
| * CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy will be assigned to the NetworkPolicy tier. This means that Admin tier policies will always override CiliumClusterwideNetworkPolicy rules. | ||
| * The priority numbering is purely internal to Cilium and only exists in memory within the Cilium Agent. It is thus possible to change these allocations at any time without any external impact. | ||
| * While it is possible to split the ranges on binary boundaries, there does not seem to be a compelling reason to do so. | ||
| * The priority mapping is applied independently for ingress and egress rules. | ||
| * The Default tier will only contain a single "allow all" or "deny all" fallback rule. | ||
| * The proposed priority range is well under the supported maximum (1^24 - 1). | ||
|
|
||
| Each ClusterNetworkPolicy rule will be mapped to a Cilium priority by multiplying the CNP priority with 25 and adding the zero-based relative order of the rules within each to the priority. For example: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the next day or so, I will be merging a PR that makes priority a float. That means it can be negative or decimal. It also adds a separate ordering pass, meaning that the internal datapath ordering can be completely distinct from any user-facing API. |
||
|
|
||
| | Tier | Priority | Rule # | Calculation | Internal priority | | ||
| | ------------ | -------- | ------ | --------------------------- |------------------ | | ||
| | Admin | 0 | 1 | 0 * 25 + 1 - 1 | 0 | | ||
| | Admin | 0 | 25 | 0 * 25 + 25 - 1 | 24 | | ||
| | Admin | 1 | 1 | 1 * 25 + 1 - 1 | 25 | | ||
| | Admin | 5 | 2 | 5 * 25 + 2 - 1 | 126 | | ||
| | Admin | 12 | 7 | 12 * 25 + 7 - 1 | 306 | | ||
| | Admin | 1000 | 25 | 1000 * 25 + 25 - 1 | 25024 | | ||
| | NetworkPolicy | N/A | 1 | 100000 | 100000 | | ||
| | NetworkPolicy | N/A | 10 | 100000 | 100000 | | ||
| | Baseline | 0 | 1 | 200000 + 0 * 25 + 1 - 1 | 200000 | ||
| | Baseline | 1 | 1 | 200000 + 1 * 25 + 1 - 1 | 200025 | | ||
| | Baseline | 50 | 3 | 200000 + 50 * 25 + 3 - 1 | 201252 | | ||
| | Baseline | 1000 | 25 | 200000 + 1000 * 25 + 25 -1 | 225024 | | ||
| | Default | N/A | N/A | 300000 | 300000 | | ||
|
|
||
| ClusterNetworkPolicy specifications mandate that behavior of policies with the same priority is undefined. For the NetworkPolicy tier, the existing precedence rules (deny over allow) continue to be in force even though they all have the same internal priority. For simplicity Cilium will do the same for other tiers as well. | ||
|
|
||
| ## Impacts / Key Questions | ||
|
|
||
| ### Key Question: Should CiliumClusterwideNetworkPolicy be able to override ClusterNetworkPolicy rules? | ||
|
|
||
| ### Option: Increase priority of CiliumClusterwideNetworkPolicy | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not confident enough to argue that this is the right interpretation (yet?), but one idea I had early on during this effort was that CCNP should just be considered the highest precedence. This way, deny or L7 processing rules (such as for a DNS server) could remain in place whether you use KCNP or not, and would not change behavior based on the content of KCNP rules. This doesn't solve all the concerns I highlighted elsewhere, but it would mitigate some potential usability flaws if the user wanted to use both (or even migrate towards KCNP).
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If CCNP is highest precedence, then by definition it is higher than CNP. This then breaks compatibility with existing behavior.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to help share understanding, when you say "breaks compatibility", do you have an example ruleset in mind of the kind of breakage you expect?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A configurable tier for CCNP seems like a logical next step. Either |
||
|
|
||
| #### Pros | ||
|
|
||
| * CCNP rules can override ClusterNetworkPolicy and Kubernetes NetworkPolicy rules. | ||
|
|
||
| #### Cons | ||
|
|
||
| * Backwards incompatible. Currently CCNP has the same priority as k8s NP and CilumNetworkPolicy. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I guess this is where the earlier discussion comes into play. The merging behavior of L7 rules between CNP and CCNP implies the same precedence level, so if CCNP was moved to a higher precedence level without retaining the rules at the same precedence level as the CNP/KNP rules then this behavior would be broken.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I emphasized moved here as I was pondering whether there's another alternative path where CCNP rules are copied to apply both at highest precedence and at the same precedence as NP/CNP... interesting thought whether that might help to solve some API compatibility concerns raised in other threads. More thought necessary to decide if that's helpful or viable. Definitely more complicated, which I don't love.. but do I love it less than breaking compatibility? I'm not sure.)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After some further thought, "copying" doesn't solve the problem I was concerned about, which is ensuring that the merging of L7 properties performs the same as today. Probably there's too much complexity with that option to seriously consider it. |
||
| * Relative priority vs ClusterNetworkPolicy is unclear. Should it be higher, lower or equal? | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One aspect is this: Today, if a cluster administrator wishes to implement a blanket denylist, they can create that denylist with CCNP and there is no way to override that denylist. You could imagine using this for DoS mitigation, geolocation compliance, or just general network segmentation use cases. The implication of putting Kubernetes Cluster Network Policy/KCNP at a higher precedence than CCNP means that it is now possible to override the intent of the CCNP. If we are requiring that the persona who manages CCNP and KCNP are the same persona with the same threat model and so on, then I could imagine we simply document this to say that it is your responsibility as a cluster administrator to understand this difference and that if you opt to allow KCNP resources to be created in the cluster, then you are also responsible for (a) locking access down and (b) ensuring the cluster is matching your intended deny posture.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would not expect a cluster to have both CCNP and KCNP. Existing users of CCNP can continue using that, if they wish. New users can choose KCNP and skip CCNP altogether. Combining both in a single cluster invites trouble, as the complexity of determining what the actual effective policy is, becomes quite high. I expect KCNP and CCNP to co-exist only temporary during a migration from CCNP to KCNP. If we really want to encourage permanent CCNP and KCNP co-existence, then there are several solutions:
Given that the target audience of KCNP is largely the same as CCNP, is there truly an issue here?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that the personas are the same, I think this is more of a docs & user education issue than a technical issue to solve. I'm inclined to agree that mixing KCNP+CCNP at the same time will introduce unnecessary complexity to both maintenance and usage in the clusters. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Raising a point that was previously discussed in the policy sync - I'm curious how a service provider may be able to implement a blanket denylist in this model, while not impeding cluster admin's choices for policies. Today, CCNP deny policies effectively are a blanket deny. Is it equivalent to have one or more admin tier policies of priority 0? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps an alternative way to phrase my question is, what happens if there are conflicting rules of equal priority? Does deny take precedence?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Deny still takes precedence at the same priority level.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some service providers are using the commandline flag to preload high priority deny policies. I think that's probably a reasonable mechanism. I understand this doesn't make the policies viewable via standard k8s mechanisms, but unless we invent yet another policy resource just for service providers, I don't think we could set appropriate RBAC for that. Probably that is more effort than it's worth if SPs can just sideload the policies and restrict tampering with the cilium-agent daemonset. |
||
|
|
||
| ## Future Milestones | ||
|
|
||
| This CFP will enable the introduction of rule priorities, which are required to support AdminNetworkPolicy. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question came up during the community meeting today: Is the intended persona for all Kubernetes ClusterNetworkPolicy resources also the cluster-wide administrator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but there could be multiple admins. That's one of the reasons why the CNPs are prioritized.