CFP-42453: Oracle Cloud Infrastructure (OCI) Cloud Provider Design#78
CFP-42453: Oracle Cloud Infrastructure (OCI) Cloud Provider Design#78trungng92 wants to merge 1 commit intocilium:mainfrom
Conversation
Signed-off-by: Trung Nguyen <trung.tn.nguyen@oracle.com>
antonipp
left a comment
There was a problem hiding this comment.
Thank you for the write-up!
Just to give some perspective, as a Cilium user running self-managed Kubernetes on 3 Cloud Providers in parallel, I'd really prefer a solution consistent with the UX we have in other Cloud Providers, meaning Option 1 ("Kubernetes IPAM solution"), like in GCP, or Option 2 ("In-tree Extending the Cilium Operator"), like in AWS and Azure.
I think Option 3 is just more complex Option 1, I am not sure it's worth it? And Option 4 ("Delegated IPAM") makes sense from the code maintenance perspective for Isovalent (and I know they've been pushing it, like here: cilium/cilium#34604 (comment)).
However, as a user, it really makes life much harder: the delegated IPAM model adds way more complexity. The setup is different from what we are running in other Cloud Providers, and there are more moving parts. It will be harder to debug and to operate, because we'd need to run this additional IPAM management system alongside the regular Kube and Cilium components. The failure modes will be different as well: for example, all Cloud Provider API interactions will now be made in a distributed fashion instead of being centralized in the Operator.
Moreover, the current Operator Cloud Provider-specific IPAM code is mostly relying on core IPAM logic which has been tested for years, has many edge cases already figured out and which is still maintained. The only things which are less maintained are the Cloud Provider interface implementations. However, the code there is not too complex (it’s just implementations of InstancesManager and Node interfaces). And I think relying on support from large enterprise users, is not such a big issue.
|
|
||
| ### Key Question: Cloud Provider Support Model | ||
|
|
||
| What does the support model look like for Cloud Providers for integrated pieces, like the Cilium Operator? |
There was a problem hiding this comment.
I will let Cilium maintainers correct me, but from what I've seen running Cilium on multiple Cloud Providers over the years is the following:
- For the Cilium AWS Operator, AWS itself doesn't seem to be involved (I believe they prefer to invest into https://github.com/aws/amazon-vpc-cni-k8s). So the burden of maintaining the AWS Operator falls on Isovalent + the users. We (Datadog) have been very active in maintaining the Operator because we heavily rely on it. This seems to be the case for other big users as well, such as Palantir.
- For Azure, I believe Azure used to maintain the Operator but then decided to move to the Delegated IPAM model so that they can maintain their code out of tree: https://github.com/Azure/azure-container-networking. We still use the Azure Operator and are not planning to switch to the Delegated IPAM model, so we actively maintain it in Cilium. I don't really know if there are any other big users.
- For GCP, they have implemented the first option you listed in this doc ("Kubernetes IPAM solution"), so they maintain their code in CCM: https://github.com/kubernetes/cloud-provider-gcp/tree/master/cmd/cloud-controller-manager
There's also the Alibaba Cloud integration but I don't really have context around it
There was a problem hiding this comment.
I don't recall any Azure engineers being involved in the Azure IPAM implementation in Cilium. I believe the story was similar for both AWS and Azure with a combination of Isovalent and community user participants in developing the implementation.
On the topic of the highlighted text from this thread, what do you mean by "support model"?
There was a problem hiding this comment.
I don't think Azure contributed to the Operator, our integrations started with delegated IPAM (I have worked on this from the Azure side since it was just an idea we had).
@antonipp have you had problems with delegated IPAM? Can you elaborate on your experience with delegated IPAM wrt this:
However, as a user, it really makes life much harder: the delegated IPAM model adds way more complexity.
Azure delegated IPAM is only available in AKS where we manage everything. I expect OCI OKE would offer a similar managed experience. It's not intended for a self-managed cluster, but only because it requires special access to the networking controlplane only available in AKS so that we can offer advanced fabric integration (eg for Azure Overlay networking).
However, I built lots of the delegated IPAM implementation for AKS and I don't think it's more complex than operating Cilium already is. We run one additional daemonset that drops the azure-ipam binary and the CNI conflist that plugs it in. It would be trivial for you to install, operate, and debug, if you can do these for Cilium already.
For me, there's a clear separation of responsibilities - Cilium does the node-local CNI things it's good at, and the cloud provider owns the tight native integration. Doing this through standard interfaces is the only way that makes sense - Cilium talking to the cloud provider(s) directly is never going to be maintained in a way that makes everyone happy.
There was a problem hiding this comment.
For context, we have not tried running delegated IPAM in our infra because as you mentioned it looks like it wasn't designed for self-managed clusters but only for AKS. I thought about trying it out because at one point there were talks about deprecating the Azure Operator IPAM model which we were using. The only reason why delegated IPAM looked more complex to me is that it indeed requires more moving parts, i.e. the deamonset + the binary and it makes things way different from what we already have in AWS and Azure, where it's exactly the same model with the Cilium Operator managing everything.
There was a problem hiding this comment.
Cilium does the node-local CNI things it's good at, and the cloud provider owns the tight native integration.
This was my original thought as well, and is kind of how the initial (non multi-vnic) solution will work, where OCI OKE will attach the IPs, give them to Cilium, and let Cilium do all of its "node-local CNI things". The delegated IPAM would have a similar frame of mind, where the IPAM would have the ability to select which IP to use (e.g. which VNIC to pick an IP from), and pass that onto Cilium.
Although I think DRA (mentioned in the thread) may provide a way for users to specify a multi-nic approach when it provides consumable-capacity (using DRA, a pod could request 2 different VNIC devices).
|
I think @antonipp summarized the state pretty well 👍 The one thing I would add around option (2) is just that it has been a struggle to find people willing to help maintain IPAM code. The cloud SDKs are large dependencies that we have minimal understanding about, and I see that as a maintenance burden and risk for the project as we try to keep them up to date. I'm not excited about the idea of adding yet another cloud SDK. |
Thanks for the replies. What @antonipp discussed is what I was looking after. Essentially, in a solution where there's an in-tree OCI integration with Cilium, who becomes responsible for the OCI integration? Cilium? (Of course I wouldn't want to put additional burden on your team 🙂) OCI? The community?
In a typical integration between two services (Service A and Service B), ideally:
But I know this doesn't always happen in the real world (as @antonipp gave good examples for). Also, it's not a 1:1 model, it's a 1:many model. I can understand that testing every cloud provider for every feature would be difficult for Isovalent which could lead to features getting stale/left behind for certain cloud providers. I am strongly leaning towards attaching a CIDR block and updating the The main drawback is that it isn't compatible with a multi-NIC solution. At the same time though, there doesn't seem to be any "easy" solutions for multi-NIC use cases, and any multi-NIC solution will require a deeper in-tree or out-of-tree integration. Aside from cloud providers implementing multi-NIC, does Cilium have any expectations around NIC usage? Is the default behavior/expectation that Cilium will just use the default ip route as specified by the OS? |
|
Cc: @rbtr |
We've also been exploring multi-NIC and so far are only offering it via AzCNI because we didn't want to dump a bunch of impl specific code in Cilium and there was no standard contract for it. But isn't this the promise of DRA(NET) and NRI? |
|
I don't have experience with NRI, but we have looked into DRA a bit. For anyone who needs a background on Dynamic Resource Allocation, the idea is that you can add "attributes" to devices (e.g. NICs). And then in your pod, you can request devices that meet specified attribute requirements. It does seem useful for multi-NIC cases, although it might require the consumable capacity feature to be usable (multiple pods connecting to the same VNIC): Perhaps the answer to generic multi-NIC support for Cilium should be to wait for this feature to be available. |
Cilium Issue Link
As discussed during the Cilium Weekly Community meeting, this is a CFP that starts a discussion on various possible integrations with Cilium and will ultimately help determine which solution works best for OCI.