Skip to content

Commit 0f88627

Browse files
committed
This commit adds comprehensive sequence diagrams that illustrate the end-to-end integration of DpuNetworkCR with the Disruptive option, covering DPU network creation, update, deletion, NF provisioning, and pod creation flows
Signed-off-by: Alkama Hasan <alkamah@marvell.com>
1 parent 1ab76b5 commit 0f88627

10 files changed

Lines changed: 679 additions & 0 deletions

doc/dpuNetworkCR_Create.pdf

32.8 KB
Binary file not shown.

doc/dpuNetworkCR_Delete.pdf

14.9 KB
Binary file not shown.

doc/dpuNetworkCR_Update.pdf

15.5 KB
Binary file not shown.

doc/dpunetwork_cr_create.puml

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
@startuml dpunetwork_cr_creation
2+
3+
actor user
4+
box "Kubernetes Control Plane"
5+
participant k8s_api
6+
participant dpu_network_controller
7+
participant configmap as "ConfigMap\ndpu-device-plugin-config"
8+
end box
9+
10+
box "Host Node"
11+
participant kubelet_host as "kubelet (Host)"
12+
participant dpu_daemon_host as "dpu-daemon (Host)\n(Device Plugin Manager + Device Plugin)"
13+
participant vsp_host as "vsp (Host)"
14+
end box
15+
16+
box "DPU Node"
17+
participant kubelet_dpu as "kubelet (DPU)"
18+
participant dpu_daemon_dpu as "dpu-daemon (DPU)\n(Device Plugin Manager + Device Plugin)"
19+
participant vsp_dpu as "vsp (DPU)"
20+
end box
21+
22+
autonumber
23+
24+
== DpuNetwork CR Creation (Multiple Networks) ==
25+
26+
user -> k8s_api: Create DpuNetwork CR 1
27+
activate k8s_api
28+
note right: **DpuNetwork 1: "dpu-network-1"**\n\napiVersion: networking.example.com/v1\nkind: DpuNetwork\nmetadata:\n name: dpu-network-1\nspec:\n nodeSelector:\n matchLabels:\n node-role: dpu-node\n dpuSelector:\n matchExpressions:\n - key: dpu-type\n operator: In\n values: ["IPU Adapter E2100"]\n - key: vfId\n operator: In\n values: ["0-3", "5-7"]\n IsDisruptive: true
29+
30+
k8s_api -> dpu_network_controller: Reconcile Event
31+
activate dpu_network_controller
32+
33+
== DpuNetwork Controller Reconciliation ==
34+
35+
dpu_network_controller -> k8s_api: List Nodes (match nodeSelector)
36+
activate k8s_api
37+
k8s_api -> dpu_network_controller: Matching Nodes
38+
deactivate k8s_api
39+
40+
dpu_network_controller -> k8s_api: List Dpu CRs
41+
activate k8s_api
42+
k8s_api -> dpu_network_controller: All Dpu CRs
43+
note right: Dpu CR contains:\n netdevs:\n - name: "ens2f0v0" vfId: 0\n - name: "ens2f0v1" vfId: 1\n - name: "ens2f0v2" vfId: 2\n - name: "ens2f0v3" vfId: 3\n - name: "ens2f0v4" vfId: 4\n - name: "ens2f0v5" vfId: 5\n - name: "ens2f0v6" vfId: 6\n - name: "ens2f0v7" vfId: 7
44+
deactivate k8s_api
45+
46+
dpu_network_controller -> dpu_network_controller: Evaluate dpuSelector\n(match dpu-type and vfId)
47+
48+
dpu_network_controller -> dpu_network_controller: Parse vfId ranges\n("0-3" -> [0,1,2,3]\n"5-7" -> [5,6,7])
49+
50+
dpu_network_controller -> dpu_network_controller: Filter VFs from Dpu CRs\n(match selected VFs: 0,1,2,3,5,6,7)
51+
52+
dpu_network_controller -> dpu_network_controller: Generate ResourceName\n"openshift.io/dpunetwork-<dpuNetworkCR Name>"
53+
54+
== ConfigMap-Based Device Plugin Registration ==
55+
56+
dpu_network_controller -> dpu_network_controller: Aggregate all DpuNetwork CRs\nfor this node
57+
58+
dpu_network_controller -> dpu_network_controller: Generate ConfigMap data\n(config.json with resource definitions)
59+
60+
dpu_network_controller -> k8s_api: Create/Update ConfigMap\ndpu-device-plugin-config
61+
activate k8s_api
62+
note right: **ConfigMap Approach (Single Source of Truth)**\n\n**One ConfigMap describes resources for both Host and DPU nodes.**\n**Each entry carries a nodeSelector so local daemons only advertise their slice.**\n\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: dpu-device-plugin-config\n namespace: dpu-operator-system\ndata:\n config.json: |\n {\n "resources": [\n {\n "resourceName": "openshift.io/dpunetwork-dpu-network-1",\n "dpuNetworkName": "dpu-network-1",\n "nodeSelector": {"matchLabels": {"node-role": "host"}},\n "vfRanges": ["0-3", "5-7"]\n },\n {\n "resourceName": "openshift.io/dpunetwork-dpu-network-1",\n "dpuNetworkName": "dpu-network-1",\n "nodeSelector": {"matchLabels": {"node-role": "dpu"}},\n "vfRanges": ["0-3", "5-7"],\n "rpmRanges": ["0-0"],\n "vethRanges": ["0-1"]\n }\n // Additional resources per DpuNetwork CR\n ]\n }
63+
k8s_api -> configmap: ConfigMap Created/Updated
64+
activate configmap
65+
k8s_api -> dpu_network_controller: ConfigMap Updated
66+
deactivate k8s_api
67+
68+
== Host dpu-daemon Watches ConfigMap ==
69+
70+
configmap -> dpu_daemon_host: ConfigMap Change Event\n(watch notification)
71+
activate dpu_daemon_host
72+
73+
dpu_daemon_host -> k8s_api: Get ConfigMap\ndpu-device-plugin-config
74+
activate k8s_api
75+
k8s_api -> dpu_daemon_host: ConfigMap with config.json
76+
deactivate k8s_api
77+
78+
dpu_daemon_host -> dpu_daemon_host: Parse config.json\nFilter entries where node-role = host
79+
80+
note over dpu_daemon_host: **Per-Node Architecture Decision:**\n**Single device plugin instance per node**\n- Host instance only advertises host-scoped resources\n- Reads shared ConfigMap, filters via nodeSelector\n- Updates in-place on ConfigMap changes
81+
82+
alt Host Device Plugin Not Running
83+
dpu_daemon_host -> dpu_daemon_host: Start Device Plugin Instance\n(read host resources)
84+
else Host Device Plugin Already Running
85+
dpu_daemon_host -> dpu_daemon_host: Reload Config\n(apply new host resource set)
86+
end
87+
88+
dpu_daemon_host -> vsp_host: GetDevices()
89+
activate vsp_host
90+
vsp_host -> vsp_host: Return host-visible devices\n(VF repr set shared with DPU)
91+
vsp_host -> dpu_daemon_host: Host device inventory
92+
deactivate vsp_host
93+
94+
dpu_daemon_host -> dpu_daemon_host: Build device list\nApply vfRanges [0-3,5-7]
95+
note right: Host Resource\n"openshift.io/dpunetwork-dpu-network-1"\nDevices: VFs 0,1,2,3,5,6,7 (no RPM/veth)
96+
97+
dpu_daemon_host -> dpu_daemon_host: ListAndWatch()\n(advertise host resource only)
98+
99+
dpu_daemon_host -> kubelet_host: Register Device Plugin\nResource "openshift.io/dpunetwork-dpu-network-1"
100+
activate kubelet_host
101+
kubelet_host -> dpu_daemon_host: Registration Accepted
102+
kubelet_host -> kubelet_host: Add node capacity\n"openshift.io/dpunetwork-dpu-network-1": 7 (host)
103+
deactivate kubelet_host
104+
105+
deactivate dpu_daemon_host
106+
107+
== DPU dpu-daemon Watches ConfigMap ==
108+
109+
configmap -> dpu_daemon_dpu: ConfigMap Change Event\n(watch notification)
110+
activate dpu_daemon_dpu
111+
112+
dpu_daemon_dpu -> k8s_api: Get ConfigMap\ndpu-device-plugin-config
113+
activate k8s_api
114+
k8s_api -> dpu_daemon_dpu: ConfigMap with config.json
115+
deactivate k8s_api
116+
117+
dpu_daemon_dpu -> dpu_daemon_dpu: Parse config.json\nFilter entries where node-role = dpu
118+
119+
note over dpu_daemon_dpu: **Per-Node Architecture Decision:**\n**Single DPU-side device plugin instance**\n- Reads same ConfigMap, filters for node-role=dpu\n- Advertises VF + RPM + veth resources\n- No restart required on updates
120+
121+
alt DPU Device Plugin Not Running
122+
dpu_daemon_dpu -> dpu_daemon_dpu: Start Device Plugin Instance\n(read DPU resources)
123+
else DPU Device Plugin Already Running
124+
dpu_daemon_dpu -> dpu_daemon_dpu: Reload Config\n(apply new DPU resource set)
125+
end
126+
127+
dpu_daemon_dpu -> vsp_dpu: GetDevices()
128+
activate vsp_dpu
129+
vsp_dpu -> vsp_dpu: Return devices by type\n(VF repr, RPM, veth)
130+
vsp_dpu -> dpu_daemon_dpu: DPU device inventory
131+
deactivate vsp_dpu
132+
133+
dpu_daemon_dpu -> dpu_daemon_dpu: Build device lists\n- VF repr filtered by vfRanges\n- RPM list via rpmRanges\n- veth list via vethRanges
134+
note right: DPU Resources Advertised\n1. "openshift.io/dpunetwork-dpu-network-1" (VF x7)\n2. "openshift.io/rpm-disruptive" (rpmRange 0-0)\n3. "openshift.io/veth-nondisruptive" (vethRange 0-1)
135+
136+
dpu_daemon_dpu -> dpu_daemon_dpu: ListAndWatch()\n(advertise three resources)
137+
138+
dpu_daemon_dpu -> kubelet_dpu: Register Device Plugin\nAll DPU resources
139+
activate kubelet_dpu
140+
kubelet_dpu -> dpu_daemon_dpu: Registration Accepted
141+
kubelet_dpu -> kubelet_dpu: Add node capacity\nVF=7, RPM=1, veth=2
142+
deactivate kubelet_dpu
143+
144+
deactivate dpu_daemon_dpu
145+
deactivate configmap
146+
147+
== BridgeID and NAD Generation (1 NAD per DpuNetwork CR) ==
148+
149+
dpu_network_controller -> dpu_network_controller: Create BridgeID
150+
151+
dpu_network_controller -> dpu_network_controller: Create single NAD\nfor all VFs in network\n(shared config: IsDisruptive, IPAM)
152+
153+
dpu_network_controller -> k8s_api: Create NetworkAttachmentDefinition
154+
activate k8s_api
155+
note right: **NAD 1 for DpuNetwork 1**\n\nmetadata:\n name: dpunetwork-1-nad\n namespace: default\n annotations:\n dpu.config.openshift.io/dpu-network: dpu-network-1\n k8s.v1.cni.cncf.io/resourceName: openshift.io/dpunetwork-dpu-network-1\nspec:\n config: {\n "type": "dpu-cni",\n "cniVersion": "0.4.0",\n "name": "dpu-cni",\n "BridgeID": "<created-bridgeID>",\n "IsDisruptive": "true",\n "ipam": {...}\n }\n\n**VFs (0,1,2,3,5,6,7) use this NAD**\n**Multiple pods can use this NAD**\n**Each pod gets allocated a VF from the pool**
156+
k8s_api -> dpu_network_controller: NAD Created
157+
deactivate k8s_api
158+
159+
note over dpu_network_controller: **About NRI (Network Resources Injector):**\nNRI webhook is installed once (via DpuOperatorConfig) and is not re-registered per DpuNetwork.\nDpuNetwork creation only needs to create NAD(s) and (optionally) publish a mapping (e.g., in DpuNetwork.status)\nso NRI can translate `dpu.config.openshift.io/dpu-network: <name>` into\n`k8s.v1.cni.cncf.io/networks: <nad list>` during Pod CREATE.
160+
161+
dpu_network_controller -> k8s_api: Update DpuNetwork 1 Status
162+
activate k8s_api
163+
note right: status:\n conditions:\n - type: Ready\n status: True\n message: NAD and Device Plugin created\n resourceName: "openshift.io/dpunetwork-dpu-network-1"\n selectedVFs: [0,1,2,3,5,6,7]\n excludedVFs: [4]
164+
k8s_api -> dpu_network_controller: Status Updated
165+
deactivate k8s_api
166+
167+
deactivate dpu_network_controller
168+
deactivate k8s_api
169+
170+
note over k8s_api: **Architecture Summary:**\n**Single ConfigMap, per-node device plugin instances**\n**- Host dpu-daemon filters node-role=host resources**\n**- DPU dpu-daemon filters node-role=dpu resources (VF+RPM+veth)**\n**- Each node runs exactly one device plugin instance**\n**- Entries share resourceName when devices overlap**\n**- NAD per DpuNetwork CR stays unchanged**\n\n**When new DpuNetwork CR created:**\n**- Controller updates ConfigMap with host + DPU entries**\n**- Both daemons detect change and reload in-place**\n**- No new pods/daemons required, only ListAndWatch updates**
171+
172+
note right of user: **See:**\n- pod_creation_regular.puml for pod creation flow\n- pod_creation_nf_disruptive.puml for NF pod flow\n- dpunetwork_cr_update.puml for update flow\n- dpunetwork_cr_deletion.puml for deletion flow
173+
174+
@enduml
175+

doc/dpunetwork_cr_delete.puml

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
@startuml dpunetwork_cr_deletion
2+
3+
actor user
4+
box "Kubernetes Control Plane"
5+
participant k8s_api
6+
participant dpu_network_controller
7+
participant configmap as "ConfigMap\\ndpu-device-plugin-config"
8+
end box
9+
10+
box "Host Node"
11+
participant kubelet_host as "kubelet (Host)"
12+
participant dpu_daemon_host as "dpu-daemon (Host)\\n(Device Plugin Manager + Device Plugin)"
13+
participant vsp_host as "vsp (Host)"
14+
end box
15+
16+
box "DPU Node"
17+
participant kubelet_dpu as "kubelet (DPU)"
18+
participant dpu_daemon_dpu as "dpu-daemon (DPU)\\n(Device Plugin Manager + Device Plugin)"
19+
participant vsp_dpu as "vsp (DPU)"
20+
end box
21+
22+
autonumber
23+
24+
== DpuNetwork Deletion (ConfigMap Approach) ==
25+
26+
note right of user: **Prerequisites:**\nDpuNetwork CR already created\nSee: dpunetwork_cr_creation.puml
27+
28+
user -> k8s_api: Delete DpuNetwork CR
29+
activate k8s_api
30+
31+
k8s_api -> dpu_network_controller: Reconcile Event (Deletion)
32+
activate dpu_network_controller
33+
34+
dpu_network_controller -> dpu_network_controller: Aggregate remaining DpuNetwork CRs\n(remove deleted network from list)
35+
36+
dpu_network_controller -> dpu_network_controller: Generate updated ConfigMap data\n(remove network-1 resource definition)
37+
38+
dpu_network_controller -> k8s_api: Update ConfigMap\ndpu-device-plugin-config
39+
activate k8s_api
40+
note right: **ConfigMap Updated**\n\nconfig.json updated:\n "resources": [\n // network-1 removed\n {\n "resourceName": "openshift.io/dpunetwork-dpu-network-2",\n ...\n }\n ]
41+
k8s_api -> configmap: ConfigMap Updated
42+
activate configmap
43+
k8s_api -> dpu_network_controller: ConfigMap Updated
44+
deactivate k8s_api
45+
46+
== ConfigMap Change Propagates to Host and DPU Nodes ==
47+
48+
configmap -> dpu_daemon_host: ConfigMap Change Event\\n(resource removed)
49+
activate dpu_daemon_host
50+
51+
dpu_daemon_host -> k8s_api: Get Updated ConfigMap
52+
activate k8s_api
53+
k8s_api -> dpu_daemon_host: ConfigMap without network-1
54+
deactivate k8s_api
55+
56+
dpu_daemon_host -> dpu_daemon_host: Parse config.json\\nDetect removed resource "openshift.io/dpunetwork-dpu-network-1"
57+
58+
dpu_daemon_host -> dpu_daemon_host: Update single device plugin instance\\n(reload config, rebuild advertised list)
59+
60+
dpu_daemon_host -> vsp_host: ReleaseHostVfPool(bridge_id="x1")
61+
activate vsp_host
62+
vsp_host -> vsp_host: Remove VF entries bound to host pods
63+
vsp_host -> dpu_daemon_host: VF pool released
64+
deactivate vsp_host
65+
66+
dpu_daemon_host -> kubelet_host: ListAndWatch Update\\n(unregister CR-specific resource)
67+
activate kubelet_host
68+
kubelet_host -> kubelet_host: Remove node capacity\\n"openshift.io/dpunetwork-dpu-network-1"
69+
kubelet_host -> dpu_daemon_host: Resource Removed
70+
deactivate kubelet_host
71+
72+
deactivate dpu_daemon_host
73+
74+
configmap -> dpu_daemon_dpu: ConfigMap Change Event\\n(resource removed)
75+
activate dpu_daemon_dpu
76+
77+
dpu_daemon_dpu -> k8s_api: Get Updated ConfigMap
78+
activate k8s_api
79+
k8s_api -> dpu_daemon_dpu: ConfigMap without network-1
80+
deactivate k8s_api
81+
82+
dpu_daemon_dpu -> dpu_daemon_dpu: Parse config.json\\nDetect removal of VF, RPM, veth entries for bridge "x1"
83+
84+
dpu_daemon_dpu -> dpu_daemon_dpu: Update device plugin instance\\n(stop advertising VF resource, adjust RPM/Veth counts)
85+
86+
dpu_daemon_dpu -> vsp_dpu: DeleteNetworkResources(bridge_id="x1")
87+
activate vsp_dpu
88+
vsp_dpu -> vsp_dpu: Tear down NF map entry\\nremove VF + RPM interfaces
89+
vsp_dpu -> vsp_dpu: Delete flow rules and bridge br-x1
90+
vsp_dpu -> dpu_daemon_dpu: Network resources deleted
91+
deactivate vsp_dpu
92+
93+
dpu_daemon_dpu -> kubelet_dpu: ListAndWatch Update\\n(unregister VF resource, update RPM/veth counts)
94+
activate kubelet_dpu
95+
kubelet_dpu -> kubelet_dpu: Remove node capacity\\n"openshift.io/dpunetwork-dpu-network-1" on DPU node
96+
kubelet_dpu -> dpu_daemon_dpu: Resource Removed
97+
deactivate kubelet_dpu
98+
99+
deactivate dpu_daemon_dpu
100+
deactivate configmap
101+
102+
dpu_network_controller -> k8s_api: Delete NetworkAttachmentDefinition
103+
activate k8s_api
104+
k8s_api -> dpu_network_controller: NAD Deleted
105+
deactivate k8s_api
106+
107+
dpu_network_controller -> k8s_api: Remove Finalizer
108+
deactivate dpu_network_controller
109+
110+
k8s_api -> k8s_api: DpuNetwork CR Deleted
111+
deactivate k8s_api
112+
113+
note right of user: **Related Diagrams:**\n- dpunetwork_cr_creation.puml (host setup)\n- dpunetwork_cr_creation-dpu.puml (DPU setup)\n- dpunetwork_cr_update.puml (update flow)
114+
115+
@enduml
116+

0 commit comments

Comments
 (0)