Skip to content

Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants#485

Open
andreaskaris wants to merge 7 commits intobpfman:mainfrom
andreaskaris:issues484
Open

Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants#485
andreaskaris wants to merge 7 commits intobpfman:mainfrom
andreaskaris:issues484

Conversation

@andreaskaris
Copy link
Contributor

@andreaskaris andreaskaris commented Oct 21, 2025

Summary

Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants.

Fixes #484

Motivation

Making Priority a pointer allows users to set the priority to 0, or to use the default behavior. This provides better API semantics and allows bpfman to apply its own default priority logic when the field is omitted. Following OpenShift API guide best practices, the default value is now set via the controller rather than the CRD definition.

Changes

Core API Changes (commit 7a20559)

  • API Changes: Changed Priority field type from int32 to *int32 in:

    • ClTcAttachInfo / ClTcAttachInfoState
    • ClTcxAttachInfo / ClTcxAttachInfoState
    • ClXdpAttachInfo / ClXdpAttachInfoState
    • TcAttachInfo / TcAttachInfoState
    • TcxAttachInfo / TcxAttachInfoState
    • XdpAttachInfo / XdpAttachInfoState
  • Kubebuilder Annotations: Removed +kubebuilder:default:=1000 annotations - the default is now set by the controller rather than via the API, following OpenShift API guide best practices

  • Helper Function: Added GetPriority() helper in pkg/helpers to handle nil pointer cases and provide default value of 1000

  • Controller Updates: Updated all reconcilers to use GetPriority() helper:

    • Cluster-scoped: cl_tc_program.go, cl_tcx_program.go, cl_xdp_program.go
    • Namespace-scoped: ns_tc_program.go, ns_tcx_program.go, ns_xdp_program.go
  • Generated Code: Updated CRDs, deepcopy functions, and CSV manifests

  • Tests: Updated all test files to use ptr.To() for priority values

Test Infrastructure Improvements (commits 7c42d03, 229e32f)

  • Table-Driven Tests (commit 7c42d03): Refactored BPF application tests to use table-driven pattern with reusable helper functions:

    • createFakeClusterReconciler() / createFakeNamespaceReconciler() for setup
    • runClusterReconciler() / runNamespaceReconciler() for execution
    • verifyClusterBpfApplicationState() / verifyNamespaceBpfApplicationState() for validation
    • verifyClusterBpfProgramState() / verifyNamespaceBpfProgramState() for status checks
  • Mockable Network Namespace Cache (commit 229e32f): Extracted NetnsCache into a mockable interface:

    • Defined NetNsCache interface with GetNetNsId() and Reset() methods
    • Implemented ReconcilerNetNsCache with original caching logic
    • Added MockNetNsCache for testing with predefined namespace mappings
    • Enabled proper unit testing without filesystem dependencies

Priority Field Testing (commit 5cd629f)

  • Unit Tests: Added tests for priority field handling:
    • Verify nil priority defaults to 1000
    • Verify explicit priority values are correctly set
    • Verify zero priority is allowed
    • Coverage for both cluster-scoped and namespace-scoped BPF applications
    • Verify priority is correctly reflected in program status

Testing

  • Unit tests updated and priority field tests added
  • Integration tests updated

@andreaskaris andreaskaris marked this pull request as draft October 21, 2025 17:02
@andreaskaris andreaskaris force-pushed the issues484 branch 3 times, most recently from 0be6ca4 to 6b87e7d Compare October 21, 2025 18:01
@andreaskaris
Copy link
Contributor Author

andreaskaris commented Oct 21, 2025

With this fix, applications can now have priority 0:

[root@centos9-bpfman bpfman]# kubectl apply -f /root/bytecode.yaml 
clusterbpfapplication.bpfman.io/go-xdp-counter-example created
[root@centos9-bpfman bpfman]# 
[root@centos9-bpfman bpfman]# 
[root@centos9-bpfman bpfman]# 
[root@centos9-bpfman bpfman]# kubectl get clusterbpfapplication
NAME                     NODESELECTOR   STATUS    AGE
go-xdp-counter-example                  Success   20s
[root@centos9-bpfman bpfman]# kubectl apply -f /root/bytecode.yaml ^C
[root@centos9-bpfman bpfman]# kubectl get clusterbpfapplicationstate  -o yaml 
apiVersion: v1
items:
- apiVersion: bpfman.io/v1alpha1
  kind: ClusterBpfApplicationState
  metadata:
    creationTimestamp: "2025-10-21T18:14:19Z"
    finalizers:
    - bpfman.io.clbpfapplicationcontroller/finalizer
    generation: 1
    labels:
      bpfman.io/ownedByProgram: go-xdp-counter-example
      kubernetes.io/hostname: bpfman-deployment-control-plane
    name: go-xdp-counter-example-33a1ac2f
    ownerReferences:
    - apiVersion: bpfman.io/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: ClusterBpfApplication
      name: go-xdp-counter-example
      uid: ad361693-6d2c-486f-a1b8-9fa3dae8a5ee
    resourceVersion: "1967"
    uid: 48ec7d07-8c22-4244-8010-8fc769e7b188
  status:
    appLoadStatus: LoadSuccess
    conditions:
    - lastTransitionTime: "2025-10-21T18:14:30Z"
      message: The BPF application has been successfully loaded and attached
      reason: Success
      status: "True"
      type: Success
    node: bpfman-deployment-control-plane
    programs:
    - name: xdp_stats
      programId: 2397
      programLinkStatus: Success
      type: XDP
      xdp:
        links:
        - interfaceName: eth0
          linkId: 1752065740
          linkStatus: Attached
          priority: 0
          proceedOn:
          - Pass
          - DispatcherReturn
          shouldAttach: true
          uuid: 915b77d2-6723-4cb1-a83f-2640a82143bb
    updateCount: 2
kind: List
metadata:
  resourceVersion: ""
[root@centos9-bpfman bpfman]# kubectl exec -n bpfman               bpfman-daemon-7rnsn -it -- /bin/bash
Defaulted container "bpfman" out of: bpfman, bpfman-agent, node-driver-registrar, mount-bpffs (init)
groups: cannot find name for group ID 2000
root@bpfman-deployment-control-plane:/# ./bpfman get link 1752065740
 Bpfman State                                                            
 BPF Function:       xdp_stats                                           
 Program Type:       xdp                                                 
 Program ID:         2397                                                
 Link ID:            1752065740                                          
 Interface:          eth0                                                
 Priority:           0                                                   
 Position:           0                                                   
 Proceed On:         pass, dispatcher_return                             
 Network Namespace:  None                                                
 Metadata:           bpfman.io/uuid=915b77d2-6723-4cb1-a83f-2640a82143bb 

@andreaskaris andreaskaris changed the title Issues484 Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants Oct 22, 2025
@andreaskaris andreaskaris force-pushed the issues484 branch 17 times, most recently from 2cb3797 to af2a346 Compare October 23, 2025 18:41
func TestTcGoCounter(t *testing.T) {
t.Log("deploying tc counter program")
require.NoError(t, clusters.KustomizeDeployForCluster(ctx, env.Cluster(), tcGoCounterKustomize))
addCleanup(func(context.Context) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: functions added via addCleanup are run exactly once, and that at the end of all tests. The problem with that is that this will conflict when multiple tests add the same cleanup. Instead, use t.Cleanup which runs after each test. It's also probably not necessary to throw an error on cleanup, therefore don't..?

@andreaskaris andreaskaris marked this pull request as ready for review October 23, 2025 18:53
frobware added a commit to frobware/bpfman-operator that referenced this pull request Jan 12, 2026
The PR bpfman#485 changed Priority from int32 to *int32 in both spec
(AttachInfo) and status (AttachInfoState) types. However, the status
field should remain int32 because the controller always resolves a
concrete priority value after reconciliation.

The field is annotated +required but was declared as *int32 with
json:"priority,omitempty" - these are contradictory: a required field
should not be a pointer with omitempty, as that allows it to be absent
from the serialised output.

This commit reverts only the status types back to int32 whilst keeping
the spec types as *int32 (which correctly allows distinguishing between
"not set" and "explicitly set to 0").
frobware added a commit to frobware/bpfman-operator that referenced this pull request Jan 12, 2026
The PR bpfman#485 changed Priority from int32 to *int32 in both spec
(AttachInfo) and status (AttachInfoState) types. However, the status
field should remain int32 because the controller always resolves a
concrete priority value after reconciliation.

The field is annotated +required but was declared as *int32 with
json:"priority,omitempty" - these are contradictory: a required field
should not be a pointer with omitempty, as that allows it to be absent
from the serialised output.

This commit reverts only the status types back to int32 whilst keeping
the spec types as *int32 (which correctly allows distinguishing between
"not set" and "explicitly set to 0").

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
frobware added a commit to frobware/bpfman-operator that referenced this pull request Jan 12, 2026
The PR bpfman#485 changed Priority from int32 to *int32 in both spec
(AttachInfo) and status (AttachInfoState) types. However, the status
field should remain int32 because the controller always resolves a
concrete priority value after reconciliation.

The field is annotated +required but was declared as *int32 with
json:"priority,omitempty" - these are contradictory: a required field
should not be a pointer with omitempty, as that allows it to be absent
from the serialised output.

This commit reverts only the status types back to int32 whilst keeping
the spec types as *int32 (which correctly allows distinguishing between
"not set" and "explicitly set to 0").

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
Copy link
Contributor

@frobware frobware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should status be a pointer too? The spec field correctly uses *int32 to distinguish "not set" from "zero value", but I think the status field should remain int32 because:

  1. Semantic mismatch: Status represents what was actually applied, not user intent. The controller always resolves a concrete priority value (either from spec or the default 1000), so status is never "unset".
  2. Annotation contradiction: The status Priority field is marked +required but uses *int32 with omitempty. This is contradictory - omitempty allows the field to be absent in JSON, but +required says it must always be present. For required fields that always have a value, a non-pointer type is more appropriate.
  3. API clarity: Using a pointer for status implies the value might be absent, which creates confusion about the controller's behaviour. A non-pointer type makes it clear that status always reflects a resolved, concrete value.

I did some follow-up work in #492 which reverts status to be a non-pointer type while keeping spec as *int32. The PR includes testing evidence showing priority: 0 is correctly preserved in both spec and status.

@andreaskaris andreaskaris changed the title Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants WIP: Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants Jan 29, 2026
@andreaskaris andreaskaris marked this pull request as draft January 29, 2026 11:26
Convert the Priority field from int32 to *int32 across all BPF program
types (TC, TCX, XDP) and their cluster variants. This change:

- Removes the default value of 1000 from kubebuilder annotations
- Makes the Priority field truly optional in the API
- Updates all AttachInfo and AttachInfoState structs
- Adds a GetPriority helper function to handle nil pointer cases
- Updates all reconcilers to use the helper function
- Updates tests to use ptr.To() for priority values

This allows users to omit the priority field when they want to use
bpfman's default behavior, rather than always setting an explicit
value. It also follows API best practices to set the default value via
the controller, and not via the CRD definition.

The status field should remain int32 because the controller always
resolves a concrete priority value after reconciliation.

Co-authored-by: Andreas Karis <ak.karis@gmail.com>
Co-authored-by: Andrew McDermott <amcdermo@redhat.com>
Signed-off-by: Andreas Karis <ak.karis@gmail.com>
// programs (XDP, TC, TCX) to network interfaces. Priority determines execution
// order relative to other programs at the same attachment point, where lower
// values indicate higher precedence. Valid range is 0-1000.
const DefaultAttachPriority int32 = 1000
Copy link
Contributor Author

@andreaskaris andreaskaris Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern here is that this will have to go, as is, into the OpenShift API. And I'm wondering if we want to do that - because having such values in the API will make changes difficult if needed.

@andreaskaris andreaskaris changed the title WIP: Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants Convert the Priority field from int32 to *int32 across all BPF program types (TC, TCX, XDP) and their cluster variants Jan 29, 2026
@andreaskaris andreaskaris marked this pull request as ready for review January 29, 2026 14:05
make generate && make manifests && make bundle

Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Convert cluster and namespace BPF application tests to table-driven
structure for better maintainability and test coverage. Extract common
test functionality into reusable helper functions:
- createFakeClusterReconciler() and createFakeNamespaceReconciler() for
  setup
- runClusterReconciler() and runNamespaceReconciler() for execution
- verifyClusterBpfApplicationState() and
  verifyNamespaceBpfApplicationState()
- verifyClusterBpfProgramState() and verifyNamespaceBpfProgramState()

Inline program definitions within test cases, simplify reconciler setup
in GetBpfAppState tests, and add comprehensive verification of program
status across multiple reconciliation cycles.

Signed-off-by: Andreas Karis <ak.karis@gmail.com>
andreaskaris and others added 4 commits January 29, 2026 15:45
Extract the NetnsCache map and getNetnsId method from ReconcilerCommon
into a new NetNsCache interface with ReconcilerNetNsCache implementation.
This enables proper unit testing by allowing tests to inject a
MockNetNsCache instead of relying on actual filesystem operations.

Changes:
- Define NetNsCache interface with GetNetNsId and Reset methods
- Implement ReconcilerNetNsCache with the original caching logic
- Update all reconcilers to use NetNsCache.GetNetNsId() instead of
  ReconcilerCommon.getNetnsId()
- Add MockNetNsCache for testing with predefined namespace mappings
- Initialize NetNsCache in main.go and test setup

Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Add unit tests for XDP, TC, and TCX programs to verify that:
- nil priority defaults to 1000
- explicit priority values are correctly set
- zero priority is allowed

The tests cover both cluster-scoped and namespace-scoped BPF
applications, ensuring the priority field is properly handled
and reflected in the program status.

Co-authored-by: Andrew McDermott <amcdermo@redhat.com>
Co-authored-by: Andreas Karis <ak.karis@gmail.com>
Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Add integration tests to verify BPF program links are correctly ordered
by priority across XDP, TC, and TCX program types.

The new verification framework validates link ordering on each cluster
node by comparing ClusterBpfApplicationState data against actual bpfman
daemon state.

Co-authored-by: amcdermo@redhat.com
Co-authored-by: Andreas Karis <ak.karis@gmail.com>
Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Move the default attach priority value to the API package as a public
constant, providing a single authoritative definition. The helper
function now references this constant rather than a private value.

This ensures the default is defined alongside the types that use it and
can be referenced by documentation and external consumers.

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
@andreaskaris
Copy link
Contributor Author

@frobware
I applied your changes from #492, squashed everything as needed, and added Co-authored-by messages where both of us touched the code.

For verification, I rebased your suggestion in 492 to upstream/main, then compared to this PR here, and these are the only differences (I updated some log messages):

diff --git a/controllers/bpfman-agent/common.go b/controllers/bpfman-agent/common.go
index f510a075..0a690c9a 100644
--- a/controllers/bpfman-agent/common.go
+++ b/controllers/bpfman-agent/common.go
@@ -94,32 +94,32 @@ type ReconcilerNetNsCache struct {
 // conversion to Stat_t fails, it returns nil.
 func (rnnc *ReconcilerNetNsCache) GetNetNsId(path string) *uint64 {
        if path == "" {
-               rnnc.logger.V(1).Info("Enter getNetnsId: Path is empty.  Using /host/proc/1/ns/net")
+               rnnc.logger.V(1).Info("Enter GetNetnsId: Path is empty.  Using /host/proc/1/ns/net")
                path = "/host/proc/1/ns/net"
        } else {
-               rnnc.logger.V(1).Info("Enter getNetnsId", "Path", path)
+               rnnc.logger.V(1).Info("Enter GetNetnsId", "Path", path)
        }
 
        // If path is in the cache, return the cached value
        if id, ok := rnnc.cache[path]; ok {
-               rnnc.logger.V(1).Info("Exit getNetnsId: Found in cache", "Path", path, "inode", id)
+               rnnc.logger.V(1).Info("Exit GetNetnsId: Found in cache", "Path", path, "inode", id)
                return &id
        }
 
        info, err := os.Stat(path)
        if err != nil {
-               rnnc.logger.V(1).Info("Exit getNetnsId: Failed to stat file", "path", path, "error", err)
+               rnnc.logger.V(1).Info("Exit GetNetnsId: Failed to stat file", "path", path, "error", err)
                return nil
        }
 
        stat, ok := info.Sys().(*syscall.Stat_t)
        if !ok {
-               rnnc.logger.V(1).Info("Exit getNetnsId: Failed to convert to Stat_t", "path", path)
+               rnnc.logger.V(1).Info("Exit GetNetnsId: Failed to convert to Stat_t", "path", path)
                return nil
        }
 
        rnnc.cache[path] = stat.Ino
-       rnnc.logger.V(1).Info("Exit getNetnsId", "Path", path, "inode", stat.Ino)
+       rnnc.logger.V(1).Info("Exit GetNetnsId", "Path", path, "inode", stat.Ino)
        return &stat.Ino
 }

I have a minor concern regarding DefaultAttachPriority (see above) but it's nothing that I'm insisting on, especially given that you can easily change it in case the API reviewers do not want to have that variable in the OpenShift API.

I reviewed up to Refactor network namespace cache into mockable interface, briefly checked test: verify priority field handling in BPF program links and test-integration: Add priority ordering verification for BPF links. But I skipped review of the test-integration commit, as I have a hard time remembering everything about the bpfman-operator. I suppose that you reviewed all of this as well, so I suppose I don't have to re-review my own code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bpfman-operator does not allow priority 0, defaults to 1000 instead

2 participants