Skip to content

Add disableWorkloadRBAC flag to skip per-workload RBAC creation#4030

Open
JAORMX wants to merge 3 commits intomainfrom
feat/disable-workload-rbac
Open

Add disableWorkloadRBAC flag to skip per-workload RBAC creation#4030
JAORMX wants to merge 3 commits intomainfrom
feat/disable-workload-rbac

Conversation

@JAORMX
Copy link
Collaborator

@JAORMX JAORMX commented Mar 6, 2026

Summary

  • Adds DISABLE_WORKLOAD_RBAC env var and operator.rbac.disableWorkloadRBAC Helm value (default: false)
  • When enabled, the operator skips creating per-workload ServiceAccount, Role, and RoleBinding resources
  • Conditionally removes roles/rolebindings permissions and ServiceAccount write verbs from the operator's ClusterRole
  • Conditionally excludes the registry-api ClusterRole and ClusterRoleBinding from the chart

Motivation

Some Kubernetes platform teams enforce strict policies on which cluster-scoped resources an operator's ClusterRole may reference. In environments managed by GitOps tools like ArgoCD, app-projects must whitelist every cluster-scoped resource the operator needs — including roles and rolebindings permissions used for dynamic RBAC creation at runtime. This creates friction for adoption in security-conscious environments.

By allowing the operator to opt out of per-workload RBAC management, platform teams can:

  • Pre-provision RBAC via GitOps: Ship ServiceAccount, Role, and RoleBinding alongside each workload CR in dedicated per-workload Helm charts or Kustomize overlays, keeping RBAC under full version control and audit
  • Satisfy ArgoCD app-project constraints: The operator's ClusterRole no longer needs roles/rolebindings permissions, eliminating the need to whitelist those resources in app-project definitions
  • Enforce least-privilege at the operator level: The operator only retains read access to ServiceAccounts (get/list/watch) and loses all write permissions for RBAC resources
  • Support policy engines: Works cleanly with OPA/Gatekeeper or Kyverno policies that restrict which controllers may create RBAC resources

This is PR1 of a two-part effort. A follow-up PR will add per-workload Helm charts that bundle SA + Role + RoleBinding + CR for each workload type, providing a turnkey solution for externally-managed RBAC.

Changes

Go code:

  • New DISABLE_WORKLOAD_RBAC constant and flag reading in main.go, passed to all controller setup functions
  • DisableWorkloadRBAC bool field added to MCPServerReconciler, MCPRemoteProxyReconciler, and VirtualMCPServerReconciler — each guards ensureRBACResources() with an early return
  • registryapi.manager accepts the flag and guards its own ensureRBACResources()
  • NewMCPRegistryReconciler forwards the flag to the registry API manager

Helm charts:

  • operator.rbac.disableWorkloadRBAC value in values.yaml and values-openshift.yaml
  • DISABLE_WORKLOAD_RBAC env var in deployment.yaml
  • ClusterRole: serviceaccounts split into its own rule block with conditional write verbs; roles/rolebindings block wrapped in conditional
  • registry-api-clusterrole.yaml and registry-api-clusterrolebinding.yaml wrapped in conditional

Tests:

  • Unit test per controller verifying ensureRBACResources returns nil and creates no resources when disabled
  • CI values file (externalRBAC-values.yaml) for Helm template linting

How to use

# values.yaml
operator:
  rbac:
    disableWorkloadRBAC: true

When this flag is set, users must pre-create the following resources for each workload:

  • MCPServer: <name>-proxy-runner ServiceAccount/Role/RoleBinding + <name>-mcp-server ServiceAccount
  • MCPRemoteProxy: <name>-remote-proxy-runner ServiceAccount/Role/RoleBinding
  • VirtualMCPServer: <name>-vmcp ServiceAccount/Role/RoleBinding
  • MCPRegistry: <name>-registry-api ServiceAccount/Role/RoleBinding

Deployments will fail to schedule if the required ServiceAccounts are not present — this is a safe fail-closed behavior.

Test plan

  • task lint-fix — 0 issues
  • task test — all unit tests pass (exit 0)
  • helm template with externalRBAC-values.yaml — ClusterRole has no roles/rolebindings, SA is read-only, registry-api resources absent
  • helm template with defaults — output unchanged from main
  • task helm-docs — README regenerated

🤖 Generated with Claude Code

@github-actions github-actions bot added the size/S Small PR: 100-299 lines changed label Mar 6, 2026
@JAORMX JAORMX force-pushed the feat/disable-workload-rbac branch from 3fe7ae2 to 57e9006 Compare March 6, 2026 11:38
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 6, 2026
@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 27.90698% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.67%. Comparing base (bc9b534) to head (fc259c3).

Files with missing lines Patch % Lines
cmd/thv-operator/main.go 0.00% 29 Missing ⚠️
...thv-operator/controllers/mcpregistry_controller.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4030   +/-   ##
=======================================
  Coverage   68.66%   68.67%           
=======================================
  Files         445      445           
  Lines       45343    45361   +18     
=======================================
+ Hits        31136    31151   +15     
- Misses      11802    11805    +3     
  Partials     2405     2405           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX JAORMX force-pushed the feat/disable-workload-rbac branch from 57e9006 to 45382c3 Compare March 6, 2026 11:52
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 6, 2026
JAORMX and others added 2 commits March 6, 2026 15:10
Some K8s platform teams reject the operator because its ClusterRole
includes roles/rolebindings permissions for dynamic RBAC creation.
This adds an opt-in DISABLE_WORKLOAD_RBAC env var (exposed via
operator.rbac.disableWorkloadRBAC Helm value) so the operator skips
per-workload ServiceAccount, Role, and RoleBinding creation.

When enabled:
- All controller ensureRBACResources() methods return nil immediately
- ClusterRole omits roles/rolebindings rules and SA write verbs
- Registry API ClusterRole/ClusterRoleBinding are not rendered
- Users must pre-create RBAC resources externally

Default behavior (false) is unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ClusterRole is generated by controller-gen and CI verifies it
matches. Helm conditionals cannot be used in generated files.

The operator code guards are the enforcement mechanism — the
ClusterRole permissions are a ceiling, not a guarantee. The registry-api
ClusterRole/ClusterRoleBinding (hand-managed) retain their conditionals.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@JAORMX JAORMX force-pushed the feat/disable-workload-rbac branch from 65f0915 to 08bfe6d Compare March 6, 2026 13:11
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 6, 2026
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 6, 2026
featureVMCP = "ENABLE_VMCP"
// disableWorkloadRBAC disables per-workload RBAC management (ServiceAccount, Role, RoleBinding).
// When enabled, the operator will not create RBAC resources for workloads,
// allowing them to be managed externally (e.g., via per-workload Helm charts).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want to create Helm Charts for this - Seems a bit overkill? It will only be 3 resources?

@@ -1,3 +1,4 @@
{{- if not .Values.operator.rbac.disableWorkloadRBAC }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually think this is used anyways. This was added previously by RedHat but I don't think it's used.

@@ -1,3 +1,4 @@
{{- if not .Values.operator.rbac.disableWorkloadRBAC }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S Small PR: 100-299 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants