Skip to content

OCPBUGS-83865: Add MC to change stalld backend#757

Open
bartwensley wants to merge 1 commit into
openshift-kni:mainfrom
bartwensley:stalld-backend
Open

OCPBUGS-83865: Add MC to change stalld backend#757
bartwensley wants to merge 1 commit into
openshift-kni:mainfrom
bartwensley:stalld-backend

Conversation

@bartwensley
Copy link
Copy Markdown

@bartwensley bartwensley commented May 12, 2026

Adding an MC to revert the stalld backend to sched_debug. This is necessary because the new default queue_task backend causes an unacceptable latency penalty (see RHEL-175242).

@openshift-ci openshift-ci Bot requested review from MarSik and yuvalk May 12, 2026 20:30
@openshift-ci-robot
Copy link
Copy Markdown
Collaborator

openshift-ci-robot commented May 12, 2026

@bartwensley: This pull request references Jira Issue OCPBUGS-83865, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @dgonyier

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Adding an MC to revert the stalld backend to sched_debug. This is necessary because the new default queue_task backend causes an unacceptable latency penalty (see RHEL-175242).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot requested a review from dgonyier May 12, 2026 20:30
@openshift-ci-robot
Copy link
Copy Markdown
Collaborator

@bartwensley: This pull request references Jira Issue OCPBUGS-83865, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @dgonyier

Details

In response to this:

Adding an MC to revert the stalld backend to sched_debug. This is necessary because the new default queue_task backend causes an unacceptable latency penalty (see RHEL-175242).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 6ded5cf5-aba2-455c-8745-a5ab0741015b

📥 Commits

Reviewing files that changed from the base of the PR and between 8eee90d and 1e0e48a.

📒 Files selected for processing (10)
  • telco-ran/configuration/argocd/AdditionalManifests.md
  • telco-ran/configuration/argocd/example/clusterinstance/kustomization.yaml
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/build.sh
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend.conf
  • telco-ran/configuration/kube-compare-reference/machine-config/stalld-backend/11-stalld-backend-master.yaml
  • telco-ran/configuration/kube-compare-reference/machine-config/stalld-backend/11-stalld-backend-worker.yaml
  • telco-ran/configuration/kube-compare-reference/metadata.yaml
  • telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-master.yaml
  • telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-worker.yaml
✅ Files skipped from review due to trivial changes (6)
  • telco-ran/configuration/argocd/example/clusterinstance/kustomization.yaml
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend.conf
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend
  • telco-ran/configuration/kube-compare-reference/machine-config/stalld-backend/11-stalld-backend-worker.yaml
  • telco-ran/configuration/argocd/AdditionalManifests.md
  • telco-ran/configuration/kube-compare-reference/machine-config/stalld-backend/11-stalld-backend-master.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-master.yaml
  • telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-worker.yaml

📝 Walkthrough

Walkthrough

This change introduces stalld-backend service configuration for OpenShift cluster nodes. A Bash build script uses mcmaker to generate MachineConfig manifests from template sources, producing both master and worker node configurations that provision sysconfig files and systemd service drop-ins.

Changes

stalld-backend Configuration

Layer / File(s) Summary
Build script and configuration sources
telco-ran/configuration/extra-manifests-builder/11-stalld-backend/build.sh, telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend, telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend.conf
The build.sh script initializes Go environment variables and invokes mcmaker to generate manifests. The stalld-backend file defines BE="-b sched_debug", and stalld-backend.conf adds a [Service] section that loads /etc/sysconfig/stalld-backend via EnvironmentFile.
Generated MachineConfig manifests
telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-master.yaml, telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-worker.yaml, telco-ran/configuration/kube-compare-reference/machine-config/stalld-backend/11-stalld-backend-master.yaml, telco-ran/configuration/kube-compare-reference/machine-config/stalld-backend/11-stalld-backend-worker.yaml
Master and worker MachineConfig manifests set Ignition v3.2.0, write /etc/sysconfig/stalld-backend from base64 payload (mode 420), and install a stalld-backend.conf systemd drop-in for stalld.service that sources the environment file.
Kustomize, docs, and metadata references
telco-ran/configuration/argocd/AdditionalManifests.md, telco-ran/configuration/argocd/example/clusterinstance/kustomization.yaml, telco-ran/configuration/kube-compare-reference/metadata.yaml
Documentation and example kustomization entries were updated to include the new extra-manifest files; kube-compare metadata now references the new master and worker MachineConfig files.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding a MachineConfig to change the stalld backend, which aligns with the PR objectives and all file changes.
Description check ✅ Passed The description is directly related to the changeset, explaining the purpose of adding the MachineConfig to revert stalld backend to sched_debug due to latency issues.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@telco-ran/configuration/extra-manifests-builder/11-stalld-backend/build.sh`:
- Around line 8-10: The shell invocation uses unquoted variable expansions
(${MCMAKER} and ${MCPROLE}) which can cause word-splitting/globbing errors;
update the command in build.sh to quote these expansions (use "${MCMAKER}" and
"${MCPROLE}") wherever they are passed to the command (the line invoking MCMAKER
-name 11-stalld-backend -mcp ${MCPROLE} -stdout ...) and apply the identical
change to the other occurrence in
01-container-mount-ns-and-kubelet-conf/build.sh (the -mcp ${MCPROLE} invocation)
so all variable uses are safely quoted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: e71f296c-8e3c-46b5-846c-78127f7eede7

📥 Commits

Reviewing files that changed from the base of the PR and between a88ca23 and 8eee90d.

📒 Files selected for processing (5)
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/build.sh
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend
  • telco-ran/configuration/extra-manifests-builder/11-stalld-backend/stalld-backend.conf
  • telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-master.yaml
  • telco-ran/configuration/source-crs/extra-manifest/11-stalld-backend-worker.yaml

@bartwensley
Copy link
Copy Markdown
Author

/cc @imiller0 @browsell

@openshift-ci openshift-ci Bot requested review from browsell and imiller0 May 12, 2026 20:35
Adding an MC to revert the stalld backend to sched_debug. This
is necessary because the new default queue_task backend causes an
unacceptable latency penalty (see RHEL-175242).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Bart Wensley <bwensley@redhat.com>
@browsell
Copy link
Copy Markdown

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 12, 2026
@MarSik
Copy link
Copy Markdown
Member

MarSik commented May 13, 2026

@bartwensley The config file generator stuff is a lot of files indeed. Btw, why not include this into NTO directly? I would say this is something the customers should not care about.

@bartwensley
Copy link
Copy Markdown
Author

@bartwensley The config file generator stuff is a lot of files indeed. Btw, why not include this into NTO directly? I would say this is something the customers should not care about.

I hadn't considered using NTO - I will take a look at that option now.

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 13, 2026
@bartwensley
Copy link
Copy Markdown
Author

/cc @makelinux

@openshift-ci openshift-ci Bot requested a review from makelinux May 13, 2026 14:14
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bartwensley, makelinux
Once this PR has been reviewed and has the lgtm label, please assign lack for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bartwensley
Copy link
Copy Markdown
Author

I have created a PR to do this in the cluster-node-tuning-operator instead: openshift/cluster-node-tuning-operator#1515

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants