OCPEDGE-2591: feat(two-node): add verify-rhel-bugfix skill#118
OCPEDGE-2591: feat(two-node): add verify-rhel-bugfix skill#118lucaconsalvi wants to merge 11 commits intoopenshift-eng:mainfrom
Conversation
Port the RHEL bug fix verification workflow from tnf-dev-env into the two-node plugin. The skill fetches bug context from Jira via MCP, checks cluster state, patches nodes with the fixed RPM, runs verification tests, and generates a JIRA comment report. Includes three scripts for deterministic operations: - verify-cluster.sh: cluster health check - patch-nodes.sh: RPM patching with reboot and verification - collect-logs.sh: pacemaker/etcd log collection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add missing blank lines around fenced code blocks and lists, add language specifiers to fenced code blocks, and remove inline HTML that triggered MD033. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New skill added (rhel-bugfix-verify), bump minor version. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lucaconsalvi The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a two-node RHEL resource-agents verification workflow: three new Bash helper scripts, a detailed SKILL.md describing the verification flow and Jira/reporting steps, README formatting updates, and plugin metadata/marketplace registry version and ordering adjustments. ChangesRHEL Bugfix Verification Workflow
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
plugins/two-node/skills/rhel-bugfix-verify/SKILL.md (1)
410-411: 💤 Low valueConsider removing or generalizing the hardcoded user-specific path.
The reference to
~/Documents/claude_test_file/is a user-specific path that likely won't exist for other contributors. Consider removing this reference or making it more generic.Proposed fix
-- Reports from past verifications in `~/Documents/claude_test_file/` can be - used as format reference if available. +- Reports from past verifications can be used as format reference if available.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/two-node/skills/rhel-bugfix-verify/SKILL.md` around lines 410 - 411, The SKILL.md contains a hardcoded, user-specific path string "~/Documents/claude_test_file/" that should be removed or generalized; update the sentence that currently references that path to either remove the example entirely or replace it with a generic placeholder (e.g., "~/Documents/<report_directory>" or "$HOME/Documents/<report_directory>") and add a brief note saying "if available, use your local report directory" so other contributors can adapt it; ensure you update the exact text matching the string "~/Documents/claude_test_file/" in SKILL.md.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/two-node/scripts/collect-logs.sh`:
- Line 1: The script collect-logs.sh currently uses the shebang line
"#!/bin/bash"; update the topmost shebang to "#!/usr/bin/bash" to conform to
CONTRIBUTING.md; simply replace the existing shebang in collect-logs.sh so the
script invokes /usr/bin/bash instead of /bin/bash.
In `@plugins/two-node/scripts/patch-nodes.sh`:
- Line 1: The script's shebang currently uses /bin/bash but CONTRIBUTING.md
mandates /usr/bin/bash; update the shebang line in the patch-nodes.sh script to
use #!/usr/bin/bash so the script follows repository conventions (look for the
file's shebang at the top of patch-nodes.sh).
- Around line 79-91: The wait loop in patch-nodes.sh (the for i in $(seq 1 60)
loop that sets m0 and m1) lacks failure handling if masters never become "up";
modify the script after the loop to check whether both m0 and m1 reached "up"
and if not print a clear error including last-known values of m0 and m1 and exit
with a non-zero status (e.g., exit 1) so downstream verification does not
proceed; reference the variables m0, m1 and the loop using seq 1 60 when
locating where to add this check and error/exit behavior.
In `@plugins/two-node/scripts/verify-cluster.sh`:
- Line 1: The script uses the wrong shebang; replace the current shebang line
"#!/bin/bash" with the required "#!/usr/bin/bash" per CONTRIBUTING.md so the
script consistently uses /usr/bin/bash; update the first line of the file (the
shebang) in the verify-cluster.sh script to the new path.
---
Nitpick comments:
In `@plugins/two-node/skills/rhel-bugfix-verify/SKILL.md`:
- Around line 410-411: The SKILL.md contains a hardcoded, user-specific path
string "~/Documents/claude_test_file/" that should be removed or generalized;
update the sentence that currently references that path to either remove the
example entirely or replace it with a generic placeholder (e.g.,
"~/Documents/<report_directory>" or "$HOME/Documents/<report_directory>") and
add a brief note saying "if available, use your local report directory" so other
contributors can adapt it; ensure you update the exact text matching the string
"~/Documents/claude_test_file/" in SKILL.md.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 21d91ebd-8a38-44a7-be2b-662b8a7cc607
📒 Files selected for processing (7)
.claude/skills/two-node-rhel-bugfix-verifyplugins/two-node/.claude-plugin/plugin.jsonplugins/two-node/README.mdplugins/two-node/scripts/collect-logs.shplugins/two-node/scripts/patch-nodes.shplugins/two-node/scripts/verify-cluster.shplugins/two-node/skills/rhel-bugfix-verify/SKILL.md
Bump two-node version to 1.1.0 in marketplace catalog and alphabetize plugin entries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use #!/usr/bin/bash shebang per CONTRIBUTING.md in all three scripts - Add timeout failure handling in patch-nodes.sh reboot wait loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align skill name with the verb-noun convention used across all plugins (e.g. create-rhel-stories, generate-tests, check-release-readiness). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/two-node/scripts/collect-logs.sh`:
- Around line 22-23: Add strict shell flags and validate MINUTES_AGO is a
non-negative integer before using it: enable set -euo pipefail at the top of the
script, check the MINUTES_AGO variable (from its assignment
MINUTES_AGO="${1:-30}") with a numeric regex or integer test and exit with a
clear error if invalid, and then use the validated variable when building the
remote command (the interpolation near where MINUTES_AGO is used on the remote
ssh/exec line) while quoting variables like "${MINUTES_AGO}" and
"${OUTPUT_BASE}" to avoid word-splitting; ensure the remote command string is
constructed safely (prefer passing the integer as a separate argument or
explicit sanitized value) and run shellcheck to confirm no remaining issues.
In `@plugins/two-node/scripts/patch-nodes.sh`:
- Around line 61-63: The current inline SSH pipeline pipes "sudo rpm-ostree
override replace /tmp/${RPM_FILE} -C 2>&1 | grep -v '^Warning:'" which loses
rpm-ostree's exit code when grep runs; update the remote command in
scripts/patch-nodes.sh (the ssh "ec2-user@${HYPERVISOR}" -> ssh ${SSH_OPTS}
core@${node} invocation that runs "sudo rpm-ostree override replace
/tmp/${RPM_FILE} -C") to capture rpm-ostree's exit status and still filter
warnings—e.g. run the command so stdout/stderr are captured (with tee or a temp
file), record the rpm-ostree exit code via PIPESTATUS[0] or $? immediately after
the command, run grep -v '^Warning:' against the captured output, then exit with
the saved rpm-ostree status so pipeline returns the real result.
- Around line 109-113: The nested SSH call interpolates user-controlled
FIX_GREP_PATTERN directly into a remote shell command, allowing command
injection; fix it by sanitizing/passing the pattern as a safe argument instead
of embedding it unescaped: in patch-nodes.sh update the verification block that
uses FIX_GREP_PATTERN so you quote variables everywhere, call grep with -- and a
quoted argument (e.g. ssh ... core@${MASTER_1} 'grep -n -- "$1"
/usr/lib/ocf/resource.d/heartbeat/podman-etcd' -- "$FIX_GREP_PATTERN" or escape
the pattern with printf '%q' before embedding), add set -euo pipefail at the
script top, and run shellcheck to ensure no remaining quoting issues; reference
FIX_GREP_PATTERN and the nested ssh line when making the change.
In `@plugins/two-node/scripts/verify-cluster.sh`:
- Around line 50-51: The current pipeline using "oc get co | awk 'NR==1 ||
/False/'" matches any row with "False" (e.g., Progressing=False) and must be
changed to only flag operators whose AVAILABLE column is False; modify the awk
expression after "oc get co" to detect the header index for the AVAILABLE column
and then print the header plus only rows where that AVAILABLE field == "False"
(for example, replace the simple /False/ filter with an awk snippet that finds
the AVAILABLE column in NR==1 and for subsequent lines tests that column ==
"False").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 2f9d93b7-a3fc-431f-846e-1102abbcce81
📒 Files selected for processing (3)
plugins/two-node/scripts/collect-logs.shplugins/two-node/scripts/patch-nodes.shplugins/two-node/scripts/verify-cluster.sh
Client-side variable expansion in ssh commands is intentional — we resolve HYPERVISOR, MASTER_0/1, SSH_OPTS, etc. locally before sending the command string to the remote host. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Validate minutes-ago input is numeric in collect-logs.sh - Fix cluster operator health filter in verify-cluster.sh to check Available/Progressing/Degraded columns instead of matching any "False" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The grep -v pipeline runs on the hypervisor's remote shell which has no pipefail, so rpm-ostree failures could be masked by grep's exit code. Capture output first, then filter warnings locally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
/two-node:verify-rhel-bugfixskill to automate RHEL resource-agents bug fix verification on TNF clustersverify-cluster.sh,patch-nodes.sh,collect-logs.shTest plan
./marketplace validate two-node— passesbash plugins/tests/marketplace_smoke_test.sh— 23/23 passnpx markdownlint-cli2on changed files — 0 errors/two-node:verify-rhel-bugfix RHEL-XXXXXon a session with a running TNF clusterContext
Ported from the
tnf-dev-envproject's bugfix-verify skill (lightweight port — no project workspace scaffolding). Fills the gap in the two-node plugin lifecycle:create-rhel-storiescreates the Jira tickets,verify-rhel-bugfixdoes the actual verification,bug-reproducerhandles a different use case (reproducing bugs).Tracked by OCPEDGE-2591.
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Chores