Skip to content

MGMT-22782: Add state root cleanup script to day-2 worker ignition#9942

Open
CrystalChun wants to merge 1 commit intoopenshift:masterfrom
CrystalChun:day-2-capoa
Open

MGMT-22782: Add state root cleanup script to day-2 worker ignition#9942
CrystalChun wants to merge 1 commit intoopenshift:masterfrom
CrystalChun:day-2-capoa

Conversation

@CrystalChun
Copy link
Contributor

@CrystalChun CrystalChun commented Feb 27, 2026

Adding day-2 hosts originally skips adding this cleanup script even though the day-2 host also is booting from a persistent disk.

We should ensure that both day-1 and day-2 hosts that are booting from a disk image get this script.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • New Features

    • Second-day worker ignition now includes an automatic state-root cleanup routine on first boot for persistent boots.
  • Refactor

    • Centralized ignition handling to apply cleanup logic consistently and simplify hostname-related processing.
  • Tests

    • Ignition tests updated to the newer v3.2 schema to validate the added cleanup units and scripts.
  • Chore

    • Ignition template version bumped to 3.2.0.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 27, 2026
@openshift-ci
Copy link

openshift-ci bot commented Feb 27, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 27, 2026

@CrystalChun: This pull request references MGMT-22782 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adding day-2 hosts originally skips adding this cleanup script even though the day-2 host also is booting from a persistent disk.

We should ensure that both day-1 and day-2 hosts that are booting from a disk image get this script.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 27, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 27, 2026

Walkthrough

Adds two exported ignition utilities and a cleanup ignition constant in internal/ignition/common.go; removes duplicated hostname logic from discovery; replaces per-host inline stateroot-merge with centralized AddStateRootCleanupToIgnition calls; updates tests and bumps template ignition version to 3.2.0.

Changes

Cohort / File(s) Summary
Ignition utilities
internal/ignition/common.go
Adds SetHostnameForNodeIgnition([]byte, *models.Host) ([]byte, error), AddStateRootCleanupToIgnition(logrus.FieldLogger, []byte, *models.Host) ([]byte, error), and cleanupDiscoveryStaterootIgnitionOverride constant (Ignition v3.2.0 payload for stateroot cleanup).
Discovery module updates
internal/ignition/discovery.go
Removes local SetHostnameForNodeIgnition and hostutil import; after formatting second-day worker ignitions calls centralized AddStateRootCleanupToIgnition(...) with error handling.
Install manifests updates
internal/ignition/installmanifests.go
Removes inline per-host inventory-based stateroot merge; replaces it with calls to AddStateRootCleanupToIgnition(g.log, configBytes, host) and wraps/propagates errors with host context.
Tests — ignition parsing & cleanup behavior
internal/ignition/discovery_test.go
Switches tests from Ignition v3.1 to v3.2 parsing (config_32.Parse), adds tests validating presence/absence of stateroot cleanup units/scripts based on boot source.
Template version bump
internal/ignition/templates/node.ign
Bumps ignition.version from 3.1.0 to 3.2.0.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly references the specific Jira issue (MGMT-22782) and accurately describes the main change: adding a state root cleanup script to day-2 worker ignition configuration.
Description check ✅ Passed The description adequately explains the problem (day-2 hosts skipping cleanup script despite persistent disk boot), the solution (ensuring both day-1 and day-2 disk-booting hosts get the script), testing approach (manual), and addresses most template sections including issue classification (bug fix), environment impact (Cloud and Operator Managed Deployments), and testing selection.
Stable And Deterministic Test Names ✅ Passed All test names in modified files are static and deterministic with no dynamic content, formatting specifiers, concatenation operators, generated identifiers, timestamps, or references to pods/nodes/namespaces/IPs.
Test Structure And Quality ✅ Passed Four new tests for AddStateRootCleanupToIgnition have single responsibilities, proper setup/cleanup, meaningful assertions, and follow codebase patterns.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link

openshift-ci bot commented Feb 27, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CrystalChun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 27, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 27, 2026

@CrystalChun: This pull request references MGMT-22782 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adding day-2 hosts originally skips adding this cleanup script even though the day-2 host also is booting from a persistent disk.

We should ensure that both day-1 and day-2 hosts that are booting from a disk image get this script.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Refactor
  • Centralized ignition configuration utilities for improved hostname management and system cleanup handling during host initialization.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/ignition/common.go (1)

61-64: Preserve the root cause when hostname lookup fails.

The current return path drops the underlying err, which makes failures harder to debug.

Proposed fix
 	hostname, err := hostutil.GetCurrentHostName(host)
 	if err != nil {
-		return nil, errors.Errorf("failed to get hostname for host %s", host.ID)
+		return nil, errors.Wrapf(err, "failed to get hostname for host %s", host.ID)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/ignition/common.go` around lines 61 - 64, The error returned from
hostutil.GetCurrentHostName is being discarded; update the failure return in the
block handling GetCurrentHostName to preserve and wrap the original err (e.g.,
use errors.Wrapf(err, "failed to get hostname for host %s", host.ID) or
fmt.Errorf("failed to get hostname for host %s: %w", host.ID, err)) so callers
see the root cause; look for the hostname lookup call to
hostutil.GetCurrentHostName in internal/ignition/common.go and change the return
that currently calls errors.Errorf to wrap the underlying err.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/ignition/common.go`:
- Around line 7-10: The imports in common.go are incorrect: replace imports of
"github.com/openshift/assisted-service/internal/ignition/ignitioncommon" and
"github.com/openshift/assisted-service/internal/models" with the
project-standard packages
"github.com/openshift/assisted-service/internal/common/ignition" and
"github.com/openshift/assisted-service/models" so the helper functions
ParseToLatest, SetFileInIgnition, MergeIgnitionConfig and the GetCurrentHostName
signature (which expects *models.Host) match the rest of the codebase and avoid
type/import mismatches; update the import block accordingly.

---

Nitpick comments:
In `@internal/ignition/common.go`:
- Around line 61-64: The error returned from hostutil.GetCurrentHostName is
being discarded; update the failure return in the block handling
GetCurrentHostName to preserve and wrap the original err (e.g., use
errors.Wrapf(err, "failed to get hostname for host %s", host.ID) or
fmt.Errorf("failed to get hostname for host %s: %w", host.ID, err)) so callers
see the root cause; look for the hostname lookup call to
hostutil.GetCurrentHostName in internal/ignition/common.go and change the return
that currently calls errors.Errorf to wrap the underlying err.

ℹ️ Review info

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 31527fa and dc8f270.

📒 Files selected for processing (3)
  • internal/ignition/common.go
  • internal/ignition/discovery.go
  • internal/ignition/installmanifests.go

@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 27, 2026

@CrystalChun: This pull request references MGMT-22782 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adding day-2 hosts originally skips adding this cleanup script even though the day-2 host also is booting from a persistent disk.

We should ensure that both day-1 and day-2 hosts that are booting from a disk image get this script.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • New Features

  • Second-day worker ignition now includes an automatic state-root cleanup routine on first boot for persistent boots.

  • Refactor

  • Centralized ignition handling to apply cleanup logic consistently across host initialization paths and simplify hostname-related processing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Adding day-2 hosts originally skips adding this cleanup script
even though the day-2 host also is booting from a persistent disk.

We should ensure that both day-1 and day-2 hosts that are booting
from a disk image get this script.
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 27, 2026

@CrystalChun: This pull request references MGMT-22782 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adding day-2 hosts originally skips adding this cleanup script even though the day-2 host also is booting from a persistent disk.

We should ensure that both day-1 and day-2 hosts that are booting from a disk image get this script.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • New Features

  • Second-day worker ignition now includes an automatic state-root cleanup routine on first boot for persistent boots.

  • Refactor

  • Centralized ignition handling to apply cleanup logic consistently and simplify hostname-related processing.

  • Tests

  • Ignition tests updated to the newer v3.2 schema to validate the added cleanup units and scripts.

  • Chore

  • Ignition template version bumped to 3.2.0.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@CrystalChun CrystalChun marked this pull request as ready for review February 27, 2026 20:58
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 27, 2026
@openshift-ci openshift-ci bot requested review from omertuc and ori-amizur February 27, 2026 21:00
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/ignition/common.go (1)

61-64: Original error is discarded in error message.

The error from GetCurrentHostName is retrieved but not included in the returned error message. This could make debugging harder when hostname retrieval fails.

♻️ Suggested fix to include original error
 	hostname, err := hostutil.GetCurrentHostName(host)
 	if err != nil {
-		return nil, errors.Errorf("failed to get hostname for host %s", host.ID)
+		return nil, errors.Wrapf(err, "failed to get hostname for host %s", host.ID)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/ignition/common.go` around lines 61 - 64, The returned error
discards the original error from hostutil.GetCurrentHostName; update the error
return to include/wrap the original err (e.g. use errors.Wrapf(err, "failed to
get hostname for host %s", host.ID) or fmt.Errorf("failed to get hostname for
host %s: %w", host.ID, err)) so the original error context from
GetCurrentHostName is preserved; locate the hostname, err :=
hostutil.GetCurrentHostName(host) block and replace the current errors.Errorf
return accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/ignition/common.go`:
- Around line 61-64: The returned error discards the original error from
hostutil.GetCurrentHostName; update the error return to include/wrap the
original err (e.g. use errors.Wrapf(err, "failed to get hostname for host %s",
host.ID) or fmt.Errorf("failed to get hostname for host %s: %w", host.ID, err))
so the original error context from GetCurrentHostName is preserved; locate the
hostname, err := hostutil.GetCurrentHostName(host) block and replace the current
errors.Errorf return accordingly.

ℹ️ Review info

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2011b32 and f2918cb.

📒 Files selected for processing (5)
  • internal/ignition/common.go
  • internal/ignition/discovery.go
  • internal/ignition/discovery_test.go
  • internal/ignition/installmanifests.go
  • internal/ignition/templates/node.ign
✅ Files skipped from review due to trivial changes (1)
  • internal/ignition/templates/node.ign
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/ignition/installmanifests.go

@CrystalChun
Copy link
Contributor Author

/retest

@codecov
Copy link

codecov bot commented Mar 2, 2026

Codecov Report

❌ Patch coverage is 53.33333% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.07%. Comparing base (10cee3a) to head (f2918cb).
⚠️ Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
internal/ignition/common.go 58.33% 5 Missing and 5 partials ⚠️
internal/ignition/discovery.go 33.33% 1 Missing and 1 partial ⚠️
internal/ignition/installmanifests.go 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #9942   +/-   ##
=======================================
  Coverage   44.06%   44.07%           
=======================================
  Files         414      415    +1     
  Lines       72172    72218   +46     
=======================================
+ Hits        31803    31828   +25     
- Misses      37494    37512   +18     
- Partials     2875     2878    +3     
Files with missing lines Coverage Δ
internal/ignition/discovery.go 75.00% <33.33%> (+0.64%) ⬆️
internal/ignition/installmanifests.go 55.29% <33.33%> (-0.13%) ⬇️
internal/ignition/common.go 58.33% <58.33%> (ø)

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@CrystalChun
Copy link
Contributor Author

/cc @rccrdpccl

@openshift-ci openshift-ci bot requested a review from rccrdpccl March 2, 2026 16:47
@CrystalChun
Copy link
Contributor Author

/test edge-subsystem-kubeapi-aws

@openshift-ci
Copy link

openshift-ci bot commented Mar 2, 2026

@CrystalChun: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants