Skip to content

OCPNODE-4168: Empty disk should be ignored for gcp additional disk#141

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
ngopalak-redhat:ngopalak/fix_gcp_disk
Mar 10, 2026
Merged

OCPNODE-4168: Empty disk should be ignored for gcp additional disk#141
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
ngopalak-redhat:ngopalak/fix_gcp_disk

Conversation

@ngopalak-redhat
Copy link
Copy Markdown
Contributor

@ngopalak-redhat ngopalak-redhat commented Feb 25, 2026

Context: We are tryin to add additional disk to GCP to test split disk: https://issues.redhat.com/browse/OCPSTRAT-188

Error:

  Warning  FailedCreate  11m   gcpcontroller  ngopalak-feb25-gcp-1-887q4-worker-b-528mq: reconciler failed to Create machine: error launching instance: googleapi: Error 400: Invalid value for field 'resource.disks[1].initializeParams.sourceImage': 'https://compute.googleapis.com/compute/v1/projects/openshift-gce-devel/global/images/'. The URL is malformed.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.ErrorInfo",
    "domain": "compute.googleapis.com",
    "metadata": {
      "field": "resource.disks[1].initializeParams.sourceImage",
      "reason": "The URL is malformed.",
      "value": "https://compute.googleapis.com/compute/v1/projects/openshift-gce-devel/global/images/"
    },
    "reason": "INVALID_FIELD_VALUE"
  },
  {
    "@type": "type.googleapis.com/google.rpc.LocalizedMessage",
    "locale": "en-US",
    "message": "Invalid value for field 'resource.disks[1].initializeParams.sourceImage': 'https://compute.googleapis.com/compute/v1/projects/openshift-gce-devel/global/images/'. The URL is malformed."
  }
]
, invalid

How to recreate:

  • Day 1: Add an additional disk for gcp with boot as false:
          - autoDelete: true
            boot: false
            sizeGb: 100
            type: pd-balanced
  • Day 2: Or edit the machine or machineset to add an additional non-boot disk for GCP
  • The machine object will have this:
      - autoDelete: true
        boot: false
        image: ""
        sizeGb: 100
        type: pd-balanced

This will cause the above error.

Fix:
Approach 1: Change the API to make the image field a pointer. I didn't take this approach as API change can impact existing users
Approach 2: Empty image can be ignored - This approach is taken

Related PR: openshift/machine-api-operator#1471

Summary by CodeRabbit

  • Bug Fixes

    • Stricter validation: boot disks must specify an image; secondary disks may be created as blank with specified size/type.
    • Image strings without path separators are resolved relative to the current project.
  • Tests

    • Added coverage for boot-disk validation and for creating secondary blank disks with expected size, type, and deletion behavior.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 25, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 25, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ngopalak-redhat ngopalak-redhat marked this pull request as ready for review February 25, 2026 09:32
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 25, 2026
@openshift-ci openshift-ci Bot requested review from chrischdi and mdbooth February 25, 2026 09:33
@kannon92
Copy link
Copy Markdown

cc @theobarberbany

@ngopalak-redhat
Copy link
Copy Markdown
Contributor Author

/retest-required

1 similar comment
@ngopalak-redhat
Copy link
Copy Markdown
Contributor Author

/retest-required

@theobarberbany
Copy link
Copy Markdown
Contributor

theobarberbany commented Mar 5, 2026

Adding a unit test here would probably be beneficial, however not worth blocking on. If you feel like raising a follow up that would be great.

I've asked QE to verify, as we don't have any e2es for this :)

/lgtm
/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 5, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: theobarberbany

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2026
@theobarberbany
Copy link
Copy Markdown
Contributor

/hold

@openshift-ci openshift-ci Bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm Indicates that a PR is ready to be merged. labels Mar 5, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 6, 2026

Walkthrough

Disk image handling in the GCP machine reconciler was changed: disk images are now optional, source image resolution to the project happens only for non-empty image strings, boot disks without an image now fail validation, and initialization parameters for attached disks are built via a shared initParams object. Tests updated accordingly.

Changes

Cohort / File(s) Summary
Disk image handling
pkg/cloud/gcp/actuators/machine/reconciler.go
Move image handling into a prebuilt AttachedDiskInitializeParams (initParams); set SourceImage only when a non-empty image is provided (resolve relative image paths); return an error if a boot disk has no image; assign AutoDelete, Boot, and InitializeParams from initParams.
Test updates
pkg/cloud/gcp/actuators/machine/reconciler_test.go
Add Image usage to GCPDisk in tests and introduce cases: boot disk without image returns validation error; secondary disk without image creates a blank disk with expected size/type and empty SourceImage.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Stable And Deterministic Test Names ✅ Passed All test case names in the test file, including the two new test cases, are stable and deterministic static strings without dynamic information.
Test Structure And Quality ✅ Passed The reconciler tests use Go subtests rather than Ginkgo, so the custom Ginkgo-specific checks are not applicable. Each t.Run block tests a focused behavior with clear t.Errorf messages, and there are no cluster-scoped resources or indefinite waits requiring cleanup or timeouts. The tests employ the repository's fake client patterns consistently, maintaining single-responsibility and clarity.
Title check ✅ Passed The title accurately reflects the main change: handling empty disks for GCP additional (non-boot) disks by ignoring them instead of causing API errors.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/cloud/gcp/actuators/machine/reconciler_test.go`:
- Around line 407-416: In validateInstance (test helper) you currently call
t.Errorf when len(instance.Disks) != 2 but then immediately access
instance.Disks[1], which can panic; change the flow so the test stops or returns
when the disk count is wrong (e.g., replace t.Errorf with t.Fatalf or follow the
t.Errorf with an immediate return) before any access to instance.Disks[1] or
other indexes to avoid the panic.

In `@pkg/cloud/gcp/actuators/machine/reconciler.go`:
- Around line 317-341: The current logic skips setting initParams.SourceImage
whenever disk.Image is empty, which also affects boot disks; update the handling
so that an empty Image is only allowed for non-boot disks and boot disks with an
empty Image are rejected. Concretely, keep srcImage and the existing SourceImage
assignment, but before skipping SourceImage check disk.Boot (or the struct's
equivalent) and if it's a boot disk and disk.Image == "" return a clear error
(e.g., "boot disk must specify an image"); otherwise allow skipping SourceImage
for non-boot disks. Apply this change around the srcImage/disk.Image logic and
the initParams.SourceImage assignment in reconciler.go (symbols: srcImage,
disk.Image, disk.Boot, initParams.SourceImage).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26e791e3-eae0-4dee-8ed9-969c665497e8

📥 Commits

Reviewing files that changed from the base of the PR and between 8074086 and c701585.

📒 Files selected for processing (2)
  • pkg/cloud/gcp/actuators/machine/reconciler.go
  • pkg/cloud/gcp/actuators/machine/reconciler_test.go

Comment on lines +407 to +416
validateInstance: func(t *testing.T, instance *compute.Instance) {
if len(instance.Disks) != 2 {
t.Errorf("expected two disks, got %d", len(instance.Disks))
}
if instance.Disks[1].InitializeParams.SourceImage != "" {
t.Errorf("Expected blank disk (empty SourceImage), got: %q", instance.Disks[1].InitializeParams.SourceImage)
}
if instance.Disks[1].InitializeParams.DiskSizeGb != 100 {
t.Errorf("Expected disk size 100, got: %d", instance.Disks[1].InitializeParams.DiskSizeGb)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid potential panic in validation path after length mismatch.

Line 408 uses t.Errorf and then still accesses instance.Disks[1] (Line 411+). If disk count is wrong, this can panic and hide the actual assertion failure.

✅ Suggested hardening
 			validateInstance: func(t *testing.T, instance *compute.Instance) {
 				if len(instance.Disks) != 2 {
-					t.Errorf("expected two disks, got %d", len(instance.Disks))
+					t.Fatalf("expected two disks, got %d", len(instance.Disks))
 				}
+				if instance.Disks[1].InitializeParams == nil {
+					t.Fatalf("expected InitializeParams on secondary disk")
+				}
 				if instance.Disks[1].InitializeParams.SourceImage != "" {
 					t.Errorf("Expected blank disk (empty SourceImage), got: %q", instance.Disks[1].InitializeParams.SourceImage)
 				}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
validateInstance: func(t *testing.T, instance *compute.Instance) {
if len(instance.Disks) != 2 {
t.Errorf("expected two disks, got %d", len(instance.Disks))
}
if instance.Disks[1].InitializeParams.SourceImage != "" {
t.Errorf("Expected blank disk (empty SourceImage), got: %q", instance.Disks[1].InitializeParams.SourceImage)
}
if instance.Disks[1].InitializeParams.DiskSizeGb != 100 {
t.Errorf("Expected disk size 100, got: %d", instance.Disks[1].InitializeParams.DiskSizeGb)
}
validateInstance: func(t *testing.T, instance *compute.Instance) {
if len(instance.Disks) != 2 {
t.Fatalf("expected two disks, got %d", len(instance.Disks))
}
if instance.Disks[1].InitializeParams == nil {
t.Fatalf("expected InitializeParams on secondary disk")
}
if instance.Disks[1].InitializeParams.SourceImage != "" {
t.Errorf("Expected blank disk (empty SourceImage), got: %q", instance.Disks[1].InitializeParams.SourceImage)
}
if instance.Disks[1].InitializeParams.DiskSizeGb != 100 {
t.Errorf("Expected disk size 100, got: %d", instance.Disks[1].InitializeParams.DiskSizeGb)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/cloud/gcp/actuators/machine/reconciler_test.go` around lines 407 - 416,
In validateInstance (test helper) you currently call t.Errorf when
len(instance.Disks) != 2 but then immediately access instance.Disks[1], which
can panic; change the flow so the test stops or returns when the disk count is
wrong (e.g., replace t.Errorf with t.Fatalf or follow the t.Errorf with an
immediate return) before any access to instance.Disks[1] or other indexes to
avoid the panic.

Comment thread pkg/cloud/gcp/actuators/machine/reconciler.go
@ngopalak-redhat ngopalak-redhat force-pushed the ngopalak/fix_gcp_disk branch from c701585 to 60f5226 Compare March 6, 2026 09:10
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
pkg/cloud/gcp/actuators/machine/reconciler_test.go (1)

421-429: ⚠️ Potential issue | 🟡 Minor

Guard against panic after disk-count mismatch in test validation.

At Line 421-Line 423, the test logs with t.Errorf and then still dereferences instance.Disks[1] on Line 424-Line 428. If disk creation regresses, this can panic and hide the root failure.

✅ Suggested hardening
 			validateInstance: func(t *testing.T, instance *compute.Instance) {
 				if len(instance.Disks) != 2 {
-					t.Errorf("expected two disks, got %d", len(instance.Disks))
+					t.Fatalf("expected two disks, got %d", len(instance.Disks))
+				}
+				if instance.Disks[1].InitializeParams == nil {
+					t.Fatalf("expected InitializeParams on secondary disk")
 				}
 				if instance.Disks[1].InitializeParams.SourceImage != "" {
 					t.Errorf("Expected blank disk (empty SourceImage), got: %q", instance.Disks[1].InitializeParams.SourceImage)
 				}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/cloud/gcp/actuators/machine/reconciler_test.go` around lines 421 - 429,
The test currently calls t.Errorf when the disk count is wrong but then
unconditionally dereferences instance.Disks[1], which can panic; change the
validation to stop the test immediately on wrong count (e.g., replace the
t.Errorf for len(instance.Disks) != 2 with t.Fatalf or add an immediate
return/FailNow) so subsequent checks of instance.Disks[1].InitializeParams
(SourceImage and DiskSizeGb) are only executed when the disk slice has the
expected length.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/cloud/gcp/actuators/machine/reconciler_test.go`:
- Around line 421-429: The test currently calls t.Errorf when the disk count is
wrong but then unconditionally dereferences instance.Disks[1], which can panic;
change the validation to stop the test immediately on wrong count (e.g., replace
the t.Errorf for len(instance.Disks) != 2 with t.Fatalf or add an immediate
return/FailNow) so subsequent checks of instance.Disks[1].InitializeParams
(SourceImage and DiskSizeGb) are only executed when the disk slice has the
expected length.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cf421ab0-f4c3-4647-8053-ea3ad06ba16c

📥 Commits

Reviewing files that changed from the base of the PR and between c701585 and 60f5226.

📒 Files selected for processing (2)
  • pkg/cloud/gcp/actuators/machine/reconciler.go
  • pkg/cloud/gcp/actuators/machine/reconciler_test.go

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 6, 2026

@ngopalak-redhat: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@nrb
Copy link
Copy Markdown
Contributor

nrb commented Mar 9, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Mar 9, 2026
@ngopalak-redhat ngopalak-redhat changed the title Empty disk should be ignored for gcp additional disk OCPNODE-4168: Empty disk should be ignored for gcp additional disk Mar 10, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 10, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 10, 2026

@ngopalak-redhat: This pull request references OCPNODE-4168 which is a valid jira issue.

Details

In response to this:

Context: We are tryin to add additional disk to GCP to test split disk: https://issues.redhat.com/browse/OCPSTRAT-188

Error:

 Warning  FailedCreate  11m   gcpcontroller  ngopalak-feb25-gcp-1-887q4-worker-b-528mq: reconciler failed to Create machine: error launching instance: googleapi: Error 400: Invalid value for field 'resource.disks[1].initializeParams.sourceImage': 'https://compute.googleapis.com/compute/v1/projects/openshift-gce-devel/global/images/'. The URL is malformed.
Details:
[
 {
   "@type": "type.googleapis.com/google.rpc.ErrorInfo",
   "domain": "compute.googleapis.com",
   "metadata": {
     "field": "resource.disks[1].initializeParams.sourceImage",
     "reason": "The URL is malformed.",
     "value": "https://compute.googleapis.com/compute/v1/projects/openshift-gce-devel/global/images/"
   },
   "reason": "INVALID_FIELD_VALUE"
 },
 {
   "@type": "type.googleapis.com/google.rpc.LocalizedMessage",
   "locale": "en-US",
   "message": "Invalid value for field 'resource.disks[1].initializeParams.sourceImage': 'https://compute.googleapis.com/compute/v1/projects/openshift-gce-devel/global/images/'. The URL is malformed."
 }
]
, invalid

How to recreate:

  • Day 1: Add an additional disk for gcp with boot as false:
         - autoDelete: true
           boot: false
           sizeGb: 100
           type: pd-balanced
  • Day 2: Or edit the machine or machineset to add an additional non-boot disk for GCP
  • The machine object will have this:
     - autoDelete: true
       boot: false
       image: ""
       sizeGb: 100
       type: pd-balanced

This will cause the above error.

Fix:
Approach 1: Change the API to make the image field a pointer. I didn't take this approach as API change can impact existing users
Approach 2: Empty image can be ignored - This approach is taken

Related PR: openshift/machine-api-operator#1471

Summary by CodeRabbit

  • Bug Fixes

  • Stricter validation: boot disks must specify an image; secondary disks may be created as blank with specified size/type.

  • Image strings without path separators are resolved relative to the current project.

  • Tests

  • Added coverage for boot-disk validation and for creating secondary blank disks with expected size, type, and deletion behavior.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ngopalak-redhat
Copy link
Copy Markdown
Contributor Author

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 10, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@ngopalak-redhat: This PR has been marked as verified by @lunarwhite.

Details

In response to this:

/verified by @lunarwhite
https://redhat-internal.slack.com/archives/CBZHF4DHC/p1772800274422079?thread_ts=1772714225.784319&cid=CBZHF4DHC
Also verified by ci

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ngopalak-redhat
Copy link
Copy Markdown
Contributor Author

/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 10, 2026
@openshift-merge-bot openshift-merge-bot Bot merged commit e31069d into openshift:main Mar 10, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants