SPLAT-2172: AWS dedicate host support#10079
Conversation
|
@vr4manta: This pull request references SPLAT-2172 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/cc |
24500b6 to
342ab2d
Compare
|
/retest-required |
tthvo
left a comment
There was a problem hiding this comment.
Cool feature 👍 🚀
I have quite some comments 😅 Sorry if I review a little too soon...
| switch *p.HostPlacement.Affinity { | ||
| case aws.HostAffinityAnyAvailable: | ||
| if p.HostPlacement.DedicatedHost != nil { | ||
| allErrs = append(allErrs, field.Required(fldPath.Child("dedicatedHost"), "dedicatedHost is required when 'affinity' is set to DedicatedHost, and forbidden otherwise")) |
There was a problem hiding this comment.
Did you mean hostPlacement.dedicatedHost is forbidden if affinity is AnyAvailable?
allErrs = append(allErrs, field.Forbidden(fldPath.Child("dedicatedHost"), "dedicatedHost must not be set when affinity is AnyAvailable"))There was a problem hiding this comment.
@tthvo So this was an interesting one. In openshift/api and openshift/machine-api-operator, it was suggested to error this way for this scenario. I was doing this to keep it consistent.
- https://github.com/openshift/api/blob/0598ae66682433c49adc29b1ff14a201487f3afe/machine/v1beta1/types_awsprovider.go#L406
- openshift/machine-api-operator@f669914#diff-30c686f7a9dbd2aa917e5e410401d720a2a4bf4a11163495f5610c8f4b98edaeR917
Now that said, I am happy to make your suggestion if you prefer installer say it this way. Just let me know.
| if err != nil { | ||
| allErrs = append(allErrs, field.InternalError(placementPath.Child("dedicatedHost"), err)) | ||
| } else { | ||
| // Check the returned configured hosts to see if the dedicated hosts defined in install-config exists. |
There was a problem hiding this comment.
💡 ❓ I got 2 questions here:
-
Do the zones of dedicated hosts need to match machinepool zone field if defined?
installer/pkg/types/aws/machinepool.go
Lines 5 to 9 in 7c9b71a
-
Do the user-input zones for dedicated hosts need to match with the actual zones returned from AWS?
// If user specified a zone, validate it matches AWS if host.Zone != "" && host.Zone != hostDetails.Zone { allErrs = append(allErrs, field.Invalid( fldPath.Child("hostPlacement", "dedicatedHost").Index(i).Child("zone"), host.Zone, fmt.Sprintf("specified zone %s does not match actual host zone %s", host.Zone, hostDetails.Zone))) }
There was a problem hiding this comment.
Do the user-input zones for dedicated hosts need to match with the actual zones returned from AWS?
On that point, I'm curious if users should need to specify the zone for the dedicated host? It seems like we can just look up the dedicated host by id and then use that to determine the zone (which the pr is already doing IIUC). Seems like we might be able to shed zone from the install config.
|
/hold |
82915c2 to
8ef8679
Compare
|
/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-dedicated |
|
@vr4manta: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1c5c6d80-c717-11f0-93e9-d918c346504c-0 |
|
/unhold |
|
/retest |
|
/hold |
c873bb3 to
5a397fb
Compare
|
/hold |
| // Dedicated hosts cannot be configured in defaultMachinePlatform | ||
| if platform.DefaultMachinePlatform.HostPlacement != nil { | ||
| defaultPath := fldPath.Child("defaultMachinePlatform").Child("hostPlacement") | ||
| errMsg := "dedicated hosts cannot be configured in defaultMachinePlatform, they must be specified per machine pool" |
There was a problem hiding this comment.
nit: it might be more user friendly to specify that they're only supported on compute machine pools, but NBD
| // Dedicated hosts are only supported on worker (compute) pools | ||
| if poolName != "" && poolName != types.MachinePoolComputeRoleName { | ||
| placementPath := fldPath.Child("hostPlacement") | ||
| errMsg := fmt.Sprintf("dedicated hosts are only supported on %s pools, not on %s pools", types.MachinePoolComputeRoleName, poolName) | ||
| allErrs = append(allErrs, field.Invalid(placementPath, pool.HostPlacement, errMsg)) | ||
| return allErrs | ||
| } | ||
|
|
||
| // Control plane pools cannot use dedicated hosts | ||
| if poolName == "" { | ||
| placementPath := fldPath.Child("hostPlacement") | ||
| errMsg := "dedicated hosts are not supported on control plane pools" | ||
| allErrs = append(allErrs, field.Invalid(placementPath, pool.HostPlacement, errMsg)) | ||
| return allErrs | ||
| } |
There was a problem hiding this comment.
nit: I would have expected these validations in pkg/types (because they don't require aws sdk)
| describeOutput, err := client.DescribeHosts(ctx, &ec2v2.DescribeHostsInput{ | ||
| HostIds: []string{id}, | ||
| }) | ||
| if err != nil { | ||
| errCode := HandleErrorCode(err) | ||
| if errCode == "InvalidHostID.NotFound" { | ||
| // Host doesn't exist, nothing to do | ||
| return nil |
There was a problem hiding this comment.
This call to DescribeHosts is always made but permissions.go is listing the permission as optional. We need to either make the permission required, or gracefully handle the permission error (i think handling the permission error is the way to go).
cc8a0b9 to
4b1d68a
Compare
mtulio
left a comment
There was a problem hiding this comment.
Hey @vr4manta , good progress!
A few questions related to tagging and covered use cases. It would be nice to know what would be the tagging behavior of a Day-2 instance created with BYO DH ID: do we need to set kubernetes cluster tags (kubernetes/cluster...) to it (mapa/capa component) or would be optional?
780e1ea to
35cdfe9
Compare
WalkthroughThis change introduces comprehensive AWS Dedicated Hosts support to the OpenShift installer. It adds a new Changes
Sequence Diagram(s)sequenceDiagram
participant Installer as Installer Process
participant InstallConfig as Install Config<br/>Validation
participant EC2API as EC2 API
participant Metadata as AWS Metadata<br/>Cache
participant MachineAsset as Machine Asset<br/>Generator
Installer->>InstallConfig: Validate install config
InstallConfig->>InstallConfig: Check hostPlacement config
InstallConfig->>Metadata: DedicatedHosts(ctx)
Metadata->>EC2API: DescribeHosts
EC2API-->>Metadata: Host list
Metadata->>Metadata: Cache result in Hosts map
Metadata-->>InstallConfig: map[hostID]Host
InstallConfig->>InstallConfig: Validate host existence,<br/>region/zone alignment,<br/>no duplicates
Installer->>MachineAsset: Generate machine assets
MachineAsset->>Metadata: Retrieve cached Hosts
Metadata-->>MachineAsset: map[hostID]Host
MachineAsset->>MachineAsset: DedicatedHost(hosts, placement, zone)<br/>select host per zone
MachineAsset->>MachineAsset: Configure machine with<br/>HostTenancy + DedicatedHost
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@vr4manta: This pull request references SPLAT-2172 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/asset/machines/aws/machinesets.go (1)
91-109:⚠️ Potential issue | 🟠 MajorDon’t silently drop host placement for unmatched zones.
If
DedicatedHost()cannot find a host for the current AZ,dedicatedHostbecomes empty and this code still emits a machineset. ForAffinity=DedicatedHost, that creates a regular machineset for that zone instead of honoring the install-config.Suggested fix
dedicatedHost := DedicatedHost(in.Hosts, mpool.HostPlacement, az) + if mpool.HostPlacement != nil && + mpool.HostPlacement.Affinity != nil && + *mpool.HostPlacement.Affinity == aws.HostAffinityDedicatedHost && + dedicatedHost == "" { + return nil, errors.Errorf("no dedicated host configured for machine pool %q in zone %s", in.Pool.Name, az) + } provider, err := provider(&machineProviderInput{ clusterID: in.ClusterID,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/asset/machines/aws/machinesets.go` around lines 91 - 109, The code currently continues to build and emit a machineset even when DedicatedHost(in.Hosts, mpool.HostPlacement, az) returns an empty dedicatedHost, which silently drops host placement and creates a regular machineset; update the logic around calling provider(&machineProviderInput{...}) to check the pool's affinity (mpool.HostPlacement or the field that indicates Affinity=DedicatedHost) and if affinity is DedicatedHost and dedicatedHost == "" then skip creating/emitting the machineset (return or continue) so we don't produce a non-dedicated machineset for that AZ; reference DedicatedHost(), the dedicatedHost variable, machineProviderInput and mpool.HostPlacement when making the change.
♻️ Duplicate comments (2)
pkg/destroy/aws/ec2helpers.go (1)
973-985:⚠️ Potential issue | 🟠 MajorReturn early on unauthorized
DescribeHosts.The
UnauthorizedOperation/AccessDeniedbranch says “skipping” and then still returns an error, so destroy will fail anyway whenec2:DescribeHostsis missing.Suggested fix
if err != nil { errCode := HandleErrorCode(err) switch errCode { case "InvalidHostID.NotFound": // Host doesn't exist, nothing to do return nil case "UnauthorizedOperation", "AccessDenied": // User doesn't have permission to describe dedicated hosts // This is expected when dedicated host permissions are not configured logger.Warn("User does not have permission to describe dedicated hosts, skipping") + return nil } return fmt.Errorf("failed to describe dedicated host %s: %w", id, err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/destroy/aws/ec2helpers.go` around lines 973 - 985, The DescribeHosts error handling in the block using HandleErrorCode(err) incorrectly logs a warning for "UnauthorizedOperation" and "AccessDenied" but still returns an error, causing destroy to fail when ec2:DescribeHosts is missing; update the switch in the same function so that for cases "UnauthorizedOperation" and "AccessDenied" you log the warning via logger.Warn("User does not have permission to describe dedicated hosts, skipping") and then return nil instead of falling through to the final fmt.Errorf return; keep the existing behavior for "InvalidHostID.NotFound" (return nil) and for other errors continue to return fmt.Errorf("failed to describe dedicated host %s: %w", id, err).pkg/asset/installconfig/aws/metadata.go (1)
379-391:⚠️ Potential issue | 🟡 MinorCache the empty dedicated-host result too.
len(m.Hosts) == 0conflates “not loaded yet” with “loaded, but there are no hosts”. In an account with zero dedicated hosts, this will callDescribeHostson every access and defeat the cache.Suggested fix
- if len(m.Hosts) == 0 { + if m.Hosts == nil { client, err := m.EC2Client(ctx) if err != nil { return nil, err } - m.Hosts, err = dedicatedHosts(ctx, client) + hosts, err := dedicatedHosts(ctx, client) if err != nil { return nil, fmt.Errorf("error listing dedicated hosts: %w", err) } + m.Hosts = hosts }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/asset/installconfig/aws/metadata.go` around lines 379 - 391, The cache check uses len(m.Hosts) == 0 which treats “not loaded” and “loaded-but-empty” the same and causes repeated DescribeHosts calls; change the guard to check for nil (if m.Hosts == nil) so an empty slice returned by dedicatedHosts is cached, and ensure dedicatedHosts or the assignment leaves m.Hosts as a non-nil empty slice when no hosts are found (refer to m.Hosts, EC2Client, and dedicatedHosts).
🧹 Nitpick comments (1)
pkg/types/aws/validation/machinepool.go (1)
111-121: Missing validation for duplicate host IDs.Per past review discussion, duplicate host IDs in the
DedicatedHostlist should be flagged. Currently, duplicates are silently accepted but would cause logical issues downstream (e.g., the map inDedicatedHost()helper would deduplicate them).♻️ Proposed fix to add uniqueness validation
case aws.HostAffinityDedicatedHost: if len(p.HostPlacement.DedicatedHost) == 0 { allErrs = append(allErrs, field.Required(fldPath.Child("dedicatedHost"), "dedicatedHost is required when 'affinity' is set to DedicatedHost, and forbidden otherwise")) } else { + seenIDs := make(map[string]int) for index, host := range p.HostPlacement.DedicatedHost { hostPath := fldPath.Child("dedicatedHost").Index(index) if len(host.ID) == 0 { allErrs = append(allErrs, field.Required(hostPath.Child("id"), "a hostID must be specified when configuring 'dedicatedHost'")) + } else if prevIdx, seen := seenIDs[host.ID]; seen { + allErrs = append(allErrs, field.Duplicate(hostPath.Child("id"), host.ID)) + } else { + seenIDs[host.ID] = index } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/types/aws/validation/machinepool.go` around lines 111 - 121, The DedicatedHost branch in Validate (case aws.HostAffinityDedicatedHost) lacks a check for duplicate host IDs in p.HostPlacement.DedicatedHost; add uniqueness validation by iterating the slice (the same loop that checks host.ID) and track seen IDs (e.g., with a map[string]bool) to append a field.Duplicate or field.Invalid error to fldPath.Child("dedicatedHost").Index(index).Child("id") when an ID repeats, ensuring behavior matches the DedicatedHost() helper which deduplicates IDs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@data/data/install.openshift.io_installconfigs.yaml`:
- Around line 202-251: The schema allows affinity: DedicatedHost with an empty
dedicatedHost array because has(self.dedicatedHost) only checks presence; update
the dedicatedHost array definition (the dedicatedHost property under
hostPlacement with type: array and item schema containing id) to include
minItems: 1 so empty lists are rejected, and ensure the existing
x-kubernetes-validations rule that references self.dedicatedHost remains
unchanged; apply the same minItems: 1 addition to the other duplicated
dedicatedHost array blocks referenced in the comment (the other two
occurrences).
In `@pkg/asset/cluster/aws/aws.go`:
- Around line 253-269: The collector currently only reads compute.Platform.AWS
from installConfig.Config.Compute for the "worker" pool and misses values
inherited from installConfig.Config.AWS.DefaultMachinePlatform; update the logic
that builds dhNames to first start from
installConfig.Config.AWS.DefaultMachinePlatform (create an awstypes.MachinePool
and call mpool.Set on the default), then overlay the worker compute (if present)
by calling mpool.Set(compute.Platform.AWS) as done in
tagSharedIAMRoles/tagSharedIAMProfiles, handle the compute == nil fallback, and
then read mpool.HostPlacement.DedicatedHost to append name.ID into dhNames so
inherited dedicated hosts are included.
- Around line 284-288: The error wrapper for the ec2Client.CreateTags call is
misleading (mentions subnets); update the wrapped error message in the
CreateTags error return to correctly reference dedicated hosts (e.g., change
"could not add tags to subnets" to "could not add tags to dedicated hosts" or
similar) so callers of ec2Client.CreateTags (using dhNames, tagKey, tagValue)
see the correct resource in the error text.
In `@pkg/asset/installconfig/aws/dedicatedhosts.go`:
- Line 36: Fix the typo in the debug log call by updating the log message string
used in the logrus.Debugf invocation (currently logrus.Debugf("Found dedicatd
host: %s", id)) to the correct wording, e.g. "Found dedicated host: %s",
preserving the formatting and the id argument.
In `@pkg/asset/installconfig/aws/permissions.go`:
- Around line 442-445: RequiredPermissionGroups currently includes
PermissionDedicatedHosts but not PermissionDynamicHostAllocation, so delete-time
permission assembly omits the ec2:ReleaseHosts action; update the code that
builds the destroy/delete permission set (the logic that assembles
RequiredPermissionGroups for destroy) to also include
PermissionDynamicHostAllocation alongside PermissionDedicatedHosts so
ec2:ReleaseHosts is present at destroy time; make the same change where the
delete-time groups are assembled later in the file (the second occurrence around
the 581-583 area). Ensure you reference PermissionDynamicHostAllocation,
PermissionDedicatedHosts and RequiredPermissionGroups when making the change.
- Around line 795-804: The includesDedicatdHosts helper currently only checks
pools named exactly types.MachinePoolComputeRoleName and misses other compute
pools with AWS.HostPlacement; change the loop condition to scan all compute
pools except edge pools (i.e., skip only when mpool.Name ==
types.MachinePoolEdgeRoleName) and return true if any mpool.Platform.AWS != nil
&& mpool.Platform.AWS.HostPlacement != nil; update includesDedicatdHosts
accordingly so DescribeHosts/ReleaseHosts run for any non-edge compute pool with
host placement.
In `@pkg/asset/installconfig/aws/validation.go`:
- Around line 556-560: The validation currently allows a dedicated-host pool
when pool.Zones is empty, which makes zone-to-host mapping non-deterministic;
add an explicit check in the same validation block to reject dedicated-host
pools that don't specify Zones: if len(pool.Zones) == 0 (and this is a
dedicated-host pool context), append a field.Invalid error on dhPath (use host
as the bad value) with a message like "machine pool with dedicated hosts must
specify explicit zones" so we fail fast instead of accepting any zone and later
producing MachineSets that lose dedicated-host placement.
In `@pkg/asset/machines/aws/machines.go`:
- Around line 371-382: The DedicatedHost function currently returns an empty
string when a placement is provided but no host matches the zone, which silently
omits Placement.Host in provider(); change DedicatedHost to return (string,
error), return a non-nil error when placement != nil and no host.ID matches the
given zone (include context like requested host IDs and zone), and update the
caller provider() to propagate or handle that error so the invalid host/zone
mapping is surfaced (e.g., fail validation or emit a clear error) rather than
falling back to shared placement.
In `@pkg/asset/machines/worker.go`:
- Around line 538-543: The code currently calls
installConfig.AWS.DedicatedHosts(ctx) whenever mpool.HostPlacement != nil, which
also runs for valid AnyAvailable placements; change the guard to only call
DedicatedHosts when the placement explicitly indicates DedicatedHost affinity
(i.e., check mpool.HostPlacement == DedicatedHost or the equivalent
enum/constant used in your codebase), so only the DedicatedHost case triggers
installConfig.AWS.DedicatedHosts(ctx) and other placements (e.g., AnyAvailable)
skip the host lookup.
In `@pkg/types/aws/validation/machinepool_test.go`:
- Around line 299-321: Test cases assert short 8-character host IDs as valid but
the validator in pkg/types/aws/machinepool.go expects DedicatedHost.ID to match
^h-[0-9a-f]{17}$; update the failing test entries in
pkg/types/aws/validation/machinepool_test.go that construct aws.MachinePool ->
aws.HostPlacement -> DedicatedHost (IDs "h-1a2b3c4d" and "h-9876abcd") to use
valid 17-hex-character IDs (e.g., "h-0123456789abcdef0") or mark them as invalid
so they align with the DedicatedHost.ID validation. Ensure you adjust the test
name/expectation (valid vs invalid) for those cases accordingly.
In `@pkg/types/aws/validation/machinepool.go`:
- Around line 85-99: The error path is being double-nested because fldPath is
already hostPlacement; remove the extra fldPath.Child("hostPlacement") calls and
use fldPath directly when building the Invalid field errors. Update both checks
that append to allErrs (the block comparing poolName to
types.MachinePoolComputeRoleName and the block for control plane pools) so they
pass fldPath (not fldPath.Child(...)) to field.Invalid along with
p.HostPlacement and the existing errMsg; keep the rest of the logic and messages
unchanged.
---
Outside diff comments:
In `@pkg/asset/machines/aws/machinesets.go`:
- Around line 91-109: The code currently continues to build and emit a
machineset even when DedicatedHost(in.Hosts, mpool.HostPlacement, az) returns an
empty dedicatedHost, which silently drops host placement and creates a regular
machineset; update the logic around calling provider(&machineProviderInput{...})
to check the pool's affinity (mpool.HostPlacement or the field that indicates
Affinity=DedicatedHost) and if affinity is DedicatedHost and dedicatedHost == ""
then skip creating/emitting the machineset (return or continue) so we don't
produce a non-dedicated machineset for that AZ; reference DedicatedHost(), the
dedicatedHost variable, machineProviderInput and mpool.HostPlacement when making
the change.
---
Duplicate comments:
In `@pkg/asset/installconfig/aws/metadata.go`:
- Around line 379-391: The cache check uses len(m.Hosts) == 0 which treats “not
loaded” and “loaded-but-empty” the same and causes repeated DescribeHosts calls;
change the guard to check for nil (if m.Hosts == nil) so an empty slice returned
by dedicatedHosts is cached, and ensure dedicatedHosts or the assignment leaves
m.Hosts as a non-nil empty slice when no hosts are found (refer to m.Hosts,
EC2Client, and dedicatedHosts).
In `@pkg/destroy/aws/ec2helpers.go`:
- Around line 973-985: The DescribeHosts error handling in the block using
HandleErrorCode(err) incorrectly logs a warning for "UnauthorizedOperation" and
"AccessDenied" but still returns an error, causing destroy to fail when
ec2:DescribeHosts is missing; update the switch in the same function so that for
cases "UnauthorizedOperation" and "AccessDenied" you log the warning via
logger.Warn("User does not have permission to describe dedicated hosts,
skipping") and then return nil instead of falling through to the final
fmt.Errorf return; keep the existing behavior for "InvalidHostID.NotFound"
(return nil) and for other errors continue to return fmt.Errorf("failed to
describe dedicated host %s: %w", id, err).
---
Nitpick comments:
In `@pkg/types/aws/validation/machinepool.go`:
- Around line 111-121: The DedicatedHost branch in Validate (case
aws.HostAffinityDedicatedHost) lacks a check for duplicate host IDs in
p.HostPlacement.DedicatedHost; add uniqueness validation by iterating the slice
(the same loop that checks host.ID) and track seen IDs (e.g., with a
map[string]bool) to append a field.Duplicate or field.Invalid error to
fldPath.Child("dedicatedHost").Index(index).Child("id") when an ID repeats,
ensuring behavior matches the DedicatedHost() helper which deduplicates IDs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7ae35e0e-d65b-46e5-883d-66408e6032fb
⛔ Files ignored due to path filters (1)
pkg/types/aws/zz_generated.deepcopy.gois excluded by!**/zz_generated*
📒 Files selected for processing (20)
data/data/install.openshift.io_installconfigs.yamlpkg/asset/cluster/aws/aws.gopkg/asset/installconfig/aws/dedicatedhosts.gopkg/asset/installconfig/aws/metadata.gopkg/asset/installconfig/aws/permissions.gopkg/asset/installconfig/aws/validation.gopkg/asset/installconfig/aws/validation_test.gopkg/asset/machines/aws/machines.gopkg/asset/machines/aws/machinesets.gopkg/asset/machines/worker.gopkg/asset/machines/worker_test.gopkg/destroy/aws/ec2helpers.gopkg/types/aws/machinepool.gopkg/types/aws/validation/featuregates.gopkg/types/aws/validation/featuregates_test.gopkg/types/aws/validation/machinepool.gopkg/types/aws/validation/machinepool_test.gopkg/types/aws/validation/platform.gopkg/types/machinepools.gopkg/types/validation/machinepools.go
| hostPlacement: | ||
| description: |- | ||
| hostPlacement configures placement on AWS Dedicated Hosts. This allows admins to assign instances to specific host | ||
| for a variety of needs including for regulatory compliance, to leverage existing per-socket or per-core software licenses (BYOL), | ||
| and to gain visibility and control over instance placement on a physical server. | ||
| When omitted, the instance is not constrained to a dedicated host. | ||
| properties: | ||
| affinity: | ||
| description: |- | ||
| affinity specifies the affinity setting for the instance. | ||
| Allowed values are AnyAvailable and DedicatedHost. | ||
| When Affinity is set to DedicatedHost, an instance started onto a specific host always restarts on the same host if stopped. In this scenario, the `dedicatedHost` field must be set. | ||
| When Affinity is set to AnyAvailable, and you stop and restart the instance, it can be restarted on any available host. | ||
| enum: | ||
| - DedicatedHost | ||
| - AnyAvailable | ||
| type: string | ||
| dedicatedHost: | ||
| description: |- | ||
| dedicatedHost specifies the exact host that an instance should be restarted on if stopped. | ||
| dedicatedHost is required when 'affinity' is set to DedicatedHost, and forbidden otherwise. | ||
| items: | ||
| description: DedicatedHost represents the configuration | ||
| for the usage of dedicated host. | ||
| properties: | ||
| id: | ||
| description: |- | ||
| id identifies the AWS Dedicated Host on which the instance must run. | ||
| The value must start with "h-" followed by 17 lowercase hexadecimal characters (0-9 and a-f). | ||
| Must be exactly 19 characters in length. | ||
| maxLength: 19 | ||
| minLength: 19 | ||
| type: string | ||
| x-kubernetes-validations: | ||
| - message: hostID must start with 'h-' followed | ||
| by 17 lowercase hexadecimal characters (0-9 | ||
| and a-f) | ||
| rule: self.matches('^h-[0-9a-f]{17}$') | ||
| required: | ||
| - id | ||
| type: object | ||
| type: array | ||
| required: | ||
| - affinity | ||
| type: object | ||
| x-kubernetes-validations: | ||
| - message: dedicatedHost is required when affinity is DedicatedHost, | ||
| and forbidden otherwise | ||
| rule: 'has(self.affinity) && self.affinity == ''DedicatedHost'' | ||
| ? has(self.dedicatedHost) : !has(self.dedicatedHost)' |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Kubernetes CRD x-kubernetes-validations, does has(self.someList) evaluate to true when the list field is present but empty ([]), and is minItems: 1 the recommended schema-level way to require at least one list element?
💡 Result:
-
Yes. In CRD CEL,
has(self.someList)is a presence test (for JSON-like objects: “is this field/key defined?”), not a “non-empty” test—so ifsomeListis present in the object as[],has(self.someList)evaluates to true. This aligns with CEL’s intent of unifying presence checks across protobuf and JSON-like types. [1][2]- If you need “present and non-empty” in CEL, use something like
has(self.someList) && self.someList.size() > 0.
- If you need “present and non-empty” in CEL, use something like
-
Yes. To require at least one list element at the schema level,
minItems: 1is the standard / recommended OpenAPI-schema validation for arrays (preferred over CEL when an OpenAPI validation exists). [3][4]
Sources: Kubernetes CEL docs [2], cel-go README (has macro semantics) [1], Kubernetes CRD validation best practices (prefer OpenAPI validations) [3], controller-tools / OpenAPI marker for MinItems=1 [4].
Enforce non-empty dedicatedHost array when affinity: DedicatedHost.
The has(self.dedicatedHost) check only validates field presence, so a config with affinity: DedicatedHost and dedicatedHost: [] passes validation despite being unusable. Add minItems: 1 to the array definition to reject empty lists at the schema boundary.
Suggested fix
dedicatedHost:
+ minItems: 1
description: |-Apply the same change at lines 1812–1861, 3362–3411, and 5112–5161.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| hostPlacement: | |
| description: |- | |
| hostPlacement configures placement on AWS Dedicated Hosts. This allows admins to assign instances to specific host | |
| for a variety of needs including for regulatory compliance, to leverage existing per-socket or per-core software licenses (BYOL), | |
| and to gain visibility and control over instance placement on a physical server. | |
| When omitted, the instance is not constrained to a dedicated host. | |
| properties: | |
| affinity: | |
| description: |- | |
| affinity specifies the affinity setting for the instance. | |
| Allowed values are AnyAvailable and DedicatedHost. | |
| When Affinity is set to DedicatedHost, an instance started onto a specific host always restarts on the same host if stopped. In this scenario, the `dedicatedHost` field must be set. | |
| When Affinity is set to AnyAvailable, and you stop and restart the instance, it can be restarted on any available host. | |
| enum: | |
| - DedicatedHost | |
| - AnyAvailable | |
| type: string | |
| dedicatedHost: | |
| description: |- | |
| dedicatedHost specifies the exact host that an instance should be restarted on if stopped. | |
| dedicatedHost is required when 'affinity' is set to DedicatedHost, and forbidden otherwise. | |
| items: | |
| description: DedicatedHost represents the configuration | |
| for the usage of dedicated host. | |
| properties: | |
| id: | |
| description: |- | |
| id identifies the AWS Dedicated Host on which the instance must run. | |
| The value must start with "h-" followed by 17 lowercase hexadecimal characters (0-9 and a-f). | |
| Must be exactly 19 characters in length. | |
| maxLength: 19 | |
| minLength: 19 | |
| type: string | |
| x-kubernetes-validations: | |
| - message: hostID must start with 'h-' followed | |
| by 17 lowercase hexadecimal characters (0-9 | |
| and a-f) | |
| rule: self.matches('^h-[0-9a-f]{17}$') | |
| required: | |
| - id | |
| type: object | |
| type: array | |
| required: | |
| - affinity | |
| type: object | |
| x-kubernetes-validations: | |
| - message: dedicatedHost is required when affinity is DedicatedHost, | |
| and forbidden otherwise | |
| rule: 'has(self.affinity) && self.affinity == ''DedicatedHost'' | |
| ? has(self.dedicatedHost) : !has(self.dedicatedHost)' | |
| hostPlacement: | |
| description: |- | |
| hostPlacement configures placement on AWS Dedicated Hosts. This allows admins to assign instances to specific host | |
| for a variety of needs including for regulatory compliance, to leverage existing per-socket or per-core software licenses (BYOL), | |
| and to gain visibility and control over instance placement on a physical server. | |
| When omitted, the instance is not constrained to a dedicated host. | |
| properties: | |
| affinity: | |
| description: |- | |
| affinity specifies the affinity setting for the instance. | |
| Allowed values are AnyAvailable and DedicatedHost. | |
| When Affinity is set to DedicatedHost, an instance started onto a specific host always restarts on the same host if stopped. In this scenario, the `dedicatedHost` field must be set. | |
| When Affinity is set to AnyAvailable, and you stop and restart the instance, it can be restarted on any available host. | |
| enum: | |
| - DedicatedHost | |
| - AnyAvailable | |
| type: string | |
| dedicatedHost: | |
| minItems: 1 | |
| description: |- | |
| dedicatedHost specifies the exact host that an instance should be restarted on if stopped. | |
| dedicatedHost is required when 'affinity' is set to DedicatedHost, and forbidden otherwise. | |
| items: | |
| description: DedicatedHost represents the configuration | |
| for the usage of dedicated host. | |
| properties: | |
| id: | |
| description: |- | |
| id identifies the AWS Dedicated Host on which the instance must run. | |
| The value must start with "h-" followed by 17 lowercase hexadecimal characters (0-9 and a-f). | |
| Must be exactly 19 characters in length. | |
| maxLength: 19 | |
| minLength: 19 | |
| type: string | |
| x-kubernetes-validations: | |
| - message: hostID must start with 'h-' followed | |
| by 17 lowercase hexadecimal characters (0-9 | |
| and a-f) | |
| rule: self.matches('^h-[0-9a-f]{17}$') | |
| required: | |
| - id | |
| type: object | |
| type: array | |
| required: | |
| - affinity | |
| type: object | |
| x-kubernetes-validations: | |
| - message: dedicatedHost is required when affinity is DedicatedHost, | |
| and forbidden otherwise | |
| rule: 'has(self.affinity) && self.affinity == ''DedicatedHost'' | |
| ? has(self.dedicatedHost) : !has(self.dedicatedHost)' |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data/data/install.openshift.io_installconfigs.yaml` around lines 202 - 251,
The schema allows affinity: DedicatedHost with an empty dedicatedHost array
because has(self.dedicatedHost) only checks presence; update the dedicatedHost
array definition (the dedicatedHost property under hostPlacement with type:
array and item schema containing id) to include minItems: 1 so empty lists are
rejected, and ensure the existing x-kubernetes-validations rule that references
self.dedicatedHost remains unchanged; apply the same minItems: 1 addition to the
other duplicated dedicatedHost array blocks referenced in the comment (the other
two occurrences).
| PermissionDynamicHostAllocation: { | ||
| // This is only used during cluster destroy if during cluster destroy we detect a dedicated host with appropriate tags on it. | ||
| "ec2:ReleaseHosts", | ||
| }, |
There was a problem hiding this comment.
Also append the destroy-time host release permission.
RequiredPermissionGroups opts into PermissionDedicatedHosts, but never PermissionDynamicHostAllocation. Since this function already assembles the delete-time permission set too, creds validation will miss ec2:ReleaseHosts, so dedicated-host clusters can validate successfully and still fail to clean up dynamically allocated hosts on destroy.
Suggested fix
if includesDedicatdHosts(ic) {
permissionGroups = append(permissionGroups, PermissionDedicatedHosts)
+ if !isSecretRegion {
+ permissionGroups = append(permissionGroups, PermissionDynamicHostAllocation)
+ }
}Also applies to: 581-583
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/asset/installconfig/aws/permissions.go` around lines 442 - 445,
RequiredPermissionGroups currently includes PermissionDedicatedHosts but not
PermissionDynamicHostAllocation, so delete-time permission assembly omits the
ec2:ReleaseHosts action; update the code that builds the destroy/delete
permission set (the logic that assembles RequiredPermissionGroups for destroy)
to also include PermissionDynamicHostAllocation alongside
PermissionDedicatedHosts so ec2:ReleaseHosts is present at destroy time; make
the same change where the delete-time groups are assembled later in the file
(the second occurrence around the 581-583 area). Ensure you reference
PermissionDynamicHostAllocation, PermissionDedicatedHosts and
RequiredPermissionGroups when making the change.
| func DedicatedHost(hosts map[string]aws.Host, placement *awstypes.HostPlacement, zone string) string { | ||
| // If install-config has HostPlacements configured, lets check the DedicatedHosts to see if one matches our region & zone. | ||
| if placement != nil { | ||
| // We only support one host ID currently for an instance. Need to also get host that matches the zone the machines will be put into. | ||
| for _, host := range placement.DedicatedHost { | ||
| hostDetails, found := hosts[host.ID] | ||
| if found && hostDetails.Zone == zone { | ||
| return hostDetails.ID | ||
| } | ||
| } | ||
| } | ||
| return "" |
There was a problem hiding this comment.
Don’t silently fall back to shared placement when no host matches the zone.
Returning "" here causes provider() on Lines 309-318 to omit Placement.Host completely. For a pool with Affinity == DedicatedHost, that emits a normal instance config instead of surfacing an invalid host/zone mapping.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/asset/machines/aws/machines.go` around lines 371 - 382, The DedicatedHost
function currently returns an empty string when a placement is provided but no
host matches the zone, which silently omits Placement.Host in provider(); change
DedicatedHost to return (string, error), return a non-nil error when placement
!= nil and no host.ID matches the given zone (include context like requested
host IDs and zone), and update the caller provider() to propagate or handle that
error so the invalid host/zone mapping is surfaced (e.g., fail validation or
emit a clear error) rather than falling back to shared placement.
|
@vr4manta: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
SPLAT-2172
Changes
Dependencies
Notes
MAO and CAO changes are needed for it to fully work. For now, this PR is adding the ability to generate the needed outputs for bootstrapping.
Summary by CodeRabbit
Release Notes
New Features
Improvements