Skip to content

WIP: Rebase 1.35 reenable blackbox test#2604

Open
jacobsee wants to merge 2456 commits intoopenshift:masterfrom
jacobsee:rebase-1.35-reenable-blackbox-test
Open

WIP: Rebase 1.35 reenable blackbox test#2604
jacobsee wants to merge 2456 commits intoopenshift:masterfrom
jacobsee:rebase-1.35-reenable-blackbox-test

Conversation

@jacobsee
Copy link
Member

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR is related to:

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


k8s-release-robot and others added 30 commits November 12, 2025 09:48
…pod-resize

Pod level in place pod resize - alpha
…containerd-skip

[KEP-4639] Remove image volume e2e test because CI has containerd < 2.1
Fix volume performance tests with performance constraints
update github.com/opencontainers/selinux to v1.13.0
* First version of batching w/out signatures.

* First version of pod signatures.

* Integrate batching with signatures.

* Fix merge conflicts.

* Fixes from self-review.

* Test fixes.

* Fix a bug that limited batches to size 2
Also add some new high-level logging and
simplify the pod affinity signature.

* Re-enable batching on perf tests for now.

* fwk.NewStatus(fwk.Success)

* Review feedback.

* Review feedback.

* Comment fix.

* Two plugin specific unit tests.:

* Add cycle state to the sign call, apply to topo spread.
Also add unit tests for several plugi signature
calls.

* Review feedback.

* Switch to distinct stats for hint and store calls.

* Switch signature from string to []byte

* Revert cyclestate in signs. Update node affinity.
Node affinity now sorts all of the various
nested arrays in the structure. CycleState no
longer in signature; revert to signing fewer
cases for pod spread.

* hack/update-vendor.sh

* Disable signatures when extenders are configured.

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update staging/src/k8s.io/kube-scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Disable node resource signatures when extended DRA enabled.

* Review feedback.

* Update pkg/scheduler/framework/plugins/imagelocality/image_locality.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/plugins/nodedeclaredfeatures/nodedeclaredfeatures.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Fixes for review suggestions.

* Add integration tests.

* Linter fixes, test fix.

* Whitespace fix.

* Remove broken test.

* Unschedulable test.

* Remove go.mod changes.

---------

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>
Update the procMount test expectations to match the intentional PSA
policy relaxation introduced in commit e8bd3f6.

As of Kubernetes 1.35+, Pod Security Admission Baseline policy
allows UnmaskedProcMount for pods with user namespaces (hostUsers:
false). This was an intentional change to support nested container
use cases while maintaining security through user namespace isolation.

The test "will fail to unmask proc mounts if not privileged" was
written before this relaxation and expected Baseline level to reject
UnmaskedProcMount. Since Baseline now allows it (for user namespace
pods), the test needs to use Restricted level instead, which
unconditionally blocks UnmaskedProcMount regardless of user namespace
settings.

Changes:
- Change PSA level from Baseline to Restricted
- Update test name to clarify it's testing Restricted level behavior
- Update framework name from "proc-mount-baseline-test" to
  "proc-mount-restricted-test"

Fixes the ci-crio-userns-e2e-serial test failure that started occurring
when runtimes began reporting user namespace support.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
…-userns-validation

test/e2e_node: Update procMount test to use Restricted PSA level
Fixes issue kubernetes#134023 where alpha API warnings were being logged
when binary version (1.34.1) and emulation version (1.34) differed
only in patch version.

The issue was in api_enablement.go where the version comparison
was using EqualTo() which compares all version components including
patch versions. The fix changes the comparison to only check
major.minor versions using version.MajorMinor().

Changes:
- Modified version comparison logic in ApplyTo() method to only
  compare major.minor versions, not patch versions
- Added comprehensive test cases to verify the fix works correctly
- Tests confirm that warnings are still logged for different
  major/minor versions but not for different patch versions

This prevents spurious warnings when emulation version is set to
major.minor (e.g., 1.34) and binary version includes patch (e.g., 1.34.1).
…rnings-134023

Fix alpha API warnings for patch version differences
Signed-off-by: Aman Shrivastava <amanshrivastava118@gmail.com>
…rf_tests_on_featuregates

Fix failing scheduler_perf test cases that don't set any feature gate
The allowRelaxedServiceNameValidation() function currently only checks
service names in spec.rules, but it should also check the service name
in spec.defaultBackend.

When an Ingress has a defaultBackend with a service name that is valid
per RFC 1123 but invalid per RFC 1035 (e.g., starting with a digit like
"1-default-service"), the function incorrectly returns false. This
prevents users from updating such Ingresses even though they were
validly created in the past.

This commit adds validation for spec.defaultBackend.service.name to
maintain backward compatibility for existing Ingresses.
Fallback to live ns lookup on admission if lister cannot find namespace
…pand-flake-fix

e2e/storage: deflake CSI Mock volume expansion quota validation
bertinatto and others added 12 commits February 11, 2026 15:29
Signed-off-by: Harshal Patil <12152047+harche@users.noreply.github.com>
Analysis of flakes from the k8s suite has shown consistent examples
of otherwise well behaved testing failing due timeouts because of
temporary load on controllers during parallel testing. Increasing these
timeouts will reduce flakes.
The pod resize e2e tests use memory limits as low as 20Mi for Guaranteed
QoS pods. On OpenShift/CRI-O, the container runtime (runc) runs inside
the pod's cgroup and requires ~20-22MB of memory during container
creation and restart operations. This causes intermittent OOM kills
when the pod's memory limit is at or below runc's memory footprint.

This issue does not occur on containerd-based clusters because
containerd's shim runs outside the pod's cgroup by default (ShimCgroup=""),
so runc's memory is not charged against the pod's limit.

Increase memory limits to provide sufficient headroom for runc:
- originalMem: 20Mi -> 35Mi
- reducedMem: 15Mi -> 30Mi
- increasedMem: 25Mi -> 40Mi

The test validates resize behavior, not minimal memory limits, so
larger values do not reduce test coverage.

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
It needs to run as [Serial], so it accidentally does not evict other Pods.

Consider this scenario on a busy clusters, with all nodes at their attachment limit.

1. pod1 of the preemption test runs, pod2 is created.
2. The scheduler evicts pod1. That frees the RWOP volume and it also frees the last attachment slot on the node.
3. Some other e2e tests creates a Pod and scheduler puts it on a node, taking the last attachment slot.
4. The scheduler schedules pod2 agaian and it sees there is no node with a free attachment slot -> new round of eviction, now evicting a pod of unrelated e2e tests. The unrelated test will fail.
with the current established limits we can see the following error:

```
failed to list <object>: unable to determine group/version/kind: cbor:
exceeded max number of elements 1024 for CBOR array.
```
…t once per pod

The RS scale down closure was registered inside the per-pod loop,
causing it to run several times during cleanup. On the second+
iteration the RS Get/Update could fail with a conflict error if the
resourceVersion changed, failing the test during teardown even
though the test itself passed. Move it out of the loop so it runs once.
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 28, 2026
@openshift-ci-robot openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label Feb 28, 2026
@openshift-ci-robot
Copy link

@jacobsee: the contents of this pull request could not be automatically validated.

The following commits are valid:

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@jacobsee
Copy link
Member Author

/payload-job 4.22-e2e-aws-ovn-serial-2of2 4.22-e2e-aws-ovn-techpreview-serial-1of3

@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2026

@jacobsee: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

@openshift-ci openshift-ci bot requested review from bertinatto and mrunalp February 28, 2026 02:02
@openshift-ci openshift-ci bot added the vendor-update Touching vendor dir or related files label Feb 28, 2026
@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jacobsee
Once this PR has been reviewed and has the lgtm label, please assign p0lyn0mial for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jacobsee
Copy link
Member Author

/payload-job 4.22-e2e-aws-ovn-serial-2of2

@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2026

@jacobsee: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

@jacobsee
Copy link
Member Author

/payload-job nightly-4.22-e2e-aws-ovn-serial-2of2

@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2026

@jacobsee: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

@jacobsee
Copy link
Member Author

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-serial-2of2 periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-techpreview-serial-1of3

@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2026

@jacobsee: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-serial-2of2
  • periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-techpreview-serial-1of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/911c5b90-1450-11f1-9e0d-e70553aef5fe-0

@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2026

@jacobsee: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-runc c7cc8a5 link true /test e2e-aws-ovn-runc

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jacobsee
Copy link
Member Author

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-serial-2of2 periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-techpreview-serial-1of3

@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2026

@jacobsee: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-serial-2of2
  • periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-techpreview-serial-1of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ddb34ac0-1475-11f1-9862-413330fa68e8-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. vendor-update Touching vendor dir or related files

Projects

None yet

Development

Successfully merging this pull request may close these issues.