Skip to content

Prevent spurious failures due to APIBindings not being ready#3934

Merged
kcp-ci-bot merged 1 commit intokcp-dev:mainfrom
ntnn:wait-for-apibindings
Mar 21, 2026
Merged

Prevent spurious failures due to APIBindings not being ready#3934
kcp-ci-bot merged 1 commit intokcp-dev:mainfrom
ntnn:wait-for-apibindings

Conversation

@ntnn
Copy link
Member

@ntnn ntnn commented Mar 20, 2026

Summary

So I'm not sure about this. Not every workspace requires the kcp APIs to be ready for their test case. On the other hand it wouldn't hurt to wait the few milliseconds more.

Alternatively we'd have to add similar waits to all tests that do require the kcp APIs over time, which is a lot of tests.

In CI errors related to kcp resources sometimes come up, e.g.:

workspace_test.go:556:
    	Error Trace:	/home/prow/go/src/github.com/kcp-dev/kcp/test/e2e/authentication/workspace_test.go:556
    	Error:      	"the server could not find the requested resource (post workspaceauthenticationconfigurations.tenancy.kcp.io)" does not contain "claim and expression cannot both be specified"
    	Test:       	TestAcceptableWorkspaceAuthenticationConfigurations/claim-and-expression

The reason for this is that the fixture only waits for the workspace and underlying logical cluster to be ready.

Instead the fixture now waits until all APIBindings (which are usually just the kcp APIs) are bound.

What Type of PR Is This?

/kind flake
/kind bug

Related Issue(s)

Fixes #

Release Notes

NONE

@kcp-ci-bot kcp-ci-bot added release-note-none Denotes a PR that doesn't merit a release note. kind/flake Categorizes issue or PR as related to a flaky test. dco-signoff: yes Indicates the PR's author has signed the DCO. kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 20, 2026
@ntnn
Copy link
Member Author

ntnn commented Mar 20, 2026

Just had another failure just like it:

     controller_test.go:89: 
        	Error Trace:	/home/prow/go/src/github.com/kcp-dev/kcp/test/e2e/reconciler/workspacedeletion/controller_test.go:89
        	            				/home/prow/go/src/github.com/kcp-dev/kcp/test/e2e/reconciler/workspacedeletion/controller_test.go:318
        	Error:      	Received unexpected error:
        	            	the server could not find the requested resource (post workspaces.tenancy.kcp.io)
        	Test:       	TestWorkspaceDeletion/create_and_clean_workspace 

@ntnn ntnn added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 20, 2026
@ntnn
Copy link
Member Author

ntnn commented Mar 20, 2026

I think this is due to #3847 as the installation of resources is done by confighelpers.Bootstrap, which now installs the bootstrapped resources in groups rather than trying to install everything at once and retrying until they install.

I remember seeing this error beforehand as well, the change just made it more pronounced, probably because there's at least one resource group being installed before the APIBindings.

In CI errors related to kcp resources sometimes come up, e.g.:

    workspace_test.go:556:
        	Error Trace:	/home/prow/go/src/github.com/kcp-dev/kcp/test/e2e/authentication/workspace_test.go:556
        	Error:      	"the server could not find the requested resource (post workspaceauthenticationconfigurations.tenancy.kcp.io)" does not contain "claim and expression cannot both be specified"
        	Test:       	TestAcceptableWorkspaceAuthenticationConfigurations/claim-and-expression

The reason for this is that the fixture only waits for the workspace and
underlying logical cluster to be ready.

Instead the fixture now waits until all APIBindings (which are usually
just the kcp APIs) are bound.

Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
@ntnn ntnn force-pushed the wait-for-apibindings branch from 8acb331 to 2e4705e Compare March 20, 2026 20:41
@ntnn ntnn added this to tbd Mar 20, 2026
@github-project-automation github-project-automation bot moved this to Backlog in tbd Mar 20, 2026
@mjudeikis
Copy link
Contributor

/lgtm
/approve

@kcp-ci-bot kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 21, 2026
@kcp-ci-bot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: e43e1d8bc65fae236528e9811668d4af033f0b0c

@kcp-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mjudeikis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kcp-ci-bot kcp-ci-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2026
@ntnn
Copy link
Member Author

ntnn commented Mar 21, 2026

/retest

Haven't seen that one flake before:

     inactive_test.go:55: Mark the logicalcluster as inactive
    inactive_test.go:58: 
        	Error Trace:	/home/prow/go/src/github.com/kcp-dev/kcp/test/e2e/workspace/inactive_test.go:58
        	Error:      	Received unexpected error:
        	            	Operation cannot be fulfilled on logicalclusters.core.kcp.io "cluster": the object has been modified; please apply your changes to the latest version and try again
        	Test:       	TestInactiveLogicalCluster 

@kcp-ci-bot kcp-ci-bot merged commit c9acc4c into kcp-dev:main Mar 21, 2026
14 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in tbd Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has signed the DCO. kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants