kcp-dev · kcp-ci-bot · Mar 31, 2026 · Mar 10, 2026 · Mar 16, 2026 · Mar 31, 2026
diff --git a/Makefile b/Makefile
@@ -410,7 +410,7 @@ test: ## Run tests
 	for MOD in $$(git ls-files '**/go.mod' | sed 's,/go.mod,,'); do \
 		if [ "$$MOD" != "." ]; then \
 			echo "Testing $$MOD module..."; \
-			(cd $$MOD && $(GO_TEST) -race $(COUNT_ARG) -coverprofile=coverage.txt -covermode=atomic $(TEST_ARGS) $(WHAT)); \
+			(cd $$MOD && $(GO_TEST) -race $(COUNT_ARG) -coverprofile=coverage.txt -covermode=atomic $(TEST_ARGS) $$(go list "$(WHAT)" | grep -v 'test/load/testing')); \
 		fi; \
 	done
 

diff --git a/test/load/Proposal.md b/test/load/Proposal.md
@@ -0,0 +1,125 @@
+# Load Testing
+
+## 10000 workspaces reference architecture
+
+Assumptions & Requirements:
+
+* We want to test how a kcp installation with 10000 workspaces behaves on synthetic workloads
+* We do not want to test how easily kcp handles adding 10000 workspaces at once
+* We don't want to run etcd on machines which are hosting kcp
+* We will be using kcp-operator to setup a sharded kcp instance
+* The minimum amount of replicas for any component is 3
+* The loadtests are infrastructure provider agnostic. This will allow us and community members to
+experiment with different infrastructure sizes
+* We treat the rootshard like we would any other shard. It will be filled with regular workspaces
+So shard1 = rootshard
+* As the kcp-operator currently has no support for a dedicated cache server: We have decided to still
+work with the default model of having an embedded cache in the rootshard (even if it overloads the
+rootshard). Specifically this means when we have 3 shards, we create 1 Rootshard and 2 Shards
+* Results are stored in some permanent storage so we can use them for comparison later
+
+### Architecture
+
+[drawing of the general layout](./architecture.excalidraw)
+
+### Node calculation
+
+All node calculations are based on the number of workspaces and use the following recommended constants:
+
+* max_workspaces_per_shard = 3500
+* min_replicas = 3
+* kcp_server_buffer = 512MB
+* #kcp_cache_nodes = 1
+* #aux_nodes = 1
+* #frontproxy_nodes = 3
+* mem_per_workspace = 5MB
+
+---
+
+1. We calculate the number of shards
+
+    ```txt
+    #shards = round_up(#workspaces / max_workspaces_per_shard)
+    ```
+
+1. Now we can calculate the number of etcd nodes.
+
+    ```txt
+    #etcd_nodes = #shards * min_replicas
+    ```
+
+1. We can calculcate the number of shards and their size in relation to the number of workspaces
+
+    ```txt
+    #shard_nodes = #shards * min_replicas
+    #actual_workspaces_per_shard = workspaces / shards
+    kcp_server_node_mem = kcp_server_buffer + (#actual_workspaces * mem_per_workspace)
+    ```
+
+The total number of all required nodes is calculated as follows
+
+  ```txt
+  #total_nodes = #frontproxy_nodes + #kcp_cache_nodes + #etcd_nodes + #kcp_server_nodes + #aux_nodes
+  ```
+
+#### Example for 10000 workspaces
+
+```txt
+#shards = 10000 / 3500 = 2,85 = 3
+#etcd_nodes = 3 * 3 = 9
+#shard_nodes = 3 * 3 = 9
+#actual_workspaces_per_shard = 10000 / 3 = 3333
+kcp_server_node_mem = 512 + (3333 * 5) = 17777MB
+#total_nodes =  3 + 1 + 9 + 3 + 1 = 17
+```
+
+### Testing Protocol
+
+Mantra: We want to test how a kcp installation with 10000 workspaces behaves on simulated, regular
+activities. We don't want to test how easily we can add 10000 workspaces at once
+
+#### Procedure
+
+1. Create 10000 workspaces, APIExports, etc. and patiently wait for all of them to become ready
+2. Simulate real world activity by simulating end-users using custom kubeconfigs to create APIBindings
+and then CRUD on their custom api-objects
+
+##### Level 1 - 10000 empty workspaces
+
+We are just going to put 10000 empty workspaces into a kcp installation. We will have a nesting level
+setting so we can try out if nesting has any impact (it should not). This test case
+is extremely deterministic and has should spread workspaces relatively equally across shards.
+We mainly use this as a base consumption measurement and to verify nesting has no performance impact.
+
+##### Level 2 - Basic CRUD
+
+Every workspace has a type and we are going to do a basic parallel CRUD workflow which we will simulate
+using simple Kubernetes Jobs. The workflow is done on basic objects from a single provider using a
+singular APIExport.
+
+##### Level 3 - Multiple Providers
+
+We are multixplexing the level 2 example to use multiple providers.
+
+##### Outlook
+
+We want to keep the initial version of the tests simple and deterministic. As a result the following
+topics have been discussed, but were considered not to be part of the first 3 level implementation:
+
+* direct user interaction via simulated users
+* custom workspacetypes with initializers and finalizers
+* integrating the init-agent
+* nested workspaces living on different shards
+* having a chaos monkey randomly killing shards
+
+### Scraping of Metrics
+
+We plan on using a plain Prometheus to scrape all of the kcp-instanes: On a higher level we plan to
+monitor:
+
+* CPU + Mem on all components
+* Number of Goroutines over time
+* Request response times on front-proxy (probably percentiles). This could also alternatively be
+measured inside the testing suite (clientside)
+* Disk IO and size on both etcd and rootshard
+* Total number of workspaces (to compare expected with actual)
diff --git a/test/load/Readme.md b/test/load/Readme.md
@@ -1,125 +1,40 @@
 # Load Testing
 
-## 10000 workspaces reference architecture
+Load testing framework and loadtests for the kcp project.
 
-Assumptions & Requirements:
+## Architecture
 
-* We want to test how a kcp installation with 10000 workspaces behaves on synthetic workloads
-* We do not want to test how easily kcp handles adding 10000 workspaces at once
-* We don't want to run etcd on machines which are hosting kcp
-* We will be using kcp-operator to setup a sharded kcp instance
-* The minimum amount of replicas for any component is 3
-* The loadtests are infrastructure provider agnostic. This will allow us and community members to
-experiment with different infrastructure sizes
-* We treat the rootshard like we would any other shard. It will be filled with regular workspaces
-So shard1 = rootshard
-* As the kcp-operator currently has no support for a dedicated cache server: We have decided to still
-work with the default model of having an embedded cache in the rootshard (even if it overloads the
-rootshard). Specifically this means when we have 3 shards, we create 1 Rootshard and 2 Shards
-* Results are stored in some permanent storage so we can use them for comparison later
+Please refer to the [drawing of the general layout](./architecture.excalidraw).
 
-### Architecture
+## Setup
 
-[drawing of the general layout](./architecture.excalidraw)
+Installation scripts and manuals are provided in [setup/Readme](./setup/Readme.md).
 
-### Node calculation
+## Usage
 
-All node calculations are based on the number of workspaces and use the following recommended constants:
+All test cases are organized in the `testing` folder. You can run the entire suite using:
 
-* max_workspaces_per_shard = 3500
-* min_replicas = 3
-* kcp_server_buffer = 512MB
-* #kcp_cache_nodes = 1
-* #aux_nodes = 1
-* #frontproxy_nodes = 3
-* mem_per_workspace = 5MB
-
----
-
-1. We calculate the number of shards
-
-    ```txt
-    #shards = round_up(#workspaces / max_workspaces_per_shard)
-    ```
-
-1. Now we can calculate the number of etcd nodes.
-
-    ```txt
-    #etcd_nodes = #shards * min_replicas
-    ```
-
-1. We can calculcate the number of shards and their size in relation to the number of workspaces
-
-    ```txt
-    #shard_nodes = #shards * min_replicas
-    #actual_workspaces_per_shard = workspaces / shards
-    kcp_server_node_mem = kcp_server_buffer + (#actual_workspaces * mem_per_workspace)
-    ```
-
-The total number of all required nodes is calculated as follows
-
-  ```txt
-  #total_nodes = #frontproxy_nodes + #kcp_cache_nodes + #etcd_nodes + #kcp_server_nodes + #aux_nodes
-  ```
-
-#### Example for 10000 workspaces
-
-```txt
-#shards = 10000 / 3500 = 2,85 = 3
-#etcd_nodes = 3 * 3 = 9
-#shard_nodes = 3 * 3 = 9
-#actual_workspaces_per_shard = 10000 / 3 = 3333
-kcp_server_node_mem = 512 + (3333 * 5) = 17777MB
-#total_nodes =  3 + 1 + 9 + 3 + 1 = 17
+```sh
+go test ./testing/...
 ```
 
-### Testing Protocol
+The tests will prompt you for any specific required variables and configs.
 
-Mantra: We want to test how a kcp installation with 10000 workspaces behaves on simulated, regular
-activities. We don't want to test how easily we can add 10000 workspaces at once
+Alternatively you can run a subset of tests using standard `go test` syntax. E.g.:
 
-#### Procedure
-
-1. Create 10000 workspaces, APIExports, etc. and patiently wait for all of them to become ready
-2. Simulate real world activity by simulating end-users using custom kubeconfigs to create APIBindings
-and then CRUD on their custom api-objects
-
-##### Level 1 - 10000 empty workspaces
-
-We are just going to put 10000 empty workspaces into a kcp installation. We will have a nesting level
-setting so we can try out if nesting has any impact (it should not). This test case
-is extremely deterministic and has should spread workspaces relatively equally across shards.
-We mainly use this as a base consumption measurement and to verify nesting has no performance impact.
-
-##### Level 2 - Basic CRUD
-
-Every workspace has a type and we are going to do a basic parallel CRUD workflow which we will simulate
-using simple Kubernetes Jobs. The workflow is done on basic objects from a single provider using a
-singular APIExport.
-
-##### Level 3 - Multiple Providers
-
-We are multixplexing the level 2 example to use multiple providers.
-
-##### Outlook
+```sh
+go test ./testing/... -run ^TestExample
+```
 
-We want to keep the initial version of the tests simple and deterministic. As a result the following
-topics have been discussed, but were considered not to be part of the first 3 level implementation:
+## Development
 
-* direct user interaction via simulated users
-* custom workspacetypes with initializers and finalizers
-* integrating the init-agent
-* nested workspaces living on different shards
-* having a chaos monkey randomly killing shards
+The load-testing framework itself is organized in the `pkg` folder. You can run its unit
+tests directly using:
 
-### Scraping of Metrics
+```sh
+go test ./pkg/...
+```
 
-We plan on using a plain Prometheus to scrape all of the kcp-instanes: On a higher level we plan to
-monitor:
+## Partitioning
 
-* CPU + Mem on all components
-* Number of Goroutines over time
-* Request response times on front-proxy (probably percentiles). This could also alternatively be
-measured inside the testing suite (clientside)
-* Disk IO and size on both etcd and rootshard
-* Total number of workspaces (to compare expected with actual)
+You can partition your loadtest by providing it with a unique `start` number. Please be advised that this multiplexes your test. Any load you place will be multiplied by the number of partitions. Depending per test, adjust throughput values like qps accordingly.
diff --git a/test/load/go.mod b/test/load/go.mod
@@ -0,0 +1,41 @@
+module github.com/kcp-dev/kcp/test/load
+
+go 1.25.0
+
+require (
+	github.com/montanaflynn/stats v0.7.1
+	github.com/stretchr/testify v1.11.1
+	k8s.io/apimachinery v0.35.1
+	k8s.io/client-go v0.35.1
+)
+
+require (
+	github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
+	github.com/fxamacker/cbor/v2 v2.9.0 // indirect
+	github.com/go-logr/logr v1.4.3 // indirect
+	github.com/json-iterator/go v1.1.12 // indirect
+	github.com/kr/pretty v0.3.1 // indirect
+	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
+	github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee // indirect
+	github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
+	github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
+	github.com/spf13/pflag v1.0.10 // indirect
+	github.com/x448/float16 v0.8.4 // indirect
+	go.yaml.in/yaml/v2 v2.4.3 // indirect
+	golang.org/x/net v0.47.0 // indirect
+	golang.org/x/oauth2 v0.30.0 // indirect
+	golang.org/x/sys v0.38.0 // indirect
+	golang.org/x/term v0.37.0 // indirect
+	golang.org/x/text v0.31.0 // indirect
+	golang.org/x/time v0.9.0 // indirect
+	gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
+	gopkg.in/inf.v0 v0.9.1 // indirect
+	gopkg.in/yaml.v3 v3.0.1 // indirect
+	k8s.io/klog/v2 v2.140.0 // indirect
+	k8s.io/kube-openapi v0.0.0-20250910181357-589584f1c912 // indirect
+	k8s.io/utils v0.0.0-20260210185600-b8788abfbbc2 // indirect
+	sigs.k8s.io/json v0.0.0-20250730193827-2d320260d730 // indirect
+	sigs.k8s.io/randfill v1.0.0 // indirect
+	sigs.k8s.io/structured-merge-diff/v6 v6.3.0 // indirect
+	sigs.k8s.io/yaml v1.6.0 // indirect
+)