|
1 | 1 | # Load Testing |
2 | 2 |
|
3 | | -## 10000 workspaces reference architecture |
| 3 | +Load testing framework and loadtests for the kcp project. |
4 | 4 |
|
5 | | -Assumptions & Requirements: |
| 5 | +## Architecture |
6 | 6 |
|
7 | | -* We want to test how a kcp installation with 10000 workspaces behaves on synthetic workloads |
8 | | -* We do not want to test how easily kcp handles adding 10000 workspaces at once |
9 | | -* We don't want to run etcd on machines which are hosting kcp |
10 | | -* We will be using kcp-operator to setup a sharded kcp instance |
11 | | -* The minimum amount of replicas for any component is 3 |
12 | | -* The loadtests are infrastructure provider agnostic. This will allow us and community members to |
13 | | -experiment with different infrastructure sizes |
14 | | -* We treat the rootshard like we would any other shard. It will be filled with regular workspaces |
15 | | -So shard1 = rootshard |
16 | | -* As the kcp-operator currently has no support for a dedicated cache server: We have decided to still |
17 | | -work with the default model of having an embedded cache in the rootshard (even if it overloads the |
18 | | -rootshard). Specifically this means when we have 3 shards, we create 1 Rootshard and 2 Shards |
19 | | -* Results are stored in some permanent storage so we can use them for comparison later |
| 7 | +Please refer to the [drawing of the general layout](./architecture.excalidraw). |
20 | 8 |
|
21 | | -### Architecture |
| 9 | +## Setup |
22 | 10 |
|
23 | | -[drawing of the general layout](./architecture.excalidraw) |
| 11 | +Installation scripts and manuals are provided in [setup/Readme](./setup/Readme.md). |
24 | 12 |
|
25 | | -### Node calculation |
| 13 | +## Usage |
26 | 14 |
|
27 | | -All node calculations are based on the number of workspaces and use the following recommended constants: |
| 15 | +All test cases are organized in the `testing` folder. You can run the entire suite using: |
28 | 16 |
|
29 | | -* max_workspaces_per_shard = 3500 |
30 | | -* min_replicas = 3 |
31 | | -* kcp_server_buffer = 512MB |
32 | | -* #kcp_cache_nodes = 1 |
33 | | -* #aux_nodes = 1 |
34 | | -* #frontproxy_nodes = 3 |
35 | | -* mem_per_workspace = 5MB |
36 | | - |
37 | | ---- |
38 | | - |
39 | | -1. We calculate the number of shards |
40 | | - |
41 | | - ```txt |
42 | | - #shards = round_up(#workspaces / max_workspaces_per_shard) |
43 | | - ``` |
44 | | -
|
45 | | -1. Now we can calculate the number of etcd nodes. |
46 | | -
|
47 | | - ```txt |
48 | | - #etcd_nodes = #shards * min_replicas |
49 | | - ``` |
50 | | -
|
51 | | -1. We can calculcate the number of shards and their size in relation to the number of workspaces |
52 | | -
|
53 | | - ```txt |
54 | | - #shard_nodes = #shards * min_replicas |
55 | | - #actual_workspaces_per_shard = workspaces / shards |
56 | | - kcp_server_node_mem = kcp_server_buffer + (#actual_workspaces * mem_per_workspace) |
57 | | - ``` |
58 | | -
|
59 | | -The total number of all required nodes is calculated as follows |
60 | | -
|
61 | | - ```txt |
62 | | - #total_nodes = #frontproxy_nodes + #kcp_cache_nodes + #etcd_nodes + #kcp_server_nodes + #aux_nodes |
63 | | - ``` |
64 | | - |
65 | | -#### Example for 10000 workspaces |
66 | | - |
67 | | -```txt |
68 | | -#shards = 10000 / 3500 = 2,85 = 3 |
69 | | -#etcd_nodes = 3 * 3 = 9 |
70 | | -#shard_nodes = 3 * 3 = 9 |
71 | | -#actual_workspaces_per_shard = 10000 / 3 = 3333 |
72 | | -kcp_server_node_mem = 512 + (3333 * 5) = 17777MB |
73 | | -#total_nodes = 3 + 1 + 9 + 3 + 1 = 17 |
| 17 | +```sh |
| 18 | +go test ./testing/... |
74 | 19 | ``` |
75 | 20 |
|
76 | | -### Testing Protocol |
| 21 | +The tests will prompt you for any specific required variables and configs. |
77 | 22 |
|
78 | | -Mantra: We want to test how a kcp installation with 10000 workspaces behaves on simulated, regular |
79 | | -activities. We don't want to test how easily we can add 10000 workspaces at once |
| 23 | +Alternatively you can run a subset of tests using standard `go test` syntax. E.g.: |
80 | 24 |
|
81 | | -#### Procedure |
82 | | - |
83 | | -1. Create 10000 workspaces, APIExports, etc. and patiently wait for all of them to become ready |
84 | | -2. Simulate real world activity by simulating end-users using custom kubeconfigs to create APIBindings |
85 | | -and then CRUD on their custom api-objects |
86 | | - |
87 | | -##### Level 1 - 10000 empty workspaces |
88 | | - |
89 | | -We are just going to put 10000 empty workspaces into a kcp installation. We will have a nesting level |
90 | | -setting so we can try out if nesting has any impact (it should not). This test case |
91 | | -is extremely deterministic and has should spread workspaces relatively equally across shards. |
92 | | -We mainly use this as a base consumption measurement and to verify nesting has no performance impact. |
93 | | - |
94 | | -##### Level 2 - Basic CRUD |
95 | | - |
96 | | -Every workspace has a type and we are going to do a basic parallel CRUD workflow which we will simulate |
97 | | -using simple Kubernetes Jobs. The workflow is done on basic objects from a single provider using a |
98 | | -singular APIExport. |
99 | | - |
100 | | -##### Level 3 - Multiple Providers |
101 | | - |
102 | | -We are multixplexing the level 2 example to use multiple providers. |
103 | | - |
104 | | -##### Outlook |
| 25 | +```sh |
| 26 | +go test ./testing/... -run ^TestExample |
| 27 | +``` |
105 | 28 |
|
106 | | -We want to keep the initial version of the tests simple and deterministic. As a result the following |
107 | | -topics have been discussed, but were considered not to be part of the first 3 level implementation: |
| 29 | +## Development |
108 | 30 |
|
109 | | -* direct user interaction via simulated users |
110 | | -* custom workspacetypes with initializers and finalizers |
111 | | -* integrating the init-agent |
112 | | -* nested workspaces living on different shards |
113 | | -* having a chaos monkey randomly killing shards |
| 31 | +The load-testing framework itself is organized in the `pkg` folder. You can run its unit |
| 32 | +tests directly using: |
114 | 33 |
|
115 | | -### Scraping of Metrics |
| 34 | +```sh |
| 35 | +go test ./pkg/... |
| 36 | +``` |
116 | 37 |
|
117 | | -We plan on using a plain Prometheus to scrape all of the kcp-instanes: On a higher level we plan to |
118 | | -monitor: |
| 38 | +## Partitioning |
119 | 39 |
|
120 | | -* CPU + Mem on all components |
121 | | -* Number of Goroutines over time |
122 | | -* Request response times on front-proxy (probably percentiles). This could also alternatively be |
123 | | -measured inside the testing suite (clientside) |
124 | | -* Disk IO and size on both etcd and rootshard |
125 | | -* Total number of workspaces (to compare expected with actual) |
| 40 | +You can partition your loadtest by providing it with a unique `start` number. Please be advised that this multiplexes your test. Any load you place will be multiplied by the number of partitions. Depending per test, adjust throughput values like qps accordingly. |
0 commit comments