(feat): Extra Valkey config#44
Conversation
This PR implements the ability for users to supply additional configuration for Valkey. Any provided parameters are appended to default configuration required by the operator Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
|
ugh. why didn't lint/fmt/vet catch those? |
jdheyburn
left a comment
There was a problem hiding this comment.
Minor comments, keen to hear opinions, otherwise LGTM
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
|
e2e still fails due to #43 |
no this is different issue than 43. its failing because of events limitation. maybe as u have added new emit event other events are not coming. https://github.com/valkey-io/valkey-operator/actions/runs/20936033780/job/60159472854?pr=44. check this PR description #37. maybe you can update events check accordingly. currently we are not asserting for all events due to events rate limit. |
|
I only added 1 new event which isn't triggered during the e2e. When I run e2e on my local kind cluster, I don't get that failure; I get the same failure as #43. |
no i don't think its related. you can check the issue description failure test case and logs are different for both of these errors. i have neven seen failure of valkey creation CR test case in main branch. if in case we are seeing, we need to create different issue. this PR is failing this test "Manager when a ValkeyCluster CR is applied [It] creates a Valkey Cluster deployment" 43 issue is with other test "Manager when a ValkeyCluster experiences degraded state [It] should detect and recover when a deployment is deleted" https://github.com/valkey-io/valkey-operator/actions/runs/20806446577/job/59761668293 quick fix: we may need to remove assertions of some of the events. valkey-operator/test/e2e/e2e_test.go Line 405 in 44d674b |
|
Rebasing (or add merge commit of main) should fix the e2e test now. |
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
| Tolerations: cluster.Spec.Tolerations, | ||
| Exporter: cluster.Spec.Exporter, | ||
| Containers: cluster.Spec.Containers, | ||
| UsersConfigMapName: getConfigMapName(cluster.Name), |
There was a problem hiding this comment.
I wonder if UsersConfigMapName implies it is related to Users, in a similar way that UsersACLSecretName does. Or does Users mean the human deploying the operator?
There was a problem hiding this comment.
Users, as the human deploying the operator. There are some config params that the operator sets, and others that the users would set.
There was a problem hiding this comment.
My interpretation was that Users here meant for ACLs. If it is ambiguous maybe it should be removed?
| // This hash should be updated whenever the contents of either script changes, which would | ||
| // coincide with operator version bump. | ||
| // $ cat internal/controller/scripts/{liveness-check.sh,readiness-check.sh} | sha256sum | ||
| scriptsHash = "8531132f52ac311772dfcb45c107c34ab05e719a0df644cc332512277b564346" |
There was a problem hiding this comment.
Is there not a way we can have this calculated for us?
There was a problem hiding this comment.
The goal was to not have the operator (re)calculating sha256 on static file contents every 30s. As for an "outside" solution, I could probably do something similar to how version numbers are done, by linking the value to the variable during compile time?
There was a problem hiding this comment.
How do other operators do this? Could we perform a faster check before the sha256?
has_changed = false
if len(current) != len(desired):
has_changed = true
elif sha256(current) != sha256(desired):
has_changed = true
There was a problem hiding this comment.
I can speak for 3 of Percona's operators (mysql, pxc, mongo): The liveness, readiness, and health check scripts are part of the operator docker image. The image is used as init-container, where a separate entrypoint runs to copy the scripts into the running pod. The init container finishes, then the main container, using the same image with default entrypoint, launches as a regular container and the scripts are there to be used.
This has some benefits: 1) no checksum calculations, 2) nothing to 'embed', and 3) upgrades to the script become automatic new release/version of the operator.
Once we decide on if we will be making an operator-specific image (ie: for including modules), perhaps we could use a similar model/process?
There was a problem hiding this comment.
That could work, but until we decide on that - could we instead calculate the sha256sum on operator boot? Given that the scripts would be coupled with the operator version, we would then be able to avoid calculating the sha on every reconcile. How does that sound?
| var scripts embed.FS | ||
|
|
||
| func getConfigMapName(clusterName string) string { | ||
| return clusterName + "-config" |
There was a problem hiding this comment.
Can we prepend valkey- to this? I'm thinking in case there is a conflict and we can minimise that.
There was a problem hiding this comment.
Yup. I'll push that shortly.
|
|
||
| // Create or update a default valkey.conf | ||
| // If additional config is provided, append to the default map | ||
| func (r *ValkeyClusterReconciler) upsertConfigMap(ctx context.Context, cluster *valkeyiov1alpha1.ValkeyCluster) error { |
There was a problem hiding this comment.
I've started refactoring other upserts to use controllerutil.CreateOrUpdate, could the same be done here for consistency? It might help reduce some lines too.
There was a problem hiding this comment.
Probably. I will look into it.
There was a problem hiding this comment.
I wonder if this would help with the script-hash that is being set in the PR today.
|
|
||
| // Build the config | ||
| var configBuilder strings.Builder | ||
| configBuilder.Grow(len(specConfig) * averageParameterLength) |
There was a problem hiding this comment.
I wonder if the use of Grow here is prematurely optimising, would it simplify the logic if we removed it?
There was a problem hiding this comment.
It would simplify the logic by 1 line. I read that this pre-allocation was a simple memory optimization if the size of the eventual string was somewhat known.
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
|
Just an update, that I'm working on the minor changes requested above, but I'm getting golang crashes when building the docker image. |
Signed-off-by: utdrmac <matthew.boehm@percona.com>
bjosv
left a comment
There was a problem hiding this comment.
LGTM, just added a nit for the record.
Just a rebase needed, then open comments can be addressed in separate PR.
| // buildValkeyNodeConfigMap builds a ConfigMap containing the embedded liveness | ||
| // and readiness probe scripts, plus an empty valkey.conf. | ||
| // The ConfigMap is named after valkeyNodeResourceName(node). | ||
| // The ConfigMap is named via config.go:getConfigMapName(node). |
There was a problem hiding this comment.
nit: remove this line or change to GetServerConfigMapName, getConfigMapName was maybe used in a previous iteration.
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
|
@sandeepkunusoth Please check my merge with your latest TLS, as the config is moved into a separate file. |
Signed-off-by: utdrmac <matthew.boehm@percona.com>
Signed-off-by: utdrmac <matthew.boehm@percona.com>
…eyConfig Signed-off-by: Joseph Heyburn <jdheyburn@gmail.com>
|
@utdrmac I rebased this branch from main, looks like its causing some e2e tests to fail - would you mind taking a look? |
|
🎉 Thanks everyone for helping, commenting, and testing! |

This feature request implements the ability for users to supply their own configuration for Valkey as an inline stanza when deploying a new cluster. The user configuration is appended to the required base configuration, which is created as a configMap. Since the operator manages the configMap, any changes to the config are reconciled. Additionally, users cannot override the required cluster settings.