-
Notifications
You must be signed in to change notification settings - Fork 76
Use C* authentication with Reaper #222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
johntrimble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good John! I left some comments.
| // Requeue with a delay to give a chance for C* schema changes to propagate | ||
| // | ||
| // TODO Should the delay be adjusted based on the C* cluster size? | ||
| return result.RequeueSoon(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to ask Cassandra if the changes are done propagating? Like a query we can run or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is. C* driver provide an API to check for schema agreement. With the Java driver for example, it is just a single method call, checkSchemaAgreement(), that returns a boolean. Internally, the driver executes the following queries:
SELECT peer, rpc_address, schema_version FROM system.peers
and
SELECT schema_version FROM system.local WHERE key='local'
It checks that schema_version is the same across all rows.
It would be easy to just add the checkSchemaAgreement call to the schema init job; however, I plan to get rid of the schema init job now that the management api supports creating keyspaces. Given that I will implement a checkSchemaAgreement function in the operator.
|
|
||
| s := string(b) | ||
|
|
||
| return strings.Contains(s, "PasswordAuthenticator"), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... so this config you have here isn't the full story. This will get handed to the config builder that will actually generate the cassandra.yaml. In the past, the defaults for the config builder have sometimes varied from that of the actual defaults (as we do with the SeedProvider impl). Right now, we use the "AllowAllAuthenticator" default in configbuilder, but that's the sort of thing I could see changing. Is there any reason we cannot just assume authentication is enabled? I think we always create a superuser with credentials irrespective of the authentication setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if you look at json.RawMessage you'll see it's just a []byte:
type RawMessage []byte
So, if you are going to cheat and not actually parse the JSON, you can just do:
return strings.Contains(string(dc.Spec.Config), "PasswordAuthenticator")
Then you don't have to return an error from the function either. If the JSON is ill-formed, it will blow up elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with assuming Authentication is enabled - because will all of this work if they toggle it on after setup?
| // | ||
| //} | ||
| // | ||
| //return nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From debugging?
| - name: r3 | ||
| config: | ||
| cassandra-yaml: | ||
| authenticator: org.apache.cassandra.auth.PasswordAuthenticator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you tell reaper to use auth and give it the credentials for an actual user, but the PasswordAuthenticator is not actually enabled, will it still work?
| "os" | ||
| "strings" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - stdlib imports should be the first group
Before Reaper can successfully deploy, Cassandra roles have to be created and the reaper keyspace needs to be initialized. When the statefulset is created, the reaper container is not deployed initially anymore. When the schema init job completes, the new ReaperStatus.SchemaInitialized field is set to true. On the subsequent reconciliation loop the statefulset(s) are updated to deploy the reaper container.
This commit adds a function for altering a keyspace which I needed to configure the RF of system_auth. When authentication is enabled, system_auth should never have RF = 1. Because the amount of data is very small, increase the RF even when authentication is disabled is not a big deal. In my testing for reaper sidecar, I found that the RF of system_auth has to be changed when authentication is enabled; otherwise, there will be lots of flapping of containers as only one C* node will have the credentials and it might not the node that is up.
Reaper coordinates schema changes with distributed locks implemented in a C* table. This is done in order to avoid concurrent schema updates which is problematic in C*. The locks table uses IP addresses. When (not if) the reaper sidecar container liveness and/or readiness probe fails and the container restarts, it will come back up with a differnet IP address. Reaper will then have to wait to reacquire the lock which can lead to additional liveness and/or readiness probe failures. By applying schema updates with a single instance via a job, these problems go away. Lastly, based on this as well as on previous experience, I firmly believe that C* schema changes in a k8s deployment should always be done in a k8s job to avoid these types of problems.
|
So.... This has turned into quite a bit of work which includes changes in Reaper. Please see thelastpickle/cassandra-reaper#956 for some additional context. There are two things I want to point out with the latest commits I pushed. First, I have temporarily added a second job for applying Reaper's schema. The initial, original job runs a python script that creates the keyspace. The second job runs a modified version of Reaper that only applies schema updates. I plan to merge all of this into a single image which can then be run in one job. Secondly, I have added code for configure the replication_factor of the system_auth keyspace. If authentication is enabled and system_auth is using RF = 1, there is a lot of flapping with the |
|
|
||
| "github.com/datastax/cass-operator/operator/pkg/serverconfig" | ||
| "github.com/datastax/cass-operator/operator/pkg/utils" | ||
| "os" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this import switch should be reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimdickinson by any chance are there any project style settings that I can import into GoLand? Or are there style guidelines available so that I can configure GoLand to be consistent?
operator/pkg/httphelper/client.go
Outdated
| ReplicationFactor int | ||
| } | ||
|
|
||
| func (client *NodeMgmtClient) CallAlterKeyspaceEndpoint(pod *corev1.Pod, keyspace string, replicationSettings[]ReplicationSetting) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| func (client *NodeMgmtClient) CallAlterKeyspaceEndpoint(pod *corev1.Pod, keyspace string, replicationSettings[]ReplicationSetting) error { | |
| func (client *NodeMgmtClient) CallAlterKeyspaceEndpoint(pod *corev1.Pod, keyspace string, replicationSettings []ReplicationSetting) error { |
| } | ||
|
|
||
| func getSystemAuthRF(dc *api.CassandraDatacenter) int { | ||
| return int(math.Min(float64(3), float64(dc.Spec.Size))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should use the number of racks, instead of the number of nodes, in this math.Min() call. In the isClusterHealthy() function, we seem to use the rack count as just such a proxy for the RF:
func (rc *ReconciliationContext) isClusterHealthy() bool {
pods := FilterPodListByCassNodeState(rc.clusterPods, stateStarted)
numRacks := len(rc.Datacenter.GetRacks())
for _, pod := range pods {
err := rc.NodeMgmtClient.CallProbeClusterEndpoint(pod, "LOCAL_QUORUM", numRacks)
if err != nil {
return false
}
}
return true
}So maybe return int(math.Min(float64(3), float64(dc.GetRacks()))) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johntrimble what about the scenario though in which we have just a single rack? We would end up with RF = 1 which is what we want to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm observing in the code is that we use the number of racks to indicate the intended RF, as in the example above. Is using the number of racks not a suitable proxy for the intended RF? My expectation would be that the only case in which we would not have 3 racks would be the case in which a user is creating a cluster of only 1 node. Is there a reason a user would want to create a cluster of 3+ nodes with only a single rack?
It's possible the code example I gave above is what needs to change, but I'd like for us to do this in a consistent manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is using the number of racks not a suitable proxy for the intended RF?
Having RF = num_racks is definitely a good practice.
My expectation would be that the only case in which we would not have 3 racks would be the case in which a user is creating a cluster of only 1 node. Is there a reason a user would want to create a cluster of 3+ nodes with only a single rack?
My understanding is that the operator allows you to specify both the number of nodes and the number of racks. I know that the operator will try to create balanced racks, but the key thing for me here is that there is nothing preventing me from creating a C* cluster with just one rack. If I am only running in a single zone, then a single rack is fine with respect to C* topology.
If we determine the RF of system_auth by the number of racks, then we can wind up with RF = 1 for system_auth even if there are multiple nodes. That would be bad, and not just for Reaper.
|
#307 supersedes this one |
This is for #211.