Added BlueGreen ingress that switches between active Svc + resolve path conflict on Blue and Green deployment ingresses#11
Added BlueGreen ingress that switches between active Svc + resolve path conflict on Blue and Green deployment ingresses#11drossos wants to merge 2 commits into
Conversation
b7975c7 to
f7f6cbb
Compare
james-kan-shopify
left a comment
There was a problem hiding this comment.
Should there be some unit tests for reconcileBlueGreenIngress?
| var flinkBlueGreenDeploymentSpec = context.getBgDeployment().getSpec(); | ||
| var objectMeta = context.getBgDeployment().getMetadata(); |
There was a problem hiding this comment.
do we know the type we should be getting here?
There was a problem hiding this comment.
We do, I was mostly following pattern from other IngressUtil reconcile but I can add in the explicit type for FlinkBlueGreenDeploymentSpec
| IngressUtils.reconcileBlueGreenIngress( | ||
| blueGreenContext, | ||
| flinkResourceContext.getOperatorConfig().isManageIngress(), | ||
| activeDeployment, | ||
| flinkResourceContext.getDeployConfig(activeDeployment.getSpec()), | ||
| blueGreenContext.getJosdkContext()); |
There was a problem hiding this comment.
if any part of this encounters exceptions like null pointers, resulting in exceptions, is this going to be a problem? Do we need to catch it and handle it?
There was a problem hiding this comment.
I think there is a reconciliation at kubernetes client level only, i.e any attempts it does to write / update will be retried . In terms of more operator exceptions being thrown, they throw at the top level in the FlinkBlueGreenDeploymentController.reconcile() method.
I will take a look a bit more to confirm here but seems its to be the pattern use by operator.
| // Update Ingress template if exists to prevent path collision between Blue and Green | ||
| if (flinkDeployment.getSpec().getIngress() != null) { | ||
| flinkDeployment | ||
| .getSpec() | ||
| .getIngress() | ||
| .setTemplate( | ||
| blueGreenDeploymentType.toString().toLowerCase() | ||
| + "-" | ||
| + flinkDeployment.getSpec().getIngress().getTemplate()); | ||
| } | ||
|
|
There was a problem hiding this comment.
Nit:
IngressSpec ingress = flinkDeployment.getSpec().getIngress();
if (ingress != null) {
ingress.setTemplate(blueGreenDeploymentType.name().toLowerCase() + "-" + ingress.getTemplate());
}
| Optional<? extends HasMetadata> ingress; | ||
| if (ingressInNetworkingV1(client.getClient())) { | ||
| ingress = | ||
| client.getSecondaryResource( | ||
| io.fabric8.kubernetes.api.model.networking.v1.Ingress.class); | ||
| } else { | ||
| ingress = client.getSecondaryResource(Ingress.class); | ||
| } | ||
| ingress.ifPresent(i -> client.getClient().resource(i).delete()); |
There was a problem hiding this comment.
This is for cleaning up and deleting the stable ingress, can we ever get into a scenario where upon next deployment this was removed but also it's now operatorManagedIngress is true? would that leave the ingress orphaned?
| @JsonProperty("configuration") | ||
| private Map<String, String> configuration; | ||
|
|
||
| @Nullable private IngressSpec ingress; |
There was a problem hiding this comment.
Is the ingress field tracked by FlinkBlueGreenDeploymentSpecDiff? If this gets changed, only when the next deploy occurs via another change that is tracked, will it be picked up. just checking that's the intended behaviour? Or did we want to immediately trigger a blue green deploy there?
There was a problem hiding this comment.
That is a good point, let me investigate this a bit more and test to see what would most sensible flow here
There was a problem hiding this comment.
Explored having a top level "parent-diff" that would update parent ingress without requiring blue-green transition. In the end I didn't love the number of edge cases and complexity it introduced. Especially since the precedent now with the blue-green configs that exist outside the template is that they only take effect upon blue-green transition.
I am going to leave as is for now (requires blue-green transition to modify parent ingress) and see what maintainers think would be best
There was a problem hiding this comment.
Ignore above, it was bugging me too much and seemed like bad design so added in a prelim ingress reconciliation to allow for in-place bluegreen ingress updates
Writing those up now, forgot to add them when flipped PR to green |
| return new FlinkBlueGreenDeploymentSpec(configuration, null, flinkDeploymentTemplateSpec); | ||
| } | ||
|
|
||
| // ==================== Ingress Rotation Tests ==================== |
There was a problem hiding this comment.
I don't think we have any tests handling ingress deletion? should we add that?
|
overall looks good, can approve, just wondering if you wanted to cover the tests around ingress deletion? |
james-kan-shopify
left a comment
There was a problem hiding this comment.
reviewed a few weeks ago, looks good and the PR is opened up in open source for review.
…c + resolve path conflict on Blue and Green deployment ingresses
…tries with exponential backoff
60168a4 to
cc322e8
Compare
Closes https://github.com/Shopify/streaming-compute/issues/667
Overview
Added ingress to FlinkBlueGreen deployment layer that will switch automatically between whatever is the active FlinkDeployment service.
This switching behaviour happens on the finalization step of any transition, meaning that after the flink deployment has transitioning to a new active state, only then will the ingress be switched over the to new service. This allow the ingress to also away be alive as it is pointing only at active running flink pipelines.
This FlinkBlueGreen ingress spec is separate from the FlinkDeployment template spec's ingress definition and can be found within the following part of the manifest (see under paragraph). This does imply that you can have both, neither or a mix of FlinkDeployment and FlinkBlueGreenDeployment ingress' defined.
Note
Testing
These changes are deployed into our sandbox environment and can be tested there. Namely the
yd-playground-sandbox-sql-bluegreensamplenamespace has a FlinkBlueGreenDeployment where you can trigger a bluegreen deployment to see this ingress switching behaviour.Can also be tested in minikube and utilize the following attached BlueGreenDeployment for testing: