Skip to content

Conversation

@stokkie90
Copy link
Contributor

@stokkie90 stokkie90 commented Dec 4, 2025

  • Added support for automatic labeling of Gateway API routes during canary deployments to prevent GitOps drift.
  • Updated documentation to reflect new features, including the ability to customize or disable the in-progress label.
  • Improved tests to verify the addition and removal of the in-progress label for HTTP, gRPC, TCP, and TLS routes.

This change enhances the integration with GitOps tools like Argo CD, ensuring smoother deployments and better resource management.

  labels:
    airalo.com/canary: 'true'
    app.kubernetes.io/component: app
    app.kubernetes.io/instance: plx-argo-rollouts-test-pr-1
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: plx-argo-rollouts-test-pr-1
    app.kubernetes.io/version: 1.0.0
    argocd.argoproj.io/instance: plx-argo-rollouts-test-pr-1
    helm.sh/chart: generic-0.5.14-alpha0
    rollouts.argoproj.io/gatewayapi-canary: in-progress  <------- Will be removed once weigt is set to 100/max
  name: plx-argo-rollouts-test-pr-1
  namespace: plx-argo-rollouts-test

- Added support for automatic labeling of Gateway API routes during canary deployments to prevent GitOps drift.
- Updated documentation to reflect new features, including the ability to customize or disable the in-progress label.
- Improved tests to verify the addition and removal of the in-progress label for HTTP, gRPC, TCP, and TLS routes.

This change enhances the integration with GitOps tools like Argo CD, ensuring smoother deployments and better resource management.

Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
@stokkie90 stokkie90 marked this pull request as ready for review December 4, 2025 10:44
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Copy link
Collaborator

@kostis-codefresh kostis-codefresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for this.

Can you confirm please that the flaky test suite works ok locally in your workstation?


sleep 10

kubectl get gatewayclasses traefik
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why this is needed here? Why do we check twice the same command?

Did something change in the latest Traefik version and it is slower to start up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the Condition/Status was not Accepted right away, when this script was triggered (at least locally) a wait for 10 seconds should make it a bit more stable. Migth be related to the updates of the charts and versions

Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
Comment on lines +399 to +408
if managedRouteIndex < 0 || managedRouteIndex >= len(routeRuleList) {
// stale or corrupted managed route index; clean references for this route and continue gracefully
for name, managedMap := range managedRouteMap {
delete(managedMap, httpRouteName)
if len(managedMap) == 0 {
delete(managedRouteMap, name)
}
}
return routeRuleList, nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experienced these issues, and only fix was to remove the configmap.

time="2025-12-09T08:47:12Z" level=error msg="roCtx.reconcile err failed to remove managed routes via plugin: RemoveManagedRoutes rpc call error: unexpected EOF" generation=4 namespace=default resourceVersion=15228 rollout=grpcroute-filters-rollout
2025-12-09T08:47:12.365Z [ERROR] plugin: plugin process exited: plugin=/home/argo-rollouts/plugin-bin/argoproj-labs/gatewayAPI id=5252 error="exit status 2"
time="2025-12-09T08:47:12Z" level=info msg="Reconciliation completed" generation=4 namespace=default resourceVersion=15228 rollout=grpcroute-filters-rollout time_ms=216.659584
time="2025-12-09T08:47:12Z" level=error msg="rollout syncHandler error: failed to remove managed routes via plugin: RemoveManagedRoutes rpc call error: unexpected EOF" namespace=default rollout=grpcroute-filters-rollout
time="2025-12-09T08:47:12Z" level=info msg="rollout syncHandler queue retries: 39 : key \"default/grpcroute-filters-rollout\"" namespace=default rollout=grpcroute-filters-rollout
time="2025-12-09T08:47:12Z" level=error msg="failed to remove managed routes via plugin: RemoveManagedRoutes rpc call error: unexpected EOF" error="<nil>"
time="2025-12-09T08:47:22Z" level=info msg="Started syncing rollout" generation=4 namespace=default resourceVersion=15228 rollout=grpcroute-filters-rollout
time="2025-12-09T08:47:22Z" level=info msg="delaying service switch from  to 55f5b6b: ReplicaSet not fully available" namespace=default rollout=grpcroute-filters-rollout service=argo-rollouts-canary-service

…maximum of 5 attempts, improving reliability of e2e tests.

Signed-off-by: rick.stokkingreef <rick.stokkingreef@airalo.com>
@kostis-codefresh kostis-codefresh merged commit 65f9467 into argoproj-labs:main Dec 10, 2025
6 of 7 checks passed
@kostis-codefresh
Copy link
Collaborator

I run the flaky tests locally and they pass.

This is now available as 0.9.0 both in binaries https://github.com/argoproj-labs/rollouts-plugin-trafficrouter-gatewayapi/releases/tag/v0.9.0 and as docker image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants