Skip to content

WIP#21

Draft
james-kan-shopify wants to merge 3 commits into
mainfrom
bg-bug-failing-state-ignore-state
Draft

WIP#21
james-kan-shopify wants to merge 3 commits into
mainfrom
bg-bug-failing-state-ignore-state

Conversation

@james-kan-shopify
Copy link
Copy Markdown

What is the purpose of the change

(For example: This pull request adds a new feature to periodically create and maintain savepoints through the FlinkDeployment custom resource.)

Brief change log

(for example:)

  • Periodic savepoint trigger is introduced to the custom resource
  • The operator checks on reconciliation whether the required time has passed
  • The JobManager's dispose savepoint API is used to clean up obsolete savepoints

Verifying this change

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (100MB)
  • Extended integration test for recovery after master (JobManager) failure
  • Manually verified the change by running a 4 node cluster with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changes to the CustomResourceDescriptors: (yes / no)
  • Core observer or reconciler logic that is regularly executed: (yes / no)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@drossos
Copy link
Copy Markdown

drossos commented Mar 13, 2026

Pushed new tests that confirm that this approach is in the right realm of working (it at least passes these tests that failed on main)

Still looking at diff approaches that would be best to approach this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants