Skip to content

Fix EMR and Glue operator on_kill to check terminal state before cancellation#1

Open
Copilot wants to merge 5 commits intomainfrom
copilot/fix-emr-job-terminal-state
Open

Fix EMR and Glue operator on_kill to check terminal state before cancellation#1
Copilot wants to merge 5 commits intomainfrom
copilot/fix-emr-job-terminal-state

Conversation

Copy link

Copilot AI commented Dec 4, 2025

The on_kill methods in EMR and Glue operators attempted to cancel jobs regardless of their current state, causing unnecessary API calls and potential errors when jobs were already complete.

Changes

Operator modifications:

  • EmrServerlessStartJobOperator: Check job state via get_job_run() before calling cancel_job_run()
  • EmrContainerOperator: Check job state via check_query_status() before calling stop_query()
  • GlueJobOperator: Check job state via get_job_state() before calling batch_stop_job_run()
  • EmrCreateJobFlowOperator: Check cluster state via describe_cluster() before calling terminate_job_flows()

Hook improvements:

  • Added JOB_TERMINAL_STATES to GlueJobHook (["FAILED", "TIMEOUT", "SUCCEEDED", "STOPPED"])
  • Added CLUSTER_TERMINAL_STATES to EmrHook (["TERMINATED", "TERMINATED_WITH_ERRORS"])
  • Updated GlueJobHook._handle_state() to use class constants

All state checks include try-except fallback to preserve existing behavior if state retrieval fails.

Example

# Before: would call cancel_job_run even if job already succeeded
operator.on_kill()  # → cancel_job_run called → API error

# After: checks state first
operator.on_kill()  # → get_job_run → state is SUCCESS → skips cancel_job_run

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Original prompt

The EMRServerlessStartJobRunOperator's on_kill command has a bug that causes the cancel job run command to sometimes run when the job is already in a terminal state. Make it so that the on_kill only runs when the job is not in a terminal state.

Do the same for EMR-EC2, EMR-EKS, and Glue job operators.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.


Open with Devin

Copilot AI and others added 3 commits December 4, 2025 22:16
Co-authored-by: agupta01 <12228572+agupta01@users.noreply.github.com>
Co-authored-by: agupta01 <12228572+agupta01@users.noreply.github.com>
Co-authored-by: agupta01 <12228572+agupta01@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix job cancellation command for terminal states Fix EMR and Glue operator on_kill to check terminal state before cancellation Dec 4, 2025
Copilot AI requested a review from agupta01 December 4, 2025 22:31
@agupta01 agupta01 marked this pull request as ready for review December 5, 2025 02:07
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional flags.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants