Skip to content

Refactor SparkSubmitOperator resumable backends into separate methods/classes #68505

@jason810496

Description

@jason810496

Summary

The SparkSubmitOperator ResumableJobMixin implementation now supports three
deployment backends (Spark standalone driver-status tracking, YARN cluster mode,
and Kubernetes driver-pod tracking). Each mixin method branches on the backend
inline, so per-backend logic is scattered across many methods instead of living
in one place per backend. This issue tracks decoupling them.

Background

Resumability for SparkSubmitOperator landed incrementally:

During review of #68067, the refactor was raised as a non-blocking idea and the
author agreed to follow up after the 3.3.0 release
(#68067 (comment)).

Problem

In providers/apache/spark/src/airflow/providers/apache/spark/operators/spark_submit.py,
the ResumableJobMixin methods each carry their own backend branching:

  • submit_job
  • get_job_status
  • is_job_active
  • is_job_succeeded
  • poll_until_complete
  • on_kill

Every method repeats if self._hook._is_yarn_cluster_mode: ... if self._hook._is_kubernetes: ... else (standalone).
A single backend's behaviour is therefore spread across six methods, which makes
the flow hard to follow, easy to break when adding a backend, and awkward to test
in isolation.

Proposed change

Separate each backend's logic so it is cohesive - for example a per-backend
strategy/handler class (standalone / YARN / K8s) implementing a common interface
(submit_job, get_job_status, is_job_active, is_job_succeeded,
poll_until_complete, on_kill), with the operator selecting the handler based
on deploy mode and tracking flags. A lighter alternative is grouping each
backend's branch into dedicated private methods. Decide between the two during
design.

Acceptance criteria

  • Per-backend logic is cohesive (one class or one method group per backend), not
    interleaved across the mixin methods.
  • Backend selection happens once instead of being re-derived in every method.
  • Existing behaviour is unchanged; current tests pass and per-backend logic is
    unit-testable in isolation.
  • No public API change to SparkSubmitOperator.

Notes

  • Non-breaking, internal refactor -- target after the 3.3.0 release.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    In progress

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions