feat(dag): Skip dataset-triggered dags without SerializedDagModel#63546
feat(dag): Skip dataset-triggered dags without SerializedDagModel#63546leossantos wants to merge 11 commits intoapache:v2-11-stablefrom
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
|
A couple of meta-questions:
|
|
@kaxil Thanks for the review. Re: main — yes, the same structural issue exists on main (asset-based code path). We plan a separate PR against main with a manual port; this PR targets 2.11.x only. Happy to follow whichever merge order you prefer. Re: logging — the [DEBUG DATASETS] / per-DAG re-query instrumentation was for our deployment debugging, not intended to merge as-is. The updated PR will keep only the SerializedDagModel guard + tests (+ newsfragment); any remaining log line in the fix will be DEBUG only. Inline comments (joinedload if needed, DDRQ preservation test, minor cleanups) will be addressed in the next push. |
Log which DAGs are selected as dataset-triggered (with ADRQ timestamp ranges) and log successful DagRun creation with dag_id, exec_date, prev_exec, event count, and event URIs. This provides visibility into the scheduler's dataset trigger decisions for debugging premature trigger incidents. Made-with: Cursor
…sions Log the full context of dataset-triggered scheduling to debug premature trigger incidents: - P0: Log condition, DDRQ URIs, and count when dataset_condition is satisfied (INFO in dags_needing_dagruns) - P1: Warn on DDRQ/event mismatch when queued URIs have no matching DatasetEvent in the timestamp range (WARNING in _create_dag_runs_dataset_triggered) - P2: Include data_interval start/end in the DagRun creation log - P3: Log consumed event timestamps and source DAG/run_id (DEBUG) Made-with: Cursor
Made-with: Cursor
…l is missing DAGs with DDRQ entries but no corresponding SerializedDagModel were bypassing dataset condition evaluation in dags_needing_dagruns() and entering dataset_triggered_dag_info unchecked. This caused premature triggers with partial events when the DAG processor was mid-parse cycle. Now explicitly detects the mismatch and excludes those DAGs from the current scheduler loop. DDRQ entries are preserved so the DAG is re-evaluated on the next heartbeat (~5s) once serialization completes. Made-with: Cursor
…DAGs This change ensures that the `missing_from_serialized` variable is deleted after its entries have been processed, preventing potential memory leaks and maintaining cleaner state management within the DAG model.
Drop [DEBUG DATASETS] instrumentation from SchedulerJobRunner and DagModel dataset-readiness loop; inline timetable dataset_condition where it is only used once.
Log the DDRQ-without-serialization case at debug and remove the [DEBUG DATASETS] prefix; drop redundant del of missing_from_serialized. Tests capture DEBUG, match the new message, and assert dataset_dag_run_queue rows remain after dags_needing_dagruns.
5682993 to
872c56c
Compare
|
I am creating the port forward for Airflow 3 |
|
The port forwarding PR for main is #64322 |
Split DagModel and DatasetDagRunQueue inserts and flush after DagModel so foreign-key order matches production DB constraints in TestDagModel.
Summary
DagModel.dags_needing_dagrunscould treat dataset-scheduled DAGs as ready for a new run when they hadDatasetDagRunQueuerows but noSerializedDagModelrow in the same evaluation window. The dataset timetable condition was never evaluated for those DAGs, yet they could still flow intodataset_triggered_dag_info, allowing premature dataset-triggered DagRuns.This PR removes such
dag_ids from the in-memoryby_dag/dag_statusesmaps until serialization exists.DatasetDagRunQueueORM rows are not deleted here, so the scheduler can re-evaluate on a later heartbeat. DEBUG logging records skipped DAGs (missing serialization) and, when applicable, satisfied dataset conditions. Docstring updated accordingly.Tests: two
TestDagModelcases cover missingSerializedDagModel(single and multiple DAGs), DEBUG log expectations, and assertdataset_dag_run_queuerow counts unchanged afterdags_needing_dagruns.Was generative AI tooling used to co-author this PR?
Cursor