refactor: simplify run_recurring_tasks for single-task processing#218
Conversation
The underlying SQL function `get_recurringtasks_to_process` is `LIMIT 1`, so the loop, bulk_create, and bulk_update were always operating on at most one row. Replace the bulk pipeline with straight-line single-task handling and rename the manager method to `get_task_to_process`, returning `RecurringTask | None`, so the Python contract matches the SQL.
Declare `task_run` as `RecurringTaskRun | None` up front and route the `_run_task` return through an intermediate `run` so the `assert isinstance` narrowing carries past the `if/else` join. Also include the task identifier in the start/finish debug logs for easier tracing.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #218 +/- ##
==========================================
+ Coverage 97.26% 97.29% +0.02%
==========================================
Files 104 104
Lines 4459 4466 +7
==========================================
+ Hits 4337 4345 +8
+ Misses 122 121 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The function only ever processes one recurring task per call (the underlying SQL is `LIMIT 1`), so a list was always 0 or 1 items long. Tighten the signature to `RecurringTaskRun | None` so the contract matches reality and mirrors `RecurringTaskManager.get_task_to_process()`. The sole production caller (`threads.run_iteration`) discards the return value and is unaffected. All recurring test sites are updated to use the singular value: `len(...) == 1` becomes `is not None`, list indexing is removed, and the empty-result case becomes `is None`. A `test_run_recurring_tasks__no_tasks__does_nothing` test is added, mirroring the existing standard-task equivalent, to cover the empty-queue early return.
emyller
left a comment
There was a problem hiding this comment.
This looks like it also limits overall recurring tasks to one per iteration, despite the caller is still named run_recurring_tasks.
In practice, we'd be running pending recurring tasks one after another, with a one second delay:
flagsmith-common/src/task_processor/threads.py
Lines 92 to 96 in 273d3ad
Please update the logic so multiple recurring tasks can run per iteration, or make the intended behaviour explicit — e.g. in function names, comments, etc.
The function processes one task per call (the SQL function it wraps, `get_recurringtasks_to_process`, has been `LIMIT 1` since migration 0012). The plural name reads as if multiple tasks are processed per call, which has caused review confusion. Rename to the singular form so the call site contract is explicit; the per-iteration cadence and round-robin selection across tasks are still provided by `TaskRunner.run` and `ORDER BY last_picked_at NULLS FIRST` respectively.
No, it does not it was already limited to one per iteration
Yes, that's the current behaviour. This pull request is only a refactor — it doesn't change any behaviour.
Have renamed the function: 953e807 let me know if that helps |
A long-running recurring task can leave the main-thread DB connection
idle past the server's session timeout. RDS terminates the session,
and the follow-up task.save() / task_run.save() then raises
OperationalError ("SSL connection has been closed unexpectedly").
Call django.db.close_old_connections() between _run_task() and the
post-run saves so Django opens a fresh connection if the prior one
was killed.
The 5 affected recurring-task tests are switched to transaction=True
because pytest-django's atomic wrapper otherwise collides with
close_old_connections' autocommit check.
Fixes FLAGSMITH-API-5EM.
docs/if required so people know about the feature.Changes
Contributes to #152
The SQL function backing
RecurringTaskManager(get_recurringtasks_to_process) isLIMIT 1, so it always returns 0 or 1 rows. Despite that,run_recurring_taskswas structured around a loop withbulk_createforRecurringTaskRuns andbulk_updatefor theRecurringTasklock columns — bulk machinery operating on at most one row.This PR:
run_recurring_taskswith straight-line single-task handling: pick the one row, dispatch onis_task_registered/should_execute, thentask.save(update_fields=…)andtask_run.save()directly.RecurringTaskManager.get_tasks_to_process()→get_task_to_process()and changes its return type fromRawQuerySet[RecurringTask]toRecurringTask | None, so the Python contract matches the SQL.get_recurringtasks_to_process) untouched — it's referenced from migration history (reverse_sql) and renaming it would force a migration for zero functional gain.if task.should_execute / else task.unlock()control flow.TaskManager.get_tasks_to_process(num_tasks)(for non-recurringTasks) is genuinely multi-row and is unchanged.How did you test this code?