Revoke prior DNA task on re-upload (fix stuck-at-50%)#101
Merged
Conversation
When a user uploaded a CSV while a previous DNA task was still running, both tasks contended on the same Book/Author row locks in Postgres. The dashboard polled the new task, which appeared frozen (~50%) until the prior task released its locks. Now upload_view revokes the prior task (best-effort, SIGTERM) before dispatching the new one. The existing nonce mechanism already protects enrichment subtasks, and calculate_full_dna already deletes UserBook rows not present in the new CSV — so there are no stale rows from the revoked task to worry about. Also switches the pending_dna_task_id write to .update() so a cached UserProfile instance can't clobber DNA written inline by an eager task in tests.
4 tasks
vanajmoorthy
added a commit
that referenced
this pull request
May 14, 2026
…-50% (#105) PR #101 revoked the parent generate_reading_dna_task on re-upload but left two paths uncovered: 1. check_author_mainstream_status_task had no nonce protection. A StoryGraph upload spawns one of these per new author, and they kept running after the parent task was revoked — flooding the worker queue with stale work and starving the new DNA task. Now accepts user_id + upload_nonce and exits early on mismatch. 2. enrich_book_task only checked the nonce at task start. A task that passed the start check could still sit briefly on Book.objects.get before its API call, during which a new upload could supersede it. Added a second re-check after the DB fetch. Both dispatch sites in dna_analyser.py now pass user_id + upload_nonce. Tests cover the new author-task nonce path (skip / run / backwards- compat) and the new mid-task re-check on enrich_book_task.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the "stuck at 50%" bug from PR #98's handoff doc (§2): uploading a CSV while a previous DNA task was still running caused the new task's progress bar to stall at ~50% until the old task finished.
Root cause: Both tasks ran concurrently and contended on Postgres row locks during `get_or_create` on shared Book/Author rows. The new task blocked on the old task's open transaction, so the dashboard's polling saw frozen progress.
Fix: Before dispatching the new DNA task, `upload_view` revokes the prior pending task (`AsyncResult.revoke(terminate=True, signal="SIGTERM")`) — best-effort, with a try/except so a failed revoke can't block the new upload. This matches user intent ("I clearly meant to replace that upload"), which the handoff doc preferred over a reject-while-pending guard.
Why no orphan-row cleanup is needed:
Side fix: Switched `pending_dna_task_id` write to `UserProfile.objects.filter(...).update(...)`. The previous code accessed `request.user.userprofile` after `.delay()`, which in eager-mode tests fetched fresh after the task wrote DNA. My new revoke logic accesses `userprofile` before dispatch, caching a stale instance whose later `.save()` clobbered the DNA. `.update()` sidesteps the cached instance entirely.
Test plan