Extend upload-nonce to author + late enrich tasks (fix stuck-at-50% on re-upload)#105
Merged
Merged
Conversation
…-50% PR #101 revoked the parent generate_reading_dna_task on re-upload but left two paths uncovered: 1. check_author_mainstream_status_task had no nonce protection. A StoryGraph upload spawns one of these per new author, and they kept running after the parent task was revoked — flooding the worker queue with stale work and starving the new DNA task. Now accepts user_id + upload_nonce and exits early on mismatch. 2. enrich_book_task only checked the nonce at task start. A task that passed the start check could still sit briefly on Book.objects.get before its API call, during which a new upload could supersede it. Added a second re-check after the DB fetch. Both dispatch sites in dna_analyser.py now pass user_id + upload_nonce. Tests cover the new author-task nonce path (skip / run / backwards- compat) and the new mid-task re-check on enrich_book_task.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #101 revoked the parent
generate_reading_dna_taskon re-upload, but the StoryGraph→Goodreads repro still hits the stuck-at-50% loading state. Diagnosed two uncovered paths:check_author_mainstream_status_taskhad no nonce protection at all. StoryGraph uploads spawn one per new author at the two dispatch sites (core/services/dna_analyser.py:497,:652). Even after the parent DNA task is revoked, these keep running, flood the worker queue, and starve the new DNA task — visible as the loading bar freezing at the "Syncing books" stage (≈50%).enrich_book_taskonly checked the nonce at task start. A task that passed the start check could still sit briefly onBook.objects.getbefore its ~5s of API calls. If a new upload landed in that window, the in-flight task would still spend the API time and write stale data.Changes
check_author_mainstream_status_tasknow accepts optionaluser_id+upload_nonce; exits early on nonce mismatch using the samesafe_cache_get("upload_nonce_{user_id}")pattern asenrich_book_task.enrich_book_taskkeeps its start-of-task check and adds a second re-check afterBook.objects.get, before invokingenrich_book_from_apis.core/services/dna_analyser.pynow passuser_id=upload_user_id, upload_nonce=upload_nonce.enrich_book_task.Out of scope (follow-up)
book.save()even when superseded mid-API-call" window. Closing this requires splittingenrich_book_from_apisinto fetch + persist phases so the task can re-check between them. Worth doing if we see stale writes in prod, but the API window is short (~5s per book, single-worker) compared to the queue-depth issue this PR addresses.generate_reading_dna_taskso it can't be starved by subtask backlog. Deployment-affecting change; left for a separate decision.Test plan
core.tests.test_integration.CheckAuthorMainstreamStatusTaskTests— skip / run / no-nonce.core.tests.test_integration.BookEnrichmentIntegrationTests.test_enrich_book_task_skipped_at_second_check_if_nonce_changed_during_db_fetch— validates the new mid-task re-check by mutating the nonce inside aBook.objects.getside-effect.