Skip to content

Commit 7d71cfc

Browse files
committed
Fix 'different_runtime_limit' race conditions
This commit fixes 3 related issues with the way runtime_limit was administered; which could lead to race conditions (and hence: the wrong runtime_limit applying at some point in time). Post-fix, the folllowing holds: 1. We use thread_locals to store this info, since there are at least 2 sources of threaded code that touch this (snappea's workers and the django debugserver) 2. We distinguish between the "from connection settings" timeout and the "temporarily overridden" ones, since we cannot assume connection-initialization happens first (as per the comment in base.py) 3. We store runtime-limits per alias ('using'). Needed for [2] (each connection may have a different moment-of-initialization, clobbering CM-set values from the other connection) and also needed once you realize there may be different defaults for the timeouts. General context: I've recently started introducing the 'different runtime' helper quite a bit more; and across connections (snappea!), which created more and more doubts as to it actually working as advertised. Thoughts on "using" being required. I used to think "you can reason about a global timeout value, and the current transaction makes clear what you're actually doing", but as per the notes above that doesn't really work. Thoughts on reproducing: A few thoughts/notes on reproducing problems with race conditions. Basic note: that's always hairy. So in the end I settled on a solution that's hopefully easy to reason about, even if it's verbose. When I started work on this commit, I focussed on thread-safety; "proving the problem" consisted of F5/^R on a web page with 2 context managers with different timeouts, hoping to show that the stack unrolling didn't work properly. However, during those "tests" I noticed quite a few resets-to-5s (from the connection defaults), which prompted fix [2] from above.
1 parent 47eed6b commit 7d71cfc

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

snappea/stats.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ def _write(self, timestamp):
107107
with immediate_atomic(using="snappea"): # explicit is better than impl.; and we combine read/write here
108108
# having stats is great, but I don't want to hog task-processing too long (which would happen
109109
# precisely when the backlog grows large)
110-
with different_runtime_limit(0.1):
110+
with different_runtime_limit(0.1, using="snappea"):
111111
try:
112112
task_counts = Task.objects.values("task_name").annotate(count=Count("task_name"))
113113
except OperationalError as e:

0 commit comments

Comments
 (0)