Skip to content

Conversation

@smudge
Copy link
Member

@smudge smudge commented Dec 10, 2025

This does two things:

  1. Avoids querying both locked_count and working_count separately (since they are synonymous)
  2. Splits the (total) count metric into two: failed_count and live_count

Worth noting that when a scan covers the entire table, a sequential scan might actually be better than an index scan. However, since the monitor already produces a count of failed jobs, we can memoize that result to avoid doing that work twice. (Furthermore, in postgres at least, these don't just become index scans -- they become INDEX ONLY scans, which should be cheaper than table scans since the index should be significantly more compact!)

/no-platform

@smudge smudge requested a review from effron December 10, 2025 19:03
end

def run!
@memo = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a bit of a stretch, but this makes me think that a separate class that can calc the metrics might be helpful, and this class is more in charge of the looping/sleeping mechanics. then we don't need this overwritable internal state, we just new up a new metric-calc-class during each call to run!. I see that Runnable handles some of this, but i guess organizing it that way means we have this notion of resettable state which is a bit smelly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed -- I was also starting to think about how we could reimagine Runnable to take on less of the whole lifecycle of Delayed::Monitor and Delayed::Worker kinds of classes.

If I keep pulling on that thread, I imagine that this smelliness will go away, but I didn't want to refactor too much within this one PR.

effron
effron previously approved these changes Dec 10, 2025
Copy link
Contributor

@effron effron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a non blocking thoughts, otherwise changes look good!

domainLGTM

Worth noting that when a scan covers the entire table, **a sequential
scan might actually be better than an index scan.**

However, since the monitor already produces a count of failed jobs, we
can memoize that result to avoid doing that work twice.

Furthermore, in postgres at least, these don't just become index scans
-- they become _INDEX ONLY_ scans, which should be cheaper than table
scans since the index should be significantly more compact!
These concepts are now synonymous. I went as far as removing the
`working` scope because it's not even truthful. (We don't know whether
it's actively working, we only know that it was claimed by a worker and
that the claim hasn't yet expired.)
-> Sort (cost=...)
Output: (CASE WHEN ((priority >= 0) AND (priority < 10)) THEN 0 WHEN ((priority >= 10) AND (priority < 20)) THEN 10 WHEN ((priority >= 20) AND (priority < 30)) THEN 20 WHEN (priority >= 30) THEN 30 ELSE NULL::integer END), queue
Sort Key: (CASE WHEN ((delayed_jobs.priority >= 0) AND (delayed_jobs.priority < 10)) THEN 0 WHEN ((delayed_jobs.priority >= 10) AND (delayed_jobs.priority < 20)) THEN 10 WHEN ((delayed_jobs.priority >= 20) AND (delayed_jobs.priority < 30)) THEN 20 WHEN (delayed_jobs.priority >= 30) THEN 30 ELSE NULL::integer END), delayed_jobs.queue
-> Seq Scan on public.delayed_jobs (cost=...)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [legacy index] case gets two Seq Scans instead of one. However, in practice, the @memo means that the monitor is still doing less work, as one of these Seq Scans is identical to a seq scan it already had to do to count failed rows.

Copy link
Contributor

@effron effron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domainLGTM

@smudge smudge merged commit f7c8775 into Betterment:main Dec 11, 2025
25 checks passed
@smudge smudge deleted the query-optimization/13 branch December 11, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants