Add monthly and weekly downloads to recent downloads calculation #12420

pksunkara · 2025-11-30T07:49:07Z

We calculate the recent downloads using a materialized view that we refresh every day. But it's only for last 90 days.

Most of the other ecosystems like NPM, PyPi, Packagist, Hex provide daily, weekly and monthly downloads. So, I added a weekly and monthly column in the recent downloads view and surfaced the data in the /api/v1/crates/{name} endpoint.

I have attached links for shields.io badges that show weekly and monthly downloads for each ecosystem. Once this is merged, I will add the same for Crates.io badges.

Turbo87

refreshing this materialized view is one of the most expensive operations that our database has to deal with. could you check what the before/after performance impact of this change is with a production-size database dump?

pksunkara · 2025-12-01T17:47:41Z

Refresh using the current query was taking 13.3s

SELECT crate_id, SUM(version_downloads.downloads) FROM version_downloads
INNER JOIN versions
  ON version_downloads.version_id = versions.id
WHERE version_downloads.date > date(CURRENT_TIMESTAMP - INTERVAL '90 days')
GROUP BY crate_id;

The query in this PR is taking 21.5s

SELECT
  crate_id,
  SUM(version_downloads.downloads),
  SUM(version_downloads.downloads) FILTER (WHERE version_downloads.date > date(CURRENT_TIMESTAMP - INTERVAL '30 days')),
  SUM(version_downloads.downloads) FILTER (WHERE version_downloads.date > date(CURRENT_TIMESTAMP - INTERVAL '7 days'))
FROM version_downloads
  INNER JOIN versions
    ON version_downloads.version_id = versions.id
  WHERE version_downloads.date > date(CURRENT_TIMESTAMP - INTERVAL '90 days')
  GROUP BY crate_id;

I improved it to take 13s by removing the timestamp conversions.

SELECT
  crate_id,
  SUM(version_downloads.downloads),
  SUM(version_downloads.downloads) FILTER (WHERE version_downloads.date > CURRENT_DATE - 30),
  SUM(version_downloads.downloads) FILTER (WHERE version_downloads.date > CURRENT_DATE - 7)
FROM version_downloads
  INNER JOIN versions
    ON version_downloads.version_id = versions.id
  WHERE version_downloads.date > CURRENT_DATE - 90
  GROUP BY crate_id;

NOTE1: Removing the timestamp conversions in the older query made it take 11.5s

NOTE2: We should probably remove the (version_id, date) index in version_downloads table because it's not actually being used since older data is archived and almost all the rows are used for calculation, thus postgres uses seq scan. That way, you can save some space. But, I will leave it upto you guys.

We calculate the recent downloads using a materialized view that we refresh every day. But it's only for last 90 days. Most of the other ecosystems like NPM, PyPi, Packagist, Hex provide daily, weekly and monthly downloads. So, I added a weekly and monthly column in the recent downloads view and surfaced the data in the `/api/v1/crates/{name}` endpoint.

Turbo87 · 2025-12-01T19:09:26Z

interesting. on our production database this operation often take multiple minutes 😅

@eth3lbert any thoughts on this?

eth3lbert · 2025-12-02T14:00:44Z

interesting. on our production database this operation often take multiple minutes 😅

I'm also a bit worried about the time this will take in production, so if we're sure it won't take longer than before, I'm not opposed to it! :D"

Referring to the downloads table, I should probably find time to migrate our downloads table to Declarative Partitioning. However, before actually doing this, it would be better to test whether a new database with the same scale, specifications, and data runs much faster than the current one. This would suggest that our performance is likely being affected by the (BRIN) index.

pksunkara · 2025-12-03T07:25:43Z

interesting. on our production database this operation often take multiple minutes 😅

That would mean that we get more savings on actual numbers :)

Do you need to confirm the benchmark yourself before merging this? Or is there anything else I can do?

Turbo87 · 2025-12-05T13:01:54Z

I thought about this some more. The 90-day download data for the crates is already available on the API, just not in summed up form, but anyone who wants to calculate these values for a certain crate can already do so. If shields.io wants to display such values, can't they subset the per-day download numbers themselves and sum them up?

As I mentioned above, the materialized view is one of the more expensive pieces of our database and I'm slightly hesitant to extend it since we can't go back once we offer these values in our API.

I'm adding this issue to the agenda for our crates.io team meeting today.

pksunkara · 2025-12-05T13:03:43Z

If shields.io wants to display such values, can't they subset the per-day download numbers themselves and sum them up?

How would they get it though? I saw some endpoints but don't think any of them solved this.

Turbo87 · 2025-12-05T13:05:58Z

I saw some endpoints but don't think any of them solved this.

https://crates.io/api/v1/crates/syn/downloads

pksunkara · 2025-12-05T13:09:59Z

👍. If deciding to use it, it would need a filter parameter for dates.

Turbo87 · 2025-12-05T13:10:35Z

If deciding to use it, it would need a filter parameter for dates.

why? you could filter it on the client side, no?

pksunkara · 2025-12-05T13:25:27Z

It would be better to pass the filter to the db to reduce the load. In fact, I am wondering how much load constantly requesting this endpoint would add to the db versus the current PR. But, I can't prove anything without having metrics.

Since the current PR keeps the time taken for refresh at least the same, without knowing metrics for that endpoint, I would prefer the current PR from a pure performance viewpoint.

rustbot added the A-backend ⚙️ label Nov 30, 2025

This comment was marked as outdated.

Sign in to view

Turbo87 reviewed Dec 1, 2025

View reviewed changes

Turbo87 added the C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works label Dec 1, 2025

pksunkara force-pushed the pavan/jj/sqvuonxt branch from 1c9b449 to cd32bcf Compare December 1, 2025 17:59

Turbo87 added this to crates.io team meetings Dec 5, 2025

Turbo87 moved this to For next meeting in crates.io team meetings Dec 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add monthly and weekly downloads to recent downloads calculation #12420

Add monthly and weekly downloads to recent downloads calculation #12420

Uh oh!

pksunkara commented Nov 30, 2025

Uh oh!

This comment was marked as outdated.

Turbo87 left a comment

Uh oh!

pksunkara commented Dec 1, 2025 •

edited

Loading

Uh oh!

Turbo87 commented Dec 1, 2025

Uh oh!

eth3lbert commented Dec 2, 2025

Uh oh!

pksunkara commented Dec 3, 2025 •

edited

Loading

Uh oh!

Turbo87 commented Dec 5, 2025

Uh oh!

pksunkara commented Dec 5, 2025

Uh oh!

Turbo87 commented Dec 5, 2025

Uh oh!

pksunkara commented Dec 5, 2025

Uh oh!

Turbo87 commented Dec 5, 2025

Uh oh!

pksunkara commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add monthly and weekly downloads to recent downloads calculation #12420

Are you sure you want to change the base?

Add monthly and weekly downloads to recent downloads calculation #12420

Uh oh!

Conversation

pksunkara commented Nov 30, 2025

Uh oh!

This comment was marked as outdated.

Turbo87 left a comment

Choose a reason for hiding this comment

Uh oh!

pksunkara commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Turbo87 commented Dec 1, 2025

Uh oh!

eth3lbert commented Dec 2, 2025

Uh oh!

pksunkara commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Turbo87 commented Dec 5, 2025

Uh oh!

pksunkara commented Dec 5, 2025

Uh oh!

Turbo87 commented Dec 5, 2025

Uh oh!

pksunkara commented Dec 5, 2025

Uh oh!

Turbo87 commented Dec 5, 2025

Uh oh!

pksunkara commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pksunkara commented Dec 1, 2025 •

edited

Loading

pksunkara commented Dec 3, 2025 •

edited

Loading