Skip to content

[PULP-578] Optimize filtering of repo versions by content#7817

Open
jobselko wants to merge 1 commit into
pulp:mainfrom
jobselko:7799
Open

[PULP-578] Optimize filtering of repo versions by content#7817
jobselko wants to merge 1 commit into
pulp:mainfrom
jobselko:7799

Conversation

@jobselko

Copy link
Copy Markdown
Member

closes #7799

📜 Checklist

  • Commits are cleanly separated with meaningful messages (simple features and bug fixes should be squashed to one commit)
  • A changelog entry or entries has been added for any significant changes
  • Follows the Pulp policy on AI Usage
  • (For new features) - User documentation and test coverage has been added

See: Pull Request Walkthrough

@jobselko jobselko self-assigned this Jun 25, 2026

@mdellweg mdellweg left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

if isinstance(content, models.QuerySet):
content_pks = list(content.values_list("pk", flat=True))
else:
content_pks = list(content)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this branch for? (According to the docstring, content is always a queryset.)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DistributionWithContentFilter.filter sends a list (versions = RepositoryVersion.objects.with_content([content.pk]).values_list("pk", flat=True)), so the docstring was already outdated. Updated now.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values_list should still return a (lazily evaluated) queryset:

https://docs.djangoproject.com/en/6.0/ref/models/querysets/#values-list

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values_list is not what is passed in. [content.pk] is the argument to with_content(), and this is a plain list, not a queryset

@jobselko jobselko Jun 26, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I changed list(content) to just content as __overlap accepts any iterable.

EDIT: overlap only works on lists, list() is still needed

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something. But the queryset from line 902 works with overlap perfectly, right?

@jobselko jobselko Jun 26, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, overlap is happy with both querysets and lists (so not only with lists, I was wrong).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does that mean the branch is unnecessary?

Comment thread pulpcore/app/models/repository.py Outdated
@jobselko jobselko force-pushed the 7799 branch 2 times, most recently from 35787ad to fc896b9 Compare June 25, 2026 12:57
@jobselko

Copy link
Copy Markdown
Member Author

Measured benefit by repeatedly adding/removing the same content unit into a repo (creating 2 RepositoryVersion and 1 RepositoryContent per cycle):

Without patch:
Response time grows with the number of cycles (e.g. ~0.5s at 100, ~2-3s at 700)

With patch:
Response time stays constant at ~0.45s regardless of the number of cycles

closes pulp#7799
Assisted By: Claude Opus 4.6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use new memoized field in repository content API filtering

3 participants