Skip to content

Conversation

@qqmyers
Copy link
Member

@qqmyers qqmyers commented Jan 14, 2026

What this PR does / why we need it: #11822 in v6.9 made changes that significantly sped permission reindexing across many datasets (e.g. when someone is granted new roles on the root dataverse). However, in doing so, it introduced an additional find(dataset_id) that slowed cases where content and permission indexing were done on one dataset. This PR, from discussion with @landreev (who identified the issue), refactors to avoid requiring a find that essentially reloads the dataset/versions/filemetadatas unnecessarily.

It does this by passing a list of datasetversions rather than a dataset id to indexDatasetFilesInNewTransaction. Since that method is in a new transaction, and we want to avoid merging the versions into the new transaction context for performance reasons (and its not really needed since we are not writing any changes to them), there is also new code that assures the fileMetadatas are loaded before the method is called in the case where they will be used (there is a JvmSettings.MIN_FILES_TO_USE_PROXY setting to avoid pulling fileMetadatas from the db when there are many files)

Which issue(s) this PR closes:

  • Closes #

Special notes for your reviewer:

Suggestions on how to test this: Regression testing - permission indexing should work as before in all cases, calling /admin/index/status shouldn't show any (new?) issues related to permission docs. For performance, there are two main cases:

  • adding a role for someone on the root collection, triggering (only) permission indexing across many dvobjects
  • editing a dataset with many files

In both cases, the PR should improve performance over 6.9 - not sure yet how big the change in the first case is. The latter should restore the pre-6.8 performance or improve on it.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@pdurbin pdurbin added this to the 6.10 milestone Jan 14, 2026
@pdurbin pdurbin moved this to In Progress 💻 in IQSS Dataverse Project Jan 14, 2026
@coveralls
Copy link

coveralls commented Jan 14, 2026

Coverage Status

coverage: 24.291% (-0.001%) from 24.292%
when pulling b4fc822 on QualitativeDataRepository:indexingperformance
into 45724b9 on IQSS:develop.

@qqmyers qqmyers moved this from In Progress 💻 to Ready for Review ⏩ in IQSS Dataverse Project Jan 14, 2026
@qqmyers qqmyers added Size: 10 A percentage of a sprint. 7 hours. GDCC: QDR of interest to QDR labels Jan 14, 2026
@qqmyers qqmyers marked this pull request as ready for review January 14, 2026 20:58
@qqmyers qqmyers removed their assignment Jan 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GDCC: QDR of interest to QDR Size: 10 A percentage of a sprint. 7 hours.

Projects

Status: Ready for Review ⏩

Development

Successfully merging this pull request may close these issues.

3 participants