Improve Permission Indexing Performance #12082
Open
+51
−22
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it: #11822 in v6.9 made changes that significantly sped permission reindexing across many datasets (e.g. when someone is granted new roles on the root dataverse). However, in doing so, it introduced an additional find(dataset_id) that slowed cases where content and permission indexing were done on one dataset. This PR, from discussion with @landreev (who identified the issue), refactors to avoid requiring a find that essentially reloads the dataset/versions/filemetadatas unnecessarily.
It does this by passing a list of datasetversions rather than a dataset id to indexDatasetFilesInNewTransaction. Since that method is in a new transaction, and we want to avoid merging the versions into the new transaction context for performance reasons (and its not really needed since we are not writing any changes to them), there is also new code that assures the fileMetadatas are loaded before the method is called in the case where they will be used (there is a JvmSettings.MIN_FILES_TO_USE_PROXY setting to avoid pulling fileMetadatas from the db when there are many files)
Which issue(s) this PR closes:
Special notes for your reviewer:
Suggestions on how to test this: Regression testing - permission indexing should work as before in all cases, calling /admin/index/status shouldn't show any (new?) issues related to permission docs. For performance, there are two main cases:
In both cases, the PR should improve performance over 6.9 - not sure yet how big the change in the first case is. The latter should restore the pre-6.8 performance or improve on it.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: