Skip to content

Improve the Efficiency of the /api/index/perms API call#12200

Open
qqmyers wants to merge 7 commits into
IQSS:developfrom
QualitativeDataRepository:Update_/api/index/perms
Open

Improve the Efficiency of the /api/index/perms API call#12200
qqmyers wants to merge 7 commits into
IQSS:developfrom
QualitativeDataRepository:Update_/api/index/perms

Conversation

@qqmyers

@qqmyers qqmyers commented Mar 9, 2026

Copy link
Copy Markdown
Member

What this PR does / why we need it: The (undocumented?) API call /api/index/perms iterates through all dvobjects in the database in one synchronous transaction, making it `unusable. This PR replaces that logic with an asynchronous iteration over the datasets and dataverses in the root dataverse and use of the index self and children logic used elsewhere. That should make it much faster/less memory intensive.

Which issue(s) this PR closes:

  • Closes #

Special notes for your reviewer: FWIW - the old code calls findAll dvobjects which, despite the allExceptFiles variable name, includes files. The code then puts all the files in a map and, as far as I can see, never used them.

Suggestions on how to test this: Run the api call before and after, confirm speed/memory improvement, check the db to make sure all last per index times are updated after the new call.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@qqmyers qqmyers added GDCC: QDR of interest to QDR Size: 3 A percentage of a sprint. 2.1 hours. labels Mar 9, 2026
@qqmyers qqmyers added this to the 6.11 milestone Mar 25, 2026
@cmbz cmbz moved this to Ready for Review ⏩ in IQSS Dataverse Project Jun 3, 2026
@cmbz cmbz modified the milestones: 6.11, 6.12 Jun 3, 2026
@cmbz cmbz added the FY26 Sprint 25 FY26 Sprint 25 (2026-06-03 - 2026-06-17) label Jun 3, 2026
@cmbz cmbz added the FY26 Sprint 26 FY26 Sprint 26 (2026-06-17 - 2026-07-01) label Jun 18, 2026

@rtreacy rtreacy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

via Claude --
PR #12200: Improve the Efficiency of the /api/index/perms API call (by qqmyers)

The PR replaces a synchronous, memory-heavy indexAllPermissions() with an @asynchronous method that iterates dataverses individually, delegating to the existing
indexPermissionsOnSelfAndChildren() logic.

Key recommendations:

  1. Add @TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED) — Without it, the async method holds a single transaction open for the entire run (potentially hours on large installations),
    risking timeouts.
  2. Guard against concurrent invocations — The fire-and-forget async nature means nothing prevents multiple overlapping runs. An AtomicBoolean or @Singleton/@lock(WRITE) pattern would prevent
    this.
  3. No authentication on the endpoint (pre-existing) — Any anonymous user can trigger resource-intensive reindexing via a simple GET request.
  4. Confirm old method is fully deleted — The old indexAllPermissions() becomes dead code after this change.

The approach itself is sound — reusing indexPermissionsOnSelfAndChildren with its batched sub-transactions and em.clear() every 10 dataverses is correct for memory management.

@rtreacy rtreacy moved this from Ready for Review ⏩ to In Review 🔎 in IQSS Dataverse Project Jun 22, 2026
@rtreacy rtreacy self-assigned this Jun 22, 2026
@qqmyers

qqmyers commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

Changes made per review, plus - changed to POST as best practice to avoid cross site issues, also updated call to permission reindex a single dataset to use POST/require superuser and added documentation (calls were previously undocumented) in release note, solr guide and change log.

qqmyers added 3 commits June 23, 2026 11:00
# Conflicts:
#	doc/sphinx-guides/source/api/changelog.rst
#	src/main/java/edu/harvard/iq/dataverse/search/SolrIndexServiceBean.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY26 Sprint 25 FY26 Sprint 25 (2026-06-03 - 2026-06-17) FY26 Sprint 26 FY26 Sprint 26 (2026-06-17 - 2026-07-01) GDCC: QDR of interest to QDR Size: 3 A percentage of a sprint. 2.1 hours.

Projects

Status: In Review 🔎

Development

Successfully merging this pull request may close these issues.

3 participants