Feature: Add getPagesInfo() for batch page info retrieval (#20530) #20589
+65
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR implements the
getPagesInfo()method requested in #20530.Currently, retrieving layout data for all pages requires calling
getPage(i)sequentially, which causes N worker round-trips. This PR introduces a batch retrieval method that fetchesview,rotate, anduserUnitfor all pages in a single round-trip.Changes
PDFDocumentProxy.prototype.getPagesInfo()insrc/display/api.js.GetPagesInfohandler insrc/core/worker.js.test/unit/api_spec.jsto ensure data consistency matchesgetPage().Performance Verification
I measured the performance improvement using a benchmark script.
The structural improvement (N -> 1 round-trip) provides significant speedup, especially for large documents.
getPageloopgetPagesInfo(New)getPageloopgetPagesInfo(New)🔻 Click to see the benchmark logic
I verified the logic using a script similar to this (simplified for reproducibility in unit tests):