fix: migrate NioFileManagerImpl from jpedal to Apache PDFBox#2200
fix: migrate NioFileManagerImpl from jpedal to Apache PDFBox#2200
Conversation
- Replace jpedal imports with PDFBox equivalents - Update createCacheVersion2() to use PDFRenderer and renderImageWithDPI() - Convert page indexing from 1-based (jpedal) to 0-based (PDFBox) - Use standard ImageIO.write() instead of jpedal's ObjectStore - Simplify error handling (single IOException vs IOException + PdfException) - Remove unused InputStream import - Maintain 96 DPI rendering quality This migration improves code maintainability by using the standard Apache PDFBox library already in use throughout OpenO EMR. Fixes #2183 Co-authored-by: Michael Yingbull <yingbull@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR migrates the PDF page-to-PNG rendering in NioFileManagerImpl from the deprecated jpedal library to Apache PDFBox. The implementation now loads PDFs as PDDocuments, uses PDFRenderer to render pages to BufferedImages at 96 DPI, and saves results as PNG via ImageIO. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/main/java/ca/openosp/openo/managers/NioFileManagerImpl.java`:
- Around line 234-242: In NioFileManagerImpl's PDF rendering block, validate
that pageNum-1 is within document.getNumberOfPages() bounds before calling
PDFRenderer.renderImageWithDPI; if out of range return null. After rendering,
check the boolean return value of ImageIO.write and, on false or on any
IOException, ensure you do not return the cacheFilePath (delete any partially
written file if created) and return null instead. Use the existing symbols
PDDocument.load, PDFRenderer.renderImageWithDPI, document.getNumberOfPages(),
ImageIO.write and cacheFilePath to locate and implement these checks and early
returns.
|
@claude review |
|
Claude finished @yingbull's task in 1m 36s —— View job Code Review Complete ✅I've reviewed the jpedal → PDFBox migration in 🔴 Critical Issues1. Missing Page Index Validation (Security & Stability)Location: The code doesn't validate that Current code: try (PDDocument document = PDDocument.load(sourceFile.toFile())) {
PDFRenderer renderer = new PDFRenderer(document);
BufferedImage image_to_save = renderer.renderImageWithDPI(pageNum - 1, 96, ImageType.RGB);
ImageIO.write(image_to_save, "png", cacheFilePath.toFile());
image_to_save.flush();
} catch (IOException e) {
log.error("Error rendering PDF page to cache", e);
}Required fix: try (PDDocument document = PDDocument.load(sourceFile.toFile())) {
int pageIndex = pageNum - 1;
int pageCount = document.getNumberOfPages();
// Validate page index is within bounds
if (pageIndex < 0 || pageIndex >= pageCount) {
log.error("Requested page " + pageNum + " is out of range for document with " + pageCount + " pages");
return null;
}
PDFRenderer renderer = new PDFRenderer(document);
BufferedImage image_to_save = renderer.renderImageWithDPI(pageIndex, 96, ImageType.RGB);
// Check ImageIO.write success (returns false on failure)
if (!ImageIO.write(image_to_save, "png", cacheFilePath.toFile())) {
log.error("Failed to write PNG image to cache file: " + cacheFilePath);
return null;
}
image_to_save.flush();
} catch (IOException e) {
log.error("Error rendering PDF page to cache", e);
return null; // CRITICAL: Must return null on error
}Why this matters:
2. Missing Return Statement in Error HandlerLocation: When an Problem: } catch (IOException e) {
log.error("Error rendering PDF page to cache", e);
// Missing: return null;
}
return cacheFilePath; // Returns path to non-existent file!Fix: Add 3. Unchecked ImageIO.write() Return ValueLocation:
Fix: Check the return value and return null on failure (shown in code above). ✅ Strengths
💡 RecommendationsRequired Before Merge:
Testing Recommendations (from PR description):The PR description includes comprehensive testing recommendations. I'd add:
📋 SummaryVerdict: ❌ Request Changes - Critical error handling issues must be fixed The migration itself is clean and well-documented, but the error handling is incomplete. The three issues above can result in:
These are straightforward fixes that should take ~5 minutes to implement. Once addressed, this will be a solid migration that improves code maintainability. 📚 Additional ContextOpenO EMR Security Standards (from CLAUDE.md):
Package Migration (from CLAUDE.md):
|
|
@claude add the commit with the required error handling |
|
Claude finished @yingbull's task in 1m 12s —— View job ✅ Error Handling ImplementedI've successfully added all three critical error handling fixes to Changes Made1. Page Bounds Validation (Lines 235-240) int pageIndex = pageNum - 1;
int pageCount = document.getNumberOfPages();
// Validate page index is within bounds
if (pageIndex < 0 || pageIndex >= pageCount) {
log.error("Requested page " + pageNum + " is out of range for document with " + pageCount + " pages");
return null;
}2. ImageIO.write() Return Value Check (Lines 249-252) // Check ImageIO.write success (returns false on failure)
if (!ImageIO.write(image_to_save, "png", cacheFilePath.toFile())) {
log.error("Failed to write PNG image to cache file: " + cacheFilePath);
return null;
}3. Return null in IOException Catch Block (Lines 255-258) } catch (IOException e) {
log.error("Error rendering PDF page to cache", e);
return null; // Must return null on error
}Security & Stability Improvements✅ Prevents IndexOutOfBoundsException - Page requests beyond document page count are now validated before rendering Commit DetailsCommitted as: Next Steps: The PR is now ready for re-review. All critical error handling issues identified in the previous review have been addressed. |
- Add page bounds validation before renderImageWithDPI call - Validate pageNum-1 is within document page count (0 to pageCount-1) - Check ImageIO.write() return value and return null on failure - Add return null in IOException catch block to prevent returning path to non-existent file - Improves security and prevents IndexOutOfBoundsException Co-authored-by: Michael Yingbull <yingbull@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR migrates PDF page rendering in NioFileManagerImpl from jpedal to Apache PDFBox, aligning with the broader effort to standardize on PDFBox and removing the last direct jpedal usage in this class. It also tightens error handling around cache generation so callers no longer get paths to non-existent cache files.
Changes:
- Replaced jpedal imports with PDFBox (
PDDocument,PDFRenderer,ImageType) andImageIO, and removed the unusedInputStreamimport. - Reimplemented
createCacheVersion2()to usePDDocument.load(...)+PDFRenderer.renderImageWithDPI(...)at 96 DPI with explicit 0-based page indexing and bounds checks. - Simplified error handling to a single
IOExceptioncatch that logs a clear message and returnsnullon failure, and added a check onImageIO.writeto avoid returning a path when cache generation fails.
|
@claude review |
|
Claude finished @yingbull's task in 2m 20s —— View job Code Review: jpedal → PDFBox Migration ✅I've completed a thorough review of PR #2200. This migration successfully replaces jpedal with Apache PDFBox in 🎯 SummaryVerdict: APPROVED ✅ This is a well-executed migration that:
The follow-up commit (dde9d26) successfully addressed all three critical error handling issues identified in the previous review. ✅ What's Working Well1. Complete jpedal Removal- import org.jpedal.PdfDecoder;
- import org.jpedal.exception.PdfException;
- import org.jpedal.fonts.FontMappings;
+ import org.apache.pdfbox.pdmodel.PDDocument;
+ import org.apache.pdfbox.rendering.PDFRenderer;
+ import org.apache.pdfbox.rendering.ImageType;
+ import javax.imageio.ImageIO;All jpedal imports successfully removed from this file. 2. Excellent Error Handling (Lines 234-259)The follow-up commit added three critical safety checks: a. Page Bounds Validation: int pageIndex = pageNum - 1;
int pageCount = document.getNumberOfPages();
if (pageIndex < 0 || pageIndex >= pageCount) {
log.error("Requested page " + pageNum + " is out of range for document with " + pageCount + " pages");
return null;
}✅ Prevents b. ImageIO.write() Success Check: if (!ImageIO.write(image_to_save, "png", cacheFilePath.toFile())) {
log.error("Failed to write PNG image to cache file: " + cacheFilePath);
return null;
}✅ Detects disk full, permission errors, and other write failures c. IOException Handling: } catch (IOException e) {
log.error("Error rendering PDF page to cache", e);
return null; // Must return null on error
}✅ Returns null instead of falling through to return a bad path 3. Code SimplificationThe migration removes 17 lines of jpedal boilerplate and adds only 9 lines of PDFBox code: Before (jpedal - 17 lines): PdfDecoder decode_pdf = new PdfDecoder(true);
FontMappings.setFontReplacements();
decode_pdf.useHiResScreenDisplay(true);
decode_pdf.setExtractionMode(0, 96, 96 / 72f);
try (InputStream inputStream = Files.newInputStream(sourceFile)) {
decode_pdf.openPdfFileFromInputStream(inputStream, false);
BufferedImage image_to_save = decode_pdf.getPageAsImage(pageNum);
decode_pdf.getObjectStore().saveStoredImage(cacheFilePath.toString(), image_to_save, true, false, "png");
} catch (IOException e) {
log.error("Error", e);
} catch (PdfException e) {
log.error("Error", e);
} finally {
decode_pdf.flushObjectValues(true);
decode_pdf.closePdfFile();
}After (PDFBox - 9 lines + 17 lines error handling = 26 lines total): try (PDDocument document = PDDocument.load(sourceFile.toFile())) {
int pageIndex = pageNum - 1;
int pageCount = document.getNumberOfPages();
if (pageIndex < 0 || pageIndex >= pageCount) {
log.error("Requested page " + pageNum + " is out of range for document with " + pageCount + " pages");
return null;
}
PDFRenderer renderer = new PDFRenderer(document);
BufferedImage image_to_save = renderer.renderImageWithDPI(pageIndex, 96, ImageType.RGB);
if (!ImageIO.write(image_to_save, "png", cacheFilePath.toFile())) {
log.error("Failed to write PNG image to cache file: " + cacheFilePath);
return null;
}
image_to_save.flush();
} catch (IOException e) {
log.error("Error rendering PDF page to cache", e);
return null;
}While the new code is 9 lines longer, the added value is significant:
4. Proper Page Indexing Transition// Render at 96 DPI to match jpedal settings
// Note: jpedal uses 1-based page indexing, PDFBox uses 0-based
BufferedImage image_to_save = renderer.renderImageWithDPI(pageIndex, 96, ImageType.RGB);✅ Excellent comment explaining the 1-based → 0-based transition 5. Security Compliance✅ 6. Consistency with CodebaseThe PR description correctly notes this follows the same PDFBox pattern used in 📋 Testing RecommendationsThe PR description includes comprehensive testing recommendations. I'd emphasize: Critical Tests:
🔍 Minor Observations (Not Blocking)1. Unused Import CheckThe migration removed the unused 2. Error Message ConsistencyThe new error messages are more descriptive than the old ones:
This is a significant improvement for debugging. ✅ 3. Resource ManagementThe new code uses try-with-resources for 📚 Alignment with OpenO EMR StandardsVerified against ✅ Package Structure: Uses 🚀 RecommendationAPPROVE and MERGE ✅ This PR is ready to merge. The migration is:
The follow-up commit (dde9d26) successfully addressed all critical issues from the previous review. No additional changes needed. 📌 Next Steps (After Merge)Per issue #2170, jpedal can be removed from Review completed | Branch: claude/issue-2183-20260128-0117 | |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
|
@claude I just updated the branch, I believe ManageDocument2Action.java is updated, can you check?If not, check if there is an issue ticket tracking migrating that java file, and if not, create one |
|
Claude finished @yingbull's task in 1m 15s —— View job Checking ManageDocument2Action.java Migration Status
✅ Good News: ManageDocument2Action.java Has Already Been Migrated!I've verified that ManageDocument2Action.java has already been successfully migrated from jpedal to Apache PDFBox: Verification Results:
Migration History:Per issue #2170, there were originally only 2 files using jpedal:
Next Steps:The jpedal → PDFBox migration is now complete! Per the parent issue #2170, the final step is:
Once this PR (#2200) is merged, you can create a follow-up issue to:
|
|
@claude create issue ticket for the work to remove jpedal |
|
Claude finished @yingbull's task in 1m 10s —— View job ✅ Issue Ticket Created
SummarySuccessfully created issue #2203 to track the removal of jpedal dependency from the project. Verification Results:
Issue Details:
The issue includes a detailed task list with acceptance criteria and regression testing steps to ensure PDF rendering continues to work correctly after jpedal removal. |
Summary
This PR migrates
NioFileManagerImpl.javafrom jpedal to Apache PDFBox for PDF page rendering.Changes Made
Removed jpedal imports:
org.jpedal.PdfDecoderorg.jpedal.exception.PdfExceptionorg.jpedal.fonts.FontMappingsAdded PDFBox imports:
org.apache.pdfbox.pdmodel.PDDocumentorg.apache.pdfbox.rendering.PDFRendererorg.apache.pdfbox.rendering.ImageTypejavax.imageio.ImageIOUpdated createCacheVersion2() method:
PdfDecoderwith PDFBoxPDDocument+PDFRendererpageNum - 1ObjectStore.saveStoredImage()with standardImageIO.write()IOExceptioninstead ofIOException+PdfException)Cleanup:
InputStreamimportTesting Recommendations
This migration requires thorough testing to ensure PDF page rendering works correctly:
Page Indexing Validation (CRITICAL):
PDF Compatibility:
Functional Testing:
hasCacheVersion2()removeCacheVersion()Quality Assurance:
Related Issues
Fixes #2183 (part of parent issue #2170: jpedal → PDFBox migration)
Notes
ManageDocument2Action.java(lines 557-559)pom.xml(see Migrate from jpedal to Apache PDFBox - Remove duplicate PDF rendering dependency #2170)🤖 Generated with Claude Code
Summary by cubic
Migrated NioFileManagerImpl from jpedal to Apache PDFBox for PDF page rendering, and added bounds checks and safer cache writes. Improves maintainability and keeps 96 DPI PNG caching. Fixes #2183.
Refactors
Bug Fixes
Written for commit 11a9974. Summary will update on new commits.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.