PDFImageViewer is a free tool to view and losslessly extract embedded images from PDF documents and it can convert Image-Only PDF to JBig2 PDF.
This software owes its existence to the MewUI, which satisfied most of my needs for a C# UI framework: high performance, AOT compatibility, dark theme, and adequate basic controls. The only catch is that the author prioritizes cross-platform tech over functional details, making the framework feel more like a tech demo. Consequently, I spent time fixing some control functionalities to make it viable for practical application details. Incidentally, I also found and worked around a critical crash bug related to image loading (though I’m currently unable to submit this issue due to the author’s circumstances).
For a long time, I have been relying on third-party tools like pdfimages (XPDF) and mutool (MuPDF) to extract images from PDFs. However, these tools cannot guarantee 100% accurate image extraction; issues like color shifts or zero-byte output files frequently occur. For instance, when processing the sample file Separation-DCT.pdf, both tools directly export the DCT-encoded image data as JPG files without proper decoding and color correction. Furthermore, mutool often exports zero-byte PAM images. In comparison, the pdfimages from poppler v26 is far superior.
Since these command-line tools lack real-time image viewing and export capabilities, I developed PDFImageViewer tool, it allows for viewing and accurately extracting images, even when dealing with different decoding filters and nested color spaces. Supported color spaces include RGB, Gray, CMYK, CalGray, CalRGB, Lab, ICCBased, Indexed, Separation, and DeviceN.Except for certain formats that Windows cannot support and inline images (which I don’t care about)
For better compatibility with decoding filters and nested color spaces, PDFImageViewer doesn’t merge the decoding and pixel restoration steps. Instead, it uses separate image processing functions for each case. This is to better handle both compliant and non-compliant image data generated by different PDF creators. The drawback is that for pipelines I haven’t encountered in my samples, PDFImageViewer might fail to extract the image (which is a coverage issue, not a technical one). So, if you encounter this, just submit an issue with the PDF file containing that page to me.
Starting from PDFImageViewer V2, it supports multiple binarization algorithms and can convert Image-Only PDF to JBig2 PDF.
Contact me: QQ 564955427 Email liucq@163.com
URL: cnblogs
