Discard cropped / hidden content ("sanitize") or extract cropped images instead of raw images

My PDF has tons of cropped images and AFAIK PyMuPDF only allows me to extract raw (uncropped) ones.

I was wondering if any of the following is possible?

a. Discard hidden / cropped part of all images (similar to "Redact" -> "Sanitize" in Acrobat, without rasterizing) prior to extracting.
b. Obtain cropbox of each images so I can crop the extracted raw images using another library.
c. (Preferrably) Ignore cropped data during extracting (aka extract just the cropped images instead of raw ones).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discard cropped / hidden content ("sanitize") or extract cropped images instead of raw images #1309

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discard cropped / hidden content ("sanitize") or extract cropped images instead of raw images #1309

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions