-
Notifications
You must be signed in to change notification settings - Fork 678
Closed
Description
Description of the bug
I want to compress a PDF. I extracted images from the PDF file and then used pngquant to compress them. Their size was reduced by more than 70%. However, when I used the replace_image function to replace the images, the size of the new PDF became bigger. I want to know why this happens, and if I used the save function incorrectly.
How to reproduce the bug
def compress_pdf(input_path, output_path=""):
doc = pymupdf.open(input_path)
doc_name_with_extension = os.path.basename(input_path)
doc_name = os.path.splitext(doc_name_with_extension)[0]
for page_index in range(10):
page = doc[page_index]
image_list = page.get_images()
for image_index, img in enumerate(image_list, start=1):
xref = img[0]
pix = pymupdf.Pixmap(doc, xref)
if pix.n - pix.alpha > 3:
pix = pymupdf.Pixmap(pymupdf.csRGB, pix)
origin_png_path = "./origin/%s_page_%s-image_%s.png" % (
doc_name,
page_index,
image_index,
)
pix.save(origin_png_path) # 存储提取出的图片
pix = None
pngquant.compress_png(origin_png_path)
compressed_png_path = "./origin/%s_page_%s-image_%s-fs8.png" % (
doc_name,
page_index,
image_index,
)
# print(os.path.getsize(compressed_png_path), compressed_png_path)
# 1. replace to file
# page.replace_image(xref, filename=compressed_png_path)
with open(compressed_png_path, "rb") as compressed_png:
compressed_png_bytes = compressed_png.read()
print(len(compressed_png_bytes), "111")
page.replace_image(xref, stream=compressed_png_bytes)
doc.save(output_path, garbage=3, clean=True)
doc.close()
PyMuPDF version
1.24.6
Operating system
MacOS
Python version
3.9
Metadata
Metadata
Assignees
Labels
No labels