-
Notifications
You must be signed in to change notification settings - Fork 678
Closed
Labels
not a bugnot a bug / user error / unable to reproducenot a bug / user error / unable to reproduce
Description
Please provide all mandatory information!
Describe the bug (mandatory)
While I extract images, if the images has no background color, the background would be black.
To Reproduce (mandatory)
Explain the steps to reproduce the behavior, For example, include a minimal code snippet, example files, etc.
Here is my code.
def extract_images(pdf_path, output_folder, output_folder_with_title, classified_titles):
doc = fitz.open(pdf_path)
doc.colorspace = fitz.csRGB
for page_number in range(len(doc)):
page = doc[page_number]
# only test 1 page
# if page_number != 53:
# continue
img_num = 0
# get xref and rect objects
images = page.get_images(full=True)
for img_info in images:
xref = img_info[0]
img_rect = page.get_image_rects(xref)
img_title = get_closest_title(page_number, img_rect, classified_titles)
base_image = doc.extract_image(xref)
image_bytes = base_image["image"]
image = Image.open(io.BytesIO(image_bytes))
img_num += 1
image_name = "page" + str(page_number) + "_" + str(img_num) + img_title
image_path = os.path.join(output_folder, f"{image_name}.png")
image_with_title_path = os.path.join(output_folder_with_title, f"{image_name}.png")
image.save(image_path)
image_with_title = add_title_big(image, "page" + str(page_number) + "_" + str(img_num))
image_with_title.save(image_with_title_path)
return
Expected behavior (optional)
extract a image with white background.
Screenshots (optional)
This is when I open it with adobe.

Your configuration (mandatory)
- Mac book
- Python 3.7
- PyMuPDF version 1.22.1
Additional context (optional)
Metadata
Metadata
Assignees
Labels
not a bugnot a bug / user error / unable to reproducenot a bug / user error / unable to reproduce
