Skip to content

Strange issue/bug when attempting edit widgets on copied pages from a PDF #4311

@k3vob

Description

@k3vob

Description of the bug

I have created the reproducible code below for my issue.

In a nutshell, I have a UI that allows a user to edit the form fields (pymupdf.Widget) on a PDF. How they edit them is not important, but what is needed is each widget to be annotated with it's widget index on the page. Each page is converted to an image and then rendered on the UI with all of the widgets annotated with each ones index inside a red box.

With some PDFs I use, somehow the image shows with the widgets on top of the annotations on the z-axis, causing the widgets to cover the annotations, even though the widgets have always existed on the PDF, and the annotations were added after.

Example:

Image

To overcome this first issue, what I decided to do was:

  1. create a new blank PDF document, so that the original PDF is not effected when adding/deleting widgets/annotations
  2. copy the required page from the original PDF to the new empty PDF
  3. add the annotations for each widget on this copied page
  4. delete all the widgets on this copied page so that they do not cover the annotations

This works as expected for the first page. However, strangely, every page after this first page that I attempt to create the image for in the same way, the newly copied page suddenly has 0 widgets on it, despite the page in the original PDF having widgets.

After calling the function once (expected results):

Image

After calling the function again on different, and the same pages (unexpected results - all widgets and hence all annotations missing):

Image

Image

I have been trying to find a solution to either of these problems for the last 2 days, but cannot.

from typing import Final

import pymupdf

ANNOTATION_FONT_SIZE: Final[float] = 7
ANNOTATION_LINE_WEIGHT: Final[float] = 1.2
ANNOTATION_MARGIN_X: Final[float] = 1
ANNOTATION_PAD: Final[float] = 1

RGB = tuple[float, float, float]
RED: Final[RGB] = (1, 0, 0)
WHITE: Final[RGB] = (1, 1, 1)
BLACK: Final[RGB] = (0, 0, 0)


def generate_annotated_page_image(pdf: pymupdf.Document, page_index: int) -> bytes:
    """Generate an annotated page image from a PyMuPDF document page."""
    new_pdf = pymupdf.open()
    new_pdf.insert_pdf(docsrc=pdf, from_page=page_index, to_page=page_index)
    pdf_page: pymupdf.Page = new_pdf[0]

    for annotation in pdf_page.annots():
        pdf_page.delete_annot(annot=annotation)

    widgets = list(pdf_page.widgets())
    print(f"Page index {page_index}: {len(widgets)} widgets")
    for widget_index, widget in enumerate(iterable=widgets):
        annotate_widget(pdf_page=pdf_page, widget=widget, widget_index=widget_index)
        pdf_page.delete_widget(widget=widget)

    return pdf_page.get_pixmap(matrix=pymupdf.Matrix(2, 2)).tobytes()


def annotate_widget(pdf_page: pymupdf.Page, widget: pymupdf.Widget, widget_index: int) -> None:
    """Annotate a field in a PyMuPDF document."""
    annotation: str = str(object=widget_index)

    text_width: float = pymupdf.get_text_length(text=annotation, fontsize=ANNOTATION_FONT_SIZE)
    rect_height: float = ANNOTATION_FONT_SIZE
    rect_width: float = max(rect_height, text_width + (2 * ANNOTATION_MARGIN_X))
    offset: int = max((widget.rect.height - rect_height) / 2, 0)

    rect: tuple[float, float, float, float] = (
        widget.rect[0],
        widget.rect[1] + offset,
        widget.rect[0] + rect_width,
        widget.rect[1] + rect_height + offset,
    )

    text_rect: tuple[float, float, float, float] = (rect[0], rect[1] + ANNOTATION_PAD / 2, rect[2], rect[3])

    rect_annotation: pymupdf.Annot = pdf_page.add_rect_annot(
        rect=(rect[0] - ANNOTATION_PAD, rect[1] - ANNOTATION_PAD, rect[2] + ANNOTATION_PAD, rect[3] + ANNOTATION_PAD)
    )
    rect_annotation.set_colors(stroke=RED, fill=WHITE)
    rect_annotation.set_border(width=ANNOTATION_LINE_WEIGHT)
    rect_annotation.update()

    pdf_page.add_freetext_annot(
        rect=text_rect,
        text=annotation,
        fontsize=ANNOTATION_FONT_SIZE,
        text_color=BLACK,
        fill_color=WHITE,
        align=pymupdf.TEXT_ALIGN_CENTER,
    )


pdf_path = "/path/to/pdf/with/widgets.pdf"
pdf = pymupdf.open(pdf_path)

for i, page_index in enumerate([0, 1, 0]):
    image = generate_annotated_page_image(pdf=pdf, page_index=page_index)
    with open(f"image{i}.png", "wb") as f:
        f.write(image)

Output:

Page index 0: 63 widgets
Page index 1: 0 widgets
Page index 0: 0 widgets

How to reproduce the bug

Run the script above, replacing pdf_path = "/path/to/pdf/with/widgets.pdf" with the path to a PDF with widgets on at least 2 pages.

PyMuPDF version

1.25.3

Operating system

MacOS

Python version

3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    not a bugnot a bug / user error / unable to reproduce

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions