Skip to content

Incorrect block detection #4241

@mxav1111

Description

@mxav1111

Description of the bug

It seems block detection has some issue in certain cases.

I am trying to extract the blocks (not text) and it seems it isn't detecting block boundary coordinates correctly.

pip list -o confirms that version is latest.

Attaching files here with :- Source pdf , Output pdf as well as code (as .txt file).

output_check_last_page_for_issue.pdf
source.pdf
block_detection_issue_on_last_page.py.txt

Thanks for your help.
-M

How to reproduce the bug

To reproduce this issue, please see attached files :- source pdf , output pdf as well as code.

To run the code rename the .txt file to python file:- block_detection_issue_on_last_page.py
and run .....

$ python block_detection_issue_on_last_page.py source.pdf output_new.pdf

PyMuPDF version

1.25.2

Operating system

Linux

Python version

3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    not a bugnot a bug / user error / unable to reproduce

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions