Skip to content

Page value 0-based and 1-based are not unified #3814

@zhu1205

Description

@zhu1205

Description of the bug

In the dest returned by Page.get_links() and Document.get_toc(), both 0-based (kind=LINK_GOTO and LINK_GOTOR) and 1-based (kind=LINK_NAMED) page values ​​exist, which is not unified. Is this by design or a bug? If so, why is it designed this way?

From the document https://pymupdf.readthedocs.io/en/latest/document.html#Document.get_toc, we can see the specification of Document.get_toc(), especially Dest as following:

dest – (dict) included only if simple=False. Contains details of the TOC item as follows:

    kind: destination kind, see [Link Destination Kinds](https://pymupdf.readthedocs.io/en/latest/vars.html#linkdest-kinds).

    file: filename if kind is [LINK_GOTOR](https://pymupdf.readthedocs.io/en/latest/vars.html#LINK_GOTOR) or [LINK_LAUNCH](https://pymupdf.readthedocs.io/en/latest/vars.html#LINK_LAUNCH).

    page: target page, 0-based, [LINK_GOTOR](https://pymupdf.readthedocs.io/en/latest/vars.html#LINK_GOTOR) or [LINK_GOTO](https://pymupdf.readthedocs.io/en/latest/vars.html#LINK_GOTO) only.

How to reproduce the bug

ECR Cross Account.pdf
kind=LINK_NAMED, and page is 1-base in dest returned by Document.get_toc()

pdf-example-bookmarks.pdf
kind=LINK_GOTO , and page is 0-base in dest returned by Document.get_toc()

PyMuPDF version

1.24.9

Operating system

Linux

Python version

3.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    not a bugnot a bug / user error / unable to reproduce

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions