Spans detected in page.get_text("dict") fails in a weird pdf format 

### Description of the bug

As far as I know, the **page.get_text("dict")** API can access to spans bounding boxes (which contain texts with the same font styles). However, some pdfs are somewhat strangely encoded, and it seems like PyMuPDF cannot detect spans for these files.

### How to reproduce the bug

When i use the span detection with the below code for the 
[no_bug.pdf](https://github.com/user-attachments/files/16620700/no_bug.pdf) file, the code works just fine and the spans are detected relatively accurate (as shown in this image) 
<img width="952" alt="no_bug" src="https://github.com/user-attachments/assets/ea6a628c-44ec-4af0-88f8-5dac3afb2405">

)

```
import fitz  # PyMuPDF
import sys
import io
from PIL import Image, ImageDraw

def visualize_span_bbox(page, img_path, dpi=200):
		"""Visualize the span bounding boxes on the image."""
		# Get the page dimensions and set scale factor for DPI
		zoom_x = dpi / 72.0
		zoom_y = dpi / 72.0
		mat = fitz.Matrix(zoom_x, zoom_y)  # Create a transformation matrix for DPI scaling
		
		# Convert PDF page to image with specified DPI
		pix = page.get_pixmap(matrix=mat)
		img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
		draw = ImageDraw.Draw(img)
		
		blocks = page.get_text("dict")["blocks"]
		for block in blocks:
				try:
					for line in block["lines"]:
							for span in line['spans']:
									span_bbox = [int(coord * zoom_x) for coord in span['bbox']]
									text = span['text']
									fontname = span["font"]
									font_flags = span["flags"]
									size = span["size"]

									is_bold = "Bold" in fontname
									is_italic = "Italic" in fontname or "Oblique" in fontname
									print(f"{text} : {fontname}: {size} : ({is_bold}, {is_italic})")
									draw.rectangle(span_bbox, outline="blue", width=2)
				except:
					print("ERROR")
								
		# Save the image with span bounding boxes
		img.save(img_path)

if __name__ == "__main__":
		doc = fitz.open(sys.argv[1])
		for page in doc:
				print("Text from page %i:" % page.number)
				visualize_span_bbox(page, f"result.png", dpi=200)
```

but when i changed into this pdf file ([bug.pdf](https://github.com/user-attachments/files/16620827/bug.pdf)), it breaks (the bounding boxes are separated in a weird way, and the font styles bold/italic are also inaccurate):
<img width="952" alt="bug" src="https://github.com/user-attachments/assets/07d62811-0ef8-4b67-9e3c-c62cecb7f526">

Can someone tell me why does this happen?

### PyMuPDF version

1.24.9

### Operating system

Linux

### Python version

3.10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Spans detected in page.get_text("dict") fails in a weird pdf format #3783

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Spans detected in page.get_text("dict") fails in a weird pdf format #3783

Description

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions