Identify different sections in a PDF

### Description of the bug

I want to identify different sections in a PDF for latter usage.
Consider following content with 2 sections:

**1.1 Uitgangspunten
	Deze service, EftOphfCrOvkORCA mag enkel vanuit UPF aangeroepen worden. Er is daarom
	ook geen publieke service omschrijving beschikbaar.

1.2 Controles
	Alle beschreven meldingen in dit document worden in de monitor-tabel gezet.
	De volgende (standaard) controles worden altijd uitgevoerd.
	•	Elk gegeven moet het juiste formaat hebben.
		Melding: ‘Ongeldige invoer: <invoerveld> heeft niet het juiste formaat.’**

1.1 Uitgangspunten and 2 lines after that is the first section. 1.2 Controles and 4 lines after that is the second section.
Is there any unique value is set by PyMuPDF when reading for each sections? I can't find any such within the properties.

I'm using following code:

```
page_blocks = page.get_text("dict")["blocks"]                    
for block in page_blocks:
     if "lines" in block.keys():
         spans = block['lines']                           
         for span in spans:
              span_info = span['spans']                                    
              for text_info in span_info:
                    text = text_info['text']
```

@JorjMcKie  Please help on this.

### How to reproduce the bug

NA

### PyMuPDF version

1.23.25

### Operating system

Windows

### Python version

3.8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Identify different sections in a PDF #3311

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Identify different sections in a PDF #3311

Description

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions