parsing with layout detection fails if region_maxsize is not specified in config

### System Info / 系統信息

WSL Ubuntu 24.04.1 LTS

### Who can help? / 谁可以帮助到您？

_No response_

### Information / 问题信息

- [x] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务

### Reproduction / 复现过程

1. Put ```test.pdf``` into root directory
2. Create ```config.yaml``` with the following content (layout is copypasted from [config.yaml example](https://github.com/zai-org/GLM-OCR/blob/main/glmocr/config.yaml)):
```
pipeline:
  maas:
    enabled: false
  
  ocr_api:
    api_host: localhost
    api_port: 11434
    api_path: /api/generate  # Use Ollama native endpoint
    model: glm-ocr:latest    # Required: specify model name
    api_mode: ollama_generate  # Required: use Ollama native format
  
  enable_layout: true
  # Layout detection settings (used when enable_layout=true)
  layout:
    # PP-DocLayoutV3 model directory
    # Can be a local folder or a Hugging Face model id
    # (Use *_safetensors for Transformers; PaddlePaddle/PP-DocLayoutV3 is a PaddleOCR export)
    model_dir: PaddlePaddle/PP-DocLayoutV3_safetensors

    # Detection threshold
    threshold: 0.3
    # threshold_by_class:           # per-class threshold override
    #   0: 0.5
    #   1: 0.3
    #   text: 0.5
    #   table: 0.2

    # Processing
    # batch_size: max images per model forward pass (reduce to 1 if OOM)
    batch_size: 1
    workers: 1
    cuda_visible_devices: "0"
    # img_size: null                # resize input (optional)

    # Post-processing
    layout_nms: true
    layout_unclip_ratio:
      - 1.0
      - 1.0

    # Merge mode for overlapping bboxes: "large" or "small"
    # Can be a single value or per-class dict
    layout_merge_bboxes_mode:
      0: large # abstract
      1: large # algorithm
      2: large # aside_text
      3: large # chart
      4: large # content
      5: large # display_formula
      6: large # doc_title
      7: large # figure_title
      8: large # footer
      9: large # footer
      10: large # footnote
      11: large # formula_number
      12: large # header
      13: large # header
      14: large # image
      15: large # inline_formula
      16: large # number
      17: large # paragraph_title
      18: small # reference
      19: large # reference_content
      20: large # seal
      21: large # table
      22: large # text
      23: large # vertical_text
      24: large # vision_footnote

    # Map detected labels to OCR task types
    # - text/table/formula: OCR with corresponding prompt
    # - skip: keep region but don't OCR (e.g., images)
    # - abandon: discard region entirely
    label_task_mapping:
      text:
        - abstract
        - algorithm
        - content
        - doc_title
        - figure_title
        - paragraph_title
        - reference_content
        - text
        - vertical_text
        - vision_footnote
        - seal
        - formula_number
      table:
        - table
      formula:
        - display_formula
        - inline_formula
      skip:
        - chart
        - image
      abandon:
        - header
        - footer
        - number
        - footnote
        - aside_text
        - reference
        - footer_image
        - header_image

    # Map label index to label name
    id2label:
      0: abstract
      1: algorithm
      2: aside_text
      3: chart
      4: content
      5: display_formula
      6: doc_title
      7: figure_title
      8: footer
      9: footer_image
      10: footnote
      11: formula_number
      12: header
      13: header_image
      14: image
      15: inline_formula
      16: number
      17: paragraph_title
      18: reference
      19: reference_content
      20: seal
      21: table
      22: text
      23: vertical_text
      24: vision_footnote
  ```
3. Run glmocr (self-hosted version):
```bash
 glmocr parse test.pdf --config config.yaml
```
4. Get the following error:
```
Exception in thread Thread-3 (layout_detection_thread):
Traceback (most recent call last):
  File "/home/username/glm-ocr/glmocr/pipeline/pipeline.py", line 357, in layout_detection_thread
    self._stream_process_layout_batch(
  File "/home/username/glm-ocr/glmocr/pipeline/pipeline.py", line 636, in _stream_process_layout_batch
    region_queue.put(
  File "/home/username/.local/share/uv/python/cpython-3.12.9-linux-x86_64-gnu/lib/python3.12/queue.py", line 134, in put
    if self.maxsize > 0:
       ^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'NoneType' and 'int'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/username/.local/share/uv/python/cpython-3.12.9-linux-x86_64-gnu/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
    self.run()
  File "/home/username/.local/share/uv/python/cpython-3.12.9-linux-x86_64-gnu/lib/python3.12/threading.py", line 1012, in run
    self._target(*self._args, **self._kwargs)
  File "/home/username/glm-ocr/glmocr/pipeline/pipeline.py", line 392, in layout_detection_thread
    state.region_queue.put(("error", None, None))
  File "/home/username/.local/share/uv/python/cpython-3.12.9-linux-x86_64-gnu/lib/python3.12/queue.py", line 134, in put
    if self.maxsize > 0:
       ^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'NoneType' and 'int'
```

### Expected behavior / 期待表现

Normal behavior (successful ```glmocr parse``` execution)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parsing with layout detection fails if region_maxsize is not specified in config #144

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

parsing with layout detection fails if region_maxsize is not specified in config #144

Description

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions