Parallelize OCR block processing and add optional CUDA preprocessing#8
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: df510233a3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for block in blocks: | ||
|
|
||
| def process_block(block: dict) -> dict | None: |
There was a problem hiding this comment.
Avoid Python 3.10-only union syntax in bridge
The new annotation -> dict | None in run_ocr uses PEP 604 syntax, which is only valid on Python 3.10+, so the bridge now fails to import with a SyntaxError on Python 3.9 runtimes that previously worked; this is a functional regression because the OCR process cannot start at all in those environments (README still documents Python 3.8+ support).
Useful? React with 👍 / 👎.
Motivation
Description
ocr/bridge/ocr_bridge.pyto process OCR blocks in parallel usingThreadPoolExecutorandDOCSTRUCT_OCR_WORKERSto control concurrency while preserving deterministic output ordering._to_grayscaleand_otsu_thresholdthat use OpenCV CUDA ifDOCSTRUCT_OCR_USE_CUDA=1is set and CUDA bindings are available, otherwise silently fall back to CPU implementations.pix2tex) handling intact and re-applied them to the ordered results to preserve existing OCR logic.README.md(DOCSTRUCT_OCR_WORKERS,DOCSTRUCT_OCR_USE_CUDA, and existingDOCSTRUCT_BRIDGE/DOCSTRUCT_PYTHON).Testing
python3 -m py_compile ocr/bridge/ocr_bridge.py, which completed successfully.cargo test -q, and the tests completed successfully (all automated tests passed).Codex Task