The OCR process in the inference_schema and document_data_capture steps is currently too slow.
Instead of relying on a vLLM as is done now, it might be better to use an external OCR service to improve performance, since the current approach is causing noticeable delays.