fix: update DocIntel default and surface OCR failures by imadreamerboy · Pull Request #1642 · microsoft/markitdown

imadreamerboy · 2026-03-27T13:54:24Z

This PR fixes Azure Document Intelligence handling for image OCR in markitdown.

There were two separate problems:

DocumentIntelligenceConverter still defaulted to api_version="2024-07-31-preview", which can fail on valid Azure resources with 404 Resource not found during begin_analyze_document(...).
That failure could be masked by fallback behavior in MarkItDown._convert(): after the DocIntel converter failed, ImageConverter could return an empty DocumentConverterResult(markdown=""), and markitdown treated that as a successful conversion. The caller then saw result.text_content == "" instead of the real Azure error.

Changes

Updated the default Azure Document Intelligence API version from 2024-07-31-preview to 2024-11-30
Kept explicit docintel_api_version=... override behavior intact
Changed conversion flow so an empty fallback result does not count as success if an earlier converter already failed
Added regression tests for:
- new default DocIntel API version
- explicit API version override
- empty image fallback no longer masking a prior converter failure

Why

This matches current Azure behavior more reliably for OCR/image analysis and fixes a misleading failure mode where real Azure/DocIntel errors were swallowed and surfaced as “no OCR text extracted”.

Validation

Tested with focused pytest coverage for DocIntel and fallback behavior:

test_docintel_default_api_version
test_docintel_explicit_api_version
test_empty_image_fallback_does_not_mask_prior_failure

These pass with the local package under test.

imadreamerboy · 2026-03-27T13:55:21Z

@microsoft-github-policy-service agree

fix: update DocIntel default and surface OCR failures

2561ae2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: update DocIntel default and surface OCR failures#1642

fix: update DocIntel default and surface OCR failures#1642
imadreamerboy wants to merge 1 commit intomicrosoft:mainfrom
imadreamerboy:fix-azure-api-endpoint

imadreamerboy commented Mar 27, 2026

Uh oh!

imadreamerboy commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

imadreamerboy commented Mar 27, 2026

Changes

Why

Validation

Uh oh!

imadreamerboy commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant