Ocr

Overview

OCR API

Available Operations

process - OCR

process

OCR

Example Usage

from mistralai.client import Mistral
import os


with Mistral(
    api_key=os.getenv("MISTRAL_API_KEY", ""),
) as mistral:

    res = mistral.ocr.process(model="CX-9", document={
        "type": "document_url",
        "document_url": "https://upset-labourer.net/",
    }, bbox_annotation_format={
        "type": "text",
    }, document_annotation_format={
        "type": "text",
    })

    # Handle response
    print(res)

Parameters

Parameter	Type	Required	Description	Example
`model`	Nullable[str]	✔️	N/A
`document`	models.DocumentUnion	✔️	Document to run OCR on
`id`	Optional[str]	➖	N/A
`pages`	OptionalNullable[models.Pages]	➖	Specific pages to process. Accepts a list of integers or a string of comma-separated numbers and ranges (e.g. '0,1,2' or '0-5' or '0,2-4'). Page numbers start from 0.
`include_image_base64`	OptionalNullable[bool]	➖	Include image URLs in response
`image_limit`	OptionalNullable[int]	➖	Max images to extract
`image_min_size`	OptionalNullable[int]	➖	Minimum height and width of image to extract
`bbox_annotation_format`	OptionalNullable[models.ResponseFormat]	➖	Structured output class for extracting useful information from each extracted bounding box / image from document. Only json_schema is valid for this field	Example 1: { "type": "text" } Example 2: { "type": "json_object" } Example 3: { "type": "json_schema", "json_schema": { "schema": { "properties": { "name": { "title": "Name", "type": "string" }, "authors": { "items": { "type": "string" }, "title": "Authors", "type": "array" } }, "required": [ "name", "authors" ], "title": "Book", "type": "object", "additionalProperties": false }, "name": "book", "strict": true } }
`document_annotation_format`	OptionalNullable[models.ResponseFormat]	➖	Structured output class for extracting useful information from the entire document. Only json_schema is valid for this field	Example 1: { "type": "text" } Example 2: { "type": "json_object" } Example 3: { "type": "json_schema", "json_schema": { "schema": { "properties": { "name": { "title": "Name", "type": "string" }, "authors": { "items": { "type": "string" }, "title": "Authors", "type": "array" } }, "required": [ "name", "authors" ], "title": "Book", "type": "object", "additionalProperties": false }, "name": "book", "strict": true } }
`document_annotation_prompt`	OptionalNullable[str]	➖	Optional prompt to guide the model in extracting structured output from the entire document. A document_annotation_format must be provided.
`table_format`	OptionalNullable[models.TableFormat]	➖	N/A
`extract_header`	Optional[bool]	➖	N/A
`extract_footer`	Optional[bool]	➖	N/A
`confidence_scores_granularity`	OptionalNullable[models.ConfidenceScoresGranularity]	➖	Granularity for confidence scores: 'word' (per-word scores) or 'page' (aggregate only). Defaults to None (no confidence scores) to keep response payload small.
`retries`	Optional[utils.RetryConfig]	➖	Configuration to override the default retry behavior of the client.

Response

models.OCRResponse

Errors

Error Type	Status Code	Content Type
errors.HTTPValidationError	422	application/json
errors.SDKError	4XX, 5XX	/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ocr

Overview

Available Operations

process

Example Usage

Parameters

Response

Errors

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Ocr

Overview

Available Operations

process

Example Usage

Parameters

Response

Errors