-
Notifications
You must be signed in to change notification settings - Fork 154
Open
Description
Python -VV
Python 3.12.8 (main, Jul 24 2025, 15:51:39) [Clang 16.0.0 (clang-1600.0.26.6)]Pip Freeze
accelerate==1.11.0
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.2
aiohttp-retry==2.9.1
aiosignal==1.4.0
amqp==5.3.1
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
anyio==4.11.0
appdirs==1.4.4
asttokens==3.0.0
asyncssh==2.21.1
atpublic==6.0.2
attrs==25.4.0
backoff==2.2.1
beautifulsoup4==4.14.2
billiard==4.2.2
black==25.9.0
cachetools==6.2.1
celery==5.5.3
certifi==2025.10.5
cffi==2.0.0
cfgv==3.4.0
chardet==5.2.0
charset-normalizer==3.4.4
click==8.3.0
click-didyoumean==0.3.1
click-plugins==1.1.1.2
click-repl==0.3.0
cloudpathlib==0.23.0
colorama==0.4.6
colorlog==6.10.1
configobj==5.0.9
coverage==7.11.0
cryptography==46.0.3
decorator==5.2.1
dictdiffer==0.9.0
dill==0.4.0
diskcache==5.6.3
distlib==0.4.0
distro==1.9.0
docling==2.58.0
docling-core==2.49.0
docling-ibm-models==3.10.2
docling-parse==4.7.0
dpath==2.2.0
dulwich==0.24.8
dvc==3.63.0
dvc-data==3.16.12
dvc-gs==3.0.2
dvc-http==2.32.0
dvc-objects==5.1.2
dvc-render==1.0.2
dvc-studio-client==0.22.0
dvc-task==0.40.2
dynaconf==3.2.12
entrypoints==0.4
et_xmlfile==2.0.0
eval_type_backport==0.2.2
executing==2.2.1
Faker==37.12.0
fastapi==0.115.14
filelock==3.20.0
filetype==1.2.0
flatten-dict==0.4.2
flufl.lock==8.2.0
frozenlist==1.8.0
fsspec==2025.9.0
funcy==2.0
gcloud-aio-auth==5.4.2
gcloud-aio-storage==9.6.0
gcsfs==2025.9.0
gitdb==4.0.12
GitPython==3.1.45
google-api-core==2.28.1
google-auth==2.42.0
google-auth-oauthlib==1.2.2
google-cloud-core==2.5.0
google-cloud-secret-manager==2.25.0
google-cloud-storage==3.4.1
google-crc32c==1.7.1
google-resumable-media==2.7.2
googleapis-common-protos==1.71.0
grandalf==0.8
grpc-google-iam-v1==0.14.3
grpcio==1.76.0
grpcio-status==1.76.0
gto==1.9.0
h11==0.16.0
hf-xet==1.2.0
httpcore==1.0.9
httpx==0.28.1
huggingface-hub==0.36.0
hydra-core==1.3.2
identify==2.6.15
idna==3.11
importlib_metadata==8.7.0
iniconfig==2.3.0
invoke==2.2.1
ipython==9.6.0
ipython_pygments_lexers==1.1.1
iterative-telemetry==0.0.10
jedi==0.19.2
Jinja2==3.1.6
jiter==0.11.1
joblib==1.5.2
jsonlines==4.0.0
jsonref==1.1.0
jsonschema==4.25.1
jsonschema-specifications==2025.9.1
kombu==5.5.4
langfuse==3.8.1
latex2mathml==3.78.1
lxml==5.4.0
markdown-it-py==4.0.0
marko==2.2.1
MarkupSafe==3.0.3
matplotlib-inline==0.2.1
mdurl==0.1.2
mistralai==1.9.11
mpire==2.10.2
mpmath==1.3.0
multidict==6.7.0
multiprocess==0.70.18
mypy==1.18.2
mypy_extensions==1.1.0
networkx==3.5
nodeenv==1.9.1
numpy==2.3.4
oauthlib==3.3.1
ocrmac==1.0.0
omegaconf==2.3.0
openai==1.109.1
opencv-python==4.11.0.86
openpyxl==3.1.5
opentelemetry-api==1.38.0
opentelemetry-exporter-otlp-proto-common==1.38.0
opentelemetry-exporter-otlp-proto-http==1.38.0
opentelemetry-proto==1.38.0
opentelemetry-sdk==1.38.0
opentelemetry-semantic-conventions==0.59b0
orjson==3.11.4
packaging==25.0
pandas==2.3.3
parso==0.8.5
pathspec==0.12.1
pexpect==4.9.0
pillow==11.3.0
platformdirs==4.5.0
pluggy==1.6.0
polyfactory==2.22.3
pre_commit==4.3.0
prompt_toolkit==3.0.52
propcache==0.4.1
proto-plus==1.26.1
protobuf==6.33.0
psutil==7.1.2
ptyprocess==0.7.0
pure_eval==0.2.3
pyasn1==0.6.1
pyasn1_modules==0.4.1
pyclipper==1.3.0.post6
pycparser==2.23
pydantic==2.12.3
pydantic-settings==2.11.0
pydantic_core==2.41.4
pydot==4.0.1
pygit2==1.19.0
Pygments==2.19.2
pygtrie==2.5.0
PyJWT==2.10.1
pylatexenc==2.10
pyobjc-core==12.0
pyobjc-framework-Cocoa==12.0
pyobjc-framework-CoreML==12.0
pyobjc-framework-Quartz==12.0
pyobjc-framework-Vision==12.0
pyparsing==3.2.5
PyPDF2==3.0.1
pypdfium2==4.30.0
pytest==8.4.2
pytest-asyncio==1.2.0
pytest-cov==7.0.0
pytest-cover==3.0.0
pytest-coverage==0.0
python-dateutil==2.9.0.post0
python-docx==1.2.0
python-dotenv==1.2.1
python-multipart==0.0.20
python-pptx==1.0.2
pytokens==0.2.0
pytz==2025.2
PyYAML==6.0.3
rapidocr==3.4.2
referencing==0.37.0
regex==2025.10.23
reportlab==4.4.4
requests==2.32.5
requests-oauthlib==2.0.0
rich==14.2.0
rpds-py==0.28.0
rsa==4.9.1
rtree==1.4.1
ruamel.yaml==0.18.16
ruamel.yaml.clib==0.2.14
ruff==0.14.2
safetensors==0.6.2
scikit-learn==1.7.2
scipy==1.16.3
scmrepo==3.5.2
semchunk==2.2.2
semver==3.0.4
setuptools==80.9.0
shapely==2.1.2
shellingham==1.5.4
shortuuid==1.0.13
shtab==1.7.2
six==1.17.0
smmap==5.0.2
sniffio==1.3.1
soupsieve==2.8
sqltrie==0.11.2
stack-data==0.6.3
starlette==0.46.2
sympy==1.14.0
tabulate==0.9.0
tenacity==9.1.2
threadpoolctl==3.6.0
tokenizers==0.22.1
tomlkit==0.13.3
torch==2.8.0
torchvision==0.23.0
tqdm==4.67.1
traitlets==5.14.3
transformers==4.57.1
typer==0.19.2
typing-inspection==0.4.2
typing_extensions==4.15.0
tzdata==2025.2
urllib3==2.5.0
uv==0.9.6
uvicorn==0.34.3
vine==5.1.0
virtualenv==20.35.4
voluptuous==0.15.2
wcwidth==0.2.14
wrapt==1.17.3
xlsxwriter==3.2.9
yarl==1.22.0
zc.lockfile==4.0
zipp==3.23.0Reproduction Steps
1.Run the following script:
from pathlib import Path
from mistralai import Mistral, DocumentURLChunk
from mistralai.extra import response_format_from_pydantic_model
from mistralai.models import OCRResponse
from pydantic import BaseModel
from pydantic import BaseModel
class Schema(BaseModel):
title: str | None = None
MISTRAL_API_KEY = settings.MISTRAL_API_KEY
MISTRAL_MODEL = settings.MISTRAL_MODEL
# Example structured schema
class ExampleSchema(BaseModel):
title: str | None = None
client = Mistral(api_key=MISTRAL_API_KEY)
path = Path("temp_file.docx")
uploaded = client.files.upload(
file={"file_name": path.name, "content": path.read_bytes()},
purpose="ocr",
)
signed = client.files.get_signed_url(file_id=uploaded.id)
payload = {
"model": MISTRAL_MODEL,
"document": DocumentURLChunk(document_url=signed.url),
"document_annotation_format": response_format_from_pydantic_model(ExampleSchema),
"include_image_base64": False,
"image_limit": 0,
"pages": [0,1,2,4,5,6,7],
}
# Process OCR
response: OCRResponse = client.ocr.process(**payload)
print("Processing completed. Response:")
print(response)Expected Behavior
From what I understand, The expected response for this is to provide A pydantic OCRResponse model with a list of 7 OCRPageObjects. However, the whole document gets crammer under index=0 of OCRPageObject.
The expected behavior occurs with .pdf files, but this bug only happens on docx files.
I wonder why this happens, and whether this also affects the pricing assigned to billing.
Does this get billed as one page of a document?
Additional Context
No response
Suggested Solutions
No response
Metadata
Metadata
Assignees
Labels
No labels