Skip to content

Ingestion issue with gpt4o : Cannot find nested property 'imageEmbedding' on the resource type 'search.documentFields' #1841

@tlievre

Description

@tlievre

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Launch ./scripts/prepdocs.sh with gpt4o vision feature enabled

Any log messages given by the failure

vscode ➜ /workspaces/azure-search-openai-demo $ ./scripts/prepdocs.sh
Loading azd .env file from current environment...
Creating Python virtual environment "app/backend/.venv"...
Installing dependencies from "requirements.txt" into virtual environment (in quiet mode)...
Running "prepdocs.py"
Using local files: ./data/*
Ensuring search index gptkbindex exists
Search index gptkbindex already exists
Skipping ./data/EFM8UB20F32G-A-QFP48.pdf, no changes detected.
Ingesting 'IRF530PBF.pdf'
Extracting text from './data/IRF530PBF.pdf' using Azure Document Intelligence
Splitting 'IRF530PBF.pdf' into sections
Each page will be split into smaller chunks of text, but images will be of the entire page.
Section ends with unclosed table, starting next section with the table at page 0 offset 884 table start 503
Section ends with unclosed table, starting next section with the table at page 0 offset 3231 table start 788
Section ends with unclosed table, starting next section with the table at page 6 offset 18513 table start 818
Uploading blob for whole file -> IRF530PBF.pdf
Unable to find arial.ttf or FreeMono.ttf, using default font
Converting page 0 to image and uploading -> IRF530PBF-0.png
Converting page 1 to image and uploading -> IRF530PBF-1.png
Converting page 2 to image and uploading -> IRF530PBF-2.png
Converting page 3 to image and uploading -> IRF530PBF-3.png
Converting page 4 to image and uploading -> IRF530PBF-4.png
Converting page 5 to image and uploading -> IRF530PBF-5.png
Converting page 6 to image and uploading -> IRF530PBF-6.png
Converting page 7 to image and uploading -> IRF530PBF-7.png
Converting page 8 to image and uploading -> IRF530PBF-8.png
Computed embeddings in batch. Batch size: 16, Token count: 6507
Computed embeddings in batch. Batch size: 13, Token count: 4194
Traceback (most recent call last):
File "/workspaces/azure-search-openai-demo/./app/backend/prepdocs.py", line 479, in
loop.run_until_complete(main(ingestion_strategy, setup_index=not args.remove and not args.removeall))
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/workspaces/azure-search-openai-demo/./app/backend/prepdocs.py", line 215, in main
await strategy.run()
File "/workspaces/azure-search-openai-demo/app/backend/prepdocslib/filestrategy.py", line 90, in run
await search_manager.update_content(sections, blob_image_embeddings, url=file.url)
File "/workspaces/azure-search-openai-demo/app/backend/prepdocslib/searchmanager.py", line 245, in update_content
await search_client.upload_documents(documents)
File "/workspaces/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/search/documents/aio/_search_client_async.py", line 571, in upload_documents
results = await self.index_documents(batch, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/tracing/decorator_async.py", line 94, in wrapper_use_tracer
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/search/documents/aio/_search_client_async.py", line 670, in index_documents
return await self._index_documents_actions(actions=batch.actions, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/search/documents/aio/_search_client_async.py", line 678, in _index_documents_actions
batch_response = await self._client.documents.index(batch=batch, error_map=error_map, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/tracing/decorator_async.py", line 94, in wrapper_use_tracer
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/search/documents/_generated/aio/operations/_documents_operations.py", line 883, in index
raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: () The request is invalid. Details: Cannot find nested property 'imageEmbedding' on the resource type 'search.documentFields'.
Code:
Message: The request is invalid. Details: Cannot find nested property 'imageEmbedding' on the resource type 'search.documentFields'.

Expected/desired behavior

Ingestion should works

OS and Version?

Windows 10

azd version?

run azd version and copy paste here.

Versions

azd version 1.9.5

Mention any other details that might be useful

The problem persists even after I change the document.


Thanks! We'll be in touch soon.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions