Skip to content

running prepdocs.py now produces err: Cannot find nested property 'images' on the resource type 'search.documentFields'. #2718

@dan-hampton

Description

@dan-hampton

Image feature is not enabled. Worked previously.

(main) $ python ./app/backend/prepdocs.py './data/*' --verbose

`INFO Request URL: 'https://gptkb-.search.windows.net//indexes('gptkbindex')/docs/search.index?api-version=REDACTED' _universal.py:510
Request method: 'POST'
Request headers:

                A body is sent with the request                                                                                                                                                                

[09:10:50] INFO Response status: 400 _universal.py:549
Response headers:

       ERROR    Unhandled exception during ingestion: () The request is invalid. Details: Cannot find nested property 'images' on the resource type 'search.documentFields'.                    prepdocs.py:700
                Code:                                                                                                                                                                                          
                Message: The request is invalid. Details: Cannot find nested property 'images' on the resource type 'search.documentFields'.                                                                   
                ╭───────────────────────────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────────────────────────╮                
                │ app\backend\prepdocs.py:698 in <module>                                                                                                   │                
                │                                                                                                                                                                             │                
                │   695 │                                                                                                                                                                     │                
                │   696 │   # Run ingestion with enhanced exception logging for Azure context                                                                                                 │                
                │   697 │   try:                                                                                                                                                              │                
                │ ❱ 698 │   │   loop.run_until_complete(main(ingestion_strategy, setup_index=not args.remove and                                                                              │                
                │       not args.removeall))                                                                                                                                                  │                
                │   699 │   except Exception as e:                                                                                                                                            │                
                │   700 │   │   logger.exception("Unhandled exception during ingestion: %s", e)                                                                                               │                
                │   701 │   │   # If this is an Azure HttpResponseError, try to log request/response context                                                                                  │                
                │                                                                                                                                                                             │                
                │ C:\Python313\Lib\asyncio\base_events.py:720 in run_until_complete                                                                                                           │                
                │                                                                                                                                                                             │                
                │    717 │   │   if not future.done():                                                                                                                                        │                
                │    718 │   │   │   raise RuntimeError('Event loop stopped before Future completed.')                                                                                        │                
                │    719 │   │                                                                                                                                                                │                
                │ ❱  720 │   │   return future.result()                                                                                                                                       │                
                │    721 │                                                                                                                                                                    │                
                │    722 │   def stop(self):                                                                                                                                                  │                
                │    723 │   │   """Stop running the event loop.                                                                                                                              │                
                │                                                                                                                                                                             │                
                │ app\backend\prepdocs.py:402 in main                                                                                                       │                
                │                                                                                                                                                                             │                
                │   399 │   if setup_index:                                                                                                                                                   │                
                │   400 │   │   await strategy.setup()                                                                                                                                        │                
                │   401 │                                                                                                                                                                     │                
                │ ❱ 402 │   await strategy.run()                                                                                                                                              │                
                │   403                                                                                                                                                                       │                
                │   404                                                                                                                                                                       │                
                │   405 if __name__ == "__main__":                                                                                                                                            │                
                │                                                                                                                                                                             │                
                │ app\backend\prepdocslib\filestrategy.py:123 in run                                                                                        │                
                │                                                                                                                                                                             │                
                │   120 │   │   │   │   │   │   file, self.file_processors, self.category, self.blob_manager,                                                                                 │                
                │       self.image_embeddings                                                                                                                                                 │                
                │   121 │   │   │   │   │   )                                                                                                                                                 │                
                │   122 │   │   │   │   │   if sections:                                                                                                                                      │                
                │ ❱ 123 │   │   │   │   │   │   await self.search_manager.update_content(sections, url=file.url)                                                                              │                
                │   124 │   │   │   │   finally:                                                                                                                                              │                
                │   125 │   │   │   │   │   if file:                                                                                                                                          │                
                │   126 │   │   │   │   │   │   file.close()                                                                                                                                  │                
                │                                                                                                                                                                             │                
                │ app\backend\prepdocslib\searchmanager.py:513 in update_content                                                                            │                
                │                                                                                                                                                                             │                
                │   510 │   │   │   │   │   len(documents),                                                                                                                                   │                
                │   511 │   │   │   │   │   self.search_info.index_name,                                                                                                                      │                
                │   512 │   │   │   │   )                                                                                                                                                     │                
                │ ❱ 513 │   │   │   │   await search_client.upload_documents(documents)                                                                                                       │                
                │   514 │                                                                                                                                                                     │                
                │   515 │   async def remove_content(self, path: Optional[str] = None, only_oid: Optional[str] =                                                                              │                
                │       None):                                                                                                                                                                │                
                │   516 │   │   logger.info(                                                                                                                                                  │                
                │                                                                                                                                                                             │                
                │ .venv\Lib\site-packages\azure\search\documents\aio\_search_client_async.py:599 in upload_documents                                        │                
                │                                                                                                                                                                             │                
                │   596 │   │   batch.add_upload_actions(documents)                                                                                                                           │                
                │   597 │   │                                                                                                                                                                 │                
                │   598 │   │   kwargs["headers"] = self._merge_client_headers(kwargs.get("headers"))                                                                                         │                
                │ ❱ 599 │   │   results = await self.index_documents(batch, **kwargs)                                                                                                         │                
                │   600 │   │   return cast(List[IndexingResult], results)                                                                                                                    │                
                │   601 │                                                                                                                                                                     │                
                │   602 │   # pylint:disable=client-method-missing-tracing-decorator-async,                                                                                                   │                
                │       delete-operation-wrong-return-type                                                                                                                                    │                
                │                                                                                                                                                                             │                
                │ .venv\Lib\site-packages\azure\core\tracing\decorator_async.py:94 in wrapper_use_tracer                                                    │                
                │                                                                                                                                                                             │                
                │    91 │   │   │                                                                                                                                                             │                
                │    92 │   │   │   span_impl_type = settings.tracing_implementation()                                                                                                        │                
                │    93 │   │   │   if span_impl_type is None:                                                                                                                                │                
                │ ❱  94 │   │   │   │   return await func(*args, **kwargs)                                                                                                                    │                
                │    95 │   │   │                                                                                                                                                             │                
                │    96 │   │   │   # Merge span is parameter is set, but only if no explicit parent are passed                                                                               │                
                │    97 │   │   │   if merge_span and not passed_in_parent:                                                                                                                   │                
                │                                                                                                                                                                             │                
                │ .venv\Lib\site-packages\azure\search\documents\aio\_search_client_async.py:698 in index_documents                                         │                
                │                                                                                                                                                                             │                
                │   695 │   │                                                                                                                                                                 │                
                │   696 │   │   :raises ~azure.search.documents.RequestEntityTooLargeError: The request is too                                                                                │                
                │       large.                                                                                                                                                                │                
                │   697 │   │   """                                                                                                                                                           │                
                │ ❱ 698 │   │   return await self._index_documents_actions(actions=batch.actions, **kwargs)                                                                                   │                
                │   699 │                                                                                                                                                                     │                
                │   700 │   async def _index_documents_actions(self, actions: List[IndexAction], **kwargs: Any)                                                                               │                
                │       -> List[IndexingResult]:                                                                                                                                              │                
                │   701 │   │   error_map = {413: RequestEntityTooLargeError}                                                                                                                 │                
                │                                                                                                                                                                             │                
                │ .venv\Lib\site-packages\azure\search\documents\aio\_search_client_async.py:706 in _index_documents_actions                                │                
                │                                                                                                                                                                             │                
                │   703 │   │   kwargs["headers"] = self._merge_client_headers(kwargs.get("headers"))                                                                                         │                
                │   704 │   │   batch = IndexBatch(actions=actions)                                                                                                                           │                
                │   705 │   │   try:                                                                                                                                                          │                
                │ ❱ 706 │   │   │   batch_response = await self._client.documents.index(batch=batch,                                                                                          │                
                │       error_map=error_map, **kwargs)                                                                                                                                        │                
                │   707 │   │   │   return cast(List[IndexingResult], batch_response.results)                                                                                                 │                
                │   708 │   │   except RequestEntityTooLargeError:                                                                                                                            │                
                │   709 │   │   │   if len(actions) == 1:                                                                                                                                     │                
                │                                                                                                                                                                             │                
                │ .venv\Lib\site-packages\azure\core\tracing\decorator_async.py:94 in wrapper_use_tracer                                                    │                
                │                                                                                                                                                                             │                
                │    91 │   │   │                                                                                                                                                             │                
                │    92 │   │   │   span_impl_type = settings.tracing_implementation()                                                                                                        │                
                │    93 │   │   │   if span_impl_type is None:                                                                                                                                │                
                │ ❱  94 │   │   │   │   return await func(*args, **kwargs)                                                                                                                    │                
                │    95 │   │   │                                                                                                                                                             │                
                │    96 │   │   │   # Merge span is parameter is set, but only if no explicit parent are passed                                                                               │                
                │    97 │   │   │   if merge_span and not passed_in_parent:                                                                                                                   │                
                │                                                                                                                                                                             │                
                │ .venv\Lib\site-packages\azure\search\documents\_generated\aio\operations\_documents_operations.py:887 in index                            │                
                │                                                                                                                                                                             │                
                │    884 │   │   if response.status_code not in [200, 207]:                                                                                                                   │                
                │    885 │   │   │   map_error(status_code=response.status_code, response=response,                                                                                           │                
                │        error_map=error_map)                                                                                                                                                 │                
                │    886 │   │   │   error = self._deserialize.failsafe_deserialize(_models.ErrorResponse,                                                                                    │                
                │        pipeline_response)                                                                                                                                                   │                
                │ ❱  887 │   │   │   raise HttpResponseError(response=response, model=error)                                                                                                  │                
                │    888 │   │                                                                                                                                                                │                
                │    889 │   │   deserialized = self._deserialize("IndexDocumentsResult",                                                                                                     │                
                │        pipeline_response.http_response)                                                                                                                                     │                
                │    890                                                                                                                                                                      │                
                ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯                
                HttpResponseError: () The request is invalid. Details: Cannot find nested property 'images' on the resource type 'search.documentFields'.                                                      
                Code:                                                                                                                                                                                          
                Message: The request is invalid. Details: Cannot find nested property 'images' on the resource type 'search.documentFields'.                                                                   
       ERROR    Azure response context: status=400 request_id=None url=https://gptkb-.search.windows.net//indexes('gptkbindex')/docs/search.index?api-version=2025-05-01-preview   prepdocs.py:710

Traceback (most recent call last):
File "app\backend\prepdocs.py", line 698, in
loop.run_until_complete(main(ingestion_strategy, setup_index=not args.remove and not args.removeall))
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python313\Lib\asyncio\base_events.py", line 720, in run_until_complete
return future.result()
~~~~~~~~~~~~~^^
File "app\backend\prepdocs.py", line 402, in main
await strategy.run()
File "app\backend\prepdocslib\filestrategy.py", line 123, in run
await self.search_manager.update_content(sections, url=file.url)
File "app\backend\prepdocslib\searchmanager.py", line 513, in update_content
await search_client.upload_documents(documents)
File ".venv\Lib\site-packages\azure\search\documents\aio_search_client_async.py", line 599, in upload_documents
results = await self.index_documents(batch, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv\Lib\site-packages\azure\core\tracing\decorator_async.py", line 94, in wrapper_use_tracer
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv\Lib\site-packages\azure\search\documents\aio_search_client_async.py", line 698, in index_documents
return await self.index_documents_actions(actions=batch.actions, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv\Lib\site-packages\azure\search\documents\aio_search_client_async.py", line 706, in index_documents_actions
batch_response = await self.client.documents.index(batch=batch, error_map=error_map, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv\Lib\site-packages\azure\core\tracing\decorator_async.py", line 94, in wrapper_use_tracer
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv\Lib\site-packages\azure\search\documents_generated\aio\operations_documents_operations.py", line 887, in index
raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: () The request is invalid. Details: Cannot find nested property 'images' on the resource type 'search.documentFields'.
Code:
Message: The request is invalid. Details: Cannot find nested property 'images' on the resource type 'search.documentFields'.
Exception ignored in: <function ClientSession.del at 0x000001C29CF83C40>
Traceback (most recent call last):
File ".venv\Lib\site-packages\aiohttp\client.py", line 459, in del
File "C:\Python313\Lib\asyncio\base_events.py", line 1897, in call_exception_handler
File "C:\Python313\Lib\logging_init
.py", line 1548, in error
File "C:\Python313\Lib\logging_init
.py", line 1664, in log
File "C:\Python313\Lib\logging_init
.py", line 1680, in handle
File "C:\Python313\Lib\logging_init
.py", line 1736, in callHandlers
File "C:\Python313\Lib\logging_init_.py", line 1026, in handle
File ".venv\Lib\site-packages\rich\logging.py", line 134, in emit
File "C:\Python313\Lib\logging_init_.py", line 998, in format
File "C:\Python313\Lib\logging_init_.py", line 719, in format
File "C:\Python313\Lib\logging_init_.py", line 669, in formatException
File "C:\Python313\Lib\traceback.py", line 129, in print_exception
File "C:\Python313\Lib\traceback.py", line 1044, in init
File "C:\Python313\Lib\traceback.py", line 492, in extract_from_extended_frame_gen
File "C:\Python313\Lib\traceback.py", line 369, in line
File "C:\Python313\Lib\traceback.py", line 350, in set_lines
File "C:\Python313\Lib\linecache.py", line 25, in getline
File "C:\Python313\Lib\linecache.py", line 41, in getlines
File "C:\Python313\Lib\linecache.py", line 88, in updatecache
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function BaseConnector.del at 0x000001C29CEE5440>
Traceback (most recent call last):
File ".venv\Lib\site-packages\aiohttp\connector.py", line 388, in del
File "C:\Python313\Lib\asyncio\base_events.py", line 1897, in call_exception_handler
File "C:\Python313\Lib\logging_init
.py", line 1548, in error
File "C:\Python313\Lib\logging_init
.py", line 1664, in log
File "C:\Python313\Lib\logging_init
.py", line 1680, in handle
File "C:\Python313\Lib\logging_init_.py", line 1736, in callHandlers
File "C:\Python313\Lib\logging_init_.py", line 1026, in handle
File ".venv\Lib\site-packages\rich\logging.py", line 134, in emit
File "C:\Python313\Lib\logging_init_.py", line 998, in format
File "C:\Python313\Lib\logging_init_.py", line 719, in format
File "C:\Python313\Lib\logging_init_.py", line 669, in formatException
File "C:\Python313\Lib\traceback.py", line 129, in print_exception
File "C:\Python313\Lib\traceback.py", line 1044, in init
File "C:\Python313\Lib\traceback.py", line 492, in _extract_from_extended_frame_gen
File "C:\Python313\Lib\traceback.py", line 369, in line
File "C:\Python313\Lib\traceback.py", line 350, in _set_lines
File "C:\Python313\Lib\linecache.py", line 25, in getline
File "C:\Python313\Lib\linecache.py", line 41, in getlines
File "C:\Python313\Lib\linecache.py", line 88, in updatecache
ImportError: sys.meta_path is None, Python is likely shutting down`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions