How to use AzureOpenAI Embeddings #47
-
Hi, wondering what config changes do I need to make to yaml file for it to use Azure openai embeddings instead of OpenAI? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 1 reply
-
Hi, In your config.yaml file you have a section named "embedding" where you can define the endpoint, api key, and model. This should work for Azure. Martin |
Beta Was this translation helpful? Give feedback.
-
Hi Martin
Thanks for your work on this, hopefully I can figure this out.
I've tried using the embeddings endpoint (in many different ways e.g.
https://xxx.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2023-05-15
or https://xxx.cognitiveservices.azure.com/) in config.yaml file, but
always get error [ERROR] EmbeddingWorker error processing batch: 404
Client Error: Resource Not Found for url:.
Any ideas? I'm sure I've configured this correctly. Have you tried
using Qdrant Loader via Azure? The endpoint expects a slightly
different format to openai.
Appreciate any suggestions.
Thanks
Sze
…On Tue, Aug 26, 2025 at 3:20 PM Martin Papy ***@***.***> wrote:
Hi,
In your config.yaml file you have a section named "embedding" where you can define the endpoint, api key, and model. This should work for Azure.
Martin
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Martin
Please find attached the error logs. I've tried many variants of the
Azure Endpoint, but always have an error. Wil open an issue as you
suggested.
Thanks
Sze
On Tue, Aug 26, 2025 at 4:01 PM Martin Papy ***@***.***> wrote:
Would you mind sharing the detailed errors logs and a sample of your config.yaml file ? Maybe open an issue as well
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
(qdrant-loader-env) PS C:\Users\sze.ding\OneDrive - Alceon Group\Documents\AI\04. RAG\QLoader> qdrant-loader ingest
04:53:10 [INFO] DefaultChunkingStrategy initialized with modular architecture chunk_overlap=100 chunk_size=1000 chunking_method=intelligent_text_processing has_encoding=True tokenizer=cl100k_base
04:53:11 [INFO] DefaultChunkingStrategy initialized with modular architecture chunk_overlap=100 chunk_size=1000 chunking_method=intelligent_text_processing has_encoding=True tokenizer=cl100k_base
04:53:11 [INFO] Initializing metrics directory at C:\Users\sze.ding\OneDrive - Alceon Group\Documents\AI\04. RAG\QLoader\metrics
04:53:11 [INFO] AsyncIngestionPipeline initialized with new modular architecture
04:53:11 [INFO] Initializing Project Manager
04:53:11 [INFO] Discovered project: support-project (PE Knowledge Base)
04:53:11 [INFO] Project Manager initialized with 1 projects
04:53:11 [INFO] [ROCKET] Starting document ingestion
04:53:11 [INFO] Processing 1 projects
04:53:11 [INFO] [ROCKET] Starting document ingestion
04:53:12 [INFO] Starting file conversion file_path='C:/Users/sze.ding/OneDrive - Alceon Group/Documents/AI/01. Loaded Data/QLTest/[AI] test/1a - Alceon PE overview.pdf'
04:53:14 [INFO] File conversion completed content_length=16796 file_path='C:/Users/sze.ding/OneDrive - Alceon Group/Documents/AI/01. Loaded Data/QLTest/[AI] test/1a - Alceon PE overview.pdf' timeout_used=300
04:53:14 [INFO] File conversion successful file_path='[AI] test/1a - Alceon PE overview.pdf'
04:53:15 [INFO] 踏 LocalFile: 1 documents from 1 sources
04:53:15 [INFO] [DOCUMENT] Collected 1 documents from all sources
04:53:15 [INFO] Starting change detection document_count=1
04:53:15 [INFO] Change detection completed deleted_count=0 new_count=1 updated_count=0
04:53:15 [INFO] [SEARCH] Change detection: 1 new, 0 updated, 0 deleted
04:53:15 [INFO] [GEAR] Processing 1 documents through pipeline
04:53:15 [INFO] 売 Starting chunking phase...
04:53:15 [INFO] 売 Chunking completed, transitioning to embedding phase...
04:53:15 [INFO] [TIMER] Chunking phase took 0.00 seconds
04:53:15 [INFO] 売 Embedding phase ready, starting upsert phase...
04:53:15 [INFO] 売 Starting embedding generation...
04:53:15 [INFO] 売 Processing 1 documents for chunking...
04:53:15 [INFO] Using markdown strategy for converted file conversion_method=markitdown document_id=48b52eb7-fd17-1969-7e5b-dfc27343908e document_title='1a - Alceon PE overview.pdf' original_file_type=pdf
04:53:16 [INFO] Chunking 1a - Alceon PE overview.pdf (16,796 chars)
04:53:16 [INFO] Processing document: 1a - Alceon PE overview.pdf (16,796 chars) extra={'estimated_chunks': 19, 'chunk_size': 1000, 'max_chunks_allowed': 1000}
04:53:16 [INFO] Small document detected - minimal splitting extra={'total_headers': 0}
04:53:19 [INFO] Markdown chunking completed for document: 1a - Alceon PE overview.pdf extra={'document_id': '48b52eb7-fd17-1969-7e5b-dfc27343908e', 'total_chunks': 22, 'document_size': 16796, 'avg_chunk_size': 797}
04:53:20 [INFO] 売 Chunking progress: 1/1 documents, 22 chunks generated
04:53:20 [INFO] [CHECK] Chunking completed: 1/1 documents processed, 22 total chunks
04:53:20 [WARNING] High memory usage detected: 95.9%. Running garbage collection...
04:53:20 [WARNING] Network error in embedding batch 1, will retry attempt=1 error='404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings' error_type=HTTPError max_retries=3
04:53:20 [WARNING] Retrying embedding batch 1 after network error attempt=1 delay_seconds=1.0 last_error='404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings' max_retries=3
04:53:22 [WARNING] Network error in embedding batch 1, will retry attempt=2 error='404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings' error_type=HTTPError max_retries=3
04:53:22 [WARNING] Retrying embedding batch 1 after network error attempt=2 delay_seconds=2.0 last_error='404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings' max_retries=3
04:53:24 [WARNING] Network error in embedding batch 1, will retry attempt=3 error='404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings' error_type=HTTPError max_retries=3
04:53:24 [WARNING] Retrying embedding batch 1 after network error attempt=3 delay_seconds=4.0 last_error='404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings' max_retries=3
04:53:28 [ERROR] All retry attempts failed for embedding batch 1 error_type=HTTPError final_error='404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings' total_attempts=4
04:53:28 [ERROR] EmbeddingWorker error processing batch: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] EmbeddingWorker final batch processing failed: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk fa97e279-e8c3-4aaf-ff2d-40cf8fe40209: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 96bfb1f9-f9ba-4eeb-e1fc-ef6b0877915b: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 5fca7642-84c4-664b-1288-bd54305df8ad: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 207dba2f-dfd5-6a85-f574-d8504f696662: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 1f496e61-8b3b-f25a-9037-a1b8c20d8633: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 9cf01c91-f037-fa6a-f277-7cb8cdaa3fea: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 24e65bf5-046f-93e1-5dac-41db1205dab0: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 7a475e83-6e58-0d28-d4af-0122eaede25f: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 1477e892-bc50-40f6-e3bd-e3a34e4bd992: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 62e3793d-cfca-991b-0626-1596cd87ef39: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 7ee7f54a-6a64-146e-d16f-af0dd8fc8e67: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk c5460778-41fe-09d1-53fb-6f9208aba52a: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 6520c283-4a52-f46c-e3a0-bc005d37d7a0: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 082e0756-a2db-3438-b8d1-14c8fcef8bbc: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 0bf9d1a6-d0ea-a872-4780-d802f05b1d32: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk e25d8e36-8916-f5aa-334f-13e7372b4793: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 6cc729ec-d827-cc02-f3d0-30f65c71f0d2: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk ff27781d-10b2-2740-07da-48bdea7369bb: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 1cc0ecf7-1b92-80c9-fd15-d7c1724623bc: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 725760e2-591f-1c10-0f9e-9115d94eb063: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk 74d2b7b8-5d9b-09a3-7cab-9a85c360b30b: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [ERROR] Embedding failed for chunk d6bd161e-6e2a-7706-22ee-c5e8d328ec90: 404 Client Error: Resource Not Found for url: https://alceon-ai-production-resource.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large/embeddings
04:53:28 [INFO] [CHECK] Embedding completed: 0 chunks processed
04:53:28 [INFO] [TIMER] Embedding + Upsert phase took 13.93 seconds
04:53:28 [INFO] [TIMER] Total pipeline duration: 13.93 seconds
04:53:28 [INFO] [CHECK] Pipeline completed: 0 chunks processed, 0 errors
04:53:28 [INFO] [CHECK] Ingestion completed: 0 chunks processed successfully
04:53:28 [INFO] Completed processing all projects: 1 total documents
04:53:28 [INFO] Cleaning up pipeline resources
04:53:28 [ERROR] Error during pipeline cleanup: float division by zero
04:53:28 [INFO] Pipeline finished, awaiting cleanup.
04:53:29 [INFO] Cleaning up resources...
04:53:29 [INFO] Cleanup completed
|
Beta Was this translation helpful? Give feedback.
-
I added the support of Azure Open AI embeddings in the version 0.7.1, but to be honest I don't have an access to Azure. So it might not fully work as expected :) Pay attention to the fact that I changed quite a bit the configuration file to add a global.llm section for that purpose. |
Beta Was this translation helpful? Give feedback.
-
Hi Martin
Thanks, I'll test it on Monday and let you know if there are issues.
Completely appreciate that you're building based on docs rather than
any test API. I can possibly provide test API keys if there are any
issues with the changes.
Sze
…On Fri, Sep 5, 2025 at 1:17 AM Martin Papy ***@***.***> wrote:
I added the support of Azure Open AI embeddings in the version 0.7.1, but to be honest I don't have an access to Azure. So it might not fully work as expected :)
Pay attention to the fact that I changed quite a bit the configuration file to add a global.llm section for that purpose.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
I added the support of Azure Open AI embeddings in the version 0.7.1, but to be honest I don't have an access to Azure. So it might not fully work as expected :)
Pay attention to the fact that I changed quite a bit the configuration file to add a global.llm section for that purpose.