Replies: 2 comments
-
|
Great work on the Azure APIM integration! This is a common enterprise requirement. Why APIM matters for enterprises:
Additional suggestions for the implementation:
For the Cohere Reranker: We help enterprises deploy AI through APIM at Revolution AI — your implementation looks solid. Would love to see this merged! |
Beta Was this translation helpful? Give feedback.
-
|
Azure APIM integration is essential for enterprise deployments! Your approach looks solid. The custom headers pattern is exactly right: default_headers={
"Ocp-Apim-Subscription-Key": config.llm_api_key
}Additional considerations: 1. Rate limiting handling # APIM returns 429 with Retry-After header
from tenacity import retry, wait_exponential
@retry(wait=wait_exponential(min=1, max=60))
async def call_with_retry(prompt):
return await llm.ainvoke(prompt)2. Multiple backend routing default_query={
"backend-pool": "eastus-westus" # APIM routing hint
}3. Logging/tracing integration default_headers={
"Ocp-Apim-Subscription-Key": key,
"x-request-id": str(uuid4()), # For APIM logs correlation
}4. Environment config AZURE_APIM_ENDPOINT=https://your-apim.azure-api.net/openai
AZURE_APIM_KEY=your-subscription-key
AZURE_APIM_DEPLOYMENT=gpt-4o
AZURE_APIM_API_VERSION=2024-02-15-previewFor Cohere Reranker: We deploy AI through Azure APIM at Revolution AI — your PR looks like a great addition to Quivr! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello!
First of all, great project!
I was wondering if it's possible to integrate models via Azure APIM, as many enterprises rely on it. You can find simple information on this project here: https://github.com/Azure-Samples/AI-Gateway/tree/main/labs/GPT-4o-inferencing.
My goal is to configure something like this for the llm_endpoint:
And for embeddings:
I'm waiting for Cohere Reranker integration.
In my case, the Azure APIM has a complex URL structure, making it difficult to extract properties directly from the URL. Additionally, we have some extra properties like default_query that need to be handled.
I’ve started working on this here: https://github.com/QuivrHQ/quivr/compare/main...dminier:quivr:feature/azure-apim?expand=1. I've tested it, and it seems to work well.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions