Skip to content

Commit da24cf4

Browse files
authored
[Docs] Add to readme: links, getting help, load balancing (#1019)
* Readme improvements * Update screenshot and search sku message * Dont use discussions
1 parent 87d15fc commit da24cf4

File tree

3 files changed

+59
-30
lines changed

3 files changed

+59
-30
lines changed

README.md

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,9 @@ urlFragment: azure-search-openai-demo
4646
- [Customizing the UI and data](#customizing-the-ui-and-data)
4747
- [Productionizing](#productionizing)
4848
- [Resources](#resources)
49-
- [Note](#note)
5049
- [FAQ](#faq)
5150
- [Troubleshooting](#troubleshooting)
51+
- [Getting help](#getting-help)
5252

5353
[![Open in GitHub Codespaces](https://img.shields.io/static/v1?style=for-the-badge&label=GitHub+Codespaces&message=Open&color=brightgreen&logo=github)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=599293758&machine=standardLinux32gb&devcontainer_path=.devcontainer%2Fdevcontainer.json&location=WestUs2)
5454
[![Open in Dev Containers](https://img.shields.io/static/v1?style=for-the-badge&label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/azure-samples/azure-search-openai-demo)
@@ -69,6 +69,8 @@ The repo includes sample data so it's ready to try end to end. In this sample ap
6969

7070
![Chat screen](docs/chatscreen.png)
7171

72+
[📺 Watch a video overview of the app.](https://youtu.be/3acB0OWmLvM)
73+
7274
## Azure account requirements
7375

7476
**IMPORTANT:** In order to deploy and run this example, you'll need:
@@ -195,7 +197,7 @@ When you run `azd up` after and are prompted to select a value for `openAiResour
195197
1. Run `azd env set AZURE_SEARCH_SERVICE_RESOURCE_GROUP {Name of existing resource group with ACS service}`
196198
1. If that resource group is in a different location than the one you'll pick for the `azd up` step,
197199
then run `azd env set AZURE_SEARCH_SERVICE_LOCATION {Location of existing service}`
198-
1. If the search service's SKU is not standard, then run `azd env set AZURE_SEARCH_SERVICE_SKU {Name of SKU}`. The free tier won't work as it doesn't support managed identity. ([See other possible values](https://learn.microsoft.com/azure/templates/microsoft.search/searchservices?pivots=deployment-language-bicep#sku))
200+
1. If the search service's SKU is not standard, then run `azd env set AZURE_SEARCH_SERVICE_SKU {Name of SKU}`. The free tier won't work as it doesn't support managed identity. If your existing search service is using the free tier, you will need to deploy a new service since [search SKUs cannot be changed](https://learn.microsoft.com/azure/search/search-sku-tier#tier-upgrade-or-downgrade). ([See other possible SKU values](https://learn.microsoft.com/azure/templates/microsoft.search/searchservices?pivots=deployment-language-bicep#sku))
199201
1. If you have an existing index that is set up with all the expected fields, then run `azd env set AZURE_SEARCH_INDEX {Name of existing index}`. Otherwise, the `azd up` command will create a new index.
200202

201203
You can also customize the search service (new or existing) for non-English searches:
@@ -310,10 +312,12 @@ to production. Read through our [productionizing guide](docs/productionizing.md)
310312

311313
## Resources
312314

313-
* [Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and AI Search](https://aka.ms/entgptsearchblog)
314-
* [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
315-
* [Azure OpenAI Service](https://learn.microsoft.com/azure/cognitive-services/openai/overview)
316-
* [Comparing Azure OpenAI and OpenAI](https://learn.microsoft.com/azure/cognitive-services/openai/overview#comparing-azure-openai-and-openai/)
315+
* [📖 Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and AI Search](https://aka.ms/entgptsearchblog)
316+
* [📖 Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
317+
* [📖 Azure OpenAI Service](https://learn.microsoft.com/azure/cognitive-services/openai/overview)
318+
* [📖 Comparing Azure OpenAI and OpenAI](https://learn.microsoft.com/azure/cognitive-services/openai/overview#comparing-azure-openai-and-openai/)
319+
* [📖 Access Control in Generative AI applications with Azure Cognitive Search](https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/access-control-in-generative-ai-applications-with-azure/ba-p/3956408)
320+
* [📺 Quickly build and deploy OpenAI apps on Azure, infused with your own data](https://www.youtube.com/watch?v=j8i-OM5kwiY)
317321

318322
## Clean up
319323

@@ -325,9 +329,6 @@ To clean up all the resources created by this sample:
325329

326330
The resource group and all the resources will be deleted.
327331

328-
### Note
329-
330-
>Note: The PDF documents used in this demo contain information generated using a language model (Azure OpenAI Service). The information contained in these documents is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the information contained in this document. All rights reserved to Microsoft.
331332

332333
### FAQ
333334

@@ -430,3 +431,15 @@ Here are the most common failure scenarios and solutions:
430431
1. You see `CERTIFICATE_VERIFY_FAILED` when the `prepdocs.py` script runs. That's typically due to incorrect SSL certificates setup on your machine. Try the suggestions in this [StackOverflow answer](https://stackoverflow.com/questions/35569042/ssl-certificate-verify-failed-with-python3/43855394#43855394).
431432

432433
1. After running `azd up` and visiting the website, you see a '404 Not Found' in the browser. Wait 10 minutes and try again, as it might be still starting up. Then try running `azd deploy` and wait again. If you still encounter errors with the deployed app, consult these [tips for debugging App Service app deployments](http://blog.pamelafox.org/2023/06/tips-for-debugging-flask-deployments-to.html) or watch [this video about downloading App Service logs](https://www.youtube.com/watch?v=f0-aYuvws54). Please file an issue if the logs don't help you resolve the error.
434+
435+
### Getting help
436+
437+
This is a sample built to demonstrate the capabilities of modern Generative AI apps and how they can be built in Azure.
438+
For help with deploying this sample, please post in [GitHub Issues](/issues). If you're a Microsoft employee, you can also post in [our Teams channel](https://aka.ms/azai-python-help).
439+
440+
This repository is supported by the maintainers, _not_ by Microsoft Support,
441+
so please use the support mechanisms described above, and we will do our best to help you out.
442+
443+
### Note
444+
445+
>Note: The PDF documents used in this demo contain information generated using a language model (Azure OpenAI Service). The information contained in these documents is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the information contained in this document. All rights reserved to Microsoft.

docs/chatscreen.png

8.68 KB
Loading

docs/productionizing.md

Lines changed: 37 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
21
# Productionizing the Chat App
32

43
This sample is designed to be a starting point for your own production application,
@@ -7,26 +6,43 @@ to production. Here are some things to consider:
76

87
## Azure resource configuration
98

10-
* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent
11-
to approximately 30 conversations per minute (assuming 1K per user message/response).
12-
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
13-
parameters in `infra/main.bicep` to your account's maximum capacity.
14-
You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
15-
to understand how much capacity you have.
16-
* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU.
17-
To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
18-
which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
19-
* **Azure AI Search**: The default search service uses the `Standard` SKU
20-
with the free semantic search option, which gives you 1000 free queries a month.
21-
Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
22-
to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
23-
If you see errors about search service capacity being exceeded, you may find it helpful to increase
24-
the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
25-
or manually scaling it from the Azure Portal.
26-
* **Azure App Service**: The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
27-
We recommend using a Premium level SKU, starting with 1 CPU core.
28-
You can use auto-scaling rules or scheduled scaling rules,
29-
and scale up the maximum/minimum based on load.
9+
### OpenAI Capacity
10+
11+
The default TPM (tokens per minute) is set to 30K. That is equivalent
12+
to approximately 30 conversations per minute (assuming 1K per user message/response).
13+
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
14+
parameters in `infra/main.bicep` to your account's maximum capacity.
15+
You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
16+
to understand how much capacity you have.
17+
18+
If the maximum TPM isn't enough for your expected load, you have a few options:
19+
20+
* Use a backoff mechanism to retry the request. This is helpful if you're running into a short-term quota due to bursts of activity but aren't over long-term quota. The [tenacity](https://tenacity.readthedocs.io/en/latest/) library is a good option for this, and this [pull request](https://github.com/Azure-Samples/azure-search-openai-demo/pull/500) shows how to apply it to this app.
21+
22+
* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management following [this blog post](https://www.raffertyuy.com/raztype/azure-openai-load-balancing/) or [this repository](https://github.com/andredewes/apim-aoai-smart-loadbalancing). Another approach is to use [LiteLLM's load balancer](https://docs.litellm.ai/docs/providers/azure#azure-api-load-balancing) with Azure Cache for Redis.
23+
24+
### Azure Storage
25+
26+
The default storage account uses the `Standard_LRS` SKU.
27+
To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
28+
which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
29+
30+
### Azure AI Search
31+
32+
The default search service uses the `Standard` SKU
33+
with the free semantic search option, which gives you 1000 free queries a month.
34+
Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
35+
to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
36+
If you see errors about search service capacity being exceeded, you may find it helpful to increase
37+
the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
38+
or manually scaling it from the Azure Portal.
39+
40+
### Azure App Service
41+
42+
The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
43+
We recommend using a Premium level SKU, starting with 1 CPU core.
44+
You can use auto-scaling rules or scheduled scaling rules,
45+
and scale up the maximum/minimum based on load.
3046

3147
## Additional security measures
3248

0 commit comments

Comments
 (0)