You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Customizing the UI and data](#customizing-the-ui-and-data)
47
47
-[Productionizing](#productionizing)
48
48
-[Resources](#resources)
49
-
-[Note](#note)
50
49
-[FAQ](#faq)
51
50
-[Troubleshooting](#troubleshooting)
51
+
-[Getting help](#getting-help)
52
52
53
53
[](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=599293758&machine=standardLinux32gb&devcontainer_path=.devcontainer%2Fdevcontainer.json&location=WestUs2)
54
54
[](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/azure-samples/azure-search-openai-demo)
@@ -69,6 +69,8 @@ The repo includes sample data so it's ready to try end to end. In this sample ap
69
69
70
70

71
71
72
+
[📺 Watch a video overview of the app.](https://youtu.be/3acB0OWmLvM)
73
+
72
74
## Azure account requirements
73
75
74
76
**IMPORTANT:** In order to deploy and run this example, you'll need:
@@ -195,7 +197,7 @@ When you run `azd up` after and are prompted to select a value for `openAiResour
195
197
1. Run `azd env set AZURE_SEARCH_SERVICE_RESOURCE_GROUP {Name of existing resource group with ACS service}`
196
198
1. If that resource group is in a different location than the one you'll pick for the `azd up` step,
197
199
then run `azd env set AZURE_SEARCH_SERVICE_LOCATION {Location of existing service}`
198
-
1. If the search service's SKU is not standard, then run `azd env set AZURE_SEARCH_SERVICE_SKU {Name of SKU}`. The free tier won't work as it doesn't support managed identity. ([See other possible values](https://learn.microsoft.com/azure/templates/microsoft.search/searchservices?pivots=deployment-language-bicep#sku))
200
+
1. If the search service's SKU is not standard, then run `azd env set AZURE_SEARCH_SERVICE_SKU {Name of SKU}`. The free tier won't work as it doesn't support managed identity. If your existing search service is using the free tier, you will need to deploy a new service since [search SKUs cannot be changed](https://learn.microsoft.com/azure/search/search-sku-tier#tier-upgrade-or-downgrade). ([See other possible SKU values](https://learn.microsoft.com/azure/templates/microsoft.search/searchservices?pivots=deployment-language-bicep#sku))
199
201
1. If you have an existing index that is set up with all the expected fields, then run `azd env set AZURE_SEARCH_INDEX {Name of existing index}`. Otherwise, the `azd up` command will create a new index.
200
202
201
203
You can also customize the search service (new or existing) for non-English searches:
@@ -310,10 +312,12 @@ to production. Read through our [productionizing guide](docs/productionizing.md)
310
312
311
313
## Resources
312
314
313
-
*[Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and AI Search](https://aka.ms/entgptsearchblog)
314
-
*[Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
*[📖 Comparing Azure OpenAI and OpenAI](https://learn.microsoft.com/azure/cognitive-services/openai/overview#comparing-azure-openai-and-openai/)
319
+
*[📖 Access Control in Generative AI applications with Azure Cognitive Search](https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/access-control-in-generative-ai-applications-with-azure/ba-p/3956408)
320
+
*[📺 Quickly build and deploy OpenAI apps on Azure, infused with your own data](https://www.youtube.com/watch?v=j8i-OM5kwiY)
317
321
318
322
## Clean up
319
323
@@ -325,9 +329,6 @@ To clean up all the resources created by this sample:
325
329
326
330
The resource group and all the resources will be deleted.
327
331
328
-
### Note
329
-
330
-
>Note: The PDF documents used in this demo contain information generated using a language model (Azure OpenAI Service). The information contained in these documents is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the information contained in this document. All rights reserved to Microsoft.
331
332
332
333
### FAQ
333
334
@@ -430,3 +431,15 @@ Here are the most common failure scenarios and solutions:
430
431
1. You see `CERTIFICATE_VERIFY_FAILED` when the `prepdocs.py` script runs. That's typically due to incorrect SSL certificates setup on your machine. Try the suggestions in this [StackOverflow answer](https://stackoverflow.com/questions/35569042/ssl-certificate-verify-failed-with-python3/43855394#43855394).
431
432
432
433
1. After running `azd up` and visiting the website, you see a '404 Not Found' in the browser. Wait 10 minutes and try again, as it might be still starting up. Then try running `azd deploy` and wait again. If you still encounter errors with the deployed app, consult these [tips for debugging App Service app deployments](http://blog.pamelafox.org/2023/06/tips-for-debugging-flask-deployments-to.html) or watch [this video about downloading App Service logs](https://www.youtube.com/watch?v=f0-aYuvws54). Please file an issue if the logs don't help you resolve the error.
434
+
435
+
### Getting help
436
+
437
+
This is a sample built to demonstrate the capabilities of modern Generative AI apps and how they can be built in Azure.
438
+
For help with deploying this sample, please post in [GitHub Issues](/issues). If you're a Microsoft employee, you can also post in [our Teams channel](https://aka.ms/azai-python-help).
439
+
440
+
This repository is supported by the maintainers, _not_ by Microsoft Support,
441
+
so please use the support mechanisms described above, and we will do our best to help you out.
442
+
443
+
### Note
444
+
445
+
>Note: The PDF documents used in this demo contain information generated using a language model (Azure OpenAI Service). The information contained in these documents is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the information contained in this document. All rights reserved to Microsoft.
Copy file name to clipboardExpand all lines: docs/productionizing.md
+37-21Lines changed: 37 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,3 @@
1
-
2
1
# Productionizing the Chat App
3
2
4
3
This sample is designed to be a starting point for your own production application,
@@ -7,26 +6,43 @@ to production. Here are some things to consider:
7
6
8
7
## Azure resource configuration
9
8
10
-
***OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent
11
-
to approximately 30 conversations per minute (assuming 1K per user message/response).
12
-
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
13
-
parameters in `infra/main.bicep` to your account's maximum capacity.
14
-
You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
15
-
to understand how much capacity you have.
16
-
***Azure Storage**: The default storage account uses the `Standard_LRS` SKU.
17
-
To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
18
-
which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
19
-
***Azure AI Search**: The default search service uses the `Standard` SKU
20
-
with the free semantic search option, which gives you 1000 free queries a month.
21
-
Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
22
-
to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
23
-
If you see errors about search service capacity being exceeded, you may find it helpful to increase
24
-
the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
25
-
or manually scaling it from the Azure Portal.
26
-
***Azure App Service**: The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
27
-
We recommend using a Premium level SKU, starting with 1 CPU core.
28
-
You can use auto-scaling rules or scheduled scaling rules,
29
-
and scale up the maximum/minimum based on load.
9
+
### OpenAI Capacity
10
+
11
+
The default TPM (tokens per minute) is set to 30K. That is equivalent
12
+
to approximately 30 conversations per minute (assuming 1K per user message/response).
13
+
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
14
+
parameters in `infra/main.bicep` to your account's maximum capacity.
15
+
You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
16
+
to understand how much capacity you have.
17
+
18
+
If the maximum TPM isn't enough for your expected load, you have a few options:
19
+
20
+
* Use a backoff mechanism to retry the request. This is helpful if you're running into a short-term quota due to bursts of activity but aren't over long-term quota. The [tenacity](https://tenacity.readthedocs.io/en/latest/) library is a good option for this, and this [pull request](https://github.com/Azure-Samples/azure-search-openai-demo/pull/500) shows how to apply it to this app.
21
+
22
+
* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management following [this blog post](https://www.raffertyuy.com/raztype/azure-openai-load-balancing/) or [this repository](https://github.com/andredewes/apim-aoai-smart-loadbalancing). Another approach is to use [LiteLLM's load balancer](https://docs.litellm.ai/docs/providers/azure#azure-api-load-balancing) with Azure Cache for Redis.
23
+
24
+
### Azure Storage
25
+
26
+
The default storage account uses the `Standard_LRS` SKU.
27
+
To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
28
+
which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
29
+
30
+
### Azure AI Search
31
+
32
+
The default search service uses the `Standard` SKU
33
+
with the free semantic search option, which gives you 1000 free queries a month.
34
+
Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
35
+
to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
36
+
If you see errors about search service capacity being exceeded, you may find it helpful to increase
37
+
the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
38
+
or manually scaling it from the Azure Portal.
39
+
40
+
### Azure App Service
41
+
42
+
The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
43
+
We recommend using a Premium level SKU, starting with 1 CPU core.
44
+
You can use auto-scaling rules or scheduled scaling rules,
0 commit comments