[Docs] Add to readme: links, getting help, load balancing (#1019)

pamelafox · web-flow · commit da24cf46d7fa · 2023-12-01T13:22:07.000-08:00
* Readme improvements

* Update screenshot and search sku message

* Dont use discussions
diff --git a/README.md b/README.md
@@ -46,9 +46,9 @@ urlFragment: azure-search-openai-demo
 - [Customizing the UI and data](#customizing-the-ui-and-data)
 - [Productionizing](#productionizing)
 - [Resources](#resources)
-  - [Note](#note)
   - [FAQ](#faq)
   - [Troubleshooting](#troubleshooting)
+  - [Getting help](#getting-help)
 
 [![Open in GitHub Codespaces](https://img.shields.io/static/v1?style=for-the-badge&label=GitHub+Codespaces&message=Open&color=brightgreen&logo=github)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=599293758&machine=standardLinux32gb&devcontainer_path=.devcontainer%2Fdevcontainer.json&location=WestUs2)
 [![Open in Dev Containers](https://img.shields.io/static/v1?style=for-the-badge&label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/azure-samples/azure-search-openai-demo)
@@ -69,6 +69,8 @@ The repo includes sample data so it's ready to try end to end. In this sample ap
 
 ![Chat screen](docs/chatscreen.png)
 
+[📺 Watch a video overview of the app.](https://youtu.be/3acB0OWmLvM)
+
 ## Azure account requirements
 
 **IMPORTANT:** In order to deploy and run this example, you'll need:
@@ -195,7 +197,7 @@ When you run `azd up` after and are prompted to select a value for `openAiResour
 1. Run `azd env set AZURE_SEARCH_SERVICE_RESOURCE_GROUP {Name of existing resource group with ACS service}`
 1. If that resource group is in a different location than the one you'll pick for the `azd up` step,
   then run `azd env set AZURE_SEARCH_SERVICE_LOCATION {Location of existing service}`
-1. If the search service's SKU is not standard, then run `azd env set AZURE_SEARCH_SERVICE_SKU {Name of SKU}`. The free tier won't work as it doesn't support managed identity. ([See other possible values](https://learn.microsoft.com/azure/templates/microsoft.search/searchservices?pivots=deployment-language-bicep#sku))
+1. If the search service's SKU is not standard, then run `azd env set AZURE_SEARCH_SERVICE_SKU {Name of SKU}`. The free tier won't work as it doesn't support managed identity. If your existing search service is using the free tier, you will need to deploy a new service since [search SKUs cannot be changed](https://learn.microsoft.com/azure/search/search-sku-tier#tier-upgrade-or-downgrade). ([See other possible SKU values](https://learn.microsoft.com/azure/templates/microsoft.search/searchservices?pivots=deployment-language-bicep#sku))
 1. If you have an existing index that is set up with all the expected fields, then run `azd env set AZURE_SEARCH_INDEX {Name of existing index}`. Otherwise, the `azd up` command will create a new index.
 
 You can also customize the search service (new or existing) for non-English searches:
@@ -310,10 +312,12 @@ to production. Read through our [productionizing guide](docs/productionizing.md)
 
 ## Resources
 
-* [Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and AI Search](https://aka.ms/entgptsearchblog)
-* [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
-* [Azure OpenAI Service](https://learn.microsoft.com/azure/cognitive-services/openai/overview)
-* [Comparing Azure OpenAI and OpenAI](https://learn.microsoft.com/azure/cognitive-services/openai/overview#comparing-azure-openai-and-openai/)
+* [📖 Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and AI Search](https://aka.ms/entgptsearchblog)
+* [📖 Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
+* [📖 Azure OpenAI Service](https://learn.microsoft.com/azure/cognitive-services/openai/overview)
+* [📖 Comparing Azure OpenAI and OpenAI](https://learn.microsoft.com/azure/cognitive-services/openai/overview#comparing-azure-openai-and-openai/)
+* [📖 Access Control in Generative AI applications with Azure Cognitive Search](https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/access-control-in-generative-ai-applications-with-azure/ba-p/3956408)
+* [📺 Quickly build and deploy OpenAI apps on Azure, infused with your own data](https://www.youtube.com/watch?v=j8i-OM5kwiY)
 
 ## Clean up
 
@@ -325,9 +329,6 @@ To clean up all the resources created by this sample:
 
 The resource group and all the resources will be deleted.
 
-### Note
-
->Note: The PDF documents used in this demo contain information generated using a language model (Azure OpenAI Service). The information contained in these documents is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the information contained in this document. All rights reserved to Microsoft.
 
 ### FAQ
 
@@ -430,3 +431,15 @@ Here are the most common failure scenarios and solutions:
 1. You see `CERTIFICATE_VERIFY_FAILED` when the `prepdocs.py` script runs. That's typically due to incorrect SSL certificates setup on your machine. Try the suggestions in this [StackOverflow answer](https://stackoverflow.com/questions/35569042/ssl-certificate-verify-failed-with-python3/43855394#43855394).
 
 1. After running `azd up` and visiting the website, you see a '404 Not Found' in the browser. Wait 10 minutes and try again, as it might be still starting up. Then try running `azd deploy` and wait again. If you still encounter errors with the deployed app, consult these [tips for debugging App Service app deployments](http://blog.pamelafox.org/2023/06/tips-for-debugging-flask-deployments-to.html) or watch [this video about downloading App Service logs](https://www.youtube.com/watch?v=f0-aYuvws54). Please file an issue if the logs don't help you resolve the error.
+
+### Getting help
+
+This is a sample built to demonstrate the capabilities of modern Generative AI apps and how they can be built in Azure.
+For help with deploying this sample, please post in [GitHub Issues](/issues). If you're a Microsoft employee, you can also post in [our Teams channel](https://aka.ms/azai-python-help).
+
+This repository is supported by the maintainers, _not_ by Microsoft Support,
+so please use the support mechanisms described above, and we will do our best to help you out.
+
+### Note
+
+>Note: The PDF documents used in this demo contain information generated using a language model (Azure OpenAI Service). The information contained in these documents is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the information contained in this document. All rights reserved to Microsoft.
diff --git a/docs/chatscreen.png b/docs/chatscreen.png
diff --git a/docs/productionizing.md b/docs/productionizing.md
@@ -1,4 +1,3 @@
-
 # Productionizing the Chat App
 
 This sample is designed to be a starting point for your own production application,
@@ -7,26 +6,43 @@ to production. Here are some things to consider:
 
 ## Azure resource configuration
 
-* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent
-  to approximately 30 conversations per minute (assuming 1K per user message/response).
-  You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
-  parameters in `infra/main.bicep` to your account's maximum capacity.
-  You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
-  to understand how much capacity you have.
-* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU.
-  To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
-  which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
-* **Azure AI Search**: The default search service uses the `Standard` SKU
-  with the free semantic search option, which gives you 1000 free queries a month.
-  Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
-  to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
-  If you see errors about search service capacity being exceeded, you may find it helpful to increase
-  the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
-  or manually scaling it from the Azure Portal.
-* **Azure App Service**: The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
-  We recommend using a Premium level SKU, starting with 1 CPU core.
-  You can use auto-scaling rules or scheduled scaling rules,
-  and scale up the maximum/minimum based on load.
+### OpenAI Capacity
+
+The default TPM (tokens per minute) is set to 30K. That is equivalent
+to approximately 30 conversations per minute (assuming 1K per user message/response).
+You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
+parameters in `infra/main.bicep` to your account's maximum capacity.
+You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
+to understand how much capacity you have.
+
+If the maximum TPM isn't enough for your expected load, you have a few options:
+
+* Use a backoff mechanism to retry the request. This is helpful if you're running into a short-term quota due to bursts of activity but aren't over long-term quota. The [tenacity](https://tenacity.readthedocs.io/en/latest/) library is a good option for this, and this [pull request](https://github.com/Azure-Samples/azure-search-openai-demo/pull/500) shows how to apply it to this app.
+
+* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management following [this blog post](https://www.raffertyuy.com/raztype/azure-openai-load-balancing/) or [this repository](https://github.com/andredewes/apim-aoai-smart-loadbalancing). Another approach is to use [LiteLLM's load balancer](https://docs.litellm.ai/docs/providers/azure#azure-api-load-balancing) with Azure Cache for Redis.
+
+### Azure Storage
+
+The default storage account uses the `Standard_LRS` SKU.
+To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
+which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
+
+### Azure AI Search
+
+The default search service uses the `Standard` SKU
+with the free semantic search option, which gives you 1000 free queries a month.
+Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
+to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
+If you see errors about search service capacity being exceeded, you may find it helpful to increase
+the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
+or manually scaling it from the Azure Portal.
+
+### Azure App Service
+
+The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
+We recommend using a Premium level SKU, starting with 1 CPU core.
+You can use auto-scaling rules or scheduled scaling rules,
+and scale up the maximum/minimum based on load.
 
 ## Additional security measures