Skip to content

Commit 9cdbc1c

Browse files
authored
Add section about productionizing (#577)
* Remove defaults for getenv * Remove print * missing output * readme section * Update README with productionizing tips * Add networking section * Review feedback from comments
1 parent 2e8adf4 commit 9cdbc1c

File tree

1 file changed

+41
-2
lines changed

1 file changed

+41
-2
lines changed

README.md

Lines changed: 41 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
- [Enabling authentication](#enabling-authentication)
2020
- [Using the app](#using-the-app)
2121
- [Running locally](#running-locally)
22+
- [Productionizing](#productionizing)
2223
- [Resources](#resources)
2324
- [Note](#note)
2425
- [FAQ](#faq)
@@ -47,7 +48,7 @@ The repo includes sample data so it's ready to try end to end. In this sample ap
4748

4849
> **IMPORTANT:** In order to deploy and run this example, you'll need an **Azure subscription with access enabled for the Azure OpenAI service**. You can request access [here](https://aka.ms/oaiapply). You can also visit [here](https://azure.microsoft.com/free/cognitive-search/) to get some free Azure credits to get you started.
4950
50-
## Azure deployment
51+
## Azure deployment
5152

5253
### Cost estimation
5354

@@ -166,7 +167,7 @@ To then limit access to a specific set of users or groups, you can follow the st
166167

167168
## Running locally
168169

169-
You can only run locally **after** having successfully run the `azd up` command.
170+
You can only run locally **after** having successfully run the `azd up` command. If you haven't yet, follow the steps in [Azure deployment](#azure-deployment) above.
170171

171172
1. Run `azd auth login`
172173
2. Change dir to `app`
@@ -183,6 +184,44 @@ Once in the web app:
183184
* Explore citations and sources
184185
* Click on "settings" to try different options, tweak prompts, etc.
185186

187+
## Productionizing
188+
189+
This sample is designed to be a starting point for your own production application,
190+
but you should do a thorough review of the security and performance before deploying
191+
to production. Here are some things to consider:
192+
193+
* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent
194+
to approximately 30 conversations per minute (assuming 1K per user message/response).
195+
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
196+
parameters in `infra/main.bicep` to your account's maximum capacity.
197+
You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
198+
to understand how much capacity you have.
199+
* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU.
200+
To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
201+
which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
202+
* **Azure Cognitive Search**: The default search service uses the `Standard` SKU
203+
with the free semantic search option, which gives you 1000 free queries a month.
204+
Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
205+
to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
206+
If you see errors about search service capacity being exceeded, you may find it helpful to increase
207+
the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
208+
or manually scaling it from the Azure Portal.
209+
* **Azure App Service**: The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
210+
We recommend using a Premium level SKU, starting with 1 CPU core.
211+
You can use auto-scaling rules or scheduled scaling rules,
212+
and scale up the maximum/minimum based on load.
213+
* **Authentication**: By default, the deployed app is publicly accessible.
214+
We recommend restricting access to authenticated users.
215+
See [Enabling authentication](#enabling-authentication) above for how to enable authentication.
216+
* **Networking**: We recommend deploying inside a Virtual Network. If the app is only for
217+
internal enterprise use, use a private DNS zone. Also consider using Azure API Management (APIM)
218+
for firewalls and other forms of protection.
219+
For more details, read [Azure OpenAI Landing Zone reference architecture](https://techcommunity.microsoft.com/t5/azure-architecture-blog/azure-openai-landing-zone-reference-architecture/ba-p/3882102).
220+
* **Loadtesting**: We recommend running a loadtest for your expected number of users.
221+
You can use the [locust tool](https://docs.locust.io/) with the `locustfile.py` in this sample
222+
or set up a loadtest with Azure Load Testing.
223+
224+
186225
## Resources
187226

188227
* [Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and Cognitive Search](https://aka.ms/entgptsearchblog)

0 commit comments

Comments
 (0)