Skip to content

Commit fec5ef2

Browse files
authored
Update deploy-models-tsuzumi.md
Remove "More inference examples" section per PM instructions. Update PM alias in metadata
1 parent 7a2f039 commit fec5ef2

File tree

1 file changed

+2
-16
lines changed

1 file changed

+2
-16
lines changed

articles/ai-studio/how-to/deploy-models-tsuzumi.md

Lines changed: 2 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ ms.service: azure-ai-studio
66
manager: scottpolly
77
ms.topic: how-to
88
ms.date: 10/24/2024
9-
ms.reviewer: ssalgado
10-
reviewer: ssalgadodev
9+
ms.reviewer: haelhamm
10+
reviewer: hazemelh
1111
ms.author: ssalgado
1212
author: ssalgadodev
1313
ms.custom: references_regions, generated
@@ -1322,20 +1322,6 @@ The following example shows how to handle events when the model detects harmful
13221322

13231323
::: zone-end
13241324

1325-
## More inference examples
1326-
1327-
For more examples of how to use Tsuzumi models, see the following examples and tutorials:
1328-
1329-
| Description | Language | Sample |
1330-
|-------------------------------------------|-------------------|------------------------------------------------------------------- |
1331-
| CURL request | Bash | [Link](https://aka.ms/meta-llama-3.1-405B-instruct-webrequests) |
1332-
| Azure AI Inference package for JavaScript | JavaScript | [Link](https://aka.ms/azsdk/azure-ai-inference/javascript/samples) |
1333-
| Azure AI Inference package for Python | Python | [Link](https://aka.ms/azsdk/azure-ai-inference/python/samples) |
1334-
| Python web requests | Python | [Link](https://aka.ms/meta-llama-3.1-405B-instruct-webrequests) |
1335-
| OpenAI SDK (experimental) | Python | [Link](https://aka.ms/meta-llama-3.1-405B-instruct-openai) |
1336-
| LangChain | Python | [Link](https://aka.ms/meta-llama-3.1-405B-instruct-langchain) |
1337-
| LiteLLM | Python | [Link](https://aka.ms/meta-llama-3.1-405B-instruct-litellm) |
1338-
13391325
## Cost and quota considerations for tsuzumi models deployed as serverless API endpoints
13401326

13411327
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.

0 commit comments

Comments
 (0)