Skip to content

Commit fd34f15

Browse files
authored
Merge pull request #41 from MicrosoftDocs/main
8/29/2024 AM Publish
2 parents 946f97f + 14e9159 commit fd34f15

29 files changed

+77
-543
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
# Operating System files
2+
.DS_Store
3+
Thumbs.db
4+
15
log/
26
obj/
37
_site/

articles/ai-services/openai/how-to/deployment-types.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Azure OpenAI offers three types of deployments. These provide a varied level of
3030

3131
| **Offering** | **Global-Batch** | **Global-Standard** | **Standard** | **Provisioned** |
3232
|---|:---|:---|:---|:---|
33-
| **Best suited for** | Offline scoring <br><br> Workloads that are not latency sensitive and can be completed in hours.<br><br> For use cases that do not have data processing residency requirements.| Recommended starting place for customers. <br><br>Global-Standard will have the higher default quota and larger number of models available than Standard. <br><br> For production applications that do not have data processing residency requirements. | For customers with data residency requirements. Optimized for low to medium volume. | Real-time scoring for large consistent volume. Includes the highest commitments and limits.|
33+
| **Best suited for** | Offline scoring <br><br> Workloads that are not latency sensitive and can be completed in hours.<br><br> For use cases that do not have data processing residency requirements.| Recommended starting place for customers. <br><br>Global-Standard will have the higher default quota and larger number of models available than Standard. | For customers with data residency requirements. Optimized for low to medium volume. | Real-time scoring for large consistent volume. Includes the highest commitments and limits.|
3434
| **How it works** | Offline processing via files |Traffic may be routed anywhere in the world | | |
3535
| **Getting started** | [Global-Batch](./batch.md) | [Model deployment](./create-resource.md) | [Model deployment](./create-resource.md) | [Provisioned onboarding](./provisioned-throughput-onboarding.md) |
3636
| **Cost** | [Least expensive option](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) <br> 50% less cost compared to Global Standard prices. Access to all new models with larger quota allocations. | [Global deployment pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | [Regional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | May experience cost savings for consistent usage |

articles/ai-studio/how-to/costs-plan-manage.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,10 @@ author: Blackmist
1818

1919
[!INCLUDE [Feature preview](~/reusable-content/ce-skilling/azure/includes/ai-studio/includes/feature-preview.md)]
2020

21-
This article describes how you plan for and manage costs for Azure AI Studio. First, you use the Azure pricing calculator to help plan for Azure AI Studio costs before you add any resources for the service to estimate costs. Next, as you add Azure resources, review the estimated costs.
21+
This article describes how you plan for and manage costs for Azure AI Studio. First, you use the Azure pricing calculator to help plan for Azure AI Studio costs before you add any resources for the service to estimate costs. Next, as you add Azure resources, review the estimated costs.
22+
23+
> [!TIP]
24+
> Azure AI Studio does not have a specific page in the Azure pricing calculator. Azure AI Studio is composed of several other Azure services, some of which are optional. This article provides information on using the pricing calculator to estimate costs for these services.
2225
2326
You use Azure AI services in Azure AI Studio. Costs for Azure AI services are only a portion of the monthly costs in your Azure bill. You're billed for all Azure services and resources used in your Azure subscription, including the third-party services.
2427

articles/ai-studio/how-to/deploy-models-cohere-command.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -231,7 +231,7 @@ print_stream(result)
231231
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
232232

233233
```python
234-
from azure.ai.inference.models import ChatCompletionsResponseFormat
234+
from azure.ai.inference.models import ChatCompletionsResponseFormatText
235235

236236
response = client.complete(
237237
messages=[
@@ -244,7 +244,7 @@ response = client.complete(
244244
stop=["<|endoftext|>"],
245245
temperature=0,
246246
top_p=1,
247-
response_format={ "type": ChatCompletionsResponseFormat.TEXT },
247+
response_format=ChatCompletionsResponseFormatText(),
248248
)
249249
```
250250

@@ -256,13 +256,15 @@ Cohere Command chat models can create JSON outputs. Set `response_format` to `js
256256

257257

258258
```python
259+
from azure.ai.inference.models import ChatCompletionsResponseFormatJSON
260+
259261
response = client.complete(
260262
messages=[
261263
SystemMessage(content="You are a helpful assistant that always generate responses in JSON format, using."
262264
" the following format: { ""answer"": ""response"" }."),
263265
UserMessage(content="How many languages are in the world?"),
264266
],
265-
response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT }
267+
response_format=ChatCompletionsResponseFormatJSON()
266268
)
267269
```
268270

articles/ai-studio/how-to/deploy-models-cohere-embed.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -617,12 +617,12 @@ Cohere Embed V3 models can optimize the embeddings based on its use case.
617617

618618
| Description | Language | Sample |
619619
|-------------------------------------------|-------------------|-----------------------------------------------------------------|
620-
| Web requests | Bash | [Command-R](https://aka.ms/samples/cohere-command-r/webrequests) - [Command-R+](https://aka.ms/samples/cohere-command-r-plus/webrequests) |
620+
| Web requests | Bash | [cohere-embed.ipynb](https://aka.ms/samples/embed-v3/webrequests) |
621621
| Azure AI Inference package for JavaScript | JavaScript | [Link](https://aka.ms/azsdk/azure-ai-inference/javascript/samples) |
622622
| Azure AI Inference package for Python | Python | [Link](https://aka.ms/azsdk/azure-ai-inference/python/samples) |
623-
| OpenAI SDK (experimental) | Python | [Link](https://aka.ms/samples/cohere-command/openaisdk) |
624-
| LangChain | Python | [Link](https://aka.ms/samples/cohere/langchain) |
625-
| Cohere SDK | Python | [Link](https://aka.ms/samples/cohere-python-sdk) |
623+
| OpenAI SDK (experimental) | Python | [Link](https://aka.ms/samples/cohere-embed/openaisdk) |
624+
| LangChain | Python | [Link](https://aka.ms/samples/cohere-embed/langchain) |
625+
| Cohere SDK | Python | [Link](https://aka.ms/samples/cohere-embed/cohere-python-sdk) |
626626
| LiteLLM SDK | Python | [Link](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/litellm.ipynb) |
627627

628628
#### Retrieval Augmented Generation (RAG) and tool use samples

articles/ai-studio/how-to/deploy-models-jais.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ print_stream(result)
201201
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
202202

203203
```python
204-
from azure.ai.inference.models import ChatCompletionsResponseFormat
204+
from azure.ai.inference.models import ChatCompletionsResponseFormatText
205205

206206
response = client.complete(
207207
messages=[
@@ -214,7 +214,7 @@ response = client.complete(
214214
stop=["<|endoftext|>"],
215215
temperature=0,
216216
top_p=1,
217-
response_format={ "type": ChatCompletionsResponseFormat.TEXT },
217+
response_format=ChatCompletionsResponseFormatText(),
218218
)
219219
```
220220

articles/ai-studio/how-to/deploy-models-llama.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,7 @@ print_stream(result)
255255
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
256256

257257
```python
258-
from azure.ai.inference.models import ChatCompletionsResponseFormat
258+
from azure.ai.inference.models import ChatCompletionsResponseFormatText
259259

260260
response = client.complete(
261261
messages=[
@@ -268,7 +268,7 @@ response = client.complete(
268268
stop=["<|endoftext|>"],
269269
temperature=0,
270270
top_p=1,
271-
response_format={ "type": ChatCompletionsResponseFormat.TEXT },
271+
response_format=ChatCompletionsResponseFormatText(),
272272
)
273273
```
274274

articles/ai-studio/how-to/deploy-models-mistral-nemo.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ print_stream(result)
209209
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
210210

211211
```python
212-
from azure.ai.inference.models import ChatCompletionsResponseFormat
212+
from azure.ai.inference.models import ChatCompletionsResponseFormatText
213213

214214
response = client.complete(
215215
messages=[
@@ -222,7 +222,7 @@ response = client.complete(
222222
stop=["<|endoftext|>"],
223223
temperature=0,
224224
top_p=1,
225-
response_format={ "type": ChatCompletionsResponseFormat.TEXT },
225+
response_format=ChatCompletionsResponseFormatText(),
226226
)
227227
```
228228

@@ -234,13 +234,15 @@ Mistral Nemo chat model can create JSON outputs. Set `response_format` to `json_
234234

235235

236236
```python
237+
from azure.ai.inference.models import ChatCompletionsResponseFormatJSON
238+
237239
response = client.complete(
238240
messages=[
239241
SystemMessage(content="You are a helpful assistant that always generate responses in JSON format, using."
240242
" the following format: { ""answer"": ""response"" }."),
241243
UserMessage(content="How many languages are in the world?"),
242244
],
243-
response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT }
245+
response_format=ChatCompletionsResponseFormatJSON()
244246
)
245247
```
246248

articles/ai-studio/how-to/deploy-models-mistral-open.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -257,7 +257,7 @@ print_stream(result)
257257
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
258258

259259
```python
260-
from azure.ai.inference.models import ChatCompletionsResponseFormat
260+
from azure.ai.inference.models import ChatCompletionsResponseFormatText
261261

262262
response = client.complete(
263263
messages=[
@@ -270,7 +270,7 @@ response = client.complete(
270270
stop=["<|endoftext|>"],
271271
temperature=0,
272272
top_p=1,
273-
response_format={ "type": ChatCompletionsResponseFormat.TEXT },
273+
response_format=ChatCompletionsResponseFormatText(),
274274
)
275275
```
276276

articles/ai-studio/how-to/deploy-models-mistral.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -239,7 +239,7 @@ print_stream(result)
239239
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
240240

241241
```python
242-
from azure.ai.inference.models import ChatCompletionsResponseFormat
242+
from azure.ai.inference.models import ChatCompletionsResponseFormatText
243243

244244
response = client.complete(
245245
messages=[
@@ -252,7 +252,7 @@ response = client.complete(
252252
stop=["<|endoftext|>"],
253253
temperature=0,
254254
top_p=1,
255-
response_format={ "type": ChatCompletionsResponseFormat.TEXT },
255+
response_format=ChatCompletionsResponseFormatText(),
256256
)
257257
```
258258

@@ -264,13 +264,15 @@ Mistral premium chat models can create JSON outputs. Set `response_format` to `j
264264

265265

266266
```python
267+
from azure.ai.inference.models import ChatCompletionsResponseFormatJSON
268+
267269
response = client.complete(
268270
messages=[
269271
SystemMessage(content="You are a helpful assistant that always generate responses in JSON format, using."
270272
" the following format: { ""answer"": ""response"" }."),
271273
UserMessage(content="How many languages are in the world?"),
272274
],
273-
response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT }
275+
response_format=ChatCompletionsResponseFormatJSON()
274276
)
275277
```
276278

0 commit comments

Comments
 (0)