Skip to content

Commit 6bb9e47

Browse files
authored
Some minor updates to package & samples README.md files (#35971)
1 parent acd606f commit 6bb9e47

File tree

2 files changed

+38
-24
lines changed

2 files changed

+38
-24
lines changed

sdk/ai/azure-ai-inference/README.md

Lines changed: 22 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Azure AI Inference client library for Python
22

3-
The client Library (in preview) does inference, including chat completions, for AI models deployed by [Azure AI Studio](https://ai.azure.com) and [Azure Machine Learning Studio](https://ml.azure.com/). It supports Serverless API endpoints and Managed Compute Endpoints (formerly known as Managed Online Endpoints). The client library makes services calls using REST API version `2024-05-01-preview`, as documented in [Azure AI Model Inference API](https://learn.microsoft.com/azure/ai-studio/reference/reference-model-inference-api). For more information see [Overview: Deploy models, flows, and web apps with Azure AI Studio](https://learn.microsoft.com/azure/ai-studio/concepts/deployments-overview).
3+
The client Library (in preview) does inference, including chat completions, for AI models deployed by [Azure AI Studio](https://ai.azure.com) and [Azure Machine Learning Studio](https://ml.azure.com/). It supports Serverless API endpoints and Managed Compute endpoints (formerly known as Managed Online Endpoints). The client library makes services calls using REST API version `2024-05-01-preview`, as documented in [Azure AI Model Inference API](https://learn.microsoft.com/azure/ai-studio/reference/reference-model-inference-api). For more information see [Overview: Deploy models, flows, and web apps with Azure AI Studio](https://learn.microsoft.com/azure/ai-studio/concepts/deployments-overview).
44

55
Use the model inference client library to:
66

@@ -181,7 +181,7 @@ In the following sections you will find simple examples of:
181181
* [Text Embeddings](#text-embeddings-example)
182182
<!-- * [Image Embeddings](#image-embeddings-example) -->
183183

184-
The examples create a synchronous client as mentioned in [Create and authenticate clients](#create-and-authenticate-clients). Only mandatory input settings are shown for simplicity.
184+
The examples create a synchronous client as mentioned in [Create and authenticate a client directly, using key](#create-and-authenticate-a-client-directly-using-key). Only mandatory input settings are shown for simplicity.
185185

186186
See the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder for full working samples for synchronous and asynchronous clients.
187187

@@ -388,7 +388,7 @@ To generate embeddings for additional phrases, simply call `client.embed` multip
388388

389389
### Exceptions
390390

391-
The `complete`, `embed` and `get_model_info` methods on the clients raise an [HttpResponseError](https://learn.microsoft.com/python/api/azure-core/azure.core.exceptions.httpresponseerror) exception for a non-success HTTP status code response from the service. The exception's `status_code` will be the HTTP response status code. The exception's `error.message` contains a detailed message that will allow you to diagnose the issue:
391+
The `complete`, `embed` and `get_model_info` methods on the clients raise an [HttpResponseError](https://learn.microsoft.com/python/api/azure-core/azure.core.exceptions.httpresponseerror) exception for a non-success HTTP status code response from the service. The exception's `status_code` will hold the HTTP response status code (with `reason` showing the friendly name). The exception's `error.message` contains a detailed message that may be helpful in diagnosing the issue:
392392

393393
```python
394394
from azure.core.exceptions import HttpResponseError
@@ -399,6 +399,7 @@ try:
399399
result = client.complete( ... )
400400
except HttpResponseError as e:
401401
print(f"Status code: {e.status_code} ({e.reason})")
402+
print(f"{e.message}")
402403
```
403404

404405
For example, when you provide a wrong authentication key:
@@ -408,7 +409,7 @@ Status code: 401 (Unauthorized)
408409
Operation returned an invalid status 'Unauthorized'
409410
```
410411

411-
Or for example when you created an `EmbeddingsClient` and called `embed` on the client, but the endpoint does not
412+
Or when you create an `EmbeddingsClient` and call `embed` on the client, but the endpoint does not
412413
support the `/embeddings` route:
413414

414415
```text
@@ -442,18 +443,25 @@ formatter = logging.Formatter("%(asctime)s:%(levelname)s:%(name)s:%(message)s")
442443
handler.setFormatter(formatter)
443444
```
444445

445-
By default logs redact the values of URL query strings, the values of some HTTP request and response headers (including `Authorization` which holds the key or token), and the request and response payloads. To create logs without redaction, set the method argument `logging_enable = True` when you construct the client library, or when you call any of the client's `create` methods.
446+
By default logs redact the values of URL query strings, the values of some HTTP request and response headers (including `Authorization` which holds the key or token), and the request and response payloads. To create logs without redaction, do these two things:
446447

447-
```python
448-
# Create a chat completions client with non redacted logs
449-
client = ChatCompletionsClient(
450-
endpoint=endpoint,
451-
credential=AzureKeyCredential(key),
452-
logging_enable=True
453-
)
454-
```
448+
1. Set the method argument `logging_enable = True` when you construct the client library, or when you call the client's `complete` or `embed` methods.
449+
```python
450+
client = ChatCompletionsClient(
451+
endpoint=endpoint,
452+
credential=AzureKeyCredential(key),
453+
logging_enable=True
454+
)
455+
```
456+
1. Set the log level to `logging.DEBUG`. Logs will be redacted with any other log level.
457+
458+
Be sure to protect non redacted logs to avoid compromising security.
459+
460+
For more information, see [Configure logging in the Azure libraries for Python](https://aka.ms/azsdk/python/logging)
461+
462+
### Reporting issues
455463

456-
None redacted logs are generated for log level `logging.DEBUG` only. Be sure to protect non redacted logs to avoid compromising security. For more information see [Configure logging in the Azure libraries for Python](https://aka.ms/azsdk/python/logging)
464+
To report issues with the client library, or request additional features, please open a GitHub issue [here](https://github.com/Azure/azure-sdk-for-python/issues)
457465

458466
## Next steps
459467

sdk/ai/azure-ai-inference/samples/README.md

Lines changed: 16 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,11 @@ urlFragment: model-inference-samples
1010

1111
# Samples for Azure AI Inference client library for Python
1212

13-
These are runnable console Python scripts that show how to do chat completion and text embeddings using the clients in this package. Samples in this folder use the a synchronous clients. Samples in the subfolder `async_samples` use the asynchronous clients. The concepts are similar, you can easily modify any of the synchronous samples to asynchronous.
13+
These are runnable console Python scripts that show how to do chat completion and text embeddings against Serverless API endpoints and Managed Compute endpoints.
14+
15+
Samples with `azure_openai` in their name show how to do chat completions and text embeddings against Azure OpenAI endpoints.
16+
17+
Samples in this folder use the a synchronous clients. Samples in the subfolder `async_samples` use the asynchronous clients. The concepts are similar, you can easily modify any of the synchronous samples to asynchronous.
1418

1519
## Prerequisites
1620

@@ -37,15 +41,17 @@ See [Prerequisites](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/
3741

3842
To construct any of the clients, you will need to pass in the endpoint URL. If you are using key authentication, you also need to pass in the key associated with your deployed AI model.
3943

40-
* The endpoint URL has the form `https://your-deployment-name.your-azure-region.inference.ai.azure.com`, where `your-deployment-name` is your unique model deployment name and `your-azure-region` is the Azure region where the model is deployed (e.g. `eastus2`).
44+
* For Serverless API and Managed Compute endpoints, the endpoint URL has the form `https://your-unique-resouce-name.your-azure-region.inference.ai.azure.com`, where `your-unique-resource-name` is your globally unique Azure resource name and `your-azure-region` is the Azure region where the model is deployed (e.g. `eastus2`).
45+
46+
* For Azure OpenAI endpoints, the endpoint URL has the form `https://your-unique-resouce-name.openai.azure.com/openai/deployments/your-deployment-name`, where `your-unique-resource-name` is your globally unique Azure OpenAI resource name, and `your-deployment-name` is your AI Model deployment name.
4147

4248
* The key is a 32-character string.
4349

4450
For convenience, and to promote the practice of not hard-coding secrets in your source code, all samples here assume the endpoint URL and key are stored in environment variables. You will need to set these environment variables before running the samples as-is. The environment variables are mentioned in the tables below.
4551

4652
Note that the client library does not directly read these environment variable at run time. The sample code reads the environment variables and constructs the relevant client with these values.
4753

48-
## Serverless API and Managed Compute Endpoints
54+
## Serverless API and Managed Compute endpoints
4955

5056
| Sample type | Endpoint environment variable name | Key environment variable name |
5157
|----------|----------|----------|
@@ -57,7 +63,7 @@ Note that the client library does not directly read these environment variable a
5763

5864
To run against a Managed Compute Endpoint, some samples also have an optional environment variable `CHAT_COMPLETIONS_DEPLOYMENT_NAME`. This is the value used to set the HTTP request header `azureml-model-deployment` when constructing the client.
5965

60-
## Azure OpenAI Endpoints
66+
## Azure OpenAI endpoints
6167

6268
| Sample type | Endpoint environment variable name | Key environment variable name |
6369
|----------|----------|----------|
@@ -84,11 +90,11 @@ similarly for the other samples.
8490
|**File Name**|**Description**|
8591
|----------------|-------------|
8692
|[sample_chat_completions_streaming.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming.py) | One chat completion operation using a synchronous client and streaming response. |
87-
|[sample_chat_completions_streaming_with_entra_id_auth.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_entra_id_auth.py) | One chat completion operation using a synchronous client and streaming response, using Entra ID authentication. This sample also shows setting the `azureml-model-deployment` HTTP request header, which may be required for Selfhosted Endpoints. |
93+
|[sample_chat_completions_streaming_with_entra_id_auth.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_entra_id_auth.py) | One chat completion operation using a synchronous client and streaming response, using Entra ID authentication. This sample also shows setting the `azureml-model-deployment` HTTP request header, which may be required for some Managed Compute endpoint. |
8894
|[sample_chat_completions.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions.py) | One chat completion operation using a synchronous client. |
89-
|[sample_chat_completions_with_history.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_history.py) | Two chat completion operations using a synchronous client, which the second completion using chat history from the first. |
95+
|[sample_chat_completions_with_history.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_history.py) | Two chat completion operations using a synchronous client, with the second completion using chat history from the first. |
9096
|[sample_chat_completions_from_input_bytes.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_bytes.py) | One chat completion operation using a synchronous client, with input messages provided as `IO[bytes]`. |
91-
|[sample_chat_completions_from_input_json.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py) | One chat completion operation using a synchronous client, with input messages provided as `MutableMapping[str, Any]` |
97+
|[sample_chat_completions_from_input_json.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py) | One chat completion operation using a synchronous client, with input messages provided as a dictionary (type `MutableMapping[str, Any]`) |
9298
|[sample_chat_completions_with_tools.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py) | Shows how do use a tool (function) in chat completions, for an AI model that supports tools |
9399
|[sample_load_client.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_load_client.py) | Shows how do use the function `load_client` to create the appropriate synchronous client based on the provided endpoint URL. In this example, it creates a synchronous `ChatCompletionsClient`. |
94100
|[sample_get_model_info.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_get_model_info.py) | Get AI model information using the chat completions client. Similarly can be done with all other clients. |
@@ -118,9 +124,9 @@ similarly for the other samples.
118124
|----------------|-------------|
119125
|[sample_chat_completions_streaming_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_streaming_async.py) | One chat completion operation using an asynchronous client and streaming response. |
120126
|[sample_chat_completions_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_async.py) | One chat completion operation using an asynchronous client. |
121-
|[sample_load_client_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_load_client_async.py) | Shows how do use the function `load_async_client` to create the appropriate asynchronous client based on the provided endpoint URL. In this example, it creates an asynchronous `ChatCompletionsClient`. |
122-
|[sample_chat_completions_from_input_bytes_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_bytes_async.py) | One chat completion operation using a synchronous client, with input messages provided as `IO[bytes]`. |
123-
|[sample_chat_completions_from_input_json_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py) | One chat completion operation using a synchronous client, with input messages provided as `MutableMapping[str, Any]` |
127+
|[sample_load_client_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_load_client_async.py) | Shows how do use the function `load_client` to create the appropriate asynchronous client based on the provided endpoint URL. In this example, it creates an asynchronous `ChatCompletionsClient`. |
128+
|[sample_chat_completions_from_input_bytes_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_bytes_async.py) | One chat completion operation using an asynchronous client, with input messages provided as `IO[bytes]`. |
129+
|[sample_chat_completions_from_input_json_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py) | One chat completion operation using an asynchronous client, with input messages provided as a dictionary (type `MutableMapping[str, Any]`) |
124130
|[sample_chat_completions_streaming_azure_openai_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_streaming_azure_openai_async.py) | One chat completion operation using an asynchronous client and streaming response against an Azure OpenAI endpoint |
125131

126132
### Text embeddings

0 commit comments

Comments
 (0)