Skip to content

Commit 2451dda

Browse files
authored
Merge pull request #262750 from MicrosoftDocs/main
1/9 11:00 AM IST Publish
2 parents 1eff420 + 461b6c0 commit 2451dda

File tree

45 files changed

+1136
-647
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+1136
-647
lines changed
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: Azure OpenAI Service API version retirement
3+
description: Learn more about API version retirement in Azure OpenAI Services
4+
services: cognitive-services
5+
manager: nitinme
6+
ms.service: azure-ai-openai
7+
ms.topic: conceptual
8+
ms.date: 01/08/2024
9+
author: mrbullwinkle
10+
ms.author: mbullwin
11+
recommendations: false
12+
ms.custom:
13+
---
14+
15+
# Azure OpenAI API preview lifecycle
16+
17+
This article is to help you understand the support lifecycle for the Azure OpenAI API previews.
18+
19+
## Latest preview API release
20+
21+
Azure OpenAI API version 2023-12-01-preview is currently the latest preview release.
22+
23+
This version contains support for all the latest Azure OpenAI features including:
24+
25+
- [Fine-tuning](./how-to/fine-tuning.md) `gpt-35-turbo`, `babbage-002`, and `davinci-002` models.[**Added in 2023-10-01-preview**]
26+
- [Whisper](./whisper-quickstart.md). [**Added in 2023-09-01-preview**]
27+
- [Function calling](./how-to/function-calling.md) [**Added in 2023-07-01-preview**]
28+
- [DALL-E](./dall-e-quickstart.md) [**Added in 2023-06-01-preview**]
29+
- [Retrieval augmented generation with the on your data feature](./use-your-data-quickstart.md). [**Added in 2023-06-01-preview**]
30+
31+
## Retiring soon
32+
33+
On April 2, 2024 the following API preview releases will be retired and will stop accepting API requests:
34+
35+
- 2023-03-15-preview
36+
- 2023-06-01-preview
37+
- 2023-07-01-preview
38+
- 2023-08-01-preview
39+
40+
To avoid service disruptions, you must update to use the latest preview version prior to the retirement date.
41+
42+
## Updating API versions
43+
44+
We recommend first testing the upgrade to new API versions to confirm there is no impact to your application from the API update prior to making the change globally across your environment.
45+
46+
If you are using the OpenAI Python client library or the REST API, you will need to update your code directly to the latest preview API version.
47+
48+
If you are using one of the Azure OpenAI SDKs for C#, Go, Java, or JavaScript you will instead need to update to the latest version of the SDK. Each SDK release is hardcoded to work with specific versions of the Azure OpenAI API.
49+
50+
## Next steps
51+
52+
- [Learn more about Azure OpenAI](overview.md)
53+
- [Learn about working with Azure OpenAI models](./how-to/working-with-models.md)

articles/ai-services/openai/concepts/content-filter.md

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -628,6 +628,174 @@ For details on the inference REST API endpoints for Azure OpenAI and how to crea
628628
}
629629
```
630630
631+
## Streaming
632+
633+
Azure OpenAI Service includes a content filtering system that works alongside core models. The following section describes the AOAI streaming experience and options in the context of content filters.
634+
635+
### Default
636+
637+
The content filtering system is integrated and enabled by default for all customers. In the default streaming scenario, completion content is buffered, the content filtering system runs on the buffered content, and – depending on content filtering configuration – content is either returned to the user if it does not violate the content filtering policy (Microsoft default or custom user configuration), or it’s immediately blocked which returns a content filtering error, without returning harmful completion content. This process is repeated until the end of the stream. Content was fully vetted according to the content filtering policy before returned to the user. Content is not returned token-by-token in this case, but in “content chunks” of the respective buffer size.
638+
639+
### Asynchronous modified filter
640+
641+
Customers who have been approved for modified content filters can choose Asynchronous Modified Filter as an additional option, providing a new streaming experience. In this case, content filters are run asynchronously, completion content is returned immediately with a smooth token-by-token streaming experience. No content is buffered, the content filters run asynchronously, which allows for zero latency in this context.
642+
643+
> [!NOTE]
644+
> Customers must be aware that while the feature improves latency, it can bring a trade-off in terms of the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and the content filtering signal in case of a policy violation are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user.
645+
646+
**Annotations**: Annotations and content moderation messages are continuously returned during the stream. We strongly recommend to consume annotations and implement additional AI content safety mechanisms, such as redacting content or returning additional safety information to the user.
647+
648+
**Content filtering signal**: The content filtering error signal is delayed; in case of a policy violation, it’s returned as soon as it’s available, and the stream is stopped. The content filtering signal is guaranteed within ~1,000-character windows in case of a policy violation.
649+
650+
Approval for Modified Content Filtering is required for access to Streaming – Asynchronous Modified Filter. The application can be found [here](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xURE01NDY1OUhBRzQ3MkQxMUhZSE1ZUlJKTiQlQCN0PWcu). To enable it via Azure OpenAI Studio please follow the instructions [here](/azure/ai-services/openai/how-to/content-filters) to create a new content filtering configuration, and select “Asynchronous Modified Filter” in the Streaming section, as shown in the below screenshot.
651+
652+
### Overview tbd
653+
654+
| | Streaming - Default | Streaming - Asynchronous Modified Filter |
655+
|---|---|---|
656+
|Status |GA |Public Preview |
657+
| Access | Enabled by default, no action needed |Customers approved for Modified Content Filtering can configure directly via Azure OpenAI Studio (as part of a content filtering configuration; applied on deployment-level) |
658+
| Eligibility |All customers |Customers approved for Modified Content Filtering |
659+
|Modality and Availability |Text; all GPT-models |Text; all GPT-models except gpt-4-vision |
660+
|Streaming experience |Content is buffered and returned in chunks |Zero latency (no buffering, filters run asynchronously) |
661+
|Content filtering signal |Immediate filtering signal |Delayed filtering signal (in up to ~1,000 char increments) |
662+
|Content filtering configurations |Supports default and any customer-defined filter setting (including optional models) |Supports default and any customer-defined filter setting (including optional models) |
663+
664+
### Annotations and sample response stream
665+
666+
#### Prompt annotation message
667+
668+
This is the same as default annotations.
669+
670+
```json
671+
data: {
672+
"id": "",
673+
"object": "",
674+
"created": 0,
675+
"model": "",
676+
"prompt_filter_results": [
677+
{
678+
"prompt_index": 0,
679+
"content_filter_results": { ... }
680+
}
681+
],
682+
"choices": [],
683+
"usage": null
684+
}
685+
```
686+
687+
#### Completion token message
688+
689+
Completion messages are forwarded immediately. No moderation is performed first, and no annotations are provided initially.
690+
691+
```json
692+
data: {
693+
"id": "chatcmpl-7rAJvsS1QQCDuZYDDdQuMJVMV3x3N",
694+
"object": "chat.completion.chunk",
695+
"created": 1692905411,
696+
"model": "gpt-35-turbo",
697+
"choices": [
698+
{
699+
"index": 0,
700+
"finish_reason": null,
701+
"delta": {
702+
"content": "Color"
703+
}
704+
}
705+
],
706+
"usage": null
707+
}
708+
```
709+
710+
#### Annotation message
711+
712+
The text field will always be an empty string, indicating no new tokens. Annotations will only be relevant to already-sent tokens. There may be multiple Annotation Messages referring to the same tokens.
713+
714+
“start_offset” and “end_offset” are low-granularity offsets in text (with 0 at beginning of prompt) which the annotation is relevant to.
715+
716+
“check_offset” represents how much text has been fully moderated. It is an exclusive lower bound on the end_offsets of future annotations. It is nondecreasing.
717+
718+
```json
719+
data: {
720+
"id": "",
721+
"object": "",
722+
"created": 0,
723+
"model": "",
724+
"choices": [
725+
{
726+
"index": 0,
727+
"finish_reason": null,
728+
"content_filter_results": { ... },
729+
"content_filter_raw": [ ... ],
730+
"content_filter_offsets": {
731+
"check_offset": 44,
732+
"start_offset": 44,
733+
"end_offset": 198
734+
}
735+
}
736+
],
737+
"usage": null
738+
}
739+
```
740+
741+
742+
### Sample response stream
743+
744+
Below is a real chat completion response using Asynchronous Modified Filter. Note how prompt annotations are not changed; completion tokens are sent without annotations; and new annotation messages are sent without tokens, instead associated with certain content filter offsets.
745+
746+
`{"temperature": 0, "frequency_penalty": 0, "presence_penalty": 1.0, "top_p": 1.0, "max_tokens": 800, "messages": [{"role": "user", "content": "What is color?"}], "stream": true}`
747+
748+
```
749+
data: {"id":"","object":"","created":0,"model":"","prompt_annotations":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[],"usage":null}
750+
751+
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"}}],"usage":null}
752+
753+
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":"Color"}}],"usage":null}
754+
755+
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" is"}}],"usage":null}
756+
757+
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" a"}}],"usage":null}
758+
759+
...
760+
761+
data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":44,"start_offset":44,"end_offset":198}}],"usage":null}
762+
763+
...
764+
765+
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":"stop","delta":{}}],"usage":null}
766+
767+
data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":506,"start_offset":44,"end_offset":571}}],"usage":null}
768+
769+
data: [DONE]
770+
```
771+
772+
### Sample response stream (blocking)
773+
774+
`{"temperature": 0, "frequency_penalty": 0, "presence_penalty": 1.0, "top_p": 1.0, "max_tokens": 800, "messages": [{"role": "user", "content": "Tell me the lyrics to \"Hey Jude\"."}], "stream": true}`
775+
776+
```
777+
data: {"id":"","object":"","created":0,"model":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[],"usage":null}
778+
779+
data: {"id":"chatcmpl-8JCbt5d4luUIhYCI7YH4dQK7hnHx2","object":"chat.completion.chunk","created":1699587397,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"}}],"usage":null}
780+
781+
data: {"id":"chatcmpl-8JCbt5d4luUIhYCI7YH4dQK7hnHx2","object":"chat.completion.chunk","created":1699587397,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":"Hey"}}],"usage":null}
782+
783+
data: {"id":"chatcmpl-8JCbt5d4luUIhYCI7YH4dQK7hnHx2","object":"chat.completion.chunk","created":1699587397,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" Jude"}}],"usage":null}
784+
785+
data: {"id":"chatcmpl-8JCbt5d4luUIhYCI7YH4dQK7hnHx2","object":"chat.completion.chunk","created":1699587397,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":","}}],"usage":null}
786+
787+
...
788+
789+
data: {"id":"chatcmpl-8JCbt5d4luUIhYCI7YH4dQK7hnHx2","object":"chat.completion.chunk","created":1699587397,"model":"gpt-35-
790+
791+
turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" better"}}],"usage":null}
792+
793+
data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":65,"start_offset":65,"end_offset":1056}}],"usage":null}
794+
795+
data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":"content_filter","content_filter_results":{"protected_material_text":{"detected":true,"filtered":true}},"content_filter_offsets":{"check_offset":65,"start_offset":65,"end_offset":1056}}],"usage":null}
796+
797+
data: [DONE]
798+
```
631799
## Best practices
632800
633801
As part of your application design, consider the following best practices to deliver a positive experience with your application while minimizing potential harms:

articles/ai-services/openai/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ items:
6565
href: ./concepts/use-your-image-data.md
6666
- name: How-to
6767
items:
68+
- name: API version lifecycle
69+
href: ./api-version-deprecation.md
6870
- name: Completions & chat completions
6971
items:
7072
- name: GPT-35-Turbo & GPT-4

articles/ai-services/translator/custom-translator/how-to/create-manage-workspace.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: How to create and manage workspaces
55
author: laujan
66
manager: nitinme
77
ms.service: azure-ai-translator
8-
ms.date: 07/18/2023
8+
ms.date: 01/08/2024
99
ms.author: lajanuar
1010
ms.topic: how-to
1111

@@ -15,6 +15,11 @@ ms.topic: how-to
1515

1616
Workspaces are places to manage your documents, projects, and models. When you create a workspace, you can choose to use the workspace independently, or share it with teammates to divide up the work.
1717

18+
> [!NOTE]
19+
>
20+
> * [Custom Translator Portal](https://portal.customtranslator.azure.ai/) access can only be enabled through a public network.
21+
> * For information on how to use selected networks and private endpoints, see [Enable Custom Translator through Azure Virtual Network](enable-vnet-service-endpoint.md).
22+
1823
## Create workspace
1924

2025
1. After you sign in to Custom Translator, you'll be asked for permission to read your profile from the Microsoft identity platform to request your user access token and refresh token. Both tokens are needed for authentication and to ensure that you aren't signed out during your live session or while training your models. </br>Select **Yes**.

articles/aks/concepts-network.md

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -24,19 +24,27 @@ This article introduces the core concepts that provide networking to your applic
2424
* [Network policies](#network-policies)
2525

2626
## Kubernetes basics
27+
Kubernetes employs a virtual networking layer to manage access within and between your applications or their components. This involves the following key aspects:
2728

28-
To allow access to your applications or between application components, Kubernetes provides an abstraction layer to virtual networking. Kubernetes nodes connect to a virtual network, providing inbound and outbound connectivity for pods. The *kube-proxy* component runs on each node to provide these network features.
29+
- **Kubernetes nodes and virtual network**: Kubernetes nodes are connected to a virtual network. This setup enables pods (basic units of deployment in Kubernetes) to have both inbound and outbound connectivity.
2930

30-
In Kubernetes:
31+
- **Kube-proxy component**: Running on each node, kube-proxy is responsible for providing the necessary network features.
3132

32-
* *Services* logically group pods to allow for direct access on a specific port via an IP address or DNS name.
33-
* *ServiceTypes* allow you to specify what kind of Service you want.
34-
* You can distribute traffic using a *load balancer*.
35-
* Layer 7 routing of application traffic can also be achieved with *ingress controllers*.
36-
* You can *control outbound (egress) traffic* for cluster nodes.
37-
* Security and filtering of the network traffic for pods is possible with *network policies*.
33+
Regarding specific Kubernetes functionalities:
3834

39-
The Azure platform also simplifies virtual networking for AKS clusters. When you create a Kubernetes load balancer, you also create and configure the underlying Azure load balancer resource. As you open network ports to pods, the corresponding Azure network security group rules are configured. For HTTP application routing, Azure can also configure *external DNS* as new Ingress routes are configured.
35+
- **Services**: These are used to logically group pods, allowing direct access to them through a specific IP address or DNS name on a designated port.
36+
- **Service types**: This feature lets you specify the kind of Service you wish to create.
37+
- **Load balancer**: You can use a load balancer to distribute network traffic evenly across various resources.
38+
- **Ingress controllers**: These facilitate Layer 7 routing, which is essential for directing application traffic.
39+
- **Egress traffic control**: Kubernetes allows you to manage and control outbound traffic from cluster nodes.
40+
- **Network policies**: These policies enable security measures and filtering for network traffic in pods.
41+
42+
In the context of the Azure platform:
43+
44+
- Azure streamlines virtual networking for AKS (Azure Kubernetes Service) clusters.
45+
- Creating a Kubernetes load balancer on Azure simultaneously sets up the corresponding Azure load balancer resource.
46+
- As you open network ports to pods, Azure automatically configures the necessary network security group rules.
47+
- Azure can also manage external DNS configurations for HTTP application routing as new Ingress routes are established.
4048

4149
## Services
4250

articles/aks/free-standard-pricing-tiers.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: Azure Kubernetes Service (AKS) Free Standard and Premium pricing tiers for cluster management
2+
title: Azure Kubernetes Service (AKS) Free, Standard and Premium pricing tiers for cluster management
33
description: Learn about the Azure Kubernetes Service (AKS) Free, Standard, and Premium pricing plans and what features, deployment patterns, and recommendations to consider between each plan.
44
ms.topic: conceptual
55
ms.date: 04/07/2023
66
ms.custom: references_regions, devx-track-azurecli
77
---
88

9-
# Free Standard and Premium pricing tiers for Azure Kubernetes Service (AKS) cluster management
9+
# Free, Standard and Premium pricing tiers for Azure Kubernetes Service (AKS) cluster management
1010

1111
Azure Kubernetes Service (AKS) is now offering three pricing tiers for cluster management: the **Free tier**, the **Standard tier** and the **Premium tier**. All tiers are in the **Base** sku.
1212

0 commit comments

Comments
 (0)