Skip to content

Commit 16333b2

Browse files
committed
usage updates
1 parent 6908860 commit 16333b2

File tree

4 files changed

+8
-5
lines changed

4 files changed

+8
-5
lines changed

articles/api-management/azure-openai-emit-token-metric-policy.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,8 +79,10 @@ The `azure-openai-emit-token-metric` policy sends metrics to Application Insight
7979
### Usage notes
8080

8181
* This policy can be used multiple times per policy definition.
82-
* You can configure at most 10 custom definitions for this policy.
82+
* You can configure at most 10 custom dimensions for this policy.
8383
* This policy can optionally be configured when adding an API from the Azure OpenAI Service using the portal.
84+
* Where available, values in the usage section of the response from the Azure OpenAI Service API are used to determine token metrics.
85+
* Certain Azure OpenAI endpoints support streaming of responses. When `stream` is set to `true` in the API request to enable streaming, token metrics are estimated.
8486

8587
## Example
8688

articles/api-management/azure-openai-semantic-cache-lookup-policy.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ ms.author: danlep
1515

1616
# Get cached responses of Azure OpenAI API requests
1717

18-
[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
18+
[!INCLUDE [api-management-availability-basicv2-standardv2](../../includes/api-management-availability-basicv2-standardv2.md)]
1919

2020
Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of responses to Azure OpenAI Chat Completion API and Completion API requests from a configured external cache, based on vector proximity of the prompt to previous requests and a specified similarity score threshold. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
2121

@@ -60,7 +60,7 @@ Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of r
6060

6161
- [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
6262
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
63-
- [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
63+
- [**Gateways:**](api-management-gateways-overview.md) v2
6464

6565
### Usage notes
6666

articles/api-management/azure-openai-semantic-cache-store-policy.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ ms.author: danlep
1515

1616
# Cache responses to Azure OpenAI API requests
1717

18-
[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
18+
[!INCLUDE [api-management-availability-basicv2-standardv2](../../includes/api-management-availability-basicv2-standardv2.md)]
1919

2020
The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI Chat Completion API and Completion API requests to a configured external cache. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
2121

@@ -44,7 +44,7 @@ The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI
4444

4545
- [**Policy sections:**](./api-management-howto-policies.md#sections) outbound
4646
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
47-
- [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
47+
- [**Gateways:**](api-management-gateways-overview.md) v2
4848

4949
### Usage notes
5050

articles/api-management/azure-openai-token-limit-policy.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ For more information, see [Azure OpenAI Service models](../ai-services/openai/co
7373

7474
* This policy can be used multiple times per policy definition.
7575
* This policy can optionally be configured when adding an API from the Azure OpenAI Service using the portal.
76+
* Where available when `estimate-prompt-tokens` is set to `false`, values in the usage section of the response from the Azure OpenAI Service API are used to determine token usage.
7677
* Certain Azure OpenAI endpoints support streaming of responses. When `stream` is set to `true` in the API request to enable streaming, prompt tokens are always estimated, regardless of the value of the `estimate-prompt-tokens` attribute.
7778
* [!INCLUDE [api-management-rate-limit-key-scope](../../includes/api-management-rate-limit-key-scope.md)]
7879

0 commit comments

Comments
 (0)