Skip to content

Commit 970588b

Browse files
Merge pull request #291808 from nimakamoosi/nimak/fix/move-semantic-caching-managed-identity-to-backend
Update documentation for moving Semantic caching embedding auth to backend.
2 parents 568e1f2 + 3f5c8ba commit 970588b

File tree

3 files changed

+4
-5
lines changed

3 files changed

+4
-5
lines changed

articles/api-management/azure-openai-enable-semantic-caching.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,10 @@ Configure a [backend](backends.md) resource for the embeddings API deployment wi
6161
```
6262
https://my-aoai.openai.azure.com/openai/deployments/embeddings-deployment/embeddings
6363
```
64+
* **Authorization credentials** - Go to **Managed Identity** tab.
65+
* **Client indentity** - Select *System assigned identity* or type in a User assigned managed identity client ID.
66+
* **Resource ID** - Enter `https://cognitiveservices.azure.com/` for Azure OpenAI Service.
67+
6468
### Test backend
6569

6670
To test the backend, create an API operation for your Azure OpenAI Service API:
@@ -123,7 +127,6 @@ Configure the following policies to enable semantic caching for Azure OpenAI API
123127
<azure-openai-semantic-cache-lookup
124128
score-threshold="0.8"
125129
embeddings-backend-id="embeddings-deployment"
126-
embeddings-backend-auth="system-assigned"
127130
ignore-system-messages="true"
128131
max-message-count="10">
129132
<vary-by>@(context.Subscription.Id)</vary-by>

articles/api-management/azure-openai-semantic-cache-lookup-policy.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of r
3434
<azure-openai-semantic-cache-lookup
3535
score-threshold="similarity score threshold"
3636
embeddings-backend-id ="backend entity ID for embeddings API"
37-
embeddings-backend-auth ="system-assigned"
3837
ignore-system-messages="true | false"
3938
max-message-count="count" >
4039
<vary-by>"expression to partition caching"</vary-by>
@@ -47,7 +46,6 @@ Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of r
4746
| ----------------- | ------------------------------------------------------ | -------- | ------- |
4847
| score-threshold | Similarity score threshold used to determine whether to return a cached response to a prompt. Value is a decimal between 0.0 and 1.0. [Learn more](../azure-cache-for-redis/cache-tutorial-semantic-cache.md#change-the-similarity-threshold). | Yes | N/A |
4948
| embeddings-backend-id | [Backend](backends.md) ID for OpenAI embeddings API call. | Yes | N/A |
50-
| embeddings-backend-auth | Authentication used for Azure OpenAI embeddings API backend. | Yes. Must be set to `system-assigned`. | N/A |
5149
| ignore-system-messages | Boolean. If set to `true`, removes system messages from a GPT chat completion prompt before assessing cache similarity. | No | false |
5250
| max-message-count | If specified, number of remaining dialog messages after which caching is skipped. | No | N/A |
5351

articles/api-management/llm-semantic-cache-lookup-policy.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ Use the `llm-semantic-cache-lookup` policy to perform cache lookup of responses
3434
<llm-semantic-cache-lookup
3535
score-threshold="similarity score threshold"
3636
embeddings-backend-id ="backend entity ID for embeddings API"
37-
embeddings-backend-auth ="system-assigned"
3837
ignore-system-messages="true | false"
3938
max-message-count="count" >
4039
<vary-by>"expression to partition caching"</vary-by>
@@ -47,7 +46,6 @@ Use the `llm-semantic-cache-lookup` policy to perform cache lookup of responses
4746
| ----------------- | ------------------------------------------------------ | -------- | ------- |
4847
| score-threshold | Similarity score threshold used to determine whether to return a cached response to a prompt. Value is a decimal between 0.0 and 1.0. [Learn more](../azure-cache-for-redis/cache-tutorial-semantic-cache.md#change-the-similarity-threshold). | Yes | N/A |
4948
| embeddings-backend-id | [Backend](backends.md) ID for OpenAI embeddings API call. | Yes | N/A |
50-
| embeddings-backend-auth | Authentication used for Azure OpenAI embeddings API backend. | Yes. Must be set to `system-assigned`. | N/A |
5149
| ignore-system-messages | Boolean. If set to `true`, removes system messages from a GPT chat completion prompt before assessing cache similarity. | No | false |
5250
| max-message-count | If specified, number of remaining dialog messages after which caching is skipped. | No | N/A |
5351

0 commit comments

Comments
 (0)