MicrosoftDocs
diff --git a/‎articles/active-directory-b2c/authorization-code-flow.md
Lines changed: 1 addition & 1 deletion b/‎articles/active-directory-b2c/authorization-code-flow.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/active-directory-b2c/custom-policies-series-call-rest-api.md
Lines changed: 1 addition & 1 deletion b/‎articles/active-directory-b2c/custom-policies-series-call-rest-api.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/api-management/api-management-howto-mutual-certificates.md
Lines changed: 1 addition & 1 deletion b/‎articles/api-management/api-management-howto-mutual-certificates.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/api-management/api-management-key-concepts-experiment.md
Lines changed: 1 addition & 1 deletion b/‎articles/api-management/api-management-key-concepts-experiment.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/api-management/api-management-key-concepts.md
Lines changed: 1 addition & 1 deletion b/‎articles/api-management/api-management-key-concepts.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/api-management/backends.md
Lines changed: 74 additions & 10 deletions b/‎articles/api-management/backends.md
Lines changed: 74 additions & 10 deletions
diff --git a/‎articles/application-gateway/self-signed-certificates.md
Lines changed: 1 addition & 1 deletion b/‎articles/application-gateway/self-signed-certificates.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/automation/troubleshoot/shared-resources.md
Lines changed: 1 addition & 1 deletion b/‎articles/automation/troubleshoot/shared-resources.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/azure-app-configuration/howto-best-practices.md
Lines changed: 1 addition & 1 deletion b/‎articles/azure-app-configuration/howto-best-practices.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/azure-cache-for-redis/cache-best-practices-kubernetes.md
Lines changed: 21 additions & 19 deletions b/‎articles/azure-cache-for-redis/cache-best-practices-kubernetes.md
Lines changed: 21 additions & 19 deletions
@@ -233,7 +233,7 @@ A successful token response looks like this:
 | access_token |The signed JWT that you requested. |
 | scope |The scopes that the token is valid for. You also can use the scopes to cache tokens for later use. |
 | expires_in |The length of time that the token is valid (in seconds). |
-| refresh_token |An OAuth 2.0 refresh token. The app can use this token to acquire additional tokens after the current token expires. Refresh tokens are long-lived, and can be used to retain access to resources for extended periods of time. For more information, see the [Azure AD B2C token reference](tokens-overview.md). |
+| refresh_token |An OAuth 2.0 refresh token. The app can use this token to acquire additional tokens after the current token expires. Refresh tokens are long-lived and can be used to retain access to resources for extended periods of time. For more information, see the [Azure AD B2C token reference](tokens-overview.md). |
 
 Error responses look like this:
 
 
@@ -216,7 +216,7 @@ to:
 ```xml
     <ValidationTechnicalProfile ReferenceId="ValidateAccessCodeViaHttp"/>
 ```
-At this point, the Technical Profile with `Id` *CheckAccessCodeViaClaimsTransformationChecker* isn't needed, and can be removed. 
+At this point, the Technical Profile with `Id` *CheckAccessCodeViaClaimsTransformationChecker* isn't needed and can be removed.
 
 
 ## Step 3 - Upload custom policy file
 
@@ -69,7 +69,7 @@ After the certificate is uploaded, it shows in the **Certificates** window. If y
 > This change is effective immediately, and calls to operations of that API will use the certificate to authenticate on the backend server.
 
 > [!TIP]
-> When a certificate is specified for gateway authentication for the backend service of an API, it becomes part of the policy for that API, and can be viewed in the policy editor.
+> When a certificate is specified for gateway authentication for the backend service of an API, it becomes part of the policy for that API and can be viewed in the policy editor.
 
 ## Disable certificate chain validation for self-signed certificates
 
 
@@ -135,7 +135,7 @@ Operations in API Management are highly configurable, with control over URL mapp
 
 ### Products
 
-Products are how APIs are surfaced to developers. Products in API Management have one or more APIs, and can be *open* or *protected*. Protected products require a subscription key, while open products can be consumed freely. 
+Products are how APIs are surfaced to developers. Products in API Management have one or more APIs and can be *open* or *protected*. Protected products require a subscription key, while open products can be consumed freely.
 
 When a product is ready for use by developers, it can be published. Once published, it can be viewed or subscribed to by developers. Subscription approval is configured at the product level and can either require an administrator's approval or be automatic.
 
 
@@ -163,7 +163,7 @@ Operations in API Management are highly configurable, with control over URL mapp
 
 ### Products
 
-Products are how APIs are surfaced to API consumers such as app developers. Products in API Management have one or more APIs, and can be *open* or *protected*. Protected products require a subscription key, while open products can be consumed freely. 
+Products are how APIs are surfaced to API consumers such as app developers. Products in API Management have one or more APIs and can be *open* or *protected*. Protected products require a subscription key, while open products can be consumed freely.
 
 When a product is ready for use by consumers, it can be published. Once published, it can be viewed or subscribed to by users through the developer portal. Subscription approval is configured at the product level and can either require an administrator's approval or be automatic.
 
 
@@ -5,7 +5,7 @@ services: api-management
 author: dlepow
 ms.service: azure-api-management
 ms.topic: concept-article
-ms.date: 04/01/2025
+ms.date: 05/20/2025
 ms.author: danlep
 ms.custom:
   - build-2024
@@ -239,16 +239,63 @@ Use a backend pool for scenarios such as the following:
 
 API Management supports the following load balancing options for backend pools:
 
-* **Round-robin**: By default, requests are distributed evenly across the backends in the pool.
-* **Weighted**: Weights are assigned to the backends in the pool, and requests are distributed across the backends based on the relative weight assigned to each backend. Use this option for scenarios such as conducting a blue-green deployment.
-* **Priority-based**: Backends are organized in priority groups, and requests are sent to the backends in order of the priority groups. Within a priority group, requests are distributed either evenly across the backends, or (if assigned) according to the relative weight assigned to each backend.
-    
+| Load balancing option      | Description |
+|------------------|-------------|
+| **Round-robin**  | Requests are distributed evenly across the backends in the pool by default. |
+| **Weighted**     | Weights are assigned to the backends in the pool, and requests are distributed based on the relative weight of each backend. Useful for scenarios such as blue-green deployments. |
+| **Priority-based** | Backends are organized into priority groups. Requests are sent to higher priority groups first; within a group, requests are distributed evenly or according to assigned weights. |    
+
 > [!NOTE]
 > Backends in lower priority groups will only be used when all backends in higher priority groups are unavailable because circuit breaker rules are tripped.
 
+### Session awareness
+
+With any of the preceding load balancing options, optionally enable **session awareness** (session affinity) to ensure that all requests from a specific user during a session are directed to the same backend in the pool. API Management sets a session ID cookie to maintain session state. This option is useful, for example, in scenarios with backends such as AI chat assistants or other conversational agents to route requests from the same session to the same endpoint.
+
+> [!NOTE]
+> Session awareness in load-balanced pools is being released first to the **AI Gateway Early** [update group](configure-service-update-settings.md).
+
+#### Manage cookies for session awareness
+
+When using session awareness, the client must handle cookies appropriately. The client needs to store the `Set-Cookie` header value and send it with subsequent requests to maintain session state.
+
+You can use API Management policies to help set cookies for session awareness. For example, for the case of the Assistants API (a feature of [Azure OpenAI in Azure AI Foundry Models](/azure/ai-services/openai/concepts/models)), the client needs to keep the session ID, extract the thread ID from the body, and keep the pair and send the right cookie for each call. Moreover, the client needs to know when to send a cookie or when not to send a cookie header. These requirements can be handled appropriately by defining the following example policies:
+
+
+```xml
+<policies>
+  <inbound>
+    <base />
+    <set-backend-service backend-id="APIMBackend" />
+  </inbound>
+  <backend>
+    <base />
+  </backend>
+  <outbound>
+    <base />
+    <set-variable name="gwSetCookie" value="@{
+      var payload = context.Response.Body.As<JObject>();
+      var threadId = payload["id"];
+      var gwSetCookieHeaderValue = context.Request.Headers.GetValueOrDefault("SetCookie", string.Empty);
+      if(!string.IsNullOrEmpty(gwSetCookieHeaderValue))
+      {
+        gwSetCookieHeaderValue = gwSetCookieHeaderValue + $";Path=/threads/{threadId};";
+      }
+      return gwSetCookieHeaderValue;
+    }" />
+    <set-header name="Set-Cookie" exists-action="override">
+      <value>Cookie=gwSetCookieHeaderValue</value>
+    </set-header>
+  </outbound>
+  <on-error>
+    <base />
+  </on-error>
+</policies>
+```
+
 ### Example
 
-Use the portal, API Management [REST API](/rest/api/apimanagement/backend), or a Bicep or ARM template to configure a backend pool. In the following example, the backend *myBackendPool* in the API Management instance *myAPIM* is configured with a backend pool. Example backends in the pool are named *backend-1* and *backend-2*. Both backends are in the highest priority group; within the group, *backend-1* has a greater weight than *backend-2* .
+Use the portal, API Management [REST API](/rest/api/apimanagement/backend), or a Bicep or ARM template to configure a backend pool. In the following example, the backend *myBackendPool* in the API Management instance *myAPIM* is configured with a backend pool. Example backends in the pool are named *backend-1* and *backend-2*. Both backends are in the highest priority group; within the group, *backend-1* has a greater weight than *backend-2*.
 
 
 #### [Portal](#tab/portal)
@@ -266,7 +313,9 @@ Use the portal, API Management [REST API](/rest/api/apimanagement/backend), or a
 
 #### [Bicep](#tab/bicep)
 
-Include a snippet similar to the following in your Bicep file for a load-balanced pool. Set the `type` property of the backend entity to `Pool` and specify the backends in the pool:
+Include a snippet similar to the following in your Bicep file for a load-balanced pool. Set the `type` property of the backend entity to `Pool` and specify the backends in the pool.
+
+This example includes an optional `sessionAffinity` pool configuration for session awareness. It sets a cookie so that requests from a user session are routed to a specific backend in the pool. 
 
 ```bicep
 resource symbolicname 'Microsoft.ApiManagement/service/backends@2023-09-01-preview' = {
@@ -286,14 +335,23 @@ resource symbolicname 'Microsoft.ApiManagement/service/backends@2023-09-01-previ
           priority: 1
           weight: 1
         }
-      ]
+      ],
+      "sessionAffinity": { 
+        "sessionId": { 
+          "source": "Cookie", 
+          "name": "SessionId" 
+        } 
+      } 
     }
   }
 }
 ```
 #### [ARM](#tab/arm)
 
-Include a JSON snippet similar to the following in your ARM template for a load-balanced pool. Set the `type` property of the backend resource to `Pool` and specify the backends in the pool: 
+Include a JSON snippet similar to the following in your ARM template for a load-balanced pool. Set the `type` property of the backend resource to `Pool` and specify the backends in the pool.
+
+This example includes an optional `sessionAffinity` pool configuration for session awareness. It sets a cookie so that requests from a user session are routed to a specific backend in the pool. 
+
 
 ```json
 {
@@ -315,7 +373,13 @@ Include a JSON snippet similar to the following in your ARM template for a load-
           "priority": "1",
           "weight": "1"    
         }
-      ]
+      ],
+        "sessionAffinity": { 
+        "sessionId": { 
+          "source": "Cookie", 
+          "name": "SessionId" 
+        } 
+      } 
     }
   }
 }
 
@@ -18,7 +18,7 @@ The Application Gateway v2 SKU introduces the use of Trusted Root Certificates t
 Application Gateway trusts your website's certificate by default if it's signed by a well-known CA (for example, GoDaddy or DigiCert). You don't need to explicitly upload the root certificate in that case. For more information, see [Overview of TLS termination and end to end TLS with Application Gateway](ssl-overview.md). However, if you have a dev/test environment and don't want to purchase a verified CA signed certificate, you can create your own custom Root CA and a leaf certificate signed by that Root CA.
 
 > [!NOTE]
-> Self-generated certificates are not trusted by default, and can be difficult to maintain. Also, they may use outdated hash and cipher suites that may not be strong. For better security, purchase a certificate signed by a well-known certificate authority.
+> Self-generated certificates aren't trusted by default and can be difficult to maintain. Also, they may use outdated hash and cipher suites that may not be strong. For better security, purchase a certificate signed by a well-known certificate authority.
 
 **You can use the following options to generate your private certificate for backend TLS connections.**
 1. Use the one-click private [**certificate generator tool**](https://appgwbackendcertgenerator.azurewebsites.net/). Using the domain name (Common Name) that you provide, this tool performs the same steps as documented in this article to generate Root and Server certificates. With the generated certificate files, you can immediately upload the Root certificate (.CER) file to the Backend Setting of your gateway and the corresponding certificate chain (.PFX) to the backend server. The password for the PFX file is also supplied in the downloaded ZIP file.
 
@@ -22,7 +22,7 @@ A module is stuck in the *Importing* state when you're importing or updating you
 
 #### Cause
 
-Because importing PowerShell modules is a complex, multistep process, a module might not import correctly, and can be stuck in a transient state. To learn more about the import process, see [Importing a PowerShell module](/powershell/scripting/developer/module/importing-a-powershell-module#the-importing-process).
+Because importing PowerShell modules is a complex, multistep process, a module might not import correctly and can be stuck in a transient state. To learn more about the import process, see [Importing a PowerShell module](/powershell/scripting/developer/module/importing-a-powershell-module#the-importing-process).
 
 #### Resolution
 
 
@@ -239,7 +239,7 @@ To address these concerns, we recommend that you use a proxy service between you
 
 ## Multitenant applications in App Configuration
 
-A multitenant application is built on an architecture where a shared instance of your application serves multiple customers or tenants. For example, you may have an email service that offers your users separate accounts and customized experiences. Your application usually manages different configurations for each tenant. Here are some architectural considerations for [using App Configuration in a multitenant application](/azure/architecture/guide/multitenant/service/app-configuration).
+A multitenant application is built on an architecture where a shared instance of your application serves multiple customers or tenants. For example, you may have an email service that offers your users separate accounts and customized experiences. Your application usually manages different configurations for each tenant. Here are some architectural considerations for [using App Configuration in a multitenant application](/azure/architecture/guide/multitenant/service/app-configuration). You can also reference the [example code for multitenant application setup](https://github.com/Azure/AppConfiguration/blob/main/examples/DotNetCore/MultiTenantApplicationSetup/README.md).
 
 ## Configuration as Code
 
 
@@ -1,49 +1,51 @@
 ---
-title: Best practices for hosting a Kubernetes client application
-description: Learn how to host a Kubernetes client application.
+title: Best practices for Kubernetes-hosted client apps
+description: Learn about best practices for using Azure Cache for Redis in Kubernetes-hosted client applications.
 ms.custom: linux-related-content, ignite-2024
 ms.topic: conceptual
-ms.date: 11/10/2023
+ms.date: 05/28/2025
 appliesto:
   - ✅ Azure Cache for Redis
 
 ---
 
-# Kubernetes-hosted client application
+# Kubernetes-hosted client applications
 
-## Client connections from multiple pods
+This article provides best practices for using Azure Cache for Redis in Kubernetes-hosted client applications.
 
-When you have multiple pods connecting to a Redis server, make sure the new connections from the pods are created in a staggered manner. If multiple pods start in a short time without staggering, it causes a sudden spike in the number of client connections created. The high number of connections leads to high load on the Redis server and might cause timeouts.
+## Stagger multiple connections
 
-Avoid the same scenario when shutting down multiple pods at the same time. Failing to stagger shutdown might cause a steep dip in the number of connections that leads to CPU pressure.
+Make sure to stagger multiple pod connections to a Redis server. Starting multiple pods in a short time without staggering causes a sudden spike in the number of client connections, leading to high load on the Redis server and possible timeouts.
 
-## Sufficient pod resources
+Also avoid shutting down multiple pods at the same time. Failing to stagger shutdown might cause a steep dip in the number of connections leading to CPU pressure.
 
-Ensure that the pod running your client application is given enough CPU and memory resources. If the client application is running close to its resource limits, it can result in timeouts.
+## Provide sufficient pod resources
 
-## Sufficient node resources
+Make sure to give the pod running your client application enough CPU and memory resources. Client applications running close to their resource limits can lead to timeouts.
 
-A pod running the client application can be affected by other pods running on the same node and throttle Redis connections or IO operations. So always ensure that the node on which your client application pods run have enough memory, CPU, and network bandwidth. Running low on any of these resources could result in connectivity issues.
+## Provide sufficient node resources
 
-## Linux-hosted client applications and TCP settings
+The pod running the client application can be affected by other pods running on the same node, and throttle Redis connections or IO operations. Make sure the nodes that run your client application pods have enough memory, CPU, and network bandwidth. Insufficient amounts of these resources could result in connectivity issues.
 
-If your Azure Cache for Redis client application runs on a Linux-based container, we recommend updating some TCP settings. These settings are detailed in [TCP settings for Linux-hosted client applications](cache-best-practices-connection.md#tcp-settings-for-linux-hosted-client-applications).
+## Check TCP settings for Linux applications
 
-## Potential connection collision with _Istio/Envoy_
+If your Azure Redis client application runs on a Linux-based container, make sure your TCP settings match the [TCP settings for Linux-hosted client applications](cache-best-practices-connection.md#tcp-settings-for-linux-hosted-client-applications).
+
+## Avoid connection collision with Istio
 
 <!-- Currently, Azure Cache for Redis uses ports 15xxx for clustered caches to expose cluster nodes to client applications. As documented [here](https://istio.io/latest/docs/ops/deployment/application-requirements/#ports-used-by-istio), the same ports are also used by _Istio.io_ sidecar proxy called _Envoy_ and could interfere with creating connections, especially on port 15001 and 15006. -->
 
-When using _Istio_ with an Azure Managed Redis cluster, consider excluding the potential collision ports with an [istio annotation](https://istio.io/latest/docs/reference/config/annotations/).
+If you use Istio with an Azure Managed Redis cluster, consider excluding potential collision ports with the following [Istio annotation](https://istio.io/latest/docs/reference/config/annotations/):
 
-```
+```console
 annotations:
   traffic.sidecar.istio.io/excludeOutboundPorts: "15000,15001,15004,15006,15008,15009,15020"
 ```
 
-To avoid connection interference, we recommend:
+To avoid connection interference:
 
-- Consider using a nonclustered cache or an Enterprise tier cache instead
-- Avoid configuring _Istio_ sidecars on pods running Azure Cache for Redis client code
+- Consider using a nonclustered cache or an Azure Managed Redis cache instead.
+- Avoid configuring Istio sidecars on pods running Azure Redis client code.
 
 ## Related content