Skip to content

Commit 2117a64

Browse files
Merge pull request #300121 from dlepow/lbaff
[APIM] Backend session affinity
2 parents f6b29df + e8d2577 commit 2117a64

File tree

1 file changed

+74
-10
lines changed

1 file changed

+74
-10
lines changed

articles/api-management/backends.md

Lines changed: 74 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: api-management
55
author: dlepow
66
ms.service: azure-api-management
77
ms.topic: concept-article
8-
ms.date: 04/01/2025
8+
ms.date: 05/20/2025
99
ms.author: danlep
1010
ms.custom:
1111
- build-2024
@@ -239,16 +239,63 @@ Use a backend pool for scenarios such as the following:
239239

240240
API Management supports the following load balancing options for backend pools:
241241

242-
* **Round-robin**: By default, requests are distributed evenly across the backends in the pool.
243-
* **Weighted**: Weights are assigned to the backends in the pool, and requests are distributed across the backends based on the relative weight assigned to each backend. Use this option for scenarios such as conducting a blue-green deployment.
244-
* **Priority-based**: Backends are organized in priority groups, and requests are sent to the backends in order of the priority groups. Within a priority group, requests are distributed either evenly across the backends, or (if assigned) according to the relative weight assigned to each backend.
245-
242+
| Load balancing option | Description |
243+
|------------------|-------------|
244+
| **Round-robin** | Requests are distributed evenly across the backends in the pool by default. |
245+
| **Weighted** | Weights are assigned to the backends in the pool, and requests are distributed based on the relative weight of each backend. Useful for scenarios such as blue-green deployments. |
246+
| **Priority-based** | Backends are organized into priority groups. Requests are sent to higher priority groups first; within a group, requests are distributed evenly or according to assigned weights. |
247+
246248
> [!NOTE]
247249
> Backends in lower priority groups will only be used when all backends in higher priority groups are unavailable because circuit breaker rules are tripped.
248250
251+
### Session awareness
252+
253+
With any of the preceding load balancing options, optionally enable **session awareness** (session affinity) to ensure that all requests from a specific user during a session are directed to the same backend in the pool. API Management sets a session ID cookie to maintain session state. This option is useful, for example, in scenarios with backends such as AI chat assistants or other conversational agents to route requests from the same session to the same endpoint.
254+
255+
> [!NOTE]
256+
> Session awareness in load-balanced pools is being released first to the **AI Gateway Early** [update group](configure-service-update-settings.md).
257+
258+
#### Manage cookies for session awareness
259+
260+
When using session awareness, the client must handle cookies appropriately. The client needs to store the `Set-Cookie` header value and send it with subsequent requests to maintain session state.
261+
262+
You can use API Management policies to help set cookies for session awareness. For example, for the case of the Assistants API (a feature of [Azure OpenAI in Azure AI Foundry Models](/azure/ai-services/openai/concepts/models)), the client needs to keep the session ID, extract the thread ID from the body, and keep the pair and send the right cookie for each call. Moreover, the client needs to know when to send a cookie or when not to send a cookie header. These requirements can be handled appropriately by defining the following example policies:
263+
264+
265+
```xml
266+
<policies>
267+
  <inbound>
268+
    <base />
269+
    <set-backend-service backend-id="APIMBackend" />
270+
  </inbound>
271+
  <backend>
272+
    <base />
273+
  </backend>
274+
  <outbound>
275+
    <base />
276+
    <set-variable name="gwSetCookie" value="@{
277+
      var payload = context.Response.Body.As<JObject>();
278+
      var threadId = payload["id"];
279+
      var gwSetCookieHeaderValue = context.Request.Headers.GetValueOrDefault("SetCookie", string.Empty);
280+
      if(!string.IsNullOrEmpty(gwSetCookieHeaderValue))
281+
      {
282+
        gwSetCookieHeaderValue = gwSetCookieHeaderValue + $";Path=/threads/{threadId};";
283+
      }
284+
      return gwSetCookieHeaderValue;
285+
    }" />
286+
    <set-header name="Set-Cookie" exists-action="override">
287+
      <value>Cookie=gwSetCookieHeaderValue</value>
288+
    </set-header>
289+
  </outbound>
290+
  <on-error>
291+
    <base />
292+
  </on-error>
293+
</policies>
294+
```
295+
249296
### Example
250297

251-
Use the portal, API Management [REST API](/rest/api/apimanagement/backend), or a Bicep or ARM template to configure a backend pool. In the following example, the backend *myBackendPool* in the API Management instance *myAPIM* is configured with a backend pool. Example backends in the pool are named *backend-1* and *backend-2*. Both backends are in the highest priority group; within the group, *backend-1* has a greater weight than *backend-2* .
298+
Use the portal, API Management [REST API](/rest/api/apimanagement/backend), or a Bicep or ARM template to configure a backend pool. In the following example, the backend *myBackendPool* in the API Management instance *myAPIM* is configured with a backend pool. Example backends in the pool are named *backend-1* and *backend-2*. Both backends are in the highest priority group; within the group, *backend-1* has a greater weight than *backend-2*.
252299

253300

254301
#### [Portal](#tab/portal)
@@ -266,7 +313,9 @@ Use the portal, API Management [REST API](/rest/api/apimanagement/backend), or a
266313

267314
#### [Bicep](#tab/bicep)
268315

269-
Include a snippet similar to the following in your Bicep file for a load-balanced pool. Set the `type` property of the backend entity to `Pool` and specify the backends in the pool:
316+
Include a snippet similar to the following in your Bicep file for a load-balanced pool. Set the `type` property of the backend entity to `Pool` and specify the backends in the pool.
317+
318+
This example includes an optional `sessionAffinity` pool configuration for session awareness. It sets a cookie so that requests from a user session are routed to a specific backend in the pool.
270319

271320
```bicep
272321
resource symbolicname 'Microsoft.ApiManagement/service/backends@2023-09-01-preview' = {
@@ -286,14 +335,23 @@ resource symbolicname 'Microsoft.ApiManagement/service/backends@2023-09-01-previ
286335
priority: 1
287336
weight: 1
288337
}
289-
]
338+
],
339+
      "sessionAffinity": {
340+
        "sessionId": {
341+
          "source": "Cookie",
342+
          "name": "SessionId"
343+
        }
344+
      }
290345
}
291346
}
292347
}
293348
```
294349
#### [ARM](#tab/arm)
295350

296-
Include a JSON snippet similar to the following in your ARM template for a load-balanced pool. Set the `type` property of the backend resource to `Pool` and specify the backends in the pool:
351+
Include a JSON snippet similar to the following in your ARM template for a load-balanced pool. Set the `type` property of the backend resource to `Pool` and specify the backends in the pool.
352+
353+
This example includes an optional `sessionAffinity` pool configuration for session awareness. It sets a cookie so that requests from a user session are routed to a specific backend in the pool.
354+
297355

298356
```json
299357
{
@@ -315,7 +373,13 @@ Include a JSON snippet similar to the following in your ARM template for a load-
315373
"priority": "1",
316374
          "weight": "1"
317375
}
318-
]
376+
],
377+
      "sessionAffinity": {
378+
        "sessionId": {
379+
          "source": "Cookie",
380+
          "name": "SessionId"
381+
        }
382+
      }
319383
}
320384
}
321385
}

0 commit comments

Comments
 (0)