|
9 | 9 | "## Backend pool Load Balancing lab\n", |
10 | 10 | "\n", |
11 | 11 | "\n", |
12 | | - "Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/azure/api-management/backends?tabs=bicep) to either a list of Azure OpenAI endpoints.\n", |
| 12 | + "Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/azure/api-management/backends?tabs=bicep) to a list of Azure OpenAI endpoints.\n", |
13 | 13 | "\n", |
14 | 14 | "Notes:\n", |
15 | | - "- The backend pool uses round-robin by default\n", |
16 | | - "- Priority and weight-based routing are also supported: Adjust the `priority` (the lower the number, the higher the priority) and `weight` parameters in the `openai_resources` variable\n", |
17 | | - "- The `retry` API Management policy initiates a retry to an available backend if an HTTP 429 status code is encountered\n", |
| 15 | + "- **This is a typical prioritized PTU with fallback consumption scenario**. The lab specifically showcases how a priority 1 (highest) backend is exhausted before gracefully falling back to two equally-weighted priority 2 backends. \n", |
| 16 | + "- The backend pool uses round-robin by default.\n", |
| 17 | + "- Priority and weight-based routing are supported and can be adjusted by modifying `priority` (the lower the number, the higher the priority) and `weight` parameters in the `openai_resources` variable below.\n", |
| 18 | + "- The `retry` API Management policy initiates a retry to an available backend if an HTTP 429 status code is encountered. This is transparent to the caller.\n", |
18 | 19 | "\n", |
19 | 20 | "### Result\n", |
20 | 21 | "\n", |
|
132 | 133 | "with open(\"policy.xml\", 'r') as policy_xml_file:\n", |
133 | 134 | " policy_template_xml = policy_xml_file.read()\n", |
134 | 135 | " if \"{backend-id}\" in policy_template_xml:\n", |
135 | | - " policy_xml = policy_template_xml.replace(\"{backend-id}\", str(\"openai-backend-pool\" if len(openai_resources) > 1 else openai_resources[0].get(\"name\"))) \n", |
| 136 | + " policy_xml = policy_template_xml.replace(\"{backend-id}\", str(\"openai-backend-pool\" if len(openai_resources) > 1 else openai_resources[0].get(\"name\")))\n", |
136 | 137 | " policy_xml_file.close()\n", |
137 | 138 | "if policy_xml is not None:\n", |
138 | 139 | " open(\"policy.xml\", 'w').write(policy_xml)\n", |
|
152 | 153 | " }\n", |
153 | 154 | "}\n", |
154 | 155 | "\n", |
155 | | - "# write the parameters to a file \n", |
| 156 | + "# write the parameters to a file\n", |
156 | 157 | "with open('params.json', 'w') as bicep_parameters_file:\n", |
157 | 158 | " bicep_parameters_file.write(json.dumps(bicep_parameters))\n", |
158 | 159 | "\n", |
159 | 160 | "# run the deployment\n", |
160 | | - "output = utils.run(f\"az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file main.bicep --parameters params.json\", \n", |
| 161 | + "output = utils.run(f\"az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file main.bicep --parameters params.json\",\n", |
161 | 162 | " f\"Deployment '{deployment_name}' succeeded\", f\"Deployment '{deployment_name}' failed\")\n", |
162 | 163 | "open(\"policy.xml\", 'w').write(policy_template_xml)\n", |
163 | 164 | "\n" |
|
370 | 371 | ], |
371 | 372 | "metadata": { |
372 | 373 | "kernelspec": { |
373 | | - "display_name": "Python 3", |
| 374 | + "display_name": ".venv", |
374 | 375 | "language": "python", |
375 | 376 | "name": "python3" |
376 | 377 | }, |
|
384 | 385 | "name": "python", |
385 | 386 | "nbconvert_exporter": "python", |
386 | 387 | "pygments_lexer": "ipython3", |
387 | | - "version": "3.12.8" |
| 388 | + "version": "3.12.0" |
388 | 389 | } |
389 | 390 | }, |
390 | 391 | "nbformat": 4, |
|
0 commit comments