You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/use-your-data.md
+1-33Lines changed: 1 addition & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -383,40 +383,8 @@ You can send a streaming request using the `stream` parameter, allowing data to
383
383
384
384
#### Conversation history for better results
385
385
386
-
When you chat with a model, providing a history of the chat will help the model return higher quality results.
386
+
When you chat with a model, providing a history of the chat will help the model return higher quality results. You don't need to include the `context` property of the assistant messages in your API requests for better response quality. See [the API reference documentation](../references/on-your-data.md#examples) for examples.
387
387
388
-
```json
389
-
{
390
-
"dataSources": [
391
-
{
392
-
"type": "AzureCognitiveSearch",
393
-
"parameters": {
394
-
"endpoint": "'$AZURE_AI_SEARCH_ENDPOINT'",
395
-
"key": "'$AZURE_AI_SEARCH_API_KEY'",
396
-
"indexName": "'$AZURE_AI_SEARCH_INDEX'"
397
-
}
398
-
}
399
-
],
400
-
"messages": [
401
-
{
402
-
"role": "user",
403
-
"content": "What are the differences between Azure Machine Learning and Azure AI services?"
404
-
},
405
-
{
406
-
"role": "tool",
407
-
"content": "{\"citations\": [{\"content\": \"title: Azure AI services and Machine Learning\\ntitleSuffix: Azure AI services\\ndescription: Learn where Azure AI services fits in with other Azure offerings for machine learning.\\nAzure AI services and machine learning\\nAzure AI services provides machine learning capabilities to solve general problems such as...\\n \"articles\\\\cognitive-services\\\\cognitive-services-and-machine-learning.md\", \"url\": null, \"metadata\": {\"chunking\": \"orignal document size=1018. Scores=0.32200050354003906 and 1.2880020141601562.Org Highlight count=115.\"}, \"chunk_id\": \"0\"}], \"intent\": \"[\\\"What are the differences between Azure Machine Learning and Azure AI services?\\\"]\"}"
408
-
},
409
-
{
410
-
"role": "assistant",
411
-
"content": "\nAzure Machine Learning is a product and service tailored for data scientists to build, train, and deploy machine learning models [doc1]..."
412
-
},
413
-
{
414
-
"role": "user",
415
-
"content": "How do I use Azure machine learning?"
416
-
}
417
-
]
418
-
}
419
-
```
420
388
421
389
## Token usage estimation for Azure OpenAI On Your Data
@@ -48,7 +48,6 @@ The request body inherits the same schema of chat completions API request. This
48
48
49
49
|Name | Type | Required | Description |
50
50
|--- | --- | --- | --- |
51
-
|`messages`|[ChatMessage](#chat-message)[]| True | The array of messages to generate chat completions for, in the chat format. The [request chat message](#chat-message) has a `context` property, which is added for Azure OpenAI On Your Data.|
52
51
|`data_sources`|[DataSource](#data-source)[]| True | The configuration entries for Azure OpenAI On Your Data. There must be exactly one element in the array. If `data_sources` is not provided, the service uses chat completions model directly, and does not use Azure OpenAI On Your Data.|
53
52
54
53
## Response body
@@ -57,17 +56,17 @@ The response body inherits the same schema of chat completions API response. The
57
56
58
57
## Chat message
59
58
60
-
In both request and response, when the chat message `role` is `assistant`, the chat message schema inherits from the chat completions assistant chat message, and is extended with the property `context`.
59
+
The responseassistantmessage schema inherits from the chat completions assistant [chat message](../reference.md#chatmessage), and is extended with the property `context`.
61
60
62
61
|Name | Type | Required | Description |
63
62
|--- | --- | --- | --- |
64
-
|`context`|[Context](#context)| False | Represents the incremental steps performed by the Azure OpenAI On Your Data while processing the request, including the detected search intent and the retrieved documents. |
63
+
|`context`|[Context](#context)| False | Represents the incremental steps performed by the Azure OpenAI On Your Data while processing the request, including the retrieved documents. |
65
64
66
65
## Context
67
66
|Name | Type | Required | Description |
68
67
|--- | --- | --- | --- |
69
-
|`citations`|[Citation](#citation)[]| False | The data source retrieval result, used to generate the assistant message in the response.|
70
-
|`intent`| string | False | The detected intent from the chat history, used to pass to the next turn to carry over the context.|
68
+
|`citations`|[Citation](#citation)[]| False | The data source retrieval result, used to generate the assistant message in the response. Clients can render references from the citations. |
69
+
|`intent`| string | False | The detected intent from the chat history. Passing back the previous intent is no longer needed. Ignore this property. |
71
70
72
71
## Citation
73
72
@@ -91,7 +90,7 @@ This list shows the supported data sources.
91
90
92
91
## Examples
93
92
94
-
This example shows how to pass context with conversation history for better results.
93
+
This example shows how to pass conversation history for better results.
95
94
96
95
Prerequisites:
97
96
* Configure the role assignments from Azure OpenAI system assigned managed identity to Azure search service. Required roles: `Search Index Data Reader`, `Search Service Contributor`.
Copy file name to clipboardExpand all lines: articles/aks/cluster-autoscaler-overview.md
+9-6Lines changed: 9 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,14 +31,16 @@ It's a common practice to enable cluster autoscaler for nodes and either the Ver
31
31
* To **effectively run workloads concurrently on both Spot and Fixed node pools**, consider using [*priority expanders*](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-expanders). This approach allows you to schedule pods based on the priority of the node pool.
32
32
* Exercise caution when **assigning CPU/Memory requests on pods**. The cluster autoscaler scales up based on pending pods rather than CPU/Memory pressure on nodes.
33
33
* For **clusters concurrently hosting both long-running workloads, like web apps, and short/bursty job workloads**, we recommend separating them into distinct node pools with [Affinity Rules](./operator-best-practices-advanced-scheduler.md#node-affinity)/[expanders](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-expanders) or using [PriorityClass](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass) to help prevent unnecessary node drain or scale down operations.
34
-
*We **don't recommend making direct changes to nodes in autoscaled node pools**. All nodes in the same node group should have uniform capacity, labels, and system pods running on them.
34
+
*In an autoscaler-enabled node pool, scale down nodes by removing workloads, instead of manually reducing the node count. This can be problematic if the node pool is already at maximum capacity or if there are active workloads running on the nodes, potentially causing unexpected behavior by the cluster autoscaler
35
35
* Nodes don't scale up if pods have a PriorityClass value below -10. Priority -10 is reserved for [overprovisioning pods](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-can-i-configure-overprovisioning-with-cluster-autoscaler). For more information, see [Using the cluster autoscaler with Pod Priority and Preemption](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-cluster-autoscaler-work-with-pod-priority-and-preemption).
36
36
***Don't combine other node autoscaling mechanisms**, such as Virtual Machine Scale Set autoscalers, with the cluster autoscaler.
37
37
* The cluster autoscaler **might be unable to scale down if pods can't move, such as in the following situations**:
38
38
* A directly created pod not backed by a controller object, such as a Deployment or ReplicaSet.
39
39
* A pod disruption budget (PDB) that's too restrictive and doesn't allow the number of pods to fall below a certain threshold.
40
40
* A pod uses node selectors or anti-affinity that can't be honored if scheduled on a different node.
41
41
For more information, see [What types of pods can prevent the cluster autoscaler from removing a node?](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node).
42
+
>[!IMPORTANT]
43
+
> **Do not make changes to individual nodes within the autoscaled node pools**. All nodes in the same node group should have uniform capacity, labels, taints and system pods running on them.
42
44
43
45
## Cluster autoscaler profile
44
46
@@ -52,21 +54,22 @@ It's important to note that the cluster autoscaler profile settings are cluster-
52
54
53
55
#### Example 1: Optimizing for performance
54
56
55
-
For clusters that handle substantial and bursty workloads with a primary focus on performance, we recommend increasing the `scan-interval` and decreasing the `scale-down-utilization-threshold`. These settings help batch multiple scaling operations into a single call, optimizing scaling time and the utilization of compute read/write quotas. It also helps mitigate the risk of swift scale down operations on underutilized nodes, enhancing the pod scheduling efficiency.
57
+
For clusters that handle substantial and bursty workloads with a primary focus on performance, we recommend increasing the `scan-interval` and decreasing the `scale-down-utilization-threshold`. These settings help batch multiple scaling operations into a single call, optimizing scaling time and the utilization of compute read/write quotas. It also helps mitigate the risk of swift scale down operations on underutilized nodes, enhancing the pod scheduling efficiency. Also increase `ok-total-unready-count`and `max-total-unready-percentage`.
56
58
57
-
For clusters with daemonset pods, we recommend setting `ignore-daemonset-utilization` to `true`, which effectively ignores node utilization by daemonset pods and minimizes unnecessary scale down operations.
59
+
For clusters with daemonset pods, we recommend setting `ignore-daemonset-utilization` to `true`, which effectively ignores node utilization by daemonset pods and minimizes unnecessary scale down operations. See [profile for bursty workloads](./cluster-autoscaler.md#configure-cluster-autoscaler-profile-for-bursty-workloads)
58
60
59
61
#### Example 2: Optimizing for cost
60
62
61
-
If you want a cost-optimized profile, we recommend setting the following parameter configurations:
62
-
63
+
If you want a [cost-optimized profile](./cluster-autoscaler.md#configure-cluster-autoscaler-profile-for-aggressive-scale-down), we recommend setting the following parameter configurations:
63
64
* Reduce `scale-down-unneeded-time`, which is the amount of time a node should be unneeded before it's eligible for scale down.
64
65
* Reduce `scale-down-delay-after-add`, which is the amount of time to wait after a node is added before considering it for scale down.
65
66
* Increase `scale-down-utilization-threshold`, which is the utilization threshold for removing nodes.
66
67
* Increase `max-empty-bulk-delete`, which is the maximum number of nodes that can be deleted in a single call.
Copy file name to clipboardExpand all lines: articles/aks/cluster-autoscaler.md
+31-4Lines changed: 31 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -195,6 +195,24 @@ The following table lists the available settings for the cluster autoscaler prof
195
195
--cluster-autoscaler-profile scan-interval=30s
196
196
```
197
197
198
+
### Configure cluster autoscaler profile for aggressive scale down
199
+
> [!NOTE]
200
+
> Scaling down aggressively is not recommended for clusters experiencing frequent scale-outs and scale-ins within short intervals, as it could potentially result in extended node provisioning times under these circumstances. Increasing `scale-down-delay-after-add` can help in these circumstances by keeping the node around longer to handle incoming workloads.
### Reset cluster autoscaler profile to default values
199
217
200
218
* Reset the cluster autoscaler profile using the [`az aks update`][az-aks-update-preview] command.
@@ -206,12 +224,11 @@ The following table lists the available settings for the cluster autoscaler prof
206
224
--cluster-autoscaler-profile ""
207
225
```
208
226
209
-
## Retrieve cluster autoscaler logs and status updates
227
+
## Retrieve cluster autoscaler logs and status
210
228
211
229
You can retrieve logs and status updates from the cluster autoscaler to help diagnose and debug autoscaler events. AKS manages the cluster autoscaler on your behalf and runs it in the managed control plane. You can enable control plane node to see the logs and operations from the cluster autoscaler.
212
230
213
231
### [Azure CLI](#tab/azure-cli)
214
-
215
232
1. Set up a rule for resource logs to push cluster autoscaler logs to Log Analytics using the [instructions here][aks-view-master-logs]. Make sure you check the box for `cluster-autoscaler` when selecting options for **Logs**.
216
233
2. Select the **Log** section on your cluster.
217
234
3. Enter the following example query into Log Analytics:
@@ -224,8 +241,16 @@ You can retrieve logs and status updates from the cluster autoscaler to help dia
224
241
As long as there are logs to retrieve, you should see logs similar to the following logs:
225
242
226
243
:::image type="content" source="media/cluster-autoscaler/autoscaler-logs.png" alt-text="Screenshot of Log Analytics logs.":::
227
-
228
-
The cluster autoscaler also writes out the health status to a `configmap` named `cluster-autoscaler-status`. You can retrieve these logs using the following `kubectl` command:
244
+
245
+
4. View cluster autoscaler scale-up not triggered events on CLI
246
+
```bash
247
+
kubectl get events --field-selector source=cluster-autoscaler,reason=NotTriggerScaleUp
248
+
```
249
+
5. View cluster autoscaler warning events on CLI
250
+
```bash
251
+
kubectl get events --field-selector source=cluster-autoscaler,type=Warning
252
+
```
253
+
6. The cluster autoscaler also writes out the health status to a `configmap` named `cluster-autoscaler-status`. You can retrieve these logs using the following `kubectl` command:
229
254
230
255
```bash
231
256
kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml
@@ -244,6 +269,8 @@ You can retrieve logs and status updates from the cluster autoscaler to help dia
244
269
---
245
270
246
271
For more information, see the [Kubernetes/autoscaler GitHub project FAQ][kubernetes-faq].
272
+
## Cluster Autoscaler Metrics
273
+
You can enable [control plane metrics (Preview)](./monitor-control-plane-metrics.md) to see the logs and operations from the [cluster autoscaler](./control-plane-metrics-default-list.md#minimal-ingestion-for-default-off-targets) with the [Azure Monitor managed service for Prometheus add-on](../azure-monitor/essentials/prometheus-metrics-overview.md)
Copy file name to clipboardExpand all lines: articles/aks/create-nginx-ingress-private-controller.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Configure internal NGIX ingress controller for Azure private DNS zone
2
+
title: Configure internal NGINX ingress controller for Azure private DNS zone
3
3
description: Understand how to configure an ingress controller with a private IP address and an Azure private DNS zone using the application routing add-on for Azure Kubernetes Service.
4
4
ms.subservice: aks-networking
5
5
ms.custom: devx-track-azurecli
@@ -320,4 +320,4 @@ For other configuration information related to SSL encryption other advanced NGI
0 commit comments