You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/use-your-data.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -122,7 +122,7 @@ You can modify the following additional settings in the **Data parameters** sect
122
122
123
123
|Parameter name | Description |
124
124
|---------|---------|
125
-
|**Retrieved documents**| Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. The default value is 3. |
125
+
|**Retrieved documents**| Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. The default value is 3. This is the `topNDocuments` parameter in the API.|
126
126
|**Strictness**| Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. The default value is 3. |
127
127
128
128
## Virtual network support & private endpoint support
Copy file name to clipboardExpand all lines: articles/ai-services/openai/reference.md
+2-4Lines changed: 2 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -381,9 +381,6 @@ curl -i -X POST YOUR_RESOURCE_NAME/openai/deployments/YOUR_DEPLOYMENT_NAME/exten
381
381
|`stream`| boolean | Optional | false | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a message `"messages": [{"delta": {"content": "[DONE]"}, "index": 2, "end_turn": true}]`|
382
382
|`stop`| string or array | Optional | null | Up to 2 sequences where the API will stop generating further tokens. |
383
383
|`max_tokens`| integer | Optional | 1000 | The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return is `4096 - prompt_tokens`. |
384
-
|`retrieved_documents`| number | Optional | 3 | Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. |
385
-
|`strictness`| number | Optional | 3 | Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. |
386
-
387
384
388
385
The following parameters can be used inside of the `parameters` field inside of `dataSources`.
389
386
@@ -395,14 +392,15 @@ The following parameters can be used inside of the `parameters` field inside of
395
392
|`indexName`| string | Required | null | The search index to be used. |
396
393
|`fieldsMapping`| dictionary | Optional | null | Index data column mapping. |
397
394
|`inScope`| boolean | Optional | true | If set, this value will limit responses specific to the grounding data content. |
398
-
|`topNDocuments`| number | Optional |5|Number of documents that need to be fetched for document augmentation.|
395
+
|`topNDocuments`| number | Optional |3|Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. This is the *retrieved documents* parameter in Azure OpenAI studio. |
399
396
|`queryType`| string | Optional | simple | Indicates which query option will be used for Azure Cognitive Search. Available types: `simple`, `semantic`, `vector`, `vectorSimpleHybrid`, `vectorSemanticHybrid`. |
400
397
|`semanticConfiguration`| string | Optional | null | The semantic search configuration. Only required when `queryType` is set to `semantic` or `vectorSemanticHybrid`. |
401
398
|`roleInformation`| string | Optional | null | Gives the model instructions about how it should behave and the context it should reference when generating a response. Corresponds to the "System Message" in Azure OpenAI Studio. See [Using your data](./concepts/use-your-data.md#system-message) for more information. There’s a 100 token limit, which counts towards the overall token limit.|
402
399
| `filter` | string | Optional | null | The filter pattern used for [restricting access to sensitive documents](./concepts/use-your-data.md#document-level-access-control)
403
400
|`embeddingEndpoint`| string | Optional | null | The endpoint URL for an Ada embedding model deployment, generally of the format `https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings?api-version=2023-05-15`. Use with the `embeddingKey` parameter for [vector search](./concepts/use-your-data.md#search-options) outside of private networks and private endpoints. |
404
401
|`embeddingKey`| string | Optional | null | The API key for an Ada embedding model deployment. Use with `embeddingEndpoint` for [vector search](./concepts/use-your-data.md#search-options) outside of private networks and private endpoints. |
405
402
|`embeddingDeploymentName`| string | Optional | null | The Ada embedding model deployment name within the same Azure OpenAI resource. Used instead of `embeddingEndpoint` and `embeddingKey` for [vector search](./concepts/use-your-data.md#search-options). Should only be used when both the `embeddingEndpoint` and `embeddingKey` parameters are defined. When this parameter is provided, Azure OpenAI on your data will use an internal call to evaluate the Ada embedding model, rather than calling the Azure OpenAI endpoint. This enables you to use vector search in private networks and private endpoints. Billing remains the same whether this parameter is defined or not. Available in regions where embedding models are [available](./concepts/models.md#embeddings-models) starting in API versions `2023-06-01-preview` and later.|
403
+
|`strictness`| number | Optional | 3 | Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. |
Memory utilized by AKS includes the sum of two values.
111
-
112
-
1.**`kubelet` daemon**
113
-
The `kubelet` daemon is installed on all Kubernetes agent nodes to manage container creation and termination.
114
-
115
-
By default on AKS, `kubelet` daemon has the *memory.available<750Mi* eviction rule, ensuring a node must always have at least 750Mi allocatable at all times. When a host is below that available memory threshold, the `kubelet` will trigger to terminate one of the running pods and free up memory on the host machine.
116
-
117
-
2.**A regressive rate of memory reservations** for the kubelet daemon to properly function (*kube-reserved*).
118
-
- 25% of the first 4 GB of memory
119
-
- 20% of the next 4 GB of memory (up to 8 GB)
120
-
- 10% of the next 8 GB of memory (up to 16 GB)
121
-
- 6% of the next 112 GB of memory (up to 128 GB)
122
-
- 2% of any memory above 128 GB
102
+
#### CPU
103
+
104
+
Reserved CPU is dependent on node type and cluster configuration, which may cause less allocatable CPU due to running additional features.
Memory utilized by AKS includes the sum of two values.
113
+
114
+
> [!IMPORTANT]
115
+
> AKS 1.28 includes certain changes to memory reservations. These changes are detailed in the following section.
116
+
117
+
**AKS 1.28 and later**
118
+
119
+
1.**`kubelet` daemon** has the *memory.available<100Mi* eviction rule by default. This ensures that a node always has at least 100Mi allocatable at all times. When a host is below that available memory threshold, the `kubelet` triggers the termination of one of the running pods and frees up memory on the host machine.
120
+
2.**A rate of memory reservations** set according to the lesser value of: *20MB * Max Pods supported on the Node + 50MB* or *25% of the total system memory resources*.
121
+
122
+
**Examples**:
123
+
* If the VM provides 8GB of memory and the node supports up to 30 pods, AKS reserves *20MB * 30 Max Pods + 50MB = 650MB* for kube-reserved. `Allocatable space = 8GB - 0.65GB (kube-reserved) - 0.1GB (eviction threshold) = 7.25GB or 90.625% allocatable.`
124
+
* If the VM provides 4GB of memory and the node supports up to 70 pods, AKS reserves *25% * 4GB = 1000MB* for kube-reserved, as this is less than *20MB * 70 Max Pods + 50MB = 1450MB*.
125
+
126
+
For more information, see [Configure maximum pods per node in an AKS cluster](./azure-cni-overview.md#maximum-pods-per-node).
127
+
128
+
**AKS versions prior to 1.28**
129
+
130
+
1.**`kubelet` daemon** is installed on all Kubernetes agent nodes to manage container creation and termination. By default on AKS, `kubelet` daemon has the *memory.available<750Mi* eviction rule, ensuring a node must always have at least 750Mi allocatable at all times. When a host is below that available memory threshold, the `kubelet` will trigger to terminate one of the running pods and free up memory on the host machine.
131
+
132
+
2.**A regressive rate of memory reservations** for the kubelet daemon to properly function (*kube-reserved*).
133
+
* 25% of the first 4GB of memory
134
+
* 20% of the next 4GB of memory (up to 8GB)
135
+
* 10% of the next 8GB of memory (up to 16GB)
136
+
* 6% of the next 112GB of memory (up to 128GB)
137
+
* 2% of any memory above 128GB
123
138
124
139
>[!NOTE]
125
140
> AKS reserves an additional 2GB for system process in Windows nodes that are not part of the calculated memory.
Copy file name to clipboardExpand all lines: articles/application-gateway/configuration-frontend-ip.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ A frontend IP address is associated to a *listener*, which checks for incoming r
48
48
49
49
> [!IMPORTANT]
50
50
> **The default domain name behavior for V1 SKU**:
51
-
> - Deployments before 1st May 2023: These deployments will continue to have the default domain names like <label>.cloudapp.net mapped to the application gateway's Public IP address.
51
+
> - Deployments before 1st May 2023: These deployments will continue to have the default domain names like \<label>.cloudapp.net mapped to the application gateway's Public IP address.
52
52
> - Deployments after 1st May 2023: For deployments after this date, there will NOT be any default domain name mapped to the gateway's Public IP address. You must manually configure using your domain name by mapping its DNS record to the gateway's IP address
0 commit comments