Skip to content

Commit faa932a

Browse files
Learn Build Service GitHub AppLearn Build Service GitHub App
authored andcommitted
Merging changes synced from https://github.com/MicrosoftDocs/azure-docs-pr (branch live)
2 parents deedf0b + ac67d75 commit faa932a

File tree

108 files changed

+1577
-2043
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

108 files changed

+1577
-2043
lines changed

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ You can modify the following additional settings in the **Data parameters** sect
122122

123123
|Parameter name | Description |
124124
|---------|---------|
125-
|**Retrieved documents** | Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. The default value is 3. |
125+
|**Retrieved documents** | Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. The default value is 3. This is the `topNDocuments` parameter in the API. |
126126
| **Strictness** | Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. The default value is 3. |
127127

128128
## Virtual network support & private endpoint support

articles/ai-services/openai/reference.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -381,9 +381,6 @@ curl -i -X POST YOUR_RESOURCE_NAME/openai/deployments/YOUR_DEPLOYMENT_NAME/exten
381381
| `stream` | boolean | Optional | false | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a message `"messages": [{"delta": {"content": "[DONE]"}, "index": 2, "end_turn": true}]` |
382382
| `stop` | string or array | Optional | null | Up to 2 sequences where the API will stop generating further tokens. |
383383
| `max_tokens` | integer | Optional | 1000 | The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return is `4096 - prompt_tokens`. |
384-
| `retrieved_documents` | number | Optional | 3 | Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. |
385-
| `strictness` | number | Optional | 3 | Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. |
386-
387384

388385
The following parameters can be used inside of the `parameters` field inside of `dataSources`.
389386

@@ -395,14 +392,15 @@ The following parameters can be used inside of the `parameters` field inside of
395392
| `indexName` | string | Required | null | The search index to be used. |
396393
| `fieldsMapping` | dictionary | Optional | null | Index data column mapping. |
397394
| `inScope` | boolean | Optional | true | If set, this value will limit responses specific to the grounding data content. |
398-
| `topNDocuments` | number | Optional | 5 | Number of documents that need to be fetched for document augmentation. |
395+
| `topNDocuments` | number | Optional | 3 | Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. This is the *retrieved documents* parameter in Azure OpenAI studio. |
399396
| `queryType` | string | Optional | simple | Indicates which query option will be used for Azure Cognitive Search. Available types: `simple`, `semantic`, `vector`, `vectorSimpleHybrid`, `vectorSemanticHybrid`. |
400397
| `semanticConfiguration` | string | Optional | null | The semantic search configuration. Only required when `queryType` is set to `semantic` or `vectorSemanticHybrid`. |
401398
| `roleInformation` | string | Optional | null | Gives the model instructions about how it should behave and the context it should reference when generating a response. Corresponds to the "System Message" in Azure OpenAI Studio. See [Using your data](./concepts/use-your-data.md#system-message) for more information. There’s a 100 token limit, which counts towards the overall token limit.|
402399
| `filter` | string | Optional | null | The filter pattern used for [restricting access to sensitive documents](./concepts/use-your-data.md#document-level-access-control)
403400
| `embeddingEndpoint` | string | Optional | null | The endpoint URL for an Ada embedding model deployment, generally of the format `https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings?api-version=2023-05-15`. Use with the `embeddingKey` parameter for [vector search](./concepts/use-your-data.md#search-options) outside of private networks and private endpoints. |
404401
| `embeddingKey` | string | Optional | null | The API key for an Ada embedding model deployment. Use with `embeddingEndpoint` for [vector search](./concepts/use-your-data.md#search-options) outside of private networks and private endpoints. |
405402
| `embeddingDeploymentName` | string | Optional | null | The Ada embedding model deployment name within the same Azure OpenAI resource. Used instead of `embeddingEndpoint` and `embeddingKey` for [vector search](./concepts/use-your-data.md#search-options). Should only be used when both the `embeddingEndpoint` and `embeddingKey` parameters are defined. When this parameter is provided, Azure OpenAI on your data will use an internal call to evaluate the Ada embedding model, rather than calling the Azure OpenAI endpoint. This enables you to use vector search in private networks and private endpoints. Billing remains the same whether this parameter is defined or not. Available in regions where embedding models are [available](./concepts/models.md#embeddings-models) starting in API versions `2023-06-01-preview` and later.|
403+
| `strictness` | number | Optional | 3 | Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. |
406404

407405
### Start an ingestion job
408406

articles/aks/concepts-clusters-workloads.md

Lines changed: 36 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -99,27 +99,42 @@ To maintain node performance and functionality, AKS reserves resources on each n
9999
100100
Two types of resources are reserved:
101101

102-
- **CPU**
103-
Reserved CPU is dependent on node type and cluster configuration, which may cause less allocatable CPU due to running additional features.
104-
105-
| CPU cores on host | 1 | 2 | 4 | 8 | 16 | 32|64|
106-
|---|---|---|---|---|---|---|---|
107-
|Kube-reserved (millicores)|60|100|140|180|260|420|740|
108-
109-
- **Memory**
110-
Memory utilized by AKS includes the sum of two values.
111-
112-
1. **`kubelet` daemon**
113-
The `kubelet` daemon is installed on all Kubernetes agent nodes to manage container creation and termination.
114-
115-
By default on AKS, `kubelet` daemon has the *memory.available<750Mi* eviction rule, ensuring a node must always have at least 750Mi allocatable at all times. When a host is below that available memory threshold, the `kubelet` will trigger to terminate one of the running pods and free up memory on the host machine.
116-
117-
2. **A regressive rate of memory reservations** for the kubelet daemon to properly function (*kube-reserved*).
118-
- 25% of the first 4 GB of memory
119-
- 20% of the next 4 GB of memory (up to 8 GB)
120-
- 10% of the next 8 GB of memory (up to 16 GB)
121-
- 6% of the next 112 GB of memory (up to 128 GB)
122-
- 2% of any memory above 128 GB
102+
#### CPU
103+
104+
Reserved CPU is dependent on node type and cluster configuration, which may cause less allocatable CPU due to running additional features.
105+
106+
| CPU cores on host | 1 | 2 | 4 | 8 | 16 | 32|64|
107+
|---|---|---|---|---|---|---|---|
108+
|Kube-reserved (millicores)|60|100|140|180|260|420|740|
109+
110+
#### Memory
111+
112+
Memory utilized by AKS includes the sum of two values.
113+
114+
> [!IMPORTANT]
115+
> AKS 1.28 includes certain changes to memory reservations. These changes are detailed in the following section.
116+
117+
**AKS 1.28 and later**
118+
119+
1. **`kubelet` daemon** has the *memory.available<100Mi* eviction rule by default. This ensures that a node always has at least 100Mi allocatable at all times. When a host is below that available memory threshold, the `kubelet` triggers the termination of one of the running pods and frees up memory on the host machine.
120+
2. **A rate of memory reservations** set according to the lesser value of: *20MB * Max Pods supported on the Node + 50MB* or *25% of the total system memory resources*.
121+
122+
**Examples**:
123+
* If the VM provides 8GB of memory and the node supports up to 30 pods, AKS reserves *20MB * 30 Max Pods + 50MB = 650MB* for kube-reserved. `Allocatable space = 8GB - 0.65GB (kube-reserved) - 0.1GB (eviction threshold) = 7.25GB or 90.625% allocatable.`
124+
* If the VM provides 4GB of memory and the node supports up to 70 pods, AKS reserves *25% * 4GB = 1000MB* for kube-reserved, as this is less than *20MB * 70 Max Pods + 50MB = 1450MB*.
125+
126+
For more information, see [Configure maximum pods per node in an AKS cluster](./azure-cni-overview.md#maximum-pods-per-node).
127+
128+
**AKS versions prior to 1.28**
129+
130+
1. **`kubelet` daemon** is installed on all Kubernetes agent nodes to manage container creation and termination. By default on AKS, `kubelet` daemon has the *memory.available<750Mi* eviction rule, ensuring a node must always have at least 750Mi allocatable at all times. When a host is below that available memory threshold, the `kubelet` will trigger to terminate one of the running pods and free up memory on the host machine.
131+
132+
2. **A regressive rate of memory reservations** for the kubelet daemon to properly function (*kube-reserved*).
133+
* 25% of the first 4GB of memory
134+
* 20% of the next 4GB of memory (up to 8GB)
135+
* 10% of the next 8GB of memory (up to 16GB)
136+
* 6% of the next 112GB of memory (up to 128GB)
137+
* 2% of any memory above 128GB
123138

124139
>[!NOTE]
125140
> AKS reserves an additional 2GB for system process in Windows nodes that are not part of the calculated memory.

articles/application-gateway/configuration-frontend-ip.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ A frontend IP address is associated to a *listener*, which checks for incoming r
4848
4949
> [!IMPORTANT]
5050
> **The default domain name behavior for V1 SKU**:
51-
> - Deployments before 1st May 2023: These deployments will continue to have the default domain names like <label>.cloudapp.net mapped to the application gateway's Public IP address.
51+
> - Deployments before 1st May 2023: These deployments will continue to have the default domain names like \<label>.cloudapp.net mapped to the application gateway's Public IP address.
5252
> - Deployments after 1st May 2023: For deployments after this date, there will NOT be any default domain name mapped to the gateway's Public IP address. You must manually configure using your domain name by mapping its DNS record to the gateway's IP address
5353
5454
## Next steps
-12.9 KB
Loading
212 KB
Loading
243 KB
Loading
-57.1 KB
Loading
-319 KB
Loading
-2.18 MB
Loading

0 commit comments

Comments
 (0)