You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/aks/custom-node-configuration.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -205,14 +205,14 @@ For agent nodes, which are expected to handle very large numbers of concurrent s
205
205
|`net.ipv4.tcp_fin_timeout`| 5 - 120 | 60 | The length of time an orphaned (no longer referenced by any application) connection will remain in the FIN_WAIT_2 state before it's aborted at the local end. |
206
206
|`net.ipv4.tcp_keepalive_time`| 30 - 432000 | 7200 | How often TCP sends out `keepalive` messages when `keepalive` is enabled. |
207
207
|`net.ipv4.tcp_keepalive_probes`| 1 - 15 | 9 | How many `keepalive` probes TCP sends out, until it decides that the connection is broken. |
208
-
|`net.ipv4.tcp_keepalive_intvl`| 10 - 75| 75 | How frequently the probes are sent out. Multiplied by `tcp_keepalive_probes` it makes up the time to kill a connection that isn't responding, after probes started. |
208
+
|`net.ipv4.tcp_keepalive_intvl`| 10 - 90| 75 | How frequently the probes are sent out. Multiplied by `tcp_keepalive_probes` it makes up the time to kill a connection that isn't responding, after probes started. |
209
209
|`net.ipv4.tcp_tw_reuse`| 0 or 1 | 0 | Allow to reuse `TIME-WAIT` sockets for new connections when it's safe from protocol viewpoint. |
210
-
|`net.ipv4.ip_local_port_range`| First: 1024 - 60999 and Last: 32768 - 65000]| First: 32768 and Last: 60999 | The local port range that is used by TCP and UDP traffic to choose the local port. Comprised of two numbers: The first number is the first local port allowed for TCP and UDP traffic on the agent node, the second is the last local port number. |
210
+
|`net.ipv4.ip_local_port_range`| First: 1024 - 60999 and Last: 32768 - 65535]| First: 32768 and Last: 60999 | The local port range that is used by TCP and UDP traffic to choose the local port. Comprised of two numbers: The first number is the first local port allowed for TCP and UDP traffic on the agent node, the second is the last local port number. |
211
211
|`net.ipv4.neigh.default.gc_thresh1`| 128 - 80000 | 4096 | Minimum number of entries that may be in the ARP cache. Garbage collection won't be triggered if the number of entries is below this setting. |
212
212
|`net.ipv4.neigh.default.gc_thresh2`| 512 - 90000 | 8192 | Soft maximum number of entries that may be in the ARP cache. This setting is arguably the most important, as ARP garbage collection will be triggered about 5 seconds after reaching this soft maximum. |
213
213
|`net.ipv4.neigh.default.gc_thresh3`| 1024 - 100000 | 16384 | Hard maximum number of entries in the ARP cache. |
214
-
|`net.netfilter.nf_conntrack_max`| 131072 - 1048576| 131072 |`nf_conntrack` is a module that tracks connection entries for NAT within Linux. The `nf_conntrack` module uses a hash table to record the *established connection* record of the TCP protocol. `nf_conntrack_max` is the maximum number of nodes in the hash table, that is, the maximum number of connections supported by the `nf_conntrack` module or the size of connection tracking table. |
215
-
|`net.netfilter.nf_conntrack_buckets`| 65536 - 147456| 65536 |`nf_conntrack` is a module that tracks connection entries for NAT within Linux. The `nf_conntrack` module uses a hash table to record the *established connection* record of the TCP protocol. `nf_conntrack_buckets` is the size of hash table. |
214
+
|`net.netfilter.nf_conntrack_max`| 131072 - 2097152| 131072 |`nf_conntrack` is a module that tracks connection entries for NAT within Linux. The `nf_conntrack` module uses a hash table to record the *established connection* record of the TCP protocol. `nf_conntrack_max` is the maximum number of nodes in the hash table, that is, the maximum number of connections supported by the `nf_conntrack` module or the size of connection tracking table. |
215
+
|`net.netfilter.nf_conntrack_buckets`| 65536 - 524288| 65536 |`nf_conntrack` is a module that tracks connection entries for NAT within Linux. The `nf_conntrack` module uses a hash table to record the *established connection* record of the TCP protocol. `nf_conntrack_buckets` is the size of hash table. |
Copy file name to clipboardExpand all lines: articles/aks/gpu-cluster.md
+3-5Lines changed: 3 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,6 @@ To view supported GPU-enabled VMs, see [GPU-optimized VM sizes in Azure][gpu-sku
25
25
26
26
* If you're using an Azure Linux GPU-enabled node pool, automatic security patches aren't applied, and the default behavior for the cluster is *Unmanaged*. For more information, see [auto-upgrade](./auto-upgrade-node-image.md).
27
27
*[NVadsA10](../virtual-machines/nva10v5-series.md) v5-series are *not* a recommended SKU for GPU VHD.
28
-
* AKS doesn't support Windows GPU-enabled node pools.
29
28
* Updating an existing node pool to add GPU isn't supported.
30
29
31
30
## Before you begin
@@ -47,7 +46,7 @@ Using NVIDIA GPUs involves the installation of various NVIDIA software component
47
46
48
47
### Skip GPU driver installation (preview)
49
48
50
-
AKS has automatic GPU driver installation enabled by default. In some cases, such as installing your own drivers or using the NVIDIA GPU Operator, you may want to skip GPU driver installation.
49
+
AKS has automatic GPU driver installation enabled by default. In some cases, such as installing your own drivers or using the [NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html), you may want to skip GPU driver installation.
51
50
52
51
[!INCLUDE [preview features callout](includes/preview/preview-callout.md)]
53
52
@@ -71,7 +70,6 @@ AKS has automatic GPU driver installation enabled by default. In some cases, suc
71
70
--node-count 1 \
72
71
--skip-gpu-driver-install \
73
72
--node-vm-size Standard_NC6s_v3 \
74
-
--node-taints sku=gpu:NoSchedule \
75
73
--enable-cluster-autoscaler \
76
74
--min-count 1 \
77
75
--max-count 3
@@ -81,7 +79,7 @@ AKS has automatic GPU driver installation enabled by default. In some cases, suc
81
79
82
80
### NVIDIA device plugin installation
83
81
84
-
NVIDIA device plugin installation is required when using GPUs on AKS. In some cases, the installation is handled automatically, such as when using the [NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/microsoft-aks.html) or the [AKS GPU image (preview)](#use-the-aks-gpu-image-preview). Alternatively, you can manually install the NVIDIA device plugin.
82
+
NVIDIA device plugin installation is required when using GPUs on AKS. In some cases, the installation is handled automatically, such as when using the [NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html) or the [AKS GPU image (preview)](#use-the-aks-gpu-image-preview). Alternatively, you can manually install the NVIDIA device plugin.
85
83
86
84
#### Manually install the NVIDIA device plugin
87
85
@@ -222,7 +220,7 @@ The NVIDIA GPU Operator automates the management of all NVIDIA software componen
222
220
223
221
1. Skip automatic GPU driver installation by creating a node pool using the [`az aks nodepool add`][az-aks-nodepool-add] command with `--skip-gpu-driver-install`. Adding the `--skip-gpu-driver-install` flag during node pool creation skips the automatic GPU driver installation. Any existing nodes aren't changed. You can scale the node pool to zero and then back up to make the change take effect.
224
222
225
-
2. Follow the NVIDIA documentation to [Install the GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/install-gpu-ocp.html#install-nvidiagpu:~:text=NVIDIA%20GPU%20Operator-,Installing%20the%20NVIDIA%20GPU%20Operator,-%EF%83%81).
223
+
2. Follow the NVIDIA documentation to [Install the GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html).
226
224
227
225
3. Now that you successfully installed the GPU Operator, you can check that your [GPUs are schedulable](#confirm-that-gpus-are-schedulable) and [run a GPU workload](#run-a-gpu-enabled-workload).
0 commit comments