You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/storage/files/nfs-large-directories.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
-
title: Work with large directories in NFS Azure file shares
3
-
description: Learn recommendations for working with large directories in NFS Azure file shares mounted on Linux clients, including mount options, commands, and operations.
2
+
title: Work with large directories in Azure file shares
3
+
description: Learn recommendations for working with large directories in Azure file shares mounted on Linux clients, including mount options, commands, and operations.
4
4
author: khdownie
5
5
ms.service: azure-file-storage
6
6
ms.custom: linux-related-content
@@ -9,25 +9,25 @@ ms.date: 01/22/2025
9
9
ms.author: kendownie
10
10
---
11
11
12
-
# Recommendations for working with large directories in NFS Azure file shares
12
+
# Optimize file share performance when accessing large directories from Linux clients
13
13
14
-
This article provides recommendations for working with NFS directories that contain large numbers of files. It's usually a good practice to reduce the number of files in a single directory by spreading the files over multiple directories. However, there are situations in which large directories can't be avoided. Consider the following suggestions when working with large directories on NFS Azure file shares that are mounted on Linux clients.
14
+
This article provides recommendations for working with directories that contain large numbers of files. It's usually a good practice to reduce the number of files in a single directory by spreading the files over multiple directories. However, there are situations in which large directories can't be avoided. Consider the following suggestions when working with large directories on Azure file shares that are mounted on Linux clients.
15
15
16
16
## Applies to
17
17
18
18
| File share type | SMB | NFS |
19
19
|-|:-:|:-:|
20
-
| Standard file shares (GPv2), LRS/ZRS |||
21
-
| Standard file shares (GPv2), GRS/GZRS |||
22
-
| Premium file shares (FileStorage), LRS/ZRS |||
20
+
| Standard file shares (GPv2), LRS/ZRS |||
21
+
| Standard file shares (GPv2), GRS/GZRS |||
22
+
| Premium file shares (FileStorage), LRS/ZRS |||
23
23
24
24
## Recommended mount options
25
25
26
26
The following mount options are specific to enumeration and can reduce latency when working with large directories.
27
27
28
28
### actimeo
29
29
30
-
Specifying `actimeo` sets all of `acregmin`, `acregmax`, `acdirmin`, and `acdirmax` to the same value. If `actimeo` isn't specified, the NFS client uses the defaults for each of these options.
30
+
Specifying `actimeo` sets all of `acregmin`, `acregmax`, `acdirmin`, and `acdirmax` to the same value. If `actimeo` isn't specified, the client uses the defaults for each of these options.
31
31
32
32
We recommend setting `actimeo` between 30 and 60 seconds when working with large directories. Setting a value in this range makes the attributes remain valid for a longer time period in the client's attribute cache, allowing operations to get file attributes from the cache instead of fetching them over the wire. This can reduce latency in situations where the cached attributes expire while the operation is still running.
33
33
@@ -70,7 +70,7 @@ The following chart compares the time it takes to output results using unaliased
70
70
71
71
### Increase the number of hash buckets
72
72
73
-
The total amount of RAM present on the system doing the enumeration influences the internal working of filesystem protocols like NFS. Even if users aren't experiencing high memory usage, the amount of memory available influences the amount of hash buckets the system has, which impacts/improves enumeration performance for large directories. You can modify the amount of hash buckets the system has to reduce the hash collisions that can occur during large enumeration workloads.
73
+
The total amount of RAM present on the system doing the enumeration influences the internal working of filesystem protocols like NFS and SMB. Even if users aren't experiencing high memory usage, the amount of memory available influences the amount of hash buckets the system has, which impacts/improves enumeration performance for large directories. You can modify the amount of hash buckets the system has to reduce the hash collisions that can occur during large enumeration workloads.
74
74
75
75
To do this, you'll need to modify your boot configuration settings by providing an additional kernel command that takes effect during boot to increase the number of hash buckets. Follow these steps.
76
76
@@ -121,15 +121,15 @@ To do this, you'll need to modify your boot configuration settings by providing
121
121
122
122
## File copy and backup operations
123
123
124
-
When copying data from an NFS file share or backing up from NFS file shares to another location, for optimal performance we recommend using a share snapshot as the source instead of the live file share with active I/O. Backup applications should run commands on the snapshot directly. For more information, see [NFS file share snapshots](storage-files-how-to-mount-nfs-shares.md#nfs-file-share-snapshots).
124
+
When copying data from a file share or backing up from file shares to another location, for optimal performance we recommend using a share snapshot as the source instead of the live file share with active I/O. Backup applications should run commands on the snapshot directly. For more information, see [Use share snapshots with Azure Files](storage-snapshots-files.md).
125
125
126
126
## Application-level recommendations
127
127
128
-
When developing applications that use large directories with NFS file shares, follow these recommendations.
128
+
When developing applications that use large directories, follow these recommendations.
129
129
130
130
-**Skip file attributes.** If the application only needs the file name and not file attributes like file type or last modified time, you can use multiple calls to system calls such as `getdents64` with a good buffer size. This will get the entries in the specified directory without the file type, making the operation faster by avoiding extra operations that aren't needed.
131
131
132
-
-**Interleave stat calls.** If the application needs attributes and the file name, we recommend interleaving the stat calls along with `getdents64` instead of getting all entries until end of file with `getdents64` and then doing a statx on all entries returned. Interleaving the stat calls instructs the NFS client to request both the file and its attributes at once, reducing the number of calls to the server. When combined with a high `actimeo` value, this can significantly improve performance. For example, instead of `[ getdents64, getdents64, ... , getdents64, statx (entry1), ... , statx(n) ]`, place the statx calls after each `getdents64` like this: `[ getdents64, (statx, statx, ... , statx), getdents64, (statx, statx, ... , statx), ... ]`.
132
+
-**Interleave stat calls.** If the application needs attributes and the file name, we recommend interleaving the stat calls along with `getdents64` instead of getting all entries until end of file with `getdents64` and then doing a statx on all entries returned. Interleaving the stat calls instructs the client to request both the file and its attributes at once, reducing the number of calls to the server. When combined with a high `actimeo` value, this can significantly improve performance. For example, instead of `[ getdents64, getdents64, ... , getdents64, statx (entry1), ... , statx(n) ]`, place the statx calls after each `getdents64` like this: `[ getdents64, (statx, statx, ... , statx), getdents64, (statx, statx, ... , statx), ... ]`.
133
133
134
134
-**Increase I/O depth.** If possible, we suggest configuring `nconnect` to a non-zero value (greater than 1) and distributing the operation among multiple threads, or using asynchronous I/O. This will enable operations that can be asynchronous to benefit from multiple concurrent connections to the file share.
135
135
@@ -138,4 +138,4 @@ When developing applications that use large directories with NFS file shares, fo
0 commit comments