You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-netapp-files/performance-large-volumes-linux.md
+5-35Lines changed: 5 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,54 +19,27 @@ ms.author: anfdocs
19
19
20
20
This article describes the tested performance capabilities of a single [Azure NetApp Files large volumes](large-volumes-requirements-considerations.md) as it pertains to Linux use cases. The tests explored scenarios for both scale-out and scale-up read and write workloads, involving one and many virtual machines (VMs). Knowing the performance envelope of large volumes helps you facilitate volume sizing.
21
21
22
-
## Test methodologies
22
+
## Testing summary
23
23
24
24
* The Azure NetApp Files large volumes feature offers three service levels, each with throughput limits. The service levels can be scaled up or down nondisruptively as your performance needs change.
25
25
26
26
* Ultra service level: 10,240 MiB/s
27
27
* Premium service level: 6,400 MiB/s
28
28
* Standard service level: 1,600 MiB/s
29
-
* The Ultra service level was used for these tests.
29
+
30
+
The Ultra service level was used in these tests.
30
31
31
32
* Sequential I/O: 100% sequential writes max out at 8,500 MiB/second, while a single large volume is capable of 10 GiB/second (10,240 MiB/second) throughput.
32
33
33
34
* Random I/O: The same single large volume delivers over 700,000 operations per second.
34
35
35
36
* Metadata-heavy workloads are advantageous for Azure NetApp File large volumes due to the large volume’s increased parallelism. Performance benefits are noticeable in workloads heavy in file creation, unlink, and file renames as typical with VCS applications, and EDA workloads where there are high file counts present. For more information on performance of high metadata workloads, see [Benefits of using Azure NetApp Files for electronic design automation](solutions-benefits-azure-netapp-files-electronic-design-automation.md).
36
37
37
-
[FIO](https://fio.readthedocs.io/en/latest/fio_doc.html), a synthetic workload generator designed as a storage stress test, was used to drive these test results.
38
-
39
-
* There are fundamentally two models of storage performance testing:
38
+
*[FIO](https://fio.readthedocs.io/en/latest/fio_doc.html), a synthetic workload generator designed as a storage stress test, was used to drive these test results. There are fundamentally two models of storage performance testing:
40
39
41
40
***Scale-out compute**, which refers to using multiple VMs to generate the maximum load possible on a single Azure NetApp Files volume.
42
41
***Scale-up compute**, which refers to using a large VM to test the upper boundaries of a single client on a single Azure NetApp Files volume.
43
42
44
-
## Summary
45
-
46
-
* A single large volume can deliver sequential throughput up to the service level limits in all but the pure sequential write scenario. For sequential writes, the synthetic tests found the upper limit to be 8500 MiB/s.
47
-
<!-- * Using 8-KiB random workloads, 10,240 MiB/s isn't achievable. As such, more than 700,000 8-KiB operations were achieved. -->
48
-
49
-
As I/O types shift toward metadata intensive operations, the scenario changes again. Metadata workloads are particularly advantageous for Azure NetApp File large volumes. When you run workloads rich in file creation, unlink, and file renames, you can notice a significant amount of performance. Typical of such primitives are the VCS application and EDA workloads where files are created, renamed, or linked at very high rates.
50
-
51
-
<!--
52
-
## Test methodologies and tools
53
-
54
-
All scenarios documented in this article used FIO, which is a synthetic workload generator designed as a storage stress test. For the purposes of this testing, we used storage stress tests.
55
-
56
-
Fundamentally, there are two models of storage performance testing:
57
-
58
-
* **Application level**
59
-
For application-level testing, the efforts are to drive I/O through client buffer caches in the same way that a typical application drives I/O. In general, when testing in this manner, direct I/O isn't used.
60
-
* Except for databases (for example, Oracle, SAP HANA, MySQL (InnoDB storage engine), PostgreSQL, and Teradata), few applications use direct I/O. Instead, most applications use a large memory cache for repeated reads and a write-behind cache for asynchronous writes.
61
-
* SPECstorage 2020 (EDA, VDA, AI, genomics, and software build), HammerDB for SQL Server, and Login VSI are typical examples of application-level testing tools. None of them uses direct I/O.
62
-
63
-
* **Storage stress test**
64
-
The most common parameter used in storage performance benchmarking is direct I/O. It's supported by FIO and Vdbench. DISKSPD offers support for the similar construct of memory-mapped I/O. With direct I/O, the filesystem cache is bypassed, operations for direct memory access copy are avoided, and storage tests are made fast and simple.
65
-
* Using the direct I/O parameter makes storage testing easy. No data is read from the filesystem cache on the client. As such, the test stresses the storage protocol and service itself rather than the memory access speeds. Also, without the DMA memory copies, read and write operations are efficient from a processing perspective.
66
-
* Take the Linux `dd` command as an example workload. Without the optional `odirect` flag, all I/O generated by `dd` is served from the Linux buffer cache. Reads with the blocks already in memory aren't retrieved from storage. Reads resulting in a buffer-cache miss end up being read from storage using NFS read-ahead with varying results, depending on factors as mount `rsize` and client read-ahead tunables. When writes are sent through the buffer cache, they use a write-behind mechanism, which is untuned and uses a significant amount of parallelism to send the data to the storage device. You might attempt to run two independent streams of I/O, one `dd` for reads and one `dd` for writes. However, the operating system, being untuned, favors writes over reads and uses more parallelism for it.
67
-
* Except for database, few applications use direct I/O. Instead, they take advantage of a large memory cache for repeated reads and a write-behind cache for asynchronous writes. In short, using direct I/O turns the test into a micro benchmark.
68
-
-->
69
-
70
43
## Linux scale-out test
71
44
72
45
Tests observed performance thresholds of a single large volume on scale-out and were conducted with the following configuration:
@@ -79,7 +52,7 @@ Tests observed performance thresholds of a single large volume on scale-out and
| Mount options | hard,rsize=65536,wsize=65536,vers=3 <br /> **NOTE:** Use of both 262144 and 65536 had similar performance results. |
81
54
82
-
### 256KiB sequential workloads (MiB/s)
55
+
### 256-KiB sequential workloads (MiB/s)
83
56
84
57
The graph represents a 256 KiB sequential workload and a 1 TiB working set. It shows that a single Azure NetApp Files large volume can handle between approximately 8,518 MiB/s pure sequential writes and 9,970 MiB/s pure sequential reads.
85
58
@@ -110,7 +83,6 @@ The graphs in this section show the results for the client-side mount option of
110
83
111
84
The following graphs compare the advantages of `nconnect` with an NFS-mounted volume without `nconnect`. In the tests, FIO generated the workload from a single E104id-v5 instance in the East US Azure region using a 64-KiB sequential workload; a 256 I/0 size was used, which is the largest I/O size recommended by Azure NetApp Files resulted in comparable performance numbers. For more information, see [`rsize` and `wsize`](performance-linux-mount-options.md#rsize-and-wsize).
112
85
113
-
114
86
### Linux read throughput
115
87
116
88
The following graphs show 256-KiB sequential reads of ~10,000MiB/s with `nconnect`, which is roughly ten times the throughput achieved without `nconnect`.
@@ -125,12 +97,10 @@ The following graphs show sequential writes. Using `nconnect` provides observabl
125
97
126
98
:::image type="content" source="./media/performance-large-volumes-linux/write-throughput-comparison.png" alt-text="Comparison of write throughput with and without nconnect." lightbox="./media/performance-large-volumes-linux/write-throughput-comparison.png":::
127
99
128
-
129
100
### Linux read IOPS
130
101
131
102
The following graphs show 8-KiB random reads of ~426,000 read IOPS with `nconnect`, roughly seven times what is observed without `nconnect`.
132
103
133
-
134
104
:::image type="content" source="./media/performance-large-volumes-linux/read-iops-comparison.png" alt-text="Charts comparing read IOPS with and without IOPS." lightbox="./media/performance-large-volumes-linux/read-iops-comparison.png":::
0 commit comments