Skip to content

Commit 41f01b0

Browse files
committed
new regular volume benchmarks
1 parent 2f85ce7 commit 41f01b0

File tree

1 file changed

+132
-5
lines changed

1 file changed

+132
-5
lines changed

articles/azure-netapp-files/performance-benchmarks-linux.md

Lines changed: 132 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,23 +42,150 @@ FIO (with and without setting randrepeat=0)
4242
- [Manual QoS](manage-manual-qos-capacity-pool.md)
4343
- Mount options: rw,nconnect=8,hard,rsize=262144,wsize=262144,vers=3,tcp,bg
4444

45+
## Parallel network connection (nconnect) benchmark configuration
46+
47+
These benchmarks used the following:
48+
- A single Azure NetApp Files regular volume with a 1-TiB data set using the Ultra performance tier
49+
- FIO (with and without setting randrepeat=0)
50+
- 4-KiB and 64-KiB wsize/rsize
51+
- A single D32s_v4 virtual machines running RHEL 9.3
52+
- NFSv3 with and without nconnect
53+
- Mount options: rw,nconnect=8,hard,rsize=262144,wsize=262144,vers=3,tcp,bg
54+
55+
## Scale-up benchmark tests
56+
57+
The scale-up test’s intent is to show the performance of an Azure NetApp File volume when scaling up (or increasing) the number of jobs generating simultaneous workload across multiple TCP connections on a single client to the same volume (such as with [nconnect](performance-linux-mount-options.md#nconnect)).
58+
59+
Without nconnect, these workloads cannot push the limits of a volume’s maximum performance, since the client cannot generate enough IO or network throughput. These tests are generally indicative of what a single user’s experience might be in workloads such as media rendering, databases, AI/ML, and general file shares.
60+
61+
## High IOP scale-out benchmarks
62+
63+
The following benchmarks show the performance achieved for Azure NetApp Files with a high IOP workload using:
64+
65+
- 32 clients
66+
- 4-KiB and 8-KiB random reads and writes
67+
- 1-TiB dataset
68+
- Read/write ratios as follows: 100%:0%, 90%:10%, 80%:20%, and so on
69+
- With and without filesystem caching involved (using `randrepeat=0` in FIO)
70+
71+
For more information, see [Testing methodology](testing-methodology.md).
72+
73+
## Results: 4-KiB, random, client caching included
74+
75+
In this benchmark, FIO ran without the `randrepeat` option to randomize data. Thus, an indeterminate amount of caching came into play. This configuration results in slightly better overall performance numbers than tests run without caching with the entire IO stack being utilized.
76+
77+
In the following graph, testing shows an Azure NetApp Files regular volume can handle between approximately 130,000 pure random 4-KiB writes and approximately 460,000 pure random 4-KiB reads during this benchmark. Read:write mix for the workload adjusted by 10% for each run.
78+
79+
As the read:write IOP mix increases towards write-heavy, the total IOPS decrease.
80+
81+
<!-- 4k random iops graph -->
82+
83+
## Results: 4-KiB, random, client caching excluded
84+
85+
In this benchmark, FIO was run with the setting `randrepeat=0` to randomize data, reducing the caching influence on performance. This resulted in an approximately 8% reduction in write IOPS and a approximately 17% reduction in read IOPS, but displays performance numbers more representative of what the storage can actually do.
86+
87+
In the following graph, testing shows an Azure NetApp Files regular volume can handle between approximately 120,000 pure random 4-KiB writes and approximately 388,000 pure random 4-KiB reads. Read:write mix for the workload adjusted by 25% for each run.
88+
89+
As the read:write IOP mix increases towards write-heavy, the total IOPS decrease.
90+
<!-- -->
91+
92+
## Results: 8-KiB, random, client caching excluded
93+
94+
Larger read and write sizes will result in fewer total IOPS, as more data can be sent with each operation. An 8-KiB read and write size was used to more accurately simulate what most modern applications use. For instance, many EDA applications utilize 8-KiB reads and writes.
95+
96+
In this benchmark, FIO ran with `randrepeat=0` to randomize data so the client caching impact was reduced. In the following graph, testing shows that an Azure NetApp Files regular volume can handle between approximately 111,000 pure random 8-KiB writes and approximately 293,000 pure random 8-KiB reads. Read:write mix for the workload adjusted by 25% for each run.
97+
98+
As the read:write IOP mix increases towards write-heavy, the total IOPS decrease.
99+
100+
## Side-by-side comparisons
101+
102+
To illustrate how caching can influence the performance benchmark tests, the following graph shows total I/OPS for 4-KiB tests with and without caching mechanisms in place. As shown, caching provides a slight performance boost for I/OPS fairly consistent trending.
103+
104+
## Specific offset, streaming random read/write workloads: scale-up tests using parallel network connections (nconnect)
105+
106+
The following tests show a high IOP benchmark using a single client with 4-KiB random workloads and a 1-TiB data set. The workload mix generated uses a different I/O depth each time. To boost the performance for a single client workload, the [nconnect mount option](performance-linux-mount-options.md#nconnect) was used to improve parallelism in comparison to client mounts without the nconnect mount option.
107+
108+
When using a standard TCP connection that provides only a single path to the storage, fewer total operations are sent per second than when a mount is able to leverage more TCP connections (such as with nconnect) per mount point. When using nconnect, he total latency for the operations is generally lower. These tests are also run with `randrepeat=0` to intentionally avoid caching. For more information on this option, see [Testing methodology](testing-methodology.md).
109+
110+
### Results: 4KiB, random, nconnect vs. no nconnect, caching excluded
111+
112+
The following graphs show a side-by-side comparison of 4-KiB reads and writes with and without nconnect to highlight the performance improvements seen when using nconnect: higher overall IOPS, lower latency.
113+
114+
## High throughput benchmarks
115+
116+
The following benchmarks show the performance achieved for Azure NetApp Files with a high throughput workload.
117+
118+
High throughput workloads are more sequential in nature and often are read/write heavy with low metadata. Throughput is generally more important than IOPS. These workloads typically leverage larger read/write sizes (64-256K), which will generate higher latencies than smaller read/write sizes, since larger payloads will naturally take longer to be processed.
119+
120+
Examples of high throughput workloads include:
121+
122+
- Media repositories
123+
- High performance compute
124+
- AI/ML/LLP
125+
126+
The following tests show a high throughput benchmark using both 64-KiB and 256-KiB sequential workloads and a 1-TiB data set. The workload mix generated decreases a set percentage at a time and demonstrates what you can expect when using varying read/write ratios (for instance, 100%:0%, 90%:10%, 80%:20%, and so on).
127+
128+
### Results: 64-KiB sequential I/O, caching included
129+
130+
In this benchmark, FIO ran using looping logic that more aggressively populated the cache, so an indeterminate amount of caching influenced the results. This results in slightly better overall performance numbers than tests run without caching.
131+
132+
In the graph below, testing shows that an Azure NetApp Files regular volume can handle between approximately 4,500MiB/s pure sequential 64-KiB reads and approximately 1,600MiB/s pure sequential 64-KiB writes. The read:write mix for the workload was adjusted by 10% for each run.
133+
134+
135+
### Results: 64-KiB sequential I/O, caching excluded
136+
137+
In this benchmark, FIO ran using looping logic that less aggressively populated the cache. Client caching didn't influence the results. This configuration results in slightly better write performance numbers, but lower read numbers than tests without caching.
138+
139+
In the following graph, testing demonstrates that an Azure NetApp Files regular volume can handle between approximately 3,600MiB/s pure sequential 64-KiB reads and approximately 2,400MiB/s pure sequential 64-KiB writes. During the tests, a 50/50 mix showed total throughput on par with a pure sequential read workload.
140+
141+
The read:write mix for the workload was adjusted by 25% for each run.
142+
143+
### Results: 256-KiB sequential I/O, caching excluded
144+
145+
In this benchmark, FIO ran using looping logic that less aggressively populated the cache, so caching didn't influence the results. This configuration results in slightly less write performance numbers than 64-KiB tests, but higher read numbers than the same 64-KiB tests run without caching.
146+
147+
In the graph below, testing shows that an Azure NetApp Files regular volume can handle between approximately 3,500MiB/s pure sequential 256-KiB reads and approximately 2,500MiB/s pure sequential 256-KiB writes. During the tests, a 50/50 mix showed total throughput peaked higher than a pure sequential read workload.
148+
149+
The read:write mix for the workload was adjusted in 25% increments for each run.
150+
151+
### Side-by-side comparison
152+
153+
To better show how caching can influence the performance benchmark tests, the following graph shows total MiB/s for 64-KiB tests with and without caching mechanisms in place. Caching provides an initial slight performance boost for total MiB/s because caching generally improves reads more so than writes. As the read/write mix changes, the total MiB/s without caching exceeds the results that utilize client caching.
154+
155+
## Parallel network connections (nconnect)
156+
157+
The following tests show a high IOP benchmark using a single client with 64-KiB random workloads and a 1-TiB data set. The workload mix generated uses a different I/O depth each time. To boost the performance for a single client workload, the nconnect mount option was leveraged for better parallelism in comparison to client mounts that didn't use the nconnect mount option. These tests were run only with caching excluded.
158+
159+
### Results: 64-KiB, sequential, caching excluded, with and without nconnect
160+
161+
The following results show a scale-up test’s results when reading and writing in 4-KiB chunks on a NFSv3 mount on a single client with and without parallelization of operations (nconnect). The graphs show that as the I/O depth grows, the I/OPS also increase. But when using a standard TCP connection that provides only a single path to the storage, fewer total operations are sent per second than when a mount is able to leverage more TCP connections per mount point. In addition, the total latency for the operations is generally lower when using nconnect.
162+
163+
### Side-by-side comparison (with and without nconnect)
164+
165+
The following graphs show a side-by-side comparison of 64-KiB sequential reads and writes with and without nconnect to highlight the performance improvements seen when using nconnect: higher overall throughput, lower latency.
166+
167+
## More information
168+
169+
- [Testing methodology](testing-methodology.md)
170+
45171
<!-- -->
46172

173+
47174
## Linux scale-out
48175

49176
This section describes performance benchmarks of Linux workload throughput and workload IOPS.
50177

51178
### Linux workload throughput
52179

53-
This graph represents a 64 kibibyte (KiB) sequential workload and a 1 TiB working set. It shows that a single Azure NetApp Files volume can handle between ~1,600 MiB/s pure sequential writes and ~4,500 MiB/s pure sequential reads.
180+
This graph represents a 64 kibibyte (KiB) sequential workload and a 1 TiB working set. It shows that a single Azure NetApp Files volume can handle between approximately 1,600 MiB/s pure sequential writes and approximately 4,500 MiB/s pure sequential reads.
54181

55182
The graph illustrates decreases in 10% at a time, from pure read to pure write. It demonstrates what you can expect when using varying read/write ratios (100%:0%, 90%:10%, 80%:20%, and so on).
56183

57184
![Linux workload throughput](./media/performance-benchmarks-linux/performance-benchmarks-linux-workload-throughput.png)
58185

59186
### Linux workload IOPS
60187

61-
The following graph represents a 4-KiB random workload and a 1 TiB working set. The graph shows that an Azure NetApp Files volume can handle between ~130,000 pure random writes and ~460,000 pure random reads.
188+
The following graph represents a 4-KiB random workload and a 1 TiB working set. The graph shows that an Azure NetApp Files volume can handle between approximately 130,000 pure random writes and approximately 460,000 pure random reads.
62189

63190
This graph illustrates decreases in 10% at a time, from pure read to pure write. It demonstrates what you can expect when using varying read/write ratios (100%:0%, 90%:10%, 80%:20%, and so on).
64191

@@ -72,7 +199,7 @@ The graphs compare the advantages of `nconnect` to a non-`connected` mounted vol
72199

73200
### Linux read throughput
74201

75-
The following graphs show 64-KiB sequential reads of ~3,500 MiB/s reads with `nconnect`, roughly 2.3X non-`nconnect`.
202+
The following graphs show 64-KiB sequential reads of approximately 3,500 MiB/s reads with `nconnect`, roughly 2.3X non-`nconnect`.
76203

77204
![Linux read throughput](./media/performance-benchmarks-linux/performance-benchmarks-linux-read-throughput.png)
78205

@@ -84,13 +211,13 @@ The following graphs show sequential writes. They indicate that `nconnect` has n
84211

85212
### Linux read IOPS
86213

87-
The following graphs show 4-KiB random reads of ~200,000 read IOPS with `nconnect`, roughly 3X non-`nconnect`.
214+
The following graphs show 4-KiB random reads of approximately 200,000 read IOPS with `nconnect`, roughly 3X non-`nconnect`.
88215

89216
![Linux read IOPS](./media/performance-benchmarks-linux/performance-benchmarks-linux-read-iops.png)
90217

91218
### Linux write IOPS
92219

93-
The following graphs show 4-KiB random writes of ~135,000 write IOPS with `nconnect`, roughly 3X non-`nconnect`.
220+
The following graphs show 4-KiB random writes of approximately 135,000 write IOPS with `nconnect`, roughly 3X non-`nconnect`.
94221

95222
![Linux write IOPS](./media/performance-benchmarks-linux/performance-benchmarks-linux-write-iops.png)
96223

0 commit comments

Comments
 (0)