Skip to content

Commit be8a136

Browse files
committed
acrolinx
1 parent da9e92c commit be8a136

File tree

1 file changed

+33
-33
lines changed

1 file changed

+33
-33
lines changed

articles/azure-netapp-files/performance-benchmarks-linux.md

Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,10 @@ This article describes performance benchmarks Azure NetApp Files delivers for Li
1818

1919
The intent of a scale-out test is to show the performance of an Azure NetApp File volume when scaling out (or increasing) the number of clients generating simultaneous workload to the same volume. These tests are generally able to push a volume to the edge of its performance limits and are indicative of workloads such as media rendering, AI/ML, and other workloads that utilize large compute farms to perform work.
2020

21-
High IOP scale out benchmark configuration
21+
## High IOP scale-out benchmark configuration
2222

2323
These benchmarks used the following:
24-
- A single Azure NetApp Files 100-TiB regular volume with a 1-TiB data set using the Ultra performance tier
24+
- A single Azure NetApp Files 100-TiB regular volume with a 1-TiB dataset using the Ultra performance tier
2525
- [FIO (with and without setting randrepeat=0)](testing-methodology.md)
2626
- 4-KiB and 8-KiB block sizes
2727
- 6 D32s_v5 virtual machines running RHEL 9.3
@@ -33,7 +33,7 @@ These benchmarks used the following:
3333

3434
These benchmarks used the following:
3535

36-
- A single Azure NetApp Files regular volume with a 1-TiB data set using the Ultra performance tier
36+
- A single Azure NetApp Files regular volume with a 1-TiB dataset using the Ultra performance tier
3737
FIO (with and without setting randrepeat=0)
3838
- [FIO (with and without setting randrepeat=0)](testing-methodology.md)
3939
- 64-KiB and 256-KiB block size
@@ -42,21 +42,21 @@ FIO (with and without setting randrepeat=0)
4242
- [Manual QoS](manage-manual-qos-capacity-pool.md)
4343
- Mount options: rw,nconnect=8,hard,rsize=262144,wsize=262144,vers=3,tcp,bg
4444

45-
## Parallel network connection (nconnect) benchmark configuration
45+
## Parallel network connection (`nconnect`) benchmark configuration
4646

4747
These benchmarks used the following:
48-
- A single Azure NetApp Files regular volume with a 1-TiB data set using the Ultra performance tier
48+
- A single Azure NetApp Files regular volume with a 1-TiB dataset using the Ultra performance tier
4949
- FIO (with and without setting randrepeat=0)
5050
- 4-KiB and 64-KiB wsize/rsize
51-
- A single D32s_v4 virtual machines running RHEL 9.3
52-
- NFSv3 with and without nconnect
51+
- A single D32s_v4 virtual machine running RHEL 9.3
52+
- NFSv3 with and without `nconnect`
5353
- Mount options: rw,nconnect=8,hard,rsize=262144,wsize=262144,vers=3,tcp,bg
5454

5555
## Scale-up benchmark tests
5656

57-
The scale-up test’s intent is to show the performance of an Azure NetApp File volume when scaling up (or increasing) the number of jobs generating simultaneous workload across multiple TCP connections on a single client to the same volume (such as with [nconnect](performance-linux-mount-options.md#nconnect)).
57+
The scale-up test’s intent is to show the performance of an Azure NetApp File volume when scaling up (or increasing) the number of jobs generating simultaneous workload across multiple TCP connections on a single client to the same volume (such as with [`nconnect`](performance-linux-mount-options.md#nconnect)).
5858

59-
Without nconnect, these workloads cannot push the limits of a volume’s maximum performance, since the client cannot generate enough IO or network throughput. These tests are generally indicative of what a single user’s experience might be in workloads such as media rendering, databases, AI/ML, and general file shares.
59+
Without `nconnect`, these workloads can't push the limits of a volume’s maximum performance, since the client can't generate enough IO or network throughput. These tests are generally indicative of what a single user’s experience might be in workloads such as media rendering, databases, AI/ML, and general file shares.
6060

6161
## High IOP scale-out benchmarks
6262

@@ -74,62 +74,62 @@ For more information, see [Testing methodology](testing-methodology.md).
7474

7575
In this benchmark, FIO ran without the `randrepeat` option to randomize data. Thus, an indeterminate amount of caching came into play. This configuration results in slightly better overall performance numbers than tests run without caching with the entire IO stack being utilized.
7676

77-
In the following graph, testing shows an Azure NetApp Files regular volume can handle between approximately 130,000 pure random 4-KiB writes and approximately 460,000 pure random 4-KiB reads during this benchmark. Read:write mix for the workload adjusted by 10% for each run.
77+
In the following graph, testing shows an Azure NetApp Files regular volume can handle between approximately 130,000 pure random 4-KiB writes and approximately 460,000 pure random 4 KiB reads during this benchmark. Read-write mix for the workload adjusted by 10% for each run.
7878

79-
As the read:write IOP mix increases towards write-heavy, the total IOPS decrease.
79+
As the read-write IOP mix increases towards write-heavy, the total IOPS decrease.
8080

8181
<!-- 4k random iops graph -->
8282

8383
## Results: 4-KiB, random, client caching excluded
8484

85-
In this benchmark, FIO was run with the setting `randrepeat=0` to randomize data, reducing the caching influence on performance. This resulted in an approximately 8% reduction in write IOPS and a approximately 17% reduction in read IOPS, but displays performance numbers more representative of what the storage can actually do.
85+
In this benchmark, FIO was run with the setting `randrepeat=0` to randomize data, reducing the caching influence on performance. This resulted in an approximately 8% reduction in write IOPS and an approximately 17% reduction in read IOPS, but displays performance numbers more representative of what the storage can actually do.
8686

87-
In the following graph, testing shows an Azure NetApp Files regular volume can handle between approximately 120,000 pure random 4-KiB writes and approximately 388,000 pure random 4-KiB reads. Read:write mix for the workload adjusted by 25% for each run.
87+
In the following graph, testing shows an Azure NetApp Files regular volume can handle between approximately 120,000 pure random 4-KiB writes and approximately 388,000 pure random 4-KiB reads. Read-write mix for the workload adjusted by 25% for each run.
8888

89-
As the read:write IOP mix increases towards write-heavy, the total IOPS decrease.
89+
As the read-write IOP mix increases towards write-heavy, the total IOPS decrease.
9090
<!-- -->
9191

9292
## Results: 8-KiB, random, client caching excluded
9393

9494
Larger read and write sizes will result in fewer total IOPS, as more data can be sent with each operation. An 8-KiB read and write size was used to more accurately simulate what most modern applications use. For instance, many EDA applications utilize 8-KiB reads and writes.
9595

96-
In this benchmark, FIO ran with `randrepeat=0` to randomize data so the client caching impact was reduced. In the following graph, testing shows that an Azure NetApp Files regular volume can handle between approximately 111,000 pure random 8-KiB writes and approximately 293,000 pure random 8-KiB reads. Read:write mix for the workload adjusted by 25% for each run.
96+
In this benchmark, FIO ran with `randrepeat=0` to randomize data so the client caching impact was reduced. In the following graph, testing shows that an Azure NetApp Files regular volume can handle between approximately 111,000 pure random 8-KiB writes and approximately 293,000 pure random 8-KiB reads. Read-write mix for the workload adjusted by 25% for each run.
9797

98-
As the read:write IOP mix increases towards write-heavy, the total IOPS decrease.
98+
As the read-write IOP mix increases towards write-heavy, the total IOPS decrease.
9999

100100
## Side-by-side comparisons
101101

102102
To illustrate how caching can influence the performance benchmark tests, the following graph shows total I/OPS for 4-KiB tests with and without caching mechanisms in place. As shown, caching provides a slight performance boost for I/OPS fairly consistent trending.
103103

104-
## Specific offset, streaming random read/write workloads: scale-up tests using parallel network connections (nconnect)
104+
## Specific offset, streaming random read/write workloads: scale-up tests using parallel network connections (`nconnect`)
105105

106-
The following tests show a high IOP benchmark using a single client with 4-KiB random workloads and a 1-TiB data set. The workload mix generated uses a different I/O depth each time. To boost the performance for a single client workload, the [nconnect mount option](performance-linux-mount-options.md#nconnect) was used to improve parallelism in comparison to client mounts without the nconnect mount option.
106+
The following tests show a high IOP benchmark using a single client with 4-KiB random workloads and a 1-TiB dataset. The workload mix generated uses a different I/O depth each time. To boost the performance for a single client workload, the [`nconnect` mount option](performance-linux-mount-options.md#nconnect) was used to improve parallelism in comparison to client mounts without the `nconnect` mount option.
107107

108-
When using a standard TCP connection that provides only a single path to the storage, fewer total operations are sent per second than when a mount is able to leverage more TCP connections (such as with nconnect) per mount point. When using nconnect, he total latency for the operations is generally lower. These tests are also run with `randrepeat=0` to intentionally avoid caching. For more information on this option, see [Testing methodology](testing-methodology.md).
108+
When using a standard TCP connection that provides only a single path to the storage, fewer total operations are sent per second than when a mount is able to leverage more TCP connections (such as with `nconnect`) per mount point. When using `nconnect`, the total latency for the operations is generally lower. These tests are also run with `randrepeat=0` to intentionally avoid caching. For more information on this option, see [Testing methodology](testing-methodology.md).
109109

110-
### Results: 4KiB, random, nconnect vs. no nconnect, caching excluded
110+
### Results: 4-KiB, random, with and without `nconnect`, caching excluded
111111

112-
The following graphs show a side-by-side comparison of 4-KiB reads and writes with and without nconnect to highlight the performance improvements seen when using nconnect: higher overall IOPS, lower latency.
112+
The following graphs show a side-by-side comparison of 4-KiB reads and writes with and without `nconnect` to highlight the performance improvements seen when using `nconnect`: higher overall IOPS, lower latency.
113113

114114
## High throughput benchmarks
115115

116116
The following benchmarks show the performance achieved for Azure NetApp Files with a high throughput workload.
117117

118-
High throughput workloads are more sequential in nature and often are read/write heavy with low metadata. Throughput is generally more important than IOPS. These workloads typically leverage larger read/write sizes (64-256K), which will generate higher latencies than smaller read/write sizes, since larger payloads will naturally take longer to be processed.
118+
High throughput workloads are more sequential in nature and often are read/write heavy with low metadata. Throughput is generally more important than I/OPS. These workloads typically leverage larger read/write sizes (64K to 256K), which generate higher latencies than smaller read/write sizes, since larger payloads will naturally take longer to be processed.
119119

120120
Examples of high throughput workloads include:
121121

122122
- Media repositories
123123
- High performance compute
124124
- AI/ML/LLP
125125

126-
The following tests show a high throughput benchmark using both 64-KiB and 256-KiB sequential workloads and a 1-TiB data set. The workload mix generated decreases a set percentage at a time and demonstrates what you can expect when using varying read/write ratios (for instance, 100%:0%, 90%:10%, 80%:20%, and so on).
126+
The following tests show a high throughput benchmark using both 64-KiB and 256-KiB sequential workloads and a 1-TiB dataset. The workload mix generated decreases a set percentage at a time and demonstrates what you can expect when using varying read/write ratios (for instance, 100%:0%, 90%:10%, 80%:20%, and so on).
127127

128128
### Results: 64-KiB sequential I/O, caching included
129129

130130
In this benchmark, FIO ran using looping logic that more aggressively populated the cache, so an indeterminate amount of caching influenced the results. This results in slightly better overall performance numbers than tests run without caching.
131131

132-
In the graph below, testing shows that an Azure NetApp Files regular volume can handle between approximately 4,500MiB/s pure sequential 64-KiB reads and approximately 1,600MiB/s pure sequential 64-KiB writes. The read:write mix for the workload was adjusted by 10% for each run.
132+
In the graph below, testing shows that an Azure NetApp Files regular volume can handle between approximately 4,500MiB/s pure sequential 64-KiB reads and approximately 1,600MiB/s pure sequential 64-KiB writes. The read-write mix for the workload was adjusted by 10% for each run.
133133

134134

135135
### Results: 64-KiB sequential I/O, caching excluded
@@ -138,31 +138,31 @@ In this benchmark, FIO ran using looping logic that less aggressively populated
138138

139139
In the following graph, testing demonstrates that an Azure NetApp Files regular volume can handle between approximately 3,600MiB/s pure sequential 64-KiB reads and approximately 2,400MiB/s pure sequential 64-KiB writes. During the tests, a 50/50 mix showed total throughput on par with a pure sequential read workload.
140140

141-
The read:write mix for the workload was adjusted by 25% for each run.
141+
The read-write mix for the workload was adjusted by 25% for each run.
142142

143143
### Results: 256-KiB sequential I/O, caching excluded
144144

145145
In this benchmark, FIO ran using looping logic that less aggressively populated the cache, so caching didn't influence the results. This configuration results in slightly less write performance numbers than 64-KiB tests, but higher read numbers than the same 64-KiB tests run without caching.
146146

147147
In the graph below, testing shows that an Azure NetApp Files regular volume can handle between approximately 3,500MiB/s pure sequential 256-KiB reads and approximately 2,500MiB/s pure sequential 256-KiB writes. During the tests, a 50/50 mix showed total throughput peaked higher than a pure sequential read workload.
148148

149-
The read:write mix for the workload was adjusted in 25% increments for each run.
149+
The read-write mix for the workload was adjusted in 25% increments for each run.
150150

151151
### Side-by-side comparison
152152

153-
To better show how caching can influence the performance benchmark tests, the following graph shows total MiB/s for 64-KiB tests with and without caching mechanisms in place. Caching provides an initial slight performance boost for total MiB/s because caching generally improves reads more so than writes. As the read/write mix changes, the total MiB/s without caching exceeds the results that utilize client caching.
153+
To better show how caching can influence the performance benchmark tests, the following graph shows total MiB/s for 64-KiB tests with and without caching mechanisms in place. Caching provides an initial slight performance boost for total MiB/s because caching generally improves reads more so than writes. As the read/write mix changes, the total throughput without caching exceeds the results that utilize client caching.
154154

155-
## Parallel network connections (nconnect)
155+
## Parallel network connections (`nconnect`)
156156

157-
The following tests show a high IOP benchmark using a single client with 64-KiB random workloads and a 1-TiB data set. The workload mix generated uses a different I/O depth each time. To boost the performance for a single client workload, the nconnect mount option was leveraged for better parallelism in comparison to client mounts that didn't use the nconnect mount option. These tests were run only with caching excluded.
157+
The following tests show a high IOP benchmark using a single client with 64-KiB random workloads and a 1-TiB dataset. The workload mix generated uses a different I/O depth each time. To boost the performance for a single client workload, the `nconnect` mount option was leveraged for better parallelism in comparison to client mounts that didn't use the `nconnect` mount option. These tests were run only with caching excluded.
158158

159-
### Results: 64-KiB, sequential, caching excluded, with and without nconnect
159+
### Results: 64-KiB, sequential, caching excluded, with and without `nconnect`
160160

161-
The following results show a scale-up test’s results when reading and writing in 4-KiB chunks on a NFSv3 mount on a single client with and without parallelization of operations (nconnect). The graphs show that as the I/O depth grows, the I/OPS also increase. But when using a standard TCP connection that provides only a single path to the storage, fewer total operations are sent per second than when a mount is able to leverage more TCP connections per mount point. In addition, the total latency for the operations is generally lower when using nconnect.
161+
The following results show a scale-up test’s results when reading and writing in 4-KiB chunks on a NFSv3 mount on a single client with and without parallelization of operations (`nconnect`). The graphs show that as the I/O depth grows, the I/OPS also increase. But when using a standard TCP connection that provides only a single path to the storage, fewer total operations are sent per second than when a mount is able to leverage more TCP connections per mount point. In addition, the total latency for the operations is generally lower when using `nconnect`.
162162

163-
### Side-by-side comparison (with and without nconnect)
163+
### Side-by-side comparison (with and without `nconnect`)
164164

165-
The following graphs show a side-by-side comparison of 64-KiB sequential reads and writes with and without nconnect to highlight the performance improvements seen when using nconnect: higher overall throughput, lower latency.
165+
The following graphs show a side-by-side comparison of 64-KiB sequential reads and writes with and without `nconnect` to highlight the performance improvements seen when using `nconnect`: higher overall throughput, lower latency.
166166

167167
## More information
168168

0 commit comments

Comments
 (0)