|
| 1 | +# DISCLAIMER |
| 2 | +This code is provided as a starting point for your own benchmarks or for adaptation to your specific needs. It is not production-ready, and may lack a testing strategy, requiring modifications to function properly. |
| 3 | + |
| 4 | + |
| 5 | +# gds-benchmakrs |
| 6 | +``` |
| 7 | +[root@machine gds_benchmarks]# tree -L 2 |
| 8 | +. |
| 9 | +├── backups |
| 10 | +│ ├── csv |
| 11 | +├── logs |
| 12 | +│ ├── CPU_GPU |
| 13 | +│ ├── CPU_ONLY |
| 14 | +│ ├── GPU_BATCH |
| 15 | +│ ├── GPU_DIRECT |
| 16 | +│ └── GPU_DIRECT_ASYNC |
| 17 | +├── README.md |
| 18 | +└── scripts |
| 19 | + ├── 2csv.sh |
| 20 | + ├── cufile.log |
| 21 | + ├── initialize.sh |
| 22 | + ├── launch.sh |
| 23 | + └── rename.sh |
| 24 | +``` |
| 25 | + |
| 26 | +The **'backups'** dir contains the backup of the previous runs (logfiles) and older CSV files. This directory is kept just in case some log files are missing. |
| 27 | +The **'logs'** directory is designated for storing the most recent log files. Its sub-directories are organized according to the **'mode'** employed during the runs. |
| 28 | + |
| 29 | +The **'scripts'** directory contains all the bash scripts used for initializing the GDS machine, running benchmarks, and transforming log files into CSV format. |
| 30 | + |
| 31 | +**'initialize.sh'**: This script prepares the machine for GDS usage by performing necessary configurations. It disables ACS, sets the correct LBA for the NVMes, and handles the creation, formatting, and mounting of a RAID 0 array, while also adjusting file system privileges. |
| 32 | + |
| 33 | +**'launch.sh'**: This script executes the benchmarks and is parameterized by 'mode' (with options 0, 1, 2, 5, 6 corresponding to GPU_DIRECT, CPU_ONLY, CPU_GPU, GPU_DIRECT_ASYNC, GPU_BATCH, GPU_DIRECT) and the number of threads. Additionally, a third parameter allows for the specification of block size (ranging from 4k to 16M). |
| 34 | + |
| 35 | +During operation, **'launch.sh'** will conduct multiple tests using a variety of block and file sizes, which are predefined (Block sizes include: 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, 1M, 2M, 4M, 8M, 16M). Each test iteration is performed four times. The file size varies depending on the number of threads, and to maintain efficiency, file sizes are kept relatively small. The GDSIO benchmark scales the file size with the number of threads, which can lead to extensive data processing. For example, running eight threads with a 128G file size would result in a total of 1024 TB of data being written, reflecting the scale of operations handled by this setup. |
| 36 | + |
| 37 | + |
| 38 | +Each test will be logged under the corresponding directory. The naming of the log files has the following format: ``gdsio_s<block size>_m<mode number>_w<threads / workers>_r<repetition number>_d<timestamp>.log`` |
| 39 | +gdsio_s1M_m2_w1_r1_d1671836744.log |
| 40 | +For example, this log refers to a test that used 1M as block size, Mode 2 (CPU_GPU), 1 thread, and its timestamp. |
| 41 | +Each log also contains the full command that was issued to run: |
| 42 | +``` |
| 43 | +/usr/local/cuda/gds/tools/gdsio -D /mnt/nvme0/ -d 0 -w 1 -s 8G -x 2 -i 1M -I 1 -V >> /home/ubuntu/gds_benchmarks/scripts/../logs/CPU_GPU//gdsio_s1M_m2_w1_r1_d1671836744.log |
| 44 | +IoType: WRITE XferType: CPU_GPU Threads: 1 DataSetSize: 8388608/8388608(KiB) IOSize: 1024(KiB) Throughput: 4.179353 GiB/sec, Avg_Latency: 233.618286 usecs ops: 8192 total_time 1.914172 secs |
| 45 | +Verifying data |
| 46 | +IoType: READ XferType: CPU_GPU Threads: 1 DataSetSize: 8388608/8388608(KiB) IOSize: 1024(KiB) Throughput: 2.681619 GiB/sec, Avg_Latency: 364.043213 usecs ops: 8192 total_time 2.983272 secs |
| 47 | +``` |
| 48 | + |
| 49 | +**'2csv.sh'**: This script is responsible for converting the log files into CSV format. The resulting CSV files can then be imported into Excel for data visualization and analysis. |
| 50 | + |
| 51 | +**'How to launch the scripts'**: To enhance operational flexibility, the launch.sh script has been modified to accommodate additional input parameters, including the block size. This enhancement allows for the resumption of interrupted runs without the necessity to restart them entirely. For instance, in a previous scenario, it was necessary to resume runs for mode 0 from 32 threads and a 16k block size. Subsequently, complete runs were conducted for all specified block sizes and thread counts: |
| 52 | +``` |
| 53 | +for i in 32 64 128; |
| 54 | + do |
| 55 | + for k in 16k 32k 64k 128k 256k 512k 1M 2M 4M 8M 16M; |
| 56 | + do |
| 57 | + echo ./launch.sh 0 $i $k; |
| 58 | + done; |
| 59 | +done | bash; |
| 60 | +
|
| 61 | +for i in 1 2 4 8 16 32 64 128; |
| 62 | + do |
| 63 | + for k in 4k 8k 16k 32k 64k 128k 256k 512k 1M 2M 4M 8M 16M; |
| 64 | + do |
| 65 | + echo ./launch.sh 2 $i $k; |
| 66 | + done; |
| 67 | +done | bash; |
| 68 | +
|
| 69 | +for i in 1 2 4 8 16 32 64 128; |
| 70 | + do |
| 71 | + for k in 4k 8k 16k 32k 64k 128k 256k 512k 1M 2M 4M 8M 16M; |
| 72 | + do |
| 73 | + echo ./launch.sh 5 $i $k; |
| 74 | + done; |
| 75 | +done | bash; |
| 76 | +
|
| 77 | +for i in 1 2 4 8 16 32 64 128; |
| 78 | + do |
| 79 | + for k in 4k 8k 16k 32k 64k 128k 256k 512k 1M 2M 4M 8M 16M; |
| 80 | + do |
| 81 | + echo ./launch.sh 6 $i $k; |
| 82 | + done; |
| 83 | +done | bash; |
| 84 | +``` |
| 85 | + |
| 86 | +# License |
| 87 | + |
| 88 | +Copyright (c) 2024 Oracle and/or its affiliates. |
| 89 | + |
| 90 | +Licensed under the Universal Permissive License (UPL), Version 1.0. |
| 91 | + |
| 92 | +See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details. |
0 commit comments