|
1 | | -This folder contains sample scripts to run the paru_benchmark program on two |
2 | | -particular systems. These scripts were used to benchmark ParU and UMFPACK for |
3 | | -the ACM TOMS paper submission for ParU: |
| 1 | +This folder contains sample scripts to run the paru_benchmark program, and the |
| 2 | +MUMPS and SuperLU benchmarks, on the test matrices used for the ACM TOMS |
| 3 | +submission for ParU. Linux is required. |
| 4 | + |
| 5 | +To get the matrices (in Matrix Market format) from the sparse.tamu.edu website, |
| 6 | +use the following: |
| 7 | + |
| 8 | + chmod +x get_matrices |
| 9 | + ./get_matrices |
| 10 | + |
| 11 | +The above script will download the matrices into your /tmp/matrices folder. |
| 12 | +About 17GB is required. These are used for all solvers except SuperLU. |
| 13 | + |
| 14 | +Next, compile ParU and its demos/benchmark programs (where "SuiteSparse") is |
| 15 | +your top-level SuiteSparse repository (suppose it is in your home directory): |
| 16 | + |
| 17 | + cd ~/SuiteSparse |
| 18 | + make |
| 19 | + cd ParU |
| 20 | + make demos |
| 21 | + |
| 22 | +Finally, run the benchmarks for ParU and UMFPACK, with: |
| 23 | + |
| 24 | + chmod +x run_benchmarks |
| 25 | + script |
| 26 | + ./run_benchmarks |
| 27 | + |
| 28 | +This will run all the benchmarks for ParU and UMFPACK for the ACM TOMS paper, |
| 29 | +and will save the results in the file "typescript"; the results will also |
| 30 | +be displayed on your screen. |
| 31 | + |
| 32 | +Note that we used the following scripts to benchmark ParU and UMFPACK for the |
| 33 | +ACM TOMS paper submission for ParU, but they are specific to our two systems. |
| 34 | +We include them for reference: |
4 | 35 |
|
5 | 36 | do_paru_and_umf_hyper |
6 | 37 | do_paru_and_umf.slurm |
7 | 38 |
|
8 | | -You will need to revise these scripts to point to the particular locations of |
9 | | -the paru_benchmark program, and the test matrices (in Matrix Market format). |
10 | | -The test matrices are too large to include here. They can be obtained from the |
11 | | -SuiteSparse matrix collection, currently at sparse.tamu.edu. |
| 39 | +To benchmark MUMPS, first obtain a copy of MUMPS 5.7.3. After uncompressing |
| 40 | +the original MUMPS 5.7.3 into (say) a ~/MUMPS folder in your home directory, |
| 41 | +make the following modifications; |
| 42 | + |
| 43 | + cp -f mumps_573_benchmarking/Makefile.inc MUMPS/ |
| 44 | + cp -f mumps_573_benchmarking/examples/* MUMPS/examples |
| 45 | + |
| 46 | +Then edit your ~/MUMPS/Makefile.inc to select the appropriate libraries. |
| 47 | +You will likely need to revise the location of the metis-5.1.0 library; |
| 48 | +it is not included in MUMPS. You can obtain it at one of these links: |
| 49 | + |
| 50 | + https://github.com/KarypisLab/METIS |
| 51 | + https://karypis.github.io/glaros/software/metis/overview.html |
| 52 | + https://karypis.github.io/glaros/files/sw/metis/metis-5.1.0.tar.gz |
| 53 | + |
| 54 | +Place a copy in ~/metis-5.1.0 (for example), and revise your MUMPS/Makefile.inc |
| 55 | +file accordingly. Then build MUMPS, following the MUMPS instructions. Next, |
| 56 | +use the following to run MUMPS on the test matrices: |
| 57 | + |
| 58 | + cd ~/SuiteSparse/ParU/Demo/Benchmarking/mumps_573_benchmarking |
| 59 | + script |
| 60 | + ./run_mumps |
| 61 | + |
| 62 | +To benchmark SuperLU_MT 4.0.1, first obtain a copy of superlu_mt_401 and |
| 63 | +(suppose it appears as ~/superlu_mt) and copy a few revised files into the |
| 64 | +original distribution: |
| 65 | + |
| 66 | + cp -f superlu_mt_401_benchmarking/SRC/* ~/superlu_mt/SRC |
| 67 | + cp -f superlu_mt_401_benchmarking/EXAMPLE/* ~/superlu_mt/EXAMPLE |
| 68 | + cp -f build_with_* ~/superlu_mt/ |
| 69 | + cp -f CMakeLists.txt ~/superlu_mt/ |
| 70 | + |
| 71 | +Then revise the build_with_gcc_and_mkl to match your system (you will need |
| 72 | +to tell it where to find the Intel MKL library). Then build SuperLU_MT |
| 73 | +with: |
| 74 | + |
| 75 | + ./build_with_gcc_and_mkl |
| 76 | + |
| 77 | +download the matrices for SuperLU_MT with: |
| 78 | + |
| 79 | + ./get_RB_matrices |
| 80 | + |
| 81 | +(requires about 15GB). Next, run the SuperLU_MT benchmarks with: |
| 82 | + |
| 83 | + cd ~/SuiteSparse/ParU/Demo/Benchmarking/superlu_mt_401_benchmarking |
| 84 | + script |
| 85 | + ./run_superlu |
| 86 | + |
| 87 | +The output files from all of these benchmarks vary from program to program. |
| 88 | +To collect the run times for import into a CSV file, use the following on |
| 89 | +each the output files: |
| 90 | + |
| 91 | + grep TABLE typescript |
| 92 | + |
| 93 | +sample outputs are listed below. For UMFPACK and ParU, the 3rd column |
| 94 | +is the name of the matrix. The next 3 columns give the umfpack |
| 95 | +and paru strategies (1: unsym, 2: symmetric), and the ordering |
| 96 | +(1: amd/colamd, 3: metis). The sym_time is the symbolic analysis |
| 97 | +time, the num_times are the run times for each # of threads used |
| 98 | +(from high to low), followed by the solve times. |
| 99 | + |
| 100 | + TABLE, UMF, TSOPF_RS_b39_c30.mtx, 1, 1, 1, sym_time:, 7.406790e-02, num_times:, 1.018119e-01, 1.018070e-01, 1.014700e-01, 9.615564e-02, 9.654265e-02, 9.607372e-02, 9.630437e-02, sol_times:, 2.128671e-02, 2.123689e-02, 1.612758e-02, 1.608125e-02, 1.602140e-02, 1.599254e-02, 1.601883e-02, |
| 101 | + TABLE, UMF, TSOPF_RS_b39_c30.mtx, 1, 1, 3, sym_time:, 3.292655e-01, num_times:, 1.517104e-01, 1.516826e-01, 1.518991e-01, 1.511355e-01, 1.515573e-01, 1.528105e-01, 1.514552e-01, sol_times:, 2.080022e-02, 2.066906e-02, 2.067235e-02, 2.076371e-02, 2.066134e-02, 2.082458e-02, 2.074122e-02, |
| 102 | + TABLE, ParU, TSOPF_RS_b39_c30.mtx, 1, 1, 1, sym_time:, 7.695978e-02, num_times:, 1.453122e-01, 1.216802e-01, 1.230776e-01, 1.284256e-01, 1.216281e-01, 1.164964e-01, 9.921592e-02, sol_times:, 8.401886e-03, 7.752119e-03, 8.355235e-03, 8.849248e-03, 8.315628e-03, 7.050963e-03, 5.527283e-03, |
| 103 | + TABLE, ParU, TSOPF_RS_b39_c30.mtx, 1, 1, 3, sym_time:, 3.495287e-01, num_times:, 2.116326e-01, 1.406327e-01, 1.310422e-01, 1.301144e-01, 1.134511e-01, 1.313425e-01, 1.388268e-01, sol_times:, 1.492923e-02, 1.442635e-02, 1.443325e-02, 1.284580e-02, 1.152180e-02, 1.207555e-02, 9.344153e-03, |
| 104 | + |
| 105 | + |
| 106 | +An example MUMPS output is listed below. It has the same |
| 107 | +format as the ParU and UMFPACK outputs, except the run times are in |
| 108 | +order of low to high # of threads. The 4th column is the ordering |
| 109 | +(1: amd, 2: metis on A+A'). |
| 110 | + |
| 111 | + TABLE, MUMPS, /tmp/matrices/TSOPF_RS_b39_c30/TSOPF_RS_b39_c30.mtx, 1, sym_time:, 3.204014e-01, num_times:, 6.222303e-02, 7.103832e-02, 4.108610e-02, 4.489215e-02, 4.803422e-02, 6.339786e-02, sol_times:, 1.091543e-02, 2.187732e-02, 8.204759e-03, 9.458208e-03, 1.547582e-02, 1.731181e-02, |
| 112 | + |
| 113 | +SuperLU is similar, except that the matrix name is not listed |
| 114 | +(use awk to find both "TABLE" and "Matrix:" if preferred). |
| 115 | + |
| 116 | + TABLE, SuperLU_MT, threads:, 32, ordering:, 3, analyze_time:, 4.29381728e-02, 4.40463973e-02, 4.21937061e-02, 4.61900234e-02, 4.15172875e-02, 4.73920098e-02, factor_time:, 8.40138663e-02, 6.18000347e-02, 5.73010538e-02, 5.09567745e-02, 5.99720702e-02, 7.43006011e-02, solve_time:, 6.62509473e-02, 7.43965395e-02, 5.66021195e-02, 7.09967716e-02, 4.94663576e-02, 6.65261745e-02, |
| 117 | + |
| 118 | +For the ACM TOMS submissions, we then copied these run times from |
| 119 | +a spreadsheet into a MATLAB script that generated the plots in the |
| 120 | +figures in the paper. This step is a bit tedious so we have omitted |
| 121 | +the details. However, the final results for our two systems are |
| 122 | +in these files in this folder: |
| 123 | + |
| 124 | + analyze_grace.m plot the results on grace.hprc.tamu.edu |
| 125 | + analyze_hyper.m plot the results on a 24-core desktop |
| 126 | + plot_one_matrix.m used by analyze_*.m |
| 127 | + subplot_one_matrix.m used by analyze_*.m |
12 | 128 |
|
0 commit comments