DrTimothyAldenDavis
diff --git a/‎ParU/Demo/Benchmarking/README.txt‎
Lines changed: 123 additions & 7 deletions b/‎ParU/Demo/Benchmarking/README.txt‎
Lines changed: 123 additions & 7 deletions
@@ -1,12 +1,128 @@
-This folder contains sample scripts to run the paru_benchmark program on two
-particular systems. These scripts were used to benchmark ParU and UMFPACK for
-the ACM TOMS paper submission for ParU:
+This folder contains sample scripts to run the paru_benchmark program, and the
+MUMPS and SuperLU benchmarks, on the test matrices used for the ACM TOMS
+submission for ParU.  Linux is required.
+
+To get the matrices (in Matrix Market format) from the sparse.tamu.edu website,
+use the following:
+
+    chmod +x get_matrices
+    ./get_matrices
+
+The above script will download the matrices into your /tmp/matrices folder.
+About 17GB is required.  These are used for all solvers except SuperLU.
+
+Next, compile ParU and its demos/benchmark programs (where "SuiteSparse") is
+your top-level SuiteSparse repository (suppose it is in your home directory):
+
+    cd ~/SuiteSparse
+    make
+    cd ParU
+    make demos
+
+Finally, run the benchmarks for ParU and UMFPACK, with:
+
+    chmod +x run_benchmarks
+    script
+    ./run_benchmarks
+
+This will run all the benchmarks for ParU and UMFPACK for the ACM TOMS paper,
+and will save the results in the file "typescript"; the results will also
+be displayed on your screen.
+
+Note that we used the following scripts to benchmark ParU and UMFPACK for the
+ACM TOMS paper submission for ParU, but they are specific to our two systems.
+We include them for reference:
 
     do_paru_and_umf_hyper
     do_paru_and_umf.slurm
 
-You will need to revise these scripts to point to the particular locations of
-the paru_benchmark program, and the test matrices (in Matrix Market format).
-The test matrices are too large to include here.  They can be obtained from the
-SuiteSparse matrix collection, currently at sparse.tamu.edu.
+To benchmark MUMPS, first obtain a copy of MUMPS 5.7.3.  After uncompressing
+the original MUMPS 5.7.3 into (say) a ~/MUMPS folder in your home directory,
+make the following modifications;
+
+    cp -f mumps_573_benchmarking/Makefile.inc MUMPS/
+    cp -f mumps_573_benchmarking/examples/* MUMPS/examples
+
+Then edit your ~/MUMPS/Makefile.inc to select the appropriate libraries. 
+You will likely need to revise the location of the metis-5.1.0 library;
+it is not included in MUMPS.  You can obtain it at one of these links:
+
+    https://github.com/KarypisLab/METIS
+    https://karypis.github.io/glaros/software/metis/overview.html
+    https://karypis.github.io/glaros/files/sw/metis/metis-5.1.0.tar.gz
+
+Place a copy in ~/metis-5.1.0 (for example), and revise your MUMPS/Makefile.inc
+file accordingly.  Then build MUMPS, following the MUMPS instructions.  Next,
+use the following to run MUMPS on the test matrices:
+
+    cd ~/SuiteSparse/ParU/Demo/Benchmarking/mumps_573_benchmarking
+    script
+    ./run_mumps
+
+To benchmark SuperLU_MT 4.0.1, first obtain a copy of superlu_mt_401 and
+(suppose it appears as ~/superlu_mt) and copy a few revised files into the
+original distribution:
+
+    cp -f superlu_mt_401_benchmarking/SRC/* ~/superlu_mt/SRC
+    cp -f superlu_mt_401_benchmarking/EXAMPLE/* ~/superlu_mt/EXAMPLE
+    cp -f build_with_* ~/superlu_mt/
+    cp -f CMakeLists.txt ~/superlu_mt/
+
+Then revise the build_with_gcc_and_mkl to match your system (you will need
+to tell it where to find the Intel MKL library).  Then build SuperLU_MT
+with:
+
+    ./build_with_gcc_and_mkl
+
+download the matrices for SuperLU_MT with:
+
+    ./get_RB_matrices
+
+(requires about 15GB).  Next, run the SuperLU_MT benchmarks with:
+
+    cd ~/SuiteSparse/ParU/Demo/Benchmarking/superlu_mt_401_benchmarking
+    script
+    ./run_superlu
+
+The output files from all of these benchmarks vary from program to program.
+To collect the run times for import into a CSV file, use the following on
+each the output files:
+
+    grep TABLE typescript
+
+sample outputs are listed below.  For UMFPACK and ParU, the 3rd column
+is the name of the matrix.  The next 3 columns give the umfpack
+and paru strategies (1: unsym, 2: symmetric), and the ordering
+(1: amd/colamd, 3: metis).  The sym_time is the symbolic analysis
+time, the num_times are the run times for each # of threads used
+(from high to low), followed by the solve times.
+
+    TABLE,  UMF, TSOPF_RS_b39_c30.mtx, 1, 1, 1, sym_time:, 7.406790e-02, num_times:,  1.018119e-01,  1.018070e-01,  1.014700e-01,  9.615564e-02,  9.654265e-02,  9.607372e-02,  9.630437e-02,  sol_times:,  2.128671e-02,  2.123689e-02,  1.612758e-02,  1.608125e-02,  1.602140e-02,  1.599254e-02,  1.601883e-02, 
+    TABLE,  UMF, TSOPF_RS_b39_c30.mtx, 1, 1, 3, sym_time:, 3.292655e-01, num_times:,  1.517104e-01,  1.516826e-01,  1.518991e-01,  1.511355e-01,  1.515573e-01,  1.528105e-01,  1.514552e-01,  sol_times:,  2.080022e-02,  2.066906e-02,  2.067235e-02,  2.076371e-02,  2.066134e-02,  2.082458e-02,  2.074122e-02, 
+    TABLE, ParU, TSOPF_RS_b39_c30.mtx, 1, 1, 1, sym_time:, 7.695978e-02, num_times:,  1.453122e-01,  1.216802e-01,  1.230776e-01,  1.284256e-01,  1.216281e-01,  1.164964e-01,  9.921592e-02,  sol_times:,  8.401886e-03,  7.752119e-03,  8.355235e-03,  8.849248e-03,  8.315628e-03,  7.050963e-03,  5.527283e-03, 
+    TABLE, ParU, TSOPF_RS_b39_c30.mtx, 1, 1, 3, sym_time:, 3.495287e-01, num_times:,  2.116326e-01,  1.406327e-01,  1.310422e-01,  1.301144e-01,  1.134511e-01,  1.313425e-01,  1.388268e-01,  sol_times:,  1.492923e-02,  1.442635e-02,  1.443325e-02,  1.284580e-02,  1.152180e-02,  1.207555e-02,  9.344153e-03, 
+
+
+An example MUMPS output is listed below.  It has the same
+format as the ParU and UMFPACK outputs, except the run times are in
+order of low to high # of threads.  The 4th column is the ordering
+(1: amd, 2: metis on A+A').
+
+    TABLE, MUMPS, /tmp/matrices/TSOPF_RS_b39_c30/TSOPF_RS_b39_c30.mtx, 1, sym_time:, 3.204014e-01, num_times:,  6.222303e-02,  7.103832e-02,  4.108610e-02,  4.489215e-02,  4.803422e-02,  6.339786e-02,  sol_times:,  1.091543e-02,  2.187732e-02,  8.204759e-03,  9.458208e-03,  1.547582e-02,  1.731181e-02, 
+
+SuperLU is similar, except that the matrix name is not listed
+(use awk to find both "TABLE" and "Matrix:" if preferred).
+
+    TABLE, SuperLU_MT, threads:, 32, ordering:,  3,  analyze_time:,  4.29381728e-02,  4.40463973e-02,  4.21937061e-02,  4.61900234e-02,  4.15172875e-02,  4.73920098e-02,  factor_time:,  8.40138663e-02,  6.18000347e-02,  5.73010538e-02,  5.09567745e-02,  5.99720702e-02,  7.43006011e-02,  solve_time:,  6.62509473e-02,  7.43965395e-02,  5.66021195e-02,  7.09967716e-02,  4.94663576e-02,  6.65261745e-02,
+
+For the ACM TOMS submissions, we then copied these run times from 
+a spreadsheet into a MATLAB script that generated the plots in the
+figures in the paper.  This step is a bit tedious so we have omitted
+the details.  However, the final results for our two systems are
+in these files in this folder:
+
+    analyze_grace.m         plot the results on grace.hprc.tamu.edu
+    analyze_hyper.m         plot the results on a 24-core desktop
+    plot_one_matrix.m       used by analyze_*.m
+    subplot_one_matrix.m    used by analyze_*.m