Skip to content

Latest commit

 

History

History
69 lines (61 loc) · 1.71 KB

File metadata and controls

69 lines (61 loc) · 1.71 KB

GPU Saturation Scorer (gssr)

gssr is a utility meant to collect and analyze GPU performance metrics on the CSCS ALPS System. it is based on top of Nvidia's DCGM tool.

Install

From Pypi

pip install gssr

From GitHub Source

pip install git+https://github.com/eth-cscs/GPU-saturation-scorer.git

To install from a specific branch, e.g. the development branch

pip install git+https://github.com/eth-cscs/GPU-saturation-scorer.git@dev

To install a specific release from a tag, e.g. v0.4.0

pip install git+https://github.com/eth-cscs/GPU-saturation-scorer.git@v0.4.0

Profile

Example

If you are submitting a batch job and the command you are executing is

srun python test.py

The srun command should be modified as follows.:

srun gssr profile python test.py
  • The gssr option to run is "profile".
  • The default output directory is "profile_out_{job_id}"
  • You can also set a label to this output data if you prefer with the "-l" flag

Output directory

If you need to write the output to a specific directory, use the "-o" flag

srun gssr profile -o /abc/def python test.py

Analyze

Metric Output

The profiled output can be analysed as follows.:

gssr analyze -i ./profile_out

PDF File Output with Plots

gssr analyze -i ./profile_out --report

PDF report(s) will be generated containing time-series and load-balancing plots.

PDF File Output with Heatmap Plots

gssr analyze -i ./profile_out --report -hm

The generation of heatmaps is very time-consuming. Please turn it on at your own risk.

Exporting the Profiled Output as a SQLite3 file

gssr analyze -i ./profile_out --export data.sqlite3

More Options

gssr --help