Skip to content

Commit 59d9679

Browse files
authored
Merge pull request #25 from gjbex/development
Development
2 parents 5771679 + 5664258 commit 59d9679

File tree

11 files changed

+86
-864
lines changed

11 files changed

+86
-864
lines changed

docs/README.md

Lines changed: 15 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -21,20 +21,21 @@ When you complete this training you will
2121

2222
## Schedule
2323

24-
Total duration: 8 hours.
25-
26-
| Subject | Duration |
27-
|---------------------------------------------|-----------|
28-
| introduction and motivation | 5 min. |
29-
| performance and profiling | 25 min. |
30-
| libraries | 10 min. |
31-
| Cython | 90 min. |
32-
| interfacing with C/C++/Fortran | 60 min. |
33-
| multi-threaded programming | 60 min. |
34-
| MPI | 120 min. |
35-
| dask | 30 min. |
36-
| pyspark | 20 min. |
37-
| wrap up | 10 min. |
24+
Total duration: 4 hours.
25+
26+
| Subject | Duration |
27+
|---------------------------------------------|----------|
28+
| introduction and motivation | 5 min. |
29+
| performance and profiling | 25 min. |
30+
| libraries | 10 min. |
31+
| Cython | 60 min. |
32+
| coffee break | 10 min. |
33+
| interfacing with C/C++/Fortran | 30 min. |
34+
| multi-threaded programming | 10 min. |
35+
| MPI | 45 min. |
36+
| dask | 15 min. |
37+
| pyspark | 20 min. |
38+
| wrap up | 10 min. |
3839

3940

4041
## Training materials
@@ -65,13 +66,6 @@ If you plan to do Python programming in a Linux or HPC environment you should
6566
be familiar with these as well.
6667

6768

68-
## Level
69-
70-
* Introductory: 10 %
71-
* Intermeidate: 30 %
72-
* Advanced: 60 %
73-
74-
7569
## Trainer(s)
7670

7771
* Geert Jan Bex ([geertjan.bex@uhasselt.be](mailto:geertjan.bex@uhasselt.be))

python_for_hpc.pptx

23.4 KB
Binary file not shown.

source-code/dask/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ CSV or HDF5 files.
1010
* `create_csv_data.py`: non-Dask script to generate a large CSV data set for
1111
experimenting with Dask.
1212
* `create_csv_data.pbs`: PBS script to run `create_csv_data.py`.
13+
* `create_csv_data.slurm`: Slurm script to run `create_csv_data.py`.
1314
* `dask_avg_csv.py`: Dask computation of the average value of columns in
1415
a large number of CSV files.
1516
* `dask_avg_csv.pbs`: PBS script to run `dask_avg_csv.py`.
@@ -27,6 +28,8 @@ CSV or HDF5 files.
2728
futures in a distributed setting.
2829
* `dask_distr_test.pbs`: PBS script that will launch a scheduler, workers,
2930
and run the `dask_distr_test.py` script.
31+
* `dask_distr_test.slurm`: Slurm script that will launch a scheduler, workers,
32+
and run the `dask_distr_test.py` script.
3033
* `dask_sum_aarays.py`: somewhat artificial example of a Dask computation
3134
on `numpy` arrays.
3235
* `dask_sum_aarays.pbs`: PBS script to execute `dask_sum_aarays.py`.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#!/usr/bin/env -S bash -l
2+
#SBATCH --account=lpt2_sysadmin
3+
#SBATCH --nodes=1
4+
#SBATCH --cpus-per-task=8
5+
#SBATCH --time=02:00:00
6+
7+
source .mamba_init.sh
8+
mamba activate python_for_hpc
9+
10+
DATA_DIR=$VSC_SCRATCH/data/time_series
11+
mkdir -p $DATA_DIR
12+
13+
./create_csv_data.py --files 800 --rows 200000 --cols 100 $DATA_DIR
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
#!/usr/bin/env -S bash -l
2+
#SBATCH --account=lpt2_sysadmin
3+
#SBATCH --cluster=wice
4+
#SBATCH --time=00:10:00
5+
#SBATCH --ntasks=1 --cpus-per-task=1
6+
#SBATCH --mem=1G
7+
#SBATCH hetjob
8+
#SBATCH --ntasks=4 --cpus-per-task=8
9+
10+
# file name to store scheduler information for workers and client
11+
scheduler_file="$(pwd)/scheduler_${SLURM_JOB_ID}.json"
12+
13+
# activate environment that has dask installed
14+
mamba activate python_for_hpc
15+
16+
# launch dask server process
17+
echo "launching dask-server"
18+
srun --exclusive \
19+
--het-group=0 \
20+
--ntasks=$SLURM_NTASKS_HET_GROUP_0 \
21+
--cpus-per-task=$SLURM_CPUS_PER_TASK_HET_GROUP_0 \
22+
--mem=$SLURM_MEM_PER_NODE_PACK_GROUP_0 \
23+
dask scheduler --scheduler-file $scheduler_file &
24+
25+
# give server time to start
26+
sleep 5
27+
28+
# launch dask worker processes
29+
for i in $(seq $SLURM_NTASKS_HET_GROUP_1)
30+
do
31+
echo "launching dask-worker $i"
32+
srun --exclusive \
33+
--het-group=1 \
34+
--ntasks=1 \
35+
--cpus-per-task=$SLURM_CPUS_PER_TASK_HET_GROUP_1 \
36+
--mem=$SLURM_MEM_PER_NODE_PACK_GROUP_1 \
37+
dask worker --scheduler-file $scheduler_file &
38+
done
39+
40+
# give workers time to start
41+
sleep 20
42+
43+
# start the client process
44+
python dask_distr_test.py \
45+
--scheduler-file $scheduler_file \
46+
--verbose
Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,6 @@
1-
#!/bin/bash
1+
#!/usr/bin/env -S bash -l
22

3-
source "${VSC_DATA}/miniconda3/setenv.sh"
4-
source activate science 2> /dev/null
5-
if [ $? -ne 0 ]
6-
then
7-
(>&2 echo '### error: conda environment not sourced correctly' )
8-
fi
3+
source ~/.mamba_init.sh
4+
mamba activate python_for_hpc
95

106
nohup dask-scheduler &> "scheduler-${PBS_JOBID}.log" &

source-code/interfacing-c-c++-fortran/Pybind11/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ pybind11 is a wrapper generator for C++ code that has a lot of nice features.
55
## What is it?
66

77
1. `Simple`: very simple illustration of wrapping C++ functions.
8-
1. `Spectrum`: illustration of warpping to support the buffer protocol.
8+
1. `Spectrum`: illustration of wrapping to support the buffer protocol.
99
1. `Stats`: illustration of wrapping a C++ class.
1010
1. `Convolution`: illustration of using the buffer protocol.
1111
1. `environment.yml`: conda environment specification for this directory.

source-code/ising/.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
11
*.pyc
22
_ising_cxx.so
33
ising_cxx.py
4+
5+
result-domains.txt
6+
result-magn.txt
7+

source-code/ising/src/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
ising_cxx_wrap.cxx
2+
*.o

source-code/profiling/README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,5 +22,3 @@ functions use memory.
2222
1. `run_memory_prof.sh`: Bash shell script to create a memory profile.
2323
Note that this generates a lot of overhead in terms of CPU time.
2424
1. `cellular_automata.py`: example code to illustrate snakeviz.
25-
1. `microbenchmarking.ipynb`: some pitfalls when microbenchmarking
26-
Python code.

0 commit comments

Comments
 (0)