|
1 | 1 | # Usage
|
2 | 2 |
|
3 |
| -## MPI-only mode |
| 3 | +MPI is based on a [single program, multiple data (SPMD)](https://en.wikipedia.org/wiki/SPMD) model, where multiple processes are launched running independent programs, which then communicate as necessary via messages. |
4 | 4 |
|
5 |
| -To run a Julia script with MPI, first make sure that `using MPI` or |
6 |
| -`import MPI` is included at the top of your script. You should then be |
7 |
| -able to run the MPI job as expected, e.g. to run [`examples/01-hello.jl`](https://github.com/JuliaParallel/MPI.jl/blob/master/examples/01-hello.jl), |
8 |
| - |
9 |
| -``` |
10 |
| -mpirun -np 3 julia 01-hello.jl |
11 |
| -``` |
12 |
| - |
13 |
| -## MPI and Julia parallel constructs together |
14 |
| - |
15 |
| -In order for MPI calls to be made from a Julia cluster, it requires the use of |
16 |
| -`MPIManager`, a cluster manager that will start the julia workers using `mpirun` |
17 |
| - |
18 |
| -It has three modes of operation |
19 |
| - |
20 |
| -- Only worker processes execute MPI code. The Julia master process executes outside of and |
21 |
| - is not part of the MPI cluster. Free bi-directional TCP/IP connectivity is required |
22 |
| - between all processes |
23 |
| - |
24 |
| -- All processes (including Julia master) are part of both the MPI as well as Julia cluster. |
25 |
| - Free bi-directional TCP/IP connectivity is required between all processes. |
26 |
| - |
27 |
| -- All processes are part of both the MPI as well as Julia cluster. MPI is used as the transport |
28 |
| - for julia messages. This is useful on environments which do not allow TCP/IP connectivity |
29 |
| - between worker processes |
30 |
| - |
31 |
| -### MPIManager: only workers execute MPI code |
32 |
| - |
33 |
| -An example is provided in `examples/05-juliacman.jl`. |
34 |
| -The julia master process is NOT part of the MPI cluster. The main script should be |
35 |
| -launched directly, `MPIManager` internally calls `mpirun` to launch julia/MPI workers. |
36 |
| -All the workers started via `MPIManager` will be part of the MPI cluster. |
37 |
| - |
38 |
| -``` |
39 |
| -MPIManager(;np=Sys.CPU_THREADS, mpi_cmd=false, launch_timeout=60.0) |
40 |
| -``` |
41 |
| - |
42 |
| -If not specified, `mpi_cmd` defaults to `mpirun -np $np` |
43 |
| -`stdout` from the launched workers is redirected back to the julia session calling `addprocs` via a TCP connection. |
44 |
| -Thus the workers must be able to freely connect via TCP to the host session. |
45 |
| -The following lines will be typically required on the julia master process to support both julia and MPI: |
| 5 | +A script should include `using MPI` and [`MPI.Init()`](@ref) statements, for example |
46 | 6 |
|
47 | 7 | ```julia
|
48 |
| -# to import MPIManager |
| 8 | +# examples/01-hello.jl |
49 | 9 | using MPI
|
| 10 | +MPI.Init() |
50 | 11 |
|
51 |
| -# need to also import Distributed to use addprocs() |
52 |
| -using Distributed |
53 |
| - |
54 |
| -# specify, number of mpi workers, launch cmd, etc. |
55 |
| -manager=MPIManager(np=4) |
56 |
| - |
57 |
| -# start mpi workers and add them as julia workers too. |
58 |
| -addprocs(manager) |
59 |
| -``` |
60 |
| - |
61 |
| -To execute code with MPI calls on all workers, use `@mpi_do`. |
62 |
| - |
63 |
| -`@mpi_do manager expr` executes `expr` on all processes that are part of `manager`. |
64 |
| - |
65 |
| -For example: |
66 |
| -``` |
67 |
| -@mpi_do manager begin |
68 |
| - comm=MPI.COMM_WORLD |
69 |
| - println("Hello world, I am $(MPI.Comm_rank(comm)) of $(MPI.Comm_size(comm))")) |
70 |
| -end |
71 |
| -``` |
72 |
| -executes on all MPI workers belonging to `manager` only |
73 |
| - |
74 |
| -[`examples/05-juliacman.jl`](https://github.com/JuliaParallel/MPI.jl/blob/master/examples/05-juliacman.jl) is a simple example of calling MPI functions on all workers interspersed with Julia parallel methods. |
75 |
| - |
76 |
| -This should be run _without_ `mpirun`: |
77 |
| -``` |
78 |
| -julia 05-juliacman.jl |
79 |
| -``` |
80 |
| - |
81 |
| -A single instation of `MPIManager` can be used only once to launch MPI workers (via `addprocs`). |
82 |
| -To create multiple sets of MPI clusters, use separate, distinct `MPIManager` objects. |
83 |
| - |
84 |
| -`procs(manager::MPIManager)` returns a list of julia pids belonging to `manager` |
85 |
| -`mpiprocs(manager::MPIManager)` returns a list of MPI ranks belonging to `manager` |
86 |
| - |
87 |
| -Fields `j2mpi` and `mpi2j` of `MPIManager` are associative collections mapping julia pids to MPI ranks and vice-versa. |
88 |
| - |
89 |
| -### MPIManager: TCP/IP transport - all processes execute MPI code |
90 |
| - |
91 |
| -Useful on environments which do not allow TCP connections outside of the cluster |
92 |
| - |
93 |
| -An example is in [`examples/06-cman-transport.jl`](https://github.com/JuliaParallel/MPI.jl/blob/master/examples/06-cman-transport.jl): |
94 |
| -``` |
95 |
| -mpirun -np 5 julia 06-cman-transport.jl TCP |
| 12 | +comm = MPI.COMM_WORLD |
| 13 | +println("Hello world, I am $(MPI.Comm_rank(comm)) of $(MPI.Comm_size(comm))") |
| 14 | +MPI.Barrier(comm) |
96 | 15 | ```
|
97 | 16 |
|
98 |
| -This launches a total of 5 processes, mpi rank 0 is the julia pid 1. mpi rank 1 is julia pid 2 and so on. |
99 |
| - |
100 |
| -The program must call `MPI.start(TCP_TRANSPORT_ALL)` with argument `TCP_TRANSPORT_ALL`. |
101 |
| -On mpi rank 0, it returns a `manager` which can be used with `@mpi_do` |
102 |
| -On other processes (i.e., the workers) the function does not return |
103 |
| - |
104 |
| - |
105 |
| -### MPIManager: MPI transport - all processes execute MPI code |
106 |
| - |
107 |
| -`MPI.start` must be called with option `MPI_TRANSPORT_ALL` to use MPI as transport. |
| 17 | +The program can then be launched via an MPI launch command (typically `mpiexec`, `mpirun` or `srun`), e.g. |
108 | 18 | ```
|
109 |
| -mpirun -np 5 julia 06-cman-transport.jl MPI |
| 19 | +$ mpiexec -n 3 julia --project examples/01-hello.jl |
| 20 | +Hello world, I am rank 0 of 3 |
| 21 | +Hello world, I am rank 2 of 3 |
| 22 | +Hello world, I am rank 1 of 3 |
110 | 23 | ```
|
111 |
| -will run the example using MPI as transport. |
112 |
| - |
113 | 24 |
|
114 | 25 | ## Finalizers
|
115 | 26 |
|
|
0 commit comments