Skip to content

Commit 662a722

Browse files
pevnakjanfrancu
authored andcommitted
working on lecture 10
1 parent a863787 commit 662a722

File tree

2 files changed

+74
-12
lines changed

2 files changed

+74
-12
lines changed

docs/src/lecture_10/juliaset_p.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ function juliaset_column!(img, c, n, colj, j)
2121
nothing
2222
end
2323

24-
function juliaset_range(c, n, columns)
24+
function juliaset_columns(c, n, columns)
2525
img = Array{UInt8,2}(undef, n, length(columns))
2626
for (colj, j) in enumerate(columns)
2727
juliaset_column!(img, c, n, colj, j)
@@ -32,7 +32,7 @@ end
3232
function juliaset_distributed(x, y, partitions = nworkers(), n = 1000)
3333
c = x + y*im
3434
columns = Iterators.partition(1:n, div(n, partitions))
35-
slices = pmap(cols -> juliaset_range(c, n, cols), columns)
35+
slices = pmap(cols -> juliaset_columns(c, n, cols), columns)
3636
reduce(hcat, slices)
3737
end
3838

@@ -70,4 +70,4 @@ function juliaset_shared(x, y, partitions = nworkers(), n = 1000)
7070
end
7171

7272
juliaset_shared(-0.79, 0.15)
73-
73+
juliaset_shared(-0.79, 0.15, 16)

docs/src/lecture_10/lecture.md

Lines changed: 71 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Julia offers different levels of parallel programming
88
In this lecture, we will focus mainly on the first two, since SIMD instructions are mainly used for low-level optimization (such as writing you own very performant BLAS library), and task switching is not a true paralelism, but allows to run a different task when one task is waiting for example for IO.
99

1010
## Process-level paralelism
11-
Process-level paralelism means that Julia runs several compilers in different processes. Compilers *do not share anything by default*, where by anything we mean no libraries, not variables. Everyhing has to be set-up, but Julia offers tooling for remote execution and communication primitives.
11+
Process-level paralelism means that Julia runs several compilers in different processes. By default, different processes *do not share anything by default*, meaning no libraries and variables. Everyhing has to be therefore set-up on all processes.
1212

1313
Julia off-the-shelf supports a mode, where a single *main* process controlls several workers. This main process has `myid() == 1`, worker processes receive higher numbers. Julia can be started with multiple workers from the very beggining, using `-p` switch as
1414
```julia
@@ -47,7 +47,7 @@ end
4747
```
4848
Alternatively, we can put the code into a separate file and load it on all workers using `-L filename.jl`
4949

50-
A real benefit from multi-processing is when we can *schedulle* an execution of a function and return the control immeadiately to do something else. A low-level function providing this functionality is `remotecall(fun, worker_id, args...)`. For example
50+
Julia's multi-processing model is based on message-passing paradigm, but the abstraction is more akin to procedure calls. This means that users are saved from prepending messages with headers and implementing logic deciding which function should be called for thich header. Instead, we can *schedulle* an execution of a function on a remote worker and return the control immeadiately to continue in our job. A low-level function providing this functionality is `remotecall(fun, worker_id, args...)`. For example
5151
```julia
5252
@everywhere begin
5353
function delayed_foo(x, y, n )
@@ -57,10 +57,16 @@ A real benefit from multi-processing is when we can *schedulle* an execution of
5757
end
5858
r = remotecall(delayed_foo, 2, 1, 1, 60)
5959
```
60-
which terminates immediately. `r` does not contain result of `foo(1, 1)`, but a struct `Future`. The `Future` can be seen as a *handle* allowing to retrieve the result later, using `fetch`, which either fetches the result or wait until the result is available.
60+
returns immediately, even though the function will take at least 60 seconds. `r` does not contain result of `foo(1, 1)`, but a struct `Future`, which is a *remote reference* in Julia's terminology. It points data located on some machine, indicates, if they are available and allows to `fetch` them from the remote worker. `fetch` is blocking, which means that the execution is blocked until data are available (if they are never available, the process can wait forever.) The presence of data can be checked using `isready`, which in case of `Future` returned from `remote_call` indicate that the computation has finished.
6161
```julia
62+
isready(r)
6263
fetch(r) == foo(1, 1)
6364
```
65+
An advantage of the remote reference is that it can be freely shared around processes and the result can be retrieved on different node then the one which issued the call.s
66+
```julia
67+
r = remotecall(delayed_foo, 2, 1, 1, 60)
68+
remotecall(r -> println("value: ",fetch(r), " retrieved on ", myid()) , 3, r)
69+
```
6470
An interesting feature of `fetch` is that it re-throw an exception raised on a different process.
6571
```julia
6672
@everywhere begin
@@ -71,9 +77,18 @@ end
7177
r = @spawnat 2 exfoo()
7278
```
7379
where `@spawnat` is a an alternative to `remotecall`, which executes a closure around expression (in this case `exfoo()`) on a specified worker (in this case 2). Fetching the result `r` throws an exception on the main process.
74-
```jula
80+
```julia
81+
fetch(r)
82+
```
83+
`@spawnat` can be executed with `:any` to signal that the user does not care, where the function will be executed and it will be left up to Julia.
84+
```julia
85+
r = @spawnat :any foo(1,1)
7586
fetch(r)
7687
```
88+
Finally, if you would for some reason need to wait for the computed value, you can use
89+
```julia
90+
remotecall_fetch(foo, 2, 1, 1)
91+
```
7792

7893
## Example: Julia sets
7994
Our example for explaining mechanisms of distributed computing will be the computation of Julia set fractal. The computation of the fractal can be easily paralelized, since the value of each pixel is independent from the remaining. The example is adapted from [Eric Aubanel](http://www.cs.unb.ca/~aubanel/JuliaMultithreadingNotes.html).
@@ -256,10 +271,57 @@ end
256271
Julia does not provide by default any facility to kill the remote execution except sending `ctrl-c` to the remote worker as `interrupt(pids::Integer...)`.
257272

258273
## Sending data
259-
- Do not send `randn(1000, 1000)`
260-
- Sending references and ObjectID would not work
261-
- Serialization is very time consuming, an efficient converstion to something simple might be wort
262-
- Dict("a" => [1,2,3], "b" = [2,3,4,5]) -> (Array of elements, array of bounds, keys)
274+
Sending parameters of functions and receiving results from a remotely called functions migh incur a significant cost.
275+
1. Try to minimize the data movement as much as possible. A prototypical example is
276+
```julia
277+
A = rand(1000,1000);
278+
Bref = @spawnat :any A^2;
279+
```
280+
and
281+
```julia
282+
Bref = @spawnat :any rand(1000,1000)^2;
283+
```
284+
2. It is not only volume of data (in terms of the number of bytes), but also a complexity of objects that are being sent. Serialization can be very time consuming, an efficient converstion to something simple might be wort
285+
```julia
286+
using BenchmarkTools
287+
@everywhere begin
288+
using Random
289+
v = [randstring(rand(1:20)) for i in 1:1000];
290+
p = [i => v[i] for i in 1:1000]
291+
d = Dict(p)
292+
293+
send_vec() = v
294+
send_dict() = d
295+
send_pairs() = p
296+
custom_serialization() = (length.(v), join(v, ""))
297+
end
298+
299+
@btime remotecall_fetch(send_vec, 2);
300+
@btime remotecall_fetch(send_dict, 2);
301+
@btime remotecall_fetch(send_pairs, 2);
302+
@btime remotecall_fetch(custom_serialization, 2);
303+
```
304+
3. Some type of objects cannot be properly serialized and deserialized
305+
```julia
306+
a = IdDict(
307+
:a => rand(1,1),
308+
)
309+
b = remotecall_fetch(identity, 2, a)
310+
a[:a] === a[:a]
311+
a[:a] === b[:a]
312+
```
313+
4. If you need to send the data to worker, i.e. you want to define (overwrite) a global variable there
314+
```julia
315+
@everywhere begin
316+
g = rand()
317+
show_secret() = println("secret of ", myid(), " is ", g)
318+
end
319+
@everywhere show_secret()
320+
321+
remotecall_fetch(g -> eval(:(g = $(g))), 2, g)
322+
@everywhere show_secret()
323+
```
324+
which is implemented in the
263325

264326
## Practical advices
265327
Recall that (i) workers are started as clean processes and (ii) they might not share the same environment with the main process. The latter is due to the fact that files describing the environment (`Project.toml` and `Manifest.toml`) might not be available on remote machines.
@@ -298,7 +360,7 @@ A complete example can be seen in [`juliaset_p.jl`](juliaset_p.jl).
298360
## Julia sets
299361
An example adapted from [Eric Aubanel](http://www.cs.unb.ca/~aubanel/JuliaMultithreadingNotes.html).
300362

301-
For ilustration, we will use Julia set fractals, ad they can be easily paralelized. Some fractals (Julia set, Mandelbrot) are determined by properties of some complex-valued functions. Julia set counts, how many iteration is required for ``f(z)=z^2+c`` to be bigger than two in absolute value, ``|f(z)>=2``. The number of iterations can then be mapped to the pixel's color, which creates a nice visualization we know.
363+
For ilustration, we will use Julia set fractals, ad they can be easily paralelized. Some fractals (Julia set, Mandelbrot) are determined by properties of some complex-valued functions. Julia set counts, how many iteration is required for ``f(z) = z^2+c`` to be bigger than two in absolute value, ``|f(z)| >=2 ``. The number of iterations can then be mapped to the pixel's color, which creates a nice visualization we know.
302364
```julia
303365
function juliaset_pixel(z₀, c)
304366
z = z₀

0 commit comments

Comments
 (0)