You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/lecture_10/lecture.md
+55-10Lines changed: 55 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,8 @@ Julia offers different levels of parallel programming
7
7
8
8
In this lecture, we will focus mainly on the first two, since SIMD instructions are mainly used for low-level optimization (such as writing you own very performant BLAS library), and task switching is not a true paralelism, but allows to run a different task when one task is waiting for example for IO.
9
9
10
+
**The most important lesson is that before you jump into the parallelism, make sure your code is fast sequentially**
11
+
10
12
## Process-level paralelism
11
13
Process-level paralelism means that Julia runs several compilers in different processes. By default, different processes *do not share anything by default*, meaning no libraries and variables. Everyhing has to be therefore set-up on all processes.
12
14
@@ -321,11 +323,10 @@ end
321
323
remotecall_fetch(g ->eval(:(g =$(g))), 2, g)
322
324
@everywhereshow_secret()
323
325
```
324
-
which is implemented in the
326
+
which is implemented in the `ParallelDataTransfer.jl` with other variants, but in general, this construct should be avoided.
325
327
326
328
## Practical advices
327
-
Recall that (i) workers are started as clean processes and (ii) they might not share the same environment with the main process. The latter is due to the fact that files describing the environment (`Project.toml` and `Manifest.toml`) might not be available on remote machines.
328
-
We recommend:
329
+
Recall that (i) workers are started as clean processes and (ii) they might not share the same environment with the main process. The latter is due to the possibility of remote machines to have a different directory structure. Our best practices are:
329
330
- to have shared directory (shared home) with code and to share the location of packages
330
331
- to place all code for workers to one file, let's call it `worker.jl` (author of this includes the code for master as well).
331
332
- put to the beggining of `worker.jl` code activating specified environment as
@@ -350,13 +351,6 @@ where `main()` is the function defined in `worker.jl` to be executed on the main
350
351
351
352
A complete example can be seen in [`juliaset_p.jl`](juliaset_p.jl).
352
353
353
-
354
-
## Multi-Threadding
355
-
- Locks / lock-free multi-threadding
356
-
- Show the effect of different schedullers
357
-
- intra-model parallelism
358
-
- sucks when operating with Heap
359
-
360
354
## Julia sets
361
355
An example adapted from [Eric Aubanel](http://www.cs.unb.ca/~aubanel/JuliaMultithreadingNotes.html).
Keep reminded that while threads are very easy very convenient to use, there are use-cases where you might be better off with proccess, even though there will be some communication overhead. One such case happens when you need to allocate and free a lot of memory. This is because Julia's garbage collector is single-threadded. Imagine a task of making histogram of bytes in a directory.
499
+
For a fair comparison, we will use `Transducers`, since they offer thread and process based paralelism
we see that the threadding is actually worse than process based paralelism despite us paying the price for serialization and deserialization of `Dict`. Needless to say that changing `Dict` to `Vector` as
When deciding, what kind of paralelism to employ, consider following
505
548
- for tightly coupled computation over shared data, multi-threadding is more suitable due to non-existing sharing of data between processes
506
549
- but if the computation requires frequent allocation and freeing of memery, or IO, separate processes are multi-suitable, since garbage collectors are independent between processes
550
+
- Making all cores busy while achieving an ideally linear speedup is difficult and needs a lot of experience and knowledge. Tooling and profilers supporting debugging of parallel processes is not much developped.
507
551
-`Transducers` thrives for (almost) the same code to support thread- and process-based paralelism.
0 commit comments