You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/lecture_01/lab.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -600,9 +600,12 @@ nothing #hide
600
600
```
601
601
602
602
There are other options to import a function/macro from a different package, however for now let's keep it simple with the `using Module` syntax, that brings to the REPL, all the variables/function/macros exported by the `BenchmarkTools` pkg. If `@btime` is exported, which it is, it can be accessed without specification i.e. just by calling `@btime` without the need for `BenchmarkTools.@btime`. More on the architecture of pkg/module loading in the package developement lecture.
603
-
```@repl lab01_base
604
-
using BenchmarkTools
605
-
@btime polynomial(aexp, x)
603
+
```julia
604
+
julia>using BenchmarkTools
605
+
606
+
julia>@btimepolynomial(aexp, x)
607
+
97.119 ns (1 allocation:16 bytes)
608
+
3.004165230550543
606
609
```
607
610
The output gives us the time of execution averaged over multiple runs (the number of samples is defined automatically based on run time) as well as the number of allocations and the output of the function, that is being benchmarked.
Copy file name to clipboardExpand all lines: docs/src/lecture_02/lecture.md
+98-28Lines changed: 98 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,12 +95,15 @@ one observes the second version produces more optimal code. Why is that?
95
95
This difference will indeed have an impact on the time of code execution.
96
96
On my i5-8279U CPU, the difference (as measured by BenchmarkTools) is
97
97
98
-
````@example lecture
98
+
```julia
99
99
using BenchmarkTools
100
-
#@btime energy(a);
101
-
#@btime energy(b);
102
-
nothing #hide
103
-
````
100
+
@btimeenergy(a)
101
+
@btimeenergy(b)
102
+
```
103
+
```
104
+
159.669 ns (0 allocations: 0 bytes)
105
+
44.571 ns (0 allocations: 0 bytes)
106
+
```
104
107
105
108
Which nicely demonstrates that the choice of types affects performance. Does it mean that we should always use `Tuples` instead of `Arrays`? Surely not, it is just that each is better for different use-cases. Using Tuples means that the compiler will compile a special function for each length of tuple and each combination of types of items it contains, which is clearly wasteful.
106
109
@@ -117,11 +120,14 @@ nothing # hide
117
120
118
121
`wolfpack_a` carries a type `Vector{Wolf}` while `wolfpack_b` has the type `Vector{Any}`. This means that in the first case, the compiler knows that all items are of the type `Wolf`and it can specialize functions using this information. In case of `wolfpack_b`, it does not know which animal it will encounter (although all are of the same type), and therefore it needs to dynamically resolve the type of each item upon its use. This ultimately leads to less performant code.
119
122
120
-
````@example lecture
121
-
#@btime energy(wolfpack_a)
122
-
#@btime energy(wolfpack_b)
123
-
nothing # hide
124
-
````
123
+
```julia
124
+
@btimeenergy(wolfpack_a)
125
+
@btimeenergy(wolfpack_b)
126
+
```
127
+
```
128
+
40.279 ns (0 allocations: 0 bytes)
129
+
159.407 ns (0 allocations: 0 bytes)
130
+
```
125
131
126
132
To conclude, julia is indeed a dynamically typed language, **but** if the compiler can infer
127
133
all types in a called function in advance, it does not have to perform the type resolution
@@ -258,25 +264,49 @@ end
258
264
259
265
This works as the definition above except that the arguments are not converted to `Float64` now. One can store different values in `x` and `y`, for example `String` (e.g. VaguePosition("Hello","world")). Although the above definition might be convenient, it limits the compiler's ability to specialize, as the type `VaguePosition` does not carry information about type of `x` and `y`, which has a negative impact on the performance. For example
260
266
261
-
````@example lecture
267
+
```julia
262
268
using BenchmarkTools
263
269
move(a,b) =typeof(a)(a.x+b.x, a.y+b.y)
264
270
x = [PositionF64(rand(), rand()) for _ in1:100]
265
271
y = [VaguePosition(rand(), rand()) for _ in1:100]
266
272
@benchmarkreduce(move, x)
267
273
@benchmarkreduce(move, y)
268
-
````
274
+
```
275
+
```
276
+
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
Giving fields of a composite type an abstract type does not really solve the problem of the compiler not knowing the type. In this example, it still does not know, if it should use instructions for `Float64` or `Int8`.
271
289
272
-
````@example lecture
290
+
```julia
273
291
struct LessVaguePosition
274
292
x::Real
275
293
y::Real
276
294
end
277
295
z = [LessVaguePosition(rand(), rand()) for _ in1:100];
278
296
@benchmarkreduce(move, z)
279
-
````
297
+
```
298
+
```
299
+
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
300
+
Range (min … max): 16.542 μs … 5.043 ms ┊ GC (min … max): 0.00% … 99.57%
16.5 μs Histogram: log(frequency) by time 21.3 μs <
307
+
308
+
Memory estimate: 9.31 KiB, allocs estimate: 496.
309
+
```
280
310
281
311
From the perspective of generating optimal code, both definitions are equally uninformative to the compiler as it cannot assume anything about the code. However, the `LessVaguePosition` will ensure that the position will contain only numbers, hence catching trivial errors like instantiating `VaguePosition` with non-numeric types for which arithmetic operators will not be defined (recall the discussion on the beginning of the lecture).
282
312
@@ -309,23 +339,27 @@ Note, that the memory layout of mutable structures is different, as fields now c
309
339
### Parametric types
310
340
So far, we had to trade-off flexibility for generality in type definitions. Can we have both? The answer is affirmative. The way to achieve this **flexibility** in definitions of the type while being able to generate optimal code is to **parametrize** the type definition. This is achieved by replacing types with a parameter (typically a single uppercase character) and decorating in definition by specifying different type in curly brackets. For example
311
341
312
-
````@example lecture
342
+
```julia
313
343
struct PositionT{T}
314
344
x::T
315
345
y::T
316
346
end
317
347
u = [PositionT(rand(), rand()) for _ in1:100]
318
-
#@btime reduce(move, u)
319
-
nothing #hide
320
-
````
348
+
@btimereduce(move, u)
349
+
```
350
+
```
351
+
116.285 ns (1 allocation: 32 bytes)
352
+
```
321
353
322
354
Notice that the compiler can take advantage of specializing for different types (which does not have an effect here as in modern processors addition of `Float` and `Int` takes the same time).
323
355
324
-
````@example lecture
356
+
```julia
325
357
v = [PositionT(rand(1:100), rand(1:100)) for _ in1:100]
326
-
#@btime reduce(move, v)
327
-
nothing #hide
328
-
````
358
+
@btimereduce(move, v)
359
+
```
360
+
```
361
+
116.892 ns (1 allocation: 32 bytes)
362
+
```
329
363
330
364
The above definition suffers the same problem as `VaguePosition`, which is that it allows us to instantiate the `PositionT` with non-numeric types, e.g. `String`. We solve this by restricting the types `T` to be children of some supertype, in this case `Real`
331
365
@@ -457,17 +491,41 @@ If the compiler cannot narrow the types of arguments to concrete types, it has t
An interesting intermediate between fully abstract and fully concrete type happens, when the compiler knows that arguments have abstract type, which is composed of a small number of concrete types. This case called Union-Splitting, which happens when there is just a little bit of uncertainty. Julia will do something like
0 commit comments