|
| 1 | +# How to write fast code |
| 2 | + |
| 3 | +In the GeoInterface ecosystem and GeometryOps specifically, there are a few tricks that can help you keep your code fast and allocation free. |
| 4 | + |
| 5 | +## Always propagate compile-time information |
| 6 | + |
| 7 | +The first time you call `trait` should be the last time you call `trait` on that geometry. |
| 8 | +Otherwise - propagate that trait down the stack! |
| 9 | + |
| 10 | +If you don't, then the compiler loses track of it, and when it finds it again, it has to allocate to perform a dynamic dispatch. |
| 11 | +This is pretty slow and can cause a 3x (or much much larger) slowdown in your code. |
| 12 | + |
| 13 | +Things like the [`Applicator`](@ref GeometryOpsCore.Applicator)s and especially the [`WithTrait`](@ref GeometryOpsCore.WithTrait) applicator can help here. |
| 14 | + |
| 15 | +Similarly, you'll notice a pattern where we pass a floating point type down the chain. This is done for type stability as well. |
| 16 | +If GeoInterface gets a `coordtype` in future then it'll default to `float(coordtype(geom))`, but for now we fix it at f64 and let |
| 17 | +the user change it if they want. This lets us avoid all the issues with "oh but I have a float32 geometry or a bigfloat |
| 18 | +geometry or something". |
| 19 | + |
| 20 | +## Try not to allocate unless necessary |
| 21 | + |
| 22 | +There are a lot of algorithms that seem simple to implement with some `collect`s. Try to skip that if possible, and use |
| 23 | +GeoInterface constructs like `getgeom`, `getpoint`, and `getring`, which are faster anyway. |
| 24 | + |
| 25 | +## Analyse your code using Julia tools |
| 26 | + |
| 27 | +[**ProfileView.jl**](https://github.com/timholy/ProfileView.jl) and [**Cthulhu.jl**](https://github.com/Cthulhu.jl) work together very well to diagnose |
| 28 | +and fix type instability. [**JET.jl**](https://github.com/aviatesk/JET.jl) is also good here. |
| 29 | + |
| 30 | +[**TimerOutputs.jl**](https://github.com/KristofferC/TimerOutputs.jl) is excellent for characterizing where your time is being spent |
| 31 | +and which parts of your function you should focus on optimizing. Always use TimerOutputs before hyperoptimizing - you don't usually |
| 32 | +want to halve the cost of a function which contributes 1% of your runtime! |
| 33 | + |
| 34 | +## Use statically sized, immutable types where you can |
| 35 | + |
| 36 | +Static, immutable types are very good because they can be inlined and do not allocate. |
| 37 | +But this isn't a taboo against mutables by any means. Sometimes rolling your own stack (which allocates) is substantially faster than |
| 38 | +recursion (which technically doesn't). |
| 39 | + |
| 40 | +If you have a type which you don't know the size of, and which you believe is completely random and unpredictable at compile time, |
| 41 | +pay the cost and make it a vector instead of forcing type instability. This applies to tuples etc. |
| 42 | + |
0 commit comments