8888Locally, the stack is composed of a *stack* and a *heap*. The stack requires a
8989static allocation: it is ordered. Because it's ordered, it is very clear where
9090things are in the stack, and therefore accesses are very quick (think
91- instantanious ). However, because this is static, it requires that the size
91+ instantaneous ). However, because this is static, it requires that the size
9292of the variables is known at compile time (to determine all of the variable
9393locations). Since that is not possible with all variables, there exists the
9494heap. The heap is essentially a stack of pointers to objects in memory. When
124124@btime inner_noalloc!(C,A,B)
125125```
126126
127- Why does the array here get heap-allocated? It isn't able to prove/guarentee
127+ Why does the array here get heap-allocated? It isn't able to prove/guarantee
128128at compile-time that the array's size will always be a given value, and thus
129- it allocates it to the heap. `@btime` tells us this allocation occured and
129+ it allocates it to the heap. `@btime` tells us this allocation occurred and
130130shows us the total heap memory that was taken. Meanwhile, the size of a Float64
131131number is known at compile-time (64-bits), and so this is stored onto the stack
132132and given a specific location that the compiler will be able to directly
@@ -277,7 +277,7 @@ without a care about performance.
277277temporary variables since the individual C kernels are written for specific
278278numbers of inputs and thus don't naturally fuse. Julia's broadcast mechanism
279279is just generating and JIT compiling Julia functions on the fly, and thus it
280- can accomodate the combinatorial explosion in the amount of choices just by
280+ can accommodate the combinatorial explosion in the amount of choices just by
281281only compiling the combinations that are necessary for a specific code)
282282
283283### Heap Allocations from Slicing
@@ -310,7 +310,7 @@ but with a relatively small constant).
310310### Asymptotic Cost of Heap Allocations
311311
312312Heap allocations have to locate and prepare a space in RAM that is proportional
313- to the amount of memory that is calcuated , which means that the cost of a heap
313+ to the amount of memory that is calculated , which means that the cost of a heap
314314allocation for an array is O(n), with a large constant. As RAM begins to fill
315315up, this cost dramatically increases. If you run out of RAM, your computer
316316may begin to use *swap*, which is essentially RAM simulated on your hard drive.
@@ -443,7 +443,7 @@ since it needs to decode and have a version for all primitive types!
443443Not only is there runtime overhead checks in function calls due to to not being
444444explicit about types, there is also a memory overhead since it is impossible
445445to know how much memory a value with take since that's a property of its type.
446- Thus the Python interpreter cannot statically guerentee exact unchanging values
446+ Thus the Python interpreter cannot statically guarantee exact unchanging values
447447for the size that a value would take in the stack, meaning that the variables
448448are not stack-allocated. This means that every number ends up heap-allocated,
449449which hopefully begins to explain why this is not as fast as C.
@@ -458,12 +458,12 @@ a + b
458458
459459However, before JIT compilation, Julia runs a type inference algorithm which
460460finds out that `A` is an `Int`, and `B` is an `Int`. You can then understand
461- that if it can prove that `A+B` is an `Int`, then it can propogate all of the
461+ that if it can prove that `A+B` is an `Int`, then it can propagate all of the
462462types through.
463463
464464### Type Specialization in Functions
465465
466- Julia is able to propogate type inference through functions because, even if
466+ Julia is able to propagate type inference through functions because, even if
467467a function is "untyped", Julia will interpret this as a *generic function*
468468over possible *methods*, where every method has a concrete type. This means that
469469in Julia, the function:
@@ -564,7 +564,7 @@ and thus the output is unknown:
564564
565565This means that its output type is `Union{Int,Float64}` (Julia uses union types
566566to keep the types still somewhat constrained). Once there are multiple choices,
567- those need to get propogated through the compiler, and all subsequent calculations
567+ those need to get propagate through the compiler, and all subsequent calculations
568568are the result of either being an `Int` or a `Float64`.
569569
570570(Note that Julia has small union optimizations, so if this union is of size
@@ -615,7 +615,7 @@ ff(x::Number,y::Number) = x + y
615615ff(2.0,5)
616616```
617617
618- Notice that the fallback method still specailizes on the inputs:
618+ Notice that the fallback method still specializes on the inputs:
619619
620620```julia
621621@code_llvm ff(2.0,5)
@@ -643,7 +643,7 @@ Note that `f(x,y) = x+y` is equivalent to `f(x::Any,y::Any) = x+y`, where `Any`
643643is the maximal supertype of every Julia type. Thus `f(x,y) = x+y` is essentially
644644a fallback for all possible input values, telling it what to do in the case that
645645no other dispatches exist. However, note that this dispatch itself is not slow,
646- since it will be specailized on the input types.
646+ since it will be specialized on the input types.
647647
648648### Ambiguities
649649
@@ -1199,10 +1199,10 @@ cheat() = qinline(1.0,2.0)
11991199```
12001200
12011201It realized that `1.0` and `2.0` are constants, so it did what's known as
1202- *constant propogation *, and then used those constants inside of the function.
1202+ *constant propagation *, and then used those constants inside of the function.
12031203It realized that the solution is always `9`, so it compiled the function that...
12041204spits out `9`! So it's fast because it's not computing anything. So be very
1205- careful about propogation of constants and literals. In general this is a very
1205+ careful about propagation of constants and literals. In general this is a very
12061206helpful feature, but when benchmarking this can cause some weird behavior. If
12071207a micro benchmark is taking less than a nanosecond, check and see if the compiler
12081208"fixed" your code!
0 commit comments