Skip to content

Commit 4061611

Browse files
Merge pull request #65 from pitmonticone/master
Clean lectures and homeworks
2 parents eef041f + 5fae8ee commit 4061611

File tree

18 files changed

+118
-118
lines changed

18 files changed

+118
-118
lines changed

_weave/homework02/hw2.jmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ The forward sensitivity equations are given by:
4646

4747
$$\frac{d}{dt}\left(\frac{du}{dp}\right) = \frac{df}{du}\frac{du}{dp} + \frac{df}{dp}$$
4848

49-
Use this definition to simultaniously solve for the solution to the ODE along
49+
Use this definition to simultaneously solve for the solution to the ODE along
5050
with its derivative with respect to parameters.
5151

5252
### Part 3: Parameter Estimation
@@ -55,7 +55,7 @@ Generate data using the parameters from Part 1. Then perturb the parameters to
5555
start at $\alpha = 1.2$, $\beta = 0.8$, $\gamma = 2.8$, and $\delta = 0.8$. Use
5656
the L2 loss against the data as a cost function, and use the forward sensitivity
5757
equations to implement gradient descent and optimize the cost function to
58-
retreive the true parameters
58+
retrieve the true parameters
5959

6060
## Problem 2: Bandwidth Maximization
6161

_weave/homework03/hw3.jmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,5 +106,5 @@ underlying array types.
106106

107107
### Part 2: GPU Neural ODE
108108

109-
Change the inital condition of the ODE solves to a CuArray to make your neural
109+
Change the initial condition of the ODE solves to a CuArray to make your neural
110110
ODE GPU-accelerated.

_weave/lecture02/optimizing.jmd

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ end
8888
Locally, the stack is composed of a *stack* and a *heap*. The stack requires a
8989
static allocation: it is ordered. Because it's ordered, it is very clear where
9090
things are in the stack, and therefore accesses are very quick (think
91-
instantanious). However, because this is static, it requires that the size
91+
instantaneous). However, because this is static, it requires that the size
9292
of the variables is known at compile time (to determine all of the variable
9393
locations). Since that is not possible with all variables, there exists the
9494
heap. The heap is essentially a stack of pointers to objects in memory. When
@@ -124,9 +124,9 @@ end
124124
@btime inner_noalloc!(C,A,B)
125125
```
126126

127-
Why does the array here get heap-allocated? It isn't able to prove/guarentee
127+
Why does the array here get heap-allocated? It isn't able to prove/guarantee
128128
at compile-time that the array's size will always be a given value, and thus
129-
it allocates it to the heap. `@btime` tells us this allocation occured and
129+
it allocates it to the heap. `@btime` tells us this allocation occurred and
130130
shows us the total heap memory that was taken. Meanwhile, the size of a Float64
131131
number is known at compile-time (64-bits), and so this is stored onto the stack
132132
and given a specific location that the compiler will be able to directly
@@ -277,7 +277,7 @@ without a care about performance.
277277
temporary variables since the individual C kernels are written for specific
278278
numbers of inputs and thus don't naturally fuse. Julia's broadcast mechanism
279279
is just generating and JIT compiling Julia functions on the fly, and thus it
280-
can accomodate the combinatorial explosion in the amount of choices just by
280+
can accommodate the combinatorial explosion in the amount of choices just by
281281
only compiling the combinations that are necessary for a specific code)
282282

283283
### Heap Allocations from Slicing
@@ -310,7 +310,7 @@ but with a relatively small constant).
310310
### Asymptotic Cost of Heap Allocations
311311

312312
Heap allocations have to locate and prepare a space in RAM that is proportional
313-
to the amount of memory that is calcuated, which means that the cost of a heap
313+
to the amount of memory that is calculated, which means that the cost of a heap
314314
allocation for an array is O(n), with a large constant. As RAM begins to fill
315315
up, this cost dramatically increases. If you run out of RAM, your computer
316316
may begin to use *swap*, which is essentially RAM simulated on your hard drive.
@@ -443,7 +443,7 @@ since it needs to decode and have a version for all primitive types!
443443
Not only is there runtime overhead checks in function calls due to to not being
444444
explicit about types, there is also a memory overhead since it is impossible
445445
to know how much memory a value with take since that's a property of its type.
446-
Thus the Python interpreter cannot statically guerentee exact unchanging values
446+
Thus the Python interpreter cannot statically guarantee exact unchanging values
447447
for the size that a value would take in the stack, meaning that the variables
448448
are not stack-allocated. This means that every number ends up heap-allocated,
449449
which hopefully begins to explain why this is not as fast as C.
@@ -458,12 +458,12 @@ a + b
458458

459459
However, before JIT compilation, Julia runs a type inference algorithm which
460460
finds out that `A` is an `Int`, and `B` is an `Int`. You can then understand
461-
that if it can prove that `A+B` is an `Int`, then it can propogate all of the
461+
that if it can prove that `A+B` is an `Int`, then it can propagate all of the
462462
types through.
463463

464464
### Type Specialization in Functions
465465

466-
Julia is able to propogate type inference through functions because, even if
466+
Julia is able to propagate type inference through functions because, even if
467467
a function is "untyped", Julia will interpret this as a *generic function*
468468
over possible *methods*, where every method has a concrete type. This means that
469469
in Julia, the function:
@@ -564,7 +564,7 @@ and thus the output is unknown:
564564

565565
This means that its output type is `Union{Int,Float64}` (Julia uses union types
566566
to keep the types still somewhat constrained). Once there are multiple choices,
567-
those need to get propogated through the compiler, and all subsequent calculations
567+
those need to get propagate through the compiler, and all subsequent calculations
568568
are the result of either being an `Int` or a `Float64`.
569569

570570
(Note that Julia has small union optimizations, so if this union is of size
@@ -615,7 +615,7 @@ ff(x::Number,y::Number) = x + y
615615
ff(2.0,5)
616616
```
617617

618-
Notice that the fallback method still specailizes on the inputs:
618+
Notice that the fallback method still specializes on the inputs:
619619

620620
```julia
621621
@code_llvm ff(2.0,5)
@@ -643,7 +643,7 @@ Note that `f(x,y) = x+y` is equivalent to `f(x::Any,y::Any) = x+y`, where `Any`
643643
is the maximal supertype of every Julia type. Thus `f(x,y) = x+y` is essentially
644644
a fallback for all possible input values, telling it what to do in the case that
645645
no other dispatches exist. However, note that this dispatch itself is not slow,
646-
since it will be specailized on the input types.
646+
since it will be specialized on the input types.
647647

648648
### Ambiguities
649649

@@ -1199,10 +1199,10 @@ cheat() = qinline(1.0,2.0)
11991199
```
12001200

12011201
It realized that `1.0` and `2.0` are constants, so it did what's known as
1202-
*constant propogation*, and then used those constants inside of the function.
1202+
*constant propagation*, and then used those constants inside of the function.
12031203
It realized that the solution is always `9`, so it compiled the function that...
12041204
spits out `9`! So it's fast because it's not computing anything. So be very
1205-
careful about propogation of constants and literals. In general this is a very
1205+
careful about propagation of constants and literals. In general this is a very
12061206
helpful feature, but when benchmarking this can cause some weird behavior. If
12071207
a micro benchmark is taking less than a nanosecond, check and see if the compiler
12081208
"fixed" your code!

_weave/lecture03/sciml.jmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ A neural network is a function:
5454
\text{NN}(x) = W_3\sigma_2(W_2\sigma_1(W_1x + b_1) + b_2) + b_3
5555
```
5656

57-
where we can change the number of layers (`(W_i,b_i)`) as necesary. Let's assume
57+
where we can change the number of layers (`(W_i,b_i)`) as necessary. Let's assume
5858
we want to approximate some $R^{10} \rightarrow R^5$ function. To do this we need
5959
to make sure that we start with 10 inputs and arrive at 5 outputs. If we want a
6060
bigger middle layer for example, we can do something like (10,32,32,5). Size changing
@@ -491,7 +491,7 @@ the field. In scientific machine learning, neural networks and machine learning
491491
are used as the basis to solve problems in scientific computing. [Scientific
492492
computing, as a discipline also known as Computational Science, is a field of
493493
study which focuses on scientific simulation, using tools such as differential
494-
equations to investigate physical, biological, and other phonomena](https://en.wikipedia.org/wiki/Computational_science).
494+
equations to investigate physical, biological, and other phenomena](https://en.wikipedia.org/wiki/Computational_science).
495495

496496
What we wish to do in scientific machine learning is use these properties of
497497
neural networks to improve the way that we investigate our scientific models.

_weave/lecture04/dynamical_systems.jmd

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ scientific models are dynamical systems. Thus if we want to start to dig into
1616
deeper methods, we will need to start looking into the theory and practice of
1717
nonlinear dynamical systems. In this lecture we will go over the basic
1818
properties of dynamical systems and understand their general behavior through
19-
code. We will also learn the idea of stability as an asymtopic property of a
19+
code. We will also learn the idea of stability as an asymptotic property of a
2020
mapping, and understand when a system is stable.
2121

2222
## Discrete Dynamical Systems
@@ -55,7 +55,7 @@ $$u_{n+1} = u_n + f(u_n,\theta)$$
5555

5656
where $f$ is a neural network parameterized by $\theta$.
5757

58-
Note that discrete dyamical systems are even more fundamental than just the
58+
Note that discrete dynamical systems are even more fundamental than just the
5959
ones shown. In any case where a continuous model is discretized to loop on the
6060
computer, the resulting algorithm is a discrete dynamical system and thus
6161
evolves according to its properties. This fact will be revisited later.
@@ -152,8 +152,8 @@ This is essentially another way of saying that a function that is differentiable
152152
is Lipschitz, where we can use the derivative as the Lipschitz bound. But notice
153153
this means that, in this neighborhood, a function with a derivative less than
154154
1 is a contraction mapping, and thus there is a limiting sequence which goes to
155-
the fixed point by the Banach Fixed Point Theorem. Furthermore, the uniquess
156-
guerentees that there is only one fixed point in a sufficiently small neighborhood
155+
the fixed point by the Banach Fixed Point Theorem. Furthermore, the uniqueness
156+
guarantees that there is only one fixed point in a sufficiently small neighborhood
157157
where the derivative is all less than 1.
158158

159159
A way to interpret this result is that, any nice enough function $f$ is locally
@@ -271,7 +271,7 @@ analysis:
271271
$$u_{n+1} = \sum_{j=0}^{k-1} \alpha_j u_{n-j} + \epsilon_n$$
272272

273273
In a very quick handwavy way, we can understand such a system by seeing how the
274-
perturbations propogate. If $u_0 = 0$, then the starting is just $\epsilon_0$.
274+
perturbations propagate. If $u_0 = 0$, then the starting is just $\epsilon_0$.
275275
If we assume all other $\epsilon_i = 0$, then this system is the same as a linear
276276
dynamical system with delays. If all of the roots are in the unit circle, then
277277
it goes to zero, meaning the perturbation is forgotten or squashed over time.
@@ -312,10 +312,10 @@ well: this was a periodic orbit of length 2.
312312
Chaos is another interesting property of a discrete dynamical system. It can be
313313
interpreted as a periodic orbit where the length is infinity. This can happen if,
314314
by changing a parameter, a period 2 orbit becomes a period 4, then a period 8,
315-
etc. (a phonomenon known as period doubling), and when it goes beyond the
315+
etc. (a phenomenon known as period doubling), and when it goes beyond the
316316
accumulation point the "infinite period orbit" is reached and chaos is found.
317317
A homework problem will delve into the properties of chaos as an example of a
318-
simple embaressingly data-parallel problem.
318+
simple embarrassingly data-parallel problem.
319319

320320
## Efficient Implmentation of Dynamical Systems
321321

@@ -441,7 +441,7 @@ count.
441441
#### Multidimensional System Implementations
442442

443443
When we go to multidimensional systems, some care needs to be taken to decrease
444-
the number of allocations which are occuring. One of the ways to do this is
444+
the number of allocations which are occurring. One of the ways to do this is
445445
to utilize statically sized arrays. For example, let's look at a discretization
446446
of the Lorenz system:
447447

@@ -750,7 +750,7 @@ get involved, this can be a significant effect.
750750

751751
1. What are some ways to compute steady states? Periodic orbits?
752752
2. When using the mutating algorithms, what are the data dependencies between
753-
different solves if they were to happen simultaniously?
753+
different solves if they were to happen simultaneously?
754754
3. We saw that there is a connection between delayed systems and multivariable
755755
systems. How deep does that go? Is every delayed system also a multivariable
756756
system and vice versa? Is this a useful idea to explore?

_weave/lecture05/parallelism_overview.jmd

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ points at every line (it's an interpreted language and therefore the interpreter
104104
can always take control). In Julia, the yield points are minimized. The common
105105
yield points are allocations and I/O (`println`). This means that a tight
106106
non-allocating inner loop will not have any yield points and will be a thread
107-
that is not interruptable. While this is great for numerical performance, it is
107+
that is not interruptible. While this is great for numerical performance, it is
108108
something to be aware of.
109109

110110
Side effect: if you run a long tight loop and wish to exit it, you may try
@@ -300,7 +300,7 @@ u = [Vector{Float64}(undef,3) for i in 1:1000]
300300
**Parallelism doesn't always make things faster**. There are two costs associated
301301
with this code. For one, we had to go to the slower heap+mutation version, so
302302
its implementation starting point is slower. But secondly, and more importantly,
303-
the cost of spinning a new thread is non-negligable. In fact, here we can see
303+
the cost of spinning a new thread is non-negligible. In fact, here we can see
304304
that it even needs to make a small allocation for the new context. The total
305305
cost is on the order of It's on the order of 50ns: not huge, but something
306306
to take note of. So what we've done is taken almost free calculations and made
@@ -324,7 +324,7 @@ their inputs. The following questions allow for independent simulations:
324324
conditions?
325325
- How does the solution very when I use a different `p`?
326326

327-
The problem has a few descriptions. For one, it's called an *embaressingly parallel*
327+
The problem has a few descriptions. For one, it's called an *embarrassingly parallel*
328328
problem since the problem can remain largely intact to solve the parallelism
329329
problem. To solve this, we can use the exact same `solve_system_save_iip!`,
330330
and just change how we are calling it. Secondly, this is called a *data parallel*
@@ -454,7 +454,7 @@ serial_out - threaded_out
454454
### Hierarchical Task-Based Multithreading and Dynamic Scheduling
455455

456456
The major change in Julia v1.3 is that Julia's `Task`s, which are traditionally
457-
its green threads interface, are now the basis of its multithreading infrustructure.
457+
its green threads interface, are now the basis of its multithreading infrastructure.
458458
This means that all independent threads are parallelized, and a new interface for
459459
multithreading will exist that works by spawning threads.
460460

@@ -487,7 +487,7 @@ However, if we check the timing we see:
487487
@btime tmap2(p -> compute_trajectory_mean5(@SVector([1.0,0.0,0.0]),p),ps)
488488
```
489489

490-
`Threads.@threads` is built on the same multithreading infrustructure, so why
490+
`Threads.@threads` is built on the same multithreading infrastructure, so why
491491
is this so much slower? The reason is because `Threads.@threads` employs
492492
**static scheduling** while `Threads.@spawn` is using **dynamic scheduling**.
493493
Dynamic scheduling is the model of allowing the runtime to determine the ordering
@@ -602,7 +602,7 @@ BLAS implementations. Extensions to these, known as LAPACK, include operations
602602
like factorizations, and are included in these standard libraries. These are
603603
all multithreaded. The reason why this is a location to target is because the
604604
operation count is high enough that parallelism can be made efficient even
605-
when only targetting this level: a matrix multiplication can take on the order
605+
when only targeting this level: a matrix multiplication can take on the order
606606
of seconds, minutes, hours, or even days, and these are all highly parallel
607607
operations. This means you can get away with a bunch just by parallelizing at
608608
this level, which happens to be a bottleneck for a lot scientific computing

0 commit comments

Comments
 (0)