Skip to content

Commit 2de4cb1

Browse files
committed
"profiling" lesson is now presentable
1 parent 6e779cd commit 2de4cb1

File tree

3 files changed

+181
-71
lines changed

3 files changed

+181
-71
lines changed

content/profiling.md

Lines changed: 137 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,147 @@
11
# Profiling
22

3-
```{questions}
4-
- When shall we worry about the performance of our code?
5-
- How do we find bottlenecks in our code?
6-
- How do we measure improvements in running time and memory usage?
7-
```
3+
:::{objectives}
4+
- Understand when improving code performance is worth the time and effort.
5+
- Knowing how to find performance bottlenecks in Python code.
6+
- Try `scalene` as one of many tools to profile Python code.
7+
:::
88

9-
```{objectives}
10-
- Understand when improving code performance is worth the time and effort.
11-
- Learn how to use profilers in Python.
12-
- Use `scalene` to find and optimize bottlenecks in a given code example.
13-
```
9+
:::{instructor-note}
10+
- Discussion: 20 min
11+
- Exercise: 20 min
12+
:::
1413

1514

16-
> [!IMPORTANT]
17-
> Left to do:
18-
> Give 20 minutes introduction to profiling:
19-
> - [ ] Discuss when to profile
20-
> - [ ] Discuss breifly manual profiling
21-
> - [ ] Introduce function call profilers
22-
> - [ ] Introduce line profilers
23-
> - [ ] Visualize one code example using `scalane`
15+
## Should we even optimize the code?
2416

17+
Classic quote to keep in mind: "Premature optimization is the root of all evil." [Donald Knuth]
18+
19+
:::{discussion}
20+
It is important to ask ourselves whether it is worth it.
21+
- Is it worth spending e.g. 2 days to make a program run 20% faster?
22+
- Is it worth optimizing the code so that it spends 90% less memory?
23+
24+
Depends. What does it depend on?
25+
:::
26+
27+
28+
## Measure instead of guessing
29+
30+
Before doing code surgery to optimize the run time or lower the memory usage,
31+
we should **measure** where the bottlenecks are. This is called **profiling**.
32+
33+
Analogy: Medical doctors don't start surgery based on guessing. They first measure
34+
(X-ray, MRI, ...) to know precisely where the problem is.
35+
36+
Not only programming beginners can otherwise guess wrong, but also experienced
37+
programmers can be surprised by the results of profiling.
38+
39+
40+
## One of the simplest tools is to insert timers
41+
42+
Below we will list some tools that can be used to profile Python code.
43+
But even without these tools you can find **time-consuming parts** of your code
44+
by inserting timers:
45+
46+
47+
48+
```{code-block} python
49+
---
50+
emphasize-lines: 1,8,10
51+
---
52+
import time
53+
54+
55+
# ...
56+
# code before the function
57+
58+
59+
start = time.time()
60+
result = some_function()
61+
print(f"some_function took {time.time() - start} seconds")
2562
26-
## Exercises
2763
28-
:::{exercise} Exercise Profiling-1
29-
Work in progress: we will provide an exercise showing the improvement in
30-
performance when introducing numpy and/or pandas.
64+
# code after the function
65+
# ...
66+
```
67+
68+
69+
## Many tools exist
70+
71+
The list below here is probably not complete, but it gives an overview of the
72+
different tools available for profiling Python code.
73+
74+
CPU profilers:
75+
- [cProfile and profile](https://docs.python.org/3/library/profile.html)
76+
- [line_profiler](https://kernprof.readthedocs.io/)
77+
- [py-spy](https://github.com/benfred/py-spy)
78+
- [Yappi](https://github.com/sumerc/yappi)
79+
- [pyinstrument](https://pyinstrument.readthedocs.io/)
80+
- [Perfetto](https://perfetto.dev/docs/analysis/trace-processor-python)
81+
82+
Memory profilers:
83+
- [memory_profiler](https://pypi.org/project/memory-profiler/) (not actively maintained)
84+
- [Pympler](https://pympler.readthedocs.io/)
85+
- [tracemalloc](https://docs.python.org/3/library/tracemalloc.html)
86+
- [guppy/heapy](https://github.com/zhuyifei1999/guppy3/)
87+
88+
Both CPU and memory:
89+
- [Scalene](https://github.com/plasma-umass/scalene)
90+
91+
In the exercise below, we will use Scalene to profile a Python program. Scalene
92+
is a sampling profiler that can profile CPU, memory, and GPU usage of Python.
93+
94+
95+
## Tracing profilers vs. sampling profilers
96+
97+
**Tracing profilers** record every function call and event in the program,
98+
logging the exact sequence and duration of events.
99+
- **Pros:**
100+
- Provides detailed information on the program's execution.
101+
- Deterministic: Captures exact call sequences and timings.
102+
- **Cons:**
103+
- Higher overhead, slowing down the program.
104+
- Can generate larger amount of data.
105+
106+
**Sampling profilers** periodically samples the program's state (where it is
107+
and how much memory is used), providing a statistical view of where time is
108+
spent.
109+
- **Pros:**
110+
- Lower overhead, as it doesn't track every event.
111+
- Scales better with larger programs.
112+
- **Cons:**
113+
- Less precise, potentially missing infrequent or short calls.
114+
- Provides an approximation rather than exact timing.
115+
116+
:::{discussion} Analogy: Imagine we want to optimize the London Underground (subway) system
117+
We wish to detect bottlenecks in the system to improve the service and for this we have
118+
asked few passengers to help us by tracking their journey.
119+
- **Tracing**: We follow every train and passenger, recording every stop
120+
and delay. When passengers enter and exit the train, we record the exact time
121+
and location.
122+
- **Sampling**: Every 5 minutes the phone notifies the passenger to note
123+
down their current location. We then use this information to estimate
124+
the most crowded stations and trains.
31125
:::
32126

33-
::::{exercise} Exercise Profiling-2
34-
In this exercise we will use the `scalene` profiler to find out where most of the time is spent
127+
128+
## Choosing the right system size
129+
130+
Sometimes we can configure the system size (for instance the time step in a simulation
131+
or the number of time steps or the matrix dimensions) to make the program finish sooner.
132+
133+
For profiling, we should choose a system size that is **representative of the real-world**
134+
use case. If we profile a program with a small input size, we might not see the same
135+
bottlenecks as when running the program with a larger input size.
136+
137+
Often, when we scale up the system size, or scale the number of processors, new bottlenecks
138+
might appear which we didn't see before. This brings us back to: "measure instead of guessing".
139+
140+
141+
## Exercises
142+
143+
::::{exercise} Exercise: Practicing profiling
144+
In this exercise we will use the Scalene profiler to find out where most of the time is spent
35145
and most of the memory is used in a given code example.
36146

37147
Please try to go through the exercise in the following steps:
@@ -58,55 +168,11 @@ Please try to go through the exercise in the following steps:
58168
You can find an example of the generated HTML report in the solution below.
59169
1. Does the result match your prediction? Can you explain the results?
60170

61-
```python
62-
"""
63-
The code below reads a text file and counts the number of unique words in it
64-
(case-insensitive).
65-
"""
66-
import re
67-
68-
69-
def count_unique_words1(file_path: str) -> int:
70-
with open(file_path, "r", encoding="utf-8") as file:
71-
text = file.read()
72-
words = re.findall(r"\b\w+\b", text.lower())
73-
return len(set(words))
74-
75-
76-
def count_unique_words2(file_path: str) -> int:
77-
unique_words = []
78-
with open(file_path, "r", encoding="utf-8") as file:
79-
for line in file:
80-
words = re.findall(r"\b\w+\b", line.lower())
81-
for word in words:
82-
if word not in unique_words:
83-
unique_words.append(word)
84-
return len(unique_words)
85-
86-
87-
def count_unique_words3(file_path: str) -> int:
88-
unique_words = set()
89-
with open(file_path, "r", encoding="utf-8") as file:
90-
for line in file:
91-
words = re.findall(r"\b\w+\b", line.lower())
92-
for word in words:
93-
unique_words.add(word)
94-
return len(unique_words)
95-
96-
97-
def main():
98-
# book.txt is downloaded from https://www.gutenberg.org/cache/epub/2600/pg2600.txt
99-
_result = count_unique_words1("book.txt")
100-
_result = count_unique_words2("book.txt")
101-
_result = count_unique_words3("book.txt")
102-
103-
104-
if __name__ == "__main__":
105-
main()
106-
```
171+
:::{literalinclude} profiling/exercise.py
172+
:::
107173

108174
:::{solution}
109-
```{figure} profiling/exercise2.png
175+
```{figure} profiling/exercise.png
110176
:alt: Result of the profiling run for the above code example.
111177
:width: 100%
112178

content/profiling/exercise.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
"""
2+
The code below reads a text file and counts the number of unique words in it
3+
(case-insensitive).
4+
"""
5+
import re
6+
7+
8+
def count_unique_words1(file_path: str) -> int:
9+
with open(file_path, "r", encoding="utf-8") as file:
10+
text = file.read()
11+
words = re.findall(r"\b\w+\b", text.lower())
12+
return len(set(words))
13+
14+
15+
def count_unique_words2(file_path: str) -> int:
16+
unique_words = []
17+
with open(file_path, "r", encoding="utf-8") as file:
18+
for line in file:
19+
words = re.findall(r"\b\w+\b", line.lower())
20+
for word in words:
21+
if word not in unique_words:
22+
unique_words.append(word)
23+
return len(unique_words)
24+
25+
26+
def count_unique_words3(file_path: str) -> int:
27+
unique_words = set()
28+
with open(file_path, "r", encoding="utf-8") as file:
29+
for line in file:
30+
words = re.findall(r"\b\w+\b", line.lower())
31+
for word in words:
32+
unique_words.add(word)
33+
return len(unique_words)
34+
35+
36+
def main():
37+
# book.txt is downloaded from https://www.gutenberg.org/cache/epub/2600/pg2600.txt
38+
_result = count_unique_words1("book.txt")
39+
_result = count_unique_words2("book.txt")
40+
_result = count_unique_words3("book.txt")
41+
42+
43+
if __name__ == "__main__":
44+
main()

0 commit comments

Comments
 (0)