11# pybm example #1 : "Hello, math world!"
22
3- This markdown contains a walkthrough for the inaugural pybm example, ` sum ` . It
4- is meant to display the usefulness of pybm in the context of Python library
5- development, where usually only a single implementation of a function is
6- maintained; therefore, especially in performance-critical sections, usually the
7- most optimized algorithm and implementation should be used.
3+ This markdown contains a walkthrough for the inaugural pybm example, ` sum ` . It is meant
4+ to display the usefulness of pybm in the context of Python library development, where
5+ usually only a single implementation of a function is maintained; therefore, especially
6+ in performance-critical sections, usually the most optimized algorithm and
7+ implementation should be used.
88
99## Prerequisites
1010
@@ -17,8 +17,8 @@ cd sum-example
1717git checkout master
1818```
1919
20- You need to run any pybm commands from a virtual environment with pybm
21- installed. The easiest way to do this is with the following series of commands:
20+ You need to run any pybm commands from a virtual environment with pybm installed. The
21+ easiest way to do this is with the following series of commands:
2222
2323```
2424# you should be in the root of the pybm-sum-example repository now
@@ -33,10 +33,10 @@ pybm init
3333
3434## Setting the stage
3535
36- We put ourselves in the perspective of the author of a fictitous Python math
37- library. As any good math package would require, there also has to be a
38- ` sum ` function, calculating the sum of the first ` n ` natural numbers for a given
39- integer input ` n ` . Currently, our author solved it like this:
36+ We put ourselves in the perspective of the author of a fictitous Python math library. As
37+ any good math package would require, there also has to be a ` sum ` function, calculating
38+ the sum of the first ` n ` natural numbers for a given integer input ` n ` . Currently, our
39+ author solved it like this:
4040
4141``` python
4242def my_sum (n : int ):
@@ -49,21 +49,19 @@ def my_sum(n: int):
4949```
5050
5151The code speaks volumes: The sum of the first ` n ` numbers is just the number 1
52- repeated ` n ` times. Not terribly clever, yet of course correct. But as you
53- notice, the computation is really tedious: A nested loop, with constant
54- increments of 1, each time.
52+ repeated ` n ` times. Not terribly clever, yet of course correct. But as you notice, the
53+ computation is really tedious: A nested loop, with constant increments of 1, each time.
5554
5655In fact, this code is pretty much a complete catastrophe: Our function has
57- _ quadratic_ complexity, meaning that the computational workload scales with the
58- square of the input. Without even running it, we can assume that this will not
59- behave very well when users want to compute sums of large numbers. Can we do
60- better?
56+ _ quadratic_ complexity, meaning that the computational workload scales with the square
57+ of the input. Without even running it, we can assume that this will not behave very well
58+ when users want to compute sums of large numbers. Can we do better?
6159
6260## Reducing it to linear time
6361
64- Alright, maybe the improvement here is already obvious. Of course, we can easily
65- cut the complexity by summing the actual numbers instead of ones. The new
66- function then looks like this:
62+ Alright, maybe the improvement here is already obvious. Of course, we can easily cut the
63+ complexity by summing the actual numbers instead of ones. The new function then looks
64+ like this:
6765
6866``` python
6967def my_sum (n : int ):
@@ -74,12 +72,12 @@ def my_sum(n: int):
7472 return result
7573```
7674
77- But we need to adhere to a normal development workflow here! So instead of just
78- hacking the new algorithm and pushing the changes, we should create a feature
79- branch (we're calling it "linear-time") containing our improved algorithm. The
80- branch is already present in the example repository that you previously checked
81- out. You can create a pybm benchmark environment for it with the following
82- command, run from the repository root folder on ` master ` :
75+ But we need to adhere to a normal development workflow here! So instead of just hacking
76+ the new algorithm and pushing the changes, we should create a feature branch (we're
77+ calling it "linear-time") containing our improved algorithm. The branch is already
78+ present in the example repository that you previously checked out. You can create a pybm
79+ benchmark environment for it with the following command, run from the repository root
80+ folder on ` master ` :
8381
8482``` shell
8583pybm env create linear-time
@@ -92,40 +90,40 @@ Successfully installed packages git+https://github.com/nicholasjng/pybm into vir
9290Successfully created benchmark environment for ref ' linear-time' .
9391` ` `
9492
95- This checks out the HEAD of the branch " linear-time" into a separate git
96- worktree located in the parent folder of the repository, and creates a fresh
97- Python virtual environment for it.
93+ This checks out the branch " linear-time" at HEAD into a separate git worktree
94+ located in the parent folder of the repository, and creates a fresh Python virtual
95+ environment for it.
9896
9997But everything changes once we pick up an analysis textbook!
10098
10199# # The super speedy sum, after C. F. Gauss
102100
103- At first glance, calculating a sum of ` n` numbers looks like an inherently
104- linear problem. Yet, the mathematical problem contains so much hidden structure
105- that we can actually do it for any number ` n` on a sheet of paper. The proof is
106- standard for any first-semester analysis course in university mathematics, and
107- sometimes finds its way into school curricula as well.
101+ At first glance, calculating a sum of ` n` numbers looks like an inherently linear
102+ problem. Yet, the mathematical problem contains so much hidden structure that we can
103+ actually do it for any number ` n` on a sheet of paper. The proof is standard for any
104+ first-semester analysis course in university mathematics, and sometimes finds its way
105+ into school curricula as well.
108106
109- In Germany specifically, it floats around as a nice little anecdote from the
110- early childhood of
111- [Carl Friedrich Gauss](https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss),
112- commonly viewed as one of the greatest mathematicians of all time, who,
113- according to legend, used it to solve the summation of the first 100 numbers in
114- a matter of seconds, much faster than his fellow pupils. There is a nice
115- [article](https://de.wikipedia.org/wiki/Gau%C3%9Fsche_Summenformel) on German
116- Wikipedia on it as well.
107+ In Germany specifically, it floats around as a nice little anecdote from the early
108+ childhood of
109+ [Carl Friedrich Gauss](https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss), commonly
110+ viewed as one of the greatest mathematicians of all time, who, according to legend, used
111+ it to solve his detention exercise of calculating the sum of the first 100
112+ numbers in a matter of seconds, much faster than his fellow pupils. There is a nice
113+ [article](https://de.wikipedia.org/wiki/Gau%C3%9Fsche_Summenformel) on German Wikipedia
114+ on it as well.
117115
118- The implementation is quite literally a one-liner, and looks like this:
116+ The implementation is a one-liner, and looks like this:
119117
120118` ` ` python
121119def my_sum(n: int):
122120 return n * (n + 1) // 2
123121` ` `
124122
125123No more loops, no ` if` s, no buts: We have reduced the summation to a
126- _constant time_ problem! This looks very promising. Again, this algorithm is
127- already implemented on another branch called ` constant-time` , for which we can
128- also create a benchmark environment:
124+ _constant time_ problem! This looks very promising. Again, this algorithm is already
125+ implemented on another branch called ` constant-time` , for which we can also create a
126+ benchmark environment:
129127
130128` ` ` shell
131129pybm env create constant-time
@@ -138,28 +136,22 @@ Successfully installed packages git+https://github.com/nicholasjng/pybm into vir
138136Successfully created benchmark environment for ref ' constant-time' .
139137` ` `
140138
141- Now we are left with a high-noon situation: Three implementation candidates,
142- three different algorithms, only one can be added to our math library. But what
143- are the numbers? We want to make an ** informed decision** and find our best
144- performer in a scientific manner. That' s where a benchmark helps!
139+ Now we are left with a high-noon situation: Three implementation candidates, three
140+ different algorithms, only one can be added to our math library. But what are the
141+ numbers? We want to make an ** informed decision** and find our best performer in a
142+ scientific manner. That' s where a benchmark helps!
145143
146144## Running the benchmark
147145
148- This is the perfect situation for pybm! We have environments for all of our
149- algorithms ( our master branch is also contained in a benchmark environment
150- called "root", created during `pybm init`), so we can directly compare them. We
151- do this by writing a very basic benchmark test:
146+ This is the perfect situation for pybm! We have environments for all of our algorithms (
147+ our master branch is also contained in a benchmark environment called "root", created
148+ during `pybm init`), so we can directly compare them. We do this by writing a very basic
149+ benchmark test:
152150
153151```python
154152# benchmarks/sum.py
155- import os
156- import sys
157-
158153import pybm
159154
160- SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
161- sys.path.append(os.path.dirname(SCRIPT_DIR))
162-
163155from main import my_sum
164156
165157
@@ -171,20 +163,18 @@ if __name__ == "__main__":
171163 pybm.run(context=globals())
172164```
173165
174- Aside from some sys-path-hacking to get the python path set up correctly
175- (which you should just ignore right now), the test file is very simple: We
176- import our function `my_sum`, sum up all numbers from 1 to 10000, and run the
177- benchmark when executing the module as `__main__`. Everything else is set up by
178- pybm' s default configuration, so we do not need to tweak more options and spend
179- more time to get up and running.
166+ The test file is very simple: We import our function `my_sum`, sum up all numbers from 1
167+ to 10000, and run the benchmark when executing the module as `__main__`. Everything else
168+ is set up by pybm' s default configuration, so we do not need to tweak more options and
169+ spend more time to get up and running.
180170
181- NOTE: The above benchmark file is the same on all three branches, and there is a
182- good reason for it! When comparing the different implementations, we do need the
171+ NOTE: The above benchmark file is the same on all three branches, and there is a good
172+ reason for it! When comparing the different implementations, we do need the
183173benchmarking _procedure_ itself to stay the same to yield comparable results.
184174
185175` ` ` shell
186176# Tells pybm to run the benchmarks in the benchmarks directory in all environments.
187- pybm run benchmarks/ --all
177+ pybm run benchmarks --all
188178
189179Starting benchmarking run in environment ' root' .
190180Discovering benchmark targets in environment ' root' .....done.
@@ -204,17 +194,17 @@ Finished benchmarking run in environment 'env_3'.
204194Finished benchmarking in all specified environments.
205195` ` `
206196
207- And there we have it! Instead of the manual rinse-and-repeat in a checkout
208- branch- > benchmark-> save-results kind of workflow, we obtained all the results we
209- need in one single command. Very nice!
197+ And there we have it! Instead of the manual rinse-and-repeat in a checkout branch- >
198+ benchmark-> save-results kind of workflow, we obtained all the results we need in one
199+ single command. Very nice!
210200
211201# # And finally... the numbers
212202
213- Lastly, we need to check how big our improvements actually are (or rather,
214- if we have achieved any in the first place! ). This is handled by the
215- ` pybm compare` command, which compares all measured results to a " frame of
216- reference " branch, which is taken to be the baseline for performance
217- comparisons. In our case, that is our fictitious math library' s current
203+ Lastly, we need to check how big our improvements actually are (or rather, if we have
204+ achieved any in the first place! ). This is handled by the
205+ ` pybm compare` command, which compares all measured results to a " frame of reference "
206+ branch, which is taken to be the baseline for performance comparisons. In our case, that
207+ is our fictitious math library' s current
218208`master`.
219209
220210```shell
@@ -227,13 +217,13 @@ pybm compare latest master linear-time constant-time
227217 benchmarks/sum.py:f | constant-time | 0.13 | 0.12 | -100.00% | 10759575.02x | 2000000
228218```
229219
230- And look here, instead of 10x-ing our previous algorithm like a normal engineer,
231- we actually... 10-million-x-ed it. Great work! Our constant time algorithm is
232- definitely ready for a pull request :-)
220+ And look here, instead of 10x-ing our previous algorithm like a normal engineer, we
221+ actually... 10-million-x-ed it. Great work! Our constant time algorithm is definitely
222+ ready for a pull request :-)
233223
234- These are of course video game numbers, obtained by algorithmic improvements.
235- More common real-world examples would see improvements in the one-to-three digit
236- percentage range, but the example you see above does happen from time to time.
224+ These are of course video game numbers, obtained by algorithmic improvements. More
225+ common real-world examples would see improvements in the one-to-three digit percentage
226+ range, but the example you see above does happen from time to time.
237227
238- And with that, the first pybm tutorial is finished. I hope you enjoyed it, and
239- catch you on the next one!
228+ And with that, the first pybm tutorial is finished. I hope you enjoyed it, and catch you
229+ on the next one!
0 commit comments