Skip to content

Commit d18ba1c

Browse files
author
Anastasiia Shcherbakova
committed
Fix spelling mistakes and changed output/description order in vectorisation
1 parent 010a273 commit d18ba1c

File tree

1 file changed

+16
-16
lines changed

1 file changed

+16
-16
lines changed

episodes/optimisation-minimise-python.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -20,15 +20,15 @@ exercises: 0
2020

2121
::::::::::::::::::::::::::::::::::::::::::::::::
2222

23-
Python is an interpreted programming language. When you execute your `.py` file, the (default) CPython back-end compiles your Python source code to an intermediate bytecode. This bytecode is then interpreted in software at runtime generating instructions for the processor as necessary. This interpretation stage, and other features of the language, harm the performance of Python (whilst improving it's usability).<!-- https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/ -->
23+
Python is an interpreted programming language. When you execute your `.py` file, the (default) CPython back-end compiles your Python source code to an intermediate bytecode. This bytecode is then interpreted in software at runtime generating instructions for the processor as necessary. This interpretation stage, and other features of the language, harm the performance of Python (whilst improving its usability).<!-- https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/ -->
2424

2525
In comparison, many languages such as C/C++ compile directly to machine code. This allows the compiler to perform low-level optimisations that better exploit hardware nuance to achieve fast performance. This however comes at the cost of compiled software not being cross-platform.
2626

2727
Whilst Python will rarely be as fast as compiled languages like C/C++, it is possible to take advantage of the CPython back-end and packages such as NumPy and Pandas that have been written in compiled languages to expose this performance.
2828

2929
A simple example of this would be to perform a linear search of a list (in the previous episode we did say this is not recommended).
3030
The below example creates a list of 2500 integers in the inclusive-exclusive range `[0, 5000)`.
31-
It then searches for all of the even numbers in that range.
31+
It then searches for all the even numbers in that range.
3232
`searchlistPython()` is implemented manually, iterating `ls` checking each individual item in Python code.
3333
`searchListC()` in contrast uses the `in` operator to perform each search, which allows CPython to implement the inner loop in it's C back-end.
3434

@@ -281,7 +281,7 @@ In particular, those which are passed an `iterable` (e.g. lists) are likely to p
281281

282282
::::::::::::::::::::::::::::::::::::: callout
283283

284-
The built-in functions [`filter()`](https://docs.python.org/3/library/functions.html#filter) and [`map()`](https://docs.python.org/3/library/functions.html#map) can be used for processing iterables However list-comprehension is likely to be more performant.
284+
The built-in functions [`filter()`](https://docs.python.org/3/library/functions.html#filter) and [`map()`](https://docs.python.org/3/library/functions.html#map) can be used for processing iterables. However, list-comprehension is likely to be more performant.
285285

286286
<!-- Would this benefit from an example? -->
287287

@@ -292,11 +292,11 @@ The built-in functions [`filter()`](https://docs.python.org/3/library/functions.
292292

293293
[NumPy](https://numpy.org/) is a commonly used package for scientific computing, which provides a wide variety of methods.
294294

295-
It adds restriction via it's own [basic numeric types](https://numpy.org/doc/stable/user/basics.types.html), and static arrays to enable even greater performance than that of core Python. However if these restrictions are ignored, the performance can become significantly worse.
295+
It adds restriction via its own [basic numeric types](https://numpy.org/doc/stable/user/basics.types.html), and static arrays to enable even greater performance than that of core Python. However, if these restrictions are ignored, the performance can become significantly worse.
296296

297297
### Arrays
298298

299-
NumPy's arrays (not to be confused with the core Python `array` package) are static arrays. Unlike core Python's lists, they do not dynamically resize. Therefore if you wish to append to a NumPy array, you must call `resize()` first. If you treat this like `append()` for a Python list, resizing for each individual append you will be performing significantly more copies and memory allocations than a Python list.
299+
NumPy's arrays (not to be confused with the core Python `array` package) are static arrays. Unlike core Python's lists, they do not dynamically resize. Therefore, if you wish to append to a NumPy array, you must call `resize()` first. If you treat this like `append()` for a Python list, resizing for each individual append you will be performing significantly more copies and memory allocations than a Python list.
300300

301301
The below example sees lists and arrays constructed from `range(100000)`.
302302

@@ -390,7 +390,7 @@ There is however a trade-off, using `numpy.random.choice()` can be clearer to so
390390

391391
### Vectorisation
392392

393-
The manner by which NumPy stores data in arrays enables it's functions to utilise vectorisation, whereby the processor executes one instruction across multiple variables simultaneously, for every mathematical operation between arrays.
393+
The manner by which NumPy stores data in arrays enables its functions to utilise vectorisation, whereby the processor executes one instruction across multiple variables simultaneously, for every mathematical operation between arrays.
394394

395395
Earlier in this episode it was demonstrated that using core Python methods over a list, will outperform a loop performing the same calculation faster. The below example takes this a step further by demonstrating the calculation of dot product.
396396

@@ -416,18 +416,18 @@ print(f"numpy_sum_array: {timeit(np_sum_ar, setup=gen_array, number=repeats):.2f
416416
print(f"numpy_dot_array: {timeit(np_dot_ar, setup=gen_array, number=repeats):.2f}ms")
417417
```
418418

419-
* `python_sum_list` uses list comprehension to perform the multiplication, followed by the Python core `sum()`. This comes out at 46.93ms
420-
* `python_sum_array` instead directly multiplies the two arrays, taking advantage of NumPy's vectorisation. But uses the core Python `sum()`, this comes in slightly faster at 33.26ms.
421-
* `numpy_sum_array` again takes advantage of NumPy's vectorisation for the multiplication, and additionally uses NumPy's `sum()` implementation. These two rounds of vectorisation provide a much faster 1.44ms completion.
422-
* `numpy_dot_array` instead uses NumPy's `dot()` to calculate the dot product in a single operation. This comes out the fastest at 0.29ms, 162x faster than `python_sum_list`.
423-
424419
```output
425420
python_sum_list: 46.93ms
426421
python_sum_array: 33.26ms
427422
numpy_sum_array: 1.44ms
428423
numpy_dot_array: 0.29ms
429424
```
430425

426+
* `python_sum_list` uses list comprehension to perform the multiplication, followed by the Python core `sum()`. This comes out at 46.93ms
427+
* `python_sum_array` instead directly multiplies the two arrays, taking advantage of NumPy's vectorisation. But uses the core Python `sum()`, this comes in slightly faster at 33.26ms.
428+
* `numpy_sum_array` again takes advantage of NumPy's vectorisation for the multiplication, and additionally uses NumPy's `sum()` implementation. These two rounds of vectorisation provide a much faster 1.44ms completion.
429+
* `numpy_dot_array` instead uses NumPy's `dot()` to calculate the dot product in a single operation. This comes out the fastest at 0.29ms, 162x faster than `python_sum_list`.
430+
431431
::::::::::::::::::::::::::::::::::::: callout
432432

433433
## Parallel NumPy
@@ -439,7 +439,7 @@ A small number of functions are backed by BLAS and LAPACK, enabling even greater
439439

440440
The [supported functions](https://numpy.org/doc/stable/reference/routines.linalg.html) mostly correspond to linear algebra operations.
441441

442-
The auto-parallelisation of these functions is hardware dependant, so you won't always automatically get the additional benefit of parallelisation.
442+
The auto-parallelisation of these functions is hardware-dependent, so you won't always automatically get the additional benefit of parallelisation.
443443
However, HPC systems should be primed to take advantage, so try increasing the number of cores you request when submitting your jobs and see if it improves the performance.
444444

445445
*This might be why `numpy_dot_array` is that much faster than `numpy_sum_array` in the previous example!*
@@ -449,7 +449,7 @@ However, HPC systems should be primed to take advantage, so try increasing the n
449449
### `vectorize()`
450450

451451
Python's `map()` was introduced earlier, for applying a function to all elements within a list.
452-
NumPy provides `vectorize()` an equivalent for operating over it's arrays.
452+
NumPy provides `vectorize()` an equivalent for operating over its arrays.
453453

454454
This doesn't actually make use of processor-level vectorisation, from the [documentation](https://numpy.org/doc/stable/reference/generated/numpy.vectorize.html):
455455

@@ -497,7 +497,7 @@ Pandas' methods by default operate on columns. Each column or series can be thou
497497

498498
Following the theme of this episode, iterating over the rows of a data frame using a `for` loop is not advised. The pythonic iteration will be slower than other approaches.
499499

500-
Pandas allows it's own methods to be applied to rows in many cases by passing `axis=1`, where available these functions should be preferred over manual loops. Where you can't find a suitable method, `apply()` can be used, which is similar to `map()`/`vectorize()`, to apply your own function to rows.
500+
Pandas allows its own methods to be applied to rows in many cases by passing `axis=1`, where available these functions should be preferred over manual loops. Where you can't find a suitable method, `apply()` can be used, which is similar to `map()`/`vectorize()`, to apply your own function to rows.
501501

502502
```python
503503
from timeit import timeit
@@ -571,7 +571,7 @@ vectorize: 1.48ms
571571

572572
It won't always be possible to take full advantage of vectorisation, for example you may have conditional logic.
573573

574-
An alternate approach is converting your dataframe to a Python dictionary using `to_dict(orient='index')`. This creates a nested dictionary, where each row of the outer dictionary is an internal dictionary. This can then be processed via list-comprehension:
574+
An alternate approach is converting your DataFrame to a Python dictionary using `to_dict(orient='index')`. This creates a nested dictionary, where each row of the outer dictionary is an internal dictionary. This can then be processed via list-comprehension:
575575

576576
```python
577577
def to_dict():
@@ -588,7 +588,7 @@ Whilst still nearly 100x slower than pure vectorisation, it's twice as fast as `
588588
to_dict: 131.15ms
589589
```
590590

591-
This is because indexing into Pandas' `Series` (rows) is significantly slower than a Python dictionary. There is a slight overhead to creating the dictionary (40ms in this example), however the stark difference in access speed is more than enough to overcome that cost for any large dataframe.
591+
This is because indexing into Pandas' `Series` (rows) is significantly slower than a Python dictionary. There is a slight overhead to creating the dictionary (40ms in this example), however the stark difference in access speed is more than enough to overcome that cost for any large DataFrame.
592592

593593
```python
594594
from timeit import timeit

0 commit comments

Comments
 (0)