Skip to content

Commit c259bb8

Browse files
Merge pull request #230 from wmvanvliet/numpy-advanced-tweaks
Some tweaks to the Advanced Numpy lesson
2 parents 555dd8c + 3ed3958 commit c259bb8

File tree

1 file changed

+39
-27
lines changed

1 file changed

+39
-27
lines changed

content/numpy-advanced.rst

Lines changed: 39 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -82,19 +82,19 @@ Exercise 1
8282

8383
.. highlight:: python
8484

85-
The library behind the curtain: BLAS
86-
------------------------------------
85+
The libraries behind the curtain: MKL and BLAS
86+
----------------------------------------------
8787

88-
NumPy is fast because it outsources most of its heavy lifting to
88+
NumPy is fast because it outsources most of its heavy lifting to heavily
89+
optimized math libraries, such as Intel's `Math Kernel Library (MKL) <https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/api-based-programming/intel-oneapi-math-kernel-library-onemkl.html>`_,
90+
which are in turn derived from a Fortran library called
8991
`Basic Linear Algebra Subprograms (BLAS) <https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms>`_.
9092
BLAS for Fortran was `published in 1979 <https://doi.org/10.1145/355841.355847>`_
9193
and is a collection of algorithms for common mathematical operations that are
92-
performed on arrays of numbers. Algorithms such as element-wise sum, matrix
93-
multiplication, computing the vector length, etc.
94-
95-
The API of that software library was later standardized, and today there are
96-
many modern implementations available. These libraries represent over 40 years
97-
of optimizing efforts and make use of
94+
performed on arrays of numbers. Algorithms such as matrix multiplication,
95+
computing the vector length, etc. The API of the BLAS library was later
96+
standardized, and today there are many modern implementations available. These
97+
libraries represent over 40 years of optimizing efforts and make use of
9898
`specialized CPU instructions for manipulating arrays <https://www.youtube.com/watch?v=Pc8DfEyAxzg&list=PLzLzYGEbdY5lrUYSssHfk5ahwZERojgid>`_.
9999
In other words, they are *fast*.
100100

@@ -136,6 +136,25 @@ NumPy copies data are not trivial and it is worth your while to take a closer
136136
look at them. This involves developing an understanding of how NumPy's
137137
:class:`numpy.ndarray` datastructure works behind the scenes.
138138

139+
140+
An example: matrix transpose
141+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142+
Transposing a matrix means that all rows become columns and all columns become
143+
rows. All off-diagonal values change places. Let's see how long NumPy's
144+
transpose function takes, by transposing a huge (10 000 ✕ 20 000) matrix::
145+
146+
import numpy as np
147+
a = np.random.rand(10_000, 20_000)
148+
print(f'Matrix `a` takes up {a.nbytes / 10**6} MB')
149+
150+
Let's time the :func:`numpy.transpose` function::
151+
152+
%%timeit
153+
b = a.transpose()
154+
155+
It takes mere nanoseconds to transpose 1600 MB of data! How?
156+
157+
139158
The ndarray exposed
140159
~~~~~~~~~~~~~~~~~~~
141160
The first thing you need to know about :class:`numpy.ndarray` is that the
@@ -159,15 +178,19 @@ Exercise 2
159178

160179
.. challenge:: Exercises: Numpy-Advanced-2
161180

162-
Write a function called ``ravel()`` that takes as input:
181+
Write a function called ``ravel()`` that takes the row and column of an
182+
element in a 2D matrix and produces the appropriate index in an 1D array,
183+
where all the rows are concatenated. See the image above to remind yourself
184+
how each row of the 2D matrix ends up in the 1D array.
185+
186+
The function takes these inputs:
163187

164188
- ``row`` The row of the requested element in the matrix as integer index.
165189
- ``col`` The column of the requested element in the matrix as integer index.
166190
- ``n_rows`` The total number of rows of the matrix.
167191
- ``n_cols`` The total number of columns of the matrix.
168192

169-
And produces as output the appropriate index in the 1D array. Use the image above as a
170-
guide. Here are some examples of input and desired output:
193+
Here are some examples of input and desired output:
171194

172195
- ``ravel(2, 3, n_rows=4, n_cols=4)`` → ``11``
173196
- ``ravel(2, 3, n_rows=4, n_cols=8)`` → ``19``
@@ -201,25 +224,14 @@ double-precision floating point numbers. Each one of those bad boys takes up 8
201224
bytes, so all the indices are multiplied by 8 to get to the proper byte in the
202225
memory array. To move to the next column in the matrix, we skip ahead 8 bytes.
203226

204-
205-
An example: matrix transpose
206-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
207-
Transposing a matrix means that all rows become columns and all columns become
208-
rows. All off-diagonal values change places. Let's see how long NumPy's
209-
transpose function takes, by transposing a huge (10 000 ✕ 20 000) matrix::
227+
So now we know the mystery beding the speed of `transpose()`. NumPy can avoid
228+
copying any data by just modifying the ``.strides`` of the array::
210229

211230
import numpy as np
212-
a = rng.rand(10_000, 20_000)
213-
print(f'Matrix `a` takes up {a.nbytes / 10**6} MB')
214-
215-
Let's time the :func:`numpy.transpose` function::
216231

217-
%%timeit
232+
a = np.random.rand(10_000, 20_000)
218233
b = a.transpose()
219234

220-
It takes mere nanoseconds to transpose 1600 MB of data! NumPy avoided copying
221-
any data by *only* modifying the ``.strides`` of the existing array in-place::
222-
223235
print(a.strides) # (160000, 8)
224236
print(b.strides) # (8, 160000)
225237

@@ -228,7 +240,7 @@ Another example: reshaping
228240
Modifying the shape of an array through :func:`numpy.reshape` is also
229241
accomplished without any copying of data by modifying the ``.strides``::
230242

231-
a = rng.rand(20_000, 10_000)
243+
a = np.random.rand(20_000, 10_000)
232244
print(f'{a.strides=}') # (80000, 8)
233245
b = a.reshape(40_000, 5_000)
234246
print(f'{b.strides=}') # (40000, 8)

0 commit comments

Comments
 (0)