You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/numpy-advanced.rst
+39-27Lines changed: 39 additions & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,19 +82,19 @@ Exercise 1
82
82
83
83
.. highlight:: python
84
84
85
-
The library behind the curtain: BLAS
86
-
------------------------------------
85
+
The libraries behind the curtain: MKL and BLAS
86
+
----------------------------------------------
87
87
88
-
NumPy is fast because it outsources most of its heavy lifting to
88
+
NumPy is fast because it outsources most of its heavy lifting to heavily
89
+
optimized math libraries, such as Intel's `Math Kernel Library (MKL) <https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/api-based-programming/intel-oneapi-math-kernel-library-onemkl.html>`_,
90
+
which are in turn derived from a Fortran library called
89
91
`Basic Linear Algebra Subprograms (BLAS) <https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms>`_.
90
92
BLAS for Fortran was `published in 1979 <https://doi.org/10.1145/355841.355847>`_
91
93
and is a collection of algorithms for common mathematical operations that are
92
-
performed on arrays of numbers. Algorithms such as element-wise sum, matrix
93
-
multiplication, computing the vector length, etc.
94
-
95
-
The API of that software library was later standardized, and today there are
96
-
many modern implementations available. These libraries represent over 40 years
97
-
of optimizing efforts and make use of
94
+
performed on arrays of numbers. Algorithms such as matrix multiplication,
95
+
computing the vector length, etc. The API of the BLAS library was later
96
+
standardized, and today there are many modern implementations available. These
97
+
libraries represent over 40 years of optimizing efforts and make use of
98
98
`specialized CPU instructions for manipulating arrays <https://www.youtube.com/watch?v=Pc8DfEyAxzg&list=PLzLzYGEbdY5lrUYSssHfk5ahwZERojgid>`_.
99
99
In other words, they are *fast*.
100
100
@@ -136,6 +136,25 @@ NumPy copies data are not trivial and it is worth your while to take a closer
136
136
look at them. This involves developing an understanding of how NumPy's
137
137
:class:`numpy.ndarray` datastructure works behind the scenes.
138
138
139
+
140
+
An example: matrix transpose
141
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142
+
Transposing a matrix means that all rows become columns and all columns become
143
+
rows. All off-diagonal values change places. Let's see how long NumPy's
144
+
transpose function takes, by transposing a huge (10 000 ✕ 20 000) matrix::
145
+
146
+
import numpy as np
147
+
a = np.random.rand(10_000, 20_000)
148
+
print(f'Matrix `a` takes up {a.nbytes / 10**6} MB')
149
+
150
+
Let's time the :func:`numpy.transpose` function::
151
+
152
+
%%timeit
153
+
b = a.transpose()
154
+
155
+
It takes mere nanoseconds to transpose 1600 MB of data! How?
156
+
157
+
139
158
The ndarray exposed
140
159
~~~~~~~~~~~~~~~~~~~
141
160
The first thing you need to know about :class:`numpy.ndarray` is that the
@@ -159,15 +178,19 @@ Exercise 2
159
178
160
179
.. challenge:: Exercises: Numpy-Advanced-2
161
180
162
-
Write a function called ``ravel()`` that takes as input:
181
+
Write a function called ``ravel()`` that takes the row and column of an
182
+
element in a 2D matrix and produces the appropriate index in an 1D array,
183
+
where all the rows are concatenated. See the image above to remind yourself
184
+
how each row of the 2D matrix ends up in the 1D array.
185
+
186
+
The function takes these inputs:
163
187
164
188
- ``row`` The row of the requested element in the matrix as integer index.
165
189
- ``col`` The column of the requested element in the matrix as integer index.
166
190
- ``n_rows`` The total number of rows of the matrix.
167
191
- ``n_cols`` The total number of columns of the matrix.
168
192
169
-
And produces as output the appropriate index in the 1D array. Use the image above as a
170
-
guide. Here are some examples of input and desired output:
193
+
Here are some examples of input and desired output:
171
194
172
195
- ``ravel(2, 3, n_rows=4, n_cols=4)`` → ``11``
173
196
- ``ravel(2, 3, n_rows=4, n_cols=8)`` → ``19``
@@ -201,25 +224,14 @@ double-precision floating point numbers. Each one of those bad boys takes up 8
201
224
bytes, so all the indices are multiplied by 8 to get to the proper byte in the
202
225
memory array. To move to the next column in the matrix, we skip ahead 8 bytes.
203
226
204
-
205
-
An example: matrix transpose
206
-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
207
-
Transposing a matrix means that all rows become columns and all columns become
208
-
rows. All off-diagonal values change places. Let's see how long NumPy's
209
-
transpose function takes, by transposing a huge (10 000 ✕ 20 000) matrix::
227
+
So now we know the mystery beding the speed of `transpose()`. NumPy can avoid
228
+
copying any data by just modifying the ``.strides`` of the array::
210
229
211
230
import numpy as np
212
-
a = rng.rand(10_000, 20_000)
213
-
print(f'Matrix `a` takes up {a.nbytes / 10**6} MB')
214
-
215
-
Let's time the :func:`numpy.transpose` function::
216
231
217
-
%%timeit
232
+
a = np.random.rand(10_000, 20_000)
218
233
b = a.transpose()
219
234
220
-
It takes mere nanoseconds to transpose 1600 MB of data! NumPy avoided copying
221
-
any data by *only* modifying the ``.strides`` of the existing array in-place::
222
-
223
235
print(a.strides) # (160000, 8)
224
236
print(b.strides) # (8, 160000)
225
237
@@ -228,7 +240,7 @@ Another example: reshaping
228
240
Modifying the shape of an array through :func:`numpy.reshape` is also
229
241
accomplished without any copying of data by modifying the ``.strides``::
0 commit comments