Skip to content

Commit d1484ac

Browse files
committed
Added blue choise section and code
1 parent dc29c88 commit d1484ac

File tree

9 files changed

+419
-84
lines changed

9 files changed

+419
-84
lines changed

01-preface.rst

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -91,21 +91,22 @@ How to contribute
9191
If you want to contribute to this book, you can:
9292

9393
* Review chapters (please contact me)
94-
* Report issues (https://github.com/rougier/numpy-book/issues)
95-
* Suggest improvements (https://github.com/rougier/numpy-book/pulls)
96-
* Correct English (https://github.com/rougier/numpy-book/issues)
94+
* Report issues (https://github.com/rougier/from-python-to-numpy/issues)
95+
* Suggest improvements (https://github.com/rougier/from-python-to-numpy/pulls)
96+
* Correct English (https://github.com/rougier/from-python-to-numpy/issues)
9797
* Design a better and more responsive html template for the book.
98-
98+
* Star the project (https://github.com/rougier/from-python-to-numpy)
9999

100100
Publishing
101101
++++++++++
102102

103103
If you're an editor interested in publishing this book, you can contact me if
104-
you agree to have this open access version online, you know how to deal with
105-
`restructured text <http://docutils.sourceforge.net/rst.html>`_ (Word is not an
106-
option), you provide a real added-value as well as supporting services, and
107-
more importantly, you have a truly amazing latex book template (and be warned
108-
that I'm a bit picky about typography & design: E.Tufte is my hero).
104+
you agree to have this version and all subsequent versions open access
105+
(i.e. online), you know how to deal with `restructured text
106+
<http://docutils.sourceforge.net/rst.html>`_ (Word is not an option), you
107+
provide a real added-value as well as supporting services, and more
108+
importantly, you have a truly amazing latex book template (and be warned that
109+
I'm a bit picky about typography & design: E.Tufte is my hero).
109110

110111
Still here?
111112

05-problem-vectorization.rst

Lines changed: 129 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -333,6 +333,13 @@ illustrated below (reading from left to right, top to bottom). Once this is
333333
done, we can ascent the gradient from the starting node. You can check on the
334334
figure this leads to the shortest path.
335335

336+
.. admonition:: **Figure**
337+
:class: legend
338+
339+
Value iteration algorithm on a simple maze. Once entrance has been reached,
340+
it is easy to find the shortest path by ascending the value gradient.
341+
342+
336343

337344
.. image:: data/value-iteration-1.pdf
338345
:width: 19%
@@ -561,13 +568,21 @@ References
561568
* `Animating Sand as a Fluid <https://www.cs.ubc.ca/%7Erbridson/docs/zhu-siggraph05-sandfluid.pdf>`_, Yongning Zhu & Robert Bridson, 2005.
562569

563570

564-
Blue noise
565-
----------
571+
Blue noise sampling
572+
-------------------
573+
574+
Blue noise refers to sample sets that have random and yet uniform distributions
575+
with absence of any spectral bias. Such noise is very useful in a variety of
576+
graphics applications like rendering, dithering, stippling, etc. Many different
577+
methods have been proposed to achieve such noise whose most simple is certainly
578+
the DART method.
579+
566580

567581
.. admonition:: **Figure 10**
568582
:class: legend
569583

570-
Detail of "The Starry Night", Vincent van Gogh, 1889.
584+
Detail of "The Starry Night", Vincent van Gogh, 1889. The detail has been
585+
resampled using voronoi cells whose centers are a blue noise sample.
571586

572587
.. image:: data/mosaic.png
573588
:width: 100%
@@ -577,23 +592,117 @@ Blue noise
577592
DART method
578593
+++++++++++
579594

580-
Numpy implementation
581-
++++++++++++++++++++
595+
The DART method is one of the earliest and simplest method. It works by
596+
sequentially drawing uniform random point and only accept those who lies at a
597+
minimum distance from every previous accepted sample. This sequential method is
598+
therefore extremely slow because each new candidate needs to be tested against
599+
previous accepted candidates. The more points you accept, the slower is the
600+
method. Let's consider the unit surface and a minimum radius `r` to be enforced
601+
between each point.
602+
603+
Knowing that the densest packing of circles in the plane is the hexagonal
604+
lattice of the bee's honeycomb, we know this density is :math:`d =
605+
\frac{1}{6}\pi\sqrt{3}` (in fact `I learned it
606+
<https://en.wikipedia.org/wiki/Circle_packing>`_ while writing this book).
607+
Considering circles with radius r, we can pack at most :math:`\frac{d}{\pi r^2}
608+
= \frac{\sqrt{3}}{6r^2} = \frac{1}{2r^2\sqrt{3}}`. We know the theoretical
609+
upper limit for the number of discs we can pack onto the surface but we'll
610+
likely not reach this upper limit because of random placements. Furthermore,
611+
because a lot of points will be rejected after a few have been accepted, we
612+
need to set a limit in the number of successive failed trials before we stop
613+
the whole process.
582614

583615

584-
.. admonition:: **Figure 11**
585-
:class: legend
616+
.. code:: python
586617
587-
Comparison of uniform, grid-jittered and Poisson disc sampling.
618+
import math
619+
import random
620+
621+
def DART_sampling(width=1.0, height=1.0, r = 0.025, k=100):
622+
def distance(p0, p1):
623+
dx, dy = p0[0]-p1[0], p0[1]-p1[1]
624+
return math.hypot(dx, dy)
625+
626+
points = []
627+
i = 0
628+
last_success = 0
629+
while True:
630+
x = random.uniform(0, width)
631+
y = random.uniform(0, height)
632+
accept = True
633+
for p in points:
634+
if distance(p, (x, y)) < r:
635+
accept = False
636+
break
637+
if accept is True:
638+
points.append((x, y))
639+
if i-last_success > k:
640+
break
641+
last_success = i
642+
i += 1
643+
return points
644+
645+
I left as an exercise the vectorization of the DART method. The idea is to
646+
pre-compute enough uniform random samples as well as paired distances and to
647+
test for their sequential inclusion.
648+
649+
650+
Bridson method
651+
++++++++++++++
652+
653+
If the vectoriation of the previous method poses no real difficulty, the speed
654+
improvement is not so good and the quality remains low and dependent on the `k`
655+
parameter. The higher the better since it basically governs how hard to try to
656+
insert a new sample. But, when there is already a large number of accepted
657+
samples, only chance allows us to find a position to insert a new sample. We
658+
could increase the `k` value but this would make the method even more slow
659+
without any guarantee in quality. It's time to think out of the box and luckily
660+
enough, Robert Bridson did that for us and proposed a simple yet efficient
661+
method:
662+
663+
**Step 0**. *Initialize an n-dimensional background grid for storing samples and
664+
accelerating spatial searches. We pick the cell size to be bounded by r/√n, so
665+
that each grid cell will contain at most one sample, and thus the grid can be
666+
implemented as a simple n- dimensional array of integers: the default −1
667+
indicates no sample, a non-negative integer gives the index of the sample
668+
located in a cell.*
669+
670+
**Step 1**. *Select the initial sample, x0, randomly chosen uniformly from the
671+
domain. Insert it into the background grid, and initialize the “active list”
672+
(an array of sample indices) with this index (zero).*
673+
674+
**Step 2**. *While the active list is not empty, choose a random index from it
675+
(say i). Generate up to k points chosen uniformly from the spherical annulus
676+
between radius r and 2r around xi. For each point in turn, check if it is
677+
within distance r of existing samples (using the background grid to only test
678+
nearby samples). If a point is adequately far from existing samples, emit it
679+
as the next sample and add it to the active list. If after k attempts no such
680+
point is found, instead remove i from the active list.*
681+
682+
683+
Implementation poses no real problem and is left as an exercise for the
684+
reader. Note that not only this method is fast, but it also offers a better
685+
quality (more samples) than the DART method even with a high `k`
686+
parameter.
687+
688+
.. admonition:: **Figure**
689+
:class: legend
588690

691+
Comparison of uniform, grid-jittered and Bridson sampling.
589692

590693
.. image:: data/sampling.png
591694
:width: 100%
592695

696+
697+
698+
593699
Sources
594700
+++++++
595701

596-
* `sampling.py <code/sampling.py>`_
702+
* `DART-sampling-python.py <code/DART-sampling-python.py>`_
703+
* `DART-sampling-numpy.py <code/DART-sampling-numpy.py>`_ (solution to the exercise)
704+
* `Bridson-sampling.py <code/Bridson-sampling.py>`_ (solution to the exercise)
705+
* `sampling.py <code/sampling.py>`_
597706
* `mosaic.py <code/mosaic.py>`_
598707
* `voronoi.py <code/voronoi.py>`_
599708

@@ -606,9 +715,19 @@ References
606715
Jose Esteve, 2012.
607716
* `Poisson Disk Sampling <http://devmag.org.za/2009/05/03/poisson-disk-sampling/>`_
608717
Herman Tulleken, 2009.
609-
* `Fast Poisson Disk Sampling in Arbitrary Dimensions <http://www.cs.ubc.ca/~rbridson/docs/bridson-siggraph07-poissondisk.pdf>`_
718+
* `Fast Poisson Disk Sampling in Arbitrary Dimensions <http://www.cs.ubc.ca/~rbridson/docs/bridson-siggraph07-poissondisk.pdf>`_,
610719
Robert Bridson, SIGGRAPH, 2007.
611720

612721

613722
Conclusion
614723
----------
724+
725+
The last example we'been studying is indeed a nice example where it is more
726+
important to vectorize the problem rather than to vectorize the code (and too
727+
early). In this spefici case we were lucky enough to have the work done for us
728+
but it won't be always the case and in such a case, the temptation might be
729+
high to vectorize the first solution we've found. I hope you're now convinced
730+
it might be a good idea in general to look for alternative solutions once
731+
you've found one. You'll (almost) always improve speed by vectorizing your
732+
code, but in the process, you may miss huge improvements.
733+

07-beyond-numpy.rst

Lines changed: 63 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -127,8 +127,8 @@ important to wonder if you need an actual instance of your result or if a
127127
simple generator might do the job.
128128

129129

130-
Friends of Numpy
131-
----------------
130+
Numpy & co
131+
----------
132132

133133
Beyond numpy, there are several other Python packages that are worth a look
134134
because they address similar yet different class of problems using different
@@ -306,72 +306,77 @@ and other massively parallel compute devices from Python.
306306
307307
308308
309-
..
310-
Friends of Scipy
311-
----------------
312-
313-
Here is a very short list of packages that are well-maintained, well tested and
314-
may simplify your scientific life (depending on your domain). There are of
315-
course many more and depending on your specific needs, chances are you do not
316-
have to program everything by yourself. But it is a good exercise if you have
317-
some spare time. For an extensive list, have a look at the `Awesome python list
318-
<https://awesome-python.com>`_.
319-
320-
scikit-learn
321-
++++++++++++
322-
323-
`scikit-learn <http://scikit-learn.org/stable/>`_ is a free software machine
324-
learning library for the Python programming language. It features various
325-
classification, regression and clustering algorithms including support vector
326-
machines, random forests, gradient boosting, k-means and DBSCAN, and is
327-
designed to interoperate with the Python numerical and scientific libraries
328-
NumPy and SciPy.
329-
330-
331-
scikit-image
332-
++++++++++++
333-
334-
`scikit-image <http://scikit-image.org>`_ is a Python package dedicated to
335-
image processing, and using natively NumPy arrays as image objects. This
336-
chapter describes how to use scikit-image on various image processing tasks,
337-
and insists on the link with other scientific Python modules such as NumPy and
338-
SciPy.
309+
Scipy & co
310+
----------
339311

340-
SympPy
341-
++++++
312+
If there are several additional packages for Numpy, there is a trillion
313+
additional packages for scipy. In fact, every domain of science probably has
314+
its own package and most of the examples we've been studying until now could
315+
have been solved in two or three calls to a method in the relevant package.
316+
But of course, it was not the goal an programming things yourself is generally
317+
a good exercise if you have some spare time. The biggest difficulty at this
318+
point is to find these relevant packages. Here is a very short list of packages
319+
that are well-maintained, well tested and may simplify your scientific life
320+
(depending on your domain). There are of course many more and depending on your
321+
specific needs, chances are you do not have to program everything by
322+
yourself. For an extensive list, have a look at the `Awesome python list
323+
<https://awesome-python.com>`_.
324+
325+
scikit-learn
326+
++++++++++++
327+
328+
`scikit-learn <http://scikit-learn.org/stable/>`_ is a free software machine
329+
learning library for the Python programming language. It features various
330+
classification, regression and clustering algorithms including support vector
331+
machines, random forests, gradient boosting, k-means and DBSCAN, and is
332+
designed to interoperate with the Python numerical and scientific libraries
333+
NumPy and SciPy.
334+
335+
336+
scikit-image
337+
++++++++++++
338+
339+
`scikit-image <http://scikit-image.org>`_ is a Python package dedicated to
340+
image processing, and using natively NumPy arrays as image objects. This
341+
chapter describes how to use scikit-image on various image processing tasks,
342+
and insists on the link with other scientific Python modules such as NumPy and
343+
SciPy.
344+
345+
SympPy
346+
++++++
342347

343-
`SymPy <http://www.sympy.org/en/index.html>`_ is a Python library for symbolic
344-
mathematics. It aims to become a full-featured computer algebra system (CAS)
345-
while keeping the code as simple as possible in order to be comprehensible and
346-
easily extensible. SymPy is written entirely in Python.
348+
`SymPy <http://www.sympy.org/en/index.html>`_ is a Python library for symbolic
349+
mathematics. It aims to become a full-featured computer algebra system (CAS)
350+
while keeping the code as simple as possible in order to be comprehensible and
351+
easily extensible. SymPy is written entirely in Python.
347352

348-
Astropy
349-
+++++++
353+
Astropy
354+
+++++++
350355

351-
The `Astropy <http://www.astropy.org>`_ project is a community effort to
352-
develop a single core package for astronomy in Python and foster
353-
interoperability between Python astronomy packages.
356+
The `Astropy <http://www.astropy.org>`_ project is a community effort to
357+
develop a single core package for astronomy in Python and foster
358+
interoperability between Python astronomy packages.
354359

355360

356-
Cartopy
357-
+++++++
361+
Cartopy
362+
+++++++
358363

359-
`Cartopy <http://scitools.org.uk/cartopy/>`_ is a Python package designed to
360-
make drawing maps for data analysis and visualisation as easy as
361-
possible. Cartopy makes use of the powerful PROJ.4, numpy and shapely libraries
362-
and has a simple and intuitive drawing interface to matplotlib for creating
363-
publication quality maps.
364+
`Cartopy <http://scitools.org.uk/cartopy/>`_ is a Python package designed to
365+
make drawing maps for data analysis and visualisation as easy as
366+
possible. Cartopy makes use of the powerful PROJ.4, numpy and shapely libraries
367+
and has a simple and intuitive drawing interface to matplotlib for creating
368+
publication quality maps.
364369

365370

366-
Brian
367-
+++++
371+
Brian
372+
+++++
368373

369-
`Brian <http://www.briansimulator.org>`_ is a free, open source simulator for
370-
spiking neural networks. It is written in the Python programming language and
371-
is available on almost all platforms. We believe that a simulator should not
372-
only save the time of processors, but also the time of scientists. Brian is
373-
therefore designed to be easy to learn and use, highly flexible and easily
374-
extensible.
374+
`Brian <http://www.briansimulator.org>`_ is a free, open source simulator for
375+
spiking neural networks. It is written in the Python programming language and
376+
is available on almost all platforms. We believe that a simulator should not
377+
only save the time of processors, but also the time of scientists. Brian is
378+
therefore designed to be easy to learn and use, highly flexible and easily
379+
extensible.
375380

376381

377382
Conclusion

08-conclusion.rst

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,27 @@
11
Conclusion
22
===============================================================================
33

4-
|WIP|
4+
You've reached the end of this book. I hope you've learned something while
5+
reading it, I sure learned a lot writing it. Trying to explain something is a
6+
generally a good exercise to test for your knowledge of this thing. Of course,
7+
we only scratched the surface of Numpy and there are many things left to
8+
discover. Have a look at the bibliography for books written by true experts, at
9+
the documentation written by people making Numpy and don't hesitate to ask your
10+
questions on the mailing lists because the Numpy community is very friendly.
511

6-
.. contents:: **Contents**
7-
:local:
12+
If there's a single message to retain from this book it is "premature
13+
optimization is the root of all evil". We've seen that code vectorization can
14+
drastically improve your computation, with several orders of magnitude in some
15+
cases. Still, problem vectorization is generally much more powerful. If you
16+
write code vectorization too early in your design process, you won't be able to
17+
think out of the box and you'll certainly miss some really powerful alternative
18+
because you won't be able anymore to identify your problem properly as we've
19+
seen in the problem vectorization chapter. This requires some experience and
20+
you have to be patient, experience is not an overnight process.
21+
22+
Finally, custom vectorization is an option worth to consider once you've looked
23+
at the alternatives to numpy. When nothing works for you, Numpy still offers
24+
you a clever framework to forge your own tools. And who knows, this can be the
25+
start of an exciting adventure for you and the community as it happened to me
26+
with the glumpy and the vispy packages.
827

0 commit comments

Comments
 (0)