Skip to content

Commit 3335b2c

Browse files
minor fix py generator cls2 bug hunt
1 parent 70634f3 commit 3335b2c

File tree

1 file changed

+18
-15
lines changed

1 file changed

+18
-15
lines changed

source/classification2.md

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -279,7 +279,7 @@ are completely determined by a
279279
but is actually totally reproducible. As long as you pick the same seed
280280
value, you get the same result!
281281

282-
```{index} sample; numpy.random.choice
282+
```{index} sample, to_list
283283
```
284284

285285
Let's use an example to investigate how randomness works in Python. Say we
@@ -291,6 +291,8 @@ Below we use the seed number `1`. At
291291
that point, Python will keep track of the randomness that occurs throughout the code.
292292
For example, we can call the `sample` method
293293
on the series of numbers, passing the argument `n=10` to indicate that we want 10 samples.
294+
The `to_list` method converts the resulting series into a basic Python list to make
295+
the output easier to read.
294296

295297
```{code-cell} ipython3
296298
import numpy as np
@@ -300,7 +302,7 @@ np.random.seed(1)
300302
301303
nums_0_to_9 = pd.Series([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
302304
303-
random_numbers1 = nums_0_to_9.sample(n=10).to_numpy()
305+
random_numbers1 = nums_0_to_9.sample(n=10).to_list()
304306
random_numbers1
305307
```
306308
You can see that `random_numbers1` is a list of 10 numbers
@@ -309,7 +311,7 @@ we run the `sample` method again,
309311
we will get a fresh batch of 10 numbers that also look random.
310312

311313
```{code-cell} ipython3
312-
random_numbers2 = nums_0_to_9.sample(n=10).to_numpy()
314+
random_numbers2 = nums_0_to_9.sample(n=10).to_list()
313315
random_numbers2
314316
```
315317

@@ -319,12 +321,12 @@ as before---and then call the `sample` method again.
319321

320322
```{code-cell} ipython3
321323
np.random.seed(1)
322-
random_numbers1_again = nums_0_to_9.sample(n=10).to_numpy()
324+
random_numbers1_again = nums_0_to_9.sample(n=10).to_list()
323325
random_numbers1_again
324326
```
325327

326328
```{code-cell} ipython3
327-
random_numbers2_again = nums_0_to_9.sample(n=10).to_numpy()
329+
random_numbers2_again = nums_0_to_9.sample(n=10).to_list()
328330
random_numbers2_again
329331
```
330332

@@ -336,13 +338,13 @@ obtain a different sequence of random numbers.
336338

337339
```{code-cell} ipython3
338340
np.random.seed(4235)
339-
random_numbers = nums_0_to_9.sample(n=10).to_numpy()
340-
random_numbers
341+
random_numbers1_different = nums_0_to_9.sample(n=10).to_list()
342+
random_numbers1_different
341343
```
342344

343345
```{code-cell} ipython3
344-
random_numbers = nums_0_to_9.sample(n=10).to_numpy()
345-
random_numbers
346+
random_numbers2_different = nums_0_to_9.sample(n=10).to_list()
347+
random_numbers2_different
346348
```
347349

348350
In other words, even though the sequences of numbers that Python is generating *look*
@@ -387,22 +389,23 @@ reproducible.
387389
In this book, we will generally only use packages that play nicely with `numpy`'s
388390
default random number generator, so we will stick with `np.random.seed`.
389391
You can achieve more careful control over randomness in your analysis
390-
by creating a `numpy` [`RandomState` object](https://numpy.org/doc/1.16/reference/generated/numpy.random.RandomState.html)
392+
by creating a `numpy` [`Generator` object](https://numpy.org/doc/stable/reference/random/generator.html)
391393
once at the beginning of your analysis, and passing it to
392394
the `random_state` argument that is available in many `pandas` and `scikit-learn`
393-
functions. Those functions will then use your `RandomState` to generate random numbers instead of
394-
`numpy`'s default generator. For example, we can reproduce our earlier example by using a `RandomState`
395+
functions. Those functions will then use your `Generator` to generate random numbers instead of
396+
`numpy`'s default generator. For example, we can reproduce our earlier example by using a `Generator`
395397
object with the `seed` value set to 1; we get the same lists of numbers once again.
396398
```{code}
397-
rnd = np.random.RandomState(seed=1)
398-
random_numbers1_third = nums_0_to_9.sample(n=10, random_state=rnd).to_numpy()
399+
from numpy.random import Generator, PCG64
400+
rng = Generator(PCG64(seed=1))
401+
random_numbers1_third = nums_0_to_9.sample(n=10, random_state=rng).to_list()
399402
random_numbers1_third
400403
```
401404
```{code}
402405
array([2, 9, 6, 4, 0, 3, 1, 7, 8, 5])
403406
```
404407
```{code}
405-
random_numbers2_third = nums_0_to_9.sample(n=10, random_state=rnd).to_numpy()
408+
random_numbers2_third = nums_0_to_9.sample(n=10, random_state=rng).to_list()
406409
random_numbers2_third
407410
```
408411
```{code}

0 commit comments

Comments
 (0)