@@ -279,7 +279,7 @@ are completely determined by a
279
279
but is actually totally reproducible. As long as you pick the same seed
280
280
value, you get the same result!
281
281
282
- ``` {index} sample; numpy.random.choice
282
+ ``` {index} sample, to_list
283
283
```
284
284
285
285
Let's use an example to investigate how randomness works in Python. Say we
@@ -291,6 +291,8 @@ Below we use the seed number `1`. At
291
291
that point, Python will keep track of the randomness that occurs throughout the code.
292
292
For example, we can call the ` sample ` method
293
293
on the series of numbers, passing the argument ` n=10 ` to indicate that we want 10 samples.
294
+ The ` to_list ` method converts the resulting series into a basic Python list to make
295
+ the output easier to read.
294
296
295
297
``` {code-cell} ipython3
296
298
import numpy as np
@@ -300,7 +302,7 @@ np.random.seed(1)
300
302
301
303
nums_0_to_9 = pd.Series([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
302
304
303
- random_numbers1 = nums_0_to_9.sample(n=10).to_numpy ()
305
+ random_numbers1 = nums_0_to_9.sample(n=10).to_list ()
304
306
random_numbers1
305
307
```
306
308
You can see that ` random_numbers1 ` is a list of 10 numbers
@@ -309,7 +311,7 @@ we run the `sample` method again,
309
311
we will get a fresh batch of 10 numbers that also look random.
310
312
311
313
``` {code-cell} ipython3
312
- random_numbers2 = nums_0_to_9.sample(n=10).to_numpy ()
314
+ random_numbers2 = nums_0_to_9.sample(n=10).to_list ()
313
315
random_numbers2
314
316
```
315
317
@@ -319,12 +321,12 @@ as before---and then call the `sample` method again.
319
321
320
322
``` {code-cell} ipython3
321
323
np.random.seed(1)
322
- random_numbers1_again = nums_0_to_9.sample(n=10).to_numpy ()
324
+ random_numbers1_again = nums_0_to_9.sample(n=10).to_list ()
323
325
random_numbers1_again
324
326
```
325
327
326
328
``` {code-cell} ipython3
327
- random_numbers2_again = nums_0_to_9.sample(n=10).to_numpy ()
329
+ random_numbers2_again = nums_0_to_9.sample(n=10).to_list ()
328
330
random_numbers2_again
329
331
```
330
332
@@ -336,13 +338,13 @@ obtain a different sequence of random numbers.
336
338
337
339
``` {code-cell} ipython3
338
340
np.random.seed(4235)
339
- random_numbers = nums_0_to_9.sample(n=10).to_numpy ()
340
- random_numbers
341
+ random_numbers1_different = nums_0_to_9.sample(n=10).to_list ()
342
+ random_numbers1_different
341
343
```
342
344
343
345
``` {code-cell} ipython3
344
- random_numbers = nums_0_to_9.sample(n=10).to_numpy ()
345
- random_numbers
346
+ random_numbers2_different = nums_0_to_9.sample(n=10).to_list ()
347
+ random_numbers2_different
346
348
```
347
349
348
350
In other words, even though the sequences of numbers that Python is generating * look*
@@ -387,22 +389,23 @@ reproducible.
387
389
In this book, we will generally only use packages that play nicely with `numpy`'s
388
390
default random number generator, so we will stick with `np.random.seed`.
389
391
You can achieve more careful control over randomness in your analysis
390
- by creating a `numpy` [`RandomState ` object](https://numpy.org/doc/1.16 /reference/generated/numpy. random.RandomState .html)
392
+ by creating a `numpy` [`Generator ` object](https://numpy.org/doc/stable /reference/random/generator .html)
391
393
once at the beginning of your analysis, and passing it to
392
394
the `random_state` argument that is available in many `pandas` and `scikit-learn`
393
- functions. Those functions will then use your `RandomState ` to generate random numbers instead of
394
- `numpy`'s default generator. For example, we can reproduce our earlier example by using a `RandomState `
395
+ functions. Those functions will then use your `Generator ` to generate random numbers instead of
396
+ `numpy`'s default generator. For example, we can reproduce our earlier example by using a `Generator `
395
397
object with the `seed` value set to 1; we get the same lists of numbers once again.
396
398
```{code}
397
- rnd = np.random.RandomState(seed=1)
398
- random_numbers1_third = nums_0_to_9.sample(n=10, random_state=rnd).to_numpy()
399
+ from numpy.random import Generator, PCG64
400
+ rng = Generator(PCG64(seed=1))
401
+ random_numbers1_third = nums_0_to_9.sample(n=10, random_state=rng).to_list()
399
402
random_numbers1_third
400
403
```
401
404
```{code}
402
405
array([2, 9, 6, 4, 0, 3, 1, 7, 8, 5])
403
406
```
404
407
```{code}
405
- random_numbers2_third = nums_0_to_9.sample(n=10, random_state=rnd).to_numpy ()
408
+ random_numbers2_third = nums_0_to_9.sample(n=10, random_state=rng).to_list ()
406
409
random_numbers2_third
407
410
```
408
411
```{code}
0 commit comments