diff --git a/episodes/02-numpy.md b/episodes/02-numpy.md index 21db9cc36..f281834b7 100644 --- a/episodes/02-numpy.md +++ b/episodes/02-numpy.md @@ -422,28 +422,50 @@ operation across an axis: ![](fig/python-operations-across-axes.png){alt="Per-patient maximum inflammation is computed row-wise across all columns usingnumpy.amax(data, axis=1). Per-day average inflammation is computed column-wise across all rows usingnumpy.mean(data, axis=0)."} -To support this functionality, -most array functions allow us to specify the axis we want to work on. -If we ask for the average across axis 0 (rows in our 2D example), -we get: +To find the **maximum inflammation reported for each patient**, you would apply the `max` function moving across the columns (axis 1). To find the **daily average inflammation reported across patients**, you would apply the `mean` function moving down the rows (axis 0). + +To support this functionality, most array functions allow us to specify the axis we want to work on. If we ask for the max across axis 1 (columns in our 2D example), we get: + +```python +print(numpy.max(data, axis=1)) +``` + +```output +[18. 18. 19. 17. 17. 18. 17. 20. 17. 18. 18. 18. 17. 16. 17. 18. 19. 19. + 17. 19. 19. 16. 17. 15. 17. 17. 18. 17. 20. 17. 16. 19. 15. 15. 19. 17. + 16. 17. 19. 16. 18. 19. 16. 19. 18. 16. 19. 15. 16. 18. 14. 20. 17. 15. + 17. 16. 17. 19. 18. 18.] +``` + +As a quick check, we can ask this array what its shape is. We expect 60 patient maximums: + +```python +print(numpy.max(data, axis=1).shape) +``` + +```output +(60,) +``` + +The expression `(60,)` tells us we have an N×1 vector, so this is the maximum inflammation per day for each patients. + +If we ask for the average across/down axis 0 (rows in our 2D example), we get: ```python print(numpy.mean(data, axis=0)) ``` ```output -[ 0. 0.45 1.11666667 1.75 2.43333333 3.15 - 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 - 8.35 7.73333333 8.36666667 9.5 9.58333333 - 10.63333333 11.56666667 12.35 13.25 11.96666667 - 11.03333333 10.16666667 10. 8.66666667 9.15 7.25 - 7.33333333 6.58333333 6.06666667 5.95 5.11666667 3.6 - 3.3 3.56666667 2.48333333 1.5 1.13333333 - 0.56666667] +[ 0. 0.45 1.11666667 1.75 2.43333333 3.15 + 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 + 8.35 7.73333333 8.36666667 9.5 9.58333333 10.63333333 + 11.56666667 12.35 13.25 11.96666667 11.03333333 10.16666667 + 10. 8.66666667 9.15 7.25 7.33333333 6.58333333 + 6.06666667 5.95 5.11666667 3.6 3.3 3.56666667 + 2.48333333 1.5 1.13333333 0.56666667] ``` -As a quick check, -we can ask this array what its shape is: +Check the array shape. We expect 40 averages, one for each day of the study: ```python print(numpy.mean(data, axis=0).shape) @@ -452,26 +474,19 @@ print(numpy.mean(data, axis=0).shape) ```output (40,) ``` - -The expression `(40,)` tells us we have an N×1 vector, -so this is the average inflammation per day for all patients. -If we average across axis 1 (columns in our 2D example), we get: +Similarly, we can apply the `mean` function to axis 1 to get the patient's average inflammation over the duration of the study (60 values). ```python print(numpy.mean(data, axis=1)) ``` - ```output -[ 5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 - 6.775 5.8 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 - 5.775 5.825 6.175 6.1 5.8 6.425 6.05 6.025 6.175 6.55 - 6.175 6.35 6.725 6.125 7.075 5.725 5.925 6.15 6.075 5.75 - 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 5.95 6.275 5.7 - 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] +[5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 6.775 5.8 + 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 5.775 5.825 6.175 6.1 + 5.8 6.425 6.05 6.025 6.175 6.55 6.175 6.35 6.725 6.125 7.075 5.725 + 5.925 6.15 6.075 5.75 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 + 5.95 6.275 5.7 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] ``` -which is the average inflammation per patient across all days. - ::::::::::::::::::::::::::::::::::::::: challenge ## Slicing Strings