Skip to content

Commit 8d4ec14

Browse files
authored
[Edit] Python:NumPy: .std() (#7446)
* [Edit] Python:NumPy: .std() * updated faqs based on PAA ---------
1 parent 4ea22fb commit 8d4ec14

File tree

1 file changed

+62
-46
lines changed
  • content/numpy/concepts/built-in-functions/terms/std

1 file changed

+62
-46
lines changed

content/numpy/concepts/built-in-functions/terms/std/std.md

Lines changed: 62 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -13,90 +13,106 @@ CatalogContent:
1313
- 'paths/data-science'
1414
---
1515

16-
The **`.std()`** function calculates the standard deviation of given data along a specified axis. A standard deviation is a statistical measure indicating the spread of a distribution of data, represented by an array, along a specified axis.
16+
The NumPy **`.std()`** function calculates the NumPy standard deviation of given data along a specified axis. Standard deviation is a statistical measure that indicates how spread out the values in a dataset are, represented by an array, along a specified axis.
1717

18-
## Syntax
18+
## NumPy `.std()` Syntax
1919

2020
```pseudo
21-
numpy.std(a, axis, dtype, out, ddof, keepdims, where)
21+
np.std(a, axis, dtype, out, ddof, keepdims, where)
2222
```
2323

24-
- `a`: Array of elements used to find the standard deviation.
24+
**Parameters:**
2525

26-
### Optional Parameters
26+
- `a`: [Array](https://www.codecademy.com/resources/docs/numpy/ndarray) of elements used to find the standard deviation.
27+
- `axis`(Optional): The axis along which the standard deviation will be computed. By default, the array is flattened before computation.
28+
- If `0`, calculates the standard deviation along the vertical axis.
29+
- If `1`, calculates the standard deviation along the horizontal axis.
30+
- If a tuple of integers, calculates the standard deviation along multiple specified axes.
31+
- `dtype` (Optional): Type used in computing the standard deviation, if specified. By default, for arrays of integer type, it is float64, while for arrays of float types, it matches the array type.
32+
- `out` (Optional): Specifies an alternative output array to contain the result. This array must have the same shape as the expected output.
33+
- `ddof` (Optional): It stands for _Delta Degrees of Freedom_. It helps adjust the calculation of standard deviation for samples.
34+
- `keepdims` (Optional): It accepts a boolean value and is used to determine whether to retain the dimensions of the given array in the output. By default, it is set to `False`.
35+
- `where` (Optional): It accepts boolean arrays or conditions where `True` values indicate the indices, or elements within the array for which the standard deviation should be calculated.
2736

28-
- `axis`: Specifies the axis along which the standard deviation will be computed. By default, the array is flattened before computation.
37+
**Return value:**
2938

30-
- **axis = 0**: Calculates the standard deviation along the vertical axis.
39+
If the `out` parameter is `None`, the NumPy `.std()` function returns a new array containing the standard deviation. Otherwise, it assigns the result to the specified output array and returns its reference.
3140

32-
- **axis = 1**: Calculates the standard deviation along the horizontal axis.
41+
> **Notes:**
42+
>
43+
> 1. For floating-point inputs, the standard deviation is calculated with the same precision as the input data. This may cause inaccuracies, especially with `np.float32` data type.
44+
> 2. For complex numbers, `std` takes the absolute value before squaring for a real, non-negative result.
3345
34-
- **tuple of ints**: Calculates the standard deviation along multiple specified axes.
46+
## Example 1: Basic Usage of NumPy `.std()`
3547

36-
- `dtype`: Type used in computing the standard deviation, if specified. By default, for arrays of integer type, it is float64, while for arrays of float types, it matches the array type.
48+
In this example, the NumPy `.std()` function calculates the standard deviation of the given data:
3749

38-
> **Note** For floating-point inputs, the standard deviation is calculated with the same precision as the input data. This may cause inaccuracies, especially with `np.float32` data type.
39-
40-
- `out`: Specifies an alternative output array to contain the result. This array must have the same shape as the expected output.
50+
```py
51+
import numpy as np
4152

42-
- `ddof`: It stands for _Delta Degrees of Freedom_. It helps adjust the calculation of standard deviation for samples.
53+
data = [10, 13, 23, 23, 16, 23, 21, 16]
4354

44-
- `keepdims`: It accepts a boolean value and is used to determine whether to retain the dimensions of the input array in the output. By default, it is set to `False`.
55+
std_dev = np.std(data)
4556

46-
- `where`: It accepts boolean arrays or conditions where `True` values indicate the indices, or elements within the array for which the standard deviation should be calculated.
57+
print("Standard Deviation:", std_dev)
58+
```
4759

48-
If the `out` parameter is `None`, the `.std()` function returns a new array containing the standard deviation. Otherwise, it assigns the result to the specified output array and returns its reference.
60+
The output of this code is:
4961

50-
> **Note** For complex numbers, `std` takes the absolute value before squaring for a real, nonnegative result.
62+
```shell
63+
Standard Deviation: 4.7549316504025585
64+
```
5165

52-
## Example
66+
## Example 2: Using NumPy `.std()` with `axis`
5367

54-
The following examples demonstrate the use of `.std()` with different parameters.
68+
In this example, the `axis` parameter is used with the NumPy `.std()` function to calculate the standard deviation of the given data:
5569

5670
```py
5771
import numpy as np
5872

59-
arr = np.array([23, 54, 19, 45, 34])
73+
arr = np.array([[1, 2, 3],
74+
[4, 5, 6]])
6075

61-
print("arr : ", arr)
76+
col_std = np.std(arr, axis=0)
77+
row_std = np.std(arr, axis=1)
6278

63-
print("\nStandard deviation of arr : ", np.std(arr))
79+
print("Column-wise STD:", col_std)
80+
print("Row-wise STD:", row_std)
81+
```
6482

65-
print("\nStandard deviation of arr (float32) : ", np.std(arr, dtype=np.float32))
83+
The output of this code is:
6684

67-
print("\nStandard deviation of arr (float64) : ", np.std(arr, dtype=np.float64))
85+
```shell
86+
Column-wise STD: [1.5 1.5 1.5]
87+
Row-wise STD: [0.81649658 0.81649658]
6888
```
6989

70-
Given below is the output for the above code block:
90+
## Codebyte Example: Using NumPy `.std()` with `ddof`
7191

72-
```shell
73-
arr : [23, 54, 19, 45, 34]
92+
In this codebyte example, the `ddof` parameter is used with the NumPy `.std()` function to calculate the standard deviation of the given data:
93+
94+
```codebyte/python
95+
import numpy as np
7496
75-
Standard deviation of arr : 13.130118049735882
97+
data = [10, 13, 23, 23, 16, 23, 21, 16]
7698
77-
Standard deviation of arr (float32) : 13.130117
99+
pop_std = np.std(data)
100+
sample_std = np.std(data, ddof=1)
78101
79-
Standard deviation of arr (float64) : 13.130118049735882
102+
print("Population STD:", pop_std)
103+
print("Sample STD:", sample_std)
80104
```
81105

82-
## Codebyte Example
106+
## Frequently Asked Questions
83107

84-
Run the below codebyte to better understand the `.std()` function:
108+
### 1. What is the difference between NumPy `.std()` and `stdev()`?
85109

86-
```codebyte/python
87-
import numpy as np
110+
NumPy `.std()` works on NumPy arrays and is optimized for performance, supporting multi-dimensional data and the `axis` parameter. Python’s built-in `statistics.stdev()` works on standard Python iterables, calculates the **sample** standard deviation by default, and doesn’t support multi-dimensional arrays or the `axis` argument.
88111

89-
arr = [[8, 8, 8, 8, 8],
90-
[15, 10, 32, 9, 8],
91-
[27, 6, 63, 4, 8, ],
92-
[23, 54, 41, 9, 8]]
112+
### 2. Is NumPy `.std()` population or sample?
93113

94-
# flattened array
95-
print("\nStandard deviation of arr, when axis = None : ", np.std(arr))
114+
By default, NumPy `.std()` calculates the **population** standard deviation (`ddof=0`). To compute the sample standard deviation, set `ddof=1`.
96115

97-
# along the axis = 0
98-
print("\nStandard deviation of arr, when axis = 0 : ", np.std(arr, axis = 0))
116+
### 3. What is the difference between NumPy `.std()` and Pandas `.std()`?
99117

100-
# along the axis = 1
101-
print("\nStandard deviation of arr, when axis = 1 : ", np.std(arr, axis = 1))
102-
```
118+
NumPy `.std()` calculates standard deviation on NumPy arrays and defaults to population standard deviation (`ddof=0`). Pandas `.std()` works on Series and DataFrame objects, automatically excludes `NaN` values, and defaults to the sample standard deviation (`ddof=1`).

0 commit comments

Comments
 (0)