You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/numpy/concepts/built-in-functions/terms/std/std.md
+62-46Lines changed: 62 additions & 46 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,90 +13,106 @@ CatalogContent:
13
13
- 'paths/data-science'
14
14
---
15
15
16
-
The **`.std()`** function calculates the standard deviation of given data along a specified axis. A standard deviation is a statistical measure indicating the spread of a distribution of data, represented by an array, along a specified axis.
16
+
The NumPy **`.std()`** function calculates the NumPy standard deviation of given data along a specified axis. Standard deviation is a statistical measure that indicates how spread out the values in a dataset are, represented by an array, along a specified axis.
-`a`: Array of elements used to find the standard deviation.
24
+
**Parameters:**
25
25
26
-
### Optional Parameters
26
+
-`a`: [Array](https://www.codecademy.com/resources/docs/numpy/ndarray) of elements used to find the standard deviation.
27
+
-`axis`(Optional): The axis along which the standard deviation will be computed. By default, the array is flattened before computation.
28
+
- If `0`, calculates the standard deviation along the vertical axis.
29
+
- If `1`, calculates the standard deviation along the horizontal axis.
30
+
- If a tuple of integers, calculates the standard deviation along multiple specified axes.
31
+
-`dtype` (Optional): Type used in computing the standard deviation, if specified. By default, for arrays of integer type, it is float64, while for arrays of float types, it matches the array type.
32
+
-`out` (Optional): Specifies an alternative output array to contain the result. This array must have the same shape as the expected output.
33
+
-`ddof` (Optional): It stands for _Delta Degrees of Freedom_. It helps adjust the calculation of standard deviation for samples.
34
+
-`keepdims` (Optional): It accepts a boolean value and is used to determine whether to retain the dimensions of the given array in the output. By default, it is set to `False`.
35
+
-`where` (Optional): It accepts boolean arrays or conditions where `True` values indicate the indices, or elements within the array for which the standard deviation should be calculated.
27
36
28
-
-`axis`: Specifies the axis along which the standard deviation will be computed. By default, the array is flattened before computation.
37
+
**Return value:**
29
38
30
-
-**axis = 0**: Calculates the standard deviation along the vertical axis.
39
+
If the `out` parameter is `None`, the NumPy `.std()` function returns a new array containing the standard deviation. Otherwise, it assigns the result to the specified output array and returns its reference.
31
40
32
-
-**axis = 1**: Calculates the standard deviation along the horizontal axis.
41
+
> **Notes:**
42
+
>
43
+
> 1. For floating-point inputs, the standard deviation is calculated with the same precision as the input data. This may cause inaccuracies, especially with `np.float32` data type.
44
+
> 2. For complex numbers, `std` takes the absolute value before squaring for a real, non-negative result.
33
45
34
-
-**tuple of ints**: Calculates the standard deviation along multiple specified axes.
46
+
## Example 1: Basic Usage of NumPy `.std()`
35
47
36
-
-`dtype`: Type used in computing the standard deviation, if specified. By default, for arrays of integer type, it is float64, while for arrays of float types, it matches the array type.
48
+
In this example, the NumPy `.std()` function calculates the standard deviation of the given data:
37
49
38
-
> **Note** For floating-point inputs, the standard deviation is calculated with the same precision as the input data. This may cause inaccuracies, especially with `np.float32` data type.
39
-
40
-
-`out`: Specifies an alternative output array to contain the result. This array must have the same shape as the expected output.
50
+
```py
51
+
import numpy as np
41
52
42
-
-`ddof`: It stands for _Delta Degrees of Freedom_. It helps adjust the calculation of standard deviation for samples.
53
+
data = [10, 13, 23, 23, 16, 23, 21, 16]
43
54
44
-
-`keepdims`: It accepts a boolean value and is used to determine whether to retain the dimensions of the input array in the output. By default, it is set to `False`.
55
+
std_dev = np.std(data)
45
56
46
-
-`where`: It accepts boolean arrays or conditions where `True` values indicate the indices, or elements within the array for which the standard deviation should be calculated.
57
+
print("Standard Deviation:", std_dev)
58
+
```
47
59
48
-
If the `out` parameter is `None`, the `.std()` function returns a new array containing the standard deviation. Otherwise, it assigns the result to the specified output array and returns its reference.
60
+
The output of this code is:
49
61
50
-
> **Note** For complex numbers, `std` takes the absolute value before squaring for a real, nonnegative result.
62
+
```shell
63
+
Standard Deviation: 4.7549316504025585
64
+
```
51
65
52
-
## Example
66
+
## Example 2: Using NumPy `.std()` with `axis`
53
67
54
-
The following examples demonstrate the use of `.std()`with different parameters.
68
+
In this example, the `axis` parameter is used with the NumPy `.std()`function to calculate the standard deviation of the given data:
55
69
56
70
```py
57
71
import numpy as np
58
72
59
-
arr = np.array([23, 54, 19, 45, 34])
73
+
arr = np.array([[1, 2, 3],
74
+
[4, 5, 6]])
60
75
61
-
print("arr : ", arr)
76
+
col_std = np.std(arr, axis=0)
77
+
row_std = np.std(arr, axis=1)
62
78
63
-
print("\nStandard deviation of arr : ", np.std(arr))
79
+
print("Column-wise STD:", col_std)
80
+
print("Row-wise STD:", row_std)
81
+
```
64
82
65
-
print("\nStandard deviation of arr (float32) : ", np.std(arr, dtype=np.float32))
83
+
The output of this code is:
66
84
67
-
print("\nStandard deviation of arr (float64) : ", np.std(arr, dtype=np.float64))
85
+
```shell
86
+
Column-wise STD: [1.5 1.5 1.5]
87
+
Row-wise STD: [0.81649658 0.81649658]
68
88
```
69
89
70
-
Given below is the output for the above code block:
90
+
## Codebyte Example: Using NumPy `.std()` with `ddof`
71
91
72
-
```shell
73
-
arr : [23, 54, 19, 45, 34]
92
+
In this codebyte example, the `ddof` parameter is used with the NumPy `.std()` function to calculate the standard deviation of the given data:
93
+
94
+
```codebyte/python
95
+
import numpy as np
74
96
75
-
Standard deviation of arr : 13.130118049735882
97
+
data = [10, 13, 23, 23, 16, 23, 21, 16]
76
98
77
-
Standard deviation of arr (float32) : 13.130117
99
+
pop_std = np.std(data)
100
+
sample_std = np.std(data, ddof=1)
78
101
79
-
Standard deviation of arr (float64) : 13.130118049735882
102
+
print("Population STD:", pop_std)
103
+
print("Sample STD:", sample_std)
80
104
```
81
105
82
-
## Codebyte Example
106
+
## Frequently Asked Questions
83
107
84
-
Run the below codebyte to better understand the`.std()`function:
108
+
### 1. What is the difference between NumPy`.std()`and `stdev()`?
85
109
86
-
```codebyte/python
87
-
import numpy as np
110
+
NumPy `.std()` works on NumPy arrays and is optimized for performance, supporting multi-dimensional data and the `axis` parameter. Python’s built-in `statistics.stdev()` works on standard Python iterables, calculates the **sample** standard deviation by default, and doesn’t support multi-dimensional arrays or the `axis` argument.
88
111
89
-
arr = [[8, 8, 8, 8, 8],
90
-
[15, 10, 32, 9, 8],
91
-
[27, 6, 63, 4, 8, ],
92
-
[23, 54, 41, 9, 8]]
112
+
### 2. Is NumPy `.std()` population or sample?
93
113
94
-
# flattened array
95
-
print("\nStandard deviation of arr, when axis = None : ", np.std(arr))
114
+
By default, NumPy `.std()` calculates the **population** standard deviation (`ddof=0`). To compute the sample standard deviation, set `ddof=1`.
96
115
97
-
# along the axis = 0
98
-
print("\nStandard deviation of arr, when axis = 0 : ", np.std(arr, axis = 0))
116
+
### 3. What is the difference between NumPy `.std()` and Pandas `.std()`?
99
117
100
-
# along the axis = 1
101
-
print("\nStandard deviation of arr, when axis = 1 : ", np.std(arr, axis = 1))
102
-
```
118
+
NumPy `.std()` calculates standard deviation on NumPy arrays and defaults to population standard deviation (`ddof=0`). Pandas `.std()` works on Series and DataFrame objects, automatically excludes `NaN` values, and defaults to the sample standard deviation (`ddof=1`).
0 commit comments