|
3 | 3 | In general, data transformations change raw feature vectors into |
4 | 4 | a representation that is more suitable for various estimators. |
5 | 5 |
|
6 | | -## Standardization |
| 6 | +## Standardization a.k.a Z-score Normalization |
7 | 7 |
|
8 | | -**Standardization** of dataset is a common requirement for many machine |
9 | | -learning techniques. These techniques might perform poorly if the individual |
10 | | -features do not more or less look like standard normally distributed data. |
| 8 | +**Standardization**, also known as Z-score normalization, is a common requirement |
| 9 | +for many machine learning techniques. These techniques might perform poorly |
| 10 | +if the individual features do not more or less look like standard normally |
| 11 | +distributed data. |
11 | 12 |
|
12 | 13 | Standardization transforms data points into corresponding standard scores |
13 | | -by removing mean and scaling to unit variance. |
| 14 | +by subtracting mean and scaling to unit variance. |
14 | 15 |
|
15 | | -The **standard score** is the signed number of standard deviations by which |
16 | | -the value of an observation or data point is above the mean value of what |
17 | | -is being observed or measured. |
| 16 | +The **standard score**, also known as Z-score, is the signed number of |
| 17 | +standard deviations by which the value of an observation or data point |
| 18 | +is above the mean value of what is being observed or measured. |
18 | 19 |
|
19 | | -Standardization can be performed using `fit(ZScoreTransform, ...)`. |
| 20 | +Standardization can be performed using `t = fit(ZScoreTransform, ...)` |
| 21 | +followed by `StatsBase.transform(t, ...)` or `StatsBase.transform!(t, ...)`. |
| 22 | +`standardize(ZScoreTransform, ...)` is a shorthand to perform both operations |
| 23 | +in a single call. |
20 | 24 |
|
21 | 25 | ```@docs |
22 | 26 | fit(::Type{ZScoreTransform}, X::AbstractArray{<:Real,2}; center::Bool=true, scale::Bool=true) |
23 | 27 | ``` |
24 | 28 |
|
25 | | -## Unit range normalization |
| 29 | +## Unit Range Normalization |
26 | 30 |
|
27 | | -**Unit range normalization** is an alternative data transformation which scales features |
28 | | -to lie in the interval `[0; 1]`. |
| 31 | +**Unit range normalization**, also known as min-max scaling, is an alternative |
| 32 | +data transformation which scales features to lie in the interval `[0; 1]`. |
29 | 33 |
|
30 | | -Unit range normalization can be performed using `fit(UnitRangeTransform, ...)`. |
| 34 | +Unit range normalization can be performed using `t = fit(UnitRangeTransform, ...)` |
| 35 | +followed by `StatsBase.transform(t, ...)` or `StatsBase.transform!(t, ...)`. |
| 36 | +`standardize(UnitRangeTransform, ...)` is a shorthand to perform both operations |
| 37 | +in a single call. |
31 | 38 |
|
32 | 39 | ```@docs |
33 | 40 | fit(::Type{UnitRangeTransform}, X::AbstractArray{<:Real,2}; unit::Bool=true) |
34 | 41 | ``` |
35 | 42 |
|
36 | | -## Additional methods |
| 43 | +## Additional Methods |
37 | 44 | ```@docs |
38 | 45 | StatsBase.transform |
39 | 46 | StatsBase.transform! |
|
0 commit comments