Skip to content

Commit 449abff

Browse files
committed
wrapping up
1 parent 3cb5583 commit 449abff

File tree

1 file changed

+52
-56
lines changed

1 file changed

+52
-56
lines changed

src/MLJMultivariateStatsInterface.jl

Lines changed: 52 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -457,9 +457,9 @@ Train the machine using `fit!(mach, rows=...)`.
457457
458458
# Hyper-parameters
459459
460-
- `maxoutdim=0`: The maximum number of output dimensions. If not set, defaults to
461-
0, where all components are kept (e.g., the number of components/output dimensions
462-
is equal to the size of the smallest dimension of the training matrix)
460+
- `maxoutdim=0`: Controls the the dimension (number of columns) of the output,
461+
`outdim`. Specifically, `outdim = min(n, indim, maxoutdim)`, where `n` is the
462+
number of observations and `indim` the input dimension.
463463
- `method=:auto`: The method to use to solve the problem. Choices are
464464
- `:svd`: Support Vector Decomposition of the matrix.
465465
- `:cov`: Covariance matrix decomposition.
@@ -472,8 +472,7 @@ Train the machine using `fit!(mach, rows=...)`.
472472
473473
# Operations
474474
475-
- `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
476-
features `Xnew` having the same scitype as `X` above.
475+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
477476
478477
# Fitted parameters
479478
@@ -485,9 +484,11 @@ The fields of `fitted_params(mach)` are:
485484
# Report
486485
487486
The fields of `report(mach)` are:
487+
`outdim = min(n, indim, maxoutdim)`, where `n` is the
488+
number of observations and `indim` the input dimension.
488489
489-
- `indim`: Dimensions of the provided data.
490-
- `outdim`: Dimensions of the transformed result.
490+
- `indim`: The input dimensions.
491+
- `outdim`: `min(n, indim, maxoutdim)`, where `n` is the number of observations.
491492
- `tprincipalvar`: Total variance of the principal components.
492493
- `tresidualvar`: Total residual variance.
493494
- `tvar`: Total observation variance (principal + residual variance).
@@ -506,7 +507,7 @@ X, y = @load_iris
506507
model = PCA(maxoutdim=2)
507508
mach = machine(model, X) |> fit!
508509
509-
projection = transform(mach, X)
510+
Xproj = transform(mach, X)
510511
```
511512
512513
See also
@@ -549,8 +550,7 @@ Train the machine using `fit!(mach, rows=...)`.
549550
550551
# Operations
551552
552-
- `transform(mach, Xnew)`: Return predictions of the target given new
553-
features `Xnew` having the same scitype as `X` above.
553+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
554554
555555
# Fitted parameters
556556
@@ -563,8 +563,8 @@ The fields of `fitted_params(mach)` are:
563563
564564
The fields of `report(mach)` are:
565565
566-
- `indim`: Dimensions of the provided data.
567-
- `outdim`: Dimensions of the transformed result.
566+
- `indim`: The input dimensions.
567+
- `outdim`: `min(n, indim, maxoutdim)`, where `n` is the number of observations.
568568
- `principalvars`: The variance of the principal components.
569569
570570
# Examples
@@ -584,7 +584,7 @@ end
584584
model = KPCA(maxoutdim=2, kernel = rbf_kernel(1))
585585
mach = machine(model, X) |> fit!
586586
587-
projection = transform(mach, X)
587+
Xproj = transform(mach, X)
588588
```
589589
590590
See also
@@ -594,9 +594,9 @@ KernelPCA
594594
"""
595595
$(MMI.doc_header(ICA))
596596
597-
`ICA` is a computational technique for separating a multivariate signal into
598-
additive subcomponents, with the assumption that the subcomponents are
599-
non-Gaussian and independent from each other.
597+
`ICA` (independent component analysis) is a computational technique for separating a
598+
multivariate signal into additive subcomponents, with the assumption that the subcomponents
599+
are non-Gaussian and independent from each other.
600600
601601
# Training data
602602
@@ -618,34 +618,34 @@ Train the machine using `fit!(mach, rows=...)`.
618618
- `fun::Symbol=:tanh`: The approximate neg-entropy function, one of `:tanh`, `:gaus`.
619619
- `do_whiten::Bool=true`: Whether or not to perform pre-whitening.
620620
- `maxiter::Int=100`: The maximum number of iterations.
621-
- `tol::Real=1e-6`: The convergence tolerance for change in matrix W.
621+
- `tol::Real=1e-6`: The convergence tolerance for change in the unmixing matrix W.
622622
- `mean::Union{Nothing, Real, Vector{Float64}}=nothing`: mean to use, if nothing (default)
623-
centering is computed and applied, if zero, no centering, a vector of means can
623+
centering is computed and applied, if zero, no centering; otherwise a vector of means can
624624
be passed.
625-
- `winit::Union{Nothing,Matrix{<:Real}}=nothing`: Initial guess for matrix `W` either
626-
an empty matrix (random initilization of `W`), a matrix of size `k × k` (if `do_whiten`
627-
is true), a matrix of size `m × k` otherwise. If unspecified i.e `nothing` an empty
628-
`Matrix{<:Real}` is used.
625+
- `winit::Union{Nothing,Matrix{<:Real}}=nothing`: Initial guess for the unmixing matrix
626+
`W`: either an empty matrix (for random initilization of `W`), a matrix of size `m × k`
627+
(if `do_whiten` is true), or a matrix of size `m × k`. Here `m` is the number
628+
of components (columns) of the input.
629629
630630
# Operations
631631
632-
- `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
633-
features `Xnew` having the same scitype as `X` above.
632+
- `transform(mach, Xnew)`: Return the component-separated version of input
633+
`Xnew`, which should have the same scitype as `X` above.
634634
635635
# Fitted parameters
636636
637637
The fields of `fitted_params(mach)` are:
638638
639-
BUG: Does not have a projection class. It would also be cool to see the whitened
640-
matrix in fitted_params, to show how the covariance is the identity
639+
# TODO: Now that this is fixed, document
641640
642641
# Report
643642
644643
The fields of `report(mach)` are:
645644
646-
- `indim`: Dimensions of the provided data.
647-
- `outdim`: Dimensions of the transformed result.
648-
- `mean`: The mean vector.
645+
- `indim`: Dimension (number of columns/components) of the training
646+
data and new data to be transformed.
647+
- `outdim`: Dimension of transformed data (number of separated components).
648+
- `mean`: The mean vector, which has length `indim`.
649649
650650
# Examples
651651
@@ -660,7 +660,7 @@ X, y = @load_iris
660660
model = ICA(k = 2, tol=0.1)
661661
mach = machine(model, X) |> fit!
662662
663-
projection = transform(mach, X)
663+
Xproj = transform(mach, X)
664664
```
665665
666666
See also
@@ -718,8 +718,7 @@ Train the machine using `fit!(mach, rows=...)`.
718718
719719
# Operations
720720
721-
- `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
722-
features `Xnew` having scitype as `X` above.
721+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
723722
- `predict(mach, Xnew)`: Return predictions of the target given
724723
features `Xnew` having the same scitype as `X` above. Predictions
725724
are probabilistic.
@@ -761,7 +760,7 @@ X, y = @load_iris
761760
model = LDA()
762761
mach = machine(model, X, y) |> fit!
763762
764-
projection = transform(mach, X)
763+
Xproj = transform(mach, X)
765764
y_hat = predict(mach, x)
766765
labels = predict_mode(mach, x)
767766
```
@@ -828,8 +827,7 @@ value `regcoef * eigmax(Sw)` where `Sw` is the within-class covariance estimator
828827
829828
# Operations
830829
831-
- `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
832-
features `Xnew` having scitype as `X` above.
830+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
833831
- `predict(mach, Xnew)`: Return predictions of the target given
834832
features `Xnew` having the same scitype as `X` above. Predictions
835833
are probabilistic.
@@ -872,7 +870,7 @@ X, y = @load_iris
872870
model = BLDA()
873871
mach = machine(model, X, y) |> fit!
874872
875-
projection = transform(mach, X)
873+
Xproj = transform(mach, X)
876874
y_hat = predict(mach, x)
877875
labels = predict_mode(mach, x)
878876
```
@@ -928,8 +926,7 @@ Train the machine using `fit!(mach, rows=...)`.
928926
929927
# Operations
930928
931-
- `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
932-
features `Xnew` having scitype as `X` above.
929+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
933930
- `predict(mach, Xnew)`: Return predictions of the target given
934931
features `Xnew` having the same scitype as `X` above. Predictions
935932
are probabilistic.
@@ -970,7 +967,7 @@ X, y = @load_iris
970967
model = sLDA()
971968
mach = machine(model, X, y) |> fit!
972969
973-
projection = transform(mach, X)
970+
Xproj = transform(mach, X)
974971
y_hat = predict(mach, X)
975972
labels = predict_mode(mach, X)
976973
```
@@ -1027,8 +1024,7 @@ Train the machine using `fit!(mach, rows=...)`.
10271024
10281025
# Operations
10291026
1030-
- `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
1031-
features `Xnew` having scitype as `X` above.
1027+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
10321028
- `predict(mach, Xnew)`: Return predictions of the target given
10331029
features `Xnew` having the same scitype as `X` above. Predictions
10341030
are probabilistic.
@@ -1069,7 +1065,7 @@ X, y = @load_iris
10691065
model = bsLDA()
10701066
mach = machine(model, X, y) |> fit!
10711067
1072-
projection = transform(mach, X)
1068+
Xproj = transform(mach, X)
10731069
y_hat = predict(mach, X)
10741070
labels = predict_mode(mach, X)
10751071
```
@@ -1101,8 +1097,9 @@ Train the machine using `fit!(mach, rows=...)`.
11011097
# Hyper-parameters
11021098
11031099
- `method::Symbol=:cm`: Method to use to solve the problem, one of `:ml`, `:em`, `:bayes`.
1104-
- `maxoutdim::Int=0`: Maximum number of output dimensions, uses max(no_of_features - 1, 1)
1105-
if 0 (default).
1100+
- `maxoutdim=0`: Controls the the dimension (number of columns) of the output,
1101+
`outdim`. Specifically, `outdim = min(n, indim, maxoutdim)`, where `n` is the
1102+
number of observations and `indim` the input dimension.
11061103
- `maxiter::Int=1000`: Maximum number of iterations.
11071104
- `tol::Real=1e-6`: Convergence tolerance.
11081105
- `eta::Real=tol`: Variance lower bound.
@@ -1113,8 +1110,7 @@ Train the machine using `fit!(mach, rows=...)`.
11131110
11141111
# Operations
11151112
1116-
- `transform(mach, Xnew)`: Return predictions of the target given new
1117-
features `Xnew` having the same scitype as `X` above.
1113+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
11181114
11191115
# Fitted parameters
11201116
@@ -1127,8 +1123,8 @@ The fields of `fitted_params(mach)` are:
11271123
11281124
The fields of `report(mach)` are:
11291125
1130-
- `indim`: Dimensions of the provided data.
1131-
- `outdim`: Dimensions of the transformed result.
1126+
- `indim`: The input dimensions.
1127+
- `outdim`: `min(n, indim, maxoutdim)`, where `n` is the number of observations.
11321128
- `variance`: The variance of the factors.
11331129
- `covariance_matrix`: The estimated covariance matrix.
11341130
- `mean`: The mean vector.
@@ -1146,7 +1142,7 @@ X, y = @load_iris
11461142
model = FA(maxoutdim=2)
11471143
mach = machine(model, X) |> fit!
11481144
1149-
projection = transform(mach, X)
1145+
Xproj = transform(mach, X)
11501146
```
11511147
11521148
See also
@@ -1177,8 +1173,9 @@ Train the machine using `fit!(mach, rows=...)`.
11771173
11781174
# Hyper-parameters
11791175
1180-
- `maxoutdim::Int=0`: The maximum number of output dimensions, uses max(no_of_features - 1, 1)
1181-
if 0 (default).
1176+
- `maxoutdim=0`: Controls the the dimension (number of columns) of the output,
1177+
`outdim`. Specifically, `outdim = min(n, indim, maxoutdim)`, where `n` is the
1178+
number of observations and `indim` the input dimension.
11821179
- `method::Symbol=:ml`: The method to use to solve the problem, one of `:ml`, `:em`, `:bayes`.
11831180
- `maxiter::Int=1000`: The maximum number of iterations.
11841181
- `tol::Real=1e-6`: The convergence tolerance.
@@ -1189,8 +1186,7 @@ Train the machine using `fit!(mach, rows=...)`.
11891186
11901187
# Operations
11911188
1192-
- `transform(mach, Xnew)`: Return predictions of the target given new
1193-
features `Xnew` having the same Scitype as `X` above.
1189+
- `transform(mach, Xnew)`: Return a lower dimentional projection of the input `Xnew` having the same scitype as `X` above.
11941190
11951191
# Fitted parameters
11961192
@@ -1203,8 +1199,8 @@ The fields of `fitted_params(mach)` are:
12031199
12041200
The fields of `report(mach)` are:
12051201
1206-
- `indim`: Dimensions of the provided data.
1207-
- `outdim`: Dimensions of the transformed result.
1202+
- `indim`: The input dimensions.
1203+
- `outdim`: `min(n, indim, maxoutdim)`, where `n` is the number of observations.
12081204
- `tvat`: The variance of the components.
12091205
- `loadings`: The models loadings, weights for each variable used when calculating
12101206
principal components.
@@ -1221,7 +1217,7 @@ X, y = @load_iris
12211217
model = PPCA(maxoutdim=2)
12221218
mach = machine(model, X) |> fit!
12231219
1224-
projection = transform(mach, X)
1220+
Xproj = transform(mach, X)
12251221
```
12261222
12271223
See also

0 commit comments

Comments
 (0)