Skip to content

Commit bc97ab7

Browse files
committed
hot fix
1 parent 06455a4 commit bc97ab7

File tree

3 files changed

+175
-4
lines changed

3 files changed

+175
-4
lines changed

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "MLJMultivariateStatsInterface"
22
uuid = "1b6a4a23-ba22-4f51-9698-8599985d3728"
33
authors = ["Anthony D. Blaom <[email protected]>", "Thibaut Lienart <[email protected]>", "Okon Samuel <[email protected]>"]
4-
version = "0.3.1"
4+
version = "0.3.2"
55

66
[deps]
77
Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"

src/MLJMultivariateStatsInterface.jl

Lines changed: 173 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -468,10 +468,24 @@ Where
468468
469469
# Fitted parameters
470470
471-
TODO: Example, coeff, report
472-
473471
The fields of `fitted_params(mach)` are:
474472
473+
- `projection`: Returns the projection matrix (of size `(d, p)`).
474+
Each column of the projection matrix corresponds to a principal component.
475+
The principal components are arranged in descending order of
476+
the corresponding variances.
477+
478+
# Report
479+
480+
The fields of `report(mach)` are:
481+
482+
- `indim`: Dimensions of the provided data.
483+
- `outdim`: Dimensions of the transformed result.
484+
- `tprincipalvar`: Total variance of the principal components.
485+
- `tresidualvar`: Total residual variance.
486+
- `tvar`: Total observation variance (principal + residual variance).
487+
- `mean`: The mean vector (of length `d`).
488+
- `principalvars`: The variance of the principal components.
475489
476490
# Examples
477491
@@ -481,16 +495,172 @@ using MLJ
481495
PCA = @load PCA pkg=MultivariateStats
482496
483497
X, y = @load_iris
498+
484499
model = PCA(maxoutdim=2)
485-
mach = machine(pca, X) |> fit!
500+
mach = machine(model, X) |> fit!
486501
502+
projection = transform(mach, X)
487503
```
488504
489505
See also
490506
TODO: ADD REFERENCES
491507
"""
492508
PCA
509+
"""
510+
$(MMI.doc_header(KernelPCA))
511+
512+
`KernelPCA` Principal component analysis. Learns a linear transformation to
513+
project the data on a lower dimensional space while preserving most of the initial
514+
variance.
515+
516+
# Training data
517+
518+
In MLJ or MLJBase, bind an instance `model` to data with
519+
mach = machine(model, X)
520+
521+
Where
522+
523+
- `X`: is any table of input features (eg, a `DataFrame`) whose columns
524+
are of scitype `Continuous`; check the scitype with `schema(X)`
525+
526+
# Hyper-parameters
527+
528+
- `maxoutdim=0`: The maximum number of output dimensions. If not set, defaults to
529+
0, where all components are kept (e.g., the number of components/output dimensions
530+
is equal to the size of the smallest dimension of the training matrix).
531+
- `kernel::Function=(x,y)->x'y`: The kernel function, takes in 2 vector arguments
532+
x and y, returns a scalar value. Defaults to the dot product of X and Y.
533+
- `solver::Symbol=:auto`: solver to use for the eigenvalues, one of `:eig`(default),
534+
`:eigs`.
535+
- `inverse::Bool=true`: perform calculations needed for inverse transform
536+
- `beta::Real=1.0`: strength of the ridge regression that learns the inverse transform
537+
when inverse is true.
538+
- `tol::Real=0.0`: Convergence tolerance for eigs solver.
539+
- `maxiter::Int=300`: maximum number of iterations for eigs solver.
540+
541+
# Operations
542+
543+
- `transform(mach, Xnew)`: Return predictions of the target given new
544+
features `Xnew` having the same Scitype as `X` above.
545+
546+
# Fitted parameters
547+
548+
The fields of `fitted_params(mach)` are:
549+
550+
- `projection`: Returns the projection matrix (of size `(d, p)`).
551+
Each column of the projection matrix corresponds to a principal component.
552+
The principal components are arranged in descending order of
553+
the corresponding variances.
554+
555+
# Report
556+
557+
The fields of `report(mach)` are:
558+
559+
- `indim`: Dimensions of the provided data.
560+
- `outdim`: Dimensions of the transformed result.
561+
- `principalvars`: The variance of the principal components.
562+
563+
# Examples
564+
565+
```
566+
using MLJ
567+
using LinearAlgebra
568+
569+
KPCA = @load KernelPCA pkg=MultivariateStats
570+
571+
X, y = @load_iris
572+
573+
function rbf_kernel(length_scale)
574+
return (x,y) -> norm(x-y)^2 / ((2 * length_scale)^2)
575+
end
576+
577+
model = KPCA(maxoutdim=2, kernel = rbf_kernel(1))
578+
mach = machine(model, X) |> fit!
579+
580+
projection = transform(mach, X)
581+
```
582+
583+
See also
584+
TODO: ADD REFERENCES
585+
"""
493586
KernelPCA
587+
"""
588+
$(MMI.doc_header(ICA))
589+
590+
`ICA` Principal component analysis. Learns a linear transformation to
591+
project the data on a lower dimensional space while preserving most of the initial
592+
variance.
593+
594+
# Training data
595+
596+
In MLJ or MLJBase, bind an instance `model` to data with
597+
mach = machine(model, X)
598+
599+
Where
600+
601+
- `X`: is any table of input features (eg, a `DataFrame`) whose columns
602+
are of scitype `Continuous`; check the scitype with `schema(X)`
603+
604+
# Hyper-parameters
605+
606+
- `maxoutdim=0`: The maximum number of output dimensions. If not set, defaults to
607+
0, where all components are kept (e.g., the number of components/output dimensions
608+
is equal to the size of the smallest dimension of the training matrix).
609+
- `kernel::Function=(x,y)->x'y`: The kernel function, takes in 2 vector arguments
610+
x and y, returns a scalar value. Defaults to the dot product of X and Y.
611+
- `solver::Symbol=:auto`: solver to use for the eigenvalues, one of `:eig`(default),
612+
`:eigs`.
613+
- `inverse::Bool=true`: perform calculations needed for inverse transform
614+
- `beta::Real=1.0`: strength of the ridge regression that learns the inverse transform
615+
when inverse is true.
616+
- `tol::Real=0.0`: Convergence tolerance for eigs solver.
617+
- `maxiter::Int=300`: maximum number of iterations for eigs solver.
618+
619+
# Operations
620+
621+
- `transform(mach, Xnew)`: Return predictions of the target given new
622+
features `Xnew` having the same Scitype as `X` above.
623+
624+
# Fitted parameters
625+
626+
The fields of `fitted_params(mach)` are:
627+
628+
- `projection`: Returns the projection matrix (of size `(d, p)`).
629+
Each column of the projection matrix corresponds to a principal component.
630+
The principal components are arranged in descending order of
631+
the corresponding variances.
632+
633+
# Report
634+
635+
The fields of `report(mach)` are:
636+
637+
- `indim`: Dimensions of the provided data.
638+
- `outdim`: Dimensions of the transformed result.
639+
- `principalvars`: The variance of the principal components.
640+
641+
# Examples
642+
643+
```
644+
using MLJ
645+
using LinearAlgebra
646+
647+
KPCA = @load KernelPCA pkg=MultivariateStats
648+
649+
X, y = @load_iris
650+
651+
function rbf_kernel(length_scale)
652+
return (x,y) -> norm(x-y)^2 / ((2 * length_scale)^2)
653+
end
654+
655+
model = KPCA(maxoutdim=2, kernel = rbf_kernel(1))
656+
mach = machine(model, X) |> fit!
657+
658+
projection = transform(mach, X)
659+
```
660+
661+
See also
662+
TODO: ADD REFERENCES
663+
"""
494664
ICA
495665
LDA
496666
BayesianLDA

src/models/decomposition_models.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ function MMI.fit(model::PCA, verbosity::Int, X)
4343
)
4444
cache = nothing
4545
report = (
46+
# TODO: Make PR to MultivariateStats
4647
indim=MS.size(fitresult)[1],
4748
outdim=MS.size(fitresult)[2],
4849
tprincipalvar=MS.tprincipalvar(fitresult),

0 commit comments

Comments
 (0)