@@ -463,7 +463,7 @@ Where
463463
464464# Operations
465465
466- - `transform(mach, Xnew)`: Return predictions of the target given new
466+ - `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
467467 features `Xnew` having the same Scitype as `X` above.
468468
469469# Fitted parameters
509509"""
510510$(MMI. doc_header (KernelPCA))
511511
512- `KernelPCA` Principal component analysis. Learns a linear transformation to
513- project the data on a lower dimensional space while preserving most of the initial
514- variance.
512+ `KernelPCA` Kernel principal component analysis. Using a kernel, the linear
513+ operations of PCA are performed in a [reproducing Hilbert space](https://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space).
515514
516515# Training data
517516
@@ -587,9 +586,9 @@ KernelPCA
587586"""
588587$(MMI. doc_header (ICA))
589588
590- `ICA` Principal component analysis. Learns a linear transformation to
591- project the data on a lower dimensional space while preserving most of the initial
592- variance .
589+ `ICA` is a computational technique for separating a multivariate signal into
590+ additive subcomponents, with the assumption that the subcomponents are
591+ non-Gaussian and independent from each other .
593592
594593# Training data
595594
@@ -603,56 +602,51 @@ Where
603602
604603# Hyper-parameters
605604
606- - `maxoutdim=0`: The maximum number of output dimensions. If not set, defaults to
607- 0, where all components are kept (e.g., the number of components/output dimensions
608- is equal to the size of the smallest dimension of the training matrix).
609- - `kernel::Function=(x,y)->x'y`: The kernel function, takes in 2 vector arguments
610- x and y, returns a scalar value. Defaults to the dot product of X and Y.
611- - `solver::Symbol=:auto`: solver to use for the eigenvalues, one of `:eig`(default),
612- `:eigs`.
613- - `inverse::Bool=true`: perform calculations needed for inverse transform
614- - `beta::Real=1.0`: strength of the ridge regression that learns the inverse transform
615- when inverse is true.
616- - `tol::Real=0.0`: Convergence tolerance for eigs solver.
617- - `maxiter::Int=300`: maximum number of iterations for eigs solver.
605+ - `k::Int=0`: The number of independent components to recover, set automatically if `0`.
606+ - `alg::Symbol=:fastica`: The algorithm to use (only `:fastica` is supported at the moment).
607+ - `fun::Symbol=:tanh`: The approximate neg-entropy function, one of `:tanh`, `:gaus`.
608+ - `do_whiten::Bool=true`: Whether or not to perform pre-whitening.
609+ - `maxiter::Int=100`: The maximum number of iterations.
610+ - `tol::Real=1e-6`: The convergence tolerance for change in matrix W.
611+ - `mean::Union{Nothing, Real, Vector{Float64}}=nothing`: mean to use, if nothing (default)
612+ centering is computed and applied, if zero, no centering, a vector of means can
613+ be passed.
614+ - `winit::Union{Nothing,Matrix{<:Real}}=nothing`: Initial guess for matrix `W` either
615+ an empty matrix (random initilization of `W`), a matrix of size `k × k` (if `do_whiten`
616+ is true), a matrix of size `m × k` otherwise. If unspecified i.e `nothing` an empty
617+ `Matrix{<:Real}` is used.
618618
619619# Operations
620620
621- - `transform(mach, Xnew)`: Return predictions of the target given new
621+ - `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
622622 features `Xnew` having the same Scitype as `X` above.
623623
624624# Fitted parameters
625625
626626The fields of `fitted_params(mach)` are:
627627
628- - `projection`: Returns the projection matrix (of size `(d, p)`).
629- Each column of the projection matrix corresponds to a principal component.
630- The principal components are arranged in descending order of
631- the corresponding variances.
628+ BUG: Does not have a projection class. It would also be cool to see the whitened
629+ matrix in fitted_params, to show how the covariance is the identity
632630
633631# Report
634632
635633The fields of `report(mach)` are:
636634
637635- `indim`: Dimensions of the provided data.
638636- `outdim`: Dimensions of the transformed result.
639- - `principalvars `: The variance of the principal components .
637+ - `mean `: The mean vector .
640638
641639# Examples
642640
643641```
644642using MLJ
645643using LinearAlgebra
646644
647- KPCA = @load KernelPCA pkg=MultivariateStats
645+ ICA = @load ICA pkg=MultivariateStats
648646
649647X, y = @load_iris
650648
651- function rbf_kernel(length_scale)
652- return (x,y) -> norm(x-y)^2 / ((2 * length_scale)^2)
653- end
654-
655- model = KPCA(maxoutdim=2, kernel = rbf_kernel(1))
649+ model = ICA(k = 2, tol=0.1)
656650mach = machine(model, X) |> fit!
657651
658652projection = transform(mach, X)
@@ -662,6 +656,88 @@ See also
662656TODO: ADD REFERENCES
663657"""
664658ICA
659+ """
660+ $(MMI. doc_header (LDA))
661+
662+ `LDA`: Multiclass linear discriminant analysis. The algorithm learns a
663+ projection matrix `P` that projects a feature matrix `Xtrain` onto a lower dimensional
664+ space of dimension `out_dim` such that the trace of the transformed between-class
665+ scatter matrix(`Pᵀ*Sb*P`) is maximized relative to the trace of the transformed
666+ within-class scatter matrix (`Pᵀ*Sw*P`).The projection matrix is scaled such that
667+ `Pᵀ*Sw*P=I` or `Pᵀ*Σw*P=I`(where `Σw` is the within-class covariance matrix) .
668+ Predicted class posterior probability for feature matrix `Xtest` are derived by
669+ applying a softmax transformationto a matrix `Pr`, such that rowᵢ of `Pr` contains
670+ computed distances(based on a distance metric) in the transformed space of rowᵢ in
671+ `Xtest` to the centroid of each class.
672+
673+ # Training data
674+
675+ In MLJ or MLJBase, bind an instance `model` to data with
676+ mach = machine(model, X)
677+
678+ Where
679+
680+ - `X`: is any table of input features (eg, a `DataFrame`) whose columns
681+ are of scitype `Continuous`; check the scitype with `schema(X)`
682+
683+ # Hyper-parameters
684+
685+ - `method::Symbol=:gevd`: The solver, one of `:gevd` or `:whiten` methods.
686+ - `cov_w::CovarianceEstimator`=SimpleCovariance: An estimator for the within-class
687+ covariance (used in computing within-class scatter matrix, Sw), by default set
688+ to the standard `MultivariateStats.SimpleCovariance()` but
689+ could be set to any robust estimator from `CovarianceEstimation.jl`..
690+ - `cov_b::CovarianceEstimator`=SimpleCovariance: The same as `cov_w` but for the between-class
691+ covariance (used in computing between-class scatter matrix, Sb).
692+ - `out_dim::Int=0`: The output dimension, i.e dimension of the transformed space,
693+ automatically set if 0 is given (default).
694+ - `regcoef::Float64=1e-6`: The regularization coefficient (default value 1e-6). A positive
695+ value `regcoef * eigmax(Sw)` where `Sw` is the within-class scatter matrix, is added
696+ to the diagonal of Sw to improve numerical stability. This can be useful if using
697+ the standard covariance estimator.
698+ - `dist::SemiMetric=SqEuclidean`: The distance metric to use when performing classification
699+ (to compare the distance between a new point and centroids in the transformed space),
700+ an alternative choice can be the `CosineDist`.Defaults to `SqEuclidean`.
701+
702+ # Operations
703+
704+ - `transform(mach, Xnew)`: Return lower dimensional projection of the target given new
705+ features `Xnew` having the same Scitype as `X` above.
706+
707+ # Fitted parameters
708+
709+ The fields of `fitted_params(mach)` are:
710+
711+ BUG: Does not have a projection class. It would also be cool to see the whitened
712+ matrix in fitted_params, to show how the covariance is the identity
713+
714+ # Report
715+
716+ The fields of `report(mach)` are:
717+
718+ - `indim`: Dimensions of the provided data.
719+ - `outdim`: Dimensions of the transformed result.
720+ - `mean`: The mean vector.
721+
722+ # Examples
723+
724+ ```
725+ using MLJ
726+ using LinearAlgebra
727+
728+ LA = @load LDA pkg=MultivariateStats
729+
730+ X, y = @load_iris
731+
732+ model = ICA(k = 2, tol=0.1)
733+ mach = machine(model, X) |> fit!
734+
735+ projection = transform(mach, X)
736+ ```
737+
738+ See also
739+ TODO: ADD REFERENCES
740+ """
665741LDA
666742BayesianLDA
667743SubspaceLDA
0 commit comments