|
6 | 6 | Multioutput feature selection
|
7 | 7 | ==============================
|
8 | 8 |
|
9 |
| -We can use :class:`FastCan` to handle multioutput feature selection. |
| 9 | +We can use :class:`FastCan` to handle multioutput feature selection, which means |
| 10 | +target ``y`` can be a matrix. For regression, :class:`FastCan` can be used for |
| 11 | +MIMO (Multi-Input Multi-Output) data. For classification, it can be used for |
| 12 | +multilabel data. Actually, for multiclass classification, which has one output with |
| 13 | +multiple categories, multioutput feature selection can also be useful. The multiclass |
| 14 | +classification can be converted to multilabel classification by one-hot encoding |
| 15 | +target ``y``. The cannonical correaltion coefficient between the features ``X`` and the |
| 16 | +one-hot encoded target ``y`` has equivalent relationship with Fisher's criterion in |
| 17 | +LDA (Linear Discriminant Analysis) [1]_. Applying :class:`FastCan` to the converted |
| 18 | +multioutput data may result in better accuracy in the following classification task |
| 19 | +than applying it directly to the original single-label data. See Figure 5 in [2]_. |
| 20 | + |
| 21 | +Relationship on multiclass data |
| 22 | +------------------------------- |
| 23 | +Assume the feature matrix is :math:`X \in \mathbb{R}^{N\times n}`, the multiclass |
| 24 | +target vector is :math:`y \in \mathbb{R}^{N\times 1}`, and the one-hot encoded target |
| 25 | +matrix is :math:`Y \in \mathbb{R}^{N\times m}`. Then, the Fisher's criterion for |
| 26 | +:math:`X` and :math:`y` is denoted as :math:`J` and the canonical correaltion |
| 27 | +coefficient between :math:`X` and :math:`Y` is denoted as :math:`R`. The relationship |
| 28 | +between :math:`J` and :math:`R` is given by |
| 29 | + |
| 30 | +.. math:: |
| 31 | + J = \frac{R^2}{1-R^2} |
| 32 | +
|
| 33 | +or |
| 34 | + |
| 35 | +.. math:: |
| 36 | + R^2 = \frac{J}{1+J} |
| 37 | +
|
| 38 | +It should be noted that the number of the Fisher's criterion and the canonical |
| 39 | +correaltion coefficient is not only one. The number of the non-zero canonical |
| 40 | +correlation coefficients is no more than :math:`\min (n, m)`, and each canonical correlation |
| 41 | +coefficient is one-to-one correspondence to each Fisher's criterion. |
| 42 | + |
| 43 | +.. rubric:: References |
| 44 | + |
| 45 | +.. [1] `"Orthogonal least squares based fast feature selection for |
| 46 | + linear classification" <https://doi.org/10.1016/j.patcog.2021.108419>`_ |
| 47 | + Zhang, S., & Lang, Z. Q. Pattern Recognition, 123, 108419 (2022). |
| 48 | +
|
| 49 | +.. [2] `"Canonical-correlation-based fast feature selection for structural |
| 50 | + health monitoring" <https://doi.org/10.1016/j.ymssp.2024.111895>`_ |
| 51 | + Zhang, S., Wang, T., Worden, K., Sun L., & Cross, E. J. |
| 52 | + Mechanical Systems and Signal Processing, 223, 111895 (2025). |
| 53 | +
|
| 54 | +.. rubric:: Examples |
| 55 | + |
| 56 | +* See :ref:`sphx_glr_auto_examples_plot_fisher.py` for an example of |
| 57 | + the equivalent relationship between CCA and LDA on multiclass data. |
0 commit comments