Robust principal component analysis via Principal Component Pursuit (PCP) with scikit-learn transformer interface.
pip install skpcp
Principal Component Pursuit (PCP) is a method for decomposing a data matrix X
into a low-rank component L
and a sparse component S
, i.e., X = L + S
. The skpcp
package provides an implementation of PCP with a scikit-learn compatible transformer interface.
At its core the algorithm solves the following optimization problem $$ \min_{L,S} |L|_* + \lambda |S|1 \quad \text{s.t.} \quad X = L + S $$ where $|L|*$ is the nuclear norm (sum of singular values) of L
, S
, and X
.
We refer the users to the original paper by Candes et al. (2011) for more details: Robust Principal Component Analysis?.
import numpy as np
from skpcp import PCP
# Generate synthetic data with low-rank and sparse components
RNG = np.random.default_rng(42)
n_samples, n_features, rank = 100, 50, 5
L = np.dot(RNG.normal(size=(n_samples, rank)), RNG.normal(size=(rank, n_features))) # Low rank component
S = RNG.binomial(1, 0.1, size=(n_samples, n_features)) * RNG.normal(loc=0, scale=10, size=(n_samples, n_features)) # Sparse component
X = L + S
# Fit PCP model
pcp = PCP()
pcp.fit(X)
L_est = pcp.low_rank_ # Estimated low-rank component
S_est = pcp.sparse_ # Estimated sparse component
Alternatively you can use the fit_transform
method to fit the model and obtain the low-rank component in one step:
L_est = pcp.fit_transform(X)
Note that the fit
method decomposes the input data matrix X
into its low-rank component L_est
and sparse component S_est
.
The behavior of the transform
method of PCP
differs from that of a typical scikit-learn transformer, in that it accepts the same data matrix X
that was used in fit
. You cannot pass a new data matrix to transform
, as the decomposition is specific to the input data used in fit
.
Please see the examples and the API reference for more details.
The documentation is supported by Sphinx and it is hosted on GitHub pages.
To build the HTML pages locally, first make sure you have installed the package with its documentation dependencies:
uv pip install -e .[docs]
then run the following:
sphinx-build docs docs/_build