Algorithm Support Lists

The table below shows the capabilities currently available in SML. In general, the following features are rarely (partly) supported in SML:

Early stop for training or iterating algorithm: We do not want to reveal any intermediate information (For some algorithms, we indeed reveal some bits to accelerate).
Manual set of random seed: SPU can't handle randomness of float properly, so if random value (matrix) is needed, user should pass it as a parameter (such as rsvd, NMF), or you can compute it in plaintext environment.
Data inspection like counting the number of label, re-transforming the data or label won't be done. (So we may assume a "fixed" format for input or just tell the number of classes as a parameter)
single-sample SGD not implemented for the latency consideration, MiniBatch-SGD (which we just call it sgd in sml) will replace it.
Jax's Ops like eigh, svd can't run in SPU directly: svd implemented now is expensive and can't handle matrix that is not column full-rank matrix.

Algorithm	Category	Supported Features	Notes
KMEANS	cluster	1. init=`random`, `k-means++` 2. algorithm=`lloyd` only 3. support `n_init`	1. Only run algo once for efficiency when n_init=1 2. Support multiple initializations with best result selection
GaussianProcessClassifier	gaussian_process	1. `RBF` kernel only 2. `OVR` for multi-class task only 3. Laplace approximation	1. Current implementations will NOT optimize the parameters of kernel during training 2. Support `sigmoid` likelihood function
PCA	decomposition	1. `power_iteration` method 2. `serial_jacobi_iteration` method 3. `rsvd` method	1. If method=`power_iteration`, then cov matrix will be computed first 2. `rsvd` is very unstable under fixedpoint setting even in `FM128`, so only small data is supported 3. Support various parameter configurations for each method
NMF	decomposition	1. init=`random` 2. solver=`mu` 3. beta_loss=`frobenius` only 4. Support L1/L2 regularization	1. Support alpha_W and alpha_H regularization parameters 2. Support transform and inverse_transform methods
T-SNE	decomposition	1. init=`pca` or `random` 2. Support various PCA method configurations 3. Support early exaggeration	1. Comprehensive parameter control including learning rate, momentum 2. Support custom Y_init for random initialization
Adaboost	ensemble	1. Decision tree model supported only 2. `SAMME` algorithm only 3. Support sample weights	1. No early stopping is implemented 2. Support multiple estimators with weight calculation
Random Forest	ensemble	1. Support `gini` criterion 2. Support `best` splitter 3. Support bootstrap sampling 4. Support max_features control	1. No early stopping is implemented 2. Support feature subsampling and sample weight
Feature Selection	feature_selection	1. `chi2` univariate selection 2. `f_classif` (ANOVA F-test) supported	1. Support p-value computation with configurable parameters 2. Support different numerical stability controls
Logistic Regression	linear_model	1. `sgd` solver only 2. All regularization methods (l1, l2, elasticnet, None) 3. Support binary and OVR multi-class 4. Support early stopping	1. `sigmoid` will be evaluated approximately 2. Support various sigmoid approximation methods 3. Support equal class weights only
Perceptron	linear_model	1. All regularization methods (l1, l2, elasticnet, None) 2. Patience-based early stop 3. Support sample batching	1. Early stop will not cut down the training time, it just forces the update of parameters stop 2. Support various batch sizes
Ridge	linear_model	1. `svd` and `cholesky` solver 2. Support bias fitting control	1. Support preprocessing and bias handling 2. Efficient matrix decomposition methods
SGDClassifier	linear_model	1. Linear regression and logistic regression 2. L2 regularization supported only	1. `sigmoid` will be evaluated approximately 2. Support different batch sizes
GLM Regressors	linear_model	1. PoissonRegressor, GammaRegressor, TweedieRegressor 2. `newton-cholesky` and `lbfgs` solvers 3. Support L2 regularization	1. Support different link functions (log, identity) 2. Support sample weights 3. Tweedie supports configurable power parameter
Quantile Regression	linear_model	1. Support different quantiles (0-1) 2. L1 regularization 3. Linear programming based solver	1. Support sample weights 2. Efficient simplex algorithm implementation
SVC	svm	1. Only support `RBF` kernel 2. Only support SMO solver 3. Support C regularization
GaussianNB	naive_bayes	1. Not support manual set of priors 2. Support online learning (partial_fit)	1. Support incremental learning with proper variance updating 2. Efficient vectorized computation
KNN	neighbors	1. `brute` algorithm only 2. `uniform` and `distance` weights supported 3. Support custom metrics	1. KD-tree or Ball-tree can't improve the efficiency in MPC setting 2. Support configurable n_neighbors and metric parameters
DecisionTreeClassifier	tree	1. Implemented based on GTree 2. Support binary features (i.e. {0, 1}) and multi-class labels 3. Support `gini` criterion and `best` splitter 4. Support sample weights	1. Memory and time complexity is around `O(n_samples * n_labels * n_features * 2 ** max_depth)` 2. Efficient oblivious array access implementation
Preprocessing	preprocessing	1. LabelBinarizer, Binarizer, Normalizer 2. RobustScaler, MinMaxScaler, MaxAbsScaler 3. KBinsDiscretizer with multiple strategies 4. OneHotEncoder, QuantileTransformer	1. Support various binning strategies (uniform, quantile, kmeans) 2. Support robust scaling with outlier handling 3. Comprehensive categorical encoding support
Manifold	manifold	1. ISOMAP with k-NN graph construction 2. Spectral Embedding (SE) 3. Support custom distance metrics	1. Efficient Floyd-Warshall and Dijkstra implementations 2. Jacobi eigenvalue decomposition 3. MDS-based dimensionality reduction
Classification	metrics	1. `roc_auc_score` (binary only) 2. accuracy_score, precision_score, recall_score, f1_score 3. Support various averaging methods 4. precision_recall_curve, average_precision_score	1. Support binary and multi-class scenarios 2. Comprehensive evaluation metrics with configurable parameters
Regression	metrics	1. mean_squared_error, r2_score 2. explained_variance_score 3. GLM-specific metrics (Poisson, Gamma, Tweedie deviance)	1. Support sample weights 2. D2 score for GLM models 3. Multiple output support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Algorithm Support Lists

FilesExpand file tree

support_lists.md

Latest commit

History

support_lists.md

File metadata and controls

Algorithm Support Lists