Skip to content

Commit 1ee69d4

Browse files
authored
ENH - New skglm README (#112)
1 parent f2bf4e3 commit 1ee69d4

File tree

3 files changed

+124
-99
lines changed

3 files changed

+124
-99
lines changed

README.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
<section align="center">
2+
3+
# ``skglm``
4+
5+
## A fast :zap: and modular :hammer_and_pick: scikit-learn replacement for sparse GLMs
6+
7+
</section>
8+
9+
![build](https://github.com/scikit-learn-contrib/skglm/workflows/pytest/badge.svg)
10+
![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)
11+
![Downloads](https://pepy.tech/badge/skglm/month)
12+
![PyPI version](https://badge.fury.io/py/skglm.svg)
13+
14+
15+
``skglm`` is a Python package that offers **fast estimators** for sparse Generalized Linear Models (GLMs) that are **100% compatible with ``scikit-learn``**. It is **highly flexible** and supports a wide range of GLMs. You get to choose from ``skglm``'s already-made estimators or **customize your own** by combining the available datafits and penalty.
16+
17+
Excited to have a tour on ``skglm`` [documentation](https://contrib.scikit-learn.org/skglm/) :memo:?
18+
19+
# Why ``skglm``?
20+
21+
``skglm`` is specifically conceived to solve sparse GLMs.
22+
It supports many missing models in ``scikit-learn`` and ensures high performance.
23+
There are several reasons to opt for ``skglm`` among which:
24+
25+
| | |
26+
| ----- | -------------- |
27+
| **Speed** :zap: | Fast solvers able to tackle large datasets, either dense or sparse, with millions of features **up to 100 times faster** than ``scikit-learn``|
28+
| **Modularity** :hammer_and_pick: | User-friendly API than enables **composing custom estimators** with any combination of its existing datafits and penalties |
29+
| **Extensibility** :arrow_up_down: | Flexible design that makes it **simple and easy to implement new datafits and penalties**, a matter of few lines of code
30+
| **Compatibility** :electric_plug: | Estimators **fully compatible with the ``scikit-learn`` API** and drop-in replacements of its GLM estimators
31+
| | |
32+
33+
34+
# Get started with ``skglm``
35+
36+
## Installing ``skglm``
37+
38+
``skglm`` is available on PyPi. Run the following command to get the latest version of the package
39+
40+
```shell
41+
pip install -U skglm
42+
```
43+
44+
It is also available on Conda _(not yet, but very soon...)_ and can be installed via the command
45+
46+
```shell
47+
conda install skglm
48+
```
49+
50+
## First steps with ``skglm``
51+
52+
Once you installed ``skglm``, you can run the following code snippet to fit a MCP Regression model on a toy dataset
53+
54+
```python
55+
# import model to fit
56+
from skglm.estimators import MCPRegression
57+
# import util to create a toy dataset
58+
from skglm.utils import make_correlated_data
59+
60+
# generate a toy dataset
61+
X, y, _ = make_correlated_data(n_samples=10, n_features=100)
62+
63+
# init and fit estimator
64+
estimator = MCPRegression()
65+
estimator.fit(X, y)
66+
67+
# print R²
68+
print(estimator.score(X, y))
69+
```
70+
You can refer to the documentation to explore the list of ``skglm``'s already-made estimators.
71+
72+
Didn't find one that suits you :monocle_face:, you can still compose your own.
73+
Here is a code snippet that fits a MCP-regularized problem with Huber loss.
74+
75+
```python
76+
# import datafit, penalty and GLM estimator
77+
from skglm.datafits import Huber
78+
from skglm.penalties import MCPenalty
79+
from skglm.estimators import GeneralizedLinearEstimator
80+
81+
from skglm.utils import make_correlated_data
82+
83+
X, y, _ = make_correlated_data(n_samples=10, n_features=100)
84+
# create and fit GLM estimator with Huber loss and MCP penalty
85+
estimator = GeneralizedLinearEstimator(
86+
datafit=Huber(delta=1.),
87+
penalty=MCPenalty(alpha=1e-2, gamma=3),
88+
)
89+
estimator.fit(X, y)
90+
```
91+
92+
You will find detailed description on the supported datafits and penalties and how to combine them in the API section of the documentation.
93+
You can also take our tutorial to learn how to create your own datafit and penalty.
94+
95+
96+
# Contribute to ``skglm``
97+
98+
``skglm`` is a continuous endeavour that relies on the community efforts to last and evolve. Your contribution is welcome and highly valuable. It can be
99+
100+
- **bug report**: you may encounter a bug while using ``skglm``. Don't hesitate to report it on the [issue section](https://github.com/scikit-learn-contrib/skglm/issues).
101+
- **feature request**: you may want to extend/add new features to ``skglm``. You can use [the issue section](https://github.com/scikit-learn-contrib/skglm/issues) to make suggestions.
102+
- **pull request**: you may have fixed a bug, added a features, or even fixed a small typo in the documentation, ... you can submit a [pull request](https://github.com/scikit-learn-contrib/skglm/pulls) and we will reach out to you asap.
103+
104+
105+
# Cite
106+
107+
``skglm`` is the result of perseverant research. It is licensed under [BSD 3-Clause](https://github.com/scikit-learn-contrib/skglm/blob/main/LICENSE). You are free to use it and if you do so, please cite
108+
109+
```bibtex
110+
@inproceedings{skglm,
111+
title = {Beyond L1: Faster and better sparse models with skglm},
112+
author = {Q. Bertrand and Q. Klopfenstein and P.-A. Bannier and G. Gidel and M. Massias},
113+
booktitle = {NeurIPS},
114+
year = {2022},
115+
}
116+
```
117+
118+
119+
# Useful links
120+
121+
- link to documentation: https://contrib.scikit-learn.org/skglm/
122+
- link to ``skglm`` arXiv article: https://arxiv.org/pdf/2204.07826.pdf

README.rst

Lines changed: 0 additions & 98 deletions
This file was deleted.

setup.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313

1414
DISTNAME = 'skglm'
1515
DESCRIPTION = 'A fast and modular scikit-learn replacement for generalized linear models'
16-
with open('README.rst', 'r') as f:
16+
with open('README.md', 'r', encoding='utf-8') as f:
1717
LONG_DESCRIPTION = f.read()
1818
MAINTAINER = 'Mathurin Massias'
1919
MAINTAINER_EMAIL = '[email protected]'
@@ -26,6 +26,7 @@
2626
version=version,
2727
description=DESCRIPTION,
2828
long_description=LONG_DESCRIPTION,
29+
long_description_content_type='text/markdown',
2930
maintainer=MAINTAINER,
3031
maintainer_email=MAINTAINER_EMAIL,
3132
url=URL,

0 commit comments

Comments
 (0)