Skip to content

Commit 2742c73

Browse files
committed
Move more explanations into code
* change header underlining
1 parent fdd3967 commit 2742c73

File tree

12 files changed

+73
-100
lines changed

12 files changed

+73
-100
lines changed

docs/src/index.rst

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,4 @@
1-
scikit-matter
2-
=============
3-
4-
scikit-matter is a toolbox of methods developed in the
5-
computational chemical and materials science community, following the
6-
`scikit-learn <https://scikit.org/>`_ API
7-
and coding guidelines to promote usability and interoperability with existing workflows.
8-
1+
.. automodule:: skmatter
92

103
.. include:: ../../README.rst
114
:start-after: marker-issues
@@ -22,6 +15,5 @@ and coding guidelines to promote usability and interoperability with existing wo
2215
contributing
2316
bibliography
2417

25-
2618
If you would like to contribute to scikit-matter, check out our :ref:`contributing`
2719
page!

docs/src/references/decomposition.rst

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,9 @@ Principal Covariates Regression (PCovR)
44
.. _PCovR-api:
55

66
PCovR
7-
#####
7+
-----
88

9-
.. currentmodule:: skmatter.decomposition
10-
11-
.. autoclass:: PCovR
9+
.. autoclass:: skmatter.decomposition.PCovR
1210
:show-inheritance:
1311
:special-members:
1412

@@ -25,11 +23,9 @@ PCovR
2523
.. _KPCovR-api:
2624

2725
Kernel PCovR
28-
############
29-
30-
.. currentmodule:: skmatter.decomposition
26+
------------
3127

32-
.. autoclass:: KernelPCovR
28+
.. autoclass:: skmatter.decomposition.KernelPCovR
3329
:show-inheritance:
3430
:special-members:
3531

docs/src/references/linear_models.rst

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,17 @@
11
Linear Models
22
=============
33

4-
.. currentmodule:: skmatter.linear_model._base
5-
64
Orthogonal Regression
7-
#####################
8-
9-
.. autoclass:: OrthogonalRegression
5+
---------------------
106

11-
.. currentmodule:: skmatter.linear_model._ridge
7+
.. autoclass:: skmatter.linear_model.OrthogonalRegression
128

139
Ridge Regression with Two-fold Cross Validation
14-
###############################################
10+
-----------------------------------------------
1511

16-
.. autoclass:: RidgeRegression2FoldCV
12+
.. autoclass:: skmatter.linear_model.RidgeRegression2FoldCV
1713

1814
PCovR
19-
#####
15+
-----
2016

2117
Principal Covariates Regression is a linear model, see :ref:`PCovR-api`.

docs/src/references/metrics.rst

Lines changed: 6 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,28 @@
1-
.. _gfrm:
2-
31
Reconstruction Measures
42
=======================
53

6-
.. marker-reconstruction-introduction-begin
7-
84
.. automodule:: skmatter.metrics
95

10-
These reconstruction measures are available:
11-
12-
* :ref:`GRE-api` (GRE) computes the amount of linearly-decodable information
13-
recovered through a global linear reconstruction.
14-
* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global linear
15-
reconstruction.
16-
* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered through
17-
a local linear reconstruction for the k-nearest neighborhood of each sample.
18-
19-
.. marker-reconstruction-introduction-end
20-
21-
.. currentmodule:: skmatter.metrics
22-
236
.. _GRE-api:
247

258
Global Reconstruction Error
269
---------------------------
2710

28-
.. autofunction:: pointwise_global_reconstruction_error
29-
.. autofunction:: global_reconstruction_error
11+
.. autofunction:: skmatter.metrics.pointwise_global_reconstruction_error
12+
.. autofunction:: skmatter.metrics.global_reconstruction_error
3013

3114
.. _GRD-api:
3215

3316
Global Reconstruction Distortion
3417
--------------------------------
3518

36-
.. autofunction:: pointwise_global_reconstruction_distortion
37-
.. autofunction:: global_reconstruction_distortion
19+
.. autofunction:: skmatter.metrics.pointwise_global_reconstruction_distortion
20+
.. autofunction:: skmatter.metrics.global_reconstruction_distortion
3821

3922
.. _LRE-api:
4023

4124
Local Reconstruction Error
4225
--------------------------
4326

44-
.. autofunction:: pointwise_local_reconstruction_error
45-
.. autofunction:: local_reconstruction_error
27+
.. autofunction:: skmatter.metrics.pointwise_local_reconstruction_error
28+
.. autofunction:: skmatter.metrics.local_reconstruction_error

docs/src/references/preprocessing.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
Preprocessing
22
=============
33

4+
.. automodule:: skmatter.preprocessing
45

56
KernelNormalizer
67
----------------

docs/src/references/selection.rst

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ Feature and Sample Selection
1010
CUR
1111
---
1212

13-
1413
CUR decomposition begins by approximating a matrix :math:`{\mathbf{X}}` using a subset
1514
of columns and rows
1615

@@ -72,7 +71,6 @@ computation of :math:`\pi`. S
7271
:undoc-members:
7372
:inherited-members:
7473

75-
7674
.. _FPS-api:
7775

7876
Farthest Point-Sampling (FPS)
@@ -93,7 +91,6 @@ row-wise), and are built off of the same base class,
9391
These selectors can be instantiated using :py:class:`skmatter.feature_selection.FPS` and
9492
:py:class:`skmatter.sample_selection.FPS`.
9593

96-
9794
.. autoclass:: skmatter.feature_selection.FPS
9895
:members:
9996
:undoc-members:
@@ -139,7 +136,7 @@ When *Not* to Use Voronoi FPS
139136

140137
In many cases, this algorithm may not increase upon the efficiency. For example, for
141138
simple metrics (such as Euclidean distance), Voronoi FPS will likely not accelerate, and
142-
may decelerate, computations when compared to FPS. The sweet spot for Voronoi FPS is
139+
may decelerate, computations when compared to FPS. The sweet spot for Voronoi FPS is
143140
when the number of selectable samples is already enough to divide the space with Voronoi
144141
polyhedrons, but not yet comparable to the total number of samples, when the cost of
145142
bookkeeping significantly degrades the speed of work compared to FPS.

docs/src/references/utils.rst

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3,34 +3,30 @@ Utility Classes
33

44
.. _PCovR_dist-api:
55

6-
.. currentmodule:: skmatter.utils._pcovr_utils
7-
86
Modified Gram Matrix :math:`\mathbf{\tilde{K}}`
9-
###############################################
7+
-----------------------------------------------
108

11-
.. autofunction:: pcovr_kernel
9+
.. autofunction:: skmatter.utils.pcovr_kernel
1210

1311

1412
Modified Covariance Matrix :math:`\mathbf{\tilde{C}}`
15-
#####################################################
13+
-----------------------------------------------------
1614

17-
.. autofunction:: pcovr_covariance
15+
.. autofunction:: skmatter.utils.pcovr_covariance
1816

1917
Orthogonalizers for CUR
20-
#######################
21-
22-
.. currentmodule:: skmatter.utils._orthogonalizers
18+
-----------------------
2319

2420
When computing non-iterative CUR, it is necessary to orthogonalize the input matrices
2521
after each selection. For this, we have supplied a feature and a sample orthogonalizer
2622
for feature and sample selection.
2723

28-
.. autofunction:: X_orthogonalizer
29-
.. autofunction:: Y_feature_orthogonalizer
30-
.. autofunction:: Y_sample_orthogonalizer
24+
.. autofunction:: skmatter.utils.X_orthogonalizer
25+
.. autofunction:: skmatter.utils.Y_feature_orthogonalizer
26+
.. autofunction:: skmatter.utils.Y_sample_orthogonalizer
3127

3228

3329
Random Partitioning with Overlaps
34-
#################################
30+
---------------------------------
3531

3632
.. autofunction:: skmatter.model_selection.train_test_split

src/skmatter/__init__.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,10 @@
1+
"""
2+
scikit-matter
3+
=============
4+
5+
scikit-matter is a toolbox of methods developed in the computational chemical and
6+
materials science community, following the `scikit-learn <https://scikit.org/>`_ API and
7+
coding guidelines to promote usability and interoperability with existing workflows.
8+
"""
9+
110
__version__ = "0.1.4"

src/skmatter/_selection.py

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,13 @@
1-
r"""
2-
This module contains data sub-selection modules primarily corresponding to
3-
methods derived from CUR matrix decomposition and Farthest Point Sampling. In
4-
their classical form, CUR and FPS determine a data subset that maximizes the
5-
variance (CUR) or distribution (FPS) of the features or samples. These methods
6-
can be modified to combine supervised target information denoted by the methods
7-
`PCov-CUR` and `PCov-FPS`. For further reading, refer to [Imbalzano2018]_ and
8-
[Cersonsky2021]_. These selectors can be used for both feature and sample
9-
selection, with similar instantiations. All sub-selection methods scores each
10-
feature or sample (without an estimator) and chooses that with the maximum
11-
score. A simple example of usage:
1+
"""
2+
This module contains data sub-selection modules primarily corresponding to methods
3+
derived from CUR matrix decomposition and Farthest Point Sampling. In their classical
4+
form, CUR and FPS determine a data subset that maximizes the variance (CUR) or
5+
distribution (FPS) of the features or samples. These methods can be modified to combine
6+
supervised target information denoted by the methods `PCov-CUR` and `PCov-FPS`. For
7+
further reading, refer to [Imbalzano2018]_ and [Cersonsky2021]_. These selectors can be
8+
used for both feature and sample selection, with similar instantiations. All
9+
sub-selection methods scores each feature or sample (without an estimator) and chooses
10+
that with the maximum score. A simple example of usage:
1211
1312
.. doctest::
1413
@@ -64,9 +63,9 @@
6463
singular value decoposition.
6564
* :ref:`PCov-CUR-api` decomposition extends upon CUR by using augmented right or left
6665
singular vectors inspired by Principal Covariates Regression.
67-
* :ref:`FPS-api`: a common selection technique intended to exploit the diversity of
68-
the input space. The selection of the first point is made at random or by a
69-
separate metric
66+
* :ref:`FPS-api`: a common selection technique intended to exploit the diversity of the
67+
input space. The selection of the first point is made at random or by a separate
68+
metric
7069
* :ref:`PCov-FPS-api` extends upon FPS much like PCov-CUR does to CUR.
7170
* :ref:`Voronoi-FPS-api`: conduct FPS selection, taking advantage of Voronoi
7271
tessellations to accelerate selection.

src/skmatter/metrics/__init__.py

Lines changed: 23 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,26 @@
1-
r"""
2-
This module contains a set of easily-interpretable error measures of the
3-
relative information capacity of feature space `F` with respect to feature
4-
space `F'`. The methods returns a value between 0 and 1, where 0 means that
5-
`F` and `F'` are completey distinct in terms of linearly-decodable
6-
information, and where 1 means that `F'` is contained in `F`. All methods
7-
are implemented as the root mean-square error for the regression of the
8-
feature matrix `X_F'` (or sometimes called `Y` in the doc) from `X_F` (or
9-
sometimes called `X` in the doc) for transformations with different
10-
constraints (linear, orthogonal, locally-linear). By default a custom 2-fold
11-
cross-validation :py:class:`skosmo.linear_model.RidgeRegression2FoldCV` is
12-
used to ensure the generalization of the transformation and efficiency of
13-
the computation, since we deal with a multi-target regression problem.
14-
Methods were applied to compare different forms of featurizations through
15-
different hyperparameters and induced metrics and kernels [Goscinski2021]_ .
1+
"""
2+
This module contains a set of easily-interpretable error measures of the relative
3+
information capacity of feature space `F` with respect to feature space `F'`. The
4+
methods returns a value between 0 and 1, where 0 means that `F` and `F'` are completey
5+
distinct in terms of linearly-decodable information, and where 1 means that `F'` is
6+
contained in `F`. All methods are implemented as the root mean-square error for the
7+
regression of the feature matrix `X_F'` (or sometimes called `Y` in the doc) from `X_F`
8+
(or sometimes called `X` in the doc) for transformations with different constraints
9+
(linear, orthogonal, locally-linear). By default a custom 2-fold cross-validation
10+
:py:class:`skosmo.linear_model.RidgeRegression2FoldCV` is used to ensure the
11+
generalization of the transformation and efficiency of the computation, since we deal
12+
with a multi-target regression problem. Methods were applied to compare different forms
13+
of featurizations through different hyperparameters and induced metrics and kernels
14+
[Goscinski2021]_ .
15+
16+
These reconstruction measures are available:
17+
18+
* :ref:`GRE-api` (GRE) computes the amount of linearly-decodable information
19+
recovered through a global linear reconstruction.
20+
* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global linear
21+
reconstruction.
22+
* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered through
23+
a local linear reconstruction for the k-nearest neighborhood of each sample.
1624
"""
1725

1826
from ._reconstruction_measures import (

0 commit comments

Comments
 (0)