Skip to content

Commit 94bf7b2

Browse files
committed
move code-blocks is sphinx doc to doctest examples in python doc
* adapt existing code blocks because they did not work * change the sphinx documentation of classes to autoclass to include the doctest examples in the docs and additional information * remove reference to GreedySelector is this is more developer doc
1 parent c2e284e commit 94bf7b2

File tree

4 files changed

+404
-254
lines changed

4 files changed

+404
-254
lines changed

docs/src/selection.rst

Lines changed: 65 additions & 193 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,12 @@ can be modified to combine supervised and unsupervised learning, in a formulatio
1111
denoted `PCov-CUR` and `PCov-FPS`.
1212
For further reading, refer to [Imbalzano2018]_ and [Cersonsky2021]_.
1313

14-
1514
These selectors can be used for both feature and sample selection, with similar
16-
instantiations. Currently, all sub-selection methods extend :py:class:`GreedySelector`,
17-
where at each iteration the model scores each
18-
feature or sample (without an estimator) and chooses that with the maximum score.
19-
This can be executed using:
15+
instantiations. This can be executed using:
2016

2117
.. doctest::
2218

19+
>>> # feature selection
2320
>>> import numpy as np
2421
>>> from skmatter.feature_selection import CUR, FPS, PCovCUR, PCovFPS
2522
>>> selector = CUR(
@@ -36,10 +33,14 @@ This can be executed using:
3633
... # are exhausted
3734
... full=False,
3835
... )
39-
>>> X = np.array([[ 0.12, 0.21, 0.02], # 3 samples, 3 features
40-
... [-0.09, 0.32, -0.10],
41-
... [-0.03, -0.53, 0.08]])
42-
>>> y = np.array([0., 0., 1.]) # classes of each sample
36+
>>> X = np.array(
37+
... [
38+
... [0.12, 0.21, 0.02], # 3 samples, 3 features
39+
... [-0.09, 0.32, -0.10],
40+
... [-0.03, -0.53, 0.08],
41+
... ]
42+
... )
43+
>>> y = np.array([0.0, 0.0, 1.0]) # classes of each sample
4344
>>> selector.fit(X)
4445
CUR(n_to_select=2, progress_bar=True, score_threshold=1e-12)
4546
>>> Xr = selector.transform(X)
@@ -51,6 +52,8 @@ This can be executed using:
5152
>>> Xr = selector.transform(X)
5253
>>> print(Xr.shape)
5354
(3, 2)
55+
>>>
56+
>>> # Now sample selection
5457
>>> from skmatter.sample_selection import CUR, FPS, PCovCUR, PCovFPS
5558
>>> selector = CUR(n_to_select=2)
5659
>>> selector.fit(X)
@@ -59,23 +62,11 @@ This can be executed using:
5962
>>> print(Xr.shape)
6063
(2, 3)
6164

62-
where `Selector` is one of the classes below that overwrites the method
63-
:py:func:`score`.
64-
65-
From :py:class:`GreedySelector`, selectors inherit these public methods:
66-
67-
.. currentmodule:: skmatter._selection
68-
69-
.. class:: GreedySelector
70-
71-
.. automethod:: fit
72-
.. automethod:: transform
73-
.. automethod:: get_support
7465

7566
.. _CUR-api:
7667

7768
CUR
78-
###
69+
---
7970

8071

8172
CUR decomposition begins by approximating a matrix :math:`{\mathbf{X}}` using a subset
@@ -100,88 +91,50 @@ features in a single iteration based upon the relative :math:`\pi` importance.
10091
The feature and sample selection versions of CUR differ only in the computation of
10192
:math:`\pi`. In sample selection :math:`\pi` is computed using the left singular
10293
vectors, versus in feature selection, :math:`\pi` is computed using the right singular
103-
vectors. In addition to :py:class:`GreedySelector`, both instances of CUR selection
104-
build off of :py:class:`skmatter._selection._cur._CUR`, and inherit
105-
106-
.. currentmodule:: skmatter._selection
107-
108-
.. automethod:: _CUR.score
109-
.. automethod:: _CUR._compute_pi
110-
111-
They are instantiated using
112-
:py:class:`skmatter.feature_selection.CUR` and
113-
:py:class:`skmatter.sample_selection.CUR`, e.g.
114-
115-
.. code-block:: python
94+
vectors.
11695

117-
from skmatter.feature_selection import CUR
96+
.. autoclass:: skmatter.feature_selection.CUR
97+
:members:
98+
:private-members: _compute_pi
99+
:undoc-members:
100+
:inherited-members:
118101

119-
selector = CUR(
120-
n_to_select=4,
121-
progress_bar=True,
122-
score_threshold=1e-12,
123-
full=False,
124-
# int, number of eigenvectors to use in computing pi
125-
k=1,
126-
# int, number of steps after which to recompute pi
127-
recompute_every=1,
128-
# float, threshold below which scores will be considered 0, defaults to 1E-12
129-
tolerance=1e-12,
130-
)
131-
selector.fit(X)
132-
133-
Xr = selector.transform(X)
102+
.. autoclass:: skmatter.sample_selection.CUR
103+
:members:
104+
:private-members: _compute_pi
105+
:undoc-members:
106+
:inherited-members:
134107

135108
.. _PCov-CUR-api:
136109

137110
PCov-CUR
138-
########
111+
--------
139112

140113
PCov-CUR extends upon CUR by using augmented right or left singular vectors inspired by
141114
Principal Covariates Regression, as demonstrated in [Cersonsky2021]_. These methods
142115
employ the modified kernel and covariance matrices introduced in :ref:`PCovR-api` and
143116
available via the Utility Classes.
144117

145118
Again, the feature and sample selection versions of PCov-CUR differ only in the
146-
computation of :math:`\pi`. So, in addition to :py:class:`GreedySelector`, both
147-
instances of PCov-CUR selection build off of
148-
:py:class:`skmatter._selection._cur._PCovCUR`, inheriting
149-
150-
.. currentmodule:: skmatter._selection
151-
152-
.. automethod:: _PCovCUR.score
153-
.. automethod:: _PCovCUR._compute_pi
154-
155-
and are instantiated using
156-
:py:class:`skmatter.feature_selection.PCovCUR` and :py:class:`skmatter.sample_selection.PCovCUR`.
157-
158-
.. code-block:: python
119+
computation of :math:`\pi`. S
159120

160-
from skmatter.feature_selection import PCovCUR
121+
.. autoclass:: skmatter.feature_selection.PCovCUR
122+
:members:
123+
:private-members: _compute_pi
124+
:undoc-members:
125+
:inherited-members:
161126

162-
selector = PCovCUR(
163-
n_to_select=4,
164-
progress_bar=True,
165-
score_threshold=1e-12,
166-
full=False,
167-
# float, default=0.5
168-
# The PCovR mixing parameter, as described in PCovR as alpha
169-
mixing=0.5,
170-
# int, number of eigenvectors to use in computing pi
171-
k=1,
172-
# int, number of steps after which to recompute pi
173-
recompute_every=1,
174-
# float, threshold below which scores will be considered 0, defaults to 1E-12
175-
tolerance=1e-12,
176-
)
177-
selector.fit(X, y)
127+
.. autoclass:: skmatter.sample_selection.PCovCUR
128+
:members:
129+
:private-members: _compute_pi
130+
:undoc-members:
131+
:inherited-members:
178132

179-
Xr = selector.transform(X)
180133

181134
.. _FPS-api:
182135

183136
Farthest Point-Sampling (FPS)
184-
#############################
137+
-----------------------------
185138

186139
Farthest Point Sampling is a common selection technique intended to exploit the
187140
diversity of the input space.
@@ -194,116 +147,53 @@ distance, however other distance metrics may be employed.
194147
Similar to CUR, the feature and selection versions of FPS differ only in the way
195148
distance is computed (feature selection does so column-wise, sample selection does so
196149
row-wise), and are built off of the same base class,
197-
:py:class:`skmatter._selection._fps._FPS`, in addition to GreedySelector, and inherit
198-
199-
.. currentmodule:: skmatter._selection
200-
201-
.. automethod:: _FPS.score
202-
.. automethod:: _FPS.get_distance
203-
.. automethod:: _FPS.get_select_distance
204150

205151
These selectors can be instantiated using :py:class:`skmatter.feature_selection.FPS` and
206152
:py:class:`skmatter.sample_selection.FPS`.
207153

208-
.. code-block:: python
209-
210-
from skmatter.feature_selection import FPS
211154

212-
selector = FPS(
213-
n_to_select=4,
214-
progress_bar=True,
215-
score_threshold=1e-12,
216-
full=False,
217-
# int or 'random', default=0
218-
# Index of the first selection.
219-
# If ‘random’, picks a random value when fit starts.
220-
initialize=0,
221-
)
222-
selector.fit(X)
155+
.. autoclass:: skmatter.feature_selection.FPS
156+
:members:
157+
:undoc-members:
158+
:inherited-members:
223159

224-
Xr = selector.transform(X)
160+
.. autoclass:: skmatter.sample_selection.FPS
161+
:members:
162+
:undoc-members:
163+
:inherited-members:
225164

226165
.. _PCov-FPS-api:
227166

228167
PCov-FPS
229-
########
168+
--------
230169

231170
PCov-FPS extends upon FPS much like PCov-CUR does to CUR. Instead of using the Euclidean
232171
distance solely in the space of :math:`\mathbf{X}`, we use a combined distance in terms
233172
of :math:`\mathbf{X}` and :math:`\mathbf{y}`.
234173

235-
Again, the feature and sample selection versions of PCov-FPS differ only in computing
236-
the distances. So, in addition to :py:class:`GreedySelector`, both instances of PCov-FPS
237-
selection build off of :py:class:`skmatter._selection._fps._PCovFPS`, and inherit
174+
.. autoclass:: skmatter.feature_selection.PCovFPS
175+
:members:
176+
:undoc-members:
177+
:inherited-members:
238178

239-
.. currentmodule:: skmatter._selection
240-
241-
.. automethod:: _PCovFPS.score
242-
.. automethod:: _PCovFPS.get_distance
243-
.. automethod:: _PCovFPS.get_select_distance
244-
245-
246-
and can
247-
be instantiated using
248-
:py:class:`skmatter.feature_selection.PCovFPS` and :py:class:`skmatter.sample_selection.PCovFPS`.
249-
250-
.. code-block:: python
251-
252-
from skmatter.feature_selection import PCovFPS
253-
254-
selector = PCovFPS(
255-
n_to_select=4,
256-
progress_bar=True,
257-
score_threshold=1e-12,
258-
full=False,
259-
# float, default=0.5
260-
# The PCovR mixing parameter, as described in PCovR as alpha
261-
mixing=0.5,
262-
# int or 'random', default=0
263-
# Index of the first selection.
264-
# If ‘random’, picks a random value when fit starts.
265-
initialize=0,
266-
)
267-
selector.fit(X, y)
268-
269-
Xr = selector.transform(X)
179+
.. autoclass:: skmatter.sample_selection.PCovFPS
180+
:members:
181+
:undoc-members:
182+
:inherited-members:
270183

271184
.. _Voronoi-FPS-api:
272185

273186
Voronoi FPS
274-
###########
275-
276-
.. currentmodule:: skmatter.sample_selection._voronoi_fps
277-
278-
.. autoclass :: VoronoiFPS
279-
280-
These selectors can be instantiated using
281-
:py:class:`skmatter.sample_selection.VoronoiFPS`.
187+
-----------
282188

283-
.. code-block:: python
189+
.. autoclass:: skmatter.sample_selection.VoronoiFPS
190+
:members:
191+
:undoc-members:
192+
:inherited-members:
284193

285-
from skmatter.feature_selection import VoronoiFPS
286-
287-
selector = VoronoiFPS(
288-
n_to_select=4,
289-
progress_bar=True,
290-
score_threshold=1e-12,
291-
full=False,
292-
# n_trial_calculation used for calculation of full_fraction,
293-
# so you need to determine only one parameter
294-
n_trial_calculation=4,
295-
full_fraction=None,
296-
# int or 'random', default=0
297-
# Index of the first selection.
298-
# If ‘random’, picks a random value when fit starts.
299-
initialize=0,
300-
)
301-
selector.fit(X)
302-
303-
Xr = selector.transform(X)
304194

305195
When *Not* to Use Voronoi FPS
306-
-----------------------------
196+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
307197

308198
In many cases, this algorithm may not increase upon the efficiency. For example, for
309199
simple metrics (such as Euclidean distance), Voronoi FPS will likely not accelerate, and
@@ -315,27 +205,9 @@ bookkeeping significantly degrades the speed of work compared to FPS.
315205
.. _DCH-api:
316206

317207
Directional Convex Hull (DCH)
318-
#############################
319-
.. currentmodule:: skmatter.sample_selection._base
320-
321-
.. autoclass :: DirectionalConvexHull
322-
323-
This selector can be instantiated using
324-
:class:`skmatter.sample_selection.DirectionalConvexHull`.
325-
326-
.. code-block:: python
327-
328-
from skmatter.sample_selection import DirectionalConvexHull
329-
330-
selector = DirectionalConvexHull(
331-
# Indices of columns of X to use for fitting
332-
# the convex hull
333-
low_dim_idx=[0, 1],
334-
)
335-
selector.fit(X, y)
208+
-----------------------------
336209

337-
# Get the distance to the convex hull for samples used to fit the
338-
# convex hull. This can also be called using other samples (X_new)
339-
# and corresponding properties (y_new) that were not used to fit
340-
# the hull.
341-
Xr = selector.score_samples(X, y)
210+
.. autoclass:: skmatter.sample_selection.DirectionalConvexHull
211+
:members:
212+
:undoc-members:
213+
:inherited-members:

0 commit comments

Comments
 (0)