Skip to content

Commit 76e921f

Browse files
committed
minor corrections
1 parent 47fe2b1 commit 76e921f

File tree

7 files changed

+37
-35
lines changed

7 files changed

+37
-35
lines changed

docs/src/anatomy_of_an_implementation.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ implementation given later.
3737
If the `data` object consumed by `fit`, `predict`, or `transform` is not
3838
not a suitable table¹, array³, tuple of tables and arrays, or some
3939
other object implementing
40-
the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/)
41-
`getobs`/`numobs` interface,
40+
the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/)
41+
`getobs`/`numobs` interface,
4242
then an implementation must: (i) overload [`obs`](@ref) to articulate how
4343
provided data can be transformed into a form that does support
4444
this interface, as illustrated below under
@@ -232,7 +232,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined:
232232
Ridge,
233233
constructor = Ridge,
234234
kinds_of_proxy=(Point(),),
235-
tags = (:regression,),
235+
tags = ("regression",),
236236
functions = (
237237
:(LearnAPI.fit),
238238
:(LearnAPI.learner),
@@ -295,6 +295,7 @@ nothing # hide
295295
learner = Ridge(lambda=0.5)
296296
@functions learner
297297
```
298+
(Exact output may differ here because of way documentation is generated.)
298299

299300
Training and predicting:
300301

@@ -353,7 +354,7 @@ LearnAPI.strip(model::RidgeFitted) =
353354
Ridge,
354355
constructor = Ridge,
355356
kinds_of_proxy=(Point(),),
356-
tags = (:regression,),
357+
tags = ("regression",),
357358
functions = (
358359
:(LearnAPI.fit),
359360
:(LearnAPI.learner),
@@ -381,10 +382,10 @@ or `predict`, such as the matrix version `A` of `X` in the ridge example. That
381382
factor out of `fit` (and also `predict`) a data pre-processing step, `obs`, to expose
382383
its outcomes. These outcomes become alternative user inputs to `fit`/`predict`.
383384

384-
In the default case, the alternative data representations will implement the MLUtils.jl
385-
`getobs/numobs` interface for observation subsampling, which is generally all a user or
386-
meta-algorithm will need, before passing the data on to `fit`/`predict` as you would the
387-
original data.
385+
In typical case (where [`LearnAPI.data_interface`](@ref) not overloaded) the alternative data
386+
representations will implement the MLUtils.jl `getobs/numobs` interface for observation
387+
subsampling, which is generally all a user or meta-algorithm will need, before passing the
388+
data on to `fit`/`predict` as you would the original data.
388389

389390
So, instead of the pattern
390391

@@ -472,7 +473,7 @@ LearnAPI.fit(learner::Ridge, data; kwargs...) =
472473
Providing `fit` signatures matching the output of [`obs`](@ref), is the first part of the
473474
`obs` contract. Since `obs(learner, data)` should evidently support all `data` that
474475
`fit(learner, data)` supports, we must be able to apply `obs(learner, _)` to it's own
475-
output (`observations` below). This leads to the additional "no-op" declaration
476+
output (`observations` below). This leads to the additional declaration
476477

477478
```@example anatomy2
478479
LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations
@@ -529,7 +530,7 @@ LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A
529530

530531
Since LearnAPI.jl provides fallbacks for `obs` that simply return the unadulterated data
531532
argument, overloading `obs` is optional. This is provided data in publicized
532-
`fit`/`predict` signatures consists only of objects implement the
533+
`fit`/`predict` signatures already consists only of objects implement the
533534
[`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof).
534535

535536
To opt out of supporting the MLUtils.jl interface altogether, an implementation must

docs/src/common_implementation_patterns.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ which introduces the main interface objects and terminology.
1010

1111
Although an implementation is defined purely by the methods and traits it implements, many
1212
implementations fall into one (or more) of the following informally understood patterns or
13-
"tasks":
13+
tasks:
1414

1515
- [Regression](@ref): Supervised learners for continuous targets
1616

docs/src/fit_update.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ fit(learner; verbosity=LearnAPI.default_verbosity()) -> static_model
88
```
99

1010
A "static" algorithm is one that does not generalize to new observations (e.g., some
11-
clustering algorithms); there is no training data and the algorithm is executed by
11+
clustering algorithms); there is no training data and heavy lifting is carried out by
1212
`predict` or `transform` which receive the data. See example below.
1313

1414

docs/src/patterns/transformers.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# [Transformers](@id transformers)
22

3-
Check out the following examples:
3+
Check out the following examples from the TestLearnAPI.jl test suite:
44

5-
- [Truncated
6-
SVD]((https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl
7-
(from the TestLearnAPI.jl test suite)
5+
- [Truncated SVD](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl)

docs/src/reference.md

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The LearnAPI.jl specification is predicated on a few basic, informally defined n
1212

1313
### Data and observations
1414

15-
ML/statistical algorithms are typically applied in conjunction with resampling of
15+
ML/statistical algorithms are frequently applied in conjunction with resampling of
1616
*observations*, as in
1717
[cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)). In this
1818
document *data* will always refer to objects encapsulating an ordered sequence of
@@ -35,9 +35,14 @@ see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details.
3535

3636
Besides the data it consumes, a machine learning algorithm's behavior is governed by a
3737
number of user-specified *hyperparameters*, such as the number of trees in a random
38-
forest. In LearnAPI.jl, one is allowed to have hyperparameters that are not data-generic.
39-
For example, a class weight dictionary, which will only make sense for a target taking
40-
values in the set of dictionary keys, can be specified as a hyperparameter.
38+
forest. Hyperparameters are understood in a rather broad sense. For example, one is
39+
allowed to have hyperparameters that are not data-generic. For example, a class weight
40+
dictionary, which will only make sense for a target taking values in the set of specified
41+
dictionary keys, should be given as a hyperparameter. For simplicity, LearnAPI.jl
42+
discourages "run time" parameters (extra arguments to `fit`) such as acceleration
43+
options (cpu/gpu/multithreading/multiprocessing). These should be included as
44+
hyperparameters as far as possible. An exception is the compulsory `verbosity` keyword
45+
argument of `fit`.
4146

4247

4348
### [Targets and target proxies](@id proxy)
@@ -56,16 +61,16 @@ compared with censored ground truth survival times. And so on ...
5661

5762
#### Definitions
5863

59-
More generally, whenever we have a variable (e.g., a class label) that can, at least in
60-
principle, be paired with a predicted value, or some predicted "proxy" for that variable
61-
(such as a class probability), then we call the variable a *target* variable, and the
62-
predicted output a *target proxy*. In this definition, it is immaterial whether or not the
63-
target appears in training (the algorithm is supervised) or whether or not predictions
64-
generalize to new input observations (the algorithm "learns").
64+
More generally, whenever we have a variable that can, at least in principle, be paired
65+
with a predicted value, or some predicted "proxy" for that variable (such as a class
66+
probability), then we call the variable a *target* variable, and the predicted output a
67+
*target proxy*. In this definition, it is immaterial whether or not the target appears in
68+
training (the algorithm is supervised) or whether or not predictions generalize to new
69+
input observations (the algorithm "learns").
6570

6671
LearnAPI.jl provides singleton [target proxy types](@ref proxy_types) for prediction
67-
dispatch. These are also used to distinguish performance metrics provided by the package
68-
[StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/).
72+
dispatch. These are the same types used to distinguish performance metrics provided by the
73+
package [StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/).
6974

7075

7176
### [Learners](@id learners)
@@ -149,9 +154,7 @@ interface.)
149154
[`LearnAPI.learner`](@ref), [`LearnAPI.constructor`](@ref) and
150155
[`LearnAPI.functions`](@ref).
151156

152-
Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). For a
153-
minimal (but useless) implementation, see the implementation of `SmallLearner`
154-
[here](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/traits.jl).
157+
Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref).
155158

156159
### List of methods
157160

@@ -187,7 +190,7 @@ minimal (but useless) implementation, see the implementation of `SmallLearner`
187190
- [Accessor functions](@ref accessor_functions): these include functions like
188191
`LearnAPI.feature_importances` and `LearnAPI.training_losses`, for extracting, from
189192
training outcomes, information common to many learners. This includes
190-
[`LearnAPI.strip(model)`](@ref) for replacing a learning outcome `model` with a
193+
[`LearnAPI.strip(model)`](@ref) for replacing a learning outcome, `model`, with a
191194
serializable version that can still `predict` or `transform`.
192195

193196
- [Learner traits](@ref traits): methods that promise specific learner behavior or

docs/src/traits.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,8 +78,8 @@ requires:
7878

7979
1. *Finiteness:* The value of a trait is the same for all `learner`s with same value of
8080
[`LearnAPI.constructor(learner)`](@ref). This typically means trait values do not
81-
depend on type parameters! For composite models (`LearnAPI.learners(learner)`
82-
non-empty) this requirement is dropped.
81+
depend on type parameters! For composite models (non-empty
82+
`LearnAPI.learners(learner)`) this requirement is dropped.
8383

8484
2. *Low level deserializability:* It should be possible to evaluate the trait *value* when
8585
`LearnAPI` and `ScientificTypesBase` are the only imported modules.

src/traits.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ argument) are excluded.
136136
137137
```
138138
julia> @functions my_feature_selector
139-
(fit, LearnAPI.learner, strip, obs, transform)
139+
(fit, LearnAPI.learner, clone, strip, obs, transform)
140140
141141
```
142142

0 commit comments

Comments
 (0)