doc improvements

ablaom · ablaom · commit 8e8123a08126 · 2024-12-16T12:14:12.000+13:00
diff --git a/docs/make.jl b/docs/make.jl
@@ -18,7 +18,7 @@ makedocs(
             "fit/update" => "fit_update.md",
             "predict/transform" => "predict_transform.md",
             "Kinds of Target Proxy" => "kinds_of_target_proxy.md",
-            "obs" => "obs.md",
+            "obs and Data Interfaces" => "obs.md",
             "target/weights/features" => "target_weights_features.md",
             "Accessor Functions" => "accessor_functions.md",
             "Learner Traits" => "traits.md",
diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md
@@ -1,6 +1,6 @@
 # Anatomy of an Implementation
 
-This section explains a detailed implementation of the LearnAPI.jl for naive [ridge
+This tutorial details an implementation of the LearnAPI.jl for naive [ridge
 regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of
 workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also
 refer to the [demonstration](@ref workflow) of the implementation given later.
@@ -35,8 +35,7 @@ A transformer ordinarily implements `transform` instead of `predict`. For more o
     then an implementation must: (i) overload [`obs`](@ref) to articulate how
     provided data can be transformed into a form that does support
     this interface, as illustrated below under
-    [Providing a separate data front end](@ref), and which may additionally
-    enable certain performance benefits; or (ii) overload the trait
+    [Providing a separate data front end](@ref); or (ii) overload the trait
     [`LearnAPI.data_interface`](@ref) to specify a more relaxed data
     API.
 
@@ -62,7 +61,7 @@ nothing # hide
 
 Instances of `Ridge` are *[learners](@ref learners)*, in LearnAPI.jl parlance.
 
-Associated with each new type of LearnAPI.jl [learner](@ref learners) will be a keyword
+Associated with each new type of LearnAPI.jl learner will be a keyword
 argument constructor, providing default values for all properties (typically, struct
 fields) that are not other learners, and we must implement
 [`LearnAPI.constructor(learner)`](@ref), for recovering the constructor from an instance:
@@ -365,9 +364,41 @@ y = 2a - b + 3c + 0.05*rand(n)
 An implementation may optionally implement [`obs`](@ref), to expose to the user (or some
 meta-algorithm like cross-validation) the representation of input data internal to `fit`
 or `predict`, such as the matrix version `A` of `X` in the ridge example.  That is, we may
-factor out of `fit` (and also `predict`) the data pre-processing step, `obs`, to expose
-its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. To see the
-use of `obs` in action, see [below](@ref advanced_demo).
+factor out of `fit` (and also `predict`) a data pre-processing step, `obs`, to expose
+its outcomes. These outcomes become alternative user inputs to `fit`/`predict`.
+
+In the default case, the alternative data representations will implement the MLUtils.jl
+`getobs/numobs` interface for observation subsampling, which is generally all a user or
+meta-algorithm will need, before passing the data on to `fit`/`predict` as you would the
+original data.
+
+So, instead of the pattern
+
+```julia
+model = fit(learner, data)
+predict(model, newdata)
+```
+
+one enables the following alternative (which in any case will still work, because of a
+no-op `obs` fallback provided by LearnAPI.jl):
+
+```julia
+observations = obs(learner, data) # pre-processed training data
+
+# optional subsampling:
+observations = MLUtils.getobs(observations, train_indices)
+
+model = fit(learner, observations)
+
+newobservations = obs(model, newdata)
+
+# optional subsampling:
+newobservations = MLUtils.getobs(observations, test_indices)
+
+predict(model, newobservations)
+```
+
+See also the demonstration [below](@ref advanced_demo).
 
 Here we specifically wrap all the pre-processed data into single object, for which we
 introduce a new type:
diff --git a/docs/src/obs.md b/docs/src/obs.md
@@ -47,8 +47,6 @@ import MLUtils
 learner = <some supervised learner>
 
 data = <some data that `fit` can consume, with 30 observations>
-X = LearnAPI.features(learner, data)
-y = LearnAPI.target(learner, data)
 
 train_test_folds = map([1:10, 11:20, 21:30]) do test
     (setdiff(1:30, test), test)
@@ -65,12 +63,14 @@ scores = map(train_test_folds) do (train, test)
 
     # predict on the fold complement:
     if never_trained
+        X = LearnAPI.features(learner, data)
         global predictobs = obs(model, X)
         global never_trained = false
     end
     predictobs_subset = MLUtils.getobs(predictobs, test)
     ŷ = predict(model, Point(), predictobs_subset)
 
+    y = LearnAPI.target(learner, data)
     return <score comparing ŷ with y[test]>
 
 end
@@ -96,8 +96,8 @@ obs
 ### [Data interfaces](@id data_interfaces)
 
 New implementations must overload [`LearnAPI.data_interface(learner)`](@ref) if the
-output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess`](@ref). (Arrays, most
-tables, and all tuples thereof, implement `RandomAccess`.)
+output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess()`](@ref). Arrays, most
+tables, and all tuples thereof, implement `RandomAccess()`.
 
 - [`LearnAPI.RandomAccess`](@ref) (default)
 - [`LearnAPI.FiniteIterable`](@ref)
diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md
@@ -86,7 +86,7 @@ dimension using distances from the cluster centres.
 
 Learners may additionally overload `transform` to apply `fit` first, using the supplied
 data if required, and then immediately `transform` the same data.  In that case the first
-argument of `transform` is an *learner* instead of the output of `fit`:
+argument of `transform` is a *learner* instead of the output of `fit`:
 
 ```julia
 transform(learner, data) # `fit` implied
diff --git a/docs/src/reference.md b/docs/src/reference.md
@@ -80,9 +80,8 @@ Informally, we will sometimes use the word "model" to refer to the output of
 `fit(learner, ...)` (see below), something which typically *does* store learned
 parameters.
 
-For `learner` to be a valid LearnAPI.jl learner,
-[`LearnAPI.constructor(learner)`](@ref) must be defined and return a keyword constructor
-enabling recovery of `learner` from its properties:
+For every `learner`, [`LearnAPI.constructor(learner)`](@ref) must return a keyword
+constructor enabling recovery of `learner` from its properties:
 
 ```julia
 properties = propertynames(learner)
@@ -92,7 +91,7 @@ named_properties = NamedTuple{properties}(getproperty.(Ref(learner), properties)
 
 which can be tested with `@assert `[`LearnAPI.clone(learner)`](@ref)` == learner`.
 
-Note that if if `learner` is an instance of a *mutable* struct, this requirement
+Note that if `learner` is an instance of a *mutable* struct, this requirement
 generally requires overloading `Base.==` for the struct.
 
 !!! important
@@ -124,6 +123,13 @@ struct GradientRidgeRegressor{T<:Real}
     epochs::Int
     l2_regularization::T
 end
+
+"""
+    GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01)
+	
+Instantiate a gradient ridge regressor with the specified hyperparameters.
+
+"""
 GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) =
     GradientRidgeRegressor(learning_rate, epochs, l2_regularization)
 LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor
@@ -132,9 +138,9 @@ LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor
 ## Documentation
 
 Attach public LearnAPI.jl-related documentation for a learner to it's *constructor*,
-rather than to the struct defining its type. In this way, a learner can implement
-multiple interfaces, in addition to the LearnAPI interface, with separate document strings
-for each.
+rather than to the struct defining its type, as shown in the example above. (In this way,
+multiple interfaces can share a common struct, with separate document strings for each
+interface.)
 
 ## Methods
 
diff --git a/src/fit_update.jl b/src/fit_update.jl
@@ -21,7 +21,8 @@ The signature `fit(learner; verbosity=...)` (no `data`) is provided by learners
 not generalize to new observations (called *static algorithms*). In that case,
 `transform(model, data)` or `predict(model, ..., data)` carries out the actual algorithm
 execution, writing any byproducts of that operation to the mutable object `model` returned
-by `fit`.
+by `fit`. Inspect the value of [`LearnAPI.is_static(learner)`](@ref) to determine whether
+`fit` consumes `data` or not.
 
 Use `verbosity=0` for warnings only, and `-1` for silent training.
 
@@ -117,7 +118,7 @@ learner = MyNeuralNetwork(epochs=10, learning_rate=0.01)
 model = fit(learner, data)
 
 # train for two more epochs using new data and new learning rate:
-model = update_observations(model, new_data; epochs=2, learning_rate=0.1)
+model = update_observations(model, new_data; epochs=12, learning_rate=0.1)
 ```
 
 When following the call `fit(learner, data)`, the `update` call is semantically
diff --git a/src/obs.jl b/src/obs.jl
@@ -25,23 +25,20 @@ model = fit(learner, data_train)
 ŷ = predict(model, Point(), X[101:150])
 ```
 
-Alternative, data agnostic, workflow using `obs` and the MLUtils.jl method `getobs`
-(assumes `LearnAPI.data_interface(learner) == RandomAccess()`):
+Alternative workflow using `obs` and the MLUtils.jl method `getobs` to carry out
+subsampling (assumes `LearnAPI.data_interface(learner) == RandomAccess()`):
 
 ```julia
 import MLUtils
-
 fit_observations = obs(learner, data)
 model = fit(learner, MLUtils.getobs(fit_observations, 1:100))
-
 predict_observations = obs(model, X)
 ẑ = predict(model, Point(), MLUtils.getobs(predict_observations, 101:150))
 @assert ẑ == ŷ
 ```
 
 See also [`LearnAPI.data_interface`](@ref).
 
-
 # Extended help
 
 # New implementations
diff --git a/src/predict_transform.jl b/src/predict_transform.jl
@@ -8,7 +8,8 @@ DOC_MUTATION(op) =
     """
 
     If [`LearnAPI.is_static(learner)`](@ref) is `true`, then `$op` may mutate it's first
-    argument, but not in a way that alters the result of a subsequent call to `predict`,
+    argument (to record byproducts of the computation not naturally part of the return
+    value) but not in a way that alters the result of a subsequent call to `predict`,
     `transform` or `inverse_transform`. See more at [`fit`](@ref).
 
     """
@@ -82,8 +83,9 @@ See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref).
 
 # Extended help
 
-Note `predict ` must not mutate any argument, except in the special case
-`LearnAPI.is_static(learner) == true`.
+In the special case `LearnAPI.is_static(learner) == true`, it is possible that
+`predict(model, ...)` will mutate `model`, but not in a way that affects subsequent
+`predict` calls.
 
 # New implementations
 
@@ -147,8 +149,9 @@ or, in one step (where supported):
 W = transform(learner, X) # `fit` implied
 ```
 
-Note `transform` does not mutate any argument, except in the special case
-`LearnAPI.is_static(learner) == true`.
+In the special case `LearnAPI.is_static(learner) == true`, it is possible that
+`transform(model, ...)` will mutate `model`, but not in a way that affects subsequent
+`transform` calls.
 
 See also [`fit`](@ref), [`predict`](@ref),
 [`inverse_transform`](@ref).
diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl
@@ -80,10 +80,9 @@ ŷ = predict(model, kind_of_proxy, X) # eg, `kind_of_proxy = Point()`
 ```
 
 For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`)
-`ŷ` above is generally intended to be an approximate proxy for `LearnAPI.target(learner,
-data)`, the training target.
+`ŷ` above is generally intended to be an approximate proxy for the target variable.
 
-The object `X` returned by `LearnAPI.target` has the same number of observations as
+The object `X` returned by `LearnAPI.features` has the same number of observations as
 `observations` does and is guaranteed to implement the data interface specified by
 [`LearnAPI.data_interface(learner)`](@ref).
 
diff --git a/src/traits.jl b/src/traits.jl
@@ -79,9 +79,9 @@ All new implementations must implement this trait. Here's a checklist for elemen
 return value:
 
 | expression                        | implementation compulsory? | include in returned tuple?         |
-|-----------------------------------|----------------------------|------------------------------------|
+|:----------------------------------|:---------------------------|:-----------------------------------|
 | `:(LearnAPI.fit)`                 | yes                        | yes                                |
-| `:(LearnAPI.learner)`           | yes                        | yes                                |
+| `:(LearnAPI.learner)`             | yes                        | yes                                |
 | `:(LearnAPI.strip)`               | no                         | yes                                |
 | `:(LearnAPI.obs)`                 | no                         | yes                                |
 | `:(LearnAPI.features)`            | no                         | yes, unless `fit` consumes no data |
diff --git a/src/types.jl b/src/types.jl
@@ -229,7 +229,8 @@ A data interface type.  We say that `data` implements the `FiniteIterable` inter
 it implements Julia's `iterate` interface, including `Base.length`, and if
 `Base.IteratorSize(typeof(data)) == Base.HasLength()`. For example, this is true if:
 
-- `data` implements the [`LearnAPI.RandomAccess`](@ref) interface (arrays and most tables)
+- `data` implements the [`LearnAPI.RandomAccess`](@ref) interface (arrays and most
+  tables); or
 
 - `data isa MLUtils.DataLoader`, which includes output from `MLUtils.eachobs`.
 
diff --git a/src/verbosity.jl b/src/verbosity.jl
@@ -2,7 +2,7 @@ const DEFAULT_VERBOSITY = Ref(1)
 
 """
     LearnAPI.default_verbosity()
-    LearnAPI.default_verbosity(level::Int)
+    LearnAPI.default_verbosity(verbosity::Int)
 
 Respectively return, or set, the default `verbosity` level for LearnAPI.jl methods that
 support it, which includes [`fit`](@ref), [`update`](@ref),
@@ -11,11 +11,10 @@ call is generally:
 
 
 
-| `level` | behaviour                         |
-|:--------|:----------------------------------|
-| 1       | informational                     |
-| 0       | warnings only                     |
-| -1      | silent                            |
+| `verbosity` | behaviour     |
+|:------------|:--------------|
+| 1           | informational |
+| 0           | warnings only |
 
 
 Methods consuming `verbosity` generally call other verbosity-supporting methods