From 1d3be8e1c550e76d387aa5537a3ba0e4c1399c74 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 25 Jan 2025 11:00:57 +1300 Subject: [PATCH 01/14] minor doc clarification --- src/traits.jl | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 46004d17..7c9a19ab 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -440,16 +440,17 @@ fit_observation_scitype(::Any) = Union{} LearnAPI.target_observation_scitype(learner) Return an upper bound `S` on the scitype of each observation of an applicable target -variable. Specifically: +variable. Specifically, both of the following is always true: - If `:(LearnAPI.target) in LearnAPI.functions(learner)` (i.e., `fit` consumes target - variables) then "target" means anything returned by `LearnAPI.target(learner, data)`, - where `data` is an admissible argument in the call `fit(learner, data)`. + variables) then "target" means anything returned by [`LearnAPI.target(learner, + observations)`](@ref), where `observations = `[`LearnAPI.obs(learner, data)`](@ref) and + `data` is an admissible argument in the call [`fit(learner, data)`](@ref). - `S` will always be an upper bound on the scitype of (point) observations that could be conceivably extracted from the output of [`predict`](@ref). -To illustate the second case, suppose we have +To illustate the second property, suppose we have ```julia model = fit(learner, data) From 874201dfc6c490ea3f6adfc6fa1c9f8c73715394 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 25 Jan 2025 12:51:02 +1300 Subject: [PATCH 02/14] rename Continuous -> Interpolated --- docs/src/traits.md | 2 +- src/types.jl | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/src/traits.md b/docs/src/traits.md index a95404bf..83f87b41 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -82,7 +82,7 @@ requires: non-empty) this requirement is dropped. 2. *Low level deserializability:* It should be possible to evaluate the trait *value* when - `LearnAPI` is the only imported module. + `LearnAPI` and `ScientificTypesBase` are the only imported modules. Because of 1, combining a lot of functionality into one learner (e.g. the learner can perform both classification or regression) can mean traits are necessarily less diff --git a/src/types.jl b/src/types.jl index faa6d250..b204dc6a 100644 --- a/src/types.jl +++ b/src/types.jl @@ -42,7 +42,7 @@ See also [`LearnAPI.KindOfProxy`](@ref). | `SurvivalDistribution` | probability distribution for survival time | | `SurvivalHazardFunction` | hazard function for survival time | | `OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | -| `Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | +| `Interpolated` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | ¹Provided for completeness but discouraged to avoid [ambiguities in representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). @@ -72,7 +72,7 @@ const IID_SYMBOLS = [ :SurvivalDistribution, :HazardFunction, :OutlierScore, - :Continuous, + :Interpolated, :Quantile, :Expectile, ] From 6ef477bdd35f14808e018fcd624e3bf6005d42e9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 25 Jan 2025 13:01:24 +1300 Subject: [PATCH 03/14] re order iid kinds of proxy to put Interpolated second --- src/types.jl | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/types.jl b/src/types.jl index b204dc6a..269fff2c 100644 --- a/src/types.jl +++ b/src/types.jl @@ -20,9 +20,10 @@ See also [`LearnAPI.KindOfProxy`](@ref). # Extended help -| type | form of an observation | -|:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | +| type | form of an observation | +|:----------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | +| `Interpolated` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | | `Sampleable` | object that can be sampled to obtain object of the same form as target observation | | `Distribution` | explicit probability density/mass function whose sample space is all possible target observations | | `LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | @@ -42,7 +43,6 @@ See also [`LearnAPI.KindOfProxy`](@ref). | `SurvivalDistribution` | probability distribution for survival time | | `SurvivalHazardFunction` | hazard function for survival time | | `OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | -| `Interpolated` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | ¹Provided for completeness but discouraged to avoid [ambiguities in representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). From 6c633a113b8bb32d0040b8f5c7d897010cded0a5 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 25 Jan 2025 11:25:09 +1300 Subject: [PATCH 04/14] dump fit_observation_scitype in favour of fit_scitype --- docs/src/traits.md | 4 ++-- src/traits.jl | 49 +++++++++++++++++++++++++++++++++------------- test/traits.jl | 2 +- 3 files changed, 38 insertions(+), 17 deletions(-) diff --git a/docs/src/traits.md b/docs/src/traits.md index 83f87b41..c2487082 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -28,7 +28,7 @@ In the examples column of the table below, `Continuous` is a name owned the pack | [`LearnAPI.human_name`](@ref)`(learner)` | human name for the learner; should be a noun | type name with spaces | "elastic net regressor" | | [`LearnAPI.iteration_parameter`](@ref)`(learner)` | symbolic name of an iteration parameter | `nothing` | :epochs | | [`LearnAPI.data_interface`](@ref)`(learner)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | -| [`LearnAPI.fit_observation_scitype`](@ref)`(learner)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.fit_scitype`](@ref)`(learner)` | upper bound on `scitype(data)` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | | [`LearnAPI.target_observation_scitype`](@ref)`(learner)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | | [`LearnAPI.is_static`](@ref)`(learner)` | `true` if `fit` consumes no data | `false` | `true` | @@ -105,7 +105,7 @@ LearnAPI.nonlearners LearnAPI.human_name LearnAPI.data_interface LearnAPI.iteration_parameter -LearnAPI.fit_observation_scitype +LearnAPI.fit_scitype LearnAPI.target_observation_scitype LearnAPI.is_static ``` diff --git a/src/traits.jl b/src/traits.jl index 7c9a19ab..1d6ed1cb 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -416,16 +416,33 @@ Implement if algorithm is iterative. Returns a symbol or `nothing`. """ iteration_parameter(::Any) = nothing +# """ +# LearnAPI.fit_observation_scitype(learner) + +# Return an upper bound `S` on the scitype of individual observations guaranteed to work +# when calling `fit`: if `observations = obs(learner, data)` and +# `ScientificTypes.scitype(collect(o)) <:S` for each `o` in `observations`, then the call +# `fit(learner, data)` is supported. + +# $DOC_EXPLAIN_EACHOBS + +# See also [`LearnAPI.target_observation_scitype`](@ref). + +# # New implementations + +# Optional. The fallback return value is `Union{}`. + +# """ +# fit_observation_scitype(::Any) = Union{} """ - LearnAPI.fit_observation_scitype(learner) + LearnAPI.fit_scitype(learner) -Return an upper bound `S` on the scitype of individual observations guaranteed to work -when calling `fit`: if `observations = obs(learner, data)` and -`ScientificTypes.scitype(collect(o)) <:S` for each `o` in `observations`, then the call -`fit(learner, data)` is supported. +Return an upper bound `S` on the `scitype` (scientific type) of `data` for which the call +[`fit(learner, data)`](@ref) is supported. Specifically, if `ScientificTypes.scitype(data) +<: S` then the call is guaranteed to succeed. If not, the call may or may not succeed. -$DOC_EXPLAIN_EACHOBS +See ScientificTypes.jl documentation for more on the `scitype` function. See also [`LearnAPI.target_observation_scitype`](@ref). @@ -434,21 +451,25 @@ See also [`LearnAPI.target_observation_scitype`](@ref). Optional. The fallback return value is `Union{}`. """ -fit_observation_scitype(::Any) = Union{} +fit_scitype(::Any) = Union{} """ LearnAPI.target_observation_scitype(learner) -Return an upper bound `S` on the scitype of each observation of an applicable target -variable. Specifically, both of the following is always true: +Return an upper bound `S` on the `scitype` (scientific type) of each observation of any +target variable associated with the learner. See LearnAPI.jl documentation for the meaning +of "target variable". See ScientificTypes.jl documentation for an explanation of the +`scitype` function, which it provides. + +Specifically, both of the following is always true: - If `:(LearnAPI.target) in LearnAPI.functions(learner)` (i.e., `fit` consumes target variables) then "target" means anything returned by [`LearnAPI.target(learner, observations)`](@ref), where `observations = `[`LearnAPI.obs(learner, data)`](@ref) and - `data` is an admissible argument in the call [`fit(learner, data)`](@ref). + `data` is a supported argument in the call [`fit(learner, data)`](@ref). -- `S` will always be an upper bound on the scitype of (point) observations that could be - conceivably extracted from the output of [`predict`](@ref). +- `S` is an upper bound on the `scitype` of (point) observations that might normally be + extracted from the output of [`predict`](@ref). To illustate the second property, suppose we have @@ -458,9 +479,9 @@ ŷ = predict(model, Sampleable(), data_new) ``` Then each individual sample generated by each "observation" of `ŷ` (a vector of sampleable -objects, say) will be bound in scitype by `S`. +objects, say) will be bound in `scitype` by `S`. -See also See also [`LearnAPI.fit_observation_scitype`](@ref). +See also See also [`LearnAPI.fit_scitype`](@ref). # New implementations diff --git a/test/traits.jl b/test/traits.jl index 8b0353f3..0a7023dd 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -43,7 +43,7 @@ small = SmallLearner() @test LearnAPI.human_name(small) == "small learner" @test isnothing(LearnAPI.iteration_parameter(small)) @test LearnAPI.data_interface(small) == LearnAPI.RandomAccess() -@test !(6 isa LearnAPI.fit_observation_scitype(small)) +@test !(6 isa LearnAPI.fit_scitype(small)) @test 6 isa LearnAPI.target_observation_scitype(small) @test !LearnAPI.is_static(small) From 47fe2b1fbaf4b78e33443a95d6a64556e90457e2 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 25 Jan 2025 13:38:53 +1300 Subject: [PATCH 05/14] remove redundant doc constant --- src/traits.jl | 27 ++++++++++++++++----------- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 1d6ed1cb..28f466b9 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -6,15 +6,15 @@ const DOC_UNKNOWN = "not overloaded the trait. " const DOC_ON_TYPE = "The value of the trait must depend only on the type of `learner`. " -const DOC_EXPLAIN_EACHOBS = - """ +# const DOC_EXPLAIN_EACHOBS = +# """ - Here, "for each `o` in `observations`" is understood in the sense of - [`LearnAPI.data_interface(learner)`](@ref). For example, if - `LearnAPI.data_interface(learner) == Base.HasLength()`, then this means "for `o` in - `MLUtils.eachobs(observations)`". +# Here, "for each `o` in `observations`" is understood in the sense of the data +# interface specified for the learner, [`LearnAPI.data_interface(learner)`](@ref). For +# example, if this is `LearnAPI.RandomAccess()`, then this means "for `o` in +# `MLUtils.eachobs(observations)`". - """ +# """ # # OVERLOADABLE TRAITS @@ -461,12 +461,17 @@ target variable associated with the learner. See LearnAPI.jl documentation for t of "target variable". See ScientificTypes.jl documentation for an explanation of the `scitype` function, which it provides. -Specifically, both of the following is always true: +Specifically, both of the following are always true: - If `:(LearnAPI.target) in LearnAPI.functions(learner)` (i.e., `fit` consumes target - variables) then "target" means anything returned by [`LearnAPI.target(learner, - observations)`](@ref), where `observations = `[`LearnAPI.obs(learner, data)`](@ref) and - `data` is a supported argument in the call [`fit(learner, data)`](@ref). + variables) then `ScientificTypes.scitype(o) <: S` for each `o` in `target_observations`, + where `target_observations = `[`LearnAPI.target(learner, observations)`](@ref), + `observations = `[`LearnAPI.obs(learner, data)`](@ref), and `data` is a supported + argument in the call [`fit(learner, data)`](@ref). Here, "for each `o` in + `target_observations`" is understood in the sense of the data interface specified for + the learner, [`LearnAPI.data_interface(learner)`](@ref). For example, if this is + `LearnAPI.RandomAccess()`, then this means "for each `o in + MLUtils.eachobs(target_observations)`". - `S` is an upper bound on the `scitype` of (point) observations that might normally be extracted from the output of [`predict`](@ref). From 76e921f756855a08ae57c742740bc8e9aa8c7603 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 25 Jan 2025 15:20:09 +1300 Subject: [PATCH 06/14] minor corrections --- docs/src/anatomy_of_an_implementation.md | 21 ++++++------- docs/src/common_implementation_patterns.md | 2 +- docs/src/fit_update.md | 2 +- docs/src/patterns/transformers.md | 6 ++-- docs/src/reference.md | 35 ++++++++++++---------- docs/src/traits.md | 4 +-- src/traits.jl | 2 +- 7 files changed, 37 insertions(+), 35 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index c977cf50..1b426f6e 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -37,8 +37,8 @@ implementation given later. If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable table¹, array³, tuple of tables and arrays, or some other object implementing - the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) - `getobs`/`numobs` interface, + the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) + `getobs`/`numobs` interface, then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support this interface, as illustrated below under @@ -232,7 +232,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: Ridge, constructor = Ridge, kinds_of_proxy=(Point(),), - tags = (:regression,), + tags = ("regression",), functions = ( :(LearnAPI.fit), :(LearnAPI.learner), @@ -295,6 +295,7 @@ nothing # hide learner = Ridge(lambda=0.5) @functions learner ``` +(Exact output may differ here because of way documentation is generated.) Training and predicting: @@ -353,7 +354,7 @@ LearnAPI.strip(model::RidgeFitted) = Ridge, constructor = Ridge, kinds_of_proxy=(Point(),), - tags = (:regression,), + tags = ("regression",), functions = ( :(LearnAPI.fit), :(LearnAPI.learner), @@ -381,10 +382,10 @@ or `predict`, such as the matrix version `A` of `X` in the ridge example. That factor out of `fit` (and also `predict`) a data pre-processing step, `obs`, to expose its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. -In the default case, the alternative data representations will implement the MLUtils.jl -`getobs/numobs` interface for observation subsampling, which is generally all a user or -meta-algorithm will need, before passing the data on to `fit`/`predict` as you would the -original data. +In typical case (where [`LearnAPI.data_interface`](@ref) not overloaded) the alternative data +representations will implement the MLUtils.jl `getobs/numobs` interface for observation +subsampling, which is generally all a user or meta-algorithm will need, before passing the +data on to `fit`/`predict` as you would the original data. So, instead of the pattern @@ -472,7 +473,7 @@ LearnAPI.fit(learner::Ridge, data; kwargs...) = Providing `fit` signatures matching the output of [`obs`](@ref), is the first part of the `obs` contract. Since `obs(learner, data)` should evidently support all `data` that `fit(learner, data)` supports, we must be able to apply `obs(learner, _)` to it's own -output (`observations` below). This leads to the additional "no-op" declaration +output (`observations` below). This leads to the additional declaration ```@example anatomy2 LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations @@ -529,7 +530,7 @@ LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A Since LearnAPI.jl provides fallbacks for `obs` that simply return the unadulterated data argument, overloading `obs` is optional. This is provided data in publicized -`fit`/`predict` signatures consists only of objects implement the +`fit`/`predict` signatures already consists only of objects implement the [`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof). To opt out of supporting the MLUtils.jl interface altogether, an implementation must diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 85ebe507..0c57ff50 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -10,7 +10,7 @@ which introduces the main interface objects and terminology. Although an implementation is defined purely by the methods and traits it implements, many implementations fall into one (or more) of the following informally understood patterns or -"tasks": +tasks: - [Regression](@ref): Supervised learners for continuous targets diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index c649e6dd..36179076 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -8,7 +8,7 @@ fit(learner; verbosity=LearnAPI.default_verbosity()) -> static_model ``` A "static" algorithm is one that does not generalize to new observations (e.g., some -clustering algorithms); there is no training data and the algorithm is executed by +clustering algorithms); there is no training data and heavy lifting is carried out by `predict` or `transform` which receive the data. See example below. diff --git a/docs/src/patterns/transformers.md b/docs/src/patterns/transformers.md index f085f928..c27f9682 100644 --- a/docs/src/patterns/transformers.md +++ b/docs/src/patterns/transformers.md @@ -1,7 +1,5 @@ # [Transformers](@id transformers) -Check out the following examples: +Check out the following examples from the TestLearnAPI.jl test suite: -- [Truncated - SVD]((https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl - (from the TestLearnAPI.jl test suite) +- [Truncated SVD](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl) diff --git a/docs/src/reference.md b/docs/src/reference.md index f068afc7..1e885e32 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -12,7 +12,7 @@ The LearnAPI.jl specification is predicated on a few basic, informally defined n ### Data and observations -ML/statistical algorithms are typically applied in conjunction with resampling of +ML/statistical algorithms are frequently applied in conjunction with resampling of *observations*, as in [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)). In this document *data* will always refer to objects encapsulating an ordered sequence of @@ -35,9 +35,14 @@ see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. Besides the data it consumes, a machine learning algorithm's behavior is governed by a number of user-specified *hyperparameters*, such as the number of trees in a random -forest. In LearnAPI.jl, one is allowed to have hyperparameters that are not data-generic. -For example, a class weight dictionary, which will only make sense for a target taking -values in the set of dictionary keys, can be specified as a hyperparameter. +forest. Hyperparameters are understood in a rather broad sense. For example, one is +allowed to have hyperparameters that are not data-generic. For example, a class weight +dictionary, which will only make sense for a target taking values in the set of specified +dictionary keys, should be given as a hyperparameter. For simplicity, LearnAPI.jl +discourages "run time" parameters (extra arguments to `fit`) such as acceleration +options (cpu/gpu/multithreading/multiprocessing). These should be included as +hyperparameters as far as possible. An exception is the compulsory `verbosity` keyword +argument of `fit`. ### [Targets and target proxies](@id proxy) @@ -56,16 +61,16 @@ compared with censored ground truth survival times. And so on ... #### Definitions -More generally, whenever we have a variable (e.g., a class label) that can, at least in -principle, be paired with a predicted value, or some predicted "proxy" for that variable -(such as a class probability), then we call the variable a *target* variable, and the -predicted output a *target proxy*. In this definition, it is immaterial whether or not the -target appears in training (the algorithm is supervised) or whether or not predictions -generalize to new input observations (the algorithm "learns"). +More generally, whenever we have a variable that can, at least in principle, be paired +with a predicted value, or some predicted "proxy" for that variable (such as a class +probability), then we call the variable a *target* variable, and the predicted output a +*target proxy*. In this definition, it is immaterial whether or not the target appears in +training (the algorithm is supervised) or whether or not predictions generalize to new +input observations (the algorithm "learns"). LearnAPI.jl provides singleton [target proxy types](@ref proxy_types) for prediction -dispatch. These are also used to distinguish performance metrics provided by the package -[StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/). +dispatch. These are the same types used to distinguish performance metrics provided by the +package [StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/). ### [Learners](@id learners) @@ -149,9 +154,7 @@ interface.) [`LearnAPI.learner`](@ref), [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref). -Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). For a -minimal (but useless) implementation, see the implementation of `SmallLearner` -[here](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/traits.jl). +Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). ### List of methods @@ -187,7 +190,7 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` - [Accessor functions](@ref accessor_functions): these include functions like `LearnAPI.feature_importances` and `LearnAPI.training_losses`, for extracting, from training outcomes, information common to many learners. This includes - [`LearnAPI.strip(model)`](@ref) for replacing a learning outcome `model` with a + [`LearnAPI.strip(model)`](@ref) for replacing a learning outcome, `model`, with a serializable version that can still `predict` or `transform`. - [Learner traits](@ref traits): methods that promise specific learner behavior or diff --git a/docs/src/traits.md b/docs/src/traits.md index c2487082..eebb9e8e 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -78,8 +78,8 @@ requires: 1. *Finiteness:* The value of a trait is the same for all `learner`s with same value of [`LearnAPI.constructor(learner)`](@ref). This typically means trait values do not - depend on type parameters! For composite models (`LearnAPI.learners(learner)` - non-empty) this requirement is dropped. + depend on type parameters! For composite models (non-empty + `LearnAPI.learners(learner)`) this requirement is dropped. 2. *Low level deserializability:* It should be possible to evaluate the trait *value* when `LearnAPI` and `ScientificTypesBase` are the only imported modules. diff --git a/src/traits.jl b/src/traits.jl index 28f466b9..56e0d2b8 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -136,7 +136,7 @@ argument) are excluded. ``` julia> @functions my_feature_selector -(fit, LearnAPI.learner, strip, obs, transform) +(fit, LearnAPI.learner, clone, strip, obs, transform) ``` From b6f606e6efc23bd8d269b6adaf7bf95e3a59a594 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 25 Jan 2025 15:26:08 +1300 Subject: [PATCH 07/14] dump default_verbosity as likely introduces precompilation issues oops --- docs/src/anatomy_of_an_implementation.md | 4 ++-- docs/src/fit_update.md | 19 +++++++++--------- src/LearnAPI.jl | 1 - src/fit_update.jl | 11 +++++------ src/verbosity.jl | 25 ------------------------ test/runtests.jl | 1 - test/verbosity.jl | 7 ------- 7 files changed, 16 insertions(+), 52 deletions(-) delete mode 100644 src/verbosity.jl delete mode 100644 test/verbosity.jl diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 1b426f6e..fe9d9518 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -112,7 +112,7 @@ Note that we also include `learner` in the struct, for it must be possible to re The implementation of `fit` looks like this: ```@example anatomy -function LearnAPI.fit(learner::Ridge, data; verbosity=LearnAPI.default_verbosity()) +function LearnAPI.fit(learner::Ridge, data; verbosity=1) X, y = data @@ -443,7 +443,7 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat (observations) which appears first below: ```@example anatomy2 -function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=LearnAPI.default_verbosity()) +function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) lambda = learner.lambda diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 36179076..8e27126c 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -3,8 +3,8 @@ ### Training ```julia -fit(learner, data; verbosity=LearnAPI.default_verbosity()) -> model -fit(learner; verbosity=LearnAPI.default_verbosity()) -> static_model +fit(learner, data; verbosity=1) -> model +fit(learner; verbosity=1) -> static_model ``` A "static" algorithm is one that does not generalize to new observations (e.g., some @@ -101,18 +101,18 @@ See also [Density Estimation](@ref). Exactly one of the following must be implemented: -| method | fallback | -|:-----------------------------------------------------------------------|:---------| -| [`fit`](@ref)`(learner, data; verbosity=LearnAPI.default_verbosity())` | none | -| [`fit`](@ref)`(learner; verbosity=LearnAPI.default_verbosity())` | none | +| method | fallback | +|:--------------------------------------------|:---------| +| [`fit`](@ref)`(learner, data; verbosity=1)` | none | +| [`fit`](@ref)`(learner; verbosity=1)` | none | ### Updating | method | fallback | compulsory? | |:-------------------------------------------------------------------------------------|:---------|-------------| -| [`update`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | -| [`update_observations`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no | -| [`update_features`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no | +| [`update`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | +| [`update_observations`](@ref)`(model, new_data; verbosity=1, hyperparameter_updates...)` | none | no | +| [`update_features`](@ref)`(model, new_data; verbosity=1, hyperparameter_updates...)` | none | no | There are some contracts governing the behaviour of the update methods, as they relate to a previous `fit` call. Consult the document strings for details. @@ -124,5 +124,4 @@ fit update update_observations update_features -LearnAPI.default_verbosity ``` diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 9687c2e9..865f916d 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -1,7 +1,6 @@ module LearnAPI include("types.jl") -include("verbosity.jl") include("tools.jl") include("predict_transform.jl") include("fit_update.jl") diff --git a/src/fit_update.jl b/src/fit_update.jl index 015669e7..c1897b9f 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -1,8 +1,8 @@ # # FIT """ - fit(learner, data; verbosity=LearnAPI.default_verbosity()) - fit(learner; verbosity=LearnAPI.default_verbosity()) + fit(learner, data; verbosity=1) + fit(learner; verbosity=1) Execute the machine learning or statistical algorithm with configuration `learner` using the provided training `data`, returning an object, `model`, on which other methods, such @@ -26,7 +26,7 @@ by `fit`. Inspect the value of [`LearnAPI.is_static(learner)`](@ref) to determin Use `verbosity=0` for warnings only, and `-1` for silent training. -See also [`LearnAPI.default_verbosity`](@ref), [`predict`](@ref), [`transform`](@ref), +See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), [`LearnAPI.functions`](@ref), [`obs`](@ref). # Extended help @@ -37,10 +37,9 @@ Implementation of exactly one of the signatures is compulsory. If `fit(learner; verbosity=...)` is implemented, then the trait [`LearnAPI.is_static`](@ref) must be overloaded to return `true`. -The signature must include `verbosity` with [`LearnAPI.default_verbosity()`](@ref) as -default. +The signature must include `verbosity` with `1` as default. -If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentation, then +If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentati[on, then [`LearnAPI.target(data)`](@ref) must be overloaded to return it. If [`predict`](@ref) or [`transform`](@ref) are implemented and consume data, then [`LearnAPI.features(data)`](@ref) must return something that can be passed as data to diff --git a/src/verbosity.jl b/src/verbosity.jl deleted file mode 100644 index 3723bb77..00000000 --- a/src/verbosity.jl +++ /dev/null @@ -1,25 +0,0 @@ -const DEFAULT_VERBOSITY = Ref(1) - -""" - LearnAPI.default_verbosity() - LearnAPI.default_verbosity(verbosity::Int) - -Respectively return, or set, the default `verbosity` level for LearnAPI.jl methods that -support it, which includes [`fit`](@ref), [`update`](@ref), -[`update_observations`](@ref), and [`update_features`](@ref). The effect in a top-level -call is generally: - - - -| `verbosity` | behaviour | -|:------------|:--------------| -| 1 | informational | -| 0 | warnings only | - - -Methods consuming `verbosity` generally call other verbosity-supporting methods -at one level lower, so increasing `verbosity` beyond `1` may be useful. - -""" -default_verbosity() = DEFAULT_VERBOSITY[] -default_verbosity(level) = (DEFAULT_VERBOSITY[] = level) diff --git a/test/runtests.jl b/test/runtests.jl index 056fa491..8a255c83 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -2,7 +2,6 @@ using Test test_files = [ "tools.jl", - "verbosity.jl", "traits.jl", "clone.jl", "predict_transform.jl", diff --git a/test/verbosity.jl b/test/verbosity.jl deleted file mode 100644 index 72ce29c8..00000000 --- a/test/verbosity.jl +++ /dev/null @@ -1,7 +0,0 @@ -using Test - -@test LearnAPI.default_verbosity() ==1 -LearnAPI.default_verbosity(42) -@test LearnAPI.default_verbosity() == 42 - -true From 337bc52b6de80958d9dbda2355b35cdec9058a4d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 26 Jan 2025 12:39:58 +1300 Subject: [PATCH 08/14] more doc refinements --- src/fit_update.jl | 10 ++++------ src/predict_transform.jl | 13 ++++++++++--- src/target_weights_features.jl | 8 ++++---- 3 files changed, 18 insertions(+), 13 deletions(-) diff --git a/src/fit_update.jl b/src/fit_update.jl index c1897b9f..c33e40b8 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -39,12 +39,10 @@ overloaded to return `true`. The signature must include `verbosity` with `1` as default. -If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentati[on, then -[`LearnAPI.target(data)`](@ref) must be overloaded to return it. If [`predict`](@ref) or -[`transform`](@ref) are implemented and consume data, then -[`LearnAPI.features(data)`](@ref) must return something that can be passed as data to -these methods. A fallback returns `first(data)` if `data` is a tuple, and `data` -otherwise. +If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentation, then +[`LearnAPI.target`](@ref) must be implemented. If [`predict`](@ref) or [`transform`](@ref) +are implemented and consume data, then you made need to overload +[`LearnAPI.features`](@ref). The LearnAPI.jl specification has nothing to say regarding `fit` signatures with more than two arguments. For convenience, for example, an implementation is free to implement a diff --git a/src/predict_transform.jl b/src/predict_transform.jl index d4bfe0c8..0a92d3f5 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -98,8 +98,10 @@ implementation must be added to the list returned by [`LearnAPI.kinds_of_proxy(learner)`](@ref). List all available kinds of proxy by doing `LearnAPI.kinds_of_proxy()`. -If `data` is not present in the implemented signature (eg., for density estimators) then -[`LearnAPI.features(learner, data)`](@ref) must return `nothing`. +When `predict` is implemented, it may be necessary to overload +[`LearnAPI.features`](@ref). If `data` is not present in the implemented signature (eg., +for density estimators) then [`LearnAPI.features(learner, data)`](@ref) must always return +`nothing`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.predict)")) @@ -161,7 +163,12 @@ See also [`fit`](@ref), [`predict`](@ref), # New implementations Implementation for new LearnAPI.jl learners is -optional. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) +optional. + +When `predict` is implemented, it may be necessary to overload +[`LearnAPI.features`](@ref). + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) $(DOC_SLURPING(:transform)) diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index c14f467b..3fe95eae 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -24,7 +24,7 @@ the LearnAPI.jl documentation. A fallback returns `nothing`. The method must be overloaded if [`fit`](@ref) consumes data that includes a target variable. If `obs` is not being overloaded, then `observations` -above is any `data` supported in calls of the form [`fit(learner, data)`](@ref). The form +above is any `data` supported in calls of the form [`fit(learner, data)`](@ref). The form of the output `y` should be suitable for pairing with the output of [`predict`](@ref), in the evaluation of a loss function, for example. @@ -93,11 +93,11 @@ The object `X` returned by `LearnAPI.features` has the same number of observatio A fallback returns `first(observations)` if `observations` is a tuple, and otherwise returns `observations`. New implementations may need to overload this method if this -fallback is inadequate. +fallback is inadequate. For density estimators, whose `fit` typically consumes *only* a target variable, you -should overload this method to return `nothing`. If `obs` is not being overloaded, then -`observations` above is any `data` supported in calls of the form [`fit(learner, +should overload this method to always return `nothing`. If `obs` is not being overloaded, +then `observations` above is any `data` supported in calls of the form [`fit(learner, data)`](@ref). It must otherwise be possible to pass the return value `X` to `predict` and/or From 8dcfbde70a93e30994461f3eb730977e30c4ff93 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 3 Feb 2025 09:38:59 +1300 Subject: [PATCH 09/14] re-order features, target, weights in docs --- docs/make.jl | 2 +- ...get_weights_features.md => features_target_weights.md} | 8 ++++---- docs/src/reference.md | 4 ++-- 3 files changed, 7 insertions(+), 7 deletions(-) rename docs/src/{target_weights_features.md => features_target_weights.md} (96%) diff --git a/docs/make.jl b/docs/make.jl index 158117cd..95e0480a 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -20,7 +20,7 @@ makedocs( "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", "obs and Data Interfaces" => "obs.md", - "target/weights/features" => "target_weights_features.md", + "features/target/weights" => "features_target_weights.md", "Accessor Functions" => "accessor_functions.md", "Learner Traits" => "traits.md", ], diff --git a/docs/src/target_weights_features.md b/docs/src/features_target_weights.md similarity index 96% rename from docs/src/target_weights_features.md rename to docs/src/features_target_weights.md index 925bae67..efac2b85 100644 --- a/docs/src/target_weights_features.md +++ b/docs/src/features_target_weights.md @@ -1,13 +1,13 @@ -# [`target`, `weights`, and `features`](@id input) +# [`features`, `target`, and `weights`](@id input) Methods for extracting parts of training observations. Here "observations" means the output of [`obs(learner, data)`](@ref); if `obs` is not overloaded for `learner`, then "observations" is any `data` supported in calls of the form [`fit(learner, data)`](@ref) ```julia +LearnAPI.features(learner, observations) -> LearnAPI.target(learner, observations) -> LearnAPI.weights(learner, observations) -> -LearnAPI.features(learner, observations) -> ``` Here `data` is something supported in a call of the form `fit(learner, data)`. @@ -33,15 +33,15 @@ training_loss = sum(ŷ .!= y) | method | fallback | compulsory? | |:----------------------------|:-----------------:|--------------------------| +| [`LearnAPI.features`](@ref) | see docstring | if fallback insufficient | | [`LearnAPI.target`](@ref) | returns `nothing` | no | | [`LearnAPI.weights`](@ref) | returns `nothing` | no | -| [`LearnAPI.features`](@ref) | see docstring | if fallback insufficient | # Reference ```@docs +LearnAPI.features LearnAPI.target LearnAPI.weights -LearnAPI.features ``` diff --git a/docs/src/reference.md b/docs/src/reference.md index 1e885e32..3f7a55dd 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -183,8 +183,8 @@ Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). implement the observation access API specified by [`LearnAPI.data_interface(learner)`](@ref). -- [`LearnAPI.target`](@ref input), [`LearnAPI.weights`](@ref input), - [`LearnAPI.features`](@ref): for extracting relevant parts of training data, where +- [`LearnAPI.features`](@ref input), [`LearnAPI.target`](@ref input), + [`LearnAPI.weights`](@ref input): for extracting relevant parts of training data, where defined. - [Accessor functions](@ref accessor_functions): these include functions like From 1edd43b7f9d4b26edf474d28cb13d31e1fb76e03 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 3 Feb 2025 09:49:04 +1300 Subject: [PATCH 10/14] re-org source files for target features weights --- src/LearnAPI.jl | 2 +- src/{target_weights_features.jl => features_target_weights.jl} | 0 test/{target_features.jl => features_target_weights.jl} | 0 test/runtests.jl | 2 +- 4 files changed, 2 insertions(+), 2 deletions(-) rename src/{target_weights_features.jl => features_target_weights.jl} (100%) rename test/{target_features.jl => features_target_weights.jl} (100%) diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 865f916d..c32ab3b8 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -4,7 +4,7 @@ include("types.jl") include("tools.jl") include("predict_transform.jl") include("fit_update.jl") -include("target_weights_features.jl") +include("features_target_weights.jl") include("obs.jl") include("accessor_functions.jl") include("traits.jl") diff --git a/src/target_weights_features.jl b/src/features_target_weights.jl similarity index 100% rename from src/target_weights_features.jl rename to src/features_target_weights.jl diff --git a/test/target_features.jl b/test/features_target_weights.jl similarity index 100% rename from test/target_features.jl rename to test/features_target_weights.jl diff --git a/test/runtests.jl b/test/runtests.jl index 8a255c83..e8117976 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -7,7 +7,7 @@ test_files = [ "predict_transform.jl", "obs.jl", "accessor_functions.jl", - "target_features.jl", + "features_target_weights.jl", ] files = isempty(ARGS) ? test_files : ARGS From 8da64d9cbd4554679d4b386dba4277daffc6398b Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 4 Feb 2025 20:40:49 +1300 Subject: [PATCH 11/14] tweak the target/features/weights contracts and update docs --- docs/src/anatomy_of_an_implementation.md | 332 +++++++++++++---------- docs/src/features_target_weights.md | 28 +- docs/src/obs.md | 14 +- src/features_target_weights.jl | 124 +++++---- src/traits.jl | 5 +- src/types.jl | 23 +- 6 files changed, 294 insertions(+), 232 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index fe9d9518..ec59d06d 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -7,61 +7,56 @@ model = fit(learner, data) predict(model, newdata) ``` -Here `learner` specifies hyperparameters, while `model` stores learned parameters and any byproducts of algorithm execution. +Here `learner` specifies hyperparameters, while `model` stores learned parameters and any +byproducts of algorithm execution. -[Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on -`predict` versus `transform`, see [Predict or transform?](@ref) +Variations on this pattern: -["Static" algorithms](@ref static_algorithms) have a `fit` that consumes no `data` -(instead `predict` or `transform` does the heavy lifting). In [density -estimation](@ref density_estimation), `predict` consumes no data. +- [Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on + `predict` versus `transform`, see [Predict or transform?](@ref) + +- ["Static" (non-generalizing) algorithms](@ref static_algorithms) (some simple + transformers and some clustering algorithms) have a `fit` that consumes no + `data`. Instead `predict` or `transform` does the heavy lifting. + +- In [density estimation](@ref density_estimation), `predict` consumes no data. These are the basic possibilities. -Elaborating on the core pattern above, we detail in this tutorial an implementation of the +Elaborating on the core pattern above, this tutorial details an implementation of the LearnAPI.jl for naive [ridge regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also refer to the [demonstration](@ref workflow) of the implementation given later. -!!! note +## A basic implementation - New implementations of `fit`, `predict`, etc, - always have a *single* `data` argument as above. - For convenience, a signature such as `fit(learner, X, y)`, calling - `fit(learner, (X, y))`, can be added, but the LearnAPI.jl specification is - silent on the meaning or existence of signatures with extra arguments. +We suppose our algorithm's `fit` method consumes data in the form `(X, y)`, where +`X` is a suitable table¹ (the features) and `y` a vector (the target). -!!! note +!!! important - If the `data` object consumed by `fit`, `predict`, or `transform` is not - not a suitable table¹, array³, tuple of tables and arrays, or some - other object implementing - the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) - `getobs`/`numobs` interface, - then an implementation must: (i) overload [`obs`](@ref) to articulate how - provided data can be transformed into a form that does support - this interface, as illustrated below under - [Providing a separate data front end](@ref); or (ii) overload the trait - [`LearnAPI.data_interface`](@ref) to specify a more relaxed data - API. + Implementations wishing to support other data + patterns may need to take additional steps explained under + [Other data patterns](@ref di) below. The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. + ```@example anatomy using LearnAPI using LinearAlgebra, Tables nothing # hide ``` -## Defining learners +### Defining learners Here's a new type whose instances specify the single ridge regression hyperparameter: ```@example anatomy struct Ridge{T<:Real} - lambda::T + lambda::T end nothing # hide ``` @@ -75,7 +70,7 @@ fields) that are not other learners, and we must implement ```@example anatomy """ - Ridge(; lambda=0.1) + Ridge(; lambda=0.1) Instantiate a ridge regression learner, with regularization of `lambda`. """ @@ -89,7 +84,7 @@ For example, in this case, if `learner = Ridge(0.2)`, then the docstring to the *constructor*, not the struct. -## Implementing `fit` +### Implementing `fit` A ridge regressor requires two types of data for training: input features `X`, which here we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a vector.⁴ @@ -99,9 +94,9 @@ coefficients labelled by feature name for inspection after training: ```@example anatomy struct RidgeFitted{T,F} - learner::Ridge - coefficients::Vector{T} - named_coefficients::F + learner::Ridge + coefficients::Vector{T} + named_coefficients::F end nothing # hide ``` @@ -114,29 +109,29 @@ The implementation of `fit` looks like this: ```@example anatomy function LearnAPI.fit(learner::Ridge, data; verbosity=1) - X, y = data + X, y = data - # data preprocessing: - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - A = Tables.matrix(table, transpose=true) + # data preprocessing: + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table, transpose=true) - lambda = learner.lambda + lambda = learner.lambda - # apply core algorithm: - coefficients = (A*A' + learner.lambda*I)\(A*y) # vector + # apply core algorithm: + coefficients = (A*A' + learner.lambda*I)\(A*y) # vector - # determine named coefficients: - named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] - # make some noise, if allowed: - verbosity > 0 && @info "Coefficients: $named_coefficients" + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(learner, coefficients, named_coefficients) + return RidgeFitted(learner, coefficients, named_coefficients) end ``` -## Implementing `predict` +### Implementing `predict` One way users will be able to call `predict` is like this: @@ -154,7 +149,7 @@ We provide this implementation for our ridge regressor: ```@example anatomy LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = - Tables.matrix(Xnew)*model.coefficients + Tables.matrix(Xnew)*model.coefficients ``` If the kind of proxy is omitted, as in `predict(model, Xnew)`, then a fallback grabs the @@ -162,23 +157,7 @@ first element of the tuple returned by [`LearnAPI.kinds_of_proxy(learner)`](@ref we overload appropriately below. -## Extracting the target from training data - -The `fit` method consumes data which includes a [target variable](@ref proxy), i.e., the -learner is a supervised learner. We must therefore declare how the target variable can be extracted -from training data, by implementing [`LearnAPI.target`](@ref): - -```@example anatomy -LearnAPI.target(learner, data) = last(data) -``` - -There is a similar method, [`LearnAPI.features`](@ref) for declaring how training features -can be extracted (something that can be passed to `predict`) but this method has a -fallback which suffices here: it returns `first(data)` if `data` is a tuple, and `data` -otherwise. - - -## Accessor functions +### Accessor functions An [accessor function](@ref accessor_functions) has the output of [`fit`](@ref) as it's sole argument. Every new implementation must implement the accessor function @@ -204,14 +183,14 @@ dump the named version of the coefficients: ```@example anatomy LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.learner, model.coefficients, nothing) + RidgeFitted(model.learner, model.coefficients, nothing) ``` Crucially, we can still use `LearnAPI.strip(model)` in place of `model` to make new predictions. -## Learner traits +### Learner traits Learner [traits](@ref traits) record extra generic information about a learner, or make specific promises of behavior. They are methods that have a learner as the sole @@ -229,44 +208,52 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: ```@example anatomy @trait( - Ridge, - constructor = Ridge, - kinds_of_proxy=(Point(),), - tags = ("regression",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.clone), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.coefficients), + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), ) ) nothing # hide ``` -The last trait, `functions`, returns a list of all LearnAPI.jl methods that can be -meaningfully applied to the learner or associated model. You always include the first five -you see here: `fit`, `learner`, `clone` ,`strip`, `obs`. Here [`clone`](@ref) is a utility -function provided by LearnAPI that you never overload; overloading [`obs`](@ref) is -optional (see [Providing a separate data front end](@ref)) but it is always included -because it has a fallback. See [`LearnAPI.functions`](@ref) for a checklist. +[`LearnAPI.functions`](@ref) (discussed further below) and [`LearnAPI.constructor`](@ref), +are the only universally compulsory traits. However, it is worthwhile studying the [list +of all traits](@ref traits_list) to see which might apply to a new implementation, to +enable maximum buy into functionality provided by third party packages, and to assist +third party algorithms that match machine learning algorithms to user-defined tasks. + +With [some exceptions](@ref trait_contract), the value of a trait should depend only on +the *type* of the argument. + +### The `functions` trait + +The last trait, `functions`, above returns a list of all LearnAPI.jl methods that can be +meaningfully applied to the learner or associated model, with the exception of traits. You +always include the first six you see here: `fit`, `learner`, `clone` ,`strip`, `obs`, +`features`. Here [`clone`](@ref) is a utility function provided by LearnAPI that you never +overload, while [`obs`](@ref) is discussed under [Providing a separate data front +end](@ref) below and is always included because it has a meaningful fallback. The +`features` method, here provided by a fallback, articulates how the features `X` can be +extracted from the training data `(X, y)`. We must also include `target` here to flag our +model as supervised; again the method itself is provided by a fallback valid in the +present case. -[`LearnAPI.functions`](@ref) and [`LearnAPI.constructor`](@ref), are the only universally -compulsory traits. However, it is worthwhile studying the [list of all traits](@ref -traits_list) to see which might apply to a new implementation, to enable maximum buy into -functionality provided by third party packages, and to assist third party algorithms that -match machine learning algorithms to user-defined tasks. +See [`LearnAPI.functions`](@ref) for a checklist of what the `functions` trait needs to +return. -Note that we know `Ridge` instances are supervised learners because `:(LearnAPI.target) -in LearnAPI.functions(learner)`, for every instance `learner`. With [some -exceptions](@ref trait_contract), the value of a trait should depend only on the *type* of -the argument. -## Signatures added for convenience +### Signatures added for convenience We add one `fit` signature for user-convenience only. The LearnAPI.jl specification has nothing to say about `fit` signatures with more than two positional arguments. @@ -327,6 +314,30 @@ recovered_model = deserialize(filename) @assert predict(recovered_model, X) == predict(model, X) ``` +## [Other data patterns](@id di) + +Here are some important remarks for implementations wanting to deviate in their +assumptions about data from those made above. + +- New implementations of `fit`, `predict`, etc, always have a *single* `data` argument as + above. For convenience, a signature such as `fit(learner, X, y)`, calling `fit(learner, + (X, y))`, can be added, but the LearnAPI.jl specification is silent on the meaning or + existence of signatures with extra arguments. + +- If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable + table¹, array³, tuple of tables and arrays, or some other object implementing the + [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface, + then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data + can be transformed into a form that does support this interface, as illustrated below + under [Providing a separate data front end](@ref) below; or (ii) overload the trait + [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. + +- Where the form of data consumed by `fit` is different from that consumed by + `predict/transform` (as in classical supervised learning) it may be necessary to + explicitly overload the functions [`LearnAPI.features`](@ref) and (if supervised) + [`LearnAPI.target`](@ref). The same holds if overloading [`obs`](@ref); see below. + + ## Providing a separate data front end ```@setup anatomy2 @@ -340,31 +351,31 @@ end Ridge(; lambda=0.1) = Ridge(lambda) struct RidgeFitted{T,F} - learner::Ridge - coefficients::Vector{T} - named_coefficients::F + learner::Ridge + coefficients::Vector{T} + named_coefficients::F end LearnAPI.learner(model::RidgeFitted) = model.learner LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.learner, model.coefficients, nothing) + RidgeFitted(model.learner, model.coefficients, nothing) @trait( - Ridge, - constructor = Ridge, - kinds_of_proxy=(Point(),), - tags = ("regression",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.clone), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.coefficients), + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), ) ) @@ -379,13 +390,30 @@ y = 2a - b + 3c + 0.05*rand(n) An implementation may optionally implement [`obs`](@ref), to expose to the user (or some meta-algorithm like cross-validation) the representation of input data internal to `fit` or `predict`, such as the matrix version `A` of `X` in the ridge example. That is, we may -factor out of `fit` (and also `predict`) a data pre-processing step, `obs`, to expose +factor out of `fit` (and also `predict`) a data preprocessing step, `obs`, to expose its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. -In typical case (where [`LearnAPI.data_interface`](@ref) not overloaded) the alternative data -representations will implement the MLUtils.jl `getobs/numobs` interface for observation -subsampling, which is generally all a user or meta-algorithm will need, before passing the -data on to `fit`/`predict` as you would the original data. +The [`obs`](@ref) methods exists to: + +- Enable meta-algorithms to avoid redundant conversions of user-provided data into the form + ultimately used by the core training algorithms. + +- Through the provision of canned data front ends (as provided by + [LearnDataFrontEnds.jl](https://juliaai.github.io/LearnAPI.jl/dev/) for example) enable + users to provide data in a variety of formats while allowing new implementations to + focus on core algorithms that consume a standardized, preprocessed, representation of + that data. + +!!! important + + While many new learner implementations will want to adopt a canned data front end, we + focus here on a self-contained implemementation of `obs` for the ridge example above, to show + how it works. + +In the typical case, where [`LearnAPI.data_interface`](@ref) is not overloaded, the +alternative data representations will implement the MLUtils.jl `getobs/numobs` interface +for observation subsampling, which is generally all a user or meta-algorithm will need, +before passing the data on to `fit`/`predict` as you would the original data. So, instead of the pattern @@ -395,10 +423,10 @@ predict(model, newdata) ``` one enables the following alternative (which in any case will still work, because of a -no-op `obs` fallback provided by LearnAPI.jl): +an `obs` fallback provided by LearnAPI.jl): ```julia -observations = obs(learner, data) # pre-processed training data +observations = obs(learner, data) # preprocessed training data # optional subsampling: observations = MLUtils.getobs(observations, train_indices) @@ -415,25 +443,25 @@ predict(model, newobservations) See also the demonstration [below](@ref advanced_demo). -Here we specifically wrap all the pre-processed data into single object, for which we +Here we specifically wrap all the preprocessed data into single object, for which we introduce a new type: ```@example anatomy2 struct RidgeFitObs{T,M<:AbstractMatrix{T}} - A::M # `p` x `n` matrix - names::Vector{Symbol} # features - y::Vector{T} # target + A::M # `p` x `n` matrix + names::Vector{Symbol} # features + y::Vector{T} # target end ``` -Now we overload `obs` to carry out the data pre-processing previously in `fit`, like this: +Now we overload `obs` to carry out the data preprocessing previously in `fit`, like this: ```@example anatomy2 function LearnAPI.obs(::Ridge, data) - X, y = data - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - return RidgeFitObs(Tables.matrix(table)', names, y) + X, y = data + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return RidgeFitObs(Tables.matrix(table)', names, y) end ``` @@ -445,27 +473,27 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat ```@example anatomy2 function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) - lambda = learner.lambda + lambda = learner.lambda - A = observations.A - names = observations.names - y = observations.y + A = observations.A + names = observations.names + y = observations.y - # apply core learner: - coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix + # apply core learner: + coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix - # determine named coefficients: - named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] - # make some noise, if allowed: - verbosity > 0 && @info "Coefficients: $named_coefficients" + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(learner, coefficients, named_coefficients) + return RidgeFitted(learner, coefficients, named_coefficients) end LearnAPI.fit(learner::Ridge, data; kwargs...) = - fit(learner, obs(learner, data); kwargs...) + fit(learner, obs(learner, data); kwargs...) ``` ### The `obs` contract @@ -489,7 +517,7 @@ this is [`LearnAPI.RandomAccess()`](@ref) (the default) it usually suffices to o ```@example anatomy2 Base.getindex(data::RidgeFitObs, I) = - RidgeFitObs(data.A[:,I], data.names, y[I]) + RidgeFitObs(data.A[:,I], data.names, y[I]) Base.length(data::RidgeFitObs) = length(data.y) ``` @@ -500,21 +528,25 @@ LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' LearnAPI.obs(::RidgeFitted, observations::AbstractArray) = observations # involutivity LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = - observations'*model.coefficients + observations'*model.coefficients LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = - predict(model, Point(), obs(model, Xnew)) + predict(model, Point(), obs(model, Xnew)) ``` -### `target` and `features` methods +### `features` and `target` methods -In the general case, we only need to implement [`LearnAPI.target`](@ref) and -[`LearnAPI.features`](@ref) to handle all possible output of `obs(learner, data)`, and now -the fallback for `LearnAPI.features` mentioned before is inadequate. +Two methods [`LearnAPI.features`](@ref) and [`LearnAPI.target`](@ref) articulate how +features and target can be extracted from `data` consumed by LearnAPI.jl +methods. Fallbacks provided by LearnAPI.jl sufficed in our basic implementation +above. Here we must explicitly overload them, so that they also handle the output of +`obs(learner, data)`: ```@example anatomy2 -LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A +LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y +LearnAPI.features(learner::Ridge, data) = LearnAPI.features(learner, obs(learner, data)) +LearnAPI.target(learner::Ridge, data) = LearnAPI.target(learner, obs(learner, data)) ``` ### Important notes: diff --git a/docs/src/features_target_weights.md b/docs/src/features_target_weights.md index efac2b85..e2878672 100644 --- a/docs/src/features_target_weights.md +++ b/docs/src/features_target_weights.md @@ -1,13 +1,12 @@ # [`features`, `target`, and `weights`](@id input) -Methods for extracting parts of training observations. Here "observations" means the -output of [`obs(learner, data)`](@ref); if `obs` is not overloaded for `learner`, then -"observations" is any `data` supported in calls of the form [`fit(learner, data)`](@ref) +Methods for extracting certain parts of `data` for all supported calls of the form +[`fit(learner, data)`](@ref). ```julia -LearnAPI.features(learner, observations) -> -LearnAPI.target(learner, observations) -> -LearnAPI.weights(learner, observations) -> +LearnAPI.features(learner, data) -> +LearnAPI.target(learner, data) -> +LearnAPI.weights(learner, data) -> ``` Here `data` is something supported in a call of the form `fit(learner, data)`. @@ -17,12 +16,11 @@ Here `data` is something supported in a call of the form `fit(learner, data)`. Not typically appearing in a general user's workflow but useful in meta-alagorithms, such as cross-validation (see the example in [`obs` and Data Interfaces](@ref data_interface)). -Supposing `learner` is a supervised classifier predicting a one-dimensional vector +Supposing `learner` is a supervised classifier predicting a vector target: ```julia -observations = obs(learner, data) -model = fit(learner, observations) +model = fit(learner, data) X = LearnAPI.features(learner, data) y = LearnAPI.target(learner, data) ŷ = predict(model, Point(), X) @@ -31,12 +29,12 @@ training_loss = sum(ŷ .!= y) # Implementation guide -| method | fallback | compulsory? | -|:----------------------------|:-----------------:|--------------------------| -| [`LearnAPI.features`](@ref) | see docstring | if fallback insufficient | -| [`LearnAPI.target`](@ref) | returns `nothing` | no | -| [`LearnAPI.weights`](@ref) | returns `nothing` | no | - +| method | fallback return value | compulsory? | +|:-------------------------------------------|:---------------------------------------------:|--------------------------| +| [`LearnAPI.features(learner, data)`](@ref) | `first(data)` if `data` is tuple, else `data` | if fallback insufficient | +| [`LearnAPI.target(learner, data)`](@ref) | `last(data)` | if fallback insufficient | +| [`LearnAPI.weights(learner, data)`](@ref) | `nothing` | no | + # Reference diff --git a/docs/src/obs.md b/docs/src/obs.md index a583f27d..70b6eb46 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -12,6 +12,9 @@ obs(learner, data) # can be passed to `fit` instead of `data` obs(model, data) # can be passed to `predict` or `transform` instead of `data` ``` +- [Data interfaces](@ref data_interfaces) + + ## [Typical workflows](@id obs_workflows) LearnAPI.jl makes no universal assumptions about the form of `data` in a call @@ -93,18 +96,11 @@ A sample implementation is given in [Providing a separate data front end](@ref). obs ``` -### [Data interfaces](@id data_interfaces) - -New implementations must overload [`LearnAPI.data_interface(learner)`](@ref) if the -output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess()`](@ref). Arrays, most -tables, and all tuples thereof, implement `RandomAccess()`. - -- [`LearnAPI.RandomAccess`](@ref) (default) -- [`LearnAPI.FiniteIterable`](@ref) -- [`LearnAPI.Iterable`](@ref) +### [Available data interfaces](@id data_interfaces) ```@docs +LearnAPI.DataInterface LearnAPI.RandomAccess LearnAPI.FiniteIterable LearnAPI.Iterable diff --git a/src/features_target_weights.jl b/src/features_target_weights.jl index 3fe95eae..ca1811e2 100644 --- a/src/features_target_weights.jl +++ b/src/features_target_weights.jl @@ -1,14 +1,14 @@ """ - LearnAPI.target(learner, observations) -> target + LearnAPI.target(learner, data) -> target -Return, for every conceivable `observations` returned by a call of the form [`obs(learner, -data)`](@ref), the target variable part of `observations`. If `nothing` is returned, the -`learner` does not see a target variable in training (is unsupervised). +Return, for each form of `data` supported by the call [`fit(learner, data)`](@ref), the +target part of `data`, in a form suitable for pairing with predictions. The return value +is only meaningful if `learner` is supervised, i.e., if `:(LearnAPI.target) in +LearnAPI.functions(learner)`. -The returned object `y` has the same number of observations as `observations` does and is -guaranteed to implement the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). It's form should be suitable for pairing with -the output of [`predict`](@ref), for example in a loss function. +The returned object has the same number of observations +as `data` has and is guaranteed to implement the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). # Extended help @@ -22,38 +22,55 @@ the LearnAPI.jl documentation. ## New implementations -A fallback returns `nothing`. The method must be overloaded if [`fit`](@ref) consumes data -that includes a target variable. If `obs` is not being overloaded, then `observations` -above is any `data` supported in calls of the form [`fit(learner, data)`](@ref). The form -of the output `y` should be suitable for pairing with the output of [`predict`](@ref), in -the evaluation of a loss function, for example. +A fallback returns `last(data)`. The method must be overloaded if [`fit`](@ref) consumes +data that includes a target variable and this fallback fails to fulfill the contract stated +above. + +If `obs` is being overloaded, then typically it suffices to overload +`LearnAPI.target(learner, observations)` where `observations = obs(learner, data)` and +`data` is any documented supported `data` in calls of the form [`fit(learner, +data)`](@ref), and to add a declaration of the form + +```julia +LearnAPI.target(learner, data) = LearnAPI.target(learner, obs(learner, data)) +``` +to catch all other forms of supported input `data`. -Ensure the object `y` returned by `LearnAPI.target`, unless `nothing`, implements the data +Remember to ensure the return value of `LearnAPI.target` implements the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.target)"; overloaded=true)) """ -target(::Any, observations) = nothing +target(::Any, data) = last(data) """ - LearnAPI.weights(learner, observations) -> weights + LearnAPI.weights(learner, data) -> weights -Return, for every conceivable `observations` returned by a call of the form [`obs(learner, -data)`](@ref), the weights part of `observations`. Where `nothing` is returned, no weights -are part of `data`, which is to be interpreted as uniform weighting. +Return, for each form of `data` supported by the call [`fit(learner, data)`](@ref), the +per-observation weights part of `data`. -The returned object `w` has the same number of observations as `observations` does and is -guaranteed to implement the data interface specified by +The returned object has the same number of observations +as `data` has and is guaranteed to implement the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). +Where `nothing` is returned, weighting is understood to be uniform. + # Extended help # New implementations -Overloading is optional. A fallback returns `nothing`. If `obs` is not being overloaded, -then `observations` above is any `data` supported in calls of the form [`fit(learner, -data)`](@ref). +Overloading is optional. A fallback returns `nothing`. + +If `obs` is being overloaded, then typically it suffices to overload +`LearnAPI.weights(learner, observations)` where `observations = obs(learner, data)` and +`data` is any documented supported `data` in calls of the form [`fit(learner, +data)`](@ref), and to add a declaration of the form + +```julia +LearnAPI.weights(learner, data) = LearnAPI.weights(learner, obs(learner, data)) +``` +to catch all other forms of supported input `data`. Ensure the returned object, unless `nothing`, implements the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). @@ -61,53 +78,54 @@ Ensure the returned object, unless `nothing`, implements the data interface spec $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) """ -weights(::Any, observations) = nothing +weights(::Any, data) = nothing """ - LearnAPI.features(learner, observations) + LearnAPI.features(learner, data) -Return, for every conceivable `observations` returned by a call of the form [`obs(learner, -data)`](@ref), the "features" part of `observations` (as opposed to the target variable, -for example). +Return, for each form of `data` supported by the call [`fit(learner, data)`](@ref), the +features part `X` of `data`. -It must always be possible to pass the returned object `X` to `predict` or `transform`, -where implemented, as in the following sample workflow: +While "features" will typically have the commonly understood meaning, the only +learner-generic guaranteed properties of `X` are: -```julia -observations = obs(learner, data) -model = fit(learner, observations) -X = LearnAPI.features(learner, observations) -ŷ = predict(model, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` -``` +- `X` can be passed to [`predict`](@ref) or [`transform`](@ref) when these are supported + by `learner`, as in the call `predict(model, X)`, where `model = fit(learner, data)`. -For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`) -`ŷ` above is generally intended to be an approximate proxy for the target variable. +- `X` has the same number of observations as `data` has and is guaranteed to implement + the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). -The object `X` returned by `LearnAPI.features` has the same number of observations as -`observations` does and is guaranteed to implement the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). +Where `nothing` is returned, `predict` and `transform` consume no data. # Extended help # New implementations -A fallback returns `first(observations)` if `observations` is a tuple, and otherwise -returns `observations`. New implementations may need to overload this method if this -fallback is inadequate. +A fallback returns `first(data)` if `data` is a tuple, and otherwise +returns `data`. New implementations will need to overload this method if this +fallback is inadequate. For density estimators, whose `fit` typically consumes *only* a target variable, you -should overload this method to always return `nothing`. If `obs` is not being overloaded, -then `observations` above is any `data` supported in calls of the form [`fit(learner, -data)`](@ref). +should overload this method to always return `nothing`. -It must otherwise be possible to pass the return value `X` to `predict` and/or -`transform`, and `X` must have same number of observations as `data`. +If `obs` is being overloaded, then typically it suffices to overload +`LearnAPI.features(learner, observations)` where `observations = obs(learner, data)` and +`data` is any documented supported `data` in calls of the form [`fit(learner, +data)`](@ref), and to add a declaration of the form + +```julia +LearnAPI.features(learner, data) = LearnAPI.features(learner, obs(learner, data)) +``` +to catch all other forms of supported input `data`. Ensure the returned object, unless `nothing`, implements the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). +`:(LearnAPI.features)` must always be included in the return value of +[`LearnAPI.functions(learner)`](@ref). + """ -features(learner, observations) = _first(observations) -_first(observations) = observations -_first(observations::Tuple) = first(observations) +features(learner, data) = _first(data) +_first(data) = data +_first(data::Tuple) = first(data) # note the factoring above guards against method ambiguities diff --git a/src/traits.jl b/src/traits.jl index 56e0d2b8..12ccd80f 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -91,7 +91,7 @@ return value: | `:(LearnAPI.clone)` | never overloaded | yes | | `:(LearnAPI.strip)` | no | yes | | `:(LearnAPI.obs)` | no | yes | -| `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | +| `:(LearnAPI.features)` | no | yes | | `:(LearnAPI.target)` | no | only if implemented | | `:(LearnAPI.weights)` | no | only if implemented | | `:(LearnAPI.update)` | no | only if implemented | @@ -364,8 +364,7 @@ in representations of input data returned by [`obs(learner, data)`](@ref) or [`obs(model, data)`](@ref), whenever `learner == LearnAPI.learner(model)`. Here `data` is `fit`, `predict`, or `transform`-consumable data. -Possible return values are [`LearnAPI.RandomAccess`](@ref), -[`LearnAPI.FiniteIterable`](@ref), and [`LearnAPI.Iterable`](@ref). +See [`LearnAPI.DataInterface`](@ref) for possible return values. See also [`obs`](@ref). diff --git a/src/types.jl b/src/types.jl index 269fff2c..3212c3f2 100644 --- a/src/types.jl +++ b/src/types.jl @@ -7,7 +7,7 @@ abstract type KindOfProxy end LearnAPI.IID <: LearnAPI.KindOfProxy Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). If `kind_of_proxy` is an instance of -`LearnAPI.IID` then, given `data` constisting of ``n`` observations, the +`LearnAPI.IID` then, given `data` consisting of ``n`` observations, the following must hold: - `ŷ = LearnAPI.predict(model, kind_of_proxy, data)` is @@ -41,7 +41,7 @@ See also [`LearnAPI.KindOfProxy`](@ref). | `ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | | `SurvivalFunction` | survival function | | `SurvivalDistribution` | probability distribution for survival time | -| `SurvivalHazardFunction` | hazard function for survival time | +| `HazardFunction` | hazard function for survival time | | `OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | ¹Provided for completeness but discouraged to avoid [ambiguities in @@ -186,6 +186,25 @@ KindOfProxy # # DATA INTERFACES +""" + + LearnAPI.DataInterface + +Abstract supertype for singleton types designating an interface for accessing observations +within a LearnAPI.jl data object. + +New learner implementations must overload [`LearnAPI.data_interface(learner)`](@ref) to +return one of the instances below if the output of [`obs`](@ref) does not implement the +default [`LearnAPI.RandomAccess()`](@ref) interface. Arrays, most tables, and all tuples +thereof, implement `RandomAccess()`. + +Available instances: + +- [`LearnAPI.RandomAccess()`](@ref) (default) +- [`LearnAPI.FiniteIterable()`](@ref) +- [`LearnAPI.Iterable()`](@ref) + +""" abstract type DataInterface end abstract type Finite <: DataInterface end From 5836481ef9235dff51182b40487f26b8b971a7ff Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 6 Feb 2025 11:46:42 +1300 Subject: [PATCH 12/14] tweak to features/target contract and fallbacks --- docs/src/anatomy_of_an_implementation.md | 207 ++++++++++++----------- src/features_target_weights.jl | 11 +- src/traits.jl | 34 ++-- test/features_target_weights.jl | 2 +- 4 files changed, 133 insertions(+), 121 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index ec59d06d..d011889d 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -7,19 +7,20 @@ model = fit(learner, data) predict(model, newdata) ``` -Here `learner` specifies hyperparameters, while `model` stores learned parameters and any -byproducts of algorithm execution. +Here `learner` specifies [hyperparameters](@ref hyperparameters), while `model` stores +learned parameters and any byproducts of algorithm execution. Variations on this pattern: - [Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on `predict` versus `transform`, see [Predict or transform?](@ref) -- ["Static" (non-generalizing) algorithms](@ref static_algorithms) (some simple - transformers and some clustering algorithms) have a `fit` that consumes no +- ["Static" (non-generalizing) algorithms](@ref static_algorithms), which includes some + simple transformers and some clustering algorithms, have a `fit` that consumes no `data`. Instead `predict` or `transform` does the heavy lifting. -- In [density estimation](@ref density_estimation), `predict` consumes no data. +- In [density estimation](@ref density_estimation), the `newdata` argument in `predict` is + missing. These are the basic possibilities. @@ -31,14 +32,16 @@ implementation given later. ## A basic implementation +See [here](@ref code) for code without explanations. + We suppose our algorithm's `fit` method consumes data in the form `(X, y)`, where `X` is a suitable table¹ (the features) and `y` a vector (the target). !!! important - Implementations wishing to support other data - patterns may need to take additional steps explained under - [Other data patterns](@ref di) below. + Implementations wishing to support other data + patterns may need to take additional steps explained under + [Other data patterns](@ref di) below. The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. @@ -56,7 +59,7 @@ Here's a new type whose instances specify the single ridge regression hyperparam ```@example anatomy struct Ridge{T<:Real} - lambda::T + lambda::T end nothing # hide ``` @@ -70,7 +73,7 @@ fields) that are not other learners, and we must implement ```@example anatomy """ - Ridge(; lambda=0.1) + Ridge(; lambda=0.1) Instantiate a ridge regression learner, with regularization of `lambda`. """ @@ -94,9 +97,9 @@ coefficients labelled by feature name for inspection after training: ```@example anatomy struct RidgeFitted{T,F} - learner::Ridge - coefficients::Vector{T} - named_coefficients::F + learner::Ridge + coefficients::Vector{T} + named_coefficients::F end nothing # hide ``` @@ -108,26 +111,25 @@ The implementation of `fit` looks like this: ```@example anatomy function LearnAPI.fit(learner::Ridge, data; verbosity=1) + X, y = data - X, y = data - - # data preprocessing: - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - A = Tables.matrix(table, transpose=true) + # data preprocessing: + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table, transpose=true) - lambda = learner.lambda + lambda = learner.lambda - # apply core algorithm: - coefficients = (A*A' + learner.lambda*I)\(A*y) # vector + # apply core algorithm: + coefficients = (A*A' + learner.lambda*I)\(A*y) # vector - # determine named coefficients: - named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] - # make some noise, if allowed: - verbosity > 0 && @info "Coefficients: $named_coefficients" + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(learner, coefficients, named_coefficients) + return RidgeFitted(learner, coefficients, named_coefficients) end ``` @@ -149,7 +151,7 @@ We provide this implementation for our ridge regressor: ```@example anatomy LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = - Tables.matrix(Xnew)*model.coefficients + Tables.matrix(Xnew)*model.coefficients ``` If the kind of proxy is omitted, as in `predict(model, Xnew)`, then a fallback grabs the @@ -183,7 +185,7 @@ dump the named version of the coefficients: ```@example anatomy LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.learner, model.coefficients, nothing) + RidgeFitted(model.learner, model.coefficients, nothing) ``` Crucially, we can still use `LearnAPI.strip(model)` in place of `model` to make new @@ -208,20 +210,20 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: ```@example anatomy @trait( - Ridge, - constructor = Ridge, - kinds_of_proxy=(Point(),), - tags = ("regression",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.clone), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.coefficients), + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), ) ) nothing # hide @@ -240,8 +242,8 @@ the *type* of the argument. The last trait, `functions`, above returns a list of all LearnAPI.jl methods that can be meaningfully applied to the learner or associated model, with the exception of traits. You -always include the first six you see here: `fit`, `learner`, `clone` ,`strip`, `obs`, -`features`. Here [`clone`](@ref) is a utility function provided by LearnAPI that you never +always include the first five you see here: `fit`, `learner`, `clone` ,`strip`, +`obs`. Here [`clone`](@ref) is a utility function provided by LearnAPI that you never overload, while [`obs`](@ref) is discussed under [Providing a separate data front end](@ref) below and is always included because it has a meaningful fallback. The `features` method, here provided by a fallback, articulates how the features `X` can be @@ -252,7 +254,6 @@ present case. See [`LearnAPI.functions`](@ref) for a checklist of what the `functions` trait needs to return. - ### Signatures added for convenience We add one `fit` signature for user-convenience only. The LearnAPI.jl specification has @@ -314,6 +315,13 @@ recovered_model = deserialize(filename) @assert predict(recovered_model, X) == predict(model, X) ``` +### Testing an implementation + +```julia +using LearnTestAPI +@testapi learner (X, y) verbosity=0 +``` + ## [Other data patterns](@id di) Here are some important remarks for implementations wanting to deviate in their @@ -340,6 +348,8 @@ assumptions about data from those made above. ## Providing a separate data front end +See [here](@ref code) for code without explanations. + ```@setup anatomy2 using LearnAPI using LinearAlgebra, Tables @@ -351,31 +361,31 @@ end Ridge(; lambda=0.1) = Ridge(lambda) struct RidgeFitted{T,F} - learner::Ridge - coefficients::Vector{T} - named_coefficients::F + learner::Ridge + coefficients::Vector{T} + named_coefficients::F end LearnAPI.learner(model::RidgeFitted) = model.learner LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.learner, model.coefficients, nothing) + RidgeFitted(model.learner, model.coefficients, nothing) @trait( - Ridge, - constructor = Ridge, - kinds_of_proxy=(Point(),), - tags = ("regression",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.clone), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.coefficients), + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), ) ) @@ -393,27 +403,25 @@ or `predict`, such as the matrix version `A` of `X` in the ridge example. That factor out of `fit` (and also `predict`) a data preprocessing step, `obs`, to expose its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. -The [`obs`](@ref) methods exists to: +The [`obs`](@ref) methods exist to: - Enable meta-algorithms to avoid redundant conversions of user-provided data into the form ultimately used by the core training algorithms. -- Through the provision of canned data front ends (as provided by - [LearnDataFrontEnds.jl](https://juliaai.github.io/LearnAPI.jl/dev/) for example) enable - users to provide data in a variety of formats while allowing new implementations to - focus on core algorithms that consume a standardized, preprocessed, representation of - that data. +- Through the provision of canned data front ends, enable users to provide data in a + variety of formats, while allowing new implementations to focus on core algorithms that + consume a standardized, preprocessed, representation of that data. !!! important - While many new learner implementations will want to adopt a canned data front end, we + While many new learner implementations will want to adopt a canned data front end, such as those provided by [LearnDataFrontEnds.jl](https://juliaai.github.io/LearnAPI.jl/dev/), we focus here on a self-contained implemementation of `obs` for the ridge example above, to show how it works. In the typical case, where [`LearnAPI.data_interface`](@ref) is not overloaded, the -alternative data representations will implement the MLUtils.jl `getobs/numobs` interface +alternative data representations must implement the MLUtils.jl `getobs/numobs` interface for observation subsampling, which is generally all a user or meta-algorithm will need, -before passing the data on to `fit`/`predict` as you would the original data. +before passing the data on to `fit`/`predict`, as you would the original data. So, instead of the pattern @@ -422,8 +430,7 @@ model = fit(learner, data) predict(model, newdata) ``` -one enables the following alternative (which in any case will still work, because of a -an `obs` fallback provided by LearnAPI.jl): +one enables the following alternative: ```julia observations = obs(learner, data) # preprocessed training data @@ -441,16 +448,20 @@ newobservations = MLUtils.getobs(observations, test_indices) predict(model, newobservations) ``` -See also the demonstration [below](@ref advanced_demo). +which works for any non-static learner implementing `predict`, no matter how one is +supposed to accesses the individual observations of `data` or `newdata`. See also the +demonstration [below](@ref advanced_demo). Furthermore, fallbacks ensure the above pattern +still works if we choose not to implement a front end at all, which is allowed, if +supported `data` and `newdata` already implement `getobs`/`numobs`. Here we specifically wrap all the preprocessed data into single object, for which we introduce a new type: ```@example anatomy2 struct RidgeFitObs{T,M<:AbstractMatrix{T}} - A::M # `p` x `n` matrix - names::Vector{Symbol} # features - y::Vector{T} # target + A::M # `p` x `n` matrix + names::Vector{Symbol} # features + y::Vector{T} # target end ``` @@ -458,10 +469,10 @@ Now we overload `obs` to carry out the data preprocessing previously in `fit`, l ```@example anatomy2 function LearnAPI.obs(::Ridge, data) - X, y = data - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - return RidgeFitObs(Tables.matrix(table)', names, y) + X, y = data + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return RidgeFitObs(Tables.matrix(table)', names, y) end ``` @@ -473,27 +484,27 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat ```@example anatomy2 function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) - lambda = learner.lambda + lambda = learner.lambda - A = observations.A - names = observations.names - y = observations.y + A = observations.A + names = observations.names + y = observations.y - # apply core learner: - coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix + # apply core learner: + coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix - # determine named coefficients: - named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] - # make some noise, if allowed: - verbosity > 0 && @info "Coefficients: $named_coefficients" + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(learner, coefficients, named_coefficients) + return RidgeFitted(learner, coefficients, named_coefficients) end LearnAPI.fit(learner::Ridge, data; kwargs...) = - fit(learner, obs(learner, data); kwargs...) + fit(learner, obs(learner, data); kwargs...) ``` ### The `obs` contract @@ -517,7 +528,7 @@ this is [`LearnAPI.RandomAccess()`](@ref) (the default) it usually suffices to o ```@example anatomy2 Base.getindex(data::RidgeFitObs, I) = - RidgeFitObs(data.A[:,I], data.names, y[I]) + RidgeFitObs(data.A[:,I], data.names, y[I]) Base.length(data::RidgeFitObs) = length(data.y) ``` @@ -528,10 +539,10 @@ LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' LearnAPI.obs(::RidgeFitted, observations::AbstractArray) = observations # involutivity LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = - observations'*model.coefficients + observations'*model.coefficients LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = - predict(model, Point(), obs(model, Xnew)) + predict(model, Point(), obs(model, Xnew)) ``` ### `features` and `target` methods diff --git a/src/features_target_weights.jl b/src/features_target_weights.jl index ca1811e2..578772fa 100644 --- a/src/features_target_weights.jl +++ b/src/features_target_weights.jl @@ -101,9 +101,9 @@ Where `nothing` is returned, `predict` and `transform` consume no data. # New implementations -A fallback returns `first(data)` if `data` is a tuple, and otherwise -returns `data`. New implementations will need to overload this method if this -fallback is inadequate. +A fallback returns `first(data)` if `data` is a tuple, and otherwise returns `data`. The +method has no meaning for static learners (where `data` is not an argument of `fit`) and +otherwise an implementation needs to overload this method if the fallback is inadequate. For density estimators, whose `fit` typically consumes *only* a target variable, you should overload this method to always return `nothing`. @@ -121,8 +121,9 @@ to catch all other forms of supported input `data`. Ensure the returned object, unless `nothing`, implements the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). -`:(LearnAPI.features)` must always be included in the return value of -[`LearnAPI.functions(learner)`](@ref). +`:(LearnAPI.features)` must be included in the return value of +[`LearnAPI.functions(learner)`](@ref), unless the learner is static (`fit` consumes no +data). """ features(learner, data) = _first(data) diff --git a/src/traits.jl b/src/traits.jl index 12ccd80f..b8d6249c 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -84,23 +84,23 @@ LearnAPI.jl. All new implementations must implement this trait. Here's a checklist for elements in the return value: -| expression | implementation compulsory? | include in returned tuple? | -|:----------------------------------|:---------------------------|:-----------------------------------| -| `:(LearnAPI.fit)` | yes | yes | -| `:(LearnAPI.learner)` | yes | yes | -| `:(LearnAPI.clone)` | never overloaded | yes | -| `:(LearnAPI.strip)` | no | yes | -| `:(LearnAPI.obs)` | no | yes | -| `:(LearnAPI.features)` | no | yes | -| `:(LearnAPI.target)` | no | only if implemented | -| `:(LearnAPI.weights)` | no | only if implemented | -| `:(LearnAPI.update)` | no | only if implemented | -| `:(LearnAPI.update_observations)` | no | only if implemented | -| `:(LearnAPI.update_features)` | no | only if implemented | -| `:(LearnAPI.predict)` | no | only if implemented | -| `:(LearnAPI.transform)` | no | only if implemented | -| `:(LearnAPI.inverse_transform)` | no | only if implemented | -| < accessor functions> | no | only if implemented | +| expression | implementation compulsory? | include in returned tuple? | +|:----------------------------------|:---------------------------|:---------------------------------| +| `:(LearnAPI.fit)` | yes | yes | +| `:(LearnAPI.learner)` | yes | yes | +| `:(LearnAPI.clone)` | never overloaded | yes | +| `:(LearnAPI.strip)` | no | yes | +| `:(LearnAPI.obs)` | no | yes | +| `:(LearnAPI.features)` | no | yes, unless `learner` is static | +| `:(LearnAPI.target)` | no | only if implemented | +| `:(LearnAPI.weights)` | no | only if implemented | +| `:(LearnAPI.update)` | no | only if implemented | +| `:(LearnAPI.update_observations)` | no | only if implemented | +| `:(LearnAPI.update_features)` | no | only if implemented | +| `:(LearnAPI.predict)` | no | only if implemented | +| `:(LearnAPI.transform)` | no | only if implemented | +| `:(LearnAPI.inverse_transform)` | no | only if implemented | +| < accessor functions> | no | only if implemented | Also include any implemented accessor functions, both those owned by LearnaAPI.jl, and any learner-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST diff --git a/test/features_target_weights.jl b/test/features_target_weights.jl index b84ded25..4809f5df 100644 --- a/test/features_target_weights.jl +++ b/test/features_target_weights.jl @@ -3,7 +3,7 @@ using LearnAPI struct Avocado end -@test isnothing(LearnAPI.target(Avocado(), "salsa")) +@test LearnAPI.target(Avocado(), (1, 2, 3)) == 3 @test isnothing(LearnAPI.weights(Avocado(), "salsa")) @test LearnAPI.features(Avocado(), "salsa") == "salsa" @test LearnAPI.features(Avocado(), (:X, :y)) == :X From fb63aaa4ac83b9411edf75843fe99015d774a7b4 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 6 Feb 2025 21:19:58 +1300 Subject: [PATCH 13/14] add missing examples.md file; plus some tweaks --- docs/src/anatomy_of_an_implementation.md | 8 +- docs/src/examples.md | 192 +++++++++++++++++++++++ docs/src/index.md | 2 +- docs/src/reference.md | 22 ++- src/traits.jl | 14 +- 5 files changed, 224 insertions(+), 14 deletions(-) create mode 100644 docs/src/examples.md diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index d011889d..e6dba45a 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -324,12 +324,12 @@ using LearnTestAPI ## [Other data patterns](@id di) -Here are some important remarks for implementations wanting to deviate in their +Here are some important remarks for implementations deviating in their assumptions about data from those made above. - New implementations of `fit`, `predict`, etc, always have a *single* `data` argument as - above. For convenience, a signature such as `fit(learner, X, y)`, calling `fit(learner, - (X, y))`, can be added, but the LearnAPI.jl specification is silent on the meaning or + above. For convenience, a signature such as `fit(learner, table, formula)`, calling `fit(learner, + (table, formula))`, can be added, but the LearnAPI.jl specification is silent on the meaning or existence of signatures with extra arguments. - If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable @@ -415,7 +415,7 @@ The [`obs`](@ref) methods exist to: !!! important While many new learner implementations will want to adopt a canned data front end, such as those provided by [LearnDataFrontEnds.jl](https://juliaai.github.io/LearnAPI.jl/dev/), we - focus here on a self-contained implemementation of `obs` for the ridge example above, to show + focus here on a self-contained implementation of `obs` for the ridge example above, to show how it works. In the typical case, where [`LearnAPI.data_interface`](@ref) is not overloaded, the diff --git a/docs/src/examples.md b/docs/src/examples.md new file mode 100644 index 00000000..dea9bc56 --- /dev/null +++ b/docs/src/examples.md @@ -0,0 +1,192 @@ +# [Code for ridge example](@id code) + +Below is the complete source code for the ridge implementations described in the tutorial, +[Anatomy of an Implementation](@ref). + +- [Basic implementation](@ref) +- [Implementation with data front end](@ref) + + +## Basic implementation + +```julia +using LearnAPI +using LinearAlgebra, Tables + +struct Ridge{T<:Real} + lambda::T +end + +""" + Ridge(; lambda=0.1) + +Instantiate a ridge regression learner, with regularization of `lambda`. +""" +Ridge(; lambda=0.1) = Ridge(lambda) +LearnAPI.constructor(::Ridge) = Ridge + +# struct for output of `fit` +struct RidgeFitted{T,F} + learner::Ridge + coefficients::Vector{T} + named_coefficients::F +end + +function LearnAPI.fit(learner::Ridge, data; verbosity=1) + X, y = data + + # data preprocessing: + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table, transpose=true) + + lambda = learner.lambda + + # apply core algorithm: + coefficients = (A*A' + learner.lambda*I)\(A*y) # vector + + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" + + return RidgeFitted(learner, coefficients, named_coefficients) +end + +LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = + Tables.matrix(Xnew)*model.coefficients + +# accessor functions: +LearnAPI.learner(model::RidgeFitted) = model.learner +LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients +LearnAPI.strip(model::RidgeFitted) = + RidgeFitted(model.learner, model.coefficients, nothing) + +@trait( + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), + ) +) + +# convenience method: +LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...) +``` + +# Implementation with data front end + +```julia +using LearnAPI +using LinearAlgebra, Tables + +struct Ridge{T<:Real} + lambda::T +end + +Ridge(; lambda=0.1) = Ridge(lambda) + +# struct for output of `fit`: +struct RidgeFitted{T,F} + learner::Ridge + coefficients::Vector{T} + named_coefficients::F +end + +# struct for internal representation of training data: +struct RidgeFitObs{T,M<:AbstractMatrix{T}} + A::M # `p` x `n` matrix + names::Vector{Symbol} # features + y::Vector{T} # target +end + +# implementation of `RandomAccess()` data interface for such representation: +Base.getindex(data::RidgeFitObs, I) = + RidgeFitObs(data.A[:,I], data.names, y[I]) +Base.length(data::RidgeFitObs) = length(data.y) + +# data front end for `fit`: +function LearnAPI.obs(::Ridge, data) + X, y = data + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return RidgeFitObs(Tables.matrix(table)', names, y) +end +LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations + +function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) + + lambda = learner.lambda + + A = observations.A + names = observations.names + y = observations.y + + # apply core learner: + coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix + + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" + + return RidgeFitted(learner, coefficients, named_coefficients) + +end + +LearnAPI.fit(learner::Ridge, data; kwargs...) = + fit(learner, obs(learner, data); kwargs...) + +# data front end for `predict`: +LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' +LearnAPI.obs(::RidgeFitted, observations::AbstractArray) = observations # involutivity + +LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = + observations'*model.coefficients + +LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = + predict(model, Point(), obs(model, Xnew)) + +# methods to deconstruct training data: +LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A +LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y +LearnAPI.features(learner::Ridge, data) = LearnAPI.features(learner, obs(learner, data)) +LearnAPI.target(learner::Ridge, data) = LearnAPI.target(learner, obs(learner, data)) + +# accessor functions: +LearnAPI.learner(model::RidgeFitted) = model.learner +LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients +LearnAPI.strip(model::RidgeFitted) = + RidgeFitted(model.learner, model.coefficients, nothing) + +@trait( + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), + ) +) + +``` diff --git a/docs/src/index.md b/docs/src/index.md index 55c18898..0d10db0f 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -47,7 +47,7 @@ Suppose `forest` is some object encapsulating the hyperparameters of the [random algorithm](https://en.wikipedia.org/wiki/Random_forest) (the number of trees, etc.). Then, a LearnAPI.jl interface can be implemented, for objects with the type of `forest`, to enable the basic workflow below. In this case data is presented following the -"scikit-learn" `X, y` pattern, although LearnAPI.jl supports other patterns as well. +"scikit-learn" `X, y` pattern, although LearnAPI.jl supports other data pattern. ```julia # `X` is some training features diff --git a/docs/src/reference.md b/docs/src/reference.md index 3f7a55dd..18fb92df 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -38,9 +38,9 @@ number of user-specified *hyperparameters*, such as the number of trees in a ran forest. Hyperparameters are understood in a rather broad sense. For example, one is allowed to have hyperparameters that are not data-generic. For example, a class weight dictionary, which will only make sense for a target taking values in the set of specified -dictionary keys, should be given as a hyperparameter. For simplicity, LearnAPI.jl -discourages "run time" parameters (extra arguments to `fit`) such as acceleration -options (cpu/gpu/multithreading/multiprocessing). These should be included as +dictionary keys, should be given as a hyperparameter. For simplicity and composability, +LearnAPI.jl discourages "run time" parameters (extra arguments to `fit`) such as +acceleration options (cpu/gpu/multithreading/multiprocessing). These should be included as hyperparameters as far as possible. An exception is the compulsory `verbosity` keyword argument of `fit`. @@ -102,7 +102,7 @@ generally requires overloading `Base.==` for the struct. !!! important No LearnAPI.jl method is permitted to mutate a learner. In particular, one should make - deep copies of RNG hyperparameters before using them in a new implementation of + deep copies of RNG hyperparameters before using them in an implementation of [`fit`](@ref). #### Composite learners (wrappers) @@ -114,9 +114,6 @@ properties that are not in [`LearnAPI.learners(learner)`](@ref). Instead, these learner-valued properties can have a `nothing` default, with the constructor throwing an error if the constructor call does not explicitly specify a new value. -Any object `learner` for which [`LearnAPI.functions(learner)`](@ref) is non-empty is -understood to have a valid implementation of the LearnAPI.jl interface. - #### Example Below is an example of a learner type with a valid constructor: @@ -139,6 +136,14 @@ GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` +#### Testing something is a learner + +Any object `object` for which [`LearnAPI.functions(object)`](@ref) is non-empty is +understood to have a valid implementation of the LearnAPI.jl interface. You can test this +with the convenience method [`LearnAPI.is_learner(object)`](@ref) but this is never explicitly +overloaded. + + ## Documentation Attach public LearnAPI.jl-related documentation for a learner to it's *constructor*, @@ -200,11 +205,14 @@ Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). ## Utilities + +- [`LearnAPI.is_learner`](@ref) - [`clone`](@ref): for cloning a learner with specified hyperparameter replacements. - [`@trait`](@ref): for simultaneously declaring multiple traits - [`@functions`](@ref): for listing functions available for use with a learner ```@docs +LearnAPI.is_learner clone @trait @functions diff --git a/src/traits.jl b/src/traits.jl index b8d6249c..0a99aaff 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -74,8 +74,8 @@ reference functions not owned by LearnAPI.jl. The understanding is that `learner` is a LearnAPI-compliant object whenever the return value is non-empty. -Do `LearnAPI.functions()` to list all possible elements of the return value owned by -LearnAPI.jl. +Do `LearnAPI.functions()` to list all possible elements of the return value representing +functions owned by LearnAPI.jl. # Extended help @@ -513,6 +513,16 @@ This trait should not be overloaded. Instead overload [`LearnAPI.nonlearners`](@ """ learners(learner) = setdiff(propertynames(learner), nonlearners(learner)) + +""" + LearnAPI.is_learner(object) + +Returns `true` if `object` has a valid implementation of the LearnAPI.jl +interface. Equivalent to non-emptiness of [`LearnAPI.functions(object)`](@ref). + +This trait should never be overloaded explicitly. + +""" is_learner(learner) = !isempty(functions(learner)) preferred_kind_of_proxy(learner) = first(kinds_of_proxy(learner)) target(learner) = :(LearnAPI.target) in functions(learner) From 017e61efad77b1fc26ee424475b7f0f9a0efd187 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 6 Feb 2025 21:40:41 +1300 Subject: [PATCH 14/14] bump 0.2.0 --- Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Project.toml b/Project.toml index 2172f4e7..d791a46e 100644 --- a/Project.toml +++ b/Project.toml @@ -1,7 +1,7 @@ name = "LearnAPI" uuid = "92ad9a40-7767-427a-9ee6-6e577f1266cb" authors = ["Anthony D. Blaom "] -version = "0.1.0" +version = "0.2.0" [compat] julia = "1.10"