From bd5de3b0e9232b081605ab45e70012905450e088 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 11 Oct 2023 16:49:46 +1300 Subject: [PATCH 001/187] rm no longer applicable comment in docs --- docs/src/operations.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/operations.md b/docs/src/operations.md index 1fc0f103..5d21179f 100644 --- a/docs/src/operations.md +++ b/docs/src/operations.md @@ -77,7 +77,7 @@ LearnAPI.IID | `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | | `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | | `LearnAPI.LabelAmbiguousDistribution`| pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | -| `LearnAPI.ConfidenceInterval` | confidence interval (possible requirement: observation `isa Tuple{Real,Real}`) | +| `LearnAPI.ConfidenceInterval` | confidence interval | | `LearnAPI.Set` | finite but possibly varying number of target observations | | `LearnAPI.ProbabilisticSet` | as for `Set` but labeled with probabilities (not necessarily summing to one) | | `LearnAPI.SurvivalFunction` | survival function (possible requirement: observation is single-argument function mapping `Real` to `Real`) | From 87495d55a12173da95c220419c6a6f1cc92d5846 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 27 Oct 2023 09:56:08 +1300 Subject: [PATCH 002/187] rm another redundant comment --- docs/src/operations.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/operations.md b/docs/src/operations.md index 5d21179f..4def3791 100644 --- a/docs/src/operations.md +++ b/docs/src/operations.md @@ -80,7 +80,7 @@ LearnAPI.IID | `LearnAPI.ConfidenceInterval` | confidence interval | | `LearnAPI.Set` | finite but possibly varying number of target observations | | `LearnAPI.ProbabilisticSet` | as for `Set` but labeled with probabilities (not necessarily summing to one) | -| `LearnAPI.SurvivalFunction` | survival function (possible requirement: observation is single-argument function mapping `Real` to `Real`) | +| `LearnAPI.SurvivalFunction` | survival function | | `LearnAPI.SurvivalDistribution` | probability distribution for survival time | | `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | | `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | From 7d9dae0fce9e61680ac4480b67b4542d48338f4a Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 30 Oct 2023 15:36:39 +1300 Subject: [PATCH 003/187] minor doc fix --- docs/src/reference.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index 1298073f..1d6d6b5d 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -2,10 +2,10 @@ > **Summary** In LearnAPI.jl an **algorithm** is a container for hyperparameters of some > ML/Statistics algorithm (which may or may not "learn"). Functionality is created by -> overloading **methods** provided by the interface, which are divided into training -> methods (e.g., `fit`), operations (e.g.,. `predict` and `transform`) and accessor -> functions (e.g., `feature_importances`). Promises of particular behavior are articulated -> by **algorithm traits**. +> overloading methods provided by the interface, which are divided into **training +> methods** (e.g., `fit`), **operations** (`predict` and `transform`) and **accessor +> functions** (e.g., `feature_importances`). Promises of particular behavior are +> articulated by **algorithm traits**. Here we give the definitive specification of the interface provided by LearnAPI.jl. For a more informal guide see [Anatomy of an Implementation](@ref) and [Common Implementation Patterns](@ref). From 68aa9be42530f8ff592ab0a0eb08aae166a7e704 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 2 Nov 2023 08:54:28 +1300 Subject: [PATCH 004/187] major refactor based on Julia Discourse feedback add SpellCheck GH action work in progress work in progress work in progress work in progress WIP tests passing, dumped ext for MLUtils as redundant work in progress work in progress work in progress wip wip big refactor, based on Julia Discourse feedback add components accessor function rename is_wrapper -> is_composite add predict shortcut tweak tweaks tweaks --- .github/workflows/SpellCheck.yml | 13 + Project.toml | 8 +- docs/make.jl | 24 +- docs/src/accessor_functions.md | 35 +- docs/src/algorithm_traits.md | 139 ------- docs/src/anatomy_of_an_implementation.md | 416 ++++++++++---------- docs/src/common_implementation_patterns.md | 17 +- docs/src/fit.md | 36 ++ docs/src/fit_update_and_ingest.md | 46 --- docs/src/goals_and_approach.md | 49 --- docs/src/index.md | 202 +++------- docs/src/kinds_of_target_proxy.md | 55 +++ docs/src/minimize.md | 34 ++ docs/src/obs.md | 100 +++++ docs/src/operations.md | 114 ------ docs/src/optional_data_interface.md | 34 -- docs/src/patterns/classification.md | 1 + docs/src/patterns/classifiers.md | 1 - docs/src/patterns/regression.md | 5 + docs/src/patterns/regressors.md | 1 - docs/src/patterns/static_algorithms.md | 7 + docs/src/patterns/static_transformers.md | 1 - docs/src/predict_transform.md | 71 ++++ docs/src/reference.md | 185 +++++---- docs/src/testing_an_implementation.md | 8 + docs/src/traits.md | 152 ++++++++ src/LearnAPI.jl | 18 +- src/accessor_functions.jl | 260 +++++++++++-- src/algorithms.jl | 19 - src/data_interface.jl | 107 ------ src/fit.jl | 184 +++++++++ src/fit_update_ingest.jl | 177 --------- src/minimize.jl | 41 ++ src/obs.jl | 122 ++++++ src/operations.jl | 184 --------- src/predict_transform.jl | 288 ++++++++++++++ src/tools.jl | 24 +- src/{algorithm_traits.jl => traits.jl} | 426 ++++++++++++--------- src/types.jl | 84 ++++ test/integration/regression.jl | 219 +++++++++++ test/integration/static_algorithms.jl | 112 ++++++ test/runtests.jl | 96 ++++- test/tools.jl | 4 + 43 files changed, 2595 insertions(+), 1524 deletions(-) create mode 100644 .github/workflows/SpellCheck.yml delete mode 100644 docs/src/algorithm_traits.md create mode 100644 docs/src/fit.md delete mode 100644 docs/src/fit_update_and_ingest.md delete mode 100644 docs/src/goals_and_approach.md create mode 100644 docs/src/kinds_of_target_proxy.md create mode 100644 docs/src/minimize.md create mode 100644 docs/src/obs.md delete mode 100644 docs/src/operations.md delete mode 100644 docs/src/optional_data_interface.md create mode 100644 docs/src/patterns/classification.md delete mode 100644 docs/src/patterns/classifiers.md create mode 100644 docs/src/patterns/regression.md delete mode 100644 docs/src/patterns/regressors.md create mode 100644 docs/src/patterns/static_algorithms.md delete mode 100644 docs/src/patterns/static_transformers.md create mode 100644 docs/src/predict_transform.md create mode 100644 docs/src/traits.md delete mode 100644 src/algorithms.jl delete mode 100644 src/data_interface.jl create mode 100644 src/fit.jl delete mode 100644 src/fit_update_ingest.jl create mode 100644 src/minimize.jl create mode 100644 src/obs.jl delete mode 100644 src/operations.jl create mode 100644 src/predict_transform.jl rename src/{algorithm_traits.jl => traits.jl} (52%) create mode 100644 src/types.jl create mode 100644 test/integration/regression.jl create mode 100644 test/integration/static_algorithms.jl diff --git a/.github/workflows/SpellCheck.yml b/.github/workflows/SpellCheck.yml new file mode 100644 index 00000000..3d62423c --- /dev/null +++ b/.github/workflows/SpellCheck.yml @@ -0,0 +1,13 @@ +name: Spell Check + +on: [pull_request] + +jobs: + typos-check: + name: Spell Check with Typos + runs-on: ubuntu-latest + steps: + - name: Checkout Actions Repository + uses: actions/checkout@v4 + - name: Check spelling + uses: crate-ci/typos@master \ No newline at end of file diff --git a/Project.toml b/Project.toml index 61029ea6..f8431fdd 100644 --- a/Project.toml +++ b/Project.toml @@ -5,14 +5,18 @@ version = "0.1.0" [deps] InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240" -Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" [compat] julia = "1.6" [extras] +DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" +LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" +MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" +Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b" SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" +Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] -test = ["SparseArrays", "Test"] +test = ["DataFrames", "LinearAlgebra", "MLUtils", "Serialization", "SparseArrays", "Tables", "Test"] diff --git a/docs/make.jl b/docs/make.jl index cfd9356c..4d4d08a6 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -2,30 +2,32 @@ using Documenter using LearnAPI using ScientificTypesBase -const REPO="github.com/JuliaAI/LearnAPI.jl" +const REPO = Remotes.GitHub("JuliaAI", "LearnAPI.jl") -makedocs(; +makedocs( modules=[LearnAPI,], format=Documenter.HTML(prettyurls = get(ENV, "CI", nothing) == "true"), pages=[ - "Overview" => "index.md", - "Goals and Approach" => "goals_and_approach.md", + "Home" => "index.md", "Anatomy of an Implementation" => "anatomy_of_an_implementation.md", "Reference" => "reference.md", - "Fit, update and ingest" => "fit_update_and_ingest.md", - "Predict and other operations" => "operations.md", + "Kinds of Target Proxy" => "kinds_of_target_proxy.md", + "fit" => "fit.md", + "predict, transform, and relatives" => "predict_transform.md", + "mimimize" => "minimize.md", + "obs" => "obs.md", "Accessor Functions" => "accessor_functions.md", - "Optional Data Interface" => "optional_data_interface.md", - "Algorithm Traits" => "algorithm_traits.md", + "Algorithm Traits" => "traits.md", "Common Implementation Patterns" => "common_implementation_patterns.md", "Testing an Implementation" => "testing_an_implementation.md", ], - repo="https://$REPO/blob/{commit}{path}#L{line}", - sitename="LearnAPI.jl" + sitename="LearnAPI.jl", + warnonly = [:cross_references, :missing_docs], + repo =REPO, ) deploydocs( - ; repo=REPO, devbranch="dev", push_preview=false, + repo=REPO, ) diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index 13203c32..8e5e81b1 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -1,16 +1,39 @@ -# Accessor Functions +# [Accessor Functions](@id accessor_functions) -> **Summary.** While byproducts of training are ordinarily recorded in the `report` -> component of the output of `fit`/`update!`/`ingest!`, some families of algorithms report an -> item that is likely shared by multiple algorithm types, and it is useful to have common -> interface for accessing these directly. Training losses and feature importances are two -> examples. +The sole argument of an accessor function is the output, `model`, of [`fit`](@ref) or +[`obsfit`](@ref). + +- [`LearnAPI.algorithm(model)`](@ref) +- [`LearnAPI.extras(model)`](@ref) +- [`LearnAPI.coefficients(model)`](@ref) +- [`LearnAPI.intercept(model)`](@ref) +- [`LearnAPI.tree(model)`](@ref) +- [`LearnAPI.trees(model)`](@ref) +- [`LearnAPI.feature_importances(model)`](@ref) +- [`LearnAPI.training_labels(model)`](@ref) +- [`LearnAPI.training_losses(model)`](@ref) +- [`LearnAPI.training_scores(model)`](@ref) +- [`LearnAPI.components(model)`](@ref) + +## Implementation guide + +All new implementations must implement [`LearnAPI.algorithm`](@ref). All others are +optional. + +## Reference ```@docs +LearnAPI.algorithm +LearnAPI.extras +LearnAPI.coefficients +LearnAPI.intercept +LearnAPI.tree +LearnAPI.trees LearnAPI.feature_importances LearnAPI.training_losses LearnAPI.training_scores LearnAPI.training_labels +LearnAPI.components ``` diff --git a/docs/src/algorithm_traits.md b/docs/src/algorithm_traits.md deleted file mode 100644 index e2ccdb0f..00000000 --- a/docs/src/algorithm_traits.md +++ /dev/null @@ -1,139 +0,0 @@ -# Algorithm Traits - -> **Summary.** Traits allow one to promise particular behaviour for an algorithm, such as: -> *This algorithm supports per-observation weights, which must appear as the third -> argument of `fit`*, or *This algorithm's `transform` method predicts `Real` vectors*. - -Algorithm traits are functions whose first (and usually only) argument is an algorithm. In -a new implementation, a single-argument trait is declared following this pattern: - -```julia -LearnAPI.is_pure_julia(algorithm::MyAlgorithmType) = true -``` - -!!! important - - The value of a trait must be the same for all algorithms of the same type, - even if the types differ only in type parameters. There are exceptions for - some traits, if - `is_wrapper(algorithm) = true` for all instances `algorithm` of some type - (composite algorithms). This requirement occasionally requires that - an existing algorithm implementation be split into separate LearnAPI - implementations (e.g., one for regression and another for classification). - -The declaration above has the shorthand - -```julia -@trait MyAlgorithmType is_pure_julia=true -``` - -Multiple traits can be declared like this: - - -```julia -@trait( - MyAlgorithmType, - is_pure_julia = true, - pkg_name = "MyPackage", -) -``` - -### Special two-argument traits - -The two-argument version of [`LearnAPI.predict_output_scitype`](@ref) and -[`LearnAPI.predict_output_scitype`](@ref) are the only overloadable traits with more than -one argument. They cannot be declared using the `@trait` macro. - -## Trait summary - -**Overloadable traits** are available for overloading by any new LearnAPI -implementation. **Derived traits** are not, and should not be called by performance -critical code - -### Overloadable traits - -In the examples column of the table below, `Table`, `Continuous`, `Sampleable` are names owned by the -package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). - -| trait | fallback value | return value | example | -|:-------------------------------------------------|:----------------------|:--------------|:--------| -| [`LearnAPI.functions`](@ref)`(algorithm)` | `()` | implemented LearnAPI functions (traits excluded) | `(:fit, :predict)` | -| [`LearnAPI.preferred_kind_of_proxy`](@ref)`(algorithm)` | `LearnAPI.None()` | an instance `tp` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, tp, ...)` is guaranteed. | `LearnAPI.Distribution()` | -| [`LearnAPI.position_of_target`](@ref)`(algorithm)` | `0` | ¹ the positional index of the **target** in `data` in `fit(..., data...; metadata)` calls | 2 | -| [`LearnAPI.position_of_weights`](@ref)`(algorithm)` | `0` | ¹ the positional index of **per-observation weights** in `data` in `fit(..., data...; metadata)` | 3 | -| [`LearnAPI.descriptors`](@ref)`(algorithm)` | `()` | lists one or more suggestive algorithm descriptors from `LearnAPI.descriptors()` | (:classifier, :probabilistic) | -| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `false` | is `true` if implementation is 100% Julia code | `true` | -| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | `"unknown"` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"DecisionTree"` | -| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | `"unknown"` | name of license of package providing core code | `"MIT"` | -| [`LearnAPI.doc_url`](@ref)`(algorithm)` | `"unknown"` | url providing documentation of the core code | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | -| [`LearnAPI.load_path`](@ref)`(algorithm)` | `"unknown"` | a string indicating where the struct for `typeof(algorithm)` is defined, beginning with name of package providing implementation | `FastTrees.LearnAPI.DecisionTreeClassifier` | -| [`LearnAPI.is_wrapper`](@ref)`(algorithm)` | `false` | is `true` if one or more properties (fields) of `algorithm` may be an algorithm | `true` | -| [`LearnAPI.human_name`](@ref)`(algorithm)` | type name with spaces | human name for the algorithm; should be a noun | "elastic net regressor" | -| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | `nothing` | symbolic name of an iteration parameter | :epochs | -| [`LearnAPI.fit_keywords`](@ref)`(algorithm)` | `()` | tuple of symbols for keyword arguments accepted by `fit` (corresponding to metadata) | `(:class_weights,)` | -| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `fit(algorithm, verbosity, data...)`² | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | -| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | `Union{}`| upper bound on `scitype(observation)` for `observation` in `data` and `data` in `fit(algorithm, verbosity, data...)`² | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.fit_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `type(data)` in `fit(algorithm, verbosity, data...)`² | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | -| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | `Union{}`| upper bound on `type(observation)` for `observation` in `data` and `data` in `fit(algorithm, verbosity, data...)`* | `Tuple{AbstractVector{<:Real}, Real}` | -| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `predict(algorithm, fitted_params, data...)`² | `Table(Continuous)` | -| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | `Any` | upper bound on `scitype(first(predict(algorithm, kind_of_proxy, ...)))` | `AbstractVector{Continuous}` | -| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `predict(algorithm, fitted_params, data...)`² | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | `Any` | upper bound on `typeof(first(predict(algorithm, kind_of_proxy, ...)))` | `AbstractVector{<:Real}` | -| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `transform(algorithm, fitted_params, data...)`² | `Table(Continuous)` | -| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | `Any` | upper bound on `scitype(first(transform(algorithm, ...)))` | `Table(Continuous)` | -| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `transform(algorithm, fitted_params, data...)`² | `AbstractMatrix{<:Real}}` | -| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | `Any` | upper bound on `typeof(first(transform(algorithm, ...)))` | `AbstractMatrix{<:Real}` | - -¹ If the value is `0`, then the variable in boldface type is not supported and not -expected to appear in `data`. If `length(data)` is less than the trait value, then `data` -is understood to exclude the variable, but note that `fit` can have multiple signatures of -varying lengths, as in `fit(algorithm, verbosity, X, y)` and `fit(algorithm, verbosity, X, y, -w)`. A non-zero value is a promise that `fit` includes a signature of sufficient length to -include the variable. - -² Assuming no [optional data interface](@ref data_interface) is implemented. See docstring -for the general case. - - -### Derived Traits - -The following convenience methods are provided but intended for overloading: - -| trait | return value | example | -|:-------------------------------------|:------------------------------------------|:-----------| -| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | -| `LearnAPI.is_algorithm(algorithm)` | `true` if `functions(algorithm)` is not empty | `true` | -| [`LearnAPI.predict_output_scitype`](@ref)(algorithm) | dictionary of upper bounds on the scitype of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | -| [`LearnAPI.predict_output_type`](@ref)(algorithm) | dictionary of upper bounds on the type of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | - - -## Reference - -```@docs -LearnAPI.functions -LearnAPI.preferred_kind_of_proxy -LearnAPI.position_of_target -LearnAPI.position_of_weights -LearnAPI.descriptors -LearnAPI.is_pure_julia -LearnAPI.pkg_name -LearnAPI.pkg_license -LearnAPI.doc_url -LearnAPI.load_path -LearnAPI.is_wrapper -LearnAPI.fit_keywords -LearnAPI.human_name -LearnAPI.iteration_parameter -LearnAPI.fit_scitype -LearnAPI.fit_type -LearnAPI.fit_observation_scitype -LearnAPI.fit_observation_type -LearnAPI.predict_input_scitype -LearnAPI.predict_output_scitype -LearnAPI.predict_input_type -LearnAPI.predict_output_type -LearnAPI.transform_input_scitype -LearnAPI.transform_output_scitype -LearnAPI.transform_input_type -LearnAPI.transform_output_type -``` diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index d0180f35..87995d38 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -1,25 +1,14 @@ # Anatomy of an Implementation -> **Summary.** Formally, an **algorithm** is a container for the hyperparameters of some -> ML/statistics algorithm. A basic implementation of the ridge regressor requires -> implementing `fit` and `predict` methods dispatched on the algorithm type; `predict` is -> an example of an **operation**, the others are `transform` and `inverse_transform`. In -> this example we also implement an **accessor function**, called `feature_importance`, -> returning the absolute values of the linear coefficients. The ridge regressor has a -> target variable and outputs literal predictions of the target (rather than, say, -> probabilistic predictions); accordingly the overloaded `predict` method is dispatched on -> the `LiteralTarget` subtype of `KindOfProxy`. An **algorithm trait** declares this type -> as the preferred kind of target proxy. Other traits articulate the algorithm's training -> data type requirements and the input/output type of `predict`. - -We begin by describing an implementation of LearnAPI.jl for basic ridge regression -(without intercept) to introduce the main actors in any implementation. - +This section explains a detailed implementation of the LearnAPI for naive [ridge +regression](https://en.wikipedia.org/wiki/Ridge_regression). Most readers will want to +scan the [demonstration](@ref workflow) of the implementation before studying the +implementation itself. ## Defining an algorithm type The first line below imports the lightweight package LearnAPI.jl whose methods we will be -extending, the second, libraries needed for the core algorithm. +extending. The second imports libraries needed for the core algorithm. ```@example anatomy using LearnAPI @@ -27,293 +16,324 @@ using LinearAlgebra, Tables nothing # hide ``` -Next, we define a struct to store the single hyperparameter `lambda` of this algorithm: +A struct stores the regularization hyperparameter `lambda` of our ridge regressor: ```@example anatomy -struct MyRidge <: LearnAPI.Algorithm - lambda::Float64 +struct Ridge + lambda::Float64 end nothing # hide ``` -The subtyping `MyRidge <: LearnAPI.Algorithm` is optional but recommended where it is not -otherwise disruptive. - -Instances of `MyRidge` are called **algorithms** and `MyRidge` is an **algorithm type**. +Instances of `Ridge` are [algorithms](@ref algorithms), in LearnAPI.jl parlance. -A keyword argument constructor providing defaults for all hyperparameters should be -provided: +A keyword argument constructor provides defaults for all hyperparameters: ```@example anatomy -nothing # hide -MyRidge(; lambda=0.1) = MyRidge(lambda) +Ridge(; lambda=0.1) = Ridge(lambda) nothing # hide ``` -## Implementing training (fit) +## Implementing `fit` -A ridge regressor requires two types of data for training: **input features** `X` and a -[**target**](@ref scope) `y`. Training is implemented by overloading `fit`. Here `verbosity` is an integer -(`0` should train silently, unless warnings are needed): +A ridge regressor requires two types of data for training: *input features* `X`, which +here we suppose are tabular, and a [target](@ref proxy) `y`, which we suppose is a +vector. Users will accordingly call [`fit`](@ref) like this: -```@example anatomy -function LearnAPI.fit(algorithm::MyRidge, verbosity, X, y) +```julia +algorithm = Ridge(lambda=0.05) +fit(algorithm, X, y; verbosity=1) +``` + +However, a new implementation does not overload `fit`. Rather it +implements + +```julia +obsfit(algorithm::Ridge, obsdata; verbosity=1) +``` - # process input: - x = Tables.matrix(X) # convert table to matrix - s = Tables.schema(X) - features = s.names +for each `obsdata` returned by a data-preprocessing call `obs(fit, algorithm, X, y)`. You +can read "obs" as "observation-accessible", for reasons explained shortly. The +LearnAPI.jl definition - # core solver: - coefficients = (x'x + algorithm.lambda*I)\(x'y) +```julia +fit(algorithm, data...; verbosity=1) = + obsfit(algorithm, obs(fit, algorithm, data...), verbosity) +``` +then takes care of `fit`. - # prepare output - learned parameters: - fitted_params = (; coefficients) +The `obs` and `obsfit` method are public, and the user can call them like this: - # prepare output - algorithm state: - state = nothing # not relevant here +```julia +obsdata = obs(fit, algorithm, X, y) +model = obsfit(algorithm, obsdata) +``` - # prepare output - byproducts of training: - feature_importances = - [features[j] => abs(coefficients[j]) for j in eachindex(features)] - sort!(feature_importances, by=last) |> reverse! - verbosity > 0 && @info "Features in order of importance: $(first.(feature_importances))" - report = (; feature_importances) +We begin by defining a struct¹ for the output of our data-preprocessing operation, `obs`, +which will store `y` and the matrix representation of `X`, together with it's column names +(needed for recording named coefficients for user inspection): - return fitted_params, state, report +```@example anatomy +struct RidgeFitData{T} + A::Matrix{T} # p x n + names::Vector{Symbol} + y::Vector{T} end nothing # hide ``` -Regarding the return value of `fit`: +And we overload [`obs`](@ref) like this -- The `fitted_params` variable is for the algorithm's learned parameters, for passing to - `predict` (see below). - -- The `state` variable is only relevant when additionally implementing a [`LearnAPI.update!`](@ref) - or [`LearnAPI.ingest!`](@ref) method (see [Fit, update! and ingest!](@ref)). - -- The `report` is for other byproducts of training, apart from the learned parameters (the - ones we'll need to provide `predict` below). +```@example anatomy +function LearnAPI.obs(::typeof(fit), ::Ridge, X, y) + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return RidgeFitData(Tables.matrix(table, transpose=true), names, y) +end +nothing # hide +``` -Our `fit` method assumes that `X` is a table (satisfies the [Tables.jl -spec](https://github.com/JuliaData/Tables.jl)) whose rows are the observations; and it -will need need `y` to be an `AbstractFloat` vector. An algorithm implementation is free to -dictate the representation of data that `fit` accepts but articulates its requirements -using appropriate traits; see [Training data types](@ref) below. We recommend against data -type checks internal to `fit`; this would ordinarily be the responsibility of a higher -level API, using those traits. +so that `obs(fit, Ridge(), X, y)` returns a combined `RidgeFitData` object with everything +the core algorithm will need. +Since `obs` is public, the user will have access to this object, but to make it useful to +her (and to fulfill the [`obs`](@ref) contract) this object must implement the +[MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) `getobs`/`numobs` interface, to enable +observation-resampling (which will be efficient, because observations are now columns). It +usually suffices to overload `Base.getindex` and `Base.length` (which are the +`getobs`/`numobs` fallbacks) so we won't actually need to depend on MLUtils.jl: -## Operations +```@example anatomy +Base.getindex(data::RidgeFitData, I) = + RidgeFitData(data.A[:,I], data.names, y[I]) +Base.length(data::RidgeFitData, I) = length(data.y) +nothing # hide +``` -Now we need a method for predicting the target on new input features: +Next, we define a second struct for storing the outcomes of training, including named +versions of the learned coefficients: ```@example anatomy -function LearnAPI.predict(::MyRidge, ::LearnAPI.LiteralTarget, fitted_params, Xnew) - Xmatrix = Tables.matrix(Xnew) - report = nothing - return Xmatrix*fitted_params.coefficients, report +struct RidgeFitted{T,F} + algorithm::Ridge + coefficients::Vector{T} + named_coefficients::F end nothing # hide ``` -The second argument of `predict` is always an instance of `KindOfProxy`, and will always -be `LiteralTarget()` in this case, as only literal values of the target (rather than, say -probabilistic predictions) are being supported. +We include `algorithm`, which must be recoverable from the output of `fit`/`obsfit` (see +[Accessor functions](@ref) below). + +We are now ready to implement a suitable `obsfit` method to execute the core training: -In some algorithms `predict` computes something of interest in addition to the target -prediction, and this `report` item is returned as the second component of the return -value. When there's nothing to report, we must return `nothing`, as here. +```@example anatomy +function LearnAPI.obsfit(algorithm::Ridge, obsdata::RidgeFitData, verbosity) -Our `predict` method is an example of an **operation**. Other operations include -`transform` and `inverse_transform` and an algorithm can implement more than one. For -example, a K-means clustering algorithm might implement `transform` for dimension -reduction, and `predict` to return cluster labels. + lambda = algorithm.lambda + A = obsdata.A + names = obsdata.names + y = obsdata.y -The `predict` method is reserved for predictions of a [target variable](@ref proxy), and -only `predict` has the extra `::KindOfProxy` argument. + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] -## Accessor functions + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" -The arguments of an operation are always `(algorithm, fitted_params, data...)`. The -interface also provides **accessor functions** for extracting information, from the -`fitted_params` and/or fit `report`, that is shared by several algorithm types. There is -one for feature importances that we can implement for `MyRidge`: + return RidgeFitted(algorithm, coefficients, named_coefficients) -```@example anatomy -LearnAPI.feature_importances(::MyRidge, fitted_params, report) = - report.feature_importances +end nothing # hide ``` - -Another example of an accessor function is [`LearnAPI.training_losses`](@ref). +Users set `verbosity=0` for warnings only, and `verbosity=-1` for silence. -## [Algorithm traits](@id traits) +## Implementing `predict` -We have implemented `predict`, and it is possible to implement `predict` methods for -multiple `KindOfProxy` types (see See [Target proxies](@ref) for a complete -list). Accordingly, we are required to declare a preferred target proxy, which we do using -[`LearnAPI.preferred_kind_of_proxy`](@ref): +The primary `predict` call will look like this: -```@example anatomy -LearnAPI.preferred_kind_of_proxy(::MyRidge) = LearnAPI.LiteralTarget() -nothing # hide -``` -Or, you can use the shorthand - -```@example anatomy -@trait MyRidge preferred_kind_of_proxy=LearnAPI.LiteralTarget() -nothing # hide +```julia +predict(model, LiteralTarget(), Xnew) ``` -[`LearnAPI.preferred_kind_of_proxy`](@ref) is an example of a **algorithm trait**. A -complete list of traits and the contracts they imply is given in [Algorithm Traits](@ref). +where `Xnew` is a table (of the same form as `X` above). The argument `LiteralTarget()` +signals that we want literal predictions of the target variable, as opposed to a proxy for +the target, such as probability density functions. `LiteralTarget` is an example of a +[`LearnAPI.KindOfProxy`](@ref proxy_types) type. Targets and target proxies are defined +[here](@ref proxy). -We also need to indicate that a target variable appears in training (this is a supervised -algorithm). We do this by declaring *where* in the list of training data arguments (in this -case `(X, y)`) the target variable (in this case `y`) appears: +Rather than overload the primary signature above, however, we overload for +"observation-accessible" input, as we did for `fit`, ```@example anatomy -@trait MyRidge position_of_target=2 +LearnAPI.obspredict(model::RidgeFitted, ::LiteralTarget, Anew::Matrix) = + ((model.coefficients)'*Anew)' nothing # hide ``` -As explained in the introduction, LearnAPI.jl does not attempt to define strict algorithm -categories, such as "regression" or "clustering". However, we can optionally specify suggestive -descriptors, as in +and overload `obs` to make the table-to-matrix conversion: ```@example anatomy -@trait MyRidge descriptors=(:regression,) -nothing # hide +LearnAPI.obs(::typeof(predict), ::Ridge, Xnew) = Tables.matrix(Xnew, transpose=true) ``` -This declaration actually promises nothing, but can help in generating documentation. Do -`LearnAPI.descriptors()` to get a list of available descriptors. +As matrices (with observations as columns) already implement the MLUtils.jl +`getobs`/`numobs` interface, we already satisfy the [`obs`](@ref) contract, and there was +no need to create a wrapper for `obs` output. -Finally, we are required to declare what methods (excluding traits) we have explicitly -overloaded for our type: +The primary `predict` method, handling tabular input, is provided by a +LearnAPI.jl fallback similar to the `fit` fallback. -```@example anatomy -@trait MyRidge methods=( - :fit, - :predict, - :feature_importances, -) -nothing # hide -``` -## Training data types +## Accessor functions -Since LearnAPI.jl is a basement level API, one is discouraged from including explicit type -checks in an implementation of `fit`. Instead one uses traits to make promises about the -acceptable type of `data` consumed by `fit`. In general, this can be a promise regarding -the ordinary type of `data` or the [scientific -type](https://github.com/JuliaAI/ScientificTypes.jl) of `data` (but not -both). Alternatively, one may only promise a bound on the type/scitype of *observations* -in the data . See [Algorithm Traits](@ref) for further details. In this case we'll be -happy to restrict the scitype of the data: +An [accessor function](@ref accessor_functions) has the output of [`fit`](@ref) (a +"model") as it's sole argument. Every new implementation must implement the accessor +function [`LearnAPI.algorithm`](@ref) for recovering an algorithm from a fitted object: ```@example anatomy -import ScientificTypesBase: scitype, Table, Continuous -@trait MyRidge fit_scitype = Tuple{Table(Continuous), AbstractVector{Continuous}} -nothing # hide +LearnAPI.algorithm(model::RidgeFitted) = model.algorithm ``` -This is a contract that `data` is acceptable in the call `fit(algorithm, verbosity, data...)` -whenever +Other accessor functions extract learned parameters or some standard byproducts of +training, such as feature importances or training losses.² Implementing the +[`LearnAPI.coefficients`](@ref) accessor function is straightforward: -```julia -scitype(data) <: Tuple{Table(Continuous), AbstractVector{Continuous}} +```@example anatomy +LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients +nothing #hide ``` -Or, in other words: +## Tearing a model down for serialization -- `X` in `fit(algorithm, verbosity, X, y)` is acceptable, provided `scitype(X) <: - Table(Continuous)` - meaning that `X` `Tables.istable(X) == true` (see - [Tables.jl](https://github.com/JuliaData/Tables.jl)) and each column has some - `<:AbstractFloat` element type. +The `minimize` method falls back to the identity. Here, for the sake of illustration, we +overload it to dump the named version of the coefficients: + +```@example anatomy +LearnAPI.minimize(model::RidgeFitted) = + RidgeFitted(model.algorithm, model.coefficients, nothing) +``` -- `y` in `fit(algorithm, verbosity, X, y)` is acceptable if `scitype(y) <: - AbstractVector{Continuous}` - meaning that it is an abstract vector with `<:AbstractFloat` - elements. +## Algorithm traits -## Input types for operations +Algorithm [traits](@ref traits) record extra generic information about an algorithm, or +make specific promises of behavior. They usually have an algorithm as the single argument. -An optional promise about what `data` is guaranteed to work in a call like -`predict(algorithm, fitted_params, data...)` is articulated this way: +In LearnAPI.jl `predict` always outputs a [target or target proxy](@ref proxy), where +"target" is understood very broadly. We overload a trait to record the fact that the +target variable explicitly appears in training (i.e, the algorithm is supervised) and +where exactly it appears: -```@example anatomy -@trait MyRidge predict_input_scitype = Tuple{AbstractVector{<:Continuous}} +```julia +LearnAPI.position_of_target(::Ridge) = 2 ``` +Or, you can use the shorthand -Note that `data` is always a `Tuple`, even if it has only one component (the typical -case), which explains the `Tuple` on the right-hand side. - -Optionally, we may express our promise using regular types, using the -[`LearnAPI.predict_input_type`](@ref) trait. +```julia +@trait Ridge position_of_target = 2 +``` -One can optionally make promises about the outut of an operation. See [Algorithm -Traits](@ref) for details. +The macro can also be used to specify multiple traits simultaneously: +```@example anatomy +@trait( + Ridge, + position_of_target = 2, + kinds_of_proxy=(LiteralTarget(),), + descriptors = (:regression,), + functions = ( + fit, + obsfit, + minimize, + predict, + obspredict, + obs, + LearnAPI.algorithm, + LearnAPI.coefficients, + ) +) +nothing # hide +``` -## [Illustrative fit/predict workflow](@id workflow) +Implementing the last trait, [`LearnAPI.functions`](@ref), which must include all +non-trait functions overloaded for `Ridge`, is compulsory. This is the only universally +compulsory trait. It is worthwhile studying the [list of all traits](@ref traits_list) to +see which might apply to a new implementation, to enable maximum buy into functionality +provided by third party packages, and to assist third party algorithms that match machine +learning algorithms to user defined tasks. -We now illustrate how to interact directly with `MyRidge` instances using the methods we -have implemented: +## [Demonstration](@id workflow) -Here's some toy data for supervised learning: +We now illustrate how to interact directly with `Ridge` instances using the methods +just implemented. ```@example anatomy -using Tables - -n = 10 # number of training observations +# synthesize some data: +n = 10 # number of observations train = 1:6 test = 7:10 - a, b, c = rand(n), rand(n), rand(n) -X = (; a, b, c) |> Tables.rowtable +X = (; a, b, c) y = 2a - b + 3c + 0.05*rand(n) -nothing # hide -``` -Instantiate an algorithm with relevant hyperparameters (which is all the object stores): -```@example anatomy -algorithm = MyRidge(lambda=0.5) +algorithm = Ridge(lambda=0.5) +LearnAPI.functions(algorithm) ``` -Train the algorithm (the `0` means do so silently): +### Naive user workflow -```@example anatomy -import LearnAPI: fit, predict, feature_importances +Training and predicting with external resampling: -fitted_params, state, fit_report = fit(algorithm, 0, X[train], y[train]) +```@example anatomy +using Tables +model = fit(algorithm, Tables.subset(X, train), y[train]) +ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) ``` -Inspect the learned parameters and report: +### Advanced workflow + +We now train and predict using internal data representations, resampled using the generic +MLUtils.jl interface. ```@example anatomy -@info "training outcomes" fitted_params fit_report +import MLUtils +fit_data = obs(fit, algorithm, X, y) +predict_data = obs(predict, algorithm, X) +model = obsfit(algorithm, MLUtils.getobs(fit_data, train)) +ẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predict_data, test)) +@assert ẑ == ŷ +nothing # hide ``` -Inspect feature importances: +### Applying an accessor function and serialization + +Extracting coefficients: ```@example anatomy -feature_importances(algorithm, fitted_params, fit_report) +LearnAPI.coefficients(model) ``` -Make a prediction using new data: +Serialization/deserialization: -```@example anatomy -yhat, predict_report = predict(algorithm, LearnAPI.LiteralTarget(), fitted_params, X[test]) +```julia +using Serialization +small_model = minimize(model) +serialize("my_ridge.jls", small_model) + +recovered_model = deserialize("my_ridge.jls") +@assert LearnAPI.algorithm(recovered_model) == algorithm +predict(recovered_model, LiteralTarget(), X) == predict(model, LiteralTarget(), X) ``` -Compare predictions with ground truth +--- -```@example anatomy -deviations = yhat - y[test] -loss = deviations .^2 |> sum -@info "Sum of squares loss" loss -``` +¹ The definition of this and other structs above is not an explicit requirement of +LearnAPI.jl, whose constracts are purely functional. + +² An implementation can provide further accessor functions, if necessary, but +like the native ones, they must be included in the [`LearnAPI.functions`](@ref) +declaration. diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 26c221fd..91e5f925 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -1,5 +1,13 @@ # Common Implementation Patterns +```@raw html +🚧 +``` + +!!! warning + + Under construction + !!! warning This section is only an implementation guide. The definitive specification of the @@ -12,16 +20,17 @@ Although an implementation is defined purely by the methods and traits it implem implementations fall into one (or more) of the following informally understood patterns or "tasks": -- [Classifiers](@ref): Supervised learners for categorical targets +- [Classification](@ref): Supervised learners for categorical targets -- [Regressors](@ref): Supervised learners for continuous targets +- [Regression](@ref): Supervised learners for continuous targets - [Iterative Algorithms](@ref) - [Incremental Algorithms](@ref) -- [Static Transformers](@ref): Transformations that do not learn but which have - hyperparameters and/or deliver ancillary information about the transformation +- [Static Algorithms](@ref): Algorithms that do not learn, in the sense they must be + re-executed for each new data set (do not generalize), but which have hyperparameters + and/or deliver ancillary information about the computation. - [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension diff --git a/docs/src/fit.md b/docs/src/fit.md new file mode 100644 index 00000000..f2709611 --- /dev/null +++ b/docs/src/fit.md @@ -0,0 +1,36 @@ +# [`fit`](@ref fit) + +```julia +fit(algorithm, data...; verbosity=1) -> model +fit(model, data...; verbosity=1) -> updated_model +``` + +## Typical workflow + +```julia +# Train some supervised `algorithm`: +model = fit(algorithm, X, y) + +# Predict probability distributions: +ŷ = predict(model, Distribution(), Xnew) + +# Inspect some byproducts of training: +LearnAPI.feature_importances(model) +``` + +## Implementation guide + +The `fit` method is not implemented directly. Instead, implement [`obsfit`](@ref). + +| method | fallback | compulsory? | requires | +|:-----------------------------|:---------|-------------|-----------------------------| +| [`obsfit`](@ref)`(alg, ...)` | none | yes | [`obs`](@ref) in some cases | +| | | | | + + +## Reference + +```@docs +LearnAPI.fit +LearnAPI.obsfit +``` diff --git a/docs/src/fit_update_and_ingest.md b/docs/src/fit_update_and_ingest.md deleted file mode 100644 index db935a2a..00000000 --- a/docs/src/fit_update_and_ingest.md +++ /dev/null @@ -1,46 +0,0 @@ -# Fit, update! and ingest! - -> **Summary.** Algorithms that learn, i.e., generalize to new data, must overload `fit`; the -> fallback performs no operation and returns all `nothing`. Implement `update!` if certain -> hyperparameter changes do not necessitate retraining from scratch (e.g., increasing an -> iteration parameter). Implement `ingest!` to implement incremental learning. All -> training methods implemented must be named in the return value of the -> `functions` trait. - -| method | fallback | compulsory? | requires | -|:---------------------------|:---------------------------------------------------|-------------|-------------------| -| [`LearnAPI.fit`](@ref) | does nothing, returns `(nothing, nothing, nothing)`| no | | -| [`LearnAPI.update!`](@ref) | calls `fit` | no | [`LearnAPI.fit`](@ref) | -| [`LearnAPI.ingest!`](@ref) | none | no | [`LearnAPI.fit`](@ref) | - -All three methods above return a triple `(fitted_params, state, report)` whose components -are explained under [`LearnAPI.fit`](@ref) below. Items that might be returned in -`report` include: feature rankings/importances, SVM support vectors, clustering centers, -methods for visualizing training outcomes, methods for saving learned parameters in a -custom format, degrees of freedom, deviances. Precisely what `report` includes might be -controlled by hyperparameters (algorithm properties) especially if there is a performance -cost to it's inclusion. - -Implement `fit` unless all [operations](@ref operations), such as `predict` and -`transform`, ignore their `fitted_params` argument (which will be `nothing`). This is the -case for many algorithms that have hyperparameters, but do not generalize to new data, such -as a basic DBSCAN clustering algorithm. - -The `update!` method is intended for all subsequent calls to train an algorithm *using the same -observations*, but with possibly altered hyperparameters (`algorithm` argument). A fallback -implementation simply calls `fit`. The main use cases for implementing `update` are: - -- warm-restarting iterative algorithms - -- "smart" training of composite algorithms, such as linear pipelines; here "smart" means that - hyperparameter changes only trigger the retraining of downstream components. - -The `ingest!` method supports incremental learning (same hyperparameters, but new training -observations). Like `update!`, it depends on the output a preceding `fit` or `ingest!` -call. - -```@docs -LearnAPI.fit -LearnAPI.update! -LearnAPI.ingest! -``` diff --git a/docs/src/goals_and_approach.md b/docs/src/goals_and_approach.md deleted file mode 100644 index 467b600c..00000000 --- a/docs/src/goals_and_approach.md +++ /dev/null @@ -1,49 +0,0 @@ -# Goals and Approach - -## Goals - -- Ease of implementation for existing ML/statistics algorithms - -- Breadth of applicability - -- Flexibility in extending functionality - -- Provision of clear interface points for algorithm-generic tooling, such as performance - evaluation through resampling, hyperparameter optimization, and iterative algorithm - control. - -- Should make minimal assumptions about data containers - -- Should be documented in detail - -In particular, the first three goals are to take precedence over user convenience, which -is addressed with a separate, [User Interface](@ref). - - -## Approach - -ML/Statistics algorithms have a complicated taxonomy. Grouping algorithms, or modelling -tasks, into a relatively small number of categories, such as "classification" and -"clusterering", and then imposing uniform behavior within each group, is challenging. In -our experience developing the [MLJ -ecosystem](https://github.com/alan-turing-institute/MLJ.jl), this either leads to -limitations on the algorithms that can be included in a general interface, or additional -complexity needed to cope with exceptional cases. Even if a complete data science -framework might benefit from such groupings, a basement-level API should, in our view, -avoid them. - -In addition to basic methods, like `fit` and `predict`, LearnAPI provides a number of -optional algorithm -[traits](https://ahsmart.com/pub/holy-traits-design-patterns-and-best-practice-book/), -each promising a specific kind of behavior, such as "This algorithm supports class -weights". There is no abstract type hierarchy for ML/statistics algorithms. - -LearnAPI.jl intentionally focuses on the notion of [target variables and target -proxies](@ref proxy), which can exist in both the superised and unsupervised setting, -rather than on the supervised/unsupervised dichotomy. In this view a supervised model is -simply one which has a target variable *and* whose target variable appears in training. - -LearnAPI is a basement-level interface and not a general ML/statistics toolbox. Algorithms -can be supervised or not supervised, can generalize to new data observations (i.e., -"learn") or not generalize (e.g., "one-shot" clusterers). - diff --git a/docs/src/index.md b/docs/src/index.md index f6bede45..e1dc44df 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -5,164 +5,86 @@ LearnAPI.jl
A base Julia interface for machine learning and statistics -

+
+
``` -## Accelerated overview - -LearnAPI.jl provides a collection methods stubs, such as `fit` and `predict`, to be -implemented by algorithms from machine learning and statistics. Through such -implementations, such algorithms buy into algorithm-generic functionality, such as -hyperparameter optimization, as provided by ML/statistics toolboxes and other -packages. LearnAPI.jl also provides a number of Julia traits for making specific promises -of behavior. - -It is designed to be powerful, from the point of view of adding algorithm-generic -functionality, while minimizing the burden on developers implementing the API for a -specific algorithm. - -- To see how to **DEVELOPERS INTERACT** with algorithms implementing LearnAPI, see [Basic fit/predict - workflow](@ref workflow). - -- To see how **USERS INTERACT** with LearnAPI algorithms, see [User - Interface](@ref).[under construction] - -- For developers wanting to **IMPLEMENT** LearnAPI, see [Anatomy of - an Implementation](@ref). - -For more on package goals and philosophy, see [Goals and Approach](@ref). - - -## Methods - -In LearnAPI an *algorithm* is a Julia object storing the hyperparameters of some -ML/statistics algorithm. - -The following methods, dispatched on algorithm type, are provided: - -- `fit`, overloaded if an algorithm involves a learning step, as in classical supervised - learning; the principal output of `fit` is learned parameters - -- `update!`, for adding iterations to an algorithm, or responding efficiently to other - post-`fit`changes in hyperparameters - -- `ingest!`, for incremental learning (training further using *new* data, without - re-initializing learned parameters) - -- *operations*, which apply the algorithm to data, typically not seen in - training, if there is any: - - - `predict`, for predicting values of a target variable or a proxy for the target, such as probability distributions; see below +LearnAPI.jl is a lightweight, functional-style interface, providing a collection of +[methods](@ref Methods), such as `fit` and `predict`, to be implemented by algorithms from +machine learning and statistics. Through such implementations, these algorithms buy into +functionality, such as hyperparameter optimization, as provided by ML/statistics toolboxes +and other packages. LearnAPI.jl also provides a number of Julia [traits](@ref traits) for +promising specific behavior. - - `transform`, for other kinds transformations - - - `inverse_transform`, for reconstructing data from a transformed representation - -- common *accessor functions*, such as `feature_importances` and `training_losses`, for - extracting, from training outcomes, information common to a number of different - algorithms - -- *algorithm traits*, such as `predict_output_type(algorithm)`, for promising specific behavior - -Since this is a functional-style interface, `fit` returns algorithm `state`, in addition to -learned parameters, for passing to the optional `update!` and `ingest!` methods. These -training methods also return a `report` component, for exposing byproducts of training -different from learned parameters. Similarly, all operations also return a `report` -component (important for algorithms that do not generalize to new data). - - -## [Informal concepts](@id scope) - -LearnAPI.jl is predicated on a few basic, informally defined notions, in *italics* -below, which some higher-level interface might decide to formalize. - -- An object which generates ordered sequences of individual *observations* is called - *data*. For example a `DataFrame` instance, from - [DataFrames.jl](https://dataframes.juliadata.org/stable/), is considered data, the - observations being the rows. A matrix can be considered data, but whether the - observations are rows or columns is ambiguous and not fixed by LearnAPI. - -- Each machine learning algorithm's behavior is governed by a number of user-specified - *hyperparameters*. The regularization parameter in ridge regression is an - example. Hyperparameters are data-independent. For example, the number of target classes - is not a hyperparameter. - -- Information needed for training that is not a hyperparameter and not data is called - *metadata*. Examples, include target *class* weights and group lasso feature - groupings. Further examples include feature names, and the pool of target classes, when - these are not embedded in the data representation. - - -### [Targets and target proxies](@id proxy) - -After training, a supervised classifier predicts labels on some input which are then -compared with ground truth labels using some accuracy measure, to assesses the performance -of the classifier. Alternatively, the classifier predicts class probabilities, which are -instead paired with ground truth labels using a proper scoring rule, say. In outlier -detection, "outlier"/"inlier" predictions, or probability-like scores, are similarly -compared with ground truth labels. In clustering, integer labels assigned to observations -by the clustering algorithm can can be paired with human labels using, say, the Rand -index. In survival analysis, predicted survival functions or probability distributions are -compared with censored ground truth survival times. +```@raw html +🚧 +``` -More generally, whenever we have a predicted variable (e.g., a class label) paired with -itself or some proxy (such as a class probability) we call the variable a *target* -variable, and the predicted output a *target proxy*. It is immaterial whether or not the -target appears in training (is supervised) or whether the model generalizes to new -observations (learns) or not. +!!! warning -The target and the kind of predicted proxy are crucial features of ML/statistics -performance measures (not provided by this package) and LearnAPI.jl provides a detailed -list of proxy dispatch types (see [Target proxies](@ref)), as well as algorithm traits to -articulate target type /scitype. + The API described here is under active development and not ready for adoption. + Join an ongoing design discussion at + [this](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048) + Julia Discourse thread. + +## Sample workflow -## Optional data interface +Suppose `forest` is some object encapsulating the hyperparameters of the [random forest +algorithm](https://en.wikipedia.org/wiki/Random_forest) (the number of trees, +etc.). Then, a LearnAPI.jl interface can be implemented, for objects with the type of +`forest`, to enable the following basic workflow: -It can be useful to distinguish between data that exists at some high level, convenient -for the general user - such as a table (dataframe) or the path to a directory containing -image files - and a performant, algorithm-specific representation of that data, such as a -matrix or image "data loader". When retraining using the same data with new -hyperparameters, one wants to avoid recreating the algorithm-specific representation, and, -accordingly, a higher level interface may want to cache such representations. Furthermore, -in resampling (e.g., cross-validation), a higher level interface wants to directly -resample the algorithm-specific representation, so it needs to know how to do that. To -meet these two ends, LearnAPI provides two additional *data methods* dispatched on -algorithm type: +```julia +X = +y = +w = +Xnew = -- `reformat(algorithm, ...)`, for converting from a user data representation to a - performant algorithm-specific representation, whose output is for use in `fit`, - `predict`, etc. above +# Train: +model = fit(forest, X, y) -- `getobs(algorithm, ...)`, for extracting a subsample of observations of the - algorithm-specific representation +# Predict probability distributions: +predict(model, Distribution(), Xnew) -It should be emphasized that LearnAPI is itself agnostic to particular representations of -data or the particular methods of accessing observations within them. By overloading these -methods, each `algorithm` is free to choose its own data interface. +# Generate point predictions: +ŷ = predict(model, LiteralTarget(), Xnew) # or `predict(model, Xnew)` -See [Optional data Interface](@ref data_interface) for more details. +# Apply an "accessor function" to inspect byproducts of training: +LearnAPI.feature_importances(model) -## Contents +# Slim down and otherwise prepare model for serialization: +small_model = minimize(model) +serialize("my_random_forest.jls", small_model) -It is useful to have a guide to the interface, linked below, organized around common -*informally defined* patterns or "tasks". However, the definitive specification of the -interface is the [Reference](@ref reference) section. +# Recover saved model and algorithm configuration: +recovered_model = deserialize("my_random_forest.jls") +@assert LearnAPI.algorithm(recovered_model) == forest +@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ +``` -- Overview: [Anatomy of an Implementation](@ref) +`Distribution` and `LiteralTarget` are singleton types owned by LearnAPI.jl. They allow +dispatch based on the [kind of target proxy](@ref proxy), a key LearnAPI.jl concept. +LearnAPI.jl places more emphasis on the notion of target variables and target proxies than +on the usual supervised/unsupervised learning dichotomy. From this point of view, a +supervised algorithm is simply one in which a target variable exists, and happens to +appear as an input to training but not to prediction. -- Official Specification: [Reference](@ref reference) +In LearnAPI.jl, a method called [`obs`](@ref data_interface) gives users access to an +"internal", algorithm-specific, representation of input data, which is always +"observation-accessible", in the sense that it can be resampled using +[MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) `getobs/numobs` interface. The +implementation can arrange for this resampling to be efficient, and workflows based on +`obs` can have performance benefits. -- User guide: [Common Implementation Patterns](@ref) [under construction] +## Learning more -- [Testing an Implementation](@ref) [under construction] +- [Anatomy of an Implementation](@ref): informal introduction to the main actors in a new + LearnAPI.jl implementation -!!! info +- [Reference](@ref reference): official specification - It is recommended developers read [Anatomy of an Implementation](@ref) before - consulting the guide or reference sections. +- [Common Implementation Patterns](@ref): implementation suggestions for common, + informally defined, algorithm types -*Note.* In the future, LearnAPI.jl may become the new foundation for the -[MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) toolbox. However, LearnAPI.jl -is meant as a general purpose, stand-alone, lightweight, low level API (and has no -reference to the "machines" used in MLJ). +- [Testing an Implementation](@ref) diff --git a/docs/src/kinds_of_target_proxy.md b/docs/src/kinds_of_target_proxy.md new file mode 100644 index 00000000..03c7e032 --- /dev/null +++ b/docs/src/kinds_of_target_proxy.md @@ -0,0 +1,55 @@ +# [Kinds of Target Proxy](@id proxy_types) + +The available kinds of [target proxy](@ref proxy) are classified by subtypes of +`LearnAPI.KindOfProxy`. These types are intended for dispatch only and have no fields. + +```@docs +LearnAPI.KindOfProxy +``` +```@docs +LearnAPI.IID +``` + +## Simple target proxies (subtypes of `LearnAPI.IID`) + +| type | form of an observation | +|:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `LearnAPI.LiteralTarget` | same as target observations | +| `LearnAPI.Sampleable` | object that can be sampled to obtain object of the same form as target observation | +| `LearnAPI.Distribution` | explicit probability density/mass function whose sample space is all possible target observations | +| `LearnAPI.LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | +| † `LearnAPI.Probability` | numerical probability or probability vector | +| † `LearnAPI.LogProbability` | log-probability or log-probability vector | +| † `LearnAPI.Parametric` | a list of parameters (e.g., mean and variance) describing some distribution | +| `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | +| `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | +| `LearnAPI.LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | +| `LearnAPI.ConfidenceInterval` | confidence interval | +| `LearnAPI.Set` | finite but possibly varying number of target observations | +| `LearnAPI.ProbabilisticSet` | as for `Set` but labeled with probabilities (not necessarily summing to one) | +| `LearnAPI.SurvivalFunction` | survival function | +| `LearnAPI.SurvivalDistribution` | probability distribution for survival time | +| `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | +| `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | + +† Provided for completeness but discouraged to avoid [ambiguities in +representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). + +> Table of concrete subtypes of `LearnAPI.IID <: LearnAPI.KindOfProxy`. + + +## When the proxy for the target is a single object + +In the following table of subtypes `T <: LearnAPI.KindOfProxy` not falling under the `IID` +umbrella, it is understood that `predict(model, ::T, ...)` is +not divided into individual observations, but represents a *single* probability +distribution for the sample space ``Y^n``, where ``Y`` is the space the target variable +takes its values, and `n` is the number of observations in `data`. + +| type `T` | form of output of `predict(model, ::T, data...)` | +|:-------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `LearnAPI.JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. | +| `LearnAPI.JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | +| `LearnAPI.JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | + +> Table of `LearnAPI.KindOfProxy` subtypes not subtyping `LearnAPI.IID` diff --git a/docs/src/minimize.md b/docs/src/minimize.md new file mode 100644 index 00000000..a9423780 --- /dev/null +++ b/docs/src/minimize.md @@ -0,0 +1,34 @@ +# [`minimize`](@id algorithm_minimize) + +```julia +minimize(model) -> +``` + +# Typical workflow + +```julia +model = fit(algorithm, X, y) +ŷ = predict(model, LiteralTarget(), Xnew) +LearnAPI.feature_importances(model) + +small_model = minimize(model) +serialize("my_model.jls", small_model) + +recovered_model = deserialize("my_random_forest.jls") +@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ + +# throws MethodError: +LearnAPI.feature_importances(recovered_model) +``` + +# Implementation guide + +| method | compulsory? | fallback | requires | +|:-----------------------------|:-----------:|:--------:|:-------------:| +| [`minimize`](@ref) | no | identity | [`fit`](@ref) | + +# Reference + +```@docs +minimize +``` diff --git a/docs/src/obs.md b/docs/src/obs.md new file mode 100644 index 00000000..599ce590 --- /dev/null +++ b/docs/src/obs.md @@ -0,0 +1,100 @@ +# [`obs`](@id data_interface) + +The [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) package provides two methods +`getobs` and `numobs` for resampling data divided into multiple observations, including +arrays and tables. The data objects returned below are guaranteed to implement this +interface and can be passed to the relevant method (`obsfit`, `obspredict` or +`obstransform`) possibly after resampling using `MLUtils.getobs`. This may provide +performance advantages over naive workflows. + +```julia +obs(fit, algorithm, data...) -> +obs(predict, algorithm, data...) -> +obs(transform, algorithm, data...) -> +``` + +## Typical workflows + +LearnAPI.jl makes no assumptions about the form of data `X` and `y` in a call like +`fit(algorithm, X, y)`. The particular `algorithm` is free to articulate it's own +requirements. However, in this example, the definition + +```julia +obsdata = obs(fit, algorithm, X, y) +``` + +combines `X` and `y` in a single object guaranteed to implement the MLUtils.jl +`getobs`/`numobs` interface, which can be passed to `obsfit` instead of `fit`, as is, or +after resampling using `MLUtils.getobs`: + +```julia +# equivalent to `fit(algorithm, X, y)`: +model = obsfit(algorithm, obsdata) + +# with resampling: +resampled_obsdata = MLUtils.getobs(obsdata, 1:100) +model = obsfit(algorithm, resampled_obsdata) +``` + +In some implementations, the alternative pattern above can be used to avoid repeating +unnecessary internal data preprocessing, or inefficient resampling. For example, here's +how a user might call `obs` and `MLUtils.getobs` to perform efficient +cross-validation: + +```julia +using LearnAPI +import MLUtils + +X = +y = +w = +algorithm = + +test_train_folds = map([1:10, 11:20, 21:30]) do test + (test, setdiff(1:30, test)) +end + +# create fixed model-specific representations of the whole data set: +fit_data = obs(fit, algorithm, X, y) +predict_data = obs(predict, algorithm, predict, X) + +scores = map(train_test_folds) do (train_indices, test_indices) + + # train using model-specific representation of data: + train_data = MLUtils.getobs(fit_data, train_indices) + model = obsfit(algorithm, train_data) + + # predict on the fold complement: + test_data = MLUtils.getobs(predict_data, test_indices) + ŷ = obspredict(model, LiteralTarget(), test_data) + + return + +end +``` + +Note here that the output of `obspredict` will match the representation of `y` , i.e., +there is no concept of an algorithm-specific representation of *outputs*, only inputs. + + +## Implementation guide + +| method | compulsory? | fallback | +|:--------------|:-----------:|:----------------------:| +| [`obs`](@ref) | depends | slurps `data` argument | +| | | | + +If the `data` consumed by `fit`, `predict` or `tranform` consists only of tables and +arrays (with last dimension the observation dimension) then overloading `obs` is +optional. However, if an implementation overloads `obs` to return a (thinly wrapped) +representation of user data that is closer to what the core algorithm actually uses, and +overloads `MLUtils.getobs` (or, more typically `Base.getindex`) to make resampling of that +representation efficient, then those optimizations become available to the user, without +the user concerning herself with the details of the representation. + +A sample implementation is given in the [`obs`](@ref) document-string below. + +```@docs +obs +``` + diff --git a/docs/src/operations.md b/docs/src/operations.md deleted file mode 100644 index 4def3791..00000000 --- a/docs/src/operations.md +++ /dev/null @@ -1,114 +0,0 @@ -# [Predict and Other Operations](@id operations) - -> **Summary** An method delivering output for some algorithm which has finished learning, -> applied to (new) data, is called an **operation**. The output depends on the fitted -> parameters associated with the algorithm, which is `nothing` for non-generalizing -> algorithms. Implement the `predict` operation when the output is predictions of a target -> variable or, more generally a proxy for the target, such as probability distributions. -> Otherwise implement `transform` and, optionally `inverse_transform`. - -The methods `predict`, `transform` and `inverse_transform` are called *operations*. They -are all dispatched on an algorithm, fitted parameters and data. The `predict` operation -additionally includes a `::ProxyType` argument in position two. If [`LearnAPI.fit`](@ref) -is not implemented, then the fitted parameters will always be `nothing`. - -Here's a snippet of code with a `LearnAPI.predict` call: - -```julia -fitted_params, state, fit_report = LearnAPI.fit(some_algorithm, 1, X, y) -ŷ, predict_report = - LearnAPI.predict(some_algorithm, LearnAPI.LiteralTarget(), fitted_params, Xnew) -``` - -| method | compulsory? | fallback | requires | -|:-----------------------------------|:-----------:|:--------:|:-----------:| -[`LearnAPI.predict`](@ref) | no | none | | -[`LearnAPI.transform`](@ref) | no | none | | -[`LearnAPI.inverse_transform`](@ref) | no | none | `transform` | - - -## General requirements - -- Operations always return a tuple `(output, report)` where `output` is the usual output - (e.g., the target predictions if the operation is `predict`) and `report` - includes byproducts of the computation, typically `nothing` unless the algorithm does not - generalize to new data (does not implement `fit`). - -- If implementing a `predict` method, you must also make a - [`LearnAPI.preferred_kind_of_proxy`](@ref) declaration. - -- The name of each operation explicitly overloaded must be included in the return value - of the [`LearnAPI.functions`](@ref) trait. - -## Predict or transform? - -If the algorithm has a notion of [target variable](@ref proxy), then implement a `predict` -method for each supported kind of target proxy (`LiteralTarget()`, `Distribution()`, -etc). See [Target proxies](@ref) below. - -If an operation is to have an inverse operation, then it cannot be `predict` - use -`transform`, and (optionally) `inverse_transform`, for inversion, broadly understood. See -[`LearnAPI.inverse_transform`](@ref) below. - - -## Target proxies - -The concept of **target proxy** is defined under [Targets and target proxies](@ref -proxy). The available kinds of target proxy are classified by subtypes of -`LearnAPI.KindOfProxy`. These types are intended for dispatch only and have no fields. - -```@docs -LearnAPI.KindOfProxy -``` -```@docs -LearnAPI.IID -``` - -| type | form of an observation | -|:-------------------------------:|:----------------------------------------------------| -| `LearnAPI.None` | has no declared relationship with a target variable | -| `LearnAPI.LiteralTarget` | same as target observations | -| `LearnAPI.Sampleable` | object that can be sampled to obtain object of the same form as target observation | -| `LearnAPI.Distribution` | explicit probability density/mass function whose sample space is all possible target observations | -| `LearnAPI.LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | -| † `LearnAPI.Probability` | raw numerical probability or probability vector | -| † `LearnAPI.LogProbability` | log-probability or log-probability vector | -| † `LearnAPI.Parametric` | a list of parameters (e.g., mean and variance) describing some distribution | -| `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | -| `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | -| `LearnAPI.LabelAmbiguousDistribution`| pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | -| `LearnAPI.ConfidenceInterval` | confidence interval | -| `LearnAPI.Set` | finite but possibly varying number of target observations | -| `LearnAPI.ProbabilisticSet` | as for `Set` but labeled with probabilities (not necessarily summing to one) | -| `LearnAPI.SurvivalFunction` | survival function | -| `LearnAPI.SurvivalDistribution` | probability distribution for survival time | -| `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | -| `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | - -† Provided for completeness but discouraged to avoid [ambiguities in -representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). - -> Table of concrete subtypes of `LearnAPI.IID <: LearnAPI.KindOfProxy`. - -In the following table of subtypes `T <: LearnAPI.KindOfProxy` not falling under the `IID` -umbrella, the first return value of `predict(algorithm, ::T, fitted_params, data...)` is -not divided into individual observations, but represents a *single* probability -distribution for the sample space ``Y^n``, where ``Y`` is the space the target variable -takes its values, and `n` is the number of observations in `data`. - -| type `T` | form of output of `predict(algorithm, ::T, fitted_params, data...)` | -|:-------------------------------:|:--------------------------------------------------------------------------| -| `LearnAPI.JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. | -| `LearnAPI.JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | -| `LearnAPI.JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | - -> Table of `LearnAPI.KindOfProxy` subtypes not subtyping `LearnAPI.IID` - - -## Reference - -```@docs -LearnAPI.predict -LearnAPI.transform -LearnAPI.inverse_transform -``` diff --git a/docs/src/optional_data_interface.md b/docs/src/optional_data_interface.md deleted file mode 100644 index beccda9c..00000000 --- a/docs/src/optional_data_interface.md +++ /dev/null @@ -1,34 +0,0 @@ -# [Optional Data Interface](@id data_interface) - -> **Summary.** Implement `getobs` to articulate how to generate individual observations -> from data consumed by a LearnAPI algorithm. Implement `reformat` to provide a higher level -> interface the means to avoid repeating transformations from user representations of data -> (such as a dataframe) and algorithm-specific representations (such as a matrix). - -## Resampling - -To aid in programmatic resampling, such as cross-validation, it is helpful if each machine -learning algorithm articulates how the data it consumes can be subsampled - that is, how a -subset of observations can be extracted from that data. Another advantage of doing so is -to mitigate some of the ambiguities around structuring observations within the container: -Are the observations in a matrix the rows or the columns? - -In LearnAPI, an implementation can articulate a subsampling method by implementing -`LearnAPI.getobs(algorithm, func, I, data...)` for each function `func` consuming data, such -as `fit` and `predict`. Examples are given below. - -```@docs -LearnAPI.getobs -``` -## Preprocessing - -So that a higher level interface can avoid unnecessarily repeating calls to convert -user-supplied data (e.g., a dataframe) into some performant, algorithm-specific -representation, an algorithm can move such data conversions out of `fit`, `predict`, etc., and -into an implementation of `LearnAPI.reformat` created for each signature of such methods -that are implemented. Examples are given below. - -```@docs -LearnAPI.reformat -``` - diff --git a/docs/src/patterns/classification.md b/docs/src/patterns/classification.md new file mode 100644 index 00000000..4e8066d9 --- /dev/null +++ b/docs/src/patterns/classification.md @@ -0,0 +1 @@ +# Classification diff --git a/docs/src/patterns/classifiers.md b/docs/src/patterns/classifiers.md deleted file mode 100644 index 3571bc78..00000000 --- a/docs/src/patterns/classifiers.md +++ /dev/null @@ -1 +0,0 @@ -# Classifiers diff --git a/docs/src/patterns/regression.md b/docs/src/patterns/regression.md new file mode 100644 index 00000000..626d59ce --- /dev/null +++ b/docs/src/patterns/regression.md @@ -0,0 +1,5 @@ +# Regression + +See [these +examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/regression.jl) +from tests. diff --git a/docs/src/patterns/regressors.md b/docs/src/patterns/regressors.md deleted file mode 100644 index 0b6163f6..00000000 --- a/docs/src/patterns/regressors.md +++ /dev/null @@ -1 +0,0 @@ -# Regressors diff --git a/docs/src/patterns/static_algorithms.md b/docs/src/patterns/static_algorithms.md new file mode 100644 index 00000000..7f420e2d --- /dev/null +++ b/docs/src/patterns/static_algorithms.md @@ -0,0 +1,7 @@ +# Static Algorithms + +See [these +examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/static_algorithms.jl) +from tests. + + diff --git a/docs/src/patterns/static_transformers.md b/docs/src/patterns/static_transformers.md deleted file mode 100644 index f413050b..00000000 --- a/docs/src/patterns/static_transformers.md +++ /dev/null @@ -1 +0,0 @@ -# Static Transformers diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md new file mode 100644 index 00000000..a596ad1a --- /dev/null +++ b/docs/src/predict_transform.md @@ -0,0 +1,71 @@ +# [`predict`, `transform`, and relatives](@id operations) + +```julia +predict(model, kind_of_proxy, data...) -> prediction +transform(model, data...) -> transformed_data +inverse_transform(model, data...) -> inverted_data +``` + +## Typical worklows + +```julia +# Train some supervised `algorithm`: +model = fit(algorithm, X, y) + +# Predict probability distributions: +ŷ = predict(model, Distribution(), Xnew) + +# Generate point predictions: +ŷ = predict(model, LiteralTarget(), Xnew) +``` + +```julia +# Training a dimension-reducing `algorithm`: +model = fit(algorithm, X) +Xnew_reduced = transform(model, Xnew) + +# Apply an approximate right inverse: +inverse_transform(model, Xnew_reduced) +``` + +### An advanced workflow + +```julia +fitdata = obs(fit, algorithm, X, y) +predictdata = obs(predict, algorithm, Xnew) +model = obsfit(algorithm, obsdata) +ŷ = obspredict(model, LiteralTarget(), predictdata) +``` + + +## Implementation guide + +The methods `predict` and `transform` are not directly overloaded. + +| method | compulsory? | fallback | requires | +|:----------------------------|:-----------:|:--------:|:-------------------------------------:| +| [`obspredict`](@ref) | no | none | [`fit`](@ref) | +| [`obstransform`](@ref) | no | none | [`fit`](@ref) | +| [`inverse_transform`](@ref) | no | none | [`fit`](@ref), [`obstransform`](@ref) | + +### Predict or transform? + +If the algorithm has a notion of [target variable](@ref proxy), then arrange for +[`obspredict`](@ref) to output each supported [kind of target proxy](@ref +proxy_types) (`LiteralTarget()`, `Distribution()`, etc). + +For output not associated with a target variable, implement [`obstransform`](@ref) +instead, which does not dispatch on [`LearnAPI.KindOfProxy`](@ref), but can be optionally +paired with an implementation of [`inverse_transform`](@ref) for returning (approximate) +right inverses to `transform`. + + +## Reference + +```@docs +predict +obspredict +transform +obstransform +inverse_transform +``` diff --git a/docs/src/reference.md b/docs/src/reference.md index 1d6d6b5d..e2c4db4a 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -1,30 +1,77 @@ # [Reference](@id reference) -> **Summary** In LearnAPI.jl an **algorithm** is a container for hyperparameters of some -> ML/Statistics algorithm (which may or may not "learn"). Functionality is created by -> overloading methods provided by the interface, which are divided into **training -> methods** (e.g., `fit`), **operations** (`predict` and `transform`) and **accessor -> functions** (e.g., `feature_importances`). Promises of particular behavior are -> articulated by **algorithm traits**. +Here we give the definitive specification of the LearnAPI.jl interface. For informal +guides see [Anatomy of an Implementation](@ref) and [Common Implementation +Patterns](@ref). -Here we give the definitive specification of the interface provided by LearnAPI.jl. For a -more informal guide see [Anatomy of an Implementation](@ref) and [Common Implementation Patterns](@ref). -!!! important +## [Important terms and concepts](@id scope) - The reader is assumed to be familiar with the LearnAPI-specific meanings of the following terms, as outlined in - [Scope and undefined notions](@ref scope): **data**, **metadata**, - **hyperparameter**, **observation**, **target**, **target proxy**. - -## Algorithms +The LearnAPI.jl specification is predicated on a few basic, informally defined notions: -In LearnAPI.jl an **algorithm** is some julia object `alg` storing the hyperparameters of -some MLJ/statistics algorithm that transforms data in some way. Typically the algorithm -"learns" from data in a training event, but this is not essential; "static" data -processing, with parameters, is included. -The type of `alg` will have a name reflecting that of the algorithm, such as -`DecisionTreeRegressor`. +### Data and observations + +ML/statistical algorithms are typically applied in conjunction with resampling of +*observations*, as in +[cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)). In this +document *data* will always refer to objects encapsulating an ordered sequence of +individual observations. If an algorithm is trained using multiple data objects, it is +undertood that individual objects share the same number of observations, and that +resampling of one component implies synchronized resampling of the others. + +A `DataFrame` instance, from [DataFrames.jl](https://dataframes.juliadata.org/stable/), is +an example of data, the observations being the rows. LearnAPI.jl makes no assumptions +about how observations can be accessed, except in the case of the output of [`obs`](@ref +data_interface), which must implement the MLUtils.jl `getobs`/`numobs` interface. For +example, it is generally ambiguous whether the rows or columms of a matrix are considered +observations, but if a matrix is returned by [`obs`](@ref data_interface) the observations +must be the columns. + +### [Hyperparameters](@id hyperparameters) + +Besides the data it consumes, a machine learning algorithm's behavior is governed by a +number of user-specified *hyperparameters*, such as the number of trees in a random +forest. In LearnAPI.jl, one is allowed to have hyperparematers that are not data-generic. +For example, a class weight dictionary will only make sense for a target taking values in +the set of dictionary keys. + + +### [Targets and target proxies](@id proxy) + +#### Context + +After training, a supervised classifier predicts labels on some input which are then +compared with ground truth labels using some accuracy measure, to assesses the performance +of the classifier. Alternatively, the classifier predicts class probabilities, which are +instead paired with ground truth labels using a proper scoring rule, say. In outlier +detection, "outlier"/"inlier" predictions, or probability-like scores, are similarly +compared with ground truth labels. In clustering, integer labels assigned to observations +by the clustering algorithm can can be paired with human labels using, say, the Rand +index. In survival analysis, predicted survival functions or probability distributions are +compared with censored ground truth survival times. + +#### Definitions + +More generally, whenever we have a variable (e.g., a class label) that can (in principle) +can be paired with a predicted value, or some predicted "proxy" for that variable (such as +a class probability), then we call the variable a *target* variable, and the predicted +output a *target proxy*. In this definition, it is immaterial whether or not the target +appears in training (is supervised) or whether or not the model generalizes to new +observations ("learns"). + +LearnAPI.jl provides singleton [target proxy types](@ref proxy_types) for prediction +dispatch in LearnAPI.jl. These are also used to distinguish performance metrics provided +by the package +[StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/). + + +### [Algorithms](@id algorithms) + +An object implementing the LearnAPI.jl interface is called an *algorithm*, although it is +more accurately "the configuration of some algorithm".¹ It will have a type name +reflecting the name of some ML/statistics algorithm (e.g., `RandomForestRegressor`) and it +will encapsulate a particular set of user-specified [hyperparameters](@ref). Additionally, for `alg::Alg` to be a LearnAPI algorithm, we require: @@ -32,73 +79,83 @@ Additionally, for `alg::Alg` to be a LearnAPI algorithm, we require: - If `alg` is an algorithm, then so are all instances of the same type. -- If `_alg` is another algorithm, then `alg == _alg` if and only if `typeof(alg) == typeof(_alg)` and - corresponding properties are `==`. This includes properties that are random number - generators (which should be copied in training to avoid mutation). +- If `_alg` is another algorithm, then `alg == _alg` if and only if `typeof(alg) == + typeof(_alg)` and corresponding properties are `==`. This includes properties that are + random number generators (which should be copied in training to avoid mutation). -- If an algorithm has other algorithms as hyperparameters, then [`LearnAPI.is_wrapper`](@ref)`(alg)` - must be `true`. +- If an algorithm has other algorithms as hyperparameters, then + [`LearnAPI.is_composite`](@ref)`(alg)` must be `true` (fallback is `false`). - A keyword constructor for `Alg` exists, providing default values for *all* non-algorithm hyperparameters. + +- At least one non-trait LearnAPI.jl function must be overloaded for instances of `Alg`, + and accordingly `LearnAPI.functions(algorithm)` must be non-empty. -Whenever any LearnAPI method (excluding traits) is overloaded for some type `Alg` (e.g., -`predict`, `transform`, `fit`) then that is a promise that all instances of `Alg` are -algorithms (and the trait [`LearnAPI.functions`](@ref)`(Alg)` will be non-empty). - -It is supposed that making copies of algorithm objects is a cheap operation. Consequently, -*learned* parameters, such as weights in a neural network (the `fitted_params` described -in [Fit, update! and ingest!](@ref)) should not be stored in the algorithm object. Storing -learned parameters in an algorithm is not explicitly ruled out, but doing so might lead to -performance issues in packages adopting LearnAPI.jl. +Any object `alg` for which [`LearnAPI.functions`](@ref)`(alg)` is non-empty is understood +have a valid implementation of the LearnAPI.jl interface. ### Example -Any instance of `GradientRidgeRegressor` defined below is a valid LearnAPI.jl algorithm: +Any instance of `GradientRidgeRegressor` defined below meets all but the last criterion +above: ```julia -struct GradientRidgeRegressor{T<:Real} <: LearnAPI.Algorithm - learning_rate::T - epochs::Int - l2_regularization::T +struct GradientRidgeRegressor{T<:Real} + learning_rate::T + epochs::Int + l2_regularization::T end +GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) = + GradientRidgeRegressor(learning_rate, epochs, l2_regularization) ``` -The same is true if we omit the subtyping `<: LearnAPI.Algorithm`, but not if we also make -this a `mutable struct`. In that case we will need to overload `Base.==` for -`GradientRidgeRegressor`. +The same is not true if we make this a `mutable struct`. In that case we will need to +appropriately overload `Base.==` for `GradientRidgeRegressor`. -```@docs -LearnAPI.Algorithm -``` ## Methods -None of the methods described in the linked sections below are compulsory, but any -implemented or overloaded method that is not an algorithm trait must be added to the return -value of [`LearnAPI.functions`](@ref), as in +Only these method names are exported: `fit`, `obsfit`, `predict`, `obspredict`, +`transform`, `obstransform`, `inverse_transform`, `minimize`, and `obs`. All new +implementations must implement [`obsfit`](@ref), the accessor function +[`LearnAPI.algorithm`](@ref algorithm_minimize) and the trait +[`LearnAPI.functions`](@ref). -```julia -LearnAPI.functions(::Type{ importance::Real` pairs (e.g `[:gender => 0.23, :height => 0.7, :weight -=> 0.1]`). +""" + LearnAPI.feature_importances(model) + +Return the algorithm-specific feature importances of a `model` output by +[`fit`](@ref)`(algorithm, ...)` for some `algorithm`. The value returned has the form of +an abstract vector of `feature::Symbol => importance::Real` pairs (e.g `[:gender => 0.23, +:height => 0.7, :weight => 0.1]`). -The `algorithm` supports feature importances if `:feature_importance in +The `algorithm` supports feature importances if `LearnAPI.feature_importances in LearnAPI.functions(algorithm)`. If an algorithm is sometimes unable to report feature importances then -`feature_importances` will return all importances as 0.0, as in `[:gender => 0.0, :height -=> 0.0, :weight => 0.0]`. +`LearnAPI.feature_importances` will return all importances as 0.0, as in `[:gender => 0.0, +:height => 0.0, :weight => 0.0]`. # New implementations -`LearnAPI.feature_importances(algorithm::SomeAlgorithmType, fitted_params, report)` may be -overloaded for any type `SomeAlgorithmType` whose instances are algorithms in the LearnAPI -sense. If an algorithm can report multiple feature importance types, then the specific type to -be reported should be controlled by a hyperparameter (i.e., by some property of `algorithm`). +Implementation is optional. $(DOC_IMPLEMENTED_METHODS(:feature_importances)). @@ -34,16 +61,99 @@ $(DOC_IMPLEMENTED_METHODS(:feature_importances)). function feature_importances end """ - training_losses(algorithm, fitted_params, report) + LearnAPI.coefficients(model) + +For a linear model, return the learned coefficients. The value returned has the form of +an abstract vector of `feature_or_class::Symbol => coefficient::Real` pairs (e.g `[:gender +=> 0.23, :height => 0.7, :weight => 0.1]`) or, in the case of multi-targets, +`feature::Symbol => coefficients::AbstractVector{<:Real}` pairs. + +The `model` reports coefficients if `LearnAPI.coefficients in +LearnAPI.functions(Learn.algorithm(model))`. + +See also [`LearnAPI.intercept`](@ref). + +# New implementations + +Implementation is optional. + +$(DOC_IMPLEMENTED_METHODS(:coefficients)). + +""" +function coefficients end + +""" + LearnAPI.intercept(model) + +For a linear model, return the learned intercept. The value returned is `Real` (single +target) or an `AbstractVector{<:Real}` (multi-target). + +The `model` reports intercept if `LearnAPI.intercept in +LearnAPI.functions(Learn.algorithm(model))`. + +See also [`LearnAPI.coefficients`](@ref). + +# New implementations + +Implementation is optional. + +$(DOC_IMPLEMENTED_METHODS(:intercept)). + +""" +function intercept end + +""" + LearnAPI.tree(model) + +Return a user-friendly tree, in the form of a root object implementing the following +interface defined in AbstractTrees.jl: + +- subtypes `AbstractTrees.AbstractNode{T}` +- implements `AbstractTrees.children()` +- implements `AbstractTrees.printnode()` + +Such a tree can be visualized using the TreeRecipe.jl package, for example. -Return the training losses for `algorithm`, given `fitted_params` and -`report`, as returned by [`LearnAPI.fit`](@ref), [`LearnAPI.update!`](@ref) or -[`LearnAPI.ingest!`](@ref). +See also [`LearnAPI.trees`](@ref). # New implementations -Implement for iterative algorithms that compute and record training losses as part of training -(e.g. neural networks). +Implementation is optional. + +$(DOC_IMPLEMENTED_METHODS(:tree)). + +""" +function tree end + +""" + LearnAPI.trees(model) + +For some ensemble model, return a vector of trees. See [`LearnAPI.tree`](@ref) for the +form of such trees. + +See also [`LearnAPI.tree`](@ref). + +# New implementations + +Implementation is optional. + +$(DOC_IMPLEMENTED_METHODS(:trees)). + +""" +function trees end + +""" + LearnAPI.training_losses(model) + +Return the training losses obtained when running `model = fit(algorithm, ...)` for some +`algorithm`. + +See also [`fit`](@ref). + +# New implementations + +Implement for iterative algorithms that compute and record training losses as part of +training (e.g. neural networks). $(DOC_IMPLEMENTED_METHODS(:training_losses)). @@ -51,17 +161,18 @@ $(DOC_IMPLEMENTED_METHODS(:training_losses)). function training_losses end """ - training_scores(algorithm, fitted_params, report) + LearnAPI.training_scores(model) + +Return the training scores obtained when running `model = fit(algorithm, ...)` for some +`algorithm`. -Return the training scores for `algorithm`, given `fitted_params` and -`report`, as returned by [`LearnAPI.fit`](@ref), [`LearnAPI.update!`](@ref) or -[`LearnAPI.ingest!`](@ref). +See also [`fit`](@ref). # New implementations -Implement for algorithms, such as outlier detection algorithms, which associate a score with each -observation during training, where these scores are of interest in later processes (e.g, in -defining normalized scores on new data). +Implement for algorithms, such as outlier detection algorithms, which associate a score +with each observation during training, where these scores are of interest in later +processes (e.g, in defining normalized scores for new data). $(DOC_IMPLEMENTED_METHODS(:training_scores)). @@ -69,11 +180,34 @@ $(DOC_IMPLEMENTED_METHODS(:training_scores)). function training_scores end """ - training_labels(algorithm, fitted_params, report) + LearnAPI.components(model) -Return the training labels for `algorithm`, given `fitted_params` and -`report`, as returned by [`LearnAPI.fit`](@ref), [`LearnAPI.update!`](@ref) or -[`LearnAPI.ingest!`](@ref). +For a composite `model`, return the component models (`fit` outputs). These will be in the +form of a vector of named pairs, `property_name::Symbol => component_model`. Here +`property_name` is the name of some algorithm-valued property (hyper-parameter) of +`algorithm = LearnAPI.algorithm(model)`. + +A composite model is one for which the corresponding `algorithm` includes one or more +algorithm-valued properties, and for which `LearnAPI.is_composite(algorithm)` is `true`. + +See also [`is_composite`](@ref). + +# New implementations + +Implementent if and only if `model` is a composite model. + +$(DOC_IMPLEMENTED_METHODS(:components)). + +""" +function components end + +""" + LearnAPI.training_labels(model) + +Return the training labels obtained when running `model = fit(algorithm, ...)` for some +`algorithm`. + +See also [`fit`](@ref). # New implementations @@ -81,3 +215,57 @@ $(DOC_IMPLEMENTED_METHODS(:training_labels)). """ function training_labels end + + +# :extras intentionally excluded: +const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( + algorithm, + coefficients, + intercept, + tree, + trees, + feature_importances, + training_labels, + training_losses, + training_scores, + components, +) + +const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS_LIST = join( + map(ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS) do f + "[`LearnAPI.$f`](@ref)" + end, + ", ", + " and ", +) + + """ + LearnAPI.extras(model) + +Return miscellaneous byproducts of an algorithm's computation, from the object `model` +returned by a call of the form `fit(algorithm, data)`. + +$DOC_STATIC + +See also [`fit`](@ref). + +# New implementations + +Implementation is discouraged for byproducts already covered by other LearnAPI.jl accessor +functions: $ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS_LIST. + +$(DOC_IMPLEMENTED_METHODS(:training_labels)). + +""" +function extras end + +const ACCESSOR_FUNCTIONS = (extras, ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS...) + +const ACCESSOR_FUNCTIONS_LIST = join( + map(ACCESSOR_FUNCTIONS) do f + "[`LearnAPI.$f`](@ref)" + end, + ", ", + " and ", +) + diff --git a/src/algorithms.jl b/src/algorithms.jl deleted file mode 100644 index 02e9614c..00000000 --- a/src/algorithms.jl +++ /dev/null @@ -1,19 +0,0 @@ -abstract type LearnAPIType end - -""" - LearnAPI.Algorithm - -An optional abstract type for algorithms implementing LearnAPI.jl. - -If `typeof(alg) <: LearnAPI.Algorithm`, then `alg` is guaranteed to be an ML/statistical -algorithm in the strict LearnAPI sense. - -# New implementations - -While not a formal requirement, algorithm types implementing the LearnAPI.jl are -encouraged to subtype `LearnAPI.Algorithm`, unless it is disruptive to do so. - -See also [`LearnAPI.functions`](@ref). - -""" -abstract type Algorithm <: LearnAPIType end diff --git a/src/data_interface.jl b/src/data_interface.jl deleted file mode 100644 index 8db5f486..00000000 --- a/src/data_interface.jl +++ /dev/null @@ -1,107 +0,0 @@ -""" - LearnAPI.getobs(algorithm, LearnAPI.fit, I, data...) - -Return a subsample of `data` consisting of all observations with indices in `I`. Here -`data` is data of the form expected in a call like `LearnAPI.fit(algorithm, verbosity, -data...; metadata...)`. - -Always returns a tuple of the same length as `data`. - - LearnAPI.getobs(algorithm, operation, I, data...) - -Return a subsample of `data` consisting of all observations with indices in `I`. Here -`data` is data of the form expected in a call of the specified `operation`, e.g., in a -call like `LearnAPI.predict(algorithm, data...)`, if `operation = LearnAPI.predict`. Possible -values for `operation` are: $DOC_OPERATIONS_LIST_FUNCTION. - -Always returns a tuple of the same length as `data`. - -# New implementations - -Implementation is optional. If implemented, then ordinarily implemented for each signature -of `fit` and operation implemented for `algorithm`. - -$(DOC_IMPLEMENTED_METHODS(:reformat)) - -The subsample returned must be acceptable in place of `data` in calls of the function -named in the second argument. - -## Sample implementation - -Suppose that `MyClassifier` is an algorithm type for simple supervised classification, with -`LearnAPI.fit(algorithm::MyClassifier, verbosity, A, y)` and `predict(algorithm::MyClassifier, -fitted_params, A)` implemented assuming the target `y` is an ordinary abstract vector and -the features `A` is an abstract matrix with columns as observations. Then the following is -a valid implementation of `getobs`: - -```julia -LearnAPI.getobs(::MyClassifier, ::typeof(LearnAPI.fit), I, A, y) = - (view(A, :, I), view(y, I)) -LearnAPI.getobs(::MyClassifier, ::typeof(LearnAPI.predict), I, A) = (view(A, :, I),) -``` - -""" -function getobs end - -""" - LearnAPI.reformat(algorithm, LearnAPI.fit, user_data...; metadata...) - -Return the algorithm-specific representations `(data, metadata)` of user-supplied `(user_data, -user_metadata)`, for consumption, after splatting, by `LearnAPI.fit`, `LearnAPI.update!` -or `LearnAPI.ingest!`. - - LearnAPI.reformat(algorithm, operation, user_data...) - -Return the algorithm-specific representation `data` of user-supplied `user_data`, for -consumption, after splatting, by the specified `operation`, dispatched on `algorithm`. Here -`operation` is one of: $DOC_OPERATIONS_LIST_FUNCTION. - -The following sample workflow illustrates the use of both versions of `reformat`above. The -data objects `X`, `y`, and `Xtest` are the user-supplied versions of data. - -```julia -data, metadata = LearnAPI.reformat(algorithm, LearnAPI.fit, X, y; class_weights=some_dictionary) -fitted_params, state, fit_report = LearnAPI.fit(algorithm, 0, data...; metadata...) - -test_data = LearnAPI.reformat(algorithm, LearnAPI.predict, Xtest) -ŷ, predict_report = LearnAPI.predict(algorithm, fitted_params, test_data...) -``` - -# New implementations - -Implementation of `reformat` is optional. The fallback simply slurps the supplied -data/metadata. You will want to implement for each `fit` or operation signature -implemented for `algorithm`. - -$(DOC_IMPLEMENTED_METHODS(:reformat, overloaded=true)) - -Ideally, any potentially expensive transformation of user-supplied data that is carried -out during training only once, at the beginning, should occur in `reformat` instead of -`fit`/`update!`/`ingest!`. - -Note that the first form of `reformat`, for operations, should always return a tuple, -because the output is splat in calls to the operation (see the sample workflow -above). Similarly, in the return value `(data, metadata)` for the `fit` variant, `data` is -always a tuple and `metadata` always a named tuple (or `Base.Pairs` object). If there is -no metadata, a `NamedTuple()` can be returned in its place. - -## Example implementation - -Suppose that `MyClassifier` is an algorithm type for simple supervised classification, with -`LearnAPI.fit(algorithm::MyClassifier, verbosity, A, y; names=...)` and -`predict(algorithm::MyClassifier, fitted_params, A)` implemented assuming that the target `y` -is an ordinary vector, the features `A` is a matrix with columns as observations, and -`names` are the names of the features. Then, supposing users supply features in tabular -form, but target as expected, then we can provide the following implementation of -`reformat`: - -```julia -using Tables -function LearnAPI.reformat(::MyClassifier, ::typeof(LearnAPI.fit), X, y) - names = Tables.schema(Tables.rows(X)).names - return ((Tables.matrix(X)', y), (; names)) -end -LearnAPI.reformat(::MyClassifier, ::typeof(LearnAPI.predict), X) = (Tables.matrix(X)',) -``` -""" -reformat(::Any, ::Any, data...; algorithm_data...) = (data, algorithm_data) diff --git a/src/fit.jl b/src/fit.jl new file mode 100644 index 00000000..e83cbcfb --- /dev/null +++ b/src/fit.jl @@ -0,0 +1,184 @@ +# # DOC STRING HELPERS + +const TRAINING_FUNCTIONS = (:fit,) + + +# # FIT + +""" + LearnAPI.fit(algorithm, data...; verbosity=1) + +Execute the algorithm with configuration `algorithm` using the provided training `data`, +returning an object, `model`, on which other methods, such as [`predict`](@ref) or +[`transform`](@ref), can be dispatched. [`LearnAPI.functions(algorithm)`](@ref) returns a +list of methods that can be applied to either `algorithm` or `model`. + +# Arguments + +- `algorithm`: property-accessible object whose properties are the hyperparameters of + some ML/statistical algorithm + +$(DOC_ARGUMENTS(:fit)) + +- `verbosity=1`: logging level; set to `0` for warnings only, and `-1` for silent training + +See also [`obsfit`](@ref), [`predict`](@ref), [`transform`](@ref), +[`inverse_transform`](@ref), [`LearnAPI.functions`](@ref), [`obs`](@ref). + +# Extended help + +# New implementations + +LearnAPI.jl provides the following defintion of `fit`, which is never directly overloaded: + +```julia +fit(algorithm, data...; verbosity=1) = + obsfit(algorithm, Obs(), obs(fit, algorithm, data...); verbosity) +``` + +Rather, new algorithms should overload [`obsfit`](@ref). See also [`obs`](@ref). + +""" +fit(algorithm, data...; verbosity=1) = + obsfit(algorithm, obs(fit, algorithm, data...), verbosity) + +""" + obsfit(algorithm, obsdata; verbosity=1) + +A lower-level alternative to [`fit`](@ref), this method consumes a pre-processed form of +user data. Specifically, the following two code snippets are equivalent: + +```julia +model = fit(algorithm, data...) +``` +and + +```julia +obsdata = obs(fit, algorithm, data...) +model = obsfit(algorithm, obsdata) +``` + +Here `obsdata` is algorithm-specific, "observaton-accessible" data, meaning it implements +the MLUtils.jl `getobs`/`numobs` interface for observation resampling (even if `data` does +not). Morevoer, resampled versions of `obsdata` may be passed to `obsfit` in its place. + +The use of `obsfit` may offer performance advantages. See more at [`obs`](@ref). + +See also [`fit`](@ref), [`obs`](@ref). + +# Extended help + +# New implementations + +Implementation of the following method signature is compulsory for all new algorithms: + +```julia +LearnAPI.obsfit(algorithm, obsdata, verbosity) +``` + +Here `obsdata` has the form explained above. If [`obs`](@ref)`(fit, ...)` is not being +overloaded, then a fallback gives `obsdata = data` (always a tuple!). Note that +`verbosity` is a positional argument, not a keyword argument in the overloaded signature. + +New implementations must also implement [`LearnAPI.algorithm`](@ref). + +If overloaded, then the functions `LearnAPI.obsfit` and `LeranAPI.fit` must be included in +the tuple returned by the [`LearnAPI.functions(algorithm)`](@ref) trait. + +## Non-generalizing algorithms + +If the algorithm does not generalize to new data (e.g, DBSCAN clustering) then `data = ()` +and `obsfit` carries out no computation, as this happen entirely in a `transform` and/or +`predict` call. In such cases, `obsfit(algorithm, ...)` may return `algorithm`, but +another possibility is allowed: To provide a mechanism for `transform`/`predict` to report +byproducts of the computation (e.g., a list of boundary points in DBSCAN clustering) they +are allowed to *mutate* the `model` object returned by `obsfit`, which is then arranged to +be a mutable struct wrapping `algorithm` and fields to store the byproducts. In that case, +[`LearnAPI.predict_or_transform_mutates(algorithm)`](@ref) must be overloaded to return +`true`. + +""" +obsfit(algorithm, obsdata; verbosity=1) = + obsfit(algorithm, obsdata, verbosity) + + +# # UPDATE + +""" + LearnAPI.update!(algorithm, verbosity, fitted_params, state, data...) + +Based on the values of `state`, and `fitted_params` returned by a preceding call to +[`LearnAPI.fit`](@ref), [`LearnAPI.ingest!`](@ref), or [`LearnAPI.update!`](@ref), update a +algorithm's fitted parameters, returning new (or mutated) `state` and `fitted_params`. + +Intended for retraining when the training data has not changed, but `algorithm` +properties (hyperparameters) may have changed, e.g., when increasing an iteration +parameter. Specifically, the assumption is that `data` have the same values +seen in the most recent call to `fit/update!/ingest!`. + +For incremental training (same algorithm, new data) see instead [`LearnAPI.ingest!`](@ref). + +# Return value + +Same as [`LearnAPI.fit`](@ref), namely a tuple (`fitted_params`, `state`, `report`). See +[`LearnAPI.fit`](@ref) for details. + + +# New implementations + +Overloading this method is optional. A fallback calls `LearnAPI.fit`: + +```julia +LearnAPI.update!(algorithm, verbosity, fitted_params, state, data...) = + fit(algorithm, verbosity, data) +``` +$(DOC_IMPLEMENTED_METHODS(:fit)) + +The most common use case is continuing training of an iterative algorithm: `state` is +simply a copy of the algorithm used in the last training call (`fit`, `update!` or +`ingest!`) and this will include the current number of iterations as a property. If +`algorithm` and `state` differ only in the number of iterations (e.g., epochs in a neural +network), which has increased, then the fitted parameters (network weights and biases) are +updated, rather than computed from scratch. Otherwise, `update!` simply calls `fit`, to +force retraining from scratch. + +It is permitted to return mutated versions of `state` and `fitted_params`. + +See also [`LearnAPI.fit`](@ref), [`LearnAPI.ingest!`](@ref). + +""" + + +# # INGEST + +""" + LernAPI.ingest!(algorithm, verbosity, fitted_params, state, data...) + +For an algorithm that supports incremental learning, update the fitted parameters using +`data`, which has typically not been seen before. The arguments `state` and +`fitted_params` are the output of a preceding call to [`LearnAPI.fit`](@ref), +[`LearnAPI.ingest!`](@ref), or [`LearnAPI.update!`](@ref), of which mutated or new +versions are returned. + +For updating fitted parameters using the *same* data but new hyperparameters, see instead +[`LearnAPI.update!`](@ref). + +For training an algorithm with new hyperparameters but *unchanged* data, see instead +[`LearnAPI.update!`](@ref). + + +# Return value + +Same as [`LearnAPI.fit`](@ref), namely a tuple (`fitted_params`, `state`, `report`). See +[`LearnAPI.fit`](@ref) for details. + + +# New implementations + +Implementing this method is optional. It has no fallback. + +$(DOC_IMPLEMENTED_METHODS(:fit)) + +See also [`LearnAPI.fit`](@ref), [`LearnAPI.update!`](@ref). + +""" diff --git a/src/fit_update_ingest.jl b/src/fit_update_ingest.jl deleted file mode 100644 index e38ccc40..00000000 --- a/src/fit_update_ingest.jl +++ /dev/null @@ -1,177 +0,0 @@ -# # DOC STRING HELPERS - -const TRAINING_FUNCTIONS = (:fit, :update!, :ingest!) - -const DOC_OPERATIONS = "An *operation* is a method dispatched on an algorithm, "* - "associated learned parameters, and data. "* - "The LearnAPI operations are: $DOC_OPERATIONS_LIST_FUNCTION. " - -const DOC_METADATA = - "`metadata` is for extra information pertaining to the data that is never "* - "iterated or subsampled. Examples, include target *class* weights and group "* - "lasso feature groupings. Further examples include feature names, and the "* - "pool of target classes, when these are not embedded in the data representation. " - -const DOC_WHAT_IS_DATA = - """ - In LearnAPI, *data* is any tuple of objects sharing a - common number of "observations. " - """ - -const DOC_MUTATING_MODELS = - """ - !!! note - - The method is not permitted to mutate `algorithm`. In particular, if `algorithm` - has a random number generator as a hyperparameter (property) then it must be - copied before use. - """ - -# # FIT - -""" - LearnAPI.fit(algorithm, verbosity, data...; metadata...) - -Perform training associated with `algorithm` using the provided `data` and `metadata`. With -the exception of warnings, training will be silent if `verbosity == 0`. Lower values -should suppress warnings; any integer ought to be admissible. Here: - -- `algorithm` is a property-accessible object whose properties are the hyperparameters of - some ML/statistical algorithm. - -- `data` is a tuple of data objects with a common number of observations, for example, - `data = (X, y, w)` where `X` is a table of features, `y` is a target vector with the - same number of rows, and `w` a vector of per-observation weights. - -- $DOC_METADATA To see the keyword names for metadata supported by `algorithm`, do - `LearnAPI.fit_keywords(algorithm)`. " - - -# Return value - -Returns a tuple (`fitted_params`, `state`, `report`) where: - -- The `fitted_params` is the algorithm's learned parameters (eg, the coefficients in a linear - algorithm) in a form understood by operations. $DOC_OPERATIONS If some training - outcome of user-interest is not needed for operations, it should be part of `report` - instead (see below). - -- The `state` is for passing to [`LearnAPI.update!`](@ref) or - [`LearnAPI.ingest!`](@ref). For algorithms that implement neither, `state` should be - `nothing`. - -- The `report` records byproducts of training not in the `fitted_params`, such as feature - rankings, or out-of-sample estimates of performance. - - -# New implementations - -Overloading this method for new algorithms is optional. A fallback performs no -computation, returning `(nothing, nothing, nothing)`. - -See the LearnAPI.jl documentation for the detailed requirements of LearnAPI.jl algorithm -objects. - -$DOC_WHAT_IS_DATA - -$DOC_MUTATING_MODELS - -$(DOC_IMPLEMENTED_METHODS(:fit)) - -If supporting metadata, you must also implement [`LearnAPI.fit_keywords`](@ref) to list -the supported keyword argument names (e.g., `class_weights`). - -See also [`LearnAPI.update!`](@ref), [`LearnAPI.ingest!`](@ref). - -""" -fit(::Any, ::Any, ::Integer, data...; metadata...) = nothing, nothing, nothing - - -# # UPDATE - -""" - LearnAPI.update!(algorithm, verbosity, fitted_params, state, data...; metadata...) - -Based on the values of `state`, and `fitted_params` returned by a preceding call to -[`LearnAPI.fit`](@ref), [`LearnAPI.ingest!`](@ref), or [`LearnAPI.update!`](@ref), update a -algorithm's fitted parameters, returning new (or mutated) `state` and `fitted_params`. - -Intended for retraining when the training data has not changed, but `algorithm` -properties (hyperparameters) may have changed, e.g., when increasing an iteration -parameter. Specifically, the assumption is that `data` and `metadata` have the same values -seen in the most recent call to `fit/update!/ingest!`. - -For incremental training (same algorithm, new data) see instead [`LearnAPI.ingest!`](@ref). - -# Return value - -Same as [`LearnAPI.fit`](@ref), namely a tuple (`fitted_params`, `state`, `report`). See -[`LearnAPI.fit`](@ref) for details. - - -# New implementations - -Overloading this method is optional. A fallback calls `LearnAPI.fit`: - -```julia -LearnAPI.update!(algorithm, verbosity, fitted_params, state, data...; metadata...) = - fit(algorithm, verbosity, data; metadata...) -``` -$(DOC_IMPLEMENTED_METHODS(:fit)) - -$DOC_WHAT_IS_DATA - -The most common use case is continuing training of an iterative algorithm: `state` is -simply a copy of the algorithm used in the last training call (`fit`, `update!` or `ingest!`) -and this will include the current number of iterations as a property. If `algorithm` and -`state` differ only in the number of iterations (e.g., epochs in a neural network), which -has increased, then the fitted parameters (weights) are updated, rather than computed from -scratch. Otherwise, `update!` simply calls `fit`, to force retraining from scratch. - -It is permitted to return mutated versions of `state` and `fitted_params`. - -$DOC_MUTATING_MODELS - -See also [`LearnAPI.fit`](@ref), [`LearnAPI.ingest!`](@ref). - -""" -update!(algorithm, verbosity, fitted_params, state, data...; metadata...) = - fit(algorithm, verbosity, data...; metadata...) - - -# # INGEST - -""" - LernAPI.ingest!(algorithm, verbosity, fitted_params, state, data...; metadata...) - -For an algorithm that supports incremental learning, update the fitted parameters using -`data`, which has typically not been seen before. The arguments `state` and -`fitted_params` are the output of a preceding call to [`LearnAPI.fit`](@ref), -[`LearnAPI.ingest!`](@ref), or [`LearnAPI.update!`](@ref), of which mutated or new -versions are returned. - -For updating fitted parameters using the *same* data but new hyperparameters, see instead -[`LearnAPI.update!`](@ref). - -For training an algorithm with new hyperparameters but *unchanged* data, see instead -[`LearnAPI.update!`](@ref). - - -# Return value - -Same as [`LearnAPI.fit`](@ref), namely a tuple (`fitted_params`, `state`, `report`). See -[`LearnAPI.fit`](@ref) for details. - - -# New implementations - -Implementing this method is optional. It has no fallback. - -$(DOC_IMPLEMENTED_METHODS(:fit)) - -$DOC_MUTATING_MODELS - -See also [`LearnAPI.fit`](@ref), [`LearnAPI.update!`](@ref). - -""" -function ingest!(algorithm, verbosity, fitted_params, state, data...; metadata...) end diff --git a/src/minimize.jl b/src/minimize.jl new file mode 100644 index 00000000..173ee24f --- /dev/null +++ b/src/minimize.jl @@ -0,0 +1,41 @@ +""" + minimize(model; options...) + +Return a version of `model` that will generally have a smaller memory allocation than +`model`, suitable for serialization. Here `model` is any object returned by +[`fit`](@ref). Accessor functions that can be called on `model` may not work on +`minimize(model)`, but [`predict`](@ref), [`transform`](@ref) and +[`inverse_transform`](@ref) will work, if implemented for `model`. Check +`LearnAPI.functions(LearnAPI.algorithm(model))` to view see what the original `model` +implements. + +Specific algorithms may provide keyword `options` to control how much of the original +functionality is preserved by `minimize`. + +# Extended help + +# New implementations + +Overloading `minimize` for new algorithms is optional. The fallback is the +identity. $(DOC_IMPLEMENTED_METHODS(:minimize, overloaded=true)) + +New implementations must enforce the following identities, whenever the right-hand side is +defined: + +```julia +predict(minimize(model; options...), args...; kwargs...) == + predict(model, args...; kwargs...) +transform(minimize(model; options...), args...; kwargs...) == + transform(model, args...; kwargs...) +inverse_transform(minimize(model; options), args...; kwargs...) == + inverse_transform(model, args...; kwargs...) +``` + +Additionally: + +```julia +minimize(minimize(model)) == minimize(model) +``` + +""" +minimize(model) = model diff --git a/src/obs.jl b/src/obs.jl new file mode 100644 index 00000000..75da42f4 --- /dev/null +++ b/src/obs.jl @@ -0,0 +1,122 @@ +""" + obs(func, algorithm, data...) + +Where `func` is `fit`, `predict` or `transform`, return a combined, algorithm-specific, +representation of `data...`, which can be passed directly to `obsfit`, `obspredict` or +`obstransform`, as shown in the example below. + +The returned object implements the `getobs`/`numobs` observation-resampling interface +provided by MLUtils.jl, even if `data` does not. + +Calling `func` on the returned object may be cheaper than calling `func` directly on +`data...`. And resampling the returned object using `MLUtils.getobs` may be cheaper than +directly resampling the components of `data` (an operation not provided by the LearnAPI.jl +interface). + +# Example + +Usual workflow, using data-specific resampling methods: + +```julia +X = +y = + +Xtrain = Tables.select(X, 1:100) +ytrain = y[1:100] +model = fit(algorithm, Xtrain, ytrain) +ŷ = predict(model, LiteralTarget(), y[101:150]) +``` + +Alternative workflow using `obs`: + +```julia +import MLUtils + +fitdata = obs(fit, algorithm, X, y) +predictdata = obs(predict, algorithm, X) + +model = obsfit(algorithm, MLUtils.getobs(fitdata, 1:100)) +ẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, 101:150)) +@assert ẑ == ŷ +``` + +See also [`obsfit`](@ref), [`obspredict`](@ref), [`obstransform`](@ref). + + +# Extended help + +# New implementations + +If the `data` to be consumed in standard user calls to `fit`, `predict` or `transform` +consists only of tables and arrays (with last dimension the observation dimension) then +overloading `obs` is optional, but the user will get no performance benefits by using +it. The implementation of `obs` is optional under more general circumstances stated at the +end. + +The fallback for `obs` just slurps the provided data: + +```julia +obs(func, alg, data...) = data +``` + +The only contractual obligation of `obs` is to return an object implementing the +`getobs`/`numobs` interface. Generally it suffices to overload `Base.getindex` and +`Base.length`. However, note that implementations of [`obsfit`](@ref), +[`obspredict`](@ref), and [`obstransform`](@ref) depend on the form of output of `obs`. + +$(DOC_IMPLEMENTED_METHODS(:(obs), overloaded=true)) + +## Sample implementation + +Suppose that `fit`, for an algorithm of type `Alg`, is to have the primary signature + +```julia +fit(algorithm::Alg, X, y) +``` + +where `X` is a table, `y` a vector. Internally, the algorithm is to call a lower level +function + +`train(A, names, y)` + +where `A = Tables.matrix(X)'` and `names` are the column names of `X`. Then relevant parts +of an implementation might look like this: + +```julia +# thin wrapper for algorithm-specific representation of data: +struct ObsData{T} + A::Matrix{T} + names::Vector{Symbol} + y::Vector{T} +end + +# (indirect) implementation of `getobs/numobs`: +Base.getindex(data::ObsData, I) = + ObsData(data.A[:,I], data.names, y[I]) +Base.length(data::ObsData, I) = length(data.y) + +# implementation of `obs`: +function LearnAPI.obs(::typeof(fit), ::Alg, X, y) + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return ObsData(Tables.matrix(table)', names, y) +end + +# implementation of `obsfit`: +function LearnAPI.obsfit(algorithm::Alg, data::ObsData; verbosity=1) + coremodel = train(data.A, data.names, data.y) + data.verbosity > 0 && @info "Training using these features: $names." + + return model +end +``` + +## When is overloading `obs` optional? + +Overloading `obs` is optional, for a given `typeof(algorithm)` and `typeof(fun)`, if the +components of `data` in the standard call `func(algorithm_or_model, data...)` are already +expected to separately implement the `getobs`/`numbobs` interface. This is true for arrays +whose last dimension is the observation dimension, and for suitable tables. + +""" +obs(func, alg, data...) = data diff --git a/src/operations.jl b/src/operations.jl deleted file mode 100644 index 2782df3d..00000000 --- a/src/operations.jl +++ /dev/null @@ -1,184 +0,0 @@ -function DOC_IMPLEMENTED_METHODS(name; overloaded=false) - word = overloaded ? "overloaded" : "implemented" - "If $word, you must include `:$name` in the tuple returned by the "* - "[`LearnAPI.functions`](@ref) trait. " -end - -const OPERATIONS = (:predict, :transform, :inverse_transform) -const DOC_OPERATIONS_LIST_SYMBOL = join(map(op -> "`:$op`", OPERATIONS), ", ") -const DOC_OPERATIONS_LIST_FUNCTION = join(map(op -> "`LearnAPI.$op`", OPERATIONS), ", ") - -const DOC_NEW_DATA = - "The `report` contains ancilliary byproducts of the computation, or "* - "is `nothing`; `data` is a tuple of data objects, "* - "generally a single object representing new observations "* - "not seen in training. " - - -# # METHOD STUBS/FALLBACKS - -""" - LearnAPI.predict(algorithm, kind_of_proxy::LearnAPI.KindOfProxy, fitted_params, data...) - -Return `(ŷ, report)` where `ŷ` is the predictions (a data object with target predictions -as observations) or a proxy for these, for the specified `algorithm` having learned -parameters `fitted_params` (first object returned by [`LearnAPI.fit`](@ref)`(algorithm, -...)`). $DOC_NEW_DATA - -Where available, use `kind_of_proxy=LiteralTarget()` for ordinary target predictions, and -`kind_of_proxy=Distribution()` for PDF/PMF predictions. Always available is -`kind_of_proxy=`LearnAPI.preferred_kind_of_proxy(algorithm)`. - -For a full list of target proxy types, run `subtypes(LearnAPI.KindOfProxy)` and -`subtypes(LearnAPI.IID)`. - -# New implementations - -$(DOC_IMPLEMENTED_METHODS(:predict)) - -If implementing `LearnAPI.predict`, then a -[`LearnAPI.preferred_kind_of_proxy`](@ref) declaration is required, as in - -```julia -LearnAPI.preferred_kind_of_proxy(::Type{<:SomeAlgorithm}) = LearnAPI.Distribution() -``` - -which has the shorthand - -```julia -@trait SomeAlgorithm preferred_kind_of_proxy=LearnAPI.Distribution() -``` - -The value of this trait must be an instance `T()`, where `T <: LearnAPI.KindOfProxy`. - -See also [`LearnAPI.fit`](@ref). - -""" -function predict end - -""" - LearnAPI.transform(algorithm, fitted_params, data...) - -Return `(output, report)`, where `output` is some kind of transformation of `data`, -provided by `algorithm`, based on the learned parameters `fitted_params` (the first object -returned by [`LearnAPI.fit`](@ref)`(algorithm, ...)`). The `fitted_params` could be -`nothing`, in the case of algorithms that do not generalize to new data. $DOC_NEW_DATA - - -# New implementations - -$(DOC_IMPLEMENTED_METHODS(:transform)) - -See also [`LearnAPI.inverse_transform`](@ref), [`LearnAPI.fit`](@ref), -[`LearnAPI.predict`](@ref), - -""" -function transform end - -""" - LearnAPI.inverse_transform(algorithm, fitted_params, data) - -Return `(data_inverted, report)`, where `data_inverted` is valid input to the call - -```julia -LearnAPI.transform(algorithm, fitted_params, data_inverted) -``` -$DOC_NEW_DATA - -Typically, the map - -```julia -data -> first(inverse_transform(algorithm, fitted_params, data)) -``` - -will be an inverse, approximate inverse, right inverse, or approximate right inverse, for -the map - -```julia -data -> first(transform(algorithm, fitted_params, data)) -``` - -For example, if `transform` corresponds to a projection, `inverse_transform` might be the -corresponding embedding. - - -# New implementations - -$(DOC_IMPLEMENTED_METHODS(:transform)) - -See also [`LearnAPI.fit`](@ref), [`LearnAPI.predict`](@ref), - -""" -function inverse_transform end - -function save end -function restore end - - -# # TARGET PROXIES - -""" - - LearnAPI.KindOfProxy - -Abstract type whose concrete subtypes `T` each represent a different kind of proxy for the -target variable, associated with some algorithm. Instances `T()` are used to request the -form of target predictions in [`LearnAPI.predict`](@ref) calls. - -For example, `LearnAPI.Distribution` is a concrete subtype of `LearnAPI.KindOfProxy` and -the call `LearnAPI.predict(algorithm , LearnAPI.Distribution(), data...)` returns a data -object whose observations are probability density/mass functions, assuming `algorithm` -supports predictions of that form. - -Run `subtypes(LearnAPI.KindOfProxy)` and `subtypes(LearnAPI.IID)` to list all concrete -subtypes of `KindOfProxy`. - -""" -abstract type KindOfProxy end - -""" - LearnAPI.IID <: LearnAPI.KindOfProxy - -Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). If `kind_of_proxy` is an instance of -`LearnAPI.IID` then, given `data` constisting of ``n`` observations, the following must -hold: - -- `LearnAPI.predict(algorithm, kind_of_proxy, data...) == (ŷ, report)` where `ŷ` is data - also consisting of ``n`` observations; and - -- The ``j``th observation of `ŷ`, for any ``j``, depends only on the ``j``th - observation of the provided `data` (no correlation between observations). - -See also [`LearnAPI.KindOfProxy`](@ref). - -""" -abstract type IID <: KindOfProxy end - -struct LiteralTarget <: IID end -struct Sampleable <: IID end -struct Distribution <: IID end -struct LogDistribution <: IID end -struct Probability <: IID end -struct LogProbability <: IID end -struct Parametric <: IID end -struct LabelAmbiguous <: IID end -struct LabelAmbiguousSampleable <: IID end -struct LabelAmbiguousDistribution <: IID end -struct ConfidenceInterval <: IID end -struct Set <: IID end -struct ProbabilisticSet <: IID end -struct SurvivalFunction <: IID end -struct SurvivalDistribution <: IID end -struct OutlierScore <: IID end -struct Continuous <: IID end - -struct JointSampleable <: KindOfProxy end -struct JointDistribution <: KindOfProxy end -struct JointLogDistribution <: KindOfProxy end - -const CONCRETE_TARGET_PROXY_TYPES = [ - subtypes(IID)..., - JointSampleable, - JointDistribution, - JointLogDistribution, -] diff --git a/src/predict_transform.jl b/src/predict_transform.jl new file mode 100644 index 00000000..4677894b --- /dev/null +++ b/src/predict_transform.jl @@ -0,0 +1,288 @@ +function DOC_IMPLEMENTED_METHODS(name; overloaded=false) + word = overloaded ? "overloaded" : "implemented" + "If $word, you must include `$name` in the tuple returned by the "* + "[`LearnAPI.functions`](@ref) trait. " +end + +const OPERATIONS = (:predict, :transform, :inverse_transform) +const DOC_OPERATIONS_LIST_SYMBOL = join(map(op -> "`:$op`", OPERATIONS), ", ") +const DOC_OPERATIONS_LIST_FUNCTION = join(map(op -> "`LearnAPI.$op`", OPERATIONS), ", ") + +DOC_ARGUMENTS(func) = +""" +- `data`: tuple of data objects with a common number of observations, for example, + `data = (X, y, w)` where `X` is a table of features, `y` is a target vector with the + same number of rows, and `w` a vector of per-observation weights. + +""" + +DOC_MUTATION(op) = + """ + + If [`LearnAPI.predict_or_transform_mutates(algorithm)`](@ref) is overloaded to return + `true`, then `$op` may mutate it's first argument, but not in a way that alters the + result of a subsequent call to `obspredict`, `obstransform` or + `inverse_transform`. This is necessary for some non-generalizing algorithms but is + otherwise discouraged. See more at [`fit`](@ref). + + """ + + +DOC_MINIMIZE(func) = + """ + + If, additionally, [`minimize(model)`](@ref) is overloaded, then the following identity + must hold: + + ```julia + $func(minimize(model), args...) = $func(model, args...) + ``` + + """ + +# # METHOD STUBS/FALLBACKS + +""" + predict(model, kind_of_proxy::LearnAPI.KindOfProxy, data...) + predict(model, data...) + +The first signature returns target or target proxy predictions for input features `data`, +according to some `model` returned by [`fit`](@ref) or [`obsfit`](@ref). Where supported, +these are literally target predictions if `kind_of_proxy = LiteralTarget()`, and +probability density/mass functions if `kind_of_proxy = +Distribution()`. $DOC_HOW_TO_LIST_PROXIES + +The shortcut `predict(model, data...) = predict(model, LiteralTarget(), data...)` is also +provided. + +# Arguments + +- `model` is anything returned by a call of the form `fit(algorithm, ...)`, for some + LearnAPI-complaint `algorithm`. + +$(DOC_ARGUMENTS(:predict)) + +# Example + +In the following, `algorithm` is some supervised learning algorithm with +training features `X`, training target `y`, and test features `Xnew`: + +```julia +model = fit(algorithm, X, y; verbosity=0) +predict(model, LiteralTarget(), Xnew) +``` + +Note `predict ` does not mutate any argument, except in the special case +`LearnAPI.predict_or_transform_mutates(algorithm) = true`. + +See also [`obspredict`](@ref), [`fit`](@ref), [`transform`](@ref), +[`inverse_transform`](@ref). + +# Extended help + +# New implementations + +LearnAPI.jl provides the following definition of `predict` which is never to be directly +overloaded: + +```julia +predict(model, kop::LearnAPI.KindOfProxy, data...) = + obspredict(model, kop, obs(predict, LearnAPI.algorithm(model), data...)) +``` + +Rather, new algorithms overload [`obspredict`](@ref). + +""" +predict(model, kind_of_proxy::KindOfProxy, data...) = + obspredict(model, kind_of_proxy, obs(predict, algorithm(model), data...)) +predict(model, data...) = predict(model, LiteralTarget(), data...) + +""" + obspredict(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata) + +Similar to `predict` but consumes algorithm-specific representations of input data, +`obsdata`, as returned by `obs(predict, algorithm, data...)`. Hre `data...` is the form of +data expected in the main [`predict`](@ref) method. Alternatively, such `obsdata` may be +replaced by a resampled version, where resampling is performed using `MLUtils.getobs` +(always supported). + +For some algorithms and workflows, `obspredict` will have a performance benefit over +[`predict`](@ref). See more at [`obs`](@ref). + +# Example + +In the following, `algorithm` is some supervised learning algorithm with +training features `X`, training target `y`, and test features `Xnew`: + +```julia +model = fit(algorithm, X, y) +obsdata = obs(predict, algorithm, Xnew) +ŷ = obspredict(model, LiteralTarget(), obsdata) +@assert ŷ == predict(model, LiteralTarget(), Xnew) +``` + +See also [`predict`](@ref), [`fit`](@ref), [`transform`](@ref), +[`inverse_transform`](@ref), [`obs`](@ref). + +# Extended help + +# New implementations + +Implementation of `obspredict` is optional, but required to enable `predict`. The method +must also handle `obsdata` in the case it is replaced by `MLUtils.getobs(obsdata, I)` for +some collection `I` of indices. If [`obs`](@ref) is not overloaded, then `obsdata = data`, +where `data...` is what the standard [`predict`](@ref) call expects, as in the call +`predict(model, kind_of_proxy, data...)`. Note `data` is always a tuple, even if `predict` +has only one data argument. See more at [`obs`](@ref). + +$(DOC_MUTATION(:obspredict)) + +If overloaded, you must include both `LearnAPI.obspredict` and `LearnAPI.predict` in the +list of methods returned by the [`LearnAPI.functions`](@ref) trait. + +Each supported `kind_of_proxy` should be listed in the return value of the +[`LearnAPI.kinds_of_proxy(algorithm)`](@ref) trait. + +$(DOC_MINIMIZE(:obspredict)) + +""" +function obspredict end + +""" + transform(model, data...) + +Return a transformation of some `data`, using some `model`, as returned by [`fit`](@ref). + +# Arguments + +- `model` is anything returned by a call of the form `fit(algorithm, ...)`, for some + LearnAPI-complaint `algorithm`. + +$(DOC_ARGUMENTS(:transform)) + +# Example + +Here `X` and `Xnew` are data of the same form: + +```julia +# For an algorithm that generalizes to new data ("learns"): +model = fit(algorithm, X; verbosity=0) +transform(model, Xnew) + +# For a static (non-generalizing) transformer: +model = fit(algorithm) +transform(model, X) +``` + +Note `transform` does not mutate any argument, except in the special case +`LearnAPI.predict_or_transform_mutates(algorithm) = true`. + +See also [`obstransform`](@ref), [`fit`](@ref), [`predict`](@ref), +[`inverse_transform`](@ref). + +# Extended help + +# New implementations + +LearnAPI.jl provides the following definition of `transform` which is never to be directly +overloaded: + + +```julia +transform(model, data...) = + obstransform(model, obs(predict, LearnAPI.algorithm(model), data...)) +``` + +Rather, new algorithms overload [`obstransform`](@ref). + +""" +transform(model, data...) = + obstransform(model, obs(transform, LearnAPI.algorithm(model), data...)) + +""" + obstransform(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata) + +Similar to `transform` but consumes algorithm-specific representations of input data, +`obsdata`, as returned by `obs(transform, algorithm, data...)`. Here `data...` is the +form of data expected in the main [`transform`](@ref) method. Alternatively, such +`obsdata` may be replaced by a resampled version, where resampling is performed using +`MLUtils.getobs` (always supported). + +For some algorithms and workflows, `obstransform` will have a performance benefit over +[`transform`](@ref). See more at [`obs`](@ref). + +# Example + +In the following, `algorithm` is some unsupervised learning algorithm with +training features `X`, and test features `Xnew`: + +```julia +model = fit(algorithm, X, y) +obsdata = obs(transform, algorithm, Xnew) +W = obstransform(model, obsdata) +@assert W == transform(model, Xnew) +``` + +See also [`transform`](@ref), [`fit`](@ref), [`predict`](@ref), +[`inverse_transform`](@ref), [`obs`](@ref). + +# Extended help + +# New implementations + +Implementation of `obstransform` is optional, but required to enable `transform`. The +method must also handle `obsdata` in the case it is replaced by `MLUtils.getobs(obsdata, +I)` for some collection `I` of indices. If [`obs`](@ref) is not overloaded, then `obsdata += data`, where `data...` is what the standard [`transform`](@ref) call expects, as in the +call `transform(model, data...)`. Note `data` is always a tuple, even if `transform` has +only one data argument. See more at [`obs`](@ref). + +$(DOC_MUTATION(:obstransform)) + +If overloaded, you must include both `LearnAPI.obstransform` and `LearnAPI.transform` in +the list of methods returned by the [`LearnAPI.functions`](@ref) trait. + +Each supported `kind_of_proxy` should be listed in the return value of the +[`LearnAPI.kinds_of_proxy(algorithm)`](@ref) trait. + +$(DOC_MINIMIZE(:obstransform)) +""" +function obstransform end + +""" + inverse_transform(model, data) + +Inverse transform `data` according to some `model` returned by [`fit`](@ref). Here +"inverse" is to be understood broadly, e.g, an approximate +right inverse for [`transform`](@ref). + +# Arguments + +- `model`: anything returned by a call of the form `fit(algorithm, ...)`, for some + LearnAPI-complaint `algorithm`. + +- `data`: something having the same form as the output of `transform(model, inputs...)` + +# Example + +In the following, `algorithm` is some dimension-reducing algorithm that generalizes to new +data (such as PCA); `Xtrain` is the training input and `Xnew` the input to be reduced: + +```julia +model = fit(algorithm, Xtrain; verbosity=0) +W = transform(model, Xnew) # reduced version of `Xnew` +Ŵ = inverse_transform(model, W) # embedding of `W` in original space +``` + +See also [`fit`](@ref), [`transform`](@ref), [`predict`](@ref). + +# Extended help + +# New implementations + +Implementation is optional. $(DOC_IMPLEMENTED_METHODS(:inverse_transform, )) + +$(DOC_MINIMIZE(:inverse_transform)) + +""" +function inverse_transform end diff --git a/src/tools.jl b/src/tools.jl index 88e1b788..7a211729 100644 --- a/src/tools.jl +++ b/src/tools.jl @@ -20,17 +20,17 @@ macro trait(algorithm_ex, exs...) return esc(program) end -""" - typename(x) +# """ +# typename(x) -Return a symbolic representation of the name of `type(x)`, stripped of any type-parameters -and module qualifications. For example, if +# Return a symbolic representation of the name of `type(x)`, stripped of any type-parameters +# and module qualifications. For example, if - typeof(x) = MLJBase.Machine{MLJAlgorithms.ConstantRegressor,true} +# typeof(x) = MLJBase.Machine{MLJAlgorithms.ConstantRegressor,true} -Then `typename(x)` returns `:Machine`. +# Then `typename(x)` returns `:Machine`. -""" +# """ function typename(x) M = typeof(x) if isdefined(M, :name) @@ -47,14 +47,14 @@ function is_uppercase(char::Char) i > 64 && i < 91 end -""" - snakecase(str, del='_') +# """ +# snakecase(str, del='_') -Return the snake case version of the abstract string or symbol, `str`, as in +# Return the snake case version of the abstract string or symbol, `str`, as in - snakecase("TheLASERBeam") == "the_laser_beam" +# snakecase("TheLASERBeam") == "the_laser_beam" -""" +# """ function snakecase(str::AbstractString; delim='_') snake = Char[] n = length(str) diff --git a/src/algorithm_traits.jl b/src/traits.jl similarity index 52% rename from src/algorithm_traits.jl rename to src/traits.jl index d9d5ba86..56f46e68 100644 --- a/src/algorithm_traits.jl +++ b/src/traits.jl @@ -6,15 +6,17 @@ const DOC_UNKNOWN = "failed to overload the trait. " const DOC_ON_TYPE = "The value of the trait must depend only on the type of `algorithm`. " -const DOC_ONLY_ONE = - "No more than one of the following should be overloaded for an algorithm type: "* - "`LearnAPI.fit_scitype`, `LearnAPI.fit_type`, `LearnAPI.fit_observation_scitype`, "* - "`LearnAPI.fit_observation_type`." +DOC_ONLY_ONE(func) = + "Ordinarily, at most one of the following should be overloaded for given "* + "algorithm "* + "`LearnAPI.$(func)_scitype`, `LearnAPI.$(func)_type`, "* + "`LearnAPI.$(func)_observation_scitype`, "* + "`LearnAPI.$(func)_observation_type`." const TRAITS = [ :functions, - :preferred_kind_of_proxy, + :kinds_of_proxy, :position_of_target, :position_of_weights, :descriptors, @@ -23,10 +25,10 @@ const TRAITS = [ :pkg_license, :doc_url, :load_path, - :is_wrapper, + :is_composite, :human_name, :iteration_parameter, - :fit_keywords, + :predict_or_transform_mutates, :fit_scitype, :fit_observation_scitype, :fit_type, @@ -43,53 +45,69 @@ const TRAITS = [ :is_algorithm, ] -# # OVERLOADABLE TRAITS -functions() = METHODS = (TRAINING_FUNCTIONS..., OPERATIONS..., ACCESSOR_FUNCTIONS...) -const FUNCTIONS = map(d -> "`:$d`", functions()) +# # OVERLOADABLE TRAITS """ LearnAPI.functions(algorithm) -Return a tuple of symbols, such as `(:fit, :predict)`, corresponding to LearnAPI methods -specifically implemented for objects having the same type as `algorithm`. If non-empty, -this also guarantees `algorithm` is an algorithm, in the LearnAPI sense. See the Reference -section of the manual for details. +Return a tuple of functions that can be sensibly applied to `algorithm`, or to objects +having the same type as `algorithm`, or to associated models (objects returned by +`fit(algorithm, ...)`. Algorithm traits are excluded. + +In addition to functions, the returned tuple may include expressions, like +`:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl packages. + +The understanding is that `algorithm` is a LearnAPI-compliant object whenever this is +non-empty. + +# Extended help # New implementations -Every LearnAPI method that is not a trait and which is specifically implemented for -`typeof(algorithm)` must be included in the return value of this trait. Specifically, the -return value is a tuple of symbols from this list: $(join(FUNCTIONS, ", ")). To regenerate -this list, do `LearnAPI.functions()`. +All new implementations must overload this trait. Here's a checklist for elements in the +return value: + +| function | needs explicit implementation? | include in returned tuple? | +|----------------------|-------------------------|--------------------------------| +| `fit` | no | yes | +| `obsfit` | yes | yes | +| `minimize` | optional | yes | +| `predict` | no | if `obspredict` is implemented | +| `obspredict` | optional | if implemented | +| `transform` | no | if `transform` is implemented | +| `obstransform` | optional | if implemented | +| `obs` | optional | yes | +| `inverse_transform` | optional | if implemented | +| `LearnAPI.algorithm` | yes | yes | -See also [`LearnAPI.Algorithm`](@ref). +Also include any implemented accessor functions. The LearnAPI.jl accessor functions are: +$ACCESSOR_FUNCTIONS_LIST. """ functions(::Any) = () """ - LearnAPI.preferred_kind_of_proxy(algorithm) + LearnAPI.kinds_of_proxy(algorithm) -Returns an instance of [`LearnAPI.KindOfProxy`](@ref), unless `LearnAPI.predict` is not -implemented for objects of type `typeof(algorithm)`, in which case it returns `nothing`. - -The returned target proxy is generally the one with the smallest computational cost, if -more than one type is supported. +Returns an tuple of instances, `kind`, for which for which `predict(algorithm, kind, +data...)` has a guaranteed implementation. Each such `kind` subtypes +[`LearnAPI.KindOfProxy`](@ref). Examples are `LiteralTarget()` (for predicting actual +target values) and `Distributions()` (for predicting probability mass/density functions). See also [`LearnAPI.predict`](@ref), [`LearnAPI.KindOfProxy`](@ref). +# Extended help + # New implementations -Any algorithm implementing `LearnAPI.predict` must overload this trait. +Implementation is optional but recommended whenever `predict` is overloaded. -The trait must return a lone instance `T()` for some concrete subtype `T <: -LearnAPI.KindOfProxy`. List these with `subtypes(LearnAPI.KindOfProxy)` and -`subtypes(LearnAPI.IID)`. +Elements of the returned tuple must be one of these: $CONCRETE_TARGET_PROXY_TYPES_LIST. Suppose, for example, we have the following implementation of a supervised learner -returning only probablistic predictions: +returning only probabilistic predictions: ```julia LearnAPI.predict(algorithm::MyNewAlgorithmType, LearnAPI.Distribution(), Xnew) = ... @@ -98,19 +116,13 @@ LearnAPI.predict(algorithm::MyNewAlgorithmType, LearnAPI.Distribution(), Xnew) = Then we can declare ```julia -@trait MyNewAlgorithmType preferred_kind_of_proxy = LearnAPI.LiteralTarget() -``` - -which is shorthand for - -```julia -LearnAPI.preferred_kind_of_proxy(::MyNewAlgorithmType) = LearnAPI.Distribution() +@trait MyNewAlgorithmType kinds_of_proxy = (LearnaAPI.Distribution(),) ``` For more on target variables and target proxies, refer to the LearnAPI documentation. """ -preferred_kind_of_proxy(::Any) = nothing +kinds_of_proxy(::Any) = () """ LearnAPI.position_of_target(algorithm) @@ -128,7 +140,7 @@ position_of_target(::Any) = 0 LearnAPI.position_of_weights(algorithm) Return the expected position of per-observation weights within `data` in -calls of the form [`LearnAPI.fit`](@ref)`(algorithm, verbosity, data...)`. +calls of the form [`LearnAPI.fit`](@ref)`(algorithm, data...)`. If this number is `0`, then no weights are expected. If this number exceeds `length(data)`, then `data` is understood to exclude weights, which are assumed to be @@ -254,21 +266,25 @@ load_path(::Any) = "unknown" """ - LearnAPI.is_wrapper(algorithm) + LearnAPI.is_composite(algorithm) Returns `true` if one or more properties (fields) of `algorithm` may themselves be algorithms, and `false` otherwise. +See also `[LearnAPI.components]`(@ref). + # New implementations -This trait must be overloaded if one or more properties (fields) of `algorithm` may take -algorithm values. Fallback return value is `false`. +This trait should be overloaded if one or more properties (fields) of `algorithm` may take +algorithm values. Fallback return value is `false`. The keyword constructor for such an +algorithm need not prescribe defaults for algorithm-valued properties. Implementation of +the accessor function `[LearnAPI.components]`(@ref) is recommended. $DOC_ON_TYPE """ -is_wrapper(::Any) = false +is_composite(::Any) = false """ LearnAPI.human_name(algorithm) @@ -287,62 +303,47 @@ to return `"K-nearest neighbors regressor"`. Ideally, this is a "concrete" noun human_name(M) = snakecase(name(M), delim=' ') # `name` defined below """ - LearnAPI.iteration_parameter(algorithm) + LearnAPI.predict_or_transform_mutates(algorithm) -The name of the iteration parameter of `algorithm`, or `nothing` if the algorithm is not -iterative. +Returns `true` if [`predict`](@ref) or [`transform`](@ref) possibly mutate their first +argument, `model`, when `LearnAPI.algorithm(model) == algorithm`. If `false`, no arguments +are ever mutated. # New implementations -Implement if algorithm is iterative. Returns a symbol or `nothing`. +This trait, falling back to `false`, may only be overloaded when `fit` has no data +arguments (`algorithm` does not generalize to new data). See more at [`fit`](@ref). """ -iteration_parameter(::Any) = nothing +predict_or_transform_mutates(::Any) = false """ - LearnAPI.fit_keywords(algorithm) + LearnAPI.iteration_parameter(algorithm) -Return a list of keywords that can be provided to `fit` that correspond to -metadata; $DOC_METADATA +The name of the iteration parameter of `algorithm`, or `nothing` if the algorithm is not +iterative. # New implementations -If `LearnAPI.fit(algorithm, ...)` supports keyword arguments, then this trait must be -overloaded, and otherwise not. Fallback returns `()`. - -Here's a sample implementation for a classifier that implements a `LearnAPI.fit` method -with signature `fit(algorithm::MyClassifier, verbosity, X, y; class_weights=nothing)`: - -``` -LearnAPI.fit_keywords(::Any{<:MyClassifier}) = (:class_weights,) -``` - -or the shorthand - -``` -@trait MyClassifier fit_keywords=(:class_weights,) -``` - +Implement if algorithm is iterative. Returns a symbol or `nothing`. """ -fit_keywords(::Any) = () +iteration_parameter(::Any) = nothing + """ LearnAPI.fit_scitype(algorithm) -Return an upper bound on the scitype of data guaranteeing it to work when training -`algorithm`. +Return an upper bound on the scitype of `data` guaranteed to work when calling +`fit(algorithm, data...)`. Specifically, if the return value is `S` and `ScientificTypes.scitype(data) <: S`, then -the following low-level calls are allowed (assuming `metadata` is also valid and -`verbosity` is an integer): +all the following calls are guaranteed to work: ```julia -# apply data front-end: -data2, metadata2 = LearnAPI.reformat(algorithm, LearnAPI.fit, data...; metadata...) - -# train: -LearnAPI.fit(algorithm, verbosity, data2...; metadata2...) +fit(algorithm, data...) +obsdata = obs(fit, algorithm, data...) +fit(algorithm, Obs(), obsdata) ``` See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_observation_scitype`](@ref), @@ -350,7 +351,7 @@ See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_observation_scitype`](@ref) # New implementations -Optional. The fallback return value is `Union{}`. $DOC_ONLY_ONE +Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) """ fit_scitype(::Any) = Union{} @@ -358,24 +359,26 @@ fit_scitype(::Any) = Union{} """ LearnAPI.fit_observation_scitype(algorithm) -Return an upper bound on the scitype of observations guaranteed to work when training -`algorithm` (independent of the type/scitype of the data container itself). - -Specifically, denoting the type returned above by `S`, suppose a user supplies training -data, `data` - typically a tuple, such as `(X, y)` - and valid metadata, `metadata`, and -one computes - - data2, metadata2 = LearnAPI.reformat(algorithm, LearnAPI.fit, data...; metadata...) +Return an upper bound on the scitype of observations guaranteed to work when calling +`fit(algorithm, data...)`, independent of the type/scitype of the data container +itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has +value different from `Union{}` the understanding is that `data` implements the MLUtils.jl +`getobs`/`numobs` interface. -Then, assuming +Specifically, denoting the type returned above by `S`, supposing `S != Union{}`, and that +user supplies `data` satisfying - ScientificTypes.scitype(LearnAPI.getobs(algorithm, LearnAPI.fit, data2, i)) <: S +```julia +ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S +``` -for any valid index `i`, the following is guaranteed to work: +for any valid index `i`, then all the following are guaranteed to work: ```julia -LearnAPI.fit(algorithm, verbosity, data2...; metadata2...) +fit(algorithm, data....) +obsdata = obs(fit, algorithm, data...) +fit(algorithm, Obs(), obsdata) ``` See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), @@ -383,7 +386,7 @@ See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), # New implementations -Optional. The fallback return value is `Union{}`. $DOC_ONLY_ONE +Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) """ fit_observation_scitype(::Any) = Union{} @@ -391,18 +394,16 @@ fit_observation_scitype(::Any) = Union{} """ LearnAPI.fit_type(algorithm) -Return an upper bound on the type of data guaranteeing it to work when training `algorithm`. +Return an upper bound on the type of `data` guaranteed to work when calling +`fit(algorithm, data...)`. -Specifically, if the return value is `T` and `typeof(data) <: T`, then the following -low-level calls are allowed (assuming `metadata` is also valid and `verbosity` is an -integer): +Specifically, if the return value is `T` and `typeof(data) <: T`, then +all the following calls are guaranteed to work: ```julia -# apply data front-end: -data2, metadata2 = LearnAPI.reformat(algorithm, LearnAPI.fit, data...; metadata...) - -# train: -LearnAPI.fit(algorithm, verbosity, data2...; metadata2...) +fit(algorithm, data...) +obsdata = obs(fit, algorithm, data...) +fit(algorithm, Obs(), obsdata) ``` See also [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref). @@ -410,7 +411,7 @@ See also [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref) # New implementations -Optional. The fallback return value is `Union{}`. $DOC_ONLY_ONE +Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) """ fit_type(::Any) = Union{} @@ -418,24 +419,25 @@ fit_type(::Any) = Union{} """ LearnAPI.fit_observation_type(algorithm) -Return an upper bound on the type of observations guaranteed to work when training -`algorithm` (independent of the type/scitype of the data container itself). - -Specifically, denoting the type returned above by `T`, suppose a user supplies training -data, `data` - typically a tuple, such as `(X, y)` - and valid metadata, `metadata`, and -one computes - - data2, metadata2 = LearnAPI.reformat(algorithm, LearnAPI.fit, data...; metadata...) - -Then, assuming +Return an upper bound on the type of observations guaranteed to work when calling +`fit(algorithm, data...)`, independent of the type/scitype of the data container +itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value +different from `Union{}` the understanding is that `data` implements the MLUtils.jl +`getobs`/`numobs` interface. - typeof(LearnAPI.getobs(algorithm, LearnAPI.fit, data2, i)) <: T +Specifically, denoting the type returned above by `T`, supposing `T != Union{}`, and that +user supplies `data` satisfying -for any valid index `i`, the following is guaranteed to work: +```julia +typeof(MLUtils.getobs(data, i)) <: T +``` +for any valid index `i`, then the following is guaranteed to work: ```julia -LearnAPI.fit(algorithm, verbosity, data2...; metadata2...) +fit(algorithm, data....) +obsdata = obs(fit, algorithm, data...) +fit(algorithm, Obs(), obsdata) ``` See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), @@ -443,51 +445,94 @@ See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), # New implementations -Optional. The fallback return value is `Union{}`. $DOC_ONLY_ONE +Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) """ fit_observation_type(::Any) = Union{} -DOC_INPUT_SCITYPE(op) = +function DOC_INPUT_SCITYPE(op) + extra = op == :predict ? " kind_of_proxy," : "" + ONLY = DOC_ONLY_ONE(op) """ LearnAPI.$(op)_input_scitype(algorithm) - Return an upper bound on the scitype of input data guaranteed to work with the `$op` - operation. + Return an upper bound on the scitype of `data` guaranteed to work in the call + `$op(algorithm,$extra data...)`. Specifically, if `S` is the value returned and `ScientificTypes.scitype(data) <: S`, - then the following low-level calls are allowed - - data2 = LearnAPI.reformat(algorithm, LearnAPI.$op, data...) - LearnAPI.$op(algorithm, fitted_params, data2...) + then the following is guaranteed to work: - Here `fitted_params` are the learned parameters returned by an appropriate call to - `LearnAPI.fit`. + ```julia + $op(model,$extra data...) + obsdata = obs($op, algorithm, data...) + $op(model,$extra Obs(), obsdata) + ``` + whenever `algorithm = LearnAPI.algorithm(model)`. See also [`LearnAPI.$(op)_input_type`](@ref). # New implementations - Implementation is optional. The fallback return value is `Union{}`. Should not be - overloaded if `LearnAPI.$(op)_input_type` is overloaded. + Implementation is optional. The fallback return value is `Union{}`. $ONLY """ +end -DOC_INPUT_TYPE(op) = +function DOC_INPUT_OBSERVATION_SCITYPE(op) + extra = op == :predict ? " kind_of_proxy," : "" + ONLY = DOC_ONLY_ONE(op) """ - LearnAPI.$(op)_input_type(algorithm) + LearnAPI.$(op)_observation_scitype(algorithm) - Return an upper bound on the type of input data guaranteed to work with the `$op` - operation. + Return an upper bound on the scitype of observations guaranteed to work when calling + `$op(model,$extra data...)`, independent of the type/scitype of the data container + itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has + value different from `Union{}` the understanding is that `data` implements the + MLUtils.jl `getobs`/`numobs` interface. - Specifically, if `T` is the value returned and `typeof(data) <: S`, then the following - low-level calls are allowed + Specifically, denoting the type returned above by `S`, supposing `S != Union{}`, and + that user supplies `data` satisfying - data2 = LearnAPI.reformat(algorithm, LearnAPI.$op, data...) - LearnAPI.$op(algorithm, fitted_params, data2...) + ```julia + ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S + ``` - Here `fitted_params` are the learned parameters returned by an appropriate call to - `LearnAPI.fit`. + for any valid index `i`, then all the following are guaranteed to work: + + ```julia + $op(model,$extra data...) + obsdata = obs($op, algorithm, data...) + $op(model,$extra Obs(), obsdata) + ``` + whenever `algorithm = LearnAPI.algorithm(model)`. + + See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), + [`LearnAPI.fit_observation_type`](@ref). + + # New implementations + + Optional. The fallback return value is `Union{}`. $ONLY + + """ +end + +function DOC_INPUT_TYPE(op) + extra = op == :predict ? " kind_of_proxy," : "" + ONLY = DOC_ONLY_ONE(op) + """ + LearnAPI.$(op)_input_type(algorithm) + + Return an upper bound on the type of `data` guaranteed to work in the call + `$op(algorithm,$extra data...)`. + + Specifically, if `T` is the value returned and `typeof(data) <: T`, then the following + is guaranteed to work: + + ```julia + $op(model,$extra data...) + obsdata = obs($op, model, data...) + $op(model,$extra Obs(), obsdata) + ``` See also [`LearnAPI.$(op)_input_scitype`](@ref). @@ -497,44 +542,65 @@ DOC_INPUT_TYPE(op) = overloaded if `LearnAPI.$(op)_input_scitype` is overloaded. """ +end -DOC_OUTPUT_SCITYPE(op) = +function DOC_INPUT_OBSERVATION_TYPE(op) + extra = op == :predict ? " kind_of_proxy," : "" + ONLY = DOC_ONLY_ONE(op) """ - LearnAPI.$(op)_output_scitype(algorithm) + LearnAPI.$(op)_observation_type(algorithm) - Return an upper bound on the scitype of the output of the `$op` operation. + Return an upper bound on the type of observations guaranteed to work when calling + `$op(model,$extra data...)`, independent of the type/scitype of the data container + itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has + value different from `Union{}` the understanding is that `data` implements the + MLUtils.jl `getobs`/`numobs` interface. - Specifically, if `S` is the value returned, and if + Specifically, denoting the type returned above by `T`, supposing `T != Union{}`, and + that user supplies `data` satisfying - output, report = LearnAPI.$op(algorithm, fitted_params, data...) + ```julia + typeof(MLUtils.getobs(data, i)) <: T + ``` - for suitable `fitted_params` and `data`, then + for any valid index `i`, then all the following are guaranteed to work: - ScientificTypes.scitype(output) <: S + ```julia + $op(model,$extra data...) + obsdata = obs($op, algorithm, data...) + $op(model,$extra Obs(), obsdata) + ``` + whenever `algorithm = LearnAPI.algorithm(model)`. - See also [`LearnAPI.$(op)_input_scitype`](@ref). + See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), + [`LearnAPI.fit_observation_type`](@ref). # New implementations - Implementation is optional. The fallback return value is `Any`. + Optional. The fallback return value is `Union{}`. $ONLY """ +end -DOC_OUTPUT_TYPE(op) = +DOC_OUTPUT_SCITYPE(op) = """ - LearnAPI.$(op)_output_type(algorithm) + LearnAPI.$(op)_output_scitype(algorithm) - Return an upper bound on the type of the output of the `$op` operation. + Return an upper bound on the scitype of the output of the `$op` operation. - Specifically, if `T` is the value returned, and if + See also [`LearnAPI.$(op)_input_scitype`](@ref). - output, report = LearnAPI.$op(algorithm, fitted_params, data...) + # New implementations - for suitable `fitted_params` and `data`, then + Implementation is optional. The fallback return value is `Any`. - typeof(output) <: T + """ - See also [`LearnAPI.$(op)_input_type`](@ref). +DOC_OUTPUT_TYPE(op) = + """ + LearnAPI.$(op)_output_type(algorithm) + + Return an upper bound on the type of the output of the `$op` operation. # New implementations @@ -545,18 +611,36 @@ DOC_OUTPUT_TYPE(op) = "$(DOC_INPUT_SCITYPE(:predict))" predict_input_scitype(::Any) = Union{} +"$(DOC_INPUT_OBSERVATION_SCITYPE(:predict))" +predict_input_observation_scitype(::Any) = Union{} + "$(DOC_INPUT_TYPE(:predict))" predict_input_type(::Any) = Union{} +"$(DOC_INPUT_OBSERVATION_TYPE(:predict))" +predict_input_observation_type(::Any) = Union{} + +"$(DOC_OUTPUT_SCITYPE(:predict))" +predict_output_scitype(::Any) = Any + +"$(DOC_OUTPUT_TYPE(:predict))" +predict_output_type(::Any) = Any + "$(DOC_INPUT_SCITYPE(:transform))" transform_input_scitype(::Any) = Union{} -"$(DOC_OUTPUT_SCITYPE(:transform))" -transform_output_scitype(::Any) = Any +"$(DOC_INPUT_OBSERVATION_SCITYPE(:transform))" +transform_input_observation_scitype(::Any) = Union{} "$(DOC_INPUT_TYPE(:transform))" transform_input_type(::Any) = Union{} +"$(DOC_INPUT_OBSERVATION_TYPE(:transform))" +transform_input_observation_type(::Any) = Union{} + +"$(DOC_OUTPUT_SCITYPE(:transform))" +transform_output_scitype(::Any) = Any + "$(DOC_OUTPUT_TYPE(:transform))" transform_output_type(::Any) = Any @@ -571,7 +655,7 @@ const DOC_PREDICT_OUTPUT(s) = Return an upper bound for the $(s)s of predictions of the specified form where supported, and otherwise return `Any`. For example, if - ŷ, report = LearnAPI.predict(algorithm, LearnAPI.Distribution(), data...) + ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...) successfully returns (i.e., `algorithm` supports predictions of target probability distributions) then the following is guaranteed to hold: @@ -590,9 +674,9 @@ const DOC_PREDICT_OUTPUT(s) = Overloading the trait is optional. Here's a sample implementation for a supervised regressor type `MyRgs` that only predicts actual values of the target: - LearnAPI.predict(alogrithm::MyRgs, ::LearnAPI.LiteralTarget, data...) = ... - LearnAPI.predict_output_$(s)(::MyRgs, ::LearnAPI.LiteralTarget) = - AbstractVector{ScientificTypesBase.Continuous} + ```julia + @trait MyRgs predict_output_$(s) = AbstractVector{ScientificTypesBase.Continuous} + ``` The fallback method returns `Any`. @@ -611,24 +695,26 @@ name(A) = string(typename(A)) is_algorithm(A) = !isempty(functions(A)) +preferred_kind_of_proxy(algorithm) = first(kinds_of_proxy(algorithm)) + const DOC_PREDICT_OUTPUT2(s) = """ - LearnAPI.predict_output_$(s)(algorithm) + LearnAPI.predict_output_$(s)s(algorithm) Return a dictionary of upper bounds on the $(s) of predictions, keyed on concrete - subtypes of [`LearnAPI.KindOfProxy`](@ref). Each of these subtypes respresents a - different form of target prediction (`LiteralTarget`, `Distribution`, `SurvivalFunction`, - etc) possibly supported by `algorithm`, but the existence of a key does not guarantee - that form is supported. + subtypes of [`LearnAPI.KindOfProxy`](@ref). Each of these subtypes represents a + different form of target prediction (`LiteralTarget`, `Distribution`, + `SurvivalFunction`, etc) possibly supported by `algorithm`, but the existence of a key + does not guarantee that form is supported. As an example, if - ŷ, report = LearnAPI.predict(algorithm, LearnAPI.Distribution(), data...) + ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...) successfully returns (i.e., `algorithm` supports predictions of target probability distributions) then the following is guaranteed to hold: - $(s)(ŷ) <: LearnAPI.predict_output_$(s)(algorithm)[LearnAPI.Distribution] + $(s)(ŷ) <: LearnAPI.predict_output_$(s)s(algorithm)[LearnAPI.Distribution] See also [`LearnAPI.KindOfProxy`](@ref), [`LearnAPI.predict`](@ref), [`LearnAPI.predict_input_$(s)`](@ref). @@ -636,18 +722,16 @@ const DOC_PREDICT_OUTPUT2(s) = # New implementations This single argument trait should not be overloaded. Instead, overload - [`LearnAPI.predict_output_$(s)`](@ref)(algorithm, kind_of_proxy). See above. + [`LearnAPI.predict_output_$(s)`](@ref)(algorithm, kind_of_proxy). """ "$(DOC_PREDICT_OUTPUT2(:scitype))" -predict_output_scitype(algorithm) = +predict_output_scitypes(algorithm) = Dict(T => predict_output_scitype(algorithm, T()) for T in CONCRETE_TARGET_PROXY_TYPES) "$(DOC_PREDICT_OUTPUT2(:type))" -predict_output_type(algorithm) = +predict_output_types(algorithm) = Dict(T => predict_output_type(algorithm, T()) for T in CONCRETE_TARGET_PROXY_TYPES) - - diff --git a/src/types.jl b/src/types.jl new file mode 100644 index 00000000..e72c159e --- /dev/null +++ b/src/types.jl @@ -0,0 +1,84 @@ +# # TARGET PROXIES + +const DOC_HOW_TO_LIST_PROXIES = + "Run `LearnAPI.CONCRETE_TARGET_PROXY_TYPES` "* + " to list all options. " + + +""" + + LearnAPI.KindOfProxy + +Abstract type whose concrete subtypes `T` each represent a different kind of proxy for +some target variable, associated with some algorithm. Instances `T()` are used to request +the form of target predictions in [`predict`](@ref) calls. + +See LearnAPI.jl documentation for an explanation of "targets" and "target proxies". + +For example, `Distribution` is a concrete subtype of `LearnAPI.KindOfProxy` and a call +like `predict(model, Distribution(), Xnew)` returns a data object whose observations are +probability density/mass functions, assuming `algorithm` supports predictions of that +form. + +$DOC_HOW_TO_LIST_PROXIES + +""" +abstract type KindOfProxy end + +""" + LearnAPI.IID <: LearnAPI.KindOfProxy + +Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). If `kind_of_proxy` is an instance of +`LearnAPI.IID` then, given `data` constisting of ``n`` observations, the +following must hold: + +- `ŷ = LearnAPI.predict(model, kind_of_proxy, data...)` is + data also consisting of ``n`` observations. + +- The ``j``th observation of `ŷ`, for any ``j``, depends only on the ``j``th + observation of the provided `data` (no correlation between observations). + +See also [`LearnAPI.KindOfProxy`](@ref). + +""" +abstract type IID <: KindOfProxy end + +struct LiteralTarget <: IID end +struct Sampleable <: IID end +struct Distribution <: IID end +struct LogDistribution <: IID end +struct Probability <: IID end +struct LogProbability <: IID end +struct Parametric <: IID end +struct LabelAmbiguous <: IID end +struct LabelAmbiguousSampleable <: IID end +struct LabelAmbiguousDistribution <: IID end +struct ConfidenceInterval <: IID end +struct Set <: IID end +struct ProbabilisticSet <: IID end +struct SurvivalFunction <: IID end +struct SurvivalDistribution <: IID end +struct OutlierScore <: IID end +struct Continuous <: IID end + +# struct None <: KindOfProxy end +struct JointSampleable <: KindOfProxy end +struct JointDistribution <: KindOfProxy end +struct JointLogDistribution <: KindOfProxy end + +const CONCRETE_TARGET_PROXY_TYPES = [ + subtypes(IID)..., + setdiff(subtypes(KindOfProxy), subtypes(IID))..., +] + +const CONCRETE_TARGET_PROXY_TYPES_SYMBOLS = map(CONCRETE_TARGET_PROXY_TYPES) do T + Symbol(last(split(string(T), '.'))) +end + +const CONCRETE_TARGET_PROXY_TYPES_LIST = join( + map(CONCRETE_TARGET_PROXY_TYPES_SYMBOLS) do s + "`$s`" + end, + ", ", + " and ", +) diff --git a/test/integration/regression.jl b/test/integration/regression.jl new file mode 100644 index 00000000..2c5d9d70 --- /dev/null +++ b/test/integration/regression.jl @@ -0,0 +1,219 @@ +using LearnAPI +using LinearAlgebra +using Tables +import MLUtils +import DataFrames + + +# # NAIVE RIDGE REGRESSION WITH NO INTERCEPTS + +# We overload `obs` to expose internal representation of input data. See later for a +# simpler variation using the `obs` fallback. + +struct Ridge + lambda::Float64 +end +Ridge(; lambda=0.1) = Ridge(lambda) + +struct RidgeFitObs{T} + A::Matrix{T} # p x n + names::Vector{Symbol} + y::Vector{T} +end + +struct RidgeFitted{T,F} + algorithm::Ridge + coefficients::Vector{T} + feature_importances::F +end + +Base.getindex(data::RidgeFitObs, I) = + RidgeFitObs(data.A[:,I], data.names, data.y[I]) +Base.length(data::RidgeFitObs, I) = length(data.y) + +function LearnAPI.obs(::typeof(fit), ::Ridge, X, y) + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + RidgeFitObs(Tables.matrix(table, transpose=true), names, y) +end + +function LearnAPI.obsfit(algorithm::Ridge, fitdata::RidgeFitObs, verbosity) + + # unpack hyperparameters and data: + lambda = algorithm.lambda + A = fitdata.A + names = fitdata.names + y = fitdata.y + + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + + # determine crude feature importances: + feature_importances = + [names[j] => abs(coefficients[j]) for j in eachindex(names)] + sort!(feature_importances, by=last) |> reverse! + + # make some noise, if allowed: + verbosity > 0 && + @info "Features in order of importance: $(first.(feature_importances))" + + return RidgeFitted(algorithm, coefficients, feature_importances) + +end + +LearnAPI.algorithm(model::RidgeFitted) = model.algorithm + +LearnAPI.obspredict(model::RidgeFitted, ::LiteralTarget, Anew::Matrix) = + ((model.coefficients)'*Anew)' + +LearnAPI.obs(::typeof(predict), ::Ridge, X) = Tables.matrix(X, transpose=true) + +LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances + +LearnAPI.minimize(model::RidgeFitted) = + RidgeFitted(model.algorithm, model.coefficients, nothing) + +@trait( + Ridge, + position_of_target=2, + kinds_of_proxy = (LiteralTarget(),), + functions = ( + fit, + obsfit, + minimize, + predict, + obspredict, + obs, + LearnAPI.algorithm, + LearnAPI.feature_importances, + ) +) + +n = 10 # number of observations +train = 1:6 +test = 7:10 +a, b, c = rand(n), rand(n), rand(n) +X = (; a, b, c) +X = DataFrames.DataFrame(X) +y = 2a - b + 3c + 0.05*rand(n) + +@testset "test an implementation of ridge regression" begin + algorithm = Ridge(lambda=0.5) + @test LearnAPI.obs in LearnAPI.functions(algorithm) + + # verbose fitting: + @test_logs( + (:info, r"Feature"), + fit( + algorithm, + Tables.subset(X, train), + y[train]; + verbosity=1, + ), + ) + + # quite fitting: + model = @test_logs( + fit( + algorithm, + Tables.subset(X, train), + y[train]; + verbosity=0, + ), + ) + + ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) + @test ŷ isa Vector{Float64} + @test predict(model, Tables.subset(X, test)) == ŷ + + fitdata = LearnAPI.obs(fit, algorithm, X, y) + predictdata = LearnAPI.obs(predict, algorithm, X) + model = obsfit(algorithm, MLUtils.getobs(fitdata, train); verbosity=1) + @test obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, test)) == ŷ + + @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} + + filename = tempname() + using Serialization + small_model = minimize(model) + serialize(filename, small_model) + + recovered_model = deserialize(filename) + @test LearnAPI.algorithm(recovered_model) == algorithm + @test obspredict( + recovered_model, + LiteralTarget(), + MLUtils.getobs(predictdata, test) + ) == ŷ +end + +# # VARIATION OF RIDGE REGRESSION THAT USES FALLBACK OF LearnAPI.obs + +struct BabyRidge + lambda::Float64 +end +BabyRidge(; lambda=0.1) = BabyRidge(lambda) + +struct BabyRidgeFitted{T,F} + algorithm::BabyRidge + coefficients::Vector{T} + feature_importances::F +end + +function LearnAPI.obsfit(algorithm::BabyRidge, data, verbosity) + + X, y = data + + lambda = algorithm.lambda + + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table, transpose=true) + + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + + feature_importances = nothing + + return BabyRidgeFitted(algorithm, coefficients, feature_importances) + +end + +LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm + +function LearnAPI.obspredict(model::BabyRidgeFitted, ::LiteralTarget, data) + X = only(data) + Anew = Tables.matrix(X, transpose=true) + return ((model.coefficients)'*Anew)' +end + +@trait( + BabyRidge, + position_of_target=2, + kinds_of_proxy = (LiteralTarget(),), + functions = ( + fit, + obsfit, + minimize, + predict, + obspredict, + obs, + LearnAPI.algorithm, + LearnAPI.feature_importances, + ) +) + +@testset "test a variation which does not overload LearnAPI.obs" begin + algorithm = BabyRidge(lambda=0.5) + + model = fit(algorithm, Tables.subset(X, train), y[train]; verbosity=0) + ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) + @test ŷ isa Vector{Float64} + + fitdata = obs(fit, algorithm, X, y) + predictdata = LearnAPI.obs(predict, algorithm, X) + model = obsfit(algorithm, MLUtils.getobs(fitdata, train); verbosity=0) + @test obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, test)) == ŷ +end + +true diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl new file mode 100644 index 00000000..e5295ddc --- /dev/null +++ b/test/integration/static_algorithms.jl @@ -0,0 +1,112 @@ +using LearnAPI +using LinearAlgebra +using Tables +import MLUtils +import DataFrames + + +# # TRANSFORMER TO SELECT SOME FEATURES (COLUMNS) OF A TABLE + +# See later for a variation that stores the names of rejected features in the model +# object, for inspection by an accessor function. + +struct Selector + names::Vector{Symbol} +end +Selector(; names=Symbol[]) = Selector(names) + +LearnAPI.obsfit(algorithm::Selector, obsdata, verbosity) = algorithm +LearnAPI.algorithm(model) = model # i.e., the algorithm + +function LearnAPI.obstransform(algorithm::Selector, obsdata) + X = only(obsdata) + table = Tables.columntable(X) + names = Tables.columnnames(table) + filtered_names = filter(in(algorithm.names), names) + filtered_columns = (Tables.getcolumn(table, name) for name in filtered_names) + filtered_table = NamedTuple{filtered_names}((filtered_columns...,)) + return Tables.materializer(X)(filtered_table) +end + +@trait Selector functions = ( + fit, + obsfit, + minimize, + transform, + obstransform, + obs, + Learn.algorithm, +) + +@testset "test a static transformer" begin + algorithm = Selector(names=[:x, :w]) + X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) + model = fit(algorithm) # no data arguments! + @test model == algorithm + @test transform(model, X) == + DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) +end + + +# # FEATURE SELECTOR THAT REPORTS BYPRODUCTS OF SELECTION PROCESS + +# This a variation of `Selector` above that stores the names of rejected features in the +# model object, for inspection by an accessor function called `rejected`. + +struct Selector2 + names::Vector{Symbol} +end +Selector2(; names=Symbol[]) = Selector2(names) + +mutable struct Selector2Fit + algorithm::Selector2 + rejected::Vector{Symbol} + Selector2Fit(algorithm) = new(algorithm) +end +LearnAPI.algorithm(model::Selector2Fit) = model.algorithm +rejected(model::Selector2Fit) = model.rejected + +# Here `obsdata=()` and we are just wrapping `algorithm` with a place-holder for +# the `rejected` feature names. +LearnAPI.obsfit(algorithm::Selector2, obsdata, verbosity) = Selector2Fit(algorithm) + +# output the filtered table and add `rejected` field to model (mutatated) +function LearnAPI.obstransform(model::Selector2Fit, obsdata) + X = only(obsdata) + table = Tables.columntable(X) + names = Tables.columnnames(table) + keep = LearnAPI.algorithm(model).names + filtered_names = filter(in(keep), names) + model.rejected = setdiff(names, filtered_names) + filtered_columns = (Tables.getcolumn(table, name) for name in filtered_names) + filtered_table = NamedTuple{filtered_names}((filtered_columns...,)) + return Tables.materializer(X)(filtered_table) +end + +@trait( + Selector2, + predict_or_transform_mutates = true, + functions = ( + fit, + obsfit, + minimize, + transform, + obstransform, + obs, + Learn.algorithm, + :(MyPkg.rejected), # accessor function not owned by LearnAPI.jl + ) +) + +@testset "test a variation that reports byproducts" begin + algorithm = Selector2(names=[:x, :w]) + X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) + model = fit(algorithm) # no data arguments! + @test !isdefined(model, :reject) + @test LearnAPI.algorithm(model) == algorithm + filtered = DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) + @test transform(model, X) == filtered + @test rejected(model) == [:y, :z] +end + +true diff --git a/test/runtests.jl b/test/runtests.jl index a5c2a854..8697a248 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -1,7 +1,99 @@ -using LearnAPI using Test -using SparseArrays @testset "tools.jl" begin include("tools.jl") end + +# # INTEGRATION TESTS + +@testset "regression" begin + include("integration/regression.jl") +end + +# @testset "classification" begin +# include("integration/classification.jl") +# end + +# @testset "clustering" begin +# include("integration/clustering.jl") +# end + +# @testset "gradient_descent" begin +# include("integration/gradient_descent.jl") +# end + +# @testset "iterative_algorithms" begin +# include("integration/iterative_algorithms.jl") +# end + +# @testset "incremental_algorithms" begin +# include("integration/incremental_algorithms.jl") +# end + +# @testset "dimension_reduction" begin +# include("integration/dimension_reduction.jl") +# end + +# @testset "encoders" begin +# include("integration/encoders.jl") +# end + +@testset "static_algorithms" begin + include("integration/static_algorithms.jl") +end + +# @testset "missing_value_imputation" begin +# include("integration/missing_value_imputation.jl") +# end + +# @testset "ensemble_algorithms" begin +# include("integration/ensemble_algorithms.jl") +# end + +# @testset "wrappers" begin +# include("integration/wrappers.jl") +# end + +# @testset "time_series_forecasting" begin +# include("integration/time_series_forecasting.jl") +# end + +# @testset "time_series_classification" begin +# include("integration/time_series_classification.jl") +# end + +# @testset "survival_analysis" begin +# include("integration/survival_analysis.jl") +# end + +# @testset "distribution_fitters" begin +# include("integration/distribution_fitters.jl") +# end + +# @testset "Bayesian_algorithms" begin +# include("integration/Bayesian_algorithms.jl") +# end + +# @testset "outlier_detection" begin +# include("integration/outlier_detection.jl") +# end + +# @testset "collaborative_filtering" begin +# include("integration/collaborative_filtering.jl") +# end + +# @testset "text_analysis" begin +# include("integration/text_analysis.jl") +# end + +# @testset "audio_analysis" begin +# include("integration/audio_analysis.jl") +# end + +# @testset "natural_language_processing" begin +# include("integration/natural_language_processing.jl") +# end + +# @testset "image_processing" begin +# include("integration/image_processing.jl") +# end diff --git a/test/tools.jl b/test/tools.jl index 15d5c29c..1b2e942f 100644 --- a/test/tools.jl +++ b/test/tools.jl @@ -1,3 +1,7 @@ +using LearnAPI +using Test +using SparseArrays + module Fruit using LearnAPI From 2507568cbb9e5497caf251e3337a67bb46a41376 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 13:51:40 +1300 Subject: [PATCH 005/187] rm redundant line from doc example --- docs/src/index.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/src/index.md b/docs/src/index.md index e1dc44df..f5c793f7 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -38,7 +38,6 @@ etc.). Then, a LearnAPI.jl interface can be implemented, for objects with the ty ```julia X = y = -w = Xnew = # Train: From da2b607af9285354eac8cae5be5089a6a833ee1a Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 14:04:39 +1300 Subject: [PATCH 006/187] spelling --- docs/make.jl | 2 +- docs/src/anatomy_of_an_implementation.md | 2 +- docs/src/obs.md | 2 +- docs/src/reference.md | 4 ++-- docs/src/traits.md | 2 +- src/fit.jl | 8 ++++---- 6 files changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 4d4d08a6..88159ea7 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -14,7 +14,7 @@ makedocs( "Kinds of Target Proxy" => "kinds_of_target_proxy.md", "fit" => "fit.md", "predict, transform, and relatives" => "predict_transform.md", - "mimimize" => "minimize.md", + "minimize" => "minimize.md", "obs" => "obs.md", "Accessor Functions" => "accessor_functions.md", "Algorithm Traits" => "traits.md", diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 87995d38..2bc24a39 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -332,7 +332,7 @@ predict(recovered_model, LiteralTarget(), X) == predict(model, LiteralTarget(), --- ¹ The definition of this and other structs above is not an explicit requirement of -LearnAPI.jl, whose constracts are purely functional. +LearnAPI.jl, whose constructs are purely functional. ² An implementation can provide further accessor functions, if necessary, but like the native ones, they must be included in the [`LearnAPI.functions`](@ref) diff --git a/docs/src/obs.md b/docs/src/obs.md index 599ce590..1b71a920 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -84,7 +84,7 @@ there is no concept of an algorithm-specific representation of *outputs*, only i | [`obs`](@ref) | depends | slurps `data` argument | | | | | -If the `data` consumed by `fit`, `predict` or `tranform` consists only of tables and +If the `data` consumed by `fit`, `predict` or `transform` consists only of tables and arrays (with last dimension the observation dimension) then overloading `obs` is optional. However, if an implementation overloads `obs` to return a (thinly wrapped) representation of user data that is closer to what the core algorithm actually uses, and diff --git a/docs/src/reference.md b/docs/src/reference.md index e2c4db4a..3433c66b 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -24,7 +24,7 @@ A `DataFrame` instance, from [DataFrames.jl](https://dataframes.juliadata.org/st an example of data, the observations being the rows. LearnAPI.jl makes no assumptions about how observations can be accessed, except in the case of the output of [`obs`](@ref data_interface), which must implement the MLUtils.jl `getobs`/`numobs` interface. For -example, it is generally ambiguous whether the rows or columms of a matrix are considered +example, it is generally ambiguous whether the rows or columns of a matrix are considered observations, but if a matrix is returned by [`obs`](@ref data_interface) the observations must be the columns. @@ -125,7 +125,7 @@ implementations must implement [`obsfit`](@ref), the accessor function - [`fit`](@ref fit)/[`obsfit`](@ref): for training algorithms that generalize to new data -- [`predict`](@ref operations)/[`obspredict`](@ref): for outputing [targets](@ref proxy) +- [`predict`](@ref operations)/[`obspredict`](@ref): for outputting [targets](@ref proxy) or [target proxies](@ref proxy) (such as probability density functions) - [`transform`](@ref operations)/[`obstransform`](@ref): similar to `predict`, but for diff --git a/docs/src/traits.md b/docs/src/traits.md index 2531bf66..e46f0f71 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -70,7 +70,7 @@ The following convenience methods are provided but not overloadable by new imple | `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | | `LearnAPI.is_algorithm(algorithm)` | `true` if `LearnAPI.functions(algorithm)` is not empty | `true` | | [`LearnAPI.predict_output_scitypes(algorithm)`](@ref) | dictionary of upper bounds on the scitype of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | -| [`LearnAPI.predict_output_types(alogorithm)`](@ref) | dictionary of upper bounds on the type of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | +| [`LearnAPI.predict_output_types(algorithm)`](@ref) | dictionary of upper bounds on the type of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | ## Implementation guide diff --git a/src/fit.jl b/src/fit.jl index e83cbcfb..797a497a 100644 --- a/src/fit.jl +++ b/src/fit.jl @@ -29,7 +29,7 @@ See also [`obsfit`](@ref), [`predict`](@ref), [`transform`](@ref), # New implementations -LearnAPI.jl provides the following defintion of `fit`, which is never directly overloaded: +LearnAPI.jl provides the following definition of `fit`, which is never directly overloaded: ```julia fit(algorithm, data...; verbosity=1) = @@ -58,9 +58,9 @@ obsdata = obs(fit, algorithm, data...) model = obsfit(algorithm, obsdata) ``` -Here `obsdata` is algorithm-specific, "observaton-accessible" data, meaning it implements +Here `obsdata` is algorithm-specific, "observation-accessible" data, meaning it implements the MLUtils.jl `getobs`/`numobs` interface for observation resampling (even if `data` does -not). Morevoer, resampled versions of `obsdata` may be passed to `obsfit` in its place. +not). Moreover, resampled versions of `obsdata` may be passed to `obsfit` in its place. The use of `obsfit` may offer performance advantages. See more at [`obs`](@ref). @@ -82,7 +82,7 @@ overloaded, then a fallback gives `obsdata = data` (always a tuple!). Note that New implementations must also implement [`LearnAPI.algorithm`](@ref). -If overloaded, then the functions `LearnAPI.obsfit` and `LeranAPI.fit` must be included in +If overloaded, then the functions `LearnAPI.obsfit` and `LearnAPI.fit` must be included in the tuple returned by the [`LearnAPI.functions(algorithm)`](@ref) trait. ## Non-generalizing algorithms From 417bd0324f40ae90cbfe81f52f405d549c009c9e Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 14:07:15 +1300 Subject: [PATCH 007/187] spelling again --- src/fit.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/fit.jl b/src/fit.jl index 797a497a..010e53e0 100644 --- a/src/fit.jl +++ b/src/fit.jl @@ -152,7 +152,7 @@ See also [`LearnAPI.fit`](@ref), [`LearnAPI.ingest!`](@ref). # # INGEST """ - LernAPI.ingest!(algorithm, verbosity, fitted_params, state, data...) + LearnAPI.ingest!(algorithm, verbosity, fitted_params, state, data...) For an algorithm that supports incremental learning, update the fitted parameters using `data`, which has typically not been seen before. The arguments `state` and From c532f96e74370e3a8c09622941b52aeb4faa9fdf Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 14:08:15 +1300 Subject: [PATCH 008/187] make codecov less fussy --- .github/codecov.yml | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 .github/codecov.yml diff --git a/.github/codecov.yml b/.github/codecov.yml new file mode 100644 index 00000000..ed9d9f1c --- /dev/null +++ b/.github/codecov.yml @@ -0,0 +1,8 @@ +coverage: + status: + project: + default: + threshold: 0.5% + patch: + default: + target: 80% \ No newline at end of file From 71c57f44be64dc163bc1ba950a7ff6347ac34487 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 14:14:36 +1300 Subject: [PATCH 009/187] bump Documenter version to 1.0 --- docs/Project.toml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/Project.toml b/docs/Project.toml index e08cc4f7..e3b8ffad 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -4,5 +4,5 @@ ScientificTypesBase = "30f210dd-8aff-4c5f-94ba-8e64358c1161" Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" [compat] -Documenter = "^0.27" -julia = "1" +Documenter = "1" +julia = "1.6" From 3efe48b3b9e49723ead99283bcc7ad596ae4ccd9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 14:24:27 +1300 Subject: [PATCH 010/187] add MLUtils as docs [deps] --- docs/Project.toml | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/Project.toml b/docs/Project.toml index e3b8ffad..caa42f70 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -1,5 +1,6 @@ [deps] Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" +MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" ScientificTypesBase = "30f210dd-8aff-4c5f-94ba-8e64358c1161" Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" From 562f67a7d9d1bdafb62cb7b4cd005dbf6b9118d8 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 21:48:46 +1300 Subject: [PATCH 011/187] revert repo kwarg in deploydocs to old string version --- docs/make.jl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 88159ea7..fd54ce70 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -23,11 +23,11 @@ makedocs( ], sitename="LearnAPI.jl", warnonly = [:cross_references, :missing_docs], - repo =REPO, + repo = Remotes.GitHub("JuliaAI", "LearnAPI.jl"), ) deploydocs( devbranch="dev", push_preview=false, - repo=REPO, + repo="github.com/JuliaAI/LearnAPI.jl.git", ) From 354210e1e15d7a2ec822771153df8e9c443b2652 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 22:07:16 +1300 Subject: [PATCH 012/187] tweak --- docs/src/obs.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/src/obs.md b/docs/src/obs.md index 1b71a920..cbcf5e9a 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -47,7 +47,6 @@ import MLUtils X = y = -w = algorithm = test_train_folds = map([1:10, 11:20, 21:30]) do test From c08b3bb2ddb3bddf5bc8bc9d0ea6f2a8089e1afc Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 22:08:03 +1300 Subject: [PATCH 013/187] doc tweak --- docs/src/obs.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/src/obs.md b/docs/src/obs.md index cbcf5e9a..1b3d5bc2 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -93,6 +93,8 @@ the user concerning herself with the details of the representation. A sample implementation is given in the [`obs`](@ref) document-string below. +## Reference + ```@docs obs ``` From 0f39c914a245dd6eef797602bda5dd8f29975d17 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 22:14:33 +1300 Subject: [PATCH 014/187] spelling --- src/predict_transform.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 4677894b..0444f2c7 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -101,7 +101,7 @@ predict(model, data...) = predict(model, LiteralTarget(), data...) obspredict(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata) Similar to `predict` but consumes algorithm-specific representations of input data, -`obsdata`, as returned by `obs(predict, algorithm, data...)`. Hre `data...` is the form of +`obsdata`, as returned by `obs(predict, algorithm, data...)`. Here `data...` is the form of data expected in the main [`predict`](@ref) method. Alternatively, such `obsdata` may be replaced by a resampled version, where resampling is performed using `MLUtils.getobs` (always supported). From e59e7915a10ebc0ff0e83c07af52f1d62039b7b0 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 22:25:36 +1300 Subject: [PATCH 015/187] fix issue with predict_output_type --- src/traits.jl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 56f46e68..6553c314 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -699,7 +699,7 @@ preferred_kind_of_proxy(algorithm) = first(kinds_of_proxy(algorithm)) const DOC_PREDICT_OUTPUT2(s) = """ - LearnAPI.predict_output_$(s)s(algorithm) + LearnAPI.predict_output_$(s)(algorithm) Return a dictionary of upper bounds on the $(s) of predictions, keyed on concrete subtypes of [`LearnAPI.KindOfProxy`](@ref). Each of these subtypes represents a @@ -727,11 +727,11 @@ const DOC_PREDICT_OUTPUT2(s) = """ "$(DOC_PREDICT_OUTPUT2(:scitype))" -predict_output_scitypes(algorithm) = +predict_output_scitype(algorithm) = Dict(T => predict_output_scitype(algorithm, T()) for T in CONCRETE_TARGET_PROXY_TYPES) "$(DOC_PREDICT_OUTPUT2(:type))" -predict_output_types(algorithm) = +predict_output_type(algorithm) = Dict(T => predict_output_type(algorithm, T()) for T in CONCRETE_TARGET_PROXY_TYPES) From 1fb6b4b43557fbd7e9fe6940773ed255f4aae758 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 22:36:10 +1300 Subject: [PATCH 016/187] drop old names --- docs/src/traits.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/src/traits.md b/docs/src/traits.md index e46f0f71..53d75f58 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -147,6 +147,4 @@ LearnAPI.transform_input_observation_type LearnAPI.predict_or_transform_mutates LearnAPI.transform_output_scitype LearnAPI.transform_output_type -LearnAPI.predict_output_scitypes -LearnAPI.predict_output_types ``` From 7027ed81275330716b265af552411c8c578dc8c9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 22:39:20 +1300 Subject: [PATCH 017/187] sneak in doc fix --- src/traits.jl | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 6553c314..422dbcfa 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -68,18 +68,18 @@ non-empty. All new implementations must overload this trait. Here's a checklist for elements in the return value: -| function | needs explicit implementation? | include in returned tuple? | -|----------------------|-------------------------|--------------------------------| -| `fit` | no | yes | -| `obsfit` | yes | yes | -| `minimize` | optional | yes | -| `predict` | no | if `obspredict` is implemented | -| `obspredict` | optional | if implemented | -| `transform` | no | if `transform` is implemented | -| `obstransform` | optional | if implemented | -| `obs` | optional | yes | -| `inverse_transform` | optional | if implemented | -| `LearnAPI.algorithm` | yes | yes | +| function | needs explicit implementation? | include in returned tuple? | +|----------------------|---------------------------------|----------------------------------| +| `fit` | no | yes | +| `obsfit` | yes | yes | +| `minimize` | optional | yes | +| `predict` | no | if `obspredict` is implemented | +| `obspredict` | optional | if implemented | +| `transform` | no | if `obstransform` is implemented | +| `obstransform` | optional | if implemented | +| `obs` | optional | yes | +| `inverse_transform` | optional | if implemented | +| `LearnAPI.algorithm` | yes | yes | Also include any implemented accessor functions. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. From 48ab57fa91f61846e5a33d6222f013170b69adb6 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 Nov 2023 22:51:24 +1300 Subject: [PATCH 018/187] rm duplicate method definition --- src/traits.jl | 6 ------ 1 file changed, 6 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 422dbcfa..6d9af601 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -620,12 +620,6 @@ predict_input_type(::Any) = Union{} "$(DOC_INPUT_OBSERVATION_TYPE(:predict))" predict_input_observation_type(::Any) = Union{} -"$(DOC_OUTPUT_SCITYPE(:predict))" -predict_output_scitype(::Any) = Any - -"$(DOC_OUTPUT_TYPE(:predict))" -predict_output_type(::Any) = Any - "$(DOC_INPUT_SCITYPE(:transform))" transform_input_scitype(::Any) = Union{} From d68a1e709418608afd590b11313c5ae386931d98 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 29 Nov 2023 07:25:41 +1300 Subject: [PATCH 019/187] doc tweak --- docs/src/reference.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index 3433c66b..5a46c6ab 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -75,7 +75,8 @@ will encapsulate a particular set of user-specified [hyperparameters](@ref). Additionally, for `alg::Alg` to be a LearnAPI algorithm, we require: -- `Base.propertynames(alg)` returns the hyperparameters of `alg`. +- `Base.propertynames(alg)` returns the hyperparameter names; values can be accessed using + `Base.getproperty` - If `alg` is an algorithm, then so are all instances of the same type. From fabe6e019787e570fefd56e365f94a988886169d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 4 Dec 2023 17:31:05 +1300 Subject: [PATCH 020/187] doc tweak --- docs/src/obs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/obs.md b/docs/src/obs.md index 1b3d5bc2..bfb35a69 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -28,7 +28,7 @@ combines `X` and `y` in a single object guaranteed to implement the MLUtils.jl after resampling using `MLUtils.getobs`: ```julia -# equivalent to `fit(algorithm, X, y)`: +# equivalent to `mode = fit(algorithm, X, y)`: model = obsfit(algorithm, obsdata) # with resampling: From 43b11e3108db7253b42371a502769ec5b4a15892 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 4 Dec 2023 17:35:08 +1300 Subject: [PATCH 021/187] docstring tweak --- src/traits.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/traits.jl b/src/traits.jl index 6d9af601..2266c6d6 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -56,7 +56,7 @@ having the same type as `algorithm`, or to associated models (objects returned b `fit(algorithm, ...)`. Algorithm traits are excluded. In addition to functions, the returned tuple may include expressions, like -`:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl packages. +`:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl. The understanding is that `algorithm` is a LearnAPI-compliant object whenever this is non-empty. From 06c086280a1c841079adb820ff035466c6c3c636 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 4 Dec 2023 17:38:26 +1300 Subject: [PATCH 022/187] docstring fix --- src/traits.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/traits.jl b/src/traits.jl index 2266c6d6..5c36521b 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -189,7 +189,7 @@ Lists one or more suggestive algorithm descriptors from this list: $DOC_DESCRIPT # New implementations -This trait should return a tuple of symbols, as in `(:classifier, :probabilistic)`. +This trait should return a tuple of symbols, as in `(:classifier, :text_analysis)`. """ descriptors(::Any) = () From 9745ea955b8f6d633d23b121b3212e7ca888e826 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 4 Dec 2023 17:39:19 +1300 Subject: [PATCH 023/187] docstring fix --- src/traits.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/traits.jl b/src/traits.jl index 5c36521b..3d0c1d0f 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -278,7 +278,7 @@ See also `[LearnAPI.components]`(@ref). This trait should be overloaded if one or more properties (fields) of `algorithm` may take algorithm values. Fallback return value is `false`. The keyword constructor for such an algorithm need not prescribe defaults for algorithm-valued properties. Implementation of -the accessor function `[LearnAPI.components]`(@ref) is recommended. +the accessor function [`LearnAPI.components`](@ref) is recommended. $DOC_ON_TYPE From c35ebf7bc3e47b184d2a8548a4411241c0853372 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 4 Dec 2023 17:43:12 +1300 Subject: [PATCH 024/187] doc tweak --- docs/src/accessor_functions.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index 8e5e81b1..89c95e2b 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -18,7 +18,9 @@ The sole argument of an accessor function is the output, `model`, of [`fit`](@re ## Implementation guide All new implementations must implement [`LearnAPI.algorithm`](@ref). All others are -optional. +optional. All implemented accessor functions must be added to the list returned by +[`LearnAPI.functions`](@ref). + ## Reference From 0d423c1a7e248c4595143d87bb49d425cd135f3c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 5 Dec 2023 13:37:45 +1300 Subject: [PATCH 025/187] doc fix --- docs/src/traits.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/traits.md b/docs/src/traits.md index 53d75f58..d6e47be0 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -5,7 +5,7 @@ per-observation weights, which must appear as the third argument of `fit`*, or * algorithm's `transform` method predicts `Real` vectors*. They also record more mundane information, such as a package license. -Algorithm traits are functions whose (and usually only) argument is an algorithm. +Algorithm traits are functions whose first (and usually only) argument is an algorithm. ### Special two-argument traits From b875de245aedbc1df78b1fcce03d36c986621cb2 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 7 Dec 2023 09:58:51 +1300 Subject: [PATCH 026/187] doc enhancement --- docs/src/predict_transform.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index a596ad1a..88dd8c62 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -1,11 +1,20 @@ # [`predict`, `transform`, and relatives](@id operations) +Standard methods: + ```julia -predict(model, kind_of_proxy, data...) -> prediction +predict(model, kind_of_proxy, data...) -> prediction transform(model, data...) -> transformed_data inverse_transform(model, data...) -> inverted_data ``` +Methods consuming output `obsdata` of data-preprocessor [`obs`](@ref): + +```julia +obspredict(model, kind_of_proxy, obsdata) -> prediction +obstransform(model, obsdata) -> transformed_data +``` + ## Typical worklows ```julia @@ -40,7 +49,7 @@ ŷ = obspredict(model, LiteralTarget(), predictdata) ## Implementation guide -The methods `predict` and `transform` are not directly overloaded. +The methods `predict` and `transform` are not directly overloaded. | method | compulsory? | fallback | requires | |:----------------------------|:-----------:|:--------:|:-------------------------------------:| From d59c5348f6e139fa5590022bdb99ef785cfe75a9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 7 Dec 2023 10:00:07 +1300 Subject: [PATCH 027/187] doc tweak --- docs/src/predict_transform.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 88dd8c62..8d5b09e1 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -49,7 +49,8 @@ ŷ = obspredict(model, LiteralTarget(), predictdata) ## Implementation guide -The methods `predict` and `transform` are not directly overloaded. +The methods `predict` and `transform` are not directly overloaded. Implement `obspredict` +and `obstransform` instead: | method | compulsory? | fallback | requires | |:----------------------------|:-----------:|:--------:|:-------------------------------------:| From 86ba3d5a25310482cdc85d392a28dd245dab87ee Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 7 Dec 2023 10:19:52 +1300 Subject: [PATCH 028/187] doc tweaks --- docs/src/predict_transform.md | 2 +- docs/src/traits.md | 4 ++-- src/predict_transform.jl | 13 ++++++++----- src/traits.jl | 2 +- 4 files changed, 12 insertions(+), 9 deletions(-) diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 8d5b09e1..382216b8 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -8,7 +8,7 @@ transform(model, data...) -> transformed_data inverse_transform(model, data...) -> inverted_data ``` -Methods consuming output `obsdata` of data-preprocessor [`obs`](@ref): +Methods consuming output, `obsdata`, of data-preprocessor [`obs`](@ref): ```julia obspredict(model, kind_of_proxy, obsdata) -> prediction diff --git a/docs/src/traits.md b/docs/src/traits.md index d6e47be0..3a263595 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -69,8 +69,8 @@ The following convenience methods are provided but not overloadable by new imple |:-----------------------------------------------------|:--------------------------------------------------------------------------------------------------------------|:--------| | `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | | `LearnAPI.is_algorithm(algorithm)` | `true` if `LearnAPI.functions(algorithm)` is not empty | `true` | -| [`LearnAPI.predict_output_scitypes(algorithm)`](@ref) | dictionary of upper bounds on the scitype of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | -| [`LearnAPI.predict_output_types(algorithm)`](@ref) | dictionary of upper bounds on the type of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | +| [`LearnAPI.predict_output_scitype(algorithm)`](@ref) | dictionary of upper bounds on the scitype of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | +| [`LearnAPI.predict_output_type(algorithm)`](@ref) | dictionary of upper bounds on the type of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | ## Implementation guide diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 0444f2c7..71e4e730 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -1,4 +1,4 @@ -function DOC_IMPLEMENTED_METHODS(name; overloaded=false) + function DOC_IMPLEMENTED_METHODS(name; overloaded=false) word = overloaded ? "overloaded" : "implemented" "If $word, you must include `$name` in the tuple returned by the "* "[`LearnAPI.functions`](@ref) trait. " @@ -49,8 +49,9 @@ DOC_MINIMIZE(func) = The first signature returns target or target proxy predictions for input features `data`, according to some `model` returned by [`fit`](@ref) or [`obsfit`](@ref). Where supported, these are literally target predictions if `kind_of_proxy = LiteralTarget()`, and -probability density/mass functions if `kind_of_proxy = -Distribution()`. $DOC_HOW_TO_LIST_PROXIES +probability density/mass functions if `kind_of_proxy = Distribution()`. List all options +with [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), where `algorithm = +LearnAPI.algorithm(model)`. The shortcut `predict(model, data...) = predict(model, LiteralTarget(), data...)` is also provided. @@ -135,13 +136,15 @@ where `data...` is what the standard [`predict`](@ref) call expects, as in the c `predict(model, kind_of_proxy, data...)`. Note `data` is always a tuple, even if `predict` has only one data argument. See more at [`obs`](@ref). + $(DOC_MUTATION(:obspredict)) If overloaded, you must include both `LearnAPI.obspredict` and `LearnAPI.predict` in the list of methods returned by the [`LearnAPI.functions`](@ref) trait. -Each supported `kind_of_proxy` should be listed in the return value of the -[`LearnAPI.kinds_of_proxy(algorithm)`](@ref) trait. +An implementation is provided for each kind of target proxy you wish to support. See the +LearnAPI.jl documentation for options. Each supported `kind_of_proxy` instance should be +listed in the return value of the [`LearnAPI.kinds_of_proxy(algorithm)`](@ref) trait. $(DOC_MINIMIZE(:obspredict)) diff --git a/src/traits.jl b/src/traits.jl index 3d0c1d0f..73c3b03a 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -91,7 +91,7 @@ functions(::Any) = () """ LearnAPI.kinds_of_proxy(algorithm) -Returns an tuple of instances, `kind`, for which for which `predict(algorithm, kind, +Returns an tuple of all instances, `kind`, for which for which `predict(algorithm, kind, data...)` has a guaranteed implementation. Each such `kind` subtypes [`LearnAPI.KindOfProxy`](@ref). Examples are `LiteralTarget()` (for predicting actual target values) and `Distributions()` (for predicting probability mass/density functions). From b53d2fd30d87f9ca60309475b58fd25ea20aa1ad Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 10 May 2024 15:09:22 +1200 Subject: [PATCH 029/187] doc tweak --- docs/src/accessor_functions.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index 89c95e2b..07c30f1f 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -17,8 +17,8 @@ The sole argument of an accessor function is the output, `model`, of [`fit`](@re ## Implementation guide -All new implementations must implement [`LearnAPI.algorithm`](@ref). All others are -optional. All implemented accessor functions must be added to the list returned by +All new implementations must implement [`LearnAPI.algorithm`](@ref). While, all others are +optional, any implemented accessor functions must be added to the list returned by [`LearnAPI.functions`](@ref). From b60fc22fa24c17b816de9ae3f744baf69eb39107 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 19 May 2024 16:20:08 +1200 Subject: [PATCH 030/187] simplify, removing in particular, obsfit, obspredict, obstransform --- Project.toml | 2 +- docs/make.jl | 14 +- docs/src/accessor_functions.md | 5 +- docs/src/anatomy_of_an_implementation.md | 516 +++++++++++++++-------- docs/src/fit.md | 16 +- docs/src/index.md | 25 +- docs/src/kinds_of_target_proxy.md | 45 +- docs/src/minimize.md | 6 +- docs/src/obs.md | 96 ++--- docs/src/predict_transform.md | 70 +-- docs/src/reference.md | 109 ++--- docs/src/traits.md | 86 ++-- src/LearnAPI.jl | 3 +- src/accessor_functions.jl | 19 + src/fit.jl | 170 +------- src/minimize.jl | 2 +- src/obs.jl | 117 ++--- src/predict_transform.jl | 219 +++------- src/tools.jl | 43 +- src/traits.jl | 319 +++++++------- src/types.jl | 93 ++-- test/integration/regression.jl | 118 ++++-- test/integration/static_algorithms.jl | 64 +-- test/runtests.jl | 4 + test/tools.jl | 8 - test/traits.jl | 16 + 26 files changed, 1062 insertions(+), 1123 deletions(-) create mode 100644 test/traits.jl diff --git a/Project.toml b/Project.toml index f8431fdd..206a4038 100644 --- a/Project.toml +++ b/Project.toml @@ -19,4 +19,4 @@ Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] -test = ["DataFrames", "LinearAlgebra", "MLUtils", "Serialization", "SparseArrays", "Tables", "Test"] +test = ["DataFrames", "LinearAlgebra", "MLUtils", "Serialization", "Tables", "Test"] diff --git a/docs/make.jl b/docs/make.jl index fd54ce70..b0705cda 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -11,13 +11,13 @@ makedocs( "Home" => "index.md", "Anatomy of an Implementation" => "anatomy_of_an_implementation.md", "Reference" => "reference.md", - "Kinds of Target Proxy" => "kinds_of_target_proxy.md", - "fit" => "fit.md", - "predict, transform, and relatives" => "predict_transform.md", - "minimize" => "minimize.md", - "obs" => "obs.md", - "Accessor Functions" => "accessor_functions.md", - "Algorithm Traits" => "traits.md", + "... fit" => "fit.md", + "... predict/transform" => "predict_transform.md", + "... Kinds of Target Proxy" => "kinds_of_target_proxy.md", + "... minimize" => "minimize.md", + "... obs" => "obs.md", + "... Accessor Functions" => "accessor_functions.md", + "... Algorithm Traits" => "traits.md", "Common Implementation Patterns" => "common_implementation_patterns.md", "Testing an Implementation" => "testing_an_implementation.md", ], diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index 07c30f1f..f35adc54 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -1,7 +1,6 @@ # [Accessor Functions](@id accessor_functions) -The sole argument of an accessor function is the output, `model`, of [`fit`](@ref) or -[`obsfit`](@ref). +The sole argument of an accessor function is the output, `model`, of [`fit`](@ref). - [`LearnAPI.algorithm(model)`](@ref) - [`LearnAPI.extras(model)`](@ref) @@ -12,6 +11,7 @@ The sole argument of an accessor function is the output, `model`, of [`fit`](@re - [`LearnAPI.feature_importances(model)`](@ref) - [`LearnAPI.training_labels(model)`](@ref) - [`LearnAPI.training_losses(model)`](@ref) +- [`LearnAPI.training_predictions(model)`](@ref) - [`LearnAPI.training_scores(model)`](@ref) - [`LearnAPI.components(model)`](@ref) @@ -33,6 +33,7 @@ LearnAPI.tree LearnAPI.trees LearnAPI.feature_importances LearnAPI.training_losses +LearnAPI.training_predictions LearnAPI.training_scores LearnAPI.training_labels LearnAPI.components diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 2bc24a39..1c011d21 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -1,11 +1,28 @@ # Anatomy of an Implementation This section explains a detailed implementation of the LearnAPI for naive [ridge -regression](https://en.wikipedia.org/wiki/Ridge_regression). Most readers will want to -scan the [demonstration](@ref workflow) of the implementation before studying the -implementation itself. +regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of +workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also +refer to the [demonstration](@ref workflow) of the implementation given later. -## Defining an algorithm type +For a transformer, implementations ordinarily implement `transform` instead of +`predict`. For more on `predict` versus `transform`, see [Predict or transform?](@ref) + +!!! important + + The core implementations of `fit`, `predict`, etc, + always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. + Calls like `fit(algorithm, X, y)` are provided as additional convenience methods. + +!!! note + + If the `data` object consumed by `fit`, `predict`, or `transform` is not + not a suitable table¹, array³, tuple of tables and arrays, or some + other object implementing + the MLUtils.jl `getobs`/`numobs` interface, + then an implementation must: (i) suitably overload the trait + [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as + illustrated below under [Providing an advanced data interface](@ref). The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. @@ -16,141 +33,81 @@ using LinearAlgebra, Tables nothing # hide ``` -A struct stores the regularization hyperparameter `lambda` of our ridge regressor: +## Defining algorithms + +Here's a new type whose instances specify ridge regression parameters: ```@example anatomy -struct Ridge - lambda::Float64 +struct Ridge{T<:Real} + lambda::T end nothing # hide ``` -Instances of `Ridge` are [algorithms](@ref algorithms), in LearnAPI.jl parlance. +Instances of `Ridge` will be [algorithms](@ref algorithms), in LearnAPI.jl parlance. -A keyword argument constructor provides defaults for all hyperparameters: +To [qualify](@ref algorithms) as a LearnAPI algorithm, an object must be come with a +mechanism for creating new versions of itself, with modified property (field) values. To +this end, we implement `LearnAPI.constructor`, which must return a keyword constructor: ```@example anatomy Ridge(; lambda=0.1) = Ridge(lambda) +LearnAPI.constructor(::Ridge) = Ridge nothing # hide ``` -## Implementing `fit` - -A ridge regressor requires two types of data for training: *input features* `X`, which -here we suppose are tabular, and a [target](@ref proxy) `y`, which we suppose is a -vector. Users will accordingly call [`fit`](@ref) like this: - -```julia -algorithm = Ridge(lambda=0.05) -fit(algorithm, X, y; verbosity=1) -``` - -However, a new implementation does not overload `fit`. Rather it -implements - -```julia -obsfit(algorithm::Ridge, obsdata; verbosity=1) -``` - -for each `obsdata` returned by a data-preprocessing call `obs(fit, algorithm, X, y)`. You -can read "obs" as "observation-accessible", for reasons explained shortly. The -LearnAPI.jl definition - -```julia -fit(algorithm, data...; verbosity=1) = - obsfit(algorithm, obs(fit, algorithm, data...), verbosity) -``` -then takes care of `fit`. - -The `obs` and `obsfit` method are public, and the user can call them like this: - -```julia -obsdata = obs(fit, algorithm, X, y) -model = obsfit(algorithm, obsdata) -``` - -We begin by defining a struct¹ for the output of our data-preprocessing operation, `obs`, -which will store `y` and the matrix representation of `X`, together with it's column names -(needed for recording named coefficients for user inspection): - -```@example anatomy -struct RidgeFitData{T} - A::Matrix{T} # p x n - names::Vector{Symbol} - y::Vector{T} -end -nothing # hide -``` - -And we overload [`obs`](@ref) like this +So, if `algorithm = Ridge(lambda=0.1)` then `LearnAPI.constructor(algorithm)(lambda=0.05)` +is another algorithm with the same properties, except that the value of `lambda` has been +changed to `0.05`. -```@example anatomy -function LearnAPI.obs(::typeof(fit), ::Ridge, X, y) - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - return RidgeFitData(Tables.matrix(table, transpose=true), names, y) -end -nothing # hide -``` - -so that `obs(fit, Ridge(), X, y)` returns a combined `RidgeFitData` object with everything -the core algorithm will need. -Since `obs` is public, the user will have access to this object, but to make it useful to -her (and to fulfill the [`obs`](@ref) contract) this object must implement the -[MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) `getobs`/`numobs` interface, to enable -observation-resampling (which will be efficient, because observations are now columns). It -usually suffices to overload `Base.getindex` and `Base.length` (which are the -`getobs`/`numobs` fallbacks) so we won't actually need to depend on MLUtils.jl: +## Implementing `fit` -```@example anatomy -Base.getindex(data::RidgeFitData, I) = - RidgeFitData(data.A[:,I], data.names, y[I]) -Base.length(data::RidgeFitData, I) = length(data.y) -nothing # hide -``` +A ridge regressor requires two types of data for training: *input features* `X`, which +here we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a +vector. -Next, we define a second struct for storing the outcomes of training, including named -versions of the learned coefficients: +It is convenient to define a new type for the `fit` output, which will include +coefficients labelled by feature name for inspection after training: ```@example anatomy struct RidgeFitted{T,F} - algorithm::Ridge - coefficients::Vector{T} - named_coefficients::F + algorithm::Ridge + coefficients::Vector{T} + named_coefficients::F end nothing # hide ``` -We include `algorithm`, which must be recoverable from the output of `fit`/`obsfit` (see -[Accessor functions](@ref) below). +Note that we also include `algorithm` in the struct, for it must be possible to recover +`algorithm` from the output of `fit`; see [Accessor functions](@ref) below. -We are now ready to implement a suitable `obsfit` method to execute the core training: +The core implementation of `fit` looks like this: ```@example anatomy -function LearnAPI.obsfit(algorithm::Ridge, obsdata::RidgeFitData, verbosity) +function LearnAPI.fit(algorithm::Ridge, data; verbosity=1) - lambda = algorithm.lambda - A = obsdata.A - names = obsdata.names - y = obsdata.y + X, y = data - # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + # data preprocessing: + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table, transpose=true) - # determine named coefficients: - named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + lambda = algorithm.lambda - # make some noise, if allowed: - verbosity > 0 && @info "Coefficients: $named_coefficients" + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector - return RidgeFitted(algorithm, coefficients, named_coefficients) + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" + + return RidgeFitted(algorithm, coefficients, named_coefficients) end -nothing # hide ``` -Users set `verbosity=0` for warnings only, and `verbosity=-1` for silence. - ## Implementing `predict` @@ -163,45 +120,29 @@ predict(model, LiteralTarget(), Xnew) where `Xnew` is a table (of the same form as `X` above). The argument `LiteralTarget()` signals that we want literal predictions of the target variable, as opposed to a proxy for the target, such as probability density functions. `LiteralTarget` is an example of a -[`LearnAPI.KindOfProxy`](@ref proxy_types) type. Targets and target proxies are defined +[`LearnAPI.KindOfProxy`](@ref proxy_types) type. Targets and target proxies are discussed [here](@ref proxy). -Rather than overload the primary signature above, however, we overload for -"observation-accessible" input, as we did for `fit`, - -```@example anatomy -LearnAPI.obspredict(model::RidgeFitted, ::LiteralTarget, Anew::Matrix) = - ((model.coefficients)'*Anew)' -nothing # hide -``` - -and overload `obs` to make the table-to-matrix conversion: +Here's the implementation for our ridge regressor: ```@example anatomy -LearnAPI.obs(::typeof(predict), ::Ridge, Xnew) = Tables.matrix(Xnew, transpose=true) +LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = + Tables.matrix(Xnew)*model.coefficients ``` -As matrices (with observations as columns) already implement the MLUtils.jl -`getobs`/`numobs` interface, we already satisfy the [`obs`](@ref) contract, and there was -no need to create a wrapper for `obs` output. - -The primary `predict` method, handling tabular input, is provided by a -LearnAPI.jl fallback similar to the `fit` fallback. - - ## Accessor functions -An [accessor function](@ref accessor_functions) has the output of [`fit`](@ref) (a -"model") as it's sole argument. Every new implementation must implement the accessor -function [`LearnAPI.algorithm`](@ref) for recovering an algorithm from a fitted object: +An [accessor function](@ref accessor_functions) has the output of [`fit`](@ref) as it's +sole argument. Every new implementation must implement the accessor function +[`LearnAPI.algorithm`](@ref) for recovering an algorithm from a fitted object: ```@example anatomy LearnAPI.algorithm(model::RidgeFitted) = model.algorithm ``` Other accessor functions extract learned parameters or some standard byproducts of -training, such as feature importances or training losses.² Implementing the -[`LearnAPI.coefficients`](@ref) accessor function is straightforward: +training, such as feature importances or training losses.² Here we implement an accessor +function to extract the linear coefficients: ```@example anatomy LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients @@ -215,56 +156,84 @@ overload it to dump the named version of the coefficients: ```@example anatomy LearnAPI.minimize(model::RidgeFitted) = - RidgeFitted(model.algorithm, model.coefficients, nothing) + RidgeFitted(model.algorithm, model.coefficients, nothing) ``` +Crucially, we can still use `LearnAPI.minimize(model)` in place of `model` to make new +predictions. + + ## Algorithm traits Algorithm [traits](@ref traits) record extra generic information about an algorithm, or -make specific promises of behavior. They usually have an algorithm as the single argument. +make specific promises of behavior. They usually have an algorithm as the single +argument. We regard [`LearnAPI.constructor`](@ref) defined above as a trait. In LearnAPI.jl `predict` always outputs a [target or target proxy](@ref proxy), where -"target" is understood very broadly. We overload a trait to record the fact that the -target variable explicitly appears in training (i.e, the algorithm is supervised) and -where exactly it appears: +"target" is understood very broadly. We overload a trait to record the fact here that the +target variable explicitly appears in training (i.e, the algorithm is supervised): ```julia -LearnAPI.position_of_target(::Ridge) = 2 +LearnAPI.target(::Ridge) = true ``` -Or, you can use the shorthand + +or, using a shortcut: ```julia -@trait Ridge position_of_target = 2 +@trait Ridge target = true ``` -The macro can also be used to specify multiple traits simultaneously: +The macro can be used to specify multiple traits simultaneously: ```@example anatomy @trait( - Ridge, - position_of_target = 2, - kinds_of_proxy=(LiteralTarget(),), - descriptors = (:regression,), - functions = ( - fit, - obsfit, - minimize, - predict, - obspredict, - obs, - LearnAPI.algorithm, - LearnAPI.coefficients, - ) + Ridge, + constructor = Ridge, + target = true, + kinds_of_proxy=(LiteralTarget(),), + descriptors = (:regression,), + functions = ( + fit, + minimize, + predict, + obs, + LearnAPI.algorithm, + LearnAPI.coefficients, + ) ) nothing # hide ``` -Implementing the last trait, [`LearnAPI.functions`](@ref), which must include all -non-trait functions overloaded for `Ridge`, is compulsory. This is the only universally -compulsory trait. It is worthwhile studying the [list of all traits](@ref traits_list) to -see which might apply to a new implementation, to enable maximum buy into functionality -provided by third party packages, and to assist third party algorithms that match machine -learning algorithms to user defined tasks. +The trait `kinds_of_proxy` is required here, because we implemented `predict`. + +The last trait `functions` returns a list of all LearnAPI.jl methods that can be +meaninfully applied to the algorithm or associated model. See [`LearnAPI.functions`](@ref) +for a checklist. This, and [`LearnAPI.constructor`](@ref), are the only universally +compulsory traits. However, it is worthwhile studying the [list of all traits](@ref +traits_list) to see which might apply to a new implementation, to enable maximum buy into +functionality provided by third party packages, and to assist third party algorithms that +match machine learning algorithms to user-defined tasks. + +Having set `LearnAPI.target(::Ridge) == true` we are obliged to overload a multi-argument +version of `LearnAPI.target` to extract the target from the `data` that gets supplied to +`fit`: + +```@example anatomy +LearnAPI.target(::Ridge, data) = last(data) +``` + +## Convenience methods + +Finally, we extend `fit` and `predict` with signatures convenient for user interaction, +enabling the kind of workflow previewed in [Sample workflow](@ref): + +```@example anatomy +LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = + fit(algorithm, (X, y); kwargs...) + +LearnAPI.predict(model::RidgeFitted, Xnew) = + predict(model, LiteralTarget(), Xnew) +``` ## [Demonstration](@id workflow) @@ -279,38 +248,21 @@ test = 7:10 a, b, c = rand(n), rand(n), rand(n) X = (; a, b, c) y = 2a - b + 3c + 0.05*rand(n) +nothing # hide +``` +```@example anatomy algorithm = Ridge(lambda=0.5) -LearnAPI.functions(algorithm) +foreach(println, LearnAPI.functions(algorithm)) ``` -### Naive user workflow - -Training and predicting with external resampling: +Training and predicting: ```@example anatomy -using Tables model = fit(algorithm, Tables.subset(X, train), y[train]) ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) ``` -### Advanced workflow - -We now train and predict using internal data representations, resampled using the generic -MLUtils.jl interface. - -```@example anatomy -import MLUtils -fit_data = obs(fit, algorithm, X, y) -predict_data = obs(predict, algorithm, X) -model = obsfit(algorithm, MLUtils.getobs(fit_data, train)) -ẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predict_data, test)) -@assert ẑ == ŷ -nothing # hide -``` - -### Applying an accessor function and serialization - Extracting coefficients: ```@example anatomy @@ -319,21 +271,213 @@ LearnAPI.coefficients(model) Serialization/deserialization: -```julia +```@example anatomy using Serialization small_model = minimize(model) -serialize("my_ridge.jls", small_model) +filename = tempname() +serialize(filename, small_model) +``` -recovered_model = deserialize("my_ridge.jls") +```julia +recovered_model = deserialize(filename) @assert LearnAPI.algorithm(recovered_model) == algorithm -predict(recovered_model, LiteralTarget(), X) == predict(model, LiteralTarget(), X) +@assert predict(recovered_model, X) == predict(model, X) +``` + +## Providing an advanced data interface + +```@setup anatomy2 +using LearnAPI +using LinearAlgebra, Tables + +struct Ridge{T<:Real} + lambda::T +end + +Ridge(; lambda=0.1) = Ridge(lambda) + +struct RidgeFitted{T,F} + algorithm::Ridge + coefficients::Vector{T} + named_coefficients::F +end + +LearnAPI.algorithm(model::RidgeFitted) = model.algorithm +LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients +LearnAPI.minimize(model::RidgeFitted) = + RidgeFitted(model.algorithm, model.coefficients, nothing) + +LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = + fit(algorithm, (X, y); kwargs...) +LearnAPI.predict(model::RidgeFitted, Xnew) = predict(model, LiteralTarget(), Xnew) + +@trait( + Ridge, + constructor = Ridge, + target = true, + kinds_of_proxy=(LiteralTarget(),), + descriptors = (:regression,), + functions = ( + fit, + minimize, + predict, + obs, + LearnAPI.algorithm, + LearnAPI.coefficients, + ) +) + +n = 10 # number of observations +train = 1:6 +test = 7:10 +a, b, c = rand(n), rand(n), rand(n) +X = (; a, b, c) +y = 2a - b + 3c + 0.05*rand(n) +``` + +An implementation may optionally implement [`obs`](@ref), to expose to the user (or some +meta-algorithm like cross-validation) the representation of input data internal to `fit` +or `predict`, such as the matrix version `A` of `X` in the ridge example. Here we +specifically wrap all the pre-processed data into single object, for which we introduce a +new type: + +```@example anatomy2 +struct RidgeFitObs{T,M<:AbstractMatrix{T}} + A::M # p x n + names::Vector{Symbol} # features + y::Vector{T} # target +end +``` + +Now we overload `obs` to carry out the data pre-processing previously in `fit`, like this: + +```@example anatomy2 +function LearnAPI.obs(::Ridge, data) + X, y = data + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return RidgeFitObs(Tables.matrix(table)', names, y) +end +``` + +We informally refer to the output of `obs` as "observations" (see [The `obs` +contract](@ref) below). The previous core `fit` signature is now replaced with two +methods - one to handle "regular" input, and one to handle the pre-processed data +(observations) which appears first below: + +```@example anatomy2 +function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1) + + lambda = algorithm.lambda + + A = observations.A + names = observations.names + y = observations.y + + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" + + return RidgeFitted(algorithm, coefficients, named_coefficients) + +end + +LearnAPI.fit(algorithm::Ridge, data; kwargs...) = + fit(algorithm, obs(algorithm, data); kwargs...) +``` + +We provide an overloading of `LearnAPI.target` to handle the additional supported data +argument of `fit`: + +```@example anatomy2 +LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y +``` + +### The `obs` contract + +Providing `fit` signatures matching the output of `obs`, is the first part of the `obs` +contract. The second part is this: *The outupt of `obs` must implement the* +[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs/numobs` *interface for +accessing individual observations*. It usually suffices to overload `Base.getindex` and +`Base.length` (which are the `getobs/numobs` fallbacks): + +```@example anatomy2 +Base.getindex(data::RidgeFitObs, I) = + RidgeFitObs(data.A[:,I], data.names, y[I]) +Base.length(data::RidgeFitObs, I) = length(data.y) +``` + +We can do something similar for `predict`, but there's no need for a new type in this +case: + +```@example anatomy2 +LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' + +LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, observations::AbstractMatrix) = + observations'*model.coefficients + +LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = + predict(model, LiteralTarget(), obs(model, Xnew)) +``` + +### Important notes: + +- The observations to be consumed by `fit` are returned by `obs(algorithm::Ridge, ...)`, + while those consumed by `predict` are returned by `obs(model::RidgeFitted, ...)`. We + need the different signatures because the form of data consumed by `fit` and `predict` + are generally different. + +- We need the adjoint operator, `'`, because the last dimension in arrays is the + observation dimension, according to the MLUtils.jl convention. Remember, `Xnew` is a + table here. + +Since LearnAPI.jl provides fallbacks for `obs` that simply return the unadulterated data +input, overloading `obs` is optional. This is provided data in publicized `fit`/`predict` +signatures consists of objects implementing the `getobs/numobs` interface (such as tables¹ +and arrays³). + +To buy out of supporting the MLUtils.jl interface altogether, an implementation must +overload the trait, [`LearnAPI.data_interface(algorithm)`](@ref). + +For more on data interfaces, see [`obs`](@ref) and +[`LearnAPI.data_interface(algorithm)`](@ref). + + +## Demonstration of an advanced `obs` workflow + +We now can train and predict using internal data representations, resampled using the +generic MLUtils.jl interface: + +```@example anatomy2 +import MLUtils +algorithm = Ridge() +observations_for_fit = obs(algorithm, (X, y)) +model = fit(algorithm, MLUtils.getobs(observations_for_fit, train)) +observations_for_predict = obs(model, X) +ẑ = predict(model, MLUtils.getobs(observations_for_predict, test)) +``` + +```julia +@assert ẑ == ŷ ``` --- -¹ The definition of this and other structs above is not an explicit requirement of -LearnAPI.jl, whose constructs are purely functional. +¹ In LearnAPI.jl a *table* is any object `X` implementing the +[Tables.jl](https://tables.juliadata.org/dev/) interface, additionally satisfying +`Tables.istable(X) == true` and implementing `DataAPI.nrow` (and whence +`MLUtils.numobs`). Tables that are also (unnamed) tuples are disallowed. ² An implementation can provide further accessor functions, if necessary, but like the native ones, they must be included in the [`LearnAPI.functions`](@ref) declaration. + +³ The last index must be the observation index. + +⁴ Guaranteed assuming +`LearnAPI.data_interface(algorithm) == Base.HasLength()`, the default. diff --git a/docs/src/fit.md b/docs/src/fit.md index f2709611..c3727110 100644 --- a/docs/src/fit.md +++ b/docs/src/fit.md @@ -1,10 +1,13 @@ # [`fit`](@ref fit) ```julia -fit(algorithm, data...; verbosity=1) -> model -fit(model, data...; verbosity=1) -> updated_model +fit(algorithm, data; verbosity=1) -> model +fit(model, data; verbosity=1) -> updated_model ``` +When `fit` expects an tuple form of argument, `data = (X1, ..., Xn)`, then the signature +`fit(algorithm, X1, ..., Xn)` is also provided. + ## Typical workflow ```julia @@ -20,17 +23,14 @@ LearnAPI.feature_importances(model) ## Implementation guide -The `fit` method is not implemented directly. Instead, implement [`obsfit`](@ref). +| method | fallback | compulsory? | +|:--------------------------|:---------|-------------| +| [`fit`](@ref)`(alg, ...)` | none | yes | -| method | fallback | compulsory? | requires | -|:-----------------------------|:---------|-------------|-----------------------------| -| [`obsfit`](@ref)`(alg, ...)` | none | yes | [`obs`](@ref) in some cases | -| | | | | ## Reference ```@docs LearnAPI.fit -LearnAPI.obsfit ``` diff --git a/docs/src/index.md b/docs/src/index.md index f5c793f7..4f979070 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -22,11 +22,11 @@ promising specific behavior. !!! warning - The API described here is under active development and not ready for adoption. - Join an ongoing design discussion at - [this](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048) + The API described here is under active development and not ready for adoption. + Join an ongoing design discussion at + [this](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048) Julia Discourse thread. - + ## Sample workflow @@ -69,12 +69,17 @@ on the usual supervised/unsupervised learning dichotomy. From this point of view supervised algorithm is simply one in which a target variable exists, and happens to appear as an input to training but not to prediction. -In LearnAPI.jl, a method called [`obs`](@ref data_interface) gives users access to an -"internal", algorithm-specific, representation of input data, which is always -"observation-accessible", in the sense that it can be resampled using -[MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) `getobs/numobs` interface. The -implementation can arrange for this resampling to be efficient, and workflows based on -`obs` can have performance benefits. +Algorithms are free to consume data in any format. However, a method called [`obs`](@ref +data_interface) (read as "observations") gives users and meta-algorithms access to an +algorithm-specific representation of input data, which is also guaranteed to implement a +standard interface for accessing individual observations, unless an algorithm explicitly +opts out. The `fit` and `predict` methods consume these alternative representations of data. + +The fallback data interface is the [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) +`getobs/numobs` interface, and if the input consumed by the algorithm already implements +that interface (tables, arrays, etc.) then overloading `obs` is completely optional. A +plain iteration interface (to support, e.g., data loaders reading images from disk files) +can also be specified. ## Learning more diff --git a/docs/src/kinds_of_target_proxy.md b/docs/src/kinds_of_target_proxy.md index 03c7e032..a34e1f42 100644 --- a/docs/src/kinds_of_target_proxy.md +++ b/docs/src/kinds_of_target_proxy.md @@ -1,17 +1,19 @@ # [Kinds of Target Proxy](@id proxy_types) -The available kinds of [target proxy](@ref proxy) are classified by subtypes of -`LearnAPI.KindOfProxy`. These types are intended for dispatch only and have no fields. +The available kinds of [target proxy](@ref proxy) (used for `predict` dispatch) are +classified by subtypes of `LearnAPI.KindOfProxy`. These types are intended for dispatch +only and have no fields. ```@docs LearnAPI.KindOfProxy ``` + +## Simple target proxies + ```@docs LearnAPI.IID ``` -## Simple target proxies (subtypes of `LearnAPI.IID`) - | type | form of an observation | |:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `LearnAPI.LiteralTarget` | same as target observations | @@ -24,11 +26,13 @@ LearnAPI.IID | `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | | `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | | `LearnAPI.LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | +| `LearnAPI.LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number | | `LearnAPI.ConfidenceInterval` | confidence interval | -| `LearnAPI.Set` | finite but possibly varying number of target observations | -| `LearnAPI.ProbabilisticSet` | as for `Set` but labeled with probabilities (not necessarily summing to one) | +| `LearnAPI.Fuzzy` | finite but possibly varying number of target observations | +| `LearnAPI.ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | | `LearnAPI.SurvivalFunction` | survival function | | `LearnAPI.SurvivalDistribution` | probability distribution for survival time | +| `LearnAPI.SurvivalHazardFunction` | hazard function for survival time | | `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | | `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | @@ -38,18 +42,31 @@ representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/p > Table of concrete subtypes of `LearnAPI.IID <: LearnAPI.KindOfProxy`. -## When the proxy for the target is a single object +## Proxies for distribution-fitting algorithms + +```@docs +LearnAPI.Single +``` + +| type `T` | form of output of `predict(model, ::T)` | +|:--------------------------------:|:-----------------------------------------------------------------------| +| `LearnAPI.SingleSampleable` | object that can be sampled to obtain a single target observation | +| `LearnAPI.SingleDistribution` | explicit probability density/mass function for sampling the target | +| `LearnAPI.SingleLogDistribution` | explicit log-probability density/mass function for sampling the target | + +> Table of `LearnAPI.KindOfProxy` subtypes subtyping `LearnAPI.Single` + -In the following table of subtypes `T <: LearnAPI.KindOfProxy` not falling under the `IID` -umbrella, it is understood that `predict(model, ::T, ...)` is -not divided into individual observations, but represents a *single* probability -distribution for the sample space ``Y^n``, where ``Y`` is the space the target variable -takes its values, and `n` is the number of observations in `data`. +## Joint probability distributions + +```@docs +LearnAPI.Joint +``` -| type `T` | form of output of `predict(model, ::T, data...)` | +| type `T` | form of output of `predict(model, ::T, data)` | |:-------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `LearnAPI.JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. | | `LearnAPI.JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | | `LearnAPI.JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | -> Table of `LearnAPI.KindOfProxy` subtypes not subtyping `LearnAPI.IID` +> Table of `LearnAPI.KindOfProxy` subtypes subtyping `LearnAPI.Joint` diff --git a/docs/src/minimize.md b/docs/src/minimize.md index a9423780..6fad919a 100644 --- a/docs/src/minimize.md +++ b/docs/src/minimize.md @@ -23,9 +23,9 @@ LearnAPI.feature_importances(recovered_model) # Implementation guide -| method | compulsory? | fallback | requires | -|:-----------------------------|:-----------:|:--------:|:-------------:| -| [`minimize`](@ref) | no | identity | [`fit`](@ref) | +| method | compulsory? | fallback | +|:-----------------------------|:-----------:|:--------:| +| [`minimize`](@ref) | no | identity | # Reference diff --git a/docs/src/obs.md b/docs/src/obs.md index bfb35a69..fe198a85 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -1,45 +1,40 @@ # [`obs`](@id data_interface) -The [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) package provides two methods -`getobs` and `numobs` for resampling data divided into multiple observations, including -arrays and tables. The data objects returned below are guaranteed to implement this -interface and can be passed to the relevant method (`obsfit`, `obspredict` or -`obstransform`) possibly after resampling using `MLUtils.getobs`. This may provide -performance advantages over naive workflows. +The `obs` method takes data intended as input to `fit`, `predict` or `transform`, and +transforms it to an algorithm-specific form guaranteed to implement a form of observation +access designated by the algorithm. The transformed data can then be resampled and passed +on to the relevant method in place of the original input. Using `obs` may provide +performance advantages over naive workflows in some cases (e.g., cross-validation). ```julia -obs(fit, algorithm, data...) -> -obs(predict, algorithm, data...) -> -obs(transform, algorithm, data...) -> +obs(algorithm, data) # can be passed to `fit` instead of `data` +obs(model, data) # can be passed to `predict` or `tranform` instead of `data` ``` ## Typical workflows -LearnAPI.jl makes no assumptions about the form of data `X` and `y` in a call like -`fit(algorithm, X, y)`. The particular `algorithm` is free to articulate it's own -requirements. However, in this example, the definition +LearnAPI.jl makes no explicit assumptions about the form of data `X` and `y` in a call +like `fit(algorithm, (X, y))`. However, if we define ```julia -obsdata = obs(fit, algorithm, X, y) +observations = obs(algorithm, (X, y)) ``` -combines `X` and `y` in a single object guaranteed to implement the MLUtils.jl -`getobs`/`numobs` interface, which can be passed to `obsfit` instead of `fit`, as is, or -after resampling using `MLUtils.getobs`: +then, assuming the typical case that `LearnAPI.data_interface(algorithm) == Base.HasLength()`, `observations` implements the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface. Moreover, we can pass `observations` to `fit` in place of +the original data, or first resample it using `MLUtils.getobs`: ```julia -# equivalent to `mode = fit(algorithm, X, y)`: -model = obsfit(algorithm, obsdata) +# equivalent to `model = fit(algorithm, (X, y))` (or `fit(algorithm, X, y))`: +model = fit(algorithm, observations) # with resampling: -resampled_obsdata = MLUtils.getobs(obsdata, 1:100) -model = obsfit(algorithm, resampled_obsdata) +resampled_observations = MLUtils.getobs(observations, 1:10) +model = fit(algorithm, resampled_observations) ``` In some implementations, the alternative pattern above can be used to avoid repeating unnecessary internal data preprocessing, or inefficient resampling. For example, here's -how a user might call `obs` and `MLUtils.getobs` to perform efficient -cross-validation: +how a user might call `obs` and `MLUtils.getobs` to perform efficient cross-validation: ```julia using LearnAPI @@ -49,53 +44,48 @@ X = y = algorithm = -test_train_folds = map([1:10, 11:20, 21:30]) do test - (test, setdiff(1:30, test)) -end +train_test_folds = map([1:10, 11:20, 21:30]) do test + (setdiff(1:30, test), test) +end -# create fixed model-specific representations of the whole data set: -fit_data = obs(fit, algorithm, X, y) -predict_data = obs(predict, algorithm, predict, X) +fitobs = obs(algorithm, (X, y)) +never_trained = true -scores = map(train_test_folds) do (train_indices, test_indices) - - # train using model-specific representation of data: - train_data = MLUtils.getobs(fit_data, train_indices) - model = obsfit(algorithm, train_data) - - # predict on the fold complement: - test_data = MLUtils.getobs(predict_data, test_indices) - ŷ = obspredict(model, LiteralTarget(), test_data) +scores = map(train_test_folds) do (train, test) + + # train using model-specific representation of data: + trainobs = MLUtils.getobs(fitobs, train) + model = fit(algorithm, trainobs) + + # predict on the fold complement: + if never_trained + global predictobs = obs(model, X) + global never_trained = false + end + testobs = MLUtils.getobs(predictobs, test) + ŷ = predict(model, LiteralTarget(), testobs) return - -end + +end ``` -Note here that the output of `obspredict` will match the representation of `y` , i.e., +Note here that the output of `predict` will match the representation of `y` , i.e., there is no concept of an algorithm-specific representation of *outputs*, only inputs. ## Implementation guide -| method | compulsory? | fallback | -|:--------------|:-----------:|:----------------------:| -| [`obs`](@ref) | depends | slurps `data` argument | -| | | | +| method | compulsory? | fallback | +|:----------------------------------------|:-----------:|:--------------:| +| [`obs(algorithm_or_model, data)`](@ref) | depends | returns `data` | +| | | | -If the `data` consumed by `fit`, `predict` or `transform` consists only of tables and -arrays (with last dimension the observation dimension) then overloading `obs` is -optional. However, if an implementation overloads `obs` to return a (thinly wrapped) -representation of user data that is closer to what the core algorithm actually uses, and -overloads `MLUtils.getobs` (or, more typically `Base.getindex`) to make resampling of that -representation efficient, then those optimizations become available to the user, without -the user concerning herself with the details of the representation. +A sample implementation is given in [Providing an advanced data interface](@ref). -A sample implementation is given in the [`obs`](@ref) document-string below. ## Reference ```@docs obs ``` - diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 382216b8..35fb52d7 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -1,72 +1,74 @@ -# [`predict`, `transform`, and relatives](@id operations) - -Standard methods: +# [`predict`, `transform` and `inverse_transform`](@id operations) ```julia -predict(model, kind_of_proxy, data...) -> prediction -transform(model, data...) -> transformed_data -inverse_transform(model, data...) -> inverted_data +predict(model, kind_of_proxy, data) +transform(model, data) +inverse_transform(model, data) ``` -Methods consuming output, `obsdata`, of data-preprocessor [`obs`](@ref): - -```julia -obspredict(model, kind_of_proxy, obsdata) -> prediction -obstransform(model, obsdata) -> transformed_data -``` +When a method expects a tuple form of argument, `data = (X1, ..., Xn)`, then a slurping +signature is also provided, as in `transform(model, X1, ..., Xn)`. ## Typical worklows +Train some supervised `algorithm`: + ```julia -# Train some supervised `algorithm`: model = fit(algorithm, X, y) +``` -# Predict probability distributions: +Predict probability distributions: + +```julia ŷ = predict(model, Distribution(), Xnew) +``` + +Generate point predictions: -# Generate point predictions: +```julia ŷ = predict(model, LiteralTarget(), Xnew) ``` +Train a dimension-reducing `algorithm`: + ```julia -# Training a dimension-reducing `algorithm`: model = fit(algorithm, X) Xnew_reduced = transform(model, Xnew) +``` + +Apply an approximate right inverse: -# Apply an approximate right inverse: +```julia inverse_transform(model, Xnew_reduced) ``` ### An advanced workflow ```julia -fitdata = obs(fit, algorithm, X, y) -predictdata = obs(predict, algorithm, Xnew) -model = obsfit(algorithm, obsdata) -ŷ = obspredict(model, LiteralTarget(), predictdata) +fitobs = obs(algorithm, (X, y)) # algorithm-specific repr. of data +model = fit(algorithm, MLUtils.getobs(fitobs, 1:100)) +predictobs = obs(model, MLUtils.getobs(X, 101:150)) +ŷ = predict(model, LiteralTarget(), predictobs) ``` ## Implementation guide -The methods `predict` and `transform` are not directly overloaded. Implement `obspredict` -and `obstransform` instead: - -| method | compulsory? | fallback | requires | -|:----------------------------|:-----------:|:--------:|:-------------------------------------:| -| [`obspredict`](@ref) | no | none | [`fit`](@ref) | -| [`obstransform`](@ref) | no | none | [`fit`](@ref) | -| [`inverse_transform`](@ref) | no | none | [`fit`](@ref), [`obstransform`](@ref) | +| method | compulsory? | fallback | +|:----------------------------|:-----------:|:--------:| +| [`predict`](@ref) | no | none | +| [`transform`](@ref) | no | none | +| [`inverse_transform`](@ref) | no | none | ### Predict or transform? -If the algorithm has a notion of [target variable](@ref proxy), then arrange for -[`obspredict`](@ref) to output each supported [kind of target proxy](@ref +If the algorithm has a notion of [target variable](@ref proxy), then use +[`predict`](@ref) to output each supported [kind of target proxy](@ref proxy_types) (`LiteralTarget()`, `Distribution()`, etc). -For output not associated with a target variable, implement [`obstransform`](@ref) +For output not associated with a target variable, implement [`transform`](@ref) instead, which does not dispatch on [`LearnAPI.KindOfProxy`](@ref), but can be optionally -paired with an implementation of [`inverse_transform`](@ref) for returning (approximate) +paired with an implementation of [`inverse_transform`](@ref), for returning (approximate) right inverses to `transform`. @@ -74,8 +76,6 @@ right inverses to `transform`. ```@docs predict -obspredict transform -obstransform inverse_transform ``` diff --git a/docs/src/reference.md b/docs/src/reference.md index 5a46c6ab..5b15e03e 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -21,12 +21,13 @@ undertood that individual objects share the same number of observations, and tha resampling of one component implies synchronized resampling of the others. A `DataFrame` instance, from [DataFrames.jl](https://dataframes.juliadata.org/stable/), is -an example of data, the observations being the rows. LearnAPI.jl makes no assumptions -about how observations can be accessed, except in the case of the output of [`obs`](@ref -data_interface), which must implement the MLUtils.jl `getobs`/`numobs` interface. For -example, it is generally ambiguous whether the rows or columns of a matrix are considered -observations, but if a matrix is returned by [`obs`](@ref data_interface) the observations -must be the columns. +an example of data, the observations being the rows. Typically, data provided to +LearnAPI.jl algorithms, will implement the +[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/stable) `getobs/numobs` interface for +accessing individual observations, but implementations can opt out of this requirement; +see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. In the MLUtils.jl +convention, observations in tables are the rows but observations in a matrix are the +columns. ### [Hyperparameters](@id hyperparameters) @@ -69,38 +70,31 @@ by the package ### [Algorithms](@id algorithms) An object implementing the LearnAPI.jl interface is called an *algorithm*, although it is -more accurately "the configuration of some algorithm".¹ It will have a type name -reflecting the name of some ML/statistics algorithm (e.g., `RandomForestRegressor`) and it -will encapsulate a particular set of user-specified [hyperparameters](@ref). +more accurately "the configuration of some algorithm".¹ An algorithm encapsulates a +particular set of user-specified [hyperparameters](@ref) as the object's properties. It +does not store learned parameters. -Additionally, for `alg::Alg` to be a LearnAPI algorithm, we require: +For `algorithm` to be a valid LearnAPI.jl algorithm, +[`LearnAPI.constructor(algorithm)`](@ref) must be defined and return a keyword constructor +enabling recovery of `algorithm` from its properties: -- `Base.propertynames(alg)` returns the hyperparameter names; values can be accessed using - `Base.getproperty` - -- If `alg` is an algorithm, then so are all instances of the same type. - -- If `_alg` is another algorithm, then `alg == _alg` if and only if `typeof(alg) == - typeof(_alg)` and corresponding properties are `==`. This includes properties that are - random number generators (which should be copied in training to avoid mutation). - -- If an algorithm has other algorithms as hyperparameters, then - [`LearnAPI.is_composite`](@ref)`(alg)` must be `true` (fallback is `false`). - -- A keyword constructor for `Alg` exists, providing default values for *all* non-algorithm - hyperparameters. - -- At least one non-trait LearnAPI.jl function must be overloaded for instances of `Alg`, - and accordingly `LearnAPI.functions(algorithm)` must be non-empty. +```julia +properties = propertynames(algorithm) +named_properties = NamedTuple{properties}(getproperty.(Ref(algorithm), properties)) +@assert algorithm == LearnAPI.constructor(algorithm)(; named_properties...) +``` -Any object `alg` for which [`LearnAPI.functions`](@ref)`(alg)` is non-empty is understood -have a valid implementation of the LearnAPI.jl interface. +Note that if if `algorithm` is an instance of a *mutable* struct, this requirement +generally requires overloading `Base.==` for the struct. +A *composite algorithm* is one with a property that can take other algorithms as values; +for such algorithms [`LearnAPI.is_composite`](@ref)`(algorithm)` must be `true` (fallback +is `false`). Generally, the keyword constructor provided by [`LearnAPI.constructor`](@ref) +must provide default values for all non-algorithm properties. ### Example -Any instance of `GradientRidgeRegressor` defined below meets all but the last criterion -above: +Any instance of `GradientRidgeRegressor` defined below is a valid algorithm. ```julia struct GradientRidgeRegressor{T<:Real} @@ -110,27 +104,33 @@ struct GradientRidgeRegressor{T<:Real} end GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) = GradientRidgeRegressor(learning_rate, epochs, l2_regularization) +LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` -The same is not true if we make this a `mutable struct`. In that case we will need to -appropriately overload `Base.==` for `GradientRidgeRegressor`. +Any object `algorithm` for which [`LearnAPI.functions`](@ref)`(algorithm)` is non-empty is +understood have a valid implementation of the LearnAPI.jl interface. ## Methods -Only these method names are exported: `fit`, `obsfit`, `predict`, `obspredict`, -`transform`, `obstransform`, `inverse_transform`, `minimize`, and `obs`. All new -implementations must implement [`obsfit`](@ref), the accessor function -[`LearnAPI.algorithm`](@ref algorithm_minimize) and the trait -[`LearnAPI.functions`](@ref). +Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_transform`, +`minimize`, and `obs`. All new implementations must implement [`fit`](@ref), +[`LearnAPI.algorithm`](@ref algorithm_minimize), [`LearnAPI.constructor`](@ref) and +[`LearnAPI.functions`](@ref). The last two are algorithm traits, which can be set with the +[`@trait`](@ref) macro. -- [`fit`](@ref fit)/[`obsfit`](@ref): for training algorithms that generalize to new data +### List of methods -- [`predict`](@ref operations)/[`obspredict`](@ref): for outputting [targets](@ref proxy) - or [target proxies](@ref proxy) (such as probability density functions) +- [`fit`](@ref fit): for training or updating algorithms that generalize to new data. For + non-generalizing ("static") algorithms, `fit(algorithm)` generally wraps algorithm in a + mutable struct that can be mutated by `predict`/`transform` to record byproducts of + those operations. -- [`transform`](@ref operations)/[`obstransform`](@ref): similar to `predict`, but for - arbitrary kinds of output, and which can be paired with an `inverse_transform` method +- [`predict`](@ref operations): for outputting [targets](@ref proxy) or [target + proxies](@ref proxy) (such as probability density functions) + +- [`transform`](@ref operations): similar to `predict`, but for arbitrary kinds of output, + and which can be paired with an `inverse_transform` method - [`inverse_transform`](@ref operations): for inverting the output of `transform` ("inverting" broadly understood) @@ -138,21 +138,22 @@ implementations must implement [`obsfit`](@ref), the accessor function - [`minimize`](@ref algorithm_minimize): for stripping the `model` output by `fit` of inessential content, for purposes of serialization. -- [`obs`](@ref data_interface): a method for exposing to the user "optimized", - algorithm-specific representations of data, which can be passed to `obsfit`, - `obspredict` or `obstransform`, but which can also be efficiently resampled using the - `getobs`/`numobs` interface provided by - [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl). - +- [`obs`](@ref data_interface): a method for exposing to the user algorithm-specific + representations of data guaranteed to implement observation access according to the + value of the [`LearnAPI.data_interface`](@ref) trait + - [Accessor functions](@ref accessor_functions): include things like `feature_importances` and `training_losses`, for extracting, from training outcomes, information common to many algorithms. -- [Algorithm traits](@ref traits): special methods that promise specific algorithm - behavior or for recording general information about the algorithm. The only universally - compulsory trait is `LearnAPI.functions(algorithm)`, which returns a list of the - explicitly overloaded non-trait methods. - +- [Algorithm traits](@ref traits): special methods, that promise specific algorithm + behavior or for recording general information about the algorithm. Only + [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref) are universally + compulsory. + +- [`LearnAPI.target`](@ref) and [`LearnAPI.weights`](@ref) are both traits and methods to + extract, from `fit` input data, the target and per-observation weights, when available. + --- ¹ We acknowledge users may not like this terminology, and may know "algorithm" by some diff --git a/docs/src/traits.md b/docs/src/traits.md index 3a263595..9ff63967 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -1,9 +1,8 @@ # [Algorithm Traits](@id traits) Traits generally promise specific algorithm behavior, such as: *This algorithm supports -per-observation weights, which must appear as the third argument of `fit`*, or *This -algorithm's `transform` method predicts `Real` vectors*. They also record more mundane -information, such as a package license. +per-observation weights, or *This algorithm's `transform` method predicts `Real` +vectors*. They also record more mundane information, such as a package license. Algorithm traits are functions whose first (and usually only) argument is an algorithm. @@ -20,46 +19,40 @@ one argument. In the examples column of the table below, `Table`, `Continuous`, `Sampleable` are names owned by the package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). -| trait | return value | fallback value | example | -|:----------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|:----------------------|:---------------------------------------------------------| -| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(LearnAPI.fit, LearnAPI.predict, LearnAPI.algorithm)` | -| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kop` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kop, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.position_of_target`](@ref)`(algorithm)` | the positional index¹ of the **target** in `data` in `fit(algorithm, data...)` calls | `0` | 2 | -| [`LearnAPI.position_of_weights`](@ref)`(algorithm)` | the positional index¹ of **per-observation weights** in `data` in `fit(algorithm, data...)` | `0` | 3 | -| [`LearnAPI.descriptors`](@ref)`(algorithm)` | lists one or more suggestive algorithm descriptors from `LearnAPI.descriptors()` | `()` | (:regression, :probabilistic) | -| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | -| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | -| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | -| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | -| [`LearnAPI.load_path`](@ref)`(algorithm)` | a string indicating where the struct for `typeof(algorithm)` is defined, beginning with name of package providing implementation | `"unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | -| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties (fields) of `algorithm` may be an algorithm | `false` | `true` | -| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | -| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | -| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | -| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | -| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | -| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kop, data...)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kop, data...)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kop, data...)` works | `Union{}` | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `predict(model, kop, data...)` works | `Union{}` | `Vector{<:Real}` | -| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `scitype(predict(model, ...))` | `Any` | `AbstractVector{Continuous}` | -| [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `typeof(predict(model, ...))` | `Any` | `AbstractVector{<:Real}` | -| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `transform(model, data...)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.transform_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `transform(model, data...)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)`ensuring `transform(model, data...)` works | `Union{}` | `AbstractMatrix{<:Real}}` | -| [`LearnAPI.transform_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `transform(model, data...)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | upper bound on `scitype(transform(model, ...))` | `Any` | `Table(Continuous)` | -| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | upper bound on `typeof(transform(model, ...))` | `Any` | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | - -¹ If the value is `0`, then the variable in boldface type is not supported and not -expected to appear in `data`. If `length(data)` is less than the trait value, then `data` -is understood to exclude the variable, but note that `fit` can have multiple signatures of -varying lengths, as in `fit(algorithm, X, y)` and `fit(algorithm, X, y, w)`. A non-zero -value is a promise that `fit` includes a signature of sufficient length to include the -variable. - +| trait | return value | fallback value | example | +|:----------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:---------------------------------------------------------| +| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | +| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(fit, predict, minimize, LearnAPI.algorithm, obs)` | +| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | +| [`LearnAPI.target`](@ref)`(algorithm)` | `true` if target can appear in `fit` data | `false` | `true` | +| [`LearnAPI.weights`](@ref)`(algorithm)` | `true` if per-observation weights can appear in `fit` data | `false` | `true` | +| [`LearnAPI.descriptors`](@ref)`(algorithm)` | lists one or more suggestive algorithm descriptors from `LearnAPI.descriptors()` | `()` | (:regression, :probabilistic) | +| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | +| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | +| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | +| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | +| [`LearnAPI.load_path`](@ref)`(algorithm)` | a string indicating where the struct for `typeof(algorithm)` is defined, beginning with name of package providing implementation | `"unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | +| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties (fields) of `algorithm` may be an algorithm | `false` | `true` | +| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | +| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | +| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | +| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | +| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | +| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | +| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kind, data...)` works | `Union{}` | `Table(Continuous)` | +| [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kind, data...)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kind, data...)` works | `Union{}` | `AbstractMatrix{<:Real}` | +| [`LearnAPI.predict_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `predict(model, kind, data...)` works | `Union{}` | `Vector{<:Real}` | +| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `scitype(predict(model, ...))` | `Any` | `AbstractVector{Continuous}` | +| [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `typeof(predict(model, ...))` | `Any` | `AbstractVector{<:Real}` | +| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `transform(model, data...)` works | `Union{}` | `Table(Continuous)` | +| [`LearnAPI.transform_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `transform(model, data...)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)`ensuring `transform(model, data...)` works | `Union{}` | `AbstractMatrix{<:Real}}` | +| [`LearnAPI.transform_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `transform(model, data...)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | upper bound on `scitype(transform(model, ...))` | `Any` | `Table(Continuous)` | +| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | upper bound on `typeof(transform(model, ...))` | `Any` | `AbstractMatrix{<:Real}` | +| [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | ### Derived Traits @@ -117,10 +110,11 @@ informative (as in `LearnAPI.predict_type(algorithm) = Any`). ## Reference ```@docs +LearnAPI.constructor LearnAPI.functions LearnAPI.kinds_of_proxy -LearnAPI.position_of_target -LearnAPI.position_of_weights +LearnAPI.target +LearnAPI.weights LearnAPI.descriptors LearnAPI.is_pure_julia LearnAPI.pkg_name @@ -129,6 +123,7 @@ LearnAPI.doc_url LearnAPI.load_path LearnAPI.is_composite LearnAPI.human_name +LearnAPI.data_interface LearnAPI.iteration_parameter LearnAPI.fit_scitype LearnAPI.fit_type @@ -147,4 +142,5 @@ LearnAPI.transform_input_observation_type LearnAPI.predict_or_transform_mutates LearnAPI.transform_output_scitype LearnAPI.transform_output_type +LearnAPI.@trait ``` diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 24626bcd..9ba6b54e 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -12,8 +12,7 @@ include("accessor_functions.jl") include("traits.jl") export @trait -export fit, predict, transform, inverse_transform, fit_transform, minimize -export obs, obsfit, obspredict, obstransform +export fit, predict, transform, inverse_transform, minimize, obs for name in Symbol.(CONCRETE_TARGET_PROXY_TYPES_SYMBOLS) @eval export $name diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index d20f1da2..b87a3ab1 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -160,6 +160,24 @@ $(DOC_IMPLEMENTED_METHODS(:training_losses)). """ function training_losses end +""" + LearnAPI.training_predictions(model) + +Return internally computed training predictions when running `model = fit(algorithm, ...)` +for some `algorithm`. + +See also [`fit`](@ref). + +# New implementations + +Implement for iterative algorithms that compute and record training losses as part of +training (e.g. neural networks). + +$(DOC_IMPLEMENTED_METHODS(:training_predictions)). + +""" +function training_predictions end + """ LearnAPI.training_scores(model) @@ -227,6 +245,7 @@ const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( feature_importances, training_labels, training_losses, + training_predictions, training_scores, components, ) diff --git a/src/fit.jl b/src/fit.jl index 010e53e0..316d0eab 100644 --- a/src/fit.jl +++ b/src/fit.jl @@ -6,179 +6,35 @@ const TRAINING_FUNCTIONS = (:fit,) # # FIT """ - LearnAPI.fit(algorithm, data...; verbosity=1) + LearnAPI.fit(algorithm, data; verbosity=1) + LearnAPI.fit(algorithm; verbosity=1) Execute the algorithm with configuration `algorithm` using the provided training `data`, returning an object, `model`, on which other methods, such as [`predict`](@ref) or [`transform`](@ref), can be dispatched. [`LearnAPI.functions(algorithm)`](@ref) returns a list of methods that can be applied to either `algorithm` or `model`. -# Arguments +The second signature applies to algorithms which do not generalize to new observations. In +that case `predict` or `transform` actually execute the algorithm, but may also write to +the (mutable) object returned by `fit`. -- `algorithm`: property-accessible object whose properties are the hyperparameters of - some ML/statistical algorithm - -$(DOC_ARGUMENTS(:fit)) - -- `verbosity=1`: logging level; set to `0` for warnings only, and `-1` for silent training - -See also [`obsfit`](@ref), [`predict`](@ref), [`transform`](@ref), -[`inverse_transform`](@ref), [`LearnAPI.functions`](@ref), [`obs`](@ref). - -# Extended help - -# New implementations - -LearnAPI.jl provides the following definition of `fit`, which is never directly overloaded: - -```julia -fit(algorithm, data...; verbosity=1) = - obsfit(algorithm, Obs(), obs(fit, algorithm, data...); verbosity) -``` - -Rather, new algorithms should overload [`obsfit`](@ref). See also [`obs`](@ref). - -""" -fit(algorithm, data...; verbosity=1) = - obsfit(algorithm, obs(fit, algorithm, data...), verbosity) - -""" - obsfit(algorithm, obsdata; verbosity=1) - -A lower-level alternative to [`fit`](@ref), this method consumes a pre-processed form of -user data. Specifically, the following two code snippets are equivalent: +When `data` is a tuple, a data slurping form of `fit` is typically provided. ```julia -model = fit(algorithm, data...) +model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` +ŷ = predict(model, X) ``` -and -```julia -obsdata = obs(fit, algorithm, data...) -model = obsfit(algorithm, obsdata) -``` - -Here `obsdata` is algorithm-specific, "observation-accessible" data, meaning it implements -the MLUtils.jl `getobs`/`numobs` interface for observation resampling (even if `data` does -not). Moreover, resampled versions of `obsdata` may be passed to `obsfit` in its place. +Use `verbosity=0` for warnings only, and `-1` for silent training. -The use of `obsfit` may offer performance advantages. See more at [`obs`](@ref). - -See also [`fit`](@ref), [`obs`](@ref). +See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), +[`LearnAPI.functions`](@ref), [`obs`](@ref). # Extended help # New implementations -Implementation of the following method signature is compulsory for all new algorithms: - -```julia -LearnAPI.obsfit(algorithm, obsdata, verbosity) -``` - -Here `obsdata` has the form explained above. If [`obs`](@ref)`(fit, ...)` is not being -overloaded, then a fallback gives `obsdata = data` (always a tuple!). Note that -`verbosity` is a positional argument, not a keyword argument in the overloaded signature. - -New implementations must also implement [`LearnAPI.algorithm`](@ref). - -If overloaded, then the functions `LearnAPI.obsfit` and `LearnAPI.fit` must be included in -the tuple returned by the [`LearnAPI.functions(algorithm)`](@ref) trait. - -## Non-generalizing algorithms - -If the algorithm does not generalize to new data (e.g, DBSCAN clustering) then `data = ()` -and `obsfit` carries out no computation, as this happen entirely in a `transform` and/or -`predict` call. In such cases, `obsfit(algorithm, ...)` may return `algorithm`, but -another possibility is allowed: To provide a mechanism for `transform`/`predict` to report -byproducts of the computation (e.g., a list of boundary points in DBSCAN clustering) they -are allowed to *mutate* the `model` object returned by `obsfit`, which is then arranged to -be a mutable struct wrapping `algorithm` and fields to store the byproducts. In that case, -[`LearnAPI.predict_or_transform_mutates(algorithm)`](@ref) must be overloaded to return -`true`. - -""" -obsfit(algorithm, obsdata; verbosity=1) = - obsfit(algorithm, obsdata, verbosity) - - -# # UPDATE - -""" - LearnAPI.update!(algorithm, verbosity, fitted_params, state, data...) - -Based on the values of `state`, and `fitted_params` returned by a preceding call to -[`LearnAPI.fit`](@ref), [`LearnAPI.ingest!`](@ref), or [`LearnAPI.update!`](@ref), update a -algorithm's fitted parameters, returning new (or mutated) `state` and `fitted_params`. - -Intended for retraining when the training data has not changed, but `algorithm` -properties (hyperparameters) may have changed, e.g., when increasing an iteration -parameter. Specifically, the assumption is that `data` have the same values -seen in the most recent call to `fit/update!/ingest!`. - -For incremental training (same algorithm, new data) see instead [`LearnAPI.ingest!`](@ref). - -# Return value - -Same as [`LearnAPI.fit`](@ref), namely a tuple (`fitted_params`, `state`, `report`). See -[`LearnAPI.fit`](@ref) for details. - - -# New implementations - -Overloading this method is optional. A fallback calls `LearnAPI.fit`: - -```julia -LearnAPI.update!(algorithm, verbosity, fitted_params, state, data...) = - fit(algorithm, verbosity, data) -``` -$(DOC_IMPLEMENTED_METHODS(:fit)) - -The most common use case is continuing training of an iterative algorithm: `state` is -simply a copy of the algorithm used in the last training call (`fit`, `update!` or -`ingest!`) and this will include the current number of iterations as a property. If -`algorithm` and `state` differ only in the number of iterations (e.g., epochs in a neural -network), which has increased, then the fitted parameters (network weights and biases) are -updated, rather than computed from scratch. Otherwise, `update!` simply calls `fit`, to -force retraining from scratch. - -It is permitted to return mutated versions of `state` and `fitted_params`. - -See also [`LearnAPI.fit`](@ref), [`LearnAPI.ingest!`](@ref). - -""" - - -# # INGEST - -""" - LearnAPI.ingest!(algorithm, verbosity, fitted_params, state, data...) - -For an algorithm that supports incremental learning, update the fitted parameters using -`data`, which has typically not been seen before. The arguments `state` and -`fitted_params` are the output of a preceding call to [`LearnAPI.fit`](@ref), -[`LearnAPI.ingest!`](@ref), or [`LearnAPI.update!`](@ref), of which mutated or new -versions are returned. - -For updating fitted parameters using the *same* data but new hyperparameters, see instead -[`LearnAPI.update!`](@ref). - -For training an algorithm with new hyperparameters but *unchanged* data, see instead -[`LearnAPI.update!`](@ref). - - -# Return value - -Same as [`LearnAPI.fit`](@ref), namely a tuple (`fitted_params`, `state`, `report`). See -[`LearnAPI.fit`](@ref) for details. - - -# New implementations - -Implementing this method is optional. It has no fallback. - -$(DOC_IMPLEMENTED_METHODS(:fit)) - -See also [`LearnAPI.fit`](@ref), [`LearnAPI.update!`](@ref). +Implementation is compulsory. The signature must include `verbosity`. """ +fit(algorithm, data...; kwargs...) = nothing diff --git a/src/minimize.jl b/src/minimize.jl index 173ee24f..f37b9d0a 100644 --- a/src/minimize.jl +++ b/src/minimize.jl @@ -5,7 +5,7 @@ Return a version of `model` that will generally have a smaller memory allocation `model`, suitable for serialization. Here `model` is any object returned by [`fit`](@ref). Accessor functions that can be called on `model` may not work on `minimize(model)`, but [`predict`](@ref), [`transform`](@ref) and -[`inverse_transform`](@ref) will work, if implemented for `model`. Check +[`inverse_transform`](@ref) will work, if implemented. Check `LearnAPI.functions(LearnAPI.algorithm(model))` to view see what the original `model` implements. diff --git a/src/obs.jl b/src/obs.jl index 75da42f4..0348d3da 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -1,17 +1,20 @@ """ - obs(func, algorithm, data...) + obs(algorithm, data) + obs(model, data) -Where `func` is `fit`, `predict` or `transform`, return a combined, algorithm-specific, -representation of `data...`, which can be passed directly to `obsfit`, `obspredict` or -`obstransform`, as shown in the example below. +Return an algorithm-specific representation of `data`, suitable for passing to `fit` +(first signature) or to `predict` and `transform` (second signature), in place of +`data`. Here `model` is the return value of `fit(algorithm, ...)` for some LearnAPI.jl +algorithm, `algorithm`. -The returned object implements the `getobs`/`numobs` observation-resampling interface -provided by MLUtils.jl, even if `data` does not. +The returned object is guaranteed to implement observation access as indicated +by [`LearnAPI.data_interface(algorithm)`](@ref) (typically the +[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface). -Calling `func` on the returned object may be cheaper than calling `func` directly on -`data...`. And resampling the returned object using `MLUtils.getobs` may be cheaper than -directly resampling the components of `data` (an operation not provided by the LearnAPI.jl -interface). +Calling `fit`/`predict`/`transform` on the returned objects may have performance +advantages over calling directly on `data` in some contexts. And resampling the returned +object using `MLUtils.getobs` may be cheaper than directly resampling the components of +`data`. # Example @@ -23,100 +26,52 @@ y = Xtrain = Tables.select(X, 1:100) ytrain = y[1:100] -model = fit(algorithm, Xtrain, ytrain) +model = fit(algorithm, (Xtrain, ytrain)) ŷ = predict(model, LiteralTarget(), y[101:150]) ``` -Alternative workflow using `obs`: +Alternative workflow using `obs` and the MLUtils.jl API: ```julia import MLUtils -fitdata = obs(fit, algorithm, X, y) -predictdata = obs(predict, algorithm, X) +fit_obsevations = obs(algorithm, (X, y)) +model = fit(algorithm, MLUtils.getobs(fit_observations, 1:100)) -model = obsfit(algorithm, MLUtils.getobs(fitdata, 1:100)) -ẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, 101:150)) +predict_observations = obs(model, X) +ẑ = predict(model, LiteralTarget(), MLUtils.getobs(predict_observations, 101:150)) @assert ẑ == ŷ ``` -See also [`obsfit`](@ref), [`obspredict`](@ref), [`obstransform`](@ref). +See also [`LearnAPI.data_interface`](@ref). # Extended help # New implementations -If the `data` to be consumed in standard user calls to `fit`, `predict` or `transform` -consists only of tables and arrays (with last dimension the observation dimension) then -overloading `obs` is optional, but the user will get no performance benefits by using -it. The implementation of `obs` is optional under more general circumstances stated at the -end. +Implementation is typically optional. -The fallback for `obs` just slurps the provided data: +For each supported form of `data` in `fit(algorithm, data)`, `predict(model, data)`, and +`transform(model, data)`, it must be true that `model = fit(algorithm, observations)` is +supported, whenever `observations = obs(algorithm, data)`, and that `predict(model, +observations)` and `transform(model, observations)` are supported, whenever `observations += obs(model, data)`. -```julia -obs(func, alg, data...) = data -``` +The fallback for `obs` is `obs(model_or_algorithm, data) = data`, and the fallback for +`LearnAPI.data_interface(algorithm)` indicates MLUtils.jl as the adopted interface. For +details refer to the [`LearnAPI.data_interface`](@ref) document string. -The only contractual obligation of `obs` is to return an object implementing the -`getobs`/`numobs` interface. Generally it suffices to overload `Base.getindex` and -`Base.length`. However, note that implementations of [`obsfit`](@ref), -[`obspredict`](@ref), and [`obstransform`](@ref) depend on the form of output of `obs`. - -$(DOC_IMPLEMENTED_METHODS(:(obs), overloaded=true)) +In particular, if the `data` to be consumed by `fit`, `predict` or `transform` consists +only of suitable tables and arrays, then `obs` and `LearnAPI.data_interface` do not need +to be overloaded. However, the user will get no performance benefits by using `obs` in +that case. ## Sample implementation -Suppose that `fit`, for an algorithm of type `Alg`, is to have the primary signature - -```julia -fit(algorithm::Alg, X, y) -``` - -where `X` is a table, `y` a vector. Internally, the algorithm is to call a lower level -function - -`train(A, names, y)` - -where `A = Tables.matrix(X)'` and `names` are the column names of `X`. Then relevant parts -of an implementation might look like this: - -```julia -# thin wrapper for algorithm-specific representation of data: -struct ObsData{T} - A::Matrix{T} - names::Vector{Symbol} - y::Vector{T} -end - -# (indirect) implementation of `getobs/numobs`: -Base.getindex(data::ObsData, I) = - ObsData(data.A[:,I], data.names, y[I]) -Base.length(data::ObsData, I) = length(data.y) - -# implementation of `obs`: -function LearnAPI.obs(::typeof(fit), ::Alg, X, y) - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - return ObsData(Tables.matrix(table)', names, y) -end - -# implementation of `obsfit`: -function LearnAPI.obsfit(algorithm::Alg, data::ObsData; verbosity=1) - coremodel = train(data.A, data.names, data.y) - data.verbosity > 0 && @info "Training using these features: $names." - - return model -end -``` - -## When is overloading `obs` optional? +Refer to the "Anatomy of an Implemetation" section of the LearnAPI +[manual](https://juliaai.github.io/LearnAPI.jl/dev/). -Overloading `obs` is optional, for a given `typeof(algorithm)` and `typeof(fun)`, if the -components of `data` in the standard call `func(algorithm_or_model, data...)` are already -expected to separately implement the `getobs`/`numbobs` interface. This is true for arrays -whose last dimension is the observation dimension, and for suitable tables. """ -obs(func, alg, data...) = data +obs(algorithm_or_model, data) = data diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 71e4e730..a20598f8 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -8,20 +8,12 @@ const OPERATIONS = (:predict, :transform, :inverse_transform) const DOC_OPERATIONS_LIST_SYMBOL = join(map(op -> "`:$op`", OPERATIONS), ", ") const DOC_OPERATIONS_LIST_FUNCTION = join(map(op -> "`LearnAPI.$op`", OPERATIONS), ", ") -DOC_ARGUMENTS(func) = -""" -- `data`: tuple of data objects with a common number of observations, for example, - `data = (X, y, w)` where `X` is a table of features, `y` is a target vector with the - same number of rows, and `w` a vector of per-observation weights. - -""" - DOC_MUTATION(op) = """ If [`LearnAPI.predict_or_transform_mutates(algorithm)`](@ref) is overloaded to return `true`, then `$op` may mutate it's first argument, but not in a way that alters the - result of a subsequent call to `obspredict`, `obstransform` or + result of a subsequent call to `predict`, `transform` or `inverse_transform`. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at [`fit`](@ref). @@ -43,25 +35,20 @@ DOC_MINIMIZE(func) = # # METHOD STUBS/FALLBACKS """ - predict(model, kind_of_proxy::LearnAPI.KindOfProxy, data...) - predict(model, data...) - -The first signature returns target or target proxy predictions for input features `data`, -according to some `model` returned by [`fit`](@ref) or [`obsfit`](@ref). Where supported, -these are literally target predictions if `kind_of_proxy = LiteralTarget()`, and -probability density/mass functions if `kind_of_proxy = Distribution()`. List all options -with [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), where `algorithm = + predict(model, kind_of_proxy::LearnAPI.KindOfProxy, data) + predict(model, data) + +The first signature returns target predictions, or proxies for target predictions, for +input features `data`, according to some `model` returned by [`fit`](@ref). Where +supported, these are literally target predictions if `kind_of_proxy = LiteralTarget()`, +and probability density/mass functions if `kind_of_proxy = Distribution()`. List all +options with [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), where `algorithm = LearnAPI.algorithm(model)`. -The shortcut `predict(model, data...) = predict(model, LiteralTarget(), data...)` is also -provided. - -# Arguments +The shortcut `predict(model, data)` calls the first method with an algorithm-specific +`kind_of_proxy`. -- `model` is anything returned by a call of the form `fit(algorithm, ...)`, for some - LearnAPI-complaint `algorithm`. - -$(DOC_ARGUMENTS(:predict)) +The argument `model` is anything returned by a call of the form `fit(algorithm, ...)`. # Example @@ -76,181 +63,84 @@ predict(model, LiteralTarget(), Xnew) Note `predict ` does not mutate any argument, except in the special case `LearnAPI.predict_or_transform_mutates(algorithm) = true`. -See also [`obspredict`](@ref), [`fit`](@ref), [`transform`](@ref), -[`inverse_transform`](@ref). - -# Extended help - -# New implementations - -LearnAPI.jl provides the following definition of `predict` which is never to be directly -overloaded: - -```julia -predict(model, kop::LearnAPI.KindOfProxy, data...) = - obspredict(model, kop, obs(predict, LearnAPI.algorithm(model), data...)) -``` - -Rather, new algorithms overload [`obspredict`](@ref). - -""" -predict(model, kind_of_proxy::KindOfProxy, data...) = - obspredict(model, kind_of_proxy, obs(predict, algorithm(model), data...)) -predict(model, data...) = predict(model, LiteralTarget(), data...) - -""" - obspredict(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata) - -Similar to `predict` but consumes algorithm-specific representations of input data, -`obsdata`, as returned by `obs(predict, algorithm, data...)`. Here `data...` is the form of -data expected in the main [`predict`](@ref) method. Alternatively, such `obsdata` may be -replaced by a resampled version, where resampling is performed using `MLUtils.getobs` -(always supported). - -For some algorithms and workflows, `obspredict` will have a performance benefit over -[`predict`](@ref). See more at [`obs`](@ref). - -# Example - -In the following, `algorithm` is some supervised learning algorithm with -training features `X`, training target `y`, and test features `Xnew`: - -```julia -model = fit(algorithm, X, y) -obsdata = obs(predict, algorithm, Xnew) -ŷ = obspredict(model, LiteralTarget(), obsdata) -@assert ŷ == predict(model, LiteralTarget(), Xnew) -``` - -See also [`predict`](@ref), [`fit`](@ref), [`transform`](@ref), -[`inverse_transform`](@ref), [`obs`](@ref). +See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). # Extended help # New implementations -Implementation of `obspredict` is optional, but required to enable `predict`. The method -must also handle `obsdata` in the case it is replaced by `MLUtils.getobs(obsdata, I)` for -some collection `I` of indices. If [`obs`](@ref) is not overloaded, then `obsdata = data`, -where `data...` is what the standard [`predict`](@ref) call expects, as in the call -`predict(model, kind_of_proxy, data...)`. Note `data` is always a tuple, even if `predict` -has only one data argument. See more at [`obs`](@ref). - +If there is no notion of a "target" variable in the LearnAPI.jl sense, or you need an +operation with an inverse, implement [`transform`](@ref) instead. -$(DOC_MUTATION(:obspredict)) +Implementation is optional. If the first signature is implemented for some +`kind_of_proxy`, then the implementation should provide an implementation of the second +convenience form, but it is free to choose the fallback `kind_of_proxy`. Each +`kind_of_proxy` that gets an implementation must be added to the list returned by +[`LearnAPI.kinds_of_proxy`](@ref). -If overloaded, you must include both `LearnAPI.obspredict` and `LearnAPI.predict` in the -list of methods returned by the [`LearnAPI.functions`](@ref) trait. +$(DOC_IMPLEMENTED_METHODS(:predict)) -An implementation is provided for each kind of target proxy you wish to support. See the -LearnAPI.jl documentation for options. Each supported `kind_of_proxy` instance should be -listed in the return value of the [`LearnAPI.kinds_of_proxy(algorithm)`](@ref) trait. +$(DOC_MINIMIZE(:predict)) -$(DOC_MINIMIZE(:obspredict)) - -""" -function obspredict end +$(DOC_MUTATION(:predict)) """ - transform(model, data...) +function predict end -Return a transformation of some `data`, using some `model`, as returned by [`fit`](@ref). -# Arguments +""" + transform(model, data) -- `model` is anything returned by a call of the form `fit(algorithm, ...)`, for some - LearnAPI-complaint `algorithm`. +Return a transformation of some `data`, using some `model`, as returned by +[`fit`](@ref). -$(DOC_ARGUMENTS(:transform)) +For `data` that consists of a tuple, a slurping version is typically provided, i.e., +`transform(model, X1, X2, X3)` in place of `transform(model, (X1, X2, X3))`. # Example -Here `X` and `Xnew` are data of the same form: +Below, `X` and `Xnew` are data of the same form. + +For an `algorithm` that generalizes to new data ("learns"): ```julia -# For an algorithm that generalizes to new data ("learns"): model = fit(algorithm, X; verbosity=0) transform(model, Xnew) - -# For a static (non-generalizing) transformer: -model = fit(algorithm) -transform(model, X) ``` -Note `transform` does not mutate any argument, except in the special case -`LearnAPI.predict_or_transform_mutates(algorithm) = true`. - -See also [`obstransform`](@ref), [`fit`](@ref), [`predict`](@ref), -[`inverse_transform`](@ref). - -# Extended help - -# New implementations - -LearnAPI.jl provides the following definition of `transform` which is never to be directly -overloaded: - +For a static (non-generalizing) transformer: ```julia -transform(model, data...) = - obstransform(model, obs(predict, LearnAPI.algorithm(model), data...)) +model = fit(algorithm) +W = transform(model, X) ``` -Rather, new algorithms overload [`obstransform`](@ref). - -""" -transform(model, data...) = - obstransform(model, obs(transform, LearnAPI.algorithm(model), data...)) - -""" - obstransform(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata) - -Similar to `transform` but consumes algorithm-specific representations of input data, -`obsdata`, as returned by `obs(transform, algorithm, data...)`. Here `data...` is the -form of data expected in the main [`transform`](@ref) method. Alternatively, such -`obsdata` may be replaced by a resampled version, where resampling is performed using -`MLUtils.getobs` (always supported). - -For some algorithms and workflows, `obstransform` will have a performance benefit over -[`transform`](@ref). See more at [`obs`](@ref). - -# Example - -In the following, `algorithm` is some unsupervised learning algorithm with -training features `X`, and test features `Xnew`: +or, in one step: ```julia -model = fit(algorithm, X, y) -obsdata = obs(transform, algorithm, Xnew) -W = obstransform(model, obsdata) -@assert W == transform(model, Xnew) +W = transform(algorithm, X) ``` -See also [`transform`](@ref), [`fit`](@ref), [`predict`](@ref), -[`inverse_transform`](@ref), [`obs`](@ref). +Note `transform` does not mutate any argument, except in the special case +`LearnAPI.predict_or_transform_mutates(algorithm) = true`. + +See also [`fit`](@ref), [`predict`](@ref), +[`inverse_transform`](@ref). # Extended help # New implementations -Implementation of `obstransform` is optional, but required to enable `transform`. The -method must also handle `obsdata` in the case it is replaced by `MLUtils.getobs(obsdata, -I)` for some collection `I` of indices. If [`obs`](@ref) is not overloaded, then `obsdata -= data`, where `data...` is what the standard [`transform`](@ref) call expects, as in the -call `transform(model, data...)`. Note `data` is always a tuple, even if `transform` has -only one data argument. See more at [`obs`](@ref). - -$(DOC_MUTATION(:obstransform)) +Implementation for new LearnAPI.jl algorithms is optional. +$(DOC_IMPLEMENTED_METHODS(:transform)) -If overloaded, you must include both `LearnAPI.obstransform` and `LearnAPI.transform` in -the list of methods returned by the [`LearnAPI.functions`](@ref) trait. +$(DOC_MINIMIZE(:transform)) -Each supported `kind_of_proxy` should be listed in the return value of the -[`LearnAPI.kinds_of_proxy(algorithm)`](@ref) trait. +$(DOC_MUTATION(:transform)) -$(DOC_MINIMIZE(:obstransform)) """ -function obstransform end +function transform end + """ inverse_transform(model, data) @@ -259,20 +149,13 @@ Inverse transform `data` according to some `model` returned by [`fit`](@ref). He "inverse" is to be understood broadly, e.g, an approximate right inverse for [`transform`](@ref). -# Arguments - -- `model`: anything returned by a call of the form `fit(algorithm, ...)`, for some - LearnAPI-complaint `algorithm`. - -- `data`: something having the same form as the output of `transform(model, inputs...)` - # Example In the following, `algorithm` is some dimension-reducing algorithm that generalizes to new data (such as PCA); `Xtrain` is the training input and `Xnew` the input to be reduced: ```julia -model = fit(algorithm, Xtrain; verbosity=0) +model = fit(algorithm, Xtrain) W = transform(model, Xnew) # reduced version of `Xnew` Ŵ = inverse_transform(model, W) # embedding of `W` in original space ``` @@ -283,7 +166,7 @@ See also [`fit`](@ref), [`transform`](@ref), [`predict`](@ref). # New implementations -Implementation is optional. $(DOC_IMPLEMENTED_METHODS(:inverse_transform, )) +Implementation is optional. $(DOC_IMPLEMENTED_METHODS(:inverse_transform)) $(DOC_MINIMIZE(:inverse_transform)) diff --git a/src/tools.jl b/src/tools.jl index 7a211729..d86e3d8d 100644 --- a/src/tools.jl +++ b/src/tools.jl @@ -8,6 +8,27 @@ function name_value_pair(ex) return (ex.args[1], ex.args[2]) end +""" + @trait(TypeEx, trait1=value1, trait2=value2, ...) + +Overload a number of traits for algorithms of type `TypeEx`. For example, the code + +```julia +@trait( + RidgeRegressor, + descriptors = ("regression", ), + doc_url = "https://some.cool.documentation", +) +``` + +is equivalent to + +```julia +LearnAPI.descriptors(::RidgeRegressor) = ("regression", ), +LearnAPI.doc_url(::RidgeRegressor) = "https://some.cool.documentation", +``` + +""" macro trait(algorithm_ex, exs...) program = quote end for ex in exs @@ -20,28 +41,6 @@ macro trait(algorithm_ex, exs...) return esc(program) end -# """ -# typename(x) - -# Return a symbolic representation of the name of `type(x)`, stripped of any type-parameters -# and module qualifications. For example, if - -# typeof(x) = MLJBase.Machine{MLJAlgorithms.ConstantRegressor,true} - -# Then `typename(x)` returns `:Machine`. - -# """ -function typename(x) - M = typeof(x) - if isdefined(M, :name) - return M.name.name - elseif isdefined(M, :body) - return typename(M.body) - else - return Symbol(string(M)) - end -end - function is_uppercase(char::Char) i = Int(char) i > 64 && i < 91 diff --git a/src/traits.jl b/src/traits.jl index 73c3b03a..f5709206 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -13,12 +13,22 @@ DOC_ONLY_ONE(func) = "`LearnAPI.$(func)_observation_scitype`, "* "`LearnAPI.$(func)_observation_type`." +const DOC_EXPLAIN_EACHOBS = + """ + + Here, "for each `o` in `observations`" is understood in the sense of + [`LearnAPI.data_interface(algorithm)`](@ref). For example, if + `LearnAPI.data_interface(algorithm) == Base.HasLength()`, then this means "for `o` in + `MLUtils.eachobs(observations)`". + + """ const TRAITS = [ + :constructor, :functions, :kinds_of_proxy, - :position_of_target, - :position_of_weights, + :target, + :weights, :descriptors, :is_pure_julia, :pkg_name, @@ -28,6 +38,7 @@ const TRAITS = [ :is_composite, :human_name, :iteration_parameter, + :data_interface, :predict_or_transform_mutates, :fit_scitype, :fit_observation_scitype, @@ -48,18 +59,51 @@ const TRAITS = [ # # OVERLOADABLE TRAITS +""" + Learn.API.constructor(algorithm) + +Return a keyword constructor that can be used to clone `algorithm` or make copies with +selectively altered property values: + +```julia-repl +julia> algorithm.lambda +0.1 +julia> C = LearnAPI.constructor(algorithm) +julia> algorithm2 = C(lambda=0.2) +julia> algorithm2.lambda +0.2 +``` + +# New implementations + +All new implementations must overload this trait. It must be possible to recover an +algorithm from the constructor returned as follows: + +```julia +properties = propertynames(algorithm) +named_properties = NamedTuple{properties}(getproperty.(Ref(algorithm), properties)) +@assert algorithm == LearnAPI.constructor(algorithm)(; named_properties...) +``` + +The keyword constructor provided by `LearnAPI.constructor` must provide default values for +all properties, with the exception of those that can take other LearnAPI.jl algorithms as +values. + +""" +function constructor end + """ LearnAPI.functions(algorithm) -Return a tuple of functions that can be sensibly applied to `algorithm`, or to objects -having the same type as `algorithm`, or to associated models (objects returned by -`fit(algorithm, ...)`. Algorithm traits are excluded. +Return a tuple of functions that can be meaningfully applied with `algorithm`, or an +associate model (object returned by `fit(algorithm, ...)`, as the first +argument. Algorithm traits (`algorithm` is the *only* argument) are excluded. In addition to functions, the returned tuple may include expressions, like `:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl. -The understanding is that `algorithm` is a LearnAPI-compliant object whenever this is -non-empty. +The understanding is that `algorithm` is a LearnAPI-compliant object whenever the return +value is non-empty. # Extended help @@ -68,18 +112,15 @@ non-empty. All new implementations must overload this trait. Here's a checklist for elements in the return value: -| function | needs explicit implementation? | include in returned tuple? | -|----------------------|---------------------------------|----------------------------------| -| `fit` | no | yes | -| `obsfit` | yes | yes | -| `minimize` | optional | yes | -| `predict` | no | if `obspredict` is implemented | -| `obspredict` | optional | if implemented | -| `transform` | no | if `obstransform` is implemented | -| `obstransform` | optional | if implemented | -| `obs` | optional | yes | -| `inverse_transform` | optional | if implemented | -| `LearnAPI.algorithm` | yes | yes | +| function | implementation/overloading compulsory? | include in returned tuple? | +|----------------------|----------------------------------------|----------------------------| +| `fit` | yes | yes | +| `minimize` | no | yes | +| `obs` | no | yes | +| `LearnAPI.algorithm` | yes | yes | +| `inverse_transform` | no | only if implemented | +| `predict` | no | only if implemented | +| `transform` | no | only if implemented | Also include any implemented accessor functions. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. @@ -125,29 +166,40 @@ For more on target variables and target proxies, refer to the LearnAPI documenta kinds_of_proxy(::Any) = () """ - LearnAPI.position_of_target(algorithm) + LearnAPI.target(algorithm)::Bool + LearnAPI.target(algorithm, data) -> target -Return the expected position of the target variable within `data` in calls of the form -[`LearnAPI.fit`](@ref)`(algorithm, verbosity, data...)`. +First method (an algorithm trait) returns `true` if the second method returns a target +variable for some value(s) of `data`, where `data` is a supported argument in +[`fit(algorithm, data)`](@ref). -If this number is `0`, then no target is expected. If this number exceeds `length(data)`, -then `data` is understood to exclude the target variable. +# New implementations + +The trait fallback returns `false`. A fallback for the second method returns `nothing`. """ -position_of_target(::Any) = 0 +target(::Any) = false +target(::Any, data) = nothing """ - LearnAPI.position_of_weights(algorithm) + LearnAPI.weights(algorithm)::Bool + LearnAPI.target(algorithm, data) -> weights + +First method (an algorithm trait) returns `true` if the second method returns +per-observation weights, for some value(s) of `data`, where `data` is a supported argument +in [`fit(algorithm, data)`](@ref). -Return the expected position of per-observation weights within `data` in -calls of the form [`LearnAPI.fit`](@ref)`(algorithm, data...)`. +Otherwise, weights, if they apply, are assumed uniform. -If this number is `0`, then no weights are expected. If this number exceeds -`length(data)`, then `data` is understood to exclude weights, which are assumed to be -uniform. +# New implementations + +The trait fallback returns `false`. A fallback for the second method returns `nothing`, +which is interpreted as uniform weights. """ -position_of_weights(::Any) = 0 +weights(::Any) = false +weights(::Any, data) = nothing + descriptors() = [ :regression, @@ -289,8 +341,8 @@ is_composite(::Any) = false """ LearnAPI.human_name(algorithm) -A human-readable string representation of `typeof(algorithm)`. Primarily intended for -auto-generation of documentation. +Return a human-readable string representation of `typeof(algorithm)`. Primarily intended +for auto-generation of documentation. # New implementations @@ -302,6 +354,32 @@ to return `"K-nearest neighbors regressor"`. Ideally, this is a "concrete" noun """ human_name(M) = snakecase(name(M), delim=' ') # `name` defined below +""" + LearnAPI.data_interface(algorithm) + +Return the data interface supported by `algorithm` for accessing individual observations in +representations of input data returned by [`obs(algorithm, data)`](@ref) or [`obs(model, +data)`](@ref). Here `data` is `fit`, `predict`, or `transform`-consumable data. + +Options for the return value: + +- `Base.HasLength()`: Data returned by `obs` implements the + [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs/numobs` interface; it + usually suffices to overload `Base.getindex` and `Base.length` (which are the + `getobs/numobs` fallbacks). + +- `Base.SizeUnknown()`: Data returned by `obs` implements Julia's `iterate` + interface. + +See also [`obs`](@ref). + +# New implementations + +The fallback returns `Base.HasLength`. + +""" +data_interface(::Any) = Base.HasLength() + """ LearnAPI.predict_or_transform_mutates(algorithm) @@ -334,17 +412,9 @@ iteration_parameter(::Any) = nothing """ LearnAPI.fit_scitype(algorithm) -Return an upper bound on the scitype of `data` guaranteed to work when calling -`fit(algorithm, data...)`. - -Specifically, if the return value is `S` and `ScientificTypes.scitype(data) <: S`, then -all the following calls are guaranteed to work: - -```julia -fit(algorithm, data...) -obsdata = obs(fit, algorithm, data...) -fit(algorithm, Obs(), obsdata) -``` +Return an upper bound `S` on the scitype of `data` guaranteed to work when calling +`fit(algorithm, data)`: if `ScientificTypes.scitype(data) <: S`, then is `fit(algorithm, +data)` is supported. See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_observation_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref). @@ -359,27 +429,12 @@ fit_scitype(::Any) = Union{} """ LearnAPI.fit_observation_scitype(algorithm) -Return an upper bound on the scitype of observations guaranteed to work when calling -`fit(algorithm, data...)`, independent of the type/scitype of the data container -itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has -value different from `Union{}` the understanding is that `data` implements the MLUtils.jl -`getobs`/`numobs` interface. - -Specifically, denoting the type returned above by `S`, supposing `S != Union{}`, and that -user supplies `data` satisfying - -```julia -ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S -``` - -for any valid index `i`, then all the following are guaranteed to work: +Return an upper bound `S` on the scitype of individual observations guaranteed to work +when calling `fit`: if `observations = obs(algorithm, data)` and +`ScientificTypes.scitype(o) <:S` for each `o` in `observations`, then the call +`fit(algorithm, data)` is supported. - -```julia -fit(algorithm, data....) -obsdata = obs(fit, algorithm, data...) -fit(algorithm, Obs(), obsdata) -``` +$DOC_EXPLAIN_EACHOBS See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref). @@ -394,17 +449,8 @@ fit_observation_scitype(::Any) = Union{} """ LearnAPI.fit_type(algorithm) -Return an upper bound on the type of `data` guaranteed to work when calling -`fit(algorithm, data...)`. - -Specifically, if the return value is `T` and `typeof(data) <: T`, then -all the following calls are guaranteed to work: - -```julia -fit(algorithm, data...) -obsdata = obs(fit, algorithm, data...) -fit(algorithm, Obs(), obsdata) -``` +Return an upper bound `T` on the type of `data` guaranteed to work when calling +`fit(algorithm, data)`: if `typeof(data) <: T`, then `fit(algorithm, data)` is supported. See also [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref). [`LearnAPI.fit_observation_scitype`](@ref) @@ -419,26 +465,12 @@ fit_type(::Any) = Union{} """ LearnAPI.fit_observation_type(algorithm) -Return an upper bound on the type of observations guaranteed to work when calling -`fit(algorithm, data...)`, independent of the type/scitype of the data container -itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value -different from `Union{}` the understanding is that `data` implements the MLUtils.jl -`getobs`/`numobs` interface. +Return an upper bound `T` on the type of individual observations guaranteed to work +when calling `fit`: if `observations = obs(algorithm, data)` and +`typeof(o) <:S` for each `o` in `observations`, then the call +`fit(algorithm, data)` is supported. -Specifically, denoting the type returned above by `T`, supposing `T != Union{}`, and that -user supplies `data` satisfying - -```julia -typeof(MLUtils.getobs(data, i)) <: T -``` - -for any valid index `i`, then the following is guaranteed to work: - -```julia -fit(algorithm, data....) -obsdata = obs(fit, algorithm, data...) -fit(algorithm, Obs(), obsdata) -``` +$DOC_EXPLAIN_EACHOBS See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_scitype`](@ref). @@ -456,18 +488,9 @@ function DOC_INPUT_SCITYPE(op) """ LearnAPI.$(op)_input_scitype(algorithm) - Return an upper bound on the scitype of `data` guaranteed to work in the call - `$op(algorithm,$extra data...)`. - - Specifically, if `S` is the value returned and `ScientificTypes.scitype(data) <: S`, - then the following is guaranteed to work: - - ```julia - $op(model,$extra data...) - obsdata = obs($op, algorithm, data...) - $op(model,$extra Obs(), obsdata) - ``` - whenever `algorithm = LearnAPI.algorithm(model)`. + Return an upper bound `S` on the scitype of `data` guaranteed to work in the call + `$op(algorithm,$extra data)`: if `ScientificTypes.scitype(data) <: S`, + then `$op(algorithm,$extra data)` is supported. See also [`LearnAPI.$(op)_input_type`](@ref). @@ -484,27 +507,12 @@ function DOC_INPUT_OBSERVATION_SCITYPE(op) """ LearnAPI.$(op)_observation_scitype(algorithm) - Return an upper bound on the scitype of observations guaranteed to work when calling - `$op(model,$extra data...)`, independent of the type/scitype of the data container - itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has - value different from `Union{}` the understanding is that `data` implements the - MLUtils.jl `getobs`/`numobs` interface. + Return an upper bound `S` on the scitype of individual observations guaranteed to work + when calling `$op`: if `observations = obs(model, data)`, for some `model` returned by + `fit(algorithm, ...)`, and `ScientificTypes.scitype(o) <: S` for each `o` in + `observations`, then the call `$(op)(model,$extra data)` is supported. - Specifically, denoting the type returned above by `S`, supposing `S != Union{}`, and - that user supplies `data` satisfying - - ```julia - ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S - ``` - - for any valid index `i`, then all the following are guaranteed to work: - - ```julia - $op(model,$extra data...) - obsdata = obs($op, algorithm, data...) - $op(model,$extra Obs(), obsdata) - ``` - whenever `algorithm = LearnAPI.algorithm(model)`. + $DOC_EXPLAIN_EACHOBS See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref). @@ -522,19 +530,11 @@ function DOC_INPUT_TYPE(op) """ LearnAPI.$(op)_input_type(algorithm) - Return an upper bound on the type of `data` guaranteed to work in the call - `$op(algorithm,$extra data...)`. - - Specifically, if `T` is the value returned and `typeof(data) <: T`, then the following - is guaranteed to work: - - ```julia - $op(model,$extra data...) - obsdata = obs($op, model, data...) - $op(model,$extra Obs(), obsdata) - ``` + Return an upper bound `T` on the scitype of `data` guaranteed to work in the call + `$op(algorithm,$extra data)`: if `typeof(data) <: T`, + then `$op(algorithm,$extra data)` is supported. - See also [`LearnAPI.$(op)_input_scitype`](@ref). + See also [`LearnAPI.$(op)_input_type`](@ref). # New implementations @@ -550,27 +550,12 @@ function DOC_INPUT_OBSERVATION_TYPE(op) """ LearnAPI.$(op)_observation_type(algorithm) - Return an upper bound on the type of observations guaranteed to work when calling - `$op(model,$extra data...)`, independent of the type/scitype of the data container - itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has - value different from `Union{}` the understanding is that `data` implements the - MLUtils.jl `getobs`/`numobs` interface. - - Specifically, denoting the type returned above by `T`, supposing `T != Union{}`, and - that user supplies `data` satisfying - - ```julia - typeof(MLUtils.getobs(data, i)) <: T - ``` - - for any valid index `i`, then all the following are guaranteed to work: + Return an upper bound `T` on the scitype of individual observations guaranteed to work + when calling `$op`: if `observations = obs(model, data)`, for some `model` returned by + `fit(algorithm, ...)`, and `typeof(o) <: T` for each `o` in + `observations`, then the call `$(op)(model,$extra data)` is supported. - ```julia - $op(model,$extra data...) - obsdata = obs($op, algorithm, data...) - $op(model,$extra Obs(), obsdata) - ``` - whenever `algorithm = LearnAPI.algorithm(model)`. + $DOC_EXPLAIN_EACHOBS See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref). @@ -649,19 +634,19 @@ const DOC_PREDICT_OUTPUT(s) = Return an upper bound for the $(s)s of predictions of the specified form where supported, and otherwise return `Any`. For example, if - ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...) + ŷ = predict(model, Distribution(), data) successfully returns (i.e., `algorithm` supports predictions of target probability distributions) then the following is guaranteed to hold: - $(s)(ŷ) <: LearnAPI.predict_output_$(s)(algorithm, LearnAPI.Distribution()) + $(s)(ŷ) <: predict_output_$(s)(algorithm, Distribution()) **Note.** This trait has a single-argument "convenience" version `LearnAPI.predict_output_$(s)(algorithm)` derived from this one, which returns a dictionary keyed on target proxy types. - See also [`LearnAPI.KindOfProxy`](@ref), [`LearnAPI.predict`](@ref), - [`LearnAPI.predict_input_$(s)`](@ref). + See also [`LearnAPI.KindOfProxy`](@ref), [`predict`](@ref), + [`predict_input_$(s)`](@ref). # New implementations @@ -685,7 +670,7 @@ predict_output_type(algorithm, kind_of_proxy) = Any # # DERIVED TRAITS -name(A) = string(typename(A)) +name(A) = split(string(constructor(A)), ".") |> last is_algorithm(A) = !isempty(functions(A)) @@ -703,14 +688,14 @@ const DOC_PREDICT_OUTPUT2(s) = As an example, if - ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...) + ŷ = predict(model, Distribution(), data...) successfully returns (i.e., `algorithm` supports predictions of target probability distributions) then the following is guaranteed to hold: - $(s)(ŷ) <: LearnAPI.predict_output_$(s)s(algorithm)[LearnAPI.Distribution] + $(s)(ŷ) <: LearnAPI.predict_output_$(s)s(algorithm)[Distribution] - See also [`LearnAPI.KindOfProxy`](@ref), [`LearnAPI.predict`](@ref), + See also [`LearnAPI.KindOfProxy`](@ref), [`predict`](@ref), [`LearnAPI.predict_input_$(s)`](@ref). # New implementations diff --git a/src/types.jl b/src/types.jl index e72c159e..e77c4cb7 100644 --- a/src/types.jl +++ b/src/types.jl @@ -1,28 +1,6 @@ # # TARGET PROXIES -const DOC_HOW_TO_LIST_PROXIES = - "Run `LearnAPI.CONCRETE_TARGET_PROXY_TYPES` "* - " to list all options. " - - -""" - - LearnAPI.KindOfProxy - -Abstract type whose concrete subtypes `T` each represent a different kind of proxy for -some target variable, associated with some algorithm. Instances `T()` are used to request -the form of target predictions in [`predict`](@ref) calls. - -See LearnAPI.jl documentation for an explanation of "targets" and "target proxies". - -For example, `Distribution` is a concrete subtype of `LearnAPI.KindOfProxy` and a call -like `predict(model, Distribution(), Xnew)` returns a data object whose observations are -probability density/mass functions, assuming `algorithm` supports predictions of that -form. - -$DOC_HOW_TO_LIST_PROXIES - -""" +# see later for doc string: abstract type KindOfProxy end """ @@ -32,7 +10,7 @@ Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). If `kind_of_proxy` is an ins `LearnAPI.IID` then, given `data` constisting of ``n`` observations, the following must hold: -- `ŷ = LearnAPI.predict(model, kind_of_proxy, data...)` is +- `ŷ = LearnAPI.predict(model, kind_of_proxy, data)` is data also consisting of ``n`` observations. - The ``j``th observation of `ŷ`, for any ``j``, depends only on the ``j``th @@ -53,22 +31,50 @@ struct Parametric <: IID end struct LabelAmbiguous <: IID end struct LabelAmbiguousSampleable <: IID end struct LabelAmbiguousDistribution <: IID end +struct LabelAmbiguousFuzzy <: IID end struct ConfidenceInterval <: IID end -struct Set <: IID end -struct ProbabilisticSet <: IID end +struct Fuzzy <: IID end +struct ProbabilisticFuzzy <: IID end struct SurvivalFunction <: IID end struct SurvivalDistribution <: IID end +struct HazardFunction <: IID end struct OutlierScore <: IID end struct Continuous <: IID end -# struct None <: KindOfProxy end -struct JointSampleable <: KindOfProxy end -struct JointDistribution <: KindOfProxy end -struct JointLogDistribution <: KindOfProxy end +""" + Joint <: KindOfProxy + +Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). If `kind_of_proxy` is an instance of +`LearnAPI.Joint` then, given `data` consisting of ``n`` observations, `predict(model, +kind_of_proxy, data)` represents a *single* probability distribution for the sample +space ``Y^n``, where ``Y`` is the space from which the target variable takes its values. + +""" +abstract type Joint <: KindOfProxy end +struct JointSampleable <: Joint end +struct JointDistribution <: Joint end +struct JointLogDistribution <: Joint end + +""" + Single <: KindOfProxy + +Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). It applies only to algorithms for +which `predict` has no data argument, i.e., is of the form `predict(model, +kind_of_proxy)`. An example is an algorithm learning a probability distribution from +samples, and we regard the samples as drawn from the "target" variable. If in this case, +`kind_of_proxy` is an instance of `LearnAPI.Single` then, `predict(algorithm)` returns a +single object representing a probability distribution. + +""" +abstract type Single <: KindOfProxy end +struct SingleSampeable <: Single end +struct SingleDistribution <: Single end +struct SingleLogDistribution <: Single end const CONCRETE_TARGET_PROXY_TYPES = [ subtypes(IID)..., - setdiff(subtypes(KindOfProxy), subtypes(IID))..., + subtypes(Single)..., + subtypes(Joint)..., ] const CONCRETE_TARGET_PROXY_TYPES_SYMBOLS = map(CONCRETE_TARGET_PROXY_TYPES) do T @@ -82,3 +88,28 @@ const CONCRETE_TARGET_PROXY_TYPES_LIST = join( ", ", " and ", ) + +const DOC_HOW_TO_LIST_PROXIES = + "The instances of [`LearnAPI.KindOfProxy`](@ref) are: "* + "$(LearnAPI.CONCRETE_TARGET_PROXY_TYPES_LIST). " + + +""" + + LearnAPI.KindOfProxy + +Abstract type whose concrete subtypes `T` each represent a different kind of proxy for +some target variable, associated with some algorithm. Instances `T()` are used to request +the form of target predictions in [`predict`](@ref) calls. + +See LearnAPI.jl documentation for an explanation of "targets" and "target proxies". + +For example, `Distribution` is a concrete subtype of `LearnAPI.KindOfProxy` and a call +like `predict(model, Distribution(), Xnew)` returns a data object whose observations are +probability density/mass functions, assuming `algorithm` supports predictions of that +form. + +$DOC_HOW_TO_LIST_PROXIES + +""" +KindOfProxy diff --git a/test/integration/regression.jl b/test/integration/regression.jl index 2c5d9d70..ee419d21 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -13,10 +13,10 @@ import DataFrames struct Ridge lambda::Float64 end -Ridge(; lambda=0.1) = Ridge(lambda) +Ridge(; lambda=0.1) = Ridge(lambda) # LearnAPI.constructor defined later -struct RidgeFitObs{T} - A::Matrix{T} # p x n +struct RidgeFitObs{T,M<:AbstractMatrix{T}} + A::M # p x n names::Vector{Symbol} y::Vector{T} end @@ -27,23 +27,28 @@ struct RidgeFitted{T,F} feature_importances::F end +LearnAPI.algorithm(model::RidgeFitted) = model.algorithm + Base.getindex(data::RidgeFitObs, I) = RidgeFitObs(data.A[:,I], data.names, data.y[I]) Base.length(data::RidgeFitObs, I) = length(data.y) -function LearnAPI.obs(::typeof(fit), ::Ridge, X, y) +# observations for consumption by `fit`: +function LearnAPI.obs(::Ridge, data) + X, y = data table = Tables.columntable(X) names = Tables.columnnames(table) |> collect - RidgeFitObs(Tables.matrix(table, transpose=true), names, y) + RidgeFitObs(Tables.matrix(table)', names, y) end -function LearnAPI.obsfit(algorithm::Ridge, fitdata::RidgeFitObs, verbosity) +# for observations: +function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1) # unpack hyperparameters and data: lambda = algorithm.lambda - A = fitdata.A - names = fitdata.names - y = fitdata.y + A = observations.A + names = observations.names + y = observations.y # apply core algorithm: coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix @@ -61,12 +66,31 @@ function LearnAPI.obsfit(algorithm::Ridge, fitdata::RidgeFitObs, verbosity) end -LearnAPI.algorithm(model::RidgeFitted) = model.algorithm +# for unprocessed `data = (X, y)`: +LearnAPI.fit(algorithm::Ridge, data; kwargs...) = + fit(algorithm, obs(algorithm, data); kwargs...) + +# for convenience: +LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = + fit(algorithm, (X, y); kwargs...) -LearnAPI.obspredict(model::RidgeFitted, ::LiteralTarget, Anew::Matrix) = - ((model.coefficients)'*Anew)' +# to extract the target: +LearnAPI.target(::Ridge, data) = last(data) +LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y -LearnAPI.obs(::typeof(predict), ::Ridge, X) = Tables.matrix(X, transpose=true) +# observations for consumption by `predict`: +LearnAPI.obs(::RidgeFitted, X) = Tables.matrix(X)' + +# matrix input: +LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, observations::AbstractMatrix) = + observations'*model.coefficients + +# tabular input: +LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = + predict(model, LiteralTarget(), obs(model, Xnew)) + +# convenience method: +LearnAPI.predict(model::RidgeFitted, data) = predict(model, LiteralTarget(), data) LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances @@ -75,21 +99,20 @@ LearnAPI.minimize(model::RidgeFitted) = @trait( Ridge, - position_of_target=2, + constructor = Ridge, + target=true, kinds_of_proxy = (LiteralTarget(),), functions = ( fit, - obsfit, minimize, predict, - obspredict, obs, LearnAPI.algorithm, LearnAPI.feature_importances, ) ) -n = 10 # number of observations +n = 30 # number of observations train = 1:6 test = 7:10 a, b, c = rand(n), rand(n), rand(n) @@ -112,7 +135,7 @@ y = 2a - b + 3c + 0.05*rand(n) ), ) - # quite fitting: + # quiet fitting: model = @test_logs( fit( algorithm, @@ -126,10 +149,10 @@ y = 2a - b + 3c + 0.05*rand(n) @test ŷ isa Vector{Float64} @test predict(model, Tables.subset(X, test)) == ŷ - fitdata = LearnAPI.obs(fit, algorithm, X, y) - predictdata = LearnAPI.obs(predict, algorithm, X) - model = obsfit(algorithm, MLUtils.getobs(fitdata, train); verbosity=1) - @test obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, test)) == ŷ + fitobs = LearnAPI.obs(algorithm, (X, y)) + predictobs = LearnAPI.obs(model, X) + model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) + @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) ≈ ŷ @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} @@ -140,11 +163,15 @@ y = 2a - b + 3c + 0.05*rand(n) recovered_model = deserialize(filename) @test LearnAPI.algorithm(recovered_model) == algorithm - @test obspredict( + @test predict( recovered_model, LiteralTarget(), - MLUtils.getobs(predictdata, test) - ) == ŷ + MLUtils.getobs(predictobs, test) + ) ≈ ŷ + + @test LearnAPI.target(algorithm, (X, y)) == y + @test LearnAPI.target(algorithm, fitobs) == y + end # # VARIATION OF RIDGE REGRESSION THAT USES FALLBACK OF LearnAPI.obs @@ -152,7 +179,7 @@ end struct BabyRidge lambda::Float64 end -BabyRidge(; lambda=0.1) = BabyRidge(lambda) +BabyRidge(; lambda=0.1) = BabyRidge(lambda) # LearnAPI.constructor defined later struct BabyRidgeFitted{T,F} algorithm::BabyRidge @@ -160,18 +187,17 @@ struct BabyRidgeFitted{T,F} feature_importances::F end -function LearnAPI.obsfit(algorithm::BabyRidge, data, verbosity) +function LearnAPI.fit(algorithm::BabyRidge, data; verbosity=1) X, y = data lambda = algorithm.lambda - table = Tables.columntable(X) names = Tables.columnnames(table) |> collect - A = Tables.matrix(table, transpose=true) + A = Tables.matrix(table)' # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector feature_importances = nothing @@ -179,25 +205,29 @@ function LearnAPI.obsfit(algorithm::BabyRidge, data, verbosity) end +LearnAPI.target(::BabyRidge, data) = last(data) + +# convenience form: +LearnAPI.fit(algorithm::BabyRidge, X, y; kwargs...) = + fit(algorithm, (X, y); kwargs...) + LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm -function LearnAPI.obspredict(model::BabyRidgeFitted, ::LiteralTarget, data) - X = only(data) - Anew = Tables.matrix(X, transpose=true) - return ((model.coefficients)'*Anew)' -end +LearnAPI.predict(model::BabyRidgeFitted, ::LiteralTarget, Xnew) = + Tables.matrix(Xnew)*model.coefficients + +LearnAPI.minimize(model::BabyRidgeFitted) = + BabyRidgeFitted(model.algorithm, model.coefficients, nothing) @trait( BabyRidge, - position_of_target=2, + constructor = Ridge, + target=true, kinds_of_proxy = (LiteralTarget(),), functions = ( fit, - obsfit, minimize, predict, - obspredict, - obs, LearnAPI.algorithm, LearnAPI.feature_importances, ) @@ -210,10 +240,12 @@ end ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) @test ŷ isa Vector{Float64} - fitdata = obs(fit, algorithm, X, y) - predictdata = LearnAPI.obs(predict, algorithm, X) - model = obsfit(algorithm, MLUtils.getobs(fitdata, train); verbosity=0) - @test obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, test)) == ŷ + fitobs = obs(algorithm, (X, y)) + predictobs = LearnAPI.obs(model, X) + model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) + @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) == ŷ + + @test LearnAPI.target(algorithm, (X, y)) == y end true diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index e5295ddc..3991dbf4 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -13,13 +13,15 @@ import DataFrames struct Selector names::Vector{Symbol} end -Selector(; names=Symbol[]) = Selector(names) +Selector(; names=Symbol[]) = Selector(names) # LearnAPI.constructor defined later -LearnAPI.obsfit(algorithm::Selector, obsdata, verbosity) = algorithm -LearnAPI.algorithm(model) = model # i.e., the algorithm +# `fit` has no input data, does no "learning", and just returns thinly wrapped `algorithm` +# (to distinguish it from the algorithm in dispatch): +LearnAPI.fit(algorithm::Selector; verbosity=1) = Ref(algorithm) +LearnAPI.algorithm(model) = model[] -function LearnAPI.obstransform(algorithm::Selector, obsdata) - X = only(obsdata) +function LearnAPI.transform(model::Base.RefValue{Selector}, X) + algorithm = LearnAPI.algorithm(model) table = Tables.columntable(X) names = Tables.columnnames(table) filtered_names = filter(in(algorithm.names), names) @@ -28,23 +30,31 @@ function LearnAPI.obstransform(algorithm::Selector, obsdata) return Tables.materializer(X)(filtered_table) end -@trait Selector functions = ( - fit, - obsfit, - minimize, - transform, - obstransform, - obs, - Learn.algorithm, +# fit and transform in one go: +function LearnAPI.transform(algorithm::Selector, X) + model = fit(algorithm) + transform(model, X) +end + +@trait( + Selector, + constructor = Selector, + functions = ( + fit, + minimize, + transform, + Learn.algorithm, + ), ) @testset "test a static transformer" begin algorithm = Selector(names=[:x, :w]) X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) model = fit(algorithm) # no data arguments! - @test model == algorithm - @test transform(model, X) == - DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) + @test LearnAPI.algorithm(model) == algorithm + W = transform(model, X) + @test W == DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) + @test W == transform(algorithm, X) end @@ -56,7 +66,7 @@ end struct Selector2 names::Vector{Symbol} end -Selector2(; names=Symbol[]) = Selector2(names) +Selector2(; names=Symbol[]) = Selector2(names) # LearnAPI.constructor defined later mutable struct Selector2Fit algorithm::Selector2 @@ -66,13 +76,11 @@ end LearnAPI.algorithm(model::Selector2Fit) = model.algorithm rejected(model::Selector2Fit) = model.rejected -# Here `obsdata=()` and we are just wrapping `algorithm` with a place-holder for -# the `rejected` feature names. -LearnAPI.obsfit(algorithm::Selector2, obsdata, verbosity) = Selector2Fit(algorithm) +# Here we are wrapping `algorithm` with a place-holder for the `rejected` feature names. +LearnAPI.fit(algorithm::Selector2; verbosity=1) = Selector2Fit(algorithm) -# output the filtered table and add `rejected` field to model (mutatated) -function LearnAPI.obstransform(model::Selector2Fit, obsdata) - X = only(obsdata) +# output the filtered table and add `rejected` field to model (mutatated!) +function LearnAPI.transform(model::Selector2Fit, X) table = Tables.columntable(X) names = Tables.columnnames(table) keep = LearnAPI.algorithm(model).names @@ -83,16 +91,21 @@ function LearnAPI.obstransform(model::Selector2Fit, obsdata) return Tables.materializer(X)(filtered_table) end +# fit and transform in one step: +function LearnAPI.transform(algorithm::Selector2, X) + model = fit(algorithm) + transform(model, X) +end + @trait( Selector2, + constructor = Selector, predict_or_transform_mutates = true, functions = ( fit, obsfit, minimize, transform, - obstransform, - obs, Learn.algorithm, :(MyPkg.rejected), # accessor function not owned by LearnAPI.jl ) @@ -106,6 +119,7 @@ end @test LearnAPI.algorithm(model) == algorithm filtered = DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) @test transform(model, X) == filtered + @test transform(algorithm, X) == filtered @test rejected(model) == [:y, :z] end diff --git a/test/runtests.jl b/test/runtests.jl index 8697a248..93788bc4 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,6 +4,10 @@ using Test include("tools.jl") end +@testset "traits.jl" begin + include("traits.jl") +end + # # INTEGRATION TESTS @testset "regression" begin diff --git a/test/tools.jl b/test/tools.jl index 1b2e942f..523f40e1 100644 --- a/test/tools.jl +++ b/test/tools.jl @@ -1,6 +1,5 @@ using LearnAPI using Test -using SparseArrays module Fruit using LearnAPI @@ -22,13 +21,6 @@ import .Fruit ## HELPERS -@testset "typename" begin - @test LearnAPI.typename(Fruit.RedApple(1)) == :RedApple - @test LearnAPI.typename(nothing) == :Nothing - m = SparseArrays.sparse([1,2], [1,3], [0.5, 0.6]) - @test LearnAPI.typename(m) == :SparseMatrixCSC -end - @testset "snakecase" begin @test LearnAPI.snakecase("AnthonyBlaomsPetElk") == "anthony_blaoms_pet_elk" diff --git a/test/traits.jl b/test/traits.jl new file mode 100644 index 00000000..3000d016 --- /dev/null +++ b/test/traits.jl @@ -0,0 +1,16 @@ +module FruitSalad +using LearnAPI + +struct RedApple{T} + x::T +end + +LearnAPI.constructor(::RedApple) = RedApple + +end + +import .FruitSalad + +@testset "name" begin + @test LearnAPI.name(FruitSalad.RedApple(1)) == "RedApple" +end From d47cabe03a50f236ae86c2e1a2aa46dd8b0ae149 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 19 May 2024 16:23:49 +1200 Subject: [PATCH 031/187] rm redundant pkg from [extras] --- Project.toml | 1 - 1 file changed, 1 deletion(-) diff --git a/Project.toml b/Project.toml index 206a4038..ee543d1a 100644 --- a/Project.toml +++ b/Project.toml @@ -14,7 +14,6 @@ DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b" -SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" From f0c68d53fc5355b92a7833ce697f6e30a358cf37 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 19 May 2024 17:05:37 +1200 Subject: [PATCH 032/187] fix typos --- docs/src/anatomy_of_an_implementation.md | 2 +- docs/src/obs.md | 2 +- src/obs.jl | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 1c011d21..b136a8ba 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -401,7 +401,7 @@ LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y ### The `obs` contract Providing `fit` signatures matching the output of `obs`, is the first part of the `obs` -contract. The second part is this: *The outupt of `obs` must implement the* +contract. The second part is this: *The output of `obs` must implement the* [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs/numobs` *interface for accessing individual observations*. It usually suffices to overload `Base.getindex` and `Base.length` (which are the `getobs/numobs` fallbacks): diff --git a/docs/src/obs.md b/docs/src/obs.md index fe198a85..3e40e3f9 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -8,7 +8,7 @@ performance advantages over naive workflows in some cases (e.g., cross-validatio ```julia obs(algorithm, data) # can be passed to `fit` instead of `data` -obs(model, data) # can be passed to `predict` or `tranform` instead of `data` +obs(model, data) # can be passed to `predict` or `transform` instead of `data` ``` ## Typical workflows diff --git a/src/obs.jl b/src/obs.jl index 0348d3da..f67b19c9 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -69,7 +69,7 @@ that case. ## Sample implementation -Refer to the "Anatomy of an Implemetation" section of the LearnAPI +Refer to the "Anatomy of an Implementation" section of the LearnAPI [manual](https://juliaai.github.io/LearnAPI.jl/dev/). From 3252e09899055371946700bd074df1b6f0f86d40 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 21 May 2024 09:29:26 +1200 Subject: [PATCH 033/187] more doc tweaks --- docs/src/anatomy_of_an_implementation.md | 11 ++++++----- docs/src/index.md | 8 ++++---- docs/src/reference.md | 18 +++++++++--------- 3 files changed, 19 insertions(+), 18 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index b136a8ba..93a82d1a 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -166,8 +166,8 @@ predictions. ## Algorithm traits Algorithm [traits](@ref traits) record extra generic information about an algorithm, or -make specific promises of behavior. They usually have an algorithm as the single -argument. We regard [`LearnAPI.constructor`](@ref) defined above as a trait. +make specific promises of behavior. They usually have an algorithm as the single argument, +and so we also regard [`LearnAPI.constructor`](@ref) defined above as a trait. In LearnAPI.jl `predict` always outputs a [target or target proxy](@ref proxy), where "target" is understood very broadly. We overload a trait to record the fact here that the @@ -214,9 +214,10 @@ traits_list) to see which might apply to a new implementation, to enable maximum functionality provided by third party packages, and to assist third party algorithms that match machine learning algorithms to user-defined tasks. -Having set `LearnAPI.target(::Ridge) == true` we are obliged to overload a multi-argument -version of `LearnAPI.target` to extract the target from the `data` that gets supplied to -`fit`: +According to the contract articulated in its document string, having set +[`LearnAPI.target(::Ridge)`](@ref) equal to `true`, we are obliged to overload a +multi-argument version of `LearnAPI.target` to extract the target from the `data` that +gets supplied to `fit`: ```@example anatomy LearnAPI.target(::Ridge, data) = last(data) diff --git a/docs/src/index.md b/docs/src/index.md index 4f979070..2e4afb74 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -12,9 +12,9 @@ A base Julia interface for machine learning and statistics LearnAPI.jl is a lightweight, functional-style interface, providing a collection of [methods](@ref Methods), such as `fit` and `predict`, to be implemented by algorithms from machine learning and statistics. Through such implementations, these algorithms buy into -functionality, such as hyperparameter optimization, as provided by ML/statistics toolboxes -and other packages. LearnAPI.jl also provides a number of Julia [traits](@ref traits) for -promising specific behavior. +functionality, such as hyperparameter optimization and model composition, as provided by +ML/statistics toolboxes and other packages. LearnAPI.jl also provides a number of Julia +[traits](@ref traits) for promising specific behavior. ```@raw html 🚧 @@ -78,7 +78,7 @@ opts out. The `fit` and `predict` methods consume these alternative representati The fallback data interface is the [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) `getobs/numobs` interface, and if the input consumed by the algorithm already implements that interface (tables, arrays, etc.) then overloading `obs` is completely optional. A -plain iteration interface (to support, e.g., data loaders reading images from disk files) +plain iteration interface (to support, e.g., data loaders reading images from disk) can also be specified. ## Learning more diff --git a/docs/src/reference.md b/docs/src/reference.md index 5b15e03e..f5be0824 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -54,16 +54,15 @@ compared with censored ground truth survival times. #### Definitions -More generally, whenever we have a variable (e.g., a class label) that can (in principle) -can be paired with a predicted value, or some predicted "proxy" for that variable (such as -a class probability), then we call the variable a *target* variable, and the predicted -output a *target proxy*. In this definition, it is immaterial whether or not the target -appears in training (is supervised) or whether or not the model generalizes to new +More generally, whenever we have a variable (e.g., a class label) that can, at least in +principle, be paired with a predicted value, or some predicted "proxy" for that variable +(such as a class probability), then we call the variable a *target* variable, and the +predicted output a *target proxy*. In this definition, it is immaterial whether or not the +target appears in training (is supervised) or whether or not the model generalizes to new observations ("learns"). LearnAPI.jl provides singleton [target proxy types](@ref proxy_types) for prediction -dispatch in LearnAPI.jl. These are also used to distinguish performance metrics provided -by the package +dispatch. These are also used to distinguish performance metrics provided by the package [StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/). @@ -151,8 +150,9 @@ Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_t [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref) are universally compulsory. -- [`LearnAPI.target`](@ref) and [`LearnAPI.weights`](@ref) are both traits and methods to - extract, from `fit` input data, the target and per-observation weights, when available. +- [`LearnAPI.target`](@ref) and [`LearnAPI.weights`](@ref) are traits which also include + extended signatures for extracting, from `fit` input data, the target and + per-observation weights, when available. --- From 69bd859b9e3be74d2f232761d6e2d933fafafeb5 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 May 2024 17:30:35 +1200 Subject: [PATCH 034/187] fix table of contents for the docs --- docs/Project.toml | 1 + docs/make.jl | 18 ++++++++++-------- docs/src/anatomy_of_an_implementation.md | 2 +- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/docs/Project.toml b/docs/Project.toml index caa42f70..47eb52e6 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -1,5 +1,6 @@ [deps] Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" +LearnAPI = "92ad9a40-7767-427a-9ee6-6e577f1266cb" MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" ScientificTypesBase = "30f210dd-8aff-4c5f-94ba-8e64358c1161" Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" diff --git a/docs/make.jl b/docs/make.jl index b0705cda..d9695614 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -10,14 +10,16 @@ makedocs( pages=[ "Home" => "index.md", "Anatomy of an Implementation" => "anatomy_of_an_implementation.md", - "Reference" => "reference.md", - "... fit" => "fit.md", - "... predict/transform" => "predict_transform.md", - "... Kinds of Target Proxy" => "kinds_of_target_proxy.md", - "... minimize" => "minimize.md", - "... obs" => "obs.md", - "... Accessor Functions" => "accessor_functions.md", - "... Algorithm Traits" => "traits.md", + "Reference" => [ + "Summary" => "reference.md", + "fit" => "fit.md", + "predict/transform" => "predict_transform.md", + "Kinds of Target Proxy" => "kinds_of_target_proxy.md", + "minimize" => "minimize.md", + "obs" => "obs.md", + "Accessor Functions" => "accessor_functions.md", + "Algorithm Traits" => "traits.md", + ], "Common Implementation Patterns" => "common_implementation_patterns.md", "Testing an Implementation" => "testing_an_implementation.md", ], diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 93a82d1a..1ee22b2e 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -215,7 +215,7 @@ functionality provided by third party packages, and to assist third party algori match machine learning algorithms to user-defined tasks. According to the contract articulated in its document string, having set -[`LearnAPI.target(::Ridge)`](@ref) equal to `true`, we are obliged to overload a +[`LearnAPI.target`](@ref)`(::Ridge)`](@ref) equal to `true`, we are obliged to overload a multi-argument version of `LearnAPI.target` to extract the target from the `data` that gets supplied to `fit`: From f4b0fdda52879df2b76ea1004cdaefbe3670dd62 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 May 2024 17:32:40 +1200 Subject: [PATCH 035/187] tweak --- docs/src/index.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/src/index.md b/docs/src/index.md index 2e4afb74..1b6cc500 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -73,7 +73,8 @@ Algorithms are free to consume data in any format. However, a method called [`ob data_interface) (read as "observations") gives users and meta-algorithms access to an algorithm-specific representation of input data, which is also guaranteed to implement a standard interface for accessing individual observations, unless an algorithm explicitly -opts out. The `fit` and `predict` methods consume these alternative representations of data. +opts out. The `fit` and `predict` methods also consume these alternative representations +of data. The fallback data interface is the [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) `getobs/numobs` interface, and if the input consumed by the algorithm already implements From acac24f93fb39e8c424915338fce8d4a1d83322a Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 May 2024 17:35:53 +1200 Subject: [PATCH 036/187] doc tweak --- docs/src/reference.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index f5be0824..7a2a196b 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -120,10 +120,9 @@ Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_t ### List of methods -- [`fit`](@ref fit): for training or updating algorithms that generalize to new data. For - non-generalizing ("static") algorithms, `fit(algorithm)` generally wraps algorithm in a - mutable struct that can be mutated by `predict`/`transform` to record byproducts of - those operations. +- [`fit`](@ref fit): for training or updating algorithms that generalize to new data. Or, + for non-generalizing ("static") algorithms, wrap `algorithm` in a mutable struct that + can be mutated by `predict`/`transform` to record byproducts of those operations. - [`predict`](@ref operations): for outputting [targets](@ref proxy) or [target proxies](@ref proxy) (such as probability density functions) From 3b289f5dd1ab1a586b7c36f6ee974b77335b2e79 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 May 2024 17:37:16 +1200 Subject: [PATCH 037/187] tweak --- docs/make.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/make.jl b/docs/make.jl index d9695614..1ed6928f 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -11,7 +11,7 @@ makedocs( "Home" => "index.md", "Anatomy of an Implementation" => "anatomy_of_an_implementation.md", "Reference" => [ - "Summary" => "reference.md", + "Overview" => "reference.md", "fit" => "fit.md", "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", From d6c320fdbea5186a99d0e7d3d73c8ec138c7d7cf Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 May 2024 17:50:17 +1200 Subject: [PATCH 038/187] whitespace fixes --- docs/src/anatomy_of_an_implementation.md | 172 +++++++++++------------ docs/src/reference.md | 10 +- 2 files changed, 93 insertions(+), 89 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 1ee22b2e..85eca73f 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -10,19 +10,19 @@ For a transformer, implementations ordinarily implement `transform` instead of !!! important - The core implementations of `fit`, `predict`, etc, - always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. - Calls like `fit(algorithm, X, y)` are provided as additional convenience methods. + The core implementations of `fit`, `predict`, etc, + always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. + Calls like `fit(algorithm, X, y)` are provided as additional convenience methods. !!! note - If the `data` object consumed by `fit`, `predict`, or `transform` is not - not a suitable table¹, array³, tuple of tables and arrays, or some - other object implementing - the MLUtils.jl `getobs`/`numobs` interface, - then an implementation must: (i) suitably overload the trait - [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as - illustrated below under [Providing an advanced data interface](@ref). + If the `data` object consumed by `fit`, `predict`, or `transform` is not + not a suitable table¹, array³, tuple of tables and arrays, or some + other object implementing + the MLUtils.jl `getobs`/`numobs` interface, + then an implementation must: (i) suitably overload the trait + [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as + illustrated below under [Providing an advanced data interface](@ref). The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. @@ -39,7 +39,7 @@ Here's a new type whose instances specify ridge regression parameters: ```@example anatomy struct Ridge{T<:Real} - lambda::T + lambda::T end nothing # hide ``` @@ -63,7 +63,7 @@ changed to `0.05`. ## Implementing `fit` -A ridge regressor requires two types of data for training: *input features* `X`, which +A ridge regressor requires two types of data for training: input features `X`, which here we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a vector. @@ -72,9 +72,9 @@ coefficients labelled by feature name for inspection after training: ```@example anatomy struct RidgeFitted{T,F} - algorithm::Ridge - coefficients::Vector{T} - named_coefficients::F + algorithm::Ridge + coefficients::Vector{T} + named_coefficients::F end nothing # hide ``` @@ -87,25 +87,25 @@ The core implementation of `fit` looks like this: ```@example anatomy function LearnAPI.fit(algorithm::Ridge, data; verbosity=1) - X, y = data + X, y = data - # data preprocessing: - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - A = Tables.matrix(table, transpose=true) + # data preprocessing: + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table, transpose=true) - lambda = algorithm.lambda + lambda = algorithm.lambda - # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector - # determine named coefficients: - named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] - # make some noise, if allowed: - verbosity > 0 && @info "Coefficients: $named_coefficients" + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(algorithm, coefficients, named_coefficients) + return RidgeFitted(algorithm, coefficients, named_coefficients) end ``` @@ -127,7 +127,7 @@ Here's the implementation for our ridge regressor: ```@example anatomy LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = - Tables.matrix(Xnew)*model.coefficients + Tables.matrix(Xnew)*model.coefficients ``` ## Accessor functions @@ -156,7 +156,7 @@ overload it to dump the named version of the coefficients: ```@example anatomy LearnAPI.minimize(model::RidgeFitted) = - RidgeFitted(model.algorithm, model.coefficients, nothing) + RidgeFitted(model.algorithm, model.coefficients, nothing) ``` Crucially, we can still use `LearnAPI.minimize(model)` in place of `model` to make new @@ -187,19 +187,19 @@ The macro can be used to specify multiple traits simultaneously: ```@example anatomy @trait( - Ridge, - constructor = Ridge, - target = true, - kinds_of_proxy=(LiteralTarget(),), - descriptors = (:regression,), - functions = ( - fit, - minimize, - predict, - obs, - LearnAPI.algorithm, - LearnAPI.coefficients, - ) + Ridge, + constructor = Ridge, + target = true, + kinds_of_proxy=(LiteralTarget(),), + descriptors = (:regression,), + functions = ( + fit, + minimize, + predict, + obs, + LearnAPI.algorithm, + LearnAPI.coefficients, + ) ) nothing # hide ``` @@ -230,10 +230,10 @@ enabling the kind of workflow previewed in [Sample workflow](@ref): ```@example anatomy LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) + fit(algorithm, (X, y); kwargs...) LearnAPI.predict(model::RidgeFitted, Xnew) = - predict(model, LiteralTarget(), Xnew) + predict(model, LiteralTarget(), Xnew) ``` ## [Demonstration](@id workflow) @@ -292,40 +292,40 @@ using LearnAPI using LinearAlgebra, Tables struct Ridge{T<:Real} - lambda::T + lambda::T end Ridge(; lambda=0.1) = Ridge(lambda) struct RidgeFitted{T,F} - algorithm::Ridge - coefficients::Vector{T} - named_coefficients::F + algorithm::Ridge + coefficients::Vector{T} + named_coefficients::F end LearnAPI.algorithm(model::RidgeFitted) = model.algorithm LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients LearnAPI.minimize(model::RidgeFitted) = - RidgeFitted(model.algorithm, model.coefficients, nothing) + RidgeFitted(model.algorithm, model.coefficients, nothing) LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) + fit(algorithm, (X, y); kwargs...) LearnAPI.predict(model::RidgeFitted, Xnew) = predict(model, LiteralTarget(), Xnew) @trait( - Ridge, - constructor = Ridge, - target = true, - kinds_of_proxy=(LiteralTarget(),), - descriptors = (:regression,), - functions = ( - fit, - minimize, - predict, - obs, - LearnAPI.algorithm, - LearnAPI.coefficients, - ) + Ridge, + constructor = Ridge, + target = true, + kinds_of_proxy=(LiteralTarget(),), + descriptors = (:regression,), + functions = ( + fit, + minimize, + predict, + obs, + LearnAPI.algorithm, + LearnAPI.coefficients, + ) ) n = 10 # number of observations @@ -344,9 +344,9 @@ new type: ```@example anatomy2 struct RidgeFitObs{T,M<:AbstractMatrix{T}} - A::M # p x n - names::Vector{Symbol} # features - y::Vector{T} # target + A::M # p x n + names::Vector{Symbol} # features + y::Vector{T} # target end ``` @@ -354,10 +354,10 @@ Now we overload `obs` to carry out the data pre-processing previously in `fit`, ```@example anatomy2 function LearnAPI.obs(::Ridge, data) - X, y = data - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - return RidgeFitObs(Tables.matrix(table)', names, y) + X, y = data + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return RidgeFitObs(Tables.matrix(table)', names, y) end ``` @@ -369,27 +369,27 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat ```@example anatomy2 function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1) - lambda = algorithm.lambda + lambda = algorithm.lambda - A = observations.A - names = observations.names - y = observations.y + A = observations.A + names = observations.names + y = observations.y - # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix - # determine named coefficients: - named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] - # make some noise, if allowed: - verbosity > 0 && @info "Coefficients: $named_coefficients" + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(algorithm, coefficients, named_coefficients) + return RidgeFitted(algorithm, coefficients, named_coefficients) end LearnAPI.fit(algorithm::Ridge, data; kwargs...) = - fit(algorithm, obs(algorithm, data); kwargs...) + fit(algorithm, obs(algorithm, data); kwargs...) ``` We provide an overloading of `LearnAPI.target` to handle the additional supported data @@ -409,7 +409,7 @@ accessing individual observations*. It usually suffices to overload `Base.getind ```@example anatomy2 Base.getindex(data::RidgeFitObs, I) = - RidgeFitObs(data.A[:,I], data.names, y[I]) + RidgeFitObs(data.A[:,I], data.names, y[I]) Base.length(data::RidgeFitObs, I) = length(data.y) ``` @@ -420,10 +420,10 @@ case: LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, observations::AbstractMatrix) = - observations'*model.coefficients + observations'*model.coefficients LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = - predict(model, LiteralTarget(), obs(model, Xnew)) + predict(model, LiteralTarget(), obs(model, Xnew)) ``` ### Important notes: diff --git a/docs/src/reference.md b/docs/src/reference.md index 7a2a196b..41f57005 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -25,9 +25,13 @@ an example of data, the observations being the rows. Typically, data provided to LearnAPI.jl algorithms, will implement the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/stable) `getobs/numobs` interface for accessing individual observations, but implementations can opt out of this requirement; -see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. In the MLUtils.jl -convention, observations in tables are the rows but observations in a matrix are the -columns. +see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. + +!!! note + + In the MLUtils.jl + convention, observations in tables are the rows but observations in a matrix are the + columns. ### [Hyperparameters](@id hyperparameters) From 54a5f9b38130f11c028f56748f1a760a841b4eae Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 28 May 2024 17:55:25 +1200 Subject: [PATCH 039/187] fix whitespace --- docs/src/anatomy_of_an_implementation.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 85eca73f..81c81d22 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -10,19 +10,19 @@ For a transformer, implementations ordinarily implement `transform` instead of !!! important - The core implementations of `fit`, `predict`, etc, - always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. - Calls like `fit(algorithm, X, y)` are provided as additional convenience methods. + The core implementations of `fit`, `predict`, etc, + always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. + Calls like `fit(algorithm, X, y)` are provided as additional convenience methods. !!! note - If the `data` object consumed by `fit`, `predict`, or `transform` is not - not a suitable table¹, array³, tuple of tables and arrays, or some - other object implementing - the MLUtils.jl `getobs`/`numobs` interface, - then an implementation must: (i) suitably overload the trait - [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as - illustrated below under [Providing an advanced data interface](@ref). + If the `data` object consumed by `fit`, `predict`, or `transform` is not + not a suitable table¹, array³, tuple of tables and arrays, or some + other object implementing + the MLUtils.jl `getobs`/`numobs` interface, + then an implementation must: (i) suitably overload the trait + [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as + illustrated below under [Providing an advanced data interface](@ref). The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. From 0af5476db879c8c07824571d2f3f3e4e4499bdeb Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 30 May 2024 13:31:55 +1200 Subject: [PATCH 040/187] clarify importance of constructor over type in traits and docstrings --- docs/src/anatomy_of_an_implementation.md | 7 +++++++ docs/src/reference.md | 15 +++++++++++---- docs/src/traits.md | 11 ++++++----- src/traits.jl | 8 ++++++-- src/types.jl | 2 +- test/integration/regression.jl | 16 ++++++++++++++++ 6 files changed, 47 insertions(+), 12 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 81c81d22..9779403a 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -51,6 +51,11 @@ mechanism for creating new versions of itself, with modified property (field) va this end, we implement `LearnAPI.constructor`, which must return a keyword constructor: ```@example anatomy +""" + Ridge(; lambda=0.1) + +Instantiate a ridge regression algorithm, with regularization of `lambda`. +""" Ridge(; lambda=0.1) = Ridge(lambda) LearnAPI.constructor(::Ridge) = Ridge nothing # hide @@ -60,6 +65,8 @@ So, if `algorithm = Ridge(lambda=0.1)` then `LearnAPI.constructor(algorithm)(lam is another algorithm with the same properties, except that the value of `lambda` has been changed to `0.05`. +Note that we attach the docstring to the constructor, not the struct. + ## Implementing `fit` diff --git a/docs/src/reference.md b/docs/src/reference.md index 41f57005..11157384 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -95,6 +95,9 @@ for such algorithms [`LearnAPI.is_composite`](@ref)`(algorithm)` must be `true` is `false`). Generally, the keyword constructor provided by [`LearnAPI.constructor`](@ref) must provide default values for all non-algorithm properties. +Any object `algorithm` for which [`LearnAPI.functions`](@ref)`(algorithm)` is non-empty is +understood have a valid implementation of the LearnAPI.jl interface. + ### Example Any instance of `GradientRidgeRegressor` defined below is a valid algorithm. @@ -110,8 +113,11 @@ GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` -Any object `algorithm` for which [`LearnAPI.functions`](@ref)`(algorithm)` is non-empty is -understood have a valid implementation of the LearnAPI.jl interface. +### Documentation + +Attach public LearnAPI.jl-related documentation for an algorithm to it's *constructor*, +rather than to the struct defining its type. In this way, an algorithm can implement +non-LearnAPI interfaces (such as a native interface) with separate document strings. ## Methods @@ -125,8 +131,9 @@ Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_t ### List of methods - [`fit`](@ref fit): for training or updating algorithms that generalize to new data. Or, - for non-generalizing ("static") algorithms, wrap `algorithm` in a mutable struct that - can be mutated by `predict`/`transform` to record byproducts of those operations. + for non-generalizing algorithms (see [Static Algorithms](@ref)), wrap `algorithm` in a + mutable struct that can be mutated by `predict`/`transform` to record byproducts of + those operations. - [`predict`](@ref operations): for outputting [targets](@ref proxy) or [target proxies](@ref proxy) (such as probability density functions) diff --git a/docs/src/traits.md b/docs/src/traits.md index 9ff63967..7862d680 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -95,12 +95,13 @@ Multiple traits can be declared like this: To ensure that trait metadata can be stored in an external algorithm registry, LearnAPI.jl requires: -1. *Finiteness:* The value of a trait is the same for all algorithms with same - underlying `UnionAll` type. That is, even if the type parameters are different, the - trait should be the same. There is an exception if `is_composite(algorithm) = true`. +1. *Finiteness:* The value of a trait is the same for all `algorithm`s with same value of + [`LearnAPI.constructor(algorithm)`](@ref). This typically means trait values do not + depend on type parameters! There is an exception if `is_composite(algorithm) = true`. -2. *Serializability:* The value of any trait can be evaluated without installing any - third party package; `using LearnAPI` should suffice. +2. *Immediate serializability:* It should be possible to call a trait without first + installing any third party package. Importing the package that defines the algorithm, + together with `import LearnAPI` should suffice. Because of 1, combining a lot of functionality into one algorithm (e.g. the algorithm can perform both classification or regression) can mean traits are necessarily less diff --git a/src/traits.jl b/src/traits.jl index f5709206..2af32b60 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -76,8 +76,12 @@ julia> algorithm2.lambda # New implementations -All new implementations must overload this trait. It must be possible to recover an -algorithm from the constructor returned as follows: +All new implementations must overload this trait. + +Attach public LearnAPI.jl-related documentation for an algorithm to the constructor, not +the algorithm struct. + +It must be possible to recover an algorithm from the constructor returned as follows: ```julia properties = propertynames(algorithm) diff --git a/src/types.jl b/src/types.jl index e77c4cb7..3d09c93a 100644 --- a/src/types.jl +++ b/src/types.jl @@ -83,7 +83,7 @@ end const CONCRETE_TARGET_PROXY_TYPES_LIST = join( map(CONCRETE_TARGET_PROXY_TYPES_SYMBOLS) do s - "`$s`" + "`$s()`" end, ", ", " and ", diff --git a/test/integration/regression.jl b/test/integration/regression.jl index ee419d21..0ff394e4 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -10,9 +10,17 @@ import DataFrames # We overload `obs` to expose internal representation of input data. See later for a # simpler variation using the `obs` fallback. +# no docstring here - that goes with the constructor struct Ridge lambda::Float64 end + +""" + Ridge(; lambda=0.1) + +Instantiate a ridge regression algorithm, with regularization of `lambda`. + +""" Ridge(; lambda=0.1) = Ridge(lambda) # LearnAPI.constructor defined later struct RidgeFitObs{T,M<:AbstractMatrix{T}} @@ -176,9 +184,17 @@ end # # VARIATION OF RIDGE REGRESSION THAT USES FALLBACK OF LearnAPI.obs +# no docstring here - that goes with the constructor struct BabyRidge lambda::Float64 end + +""" + BabyRidge(; lambda=0.1) + +Instantiate a ridge regression algorithm, with regularization of `lambda`. + +""" BabyRidge(; lambda=0.1) = BabyRidge(lambda) # LearnAPI.constructor defined later struct BabyRidgeFitted{T,F} From 4b7c09c648560396103db82b6f1ea444235cdc3a Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 13 Jun 2024 15:33:43 +1200 Subject: [PATCH 041/187] add Expectile and Quantile target proxy types --- docs/src/kinds_of_target_proxy.md | 17 +++++++++++------ src/traits.jl | 15 ++++++++++----- src/types.jl | 3 +++ 3 files changed, 24 insertions(+), 11 deletions(-) diff --git a/docs/src/kinds_of_target_proxy.md b/docs/src/kinds_of_target_proxy.md index a34e1f42..35d51e4c 100644 --- a/docs/src/kinds_of_target_proxy.md +++ b/docs/src/kinds_of_target_proxy.md @@ -16,18 +16,20 @@ LearnAPI.IID | type | form of an observation | |:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `LearnAPI.LiteralTarget` | same as target observations | +| `LearnAPI.LiteralTarget` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | | `LearnAPI.Sampleable` | object that can be sampled to obtain object of the same form as target observation | | `LearnAPI.Distribution` | explicit probability density/mass function whose sample space is all possible target observations | | `LearnAPI.LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | -| † `LearnAPI.Probability` | numerical probability or probability vector | -| † `LearnAPI.LogProbability` | log-probability or log-probability vector | -| † `LearnAPI.Parametric` | a list of parameters (e.g., mean and variance) describing some distribution | +| `LearnAPI.Probability`¹ | numerical probability or probability vector | +| `LearnAPI.LogProbability`¹ | log-probability or log-probability vector | +| `LearnAPI.Parametric`¹ | a list of parameters (e.g., mean and variance) describing some distribution | | `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | | `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | | `LearnAPI.LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | | `LearnAPI.LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number | -| `LearnAPI.ConfidenceInterval` | confidence interval | +| `LearnAPI.Quantile`² | same as target but with quantile interpretation | +| `LearnAPI.Expectile`² | same as target but with expectile interpretation | +| `LearnAPI.ConfidenceInterval`² | confidence interval | | `LearnAPI.Fuzzy` | finite but possibly varying number of target observations | | `LearnAPI.ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | | `LearnAPI.SurvivalFunction` | survival function | @@ -36,9 +38,12 @@ LearnAPI.IID | `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | | `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | -† Provided for completeness but discouraged to avoid [ambiguities in +¹Provided for completeness but discouraged to avoid [ambiguities in representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). +²The level will be controlled by a hyper-parameter; models providing only quantiles or +expectiles at 50% will provide `LiteralTarget` instead. + > Table of concrete subtypes of `LearnAPI.IID <: LearnAPI.KindOfProxy`. diff --git a/src/traits.jl b/src/traits.jl index 2af32b60..26d12597 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -304,14 +304,19 @@ doc_url(::Any) = "unknown" """ LearnAPI.load_path(algorithm) -Return a string indicating where the `struct` for `typeof(algorithm)` can be found, beginning -with the name of the package module defining it. For example, a return value of -`"FastTrees.LearnAPI.DecisionTreeClassifier"` means the following julia code will return the -algorithm type: +Return a string indicating where in code the definition of the algorithm's constructor can +be found, beginning with the name of the package module defining it. By "constructor" we +mean the return value of [`LearnAPI.constructor(algorithm)`](@ref). + +# Implementation + +For example, a return value of `"FastTrees.LearnAPI.DecisionTreeClassifier"` means the +following julia code will not error: ```julia import FastTrees -FastTrees.LearnAPI.DecisionTreeClassifier +import LearnAPI +@assert FastTrees.LearnAPI.DecisionTreeClassifier == LearnAPI.constructor(algorithm) ``` $DOC_UNKNOWN diff --git a/src/types.jl b/src/types.jl index 3d09c93a..02218bd3 100644 --- a/src/types.jl +++ b/src/types.jl @@ -40,6 +40,9 @@ struct SurvivalDistribution <: IID end struct HazardFunction <: IID end struct OutlierScore <: IID end struct Continuous <: IID end +struct Quantile <: IID end +struct Expectile <: IID end + """ Joint <: KindOfProxy From 82ade4072f976bd0a02f0bfd699a90d771032025 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 9 Sep 2024 09:34:21 +1200 Subject: [PATCH 042/187] add target_observation_scitype --- docs/make.jl | 5 ++++- docs/src/traits.md | 26 ++++++++++++++------------ src/traits.jl | 30 ++++++++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 13 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 1ed6928f..1e3a6277 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -6,7 +6,10 @@ const REPO = Remotes.GitHub("JuliaAI", "LearnAPI.jl") makedocs( modules=[LearnAPI,], - format=Documenter.HTML(prettyurls = get(ENV, "CI", nothing) == "true"), + format=Documenter.HTML( + prettyurls = get(ENV, "CI", nothing) == "true", + collapselevel = 1, + ), pages=[ "Home" => "index.md", "Anatomy of an Implementation" => "anatomy_of_an_implementation.md", diff --git a/docs/src/traits.md b/docs/src/traits.md index 7862d680..84aedeef 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -36,20 +36,21 @@ package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase. | [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | | [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | | [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | -| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | -| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | -| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data...)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | -| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kind, data...)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kind, data...)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kind, data...)` works | `Union{}` | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `predict(model, kind, data...)` works | `Union{}` | `Vector{<:Real}` | +| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | +| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | +| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | +| [`LearnAPI.target_observation scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | +| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `Table(Continuous)` | +| [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `AbstractMatrix{<:Real}` | +| [`LearnAPI.predict_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{<:Real}` | | [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `scitype(predict(model, ...))` | `Any` | `AbstractVector{Continuous}` | | [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `typeof(predict(model, ...))` | `Any` | `AbstractVector{<:Real}` | -| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `transform(model, data...)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.transform_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `transform(model, data...)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)`ensuring `transform(model, data...)` works | `Union{}` | `AbstractMatrix{<:Real}}` | -| [`LearnAPI.transform_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `transform(model, data...)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `transform(model, data)` works | `Union{}` | `Table(Continuous)` | +| [`LearnAPI.transform_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)`ensuring `transform(model, data)` works | `Union{}` | `AbstractMatrix{<:Real}}` | +| [`LearnAPI.transform_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | | [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | upper bound on `scitype(transform(model, ...))` | `Any` | `Table(Continuous)` | | [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | upper bound on `typeof(transform(model, ...))` | `Any` | `AbstractMatrix{<:Real}` | | [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | @@ -130,6 +131,7 @@ LearnAPI.fit_scitype LearnAPI.fit_type LearnAPI.fit_observation_scitype LearnAPI.fit_observation_type +LearnAPI.target_observation_scitype LearnAPI.predict_input_scitype LearnAPI.predict_input_observation_scitype LearnAPI.predict_input_type diff --git a/src/traits.jl b/src/traits.jl index 26d12597..79fd3453 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -44,6 +44,7 @@ const TRAITS = [ :fit_observation_scitype, :fit_type, :fit_observation_type, + :target_observation_scitype, :predict_input_scitype, :predict_output_scitype, :predict_input_type, @@ -491,6 +492,35 @@ Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) """ fit_observation_type(::Any) = Union{} +""" + LearnAPI.target_observation_scitype(algorithm) + +Return an upper bound `S` on the scitype of each observation of `LearnAPI.target(data)`, +where `data` is an admissible argument in the call `fit(algorithm, data)`. + +This interpretation only holds if `LearnAPI.target(algorithm)` is `true`. In any case, +however, if `algorithm` implements `predict`, then `S` will always be an +upper bound on the scitype of observations that could be conceivably extracted from the +output of [`predict`](@ref). For example, suppose we have + +```julia +model = fit(algorithm, data) +ŷ = predict(model, Sampleable(), data_new) +``` + +Then each sample generated by each "observation" of `ŷ` (a vector of sampleable objects, +say) will be bound in scitype by `S`. + +See also See also [`LearnAPI.fit_observation_scitype`](@ref). + +# New implementations + +Optional. The fallback return value is `Any`. + +""" +target_observation_scitype(::Any) = Any + + function DOC_INPUT_SCITYPE(op) extra = op == :predict ? " kind_of_proxy," : "" ONLY = DOC_ONLY_ONE(op) From 7a781a086493c601097941bdc75f2a5cae64cfdd Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 24 Sep 2024 13:30:13 +1200 Subject: [PATCH 043/187] more doc updates --- docs/make.jl | 1 + docs/src/fit.md | 54 ++++++++++++++++++++----- docs/src/index.md | 14 ++++--- docs/src/minimize.md | 2 +- docs/src/obs.md | 57 +++++++++++++++++--------- docs/src/predict_transform.md | 29 +++++++++++-- docs/src/reference.md | 75 +++++++++++++++++++--------------- docs/src/traits.md | 4 +- src/LearnAPI.jl | 1 + src/accessor_functions.jl | 33 ++++++++------- src/fit.jl | 27 +++++++++---- src/minimize.jl | 2 +- src/obs.jl | 37 +++++++++-------- src/predict_transform.jl | 40 +++++++++++++----- src/traits.jl | 57 +++++++++++++------------- src/types.jl | 76 +++++++++++++++++++++++++++++++++++ 16 files changed, 353 insertions(+), 156 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 1e3a6277..ecfc1dd0 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -19,6 +19,7 @@ makedocs( "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", "minimize" => "minimize.md", + "input" => "input.md", "obs" => "obs.md", "Accessor Functions" => "accessor_functions.md", "Algorithm Traits" => "traits.md", diff --git a/docs/src/fit.md b/docs/src/fit.md index c3727110..1687c686 100644 --- a/docs/src/fit.md +++ b/docs/src/fit.md @@ -1,33 +1,69 @@ # [`fit`](@ref fit) +Training for the first time: + ```julia fit(algorithm, data; verbosity=1) -> model -fit(model, data; verbosity=1) -> updated_model +fit(algorithm; verbosity=1) -> static_model ``` -When `fit` expects an tuple form of argument, `data = (X1, ..., Xn)`, then the signature -`fit(algorithm, X1, ..., Xn)` is also provided. +Updating: + +``` +fit(model, data; verbosity=1, param1=new_value1, param2=new_value2, ...) -> updated_model +fit(model, NewObservations(), new_data; verbosity=1, param1=new_value1, ...) -> updated_model +fit(model, NewFeatures(), new_data; verbosity=1, param1=new_value1, ...) -> updated_model +``` -## Typical workflow +When `fit` expects a tuple form of argument, `data = (X1, ..., Xn)`, then the signature +`fit(algorithm, X1, ..., Xn)` is also provided. + +## Typical workflows + +Supposing `Algorithm` is some supervised classifier type, with an iteration parameter `n`: ```julia -# Train some supervised `algorithm`: -model = fit(algorithm, X, y) +algorithm = Algorithm(n=100) +model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` # Predict probability distributions: ŷ = predict(model, Distribution(), Xnew) # Inspect some byproducts of training: LearnAPI.feature_importances(model) + +# Add 50 iterations and predict again: +model = fit(model; n=150) +predict(model, Distribution(), X) +``` + +### A static algorithm (no "learning") + +```julia +# Apply some clustering algorithm which cannot be generalized to new data: +model = fit(algorithm) +labels = predict(model, LabelAmbiguous(), X) # mutates `model` + +# inspect byproducts of the clustering algorithm (e.g., outliers): +LearnAPI.extras(model) ``` ## Implementation guide -| method | fallback | compulsory? | -|:--------------------------|:---------|-------------| -| [`fit`](@ref)`(alg, ...)` | none | yes | +Initial training: + +| method | fallback | compulsory? | +|:-------------------------------------------------------------------------------|:-----------------------------------------------------------------|--------------------| +| [`fit`](@ref)`(algorithm, data; verbosity=1)` | ignores `data` and applies signature below | yes, unless static | +| [`fit`](@ref)`(algorithm; verbosity=1)` | none | no, unless static | +Updating: +| method | fallback | compulsory? | +|:-------------------------------------------------------------------------------|:---------------------------------------------------------------------------|-------------| +| [`fit`](@ref)`(model, data; verbosity=1, param_updates...)` | retrains from scratch on `data` with specified hyperparameter replacements | no | +| [`fit`](@ref)`(model, ::NewObservations, data; verbosity=1, param_updates...)` | none | no | +| [`fit`](@ref)`(model, ::NewFeatures, data; verbosity=1, param_updates...)` | none | no | ## Reference diff --git a/docs/src/index.md b/docs/src/index.md index 1b6cc500..b66a6d74 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -69,18 +69,20 @@ on the usual supervised/unsupervised learning dichotomy. From this point of view supervised algorithm is simply one in which a target variable exists, and happens to appear as an input to training but not to prediction. +## Data interfaces + Algorithms are free to consume data in any format. However, a method called [`obs`](@ref data_interface) (read as "observations") gives users and meta-algorithms access to an algorithm-specific representation of input data, which is also guaranteed to implement a -standard interface for accessing individual observations, unless an algorithm explicitly -opts out. The `fit` and `predict` methods also consume these alternative representations -of data. +standard interface for accessing individual observations, unless the algorithm explicitly +opts out. Moreover, the `fit` and `predict` methods will also be able to consume these +alternative data representations. The fallback data interface is the [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) `getobs/numobs` interface, and if the input consumed by the algorithm already implements -that interface (tables, arrays, etc.) then overloading `obs` is completely optional. A -plain iteration interface (to support, e.g., data loaders reading images from disk) -can also be specified. +that interface (tables, arrays, etc.) then overloading `obs` is completely optional. Plain +iteration interfaces, with or without knowledge of the number of observations, can also be +specified (to support, e.g., data loaders reading images from disk). ## Learning more diff --git a/docs/src/minimize.md b/docs/src/minimize.md index 6fad919a..8e7a4efb 100644 --- a/docs/src/minimize.md +++ b/docs/src/minimize.md @@ -7,7 +7,7 @@ minimize(model) -> # Typical workflow ```julia -model = fit(algorithm, X, y) +model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` ŷ = predict(model, LiteralTarget(), Xnew) LearnAPI.feature_importances(model) diff --git a/docs/src/obs.md b/docs/src/obs.md index 3e40e3f9..bae83427 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -1,4 +1,4 @@ -# [`obs`](@id data_interface) +# [`obs` and Data Interfaces](@id data_interface) The `obs` method takes data intended as input to `fit`, `predict` or `transform`, and transforms it to an algorithm-specific form guaranteed to implement a form of observation @@ -13,18 +13,21 @@ obs(model, data) # can be passed to `predict` or `transform` instead of `dat ## Typical workflows -LearnAPI.jl makes no explicit assumptions about the form of data `X` and `y` in a call -like `fit(algorithm, (X, y))`. However, if we define +LearnAPI.jl makes no universal assumptions about the form of `data` in a call +like `fit(algorithm, data)`. However, if we define ```julia -observations = obs(algorithm, (X, y)) +observations = obs(algorithm, data) ``` -then, assuming the typical case that `LearnAPI.data_interface(algorithm) == Base.HasLength()`, `observations` implements the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface. Moreover, we can pass `observations` to `fit` in place of -the original data, or first resample it using `MLUtils.getobs`: +then, assuming the typical case that `LearnAPI.data_interface(algorithm) == +LearnAPI.RandomAccess()`, `observations` implements the +[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface, for +grabbing and counting observations. Moreover, we can pass `observations` to `fit` in place +of the original data, or first resample it using `MLUtils.getobs`: ```julia -# equivalent to `model = fit(algorithm, (X, y))` (or `fit(algorithm, X, y))`: +# equivalent to `model = fit(algorithm, data)` model = fit(algorithm, observations) # with resampling: @@ -40,40 +43,38 @@ how a user might call `obs` and `MLUtils.getobs` to perform efficient cross-vali using LearnAPI import MLUtils -X = -y = -algorithm = +algorithm = + +data = +X = LearnAPI.input(algorithm, data) +y = LearnAPI.target(algorithm, data) train_test_folds = map([1:10, 11:20, 21:30]) do test (setdiff(1:30, test), test) end -fitobs = obs(algorithm, (X, y)) +fitobs = obs(algorithm, data) never_trained = true scores = map(train_test_folds) do (train, test) # train using model-specific representation of data: - trainobs = MLUtils.getobs(fitobs, train) - model = fit(algorithm, trainobs) + fitobs_subset = MLUtils.getobs(fitobs, train) + model = fit(algorithm, fitobs_subset) # predict on the fold complement: if never_trained global predictobs = obs(model, X) global never_trained = false end - testobs = MLUtils.getobs(predictobs, test) - ŷ = predict(model, LiteralTarget(), testobs) + predictobs_subset = MLUtils.getobs(predictobs, test) + ŷ = predict(model, LiteralTarget(), predictobs_subset) return end ``` -Note here that the output of `predict` will match the representation of `y` , i.e., -there is no concept of an algorithm-specific representation of *outputs*, only inputs. - - ## Implementation guide | method | compulsory? | fallback | @@ -89,3 +90,21 @@ A sample implementation is given in [Providing an advanced data interface](@ref) ```@docs obs ``` + +### Data interfaces + +New implementations must overload [`LearnAPI.data_interface(algorithm)`](@ref) if the +output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess`](@ref). (Arrays, most +tables, and all tuples thereof, implement `RandomAccess`.) + +- [`LearnAPI.RandomAccess`](@ref) (default) +- [`LearnAPI.FiniteIterable`](@ref) +- [`LearnAPI.Iterable`](@ref) + + +```@docs +LearnAPI.RandomAccess +LearnAPI.FiniteIterable +LearnAPI.Iterable +``` + diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 35fb52d7..2ec378ef 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -9,12 +9,13 @@ inverse_transform(model, data) When a method expects a tuple form of argument, `data = (X1, ..., Xn)`, then a slurping signature is also provided, as in `transform(model, X1, ..., Xn)`. -## Typical worklows + +## [Typical worklows](@id predict_workflow) Train some supervised `algorithm`: ```julia -model = fit(algorithm, X, y) +model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` ``` Predict probability distributions: @@ -52,7 +53,7 @@ ŷ = predict(model, LiteralTarget(), predictobs) ``` -## Implementation guide +## [Implementation guide](@id predict_guide) | method | compulsory? | fallback | |:----------------------------|:-----------:|:--------:| @@ -72,7 +73,27 @@ paired with an implementation of [`inverse_transform`](@ref), for returning (app right inverses to `transform`. -## Reference +### [One-liners combining fit and transform/predict](@id one_liners) + +Algorithms may optionally overload `transform` to apply `fit` first, using the supplied +data if required, and then immediately `transform` the same data. The same applies to +`predict`. In that case the first argument of `transform`/`predict` is an *algorithm* +instead of the output of `fit`: + +```julia +predict(algorithm, kind_of_proxy, data) # `fit` implied +transform(algorithm, data) # `fit` implied +``` + +For example, if `fit(algorithm, X)` is defined, then `predict(algorithm, X)` will be +shorthand for + +```julia +model = fit(algorithm, X) +predict(model, X) +``` + +## [Reference](@id predict_ref) ```@docs predict diff --git a/docs/src/reference.md b/docs/src/reference.md index 11157384..de0bb3d6 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -25,21 +25,21 @@ an example of data, the observations being the rows. Typically, data provided to LearnAPI.jl algorithms, will implement the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/stable) `getobs/numobs` interface for accessing individual observations, but implementations can opt out of this requirement; -see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. +see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. -!!! note +!!! note - In the MLUtils.jl - convention, observations in tables are the rows but observations in a matrix are the - columns. + In the MLUtils.jl + convention, observations in tables are the rows but observations in a matrix are the + columns. ### [Hyperparameters](@id hyperparameters) Besides the data it consumes, a machine learning algorithm's behavior is governed by a number of user-specified *hyperparameters*, such as the number of trees in a random -forest. In LearnAPI.jl, one is allowed to have hyperparematers that are not data-generic. -For example, a class weight dictionary will only make sense for a target taking values in -the set of dictionary keys. +forest. In LearnAPI.jl, one is allowed to have hyperparameters that are not data-generic. +For example, a class weight dictionary, which will only make sense for a target taking +values in the set of dictionary keys, can be specified as a hyperparameter. ### [Targets and target proxies](@id proxy) @@ -54,7 +54,7 @@ detection, "outlier"/"inlier" predictions, or probability-like scores, are simil compared with ground truth labels. In clustering, integer labels assigned to observations by the clustering algorithm can can be paired with human labels using, say, the Rand index. In survival analysis, predicted survival functions or probability distributions are -compared with censored ground truth survival times. +compared with censored ground truth survival times. And so on ... #### Definitions @@ -74,8 +74,12 @@ dispatch. These are also used to distinguish performance metrics provided by the An object implementing the LearnAPI.jl interface is called an *algorithm*, although it is more accurately "the configuration of some algorithm".¹ An algorithm encapsulates a -particular set of user-specified [hyperparameters](@ref) as the object's properties. It -does not store learned parameters. +particular set of user-specified [hyperparameters](@ref) as the object's *properties* +(which conceivably differ from its fields). It does not store learned parameters. + +Informally, we will sometimes use the word "model" to refer to the output of +`fit(algorithm, ...)` (see below), something which typically does store learned +parameters. For `algorithm` to be a valid LearnAPI.jl algorithm, [`LearnAPI.constructor(algorithm)`](@ref) must be defined and return a keyword constructor @@ -90,13 +94,16 @@ named_properties = NamedTuple{properties}(getproperty.(Ref(algorithm), propertie Note that if if `algorithm` is an instance of a *mutable* struct, this requirement generally requires overloading `Base.==` for the struct. -A *composite algorithm* is one with a property that can take other algorithms as values; -for such algorithms [`LearnAPI.is_composite`](@ref)`(algorithm)` must be `true` (fallback -is `false`). Generally, the keyword constructor provided by [`LearnAPI.constructor`](@ref) -must provide default values for all non-algorithm properties. +#### Composite algorithms (wrappers) + +A *composite algorithm* is one with at least one property that can take other algorithms +as values; for such algorithms [`LearnAPI.is_composite`](@ref)`(algorithm)` must be `true` +(fallback is `false`). Generally, the keyword constructor provided by +[`LearnAPI.constructor`](@ref) must provide default values for all fields that are not +algorithm-valued. Any object `algorithm` for which [`LearnAPI.functions`](@ref)`(algorithm)` is non-empty is -understood have a valid implementation of the LearnAPI.jl interface. +understood to have a valid implementation of the LearnAPI.jl interface. ### Example @@ -109,7 +116,7 @@ struct GradientRidgeRegressor{T<:Real} l2_regularization::T end GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) = - GradientRidgeRegressor(learning_rate, epochs, l2_regularization) + GradientRidgeRegressor(learning_rate, epochs, l2_regularization) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` @@ -117,16 +124,22 @@ LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor Attach public LearnAPI.jl-related documentation for an algorithm to it's *constructor*, rather than to the struct defining its type. In this way, an algorithm can implement -non-LearnAPI interfaces (such as a native interface) with separate document strings. +multiple interfaces, in addition to the LearnAPI interface, with separate document strings +for each. ## Methods Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_transform`, -`minimize`, and `obs`. All new implementations must implement [`fit`](@ref), -[`LearnAPI.algorithm`](@ref algorithm_minimize), [`LearnAPI.constructor`](@ref) and -[`LearnAPI.functions`](@ref). The last two are algorithm traits, which can be set with the -[`@trait`](@ref) macro. +`minimize`, and `obs`. + +!!! note + + All new implementations must implement [`fit`](@ref), + [`LearnAPI.algorithm`](@ref algorithm_minimize), [`LearnAPI.constructor`](@ref) and + [`LearnAPI.functions`](@ref). The last two are algorithm traits, which can be set + with the [`@trait`](@ref) macro. + ### List of methods @@ -147,23 +160,21 @@ Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_t - [`minimize`](@ref algorithm_minimize): for stripping the `model` output by `fit` of inessential content, for purposes of serialization. +- [`LearnAPI.input`](@ref input): for extracting inputs from training data. + - [`obs`](@ref data_interface): a method for exposing to the user algorithm-specific - representations of data guaranteed to implement observation access according to the - value of the [`LearnAPI.data_interface`](@ref) trait - -- [Accessor functions](@ref accessor_functions): include things like `feature_importances` - and `training_losses`, for extracting, from training outcomes, information common to - many algorithms. + representations of data that are guaranteed to implement observation access, as + specified by [`LearnAPI.data_interface(algorithm)`](@ref). + +- [Accessor functions](@ref accessor_functions): these include functions like + `feature_importances` and `training_losses`, for extracting, from training outcomes, + information common to many algorithms. - [Algorithm traits](@ref traits): special methods, that promise specific algorithm behavior or for recording general information about the algorithm. Only [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref) are universally compulsory. -- [`LearnAPI.target`](@ref) and [`LearnAPI.weights`](@ref) are traits which also include - extended signatures for extracting, from `fit` input data, the target and - per-observation weights, when available. - --- ¹ We acknowledge users may not like this terminology, and may know "algorithm" by some diff --git a/docs/src/traits.md b/docs/src/traits.md index 84aedeef..b04b494d 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -22,7 +22,7 @@ package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase. | trait | return value | fallback value | example | |:----------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:---------------------------------------------------------| | [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | -| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(fit, predict, minimize, LearnAPI.algorithm, obs)` | +| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | | [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | | [`LearnAPI.target`](@ref)`(algorithm)` | `true` if target can appear in `fit` data | `false` | `true` | | [`LearnAPI.weights`](@ref)`(algorithm)` | `true` if per-observation weights can appear in `fit` data | `false` | `true` | @@ -40,7 +40,7 @@ package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase. | [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | | [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | | [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | -| [`LearnAPI.target_observation scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | +| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | | [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `Table(Continuous)` | | [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{Continuous}` | | [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `AbstractMatrix{<:Real}` | diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 9ba6b54e..66c9aa9e 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -7,6 +7,7 @@ include("types.jl") include("predict_transform.jl") include("fit.jl") include("minimize.jl") +include("input.jl") include("obs.jl") include("accessor_functions.jl") include("traits.jl") diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index b87a3ab1..854bfdb7 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -31,7 +31,7 @@ is `true`. # New implementations Implementation is compulsory for new algorithm types. The behaviour described above is the -only contract. $(DOC_IMPLEMENTED_METHODS(:algorithm)) +only contract. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.algorithm)")) """ function algorithm end @@ -44,7 +44,7 @@ Return the algorithm-specific feature importances of a `model` output by an abstract vector of `feature::Symbol => importance::Real` pairs (e.g `[:gender => 0.23, :height => 0.7, :weight => 0.1]`). -The `algorithm` supports feature importances if `LearnAPI.feature_importances in +The `algorithm` supports feature importances if `:(LearnAPI.feature_importances) in LearnAPI.functions(algorithm)`. If an algorithm is sometimes unable to report feature importances then @@ -55,7 +55,7 @@ If an algorithm is sometimes unable to report feature importances then Implementation is optional. -$(DOC_IMPLEMENTED_METHODS(:feature_importances)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.feature_importances)")). """ function feature_importances end @@ -68,7 +68,7 @@ an abstract vector of `feature_or_class::Symbol => coefficient::Real` pairs (e.g => 0.23, :height => 0.7, :weight => 0.1]`) or, in the case of multi-targets, `feature::Symbol => coefficients::AbstractVector{<:Real}` pairs. -The `model` reports coefficients if `LearnAPI.coefficients in +The `model` reports coefficients if `:(LearnAPI.coefficients) in LearnAPI.functions(Learn.algorithm(model))`. See also [`LearnAPI.intercept`](@ref). @@ -77,7 +77,7 @@ See also [`LearnAPI.intercept`](@ref). Implementation is optional. -$(DOC_IMPLEMENTED_METHODS(:coefficients)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.coefficients)")). """ function coefficients end @@ -88,7 +88,7 @@ function coefficients end For a linear model, return the learned intercept. The value returned is `Real` (single target) or an `AbstractVector{<:Real}` (multi-target). -The `model` reports intercept if `LearnAPI.intercept in +The `model` reports intercept if `:(LearnAPI.intercept) in LearnAPI.functions(Learn.algorithm(model))`. See also [`LearnAPI.coefficients`](@ref). @@ -97,7 +97,7 @@ See also [`LearnAPI.coefficients`](@ref). Implementation is optional. -$(DOC_IMPLEMENTED_METHODS(:intercept)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.intercept)")). """ function intercept end @@ -120,7 +120,7 @@ See also [`LearnAPI.trees`](@ref). Implementation is optional. -$(DOC_IMPLEMENTED_METHODS(:tree)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.tree)")). """ function tree end @@ -137,7 +137,7 @@ See also [`LearnAPI.tree`](@ref). Implementation is optional. -$(DOC_IMPLEMENTED_METHODS(:trees)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.trees)")). """ function trees end @@ -155,7 +155,7 @@ See also [`fit`](@ref). Implement for iterative algorithms that compute and record training losses as part of training (e.g. neural networks). -$(DOC_IMPLEMENTED_METHODS(:training_losses)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_losses)")). """ function training_losses end @@ -173,7 +173,7 @@ See also [`fit`](@ref). Implement for iterative algorithms that compute and record training losses as part of training (e.g. neural networks). -$(DOC_IMPLEMENTED_METHODS(:training_predictions)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_predictions)")). """ function training_predictions end @@ -192,7 +192,7 @@ Implement for algorithms, such as outlier detection algorithms, which associate with each observation during training, where these scores are of interest in later processes (e.g, in defining normalized scores for new data). -$(DOC_IMPLEMENTED_METHODS(:training_scores)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_scores)")). """ function training_scores end @@ -212,9 +212,9 @@ See also [`is_composite`](@ref). # New implementations -Implementent if and only if `model` is a composite model. +Implementent if and only if `model` is a composite model. -$(DOC_IMPLEMENTED_METHODS(:components)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.components)")). """ function components end @@ -229,7 +229,7 @@ See also [`fit`](@ref). # New implementations -$(DOC_IMPLEMENTED_METHODS(:training_labels)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_labels)")). """ function training_labels end @@ -273,7 +273,7 @@ See also [`fit`](@ref). Implementation is discouraged for byproducts already covered by other LearnAPI.jl accessor functions: $ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS_LIST. -$(DOC_IMPLEMENTED_METHODS(:training_labels)). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_labels)")). """ function extras end @@ -287,4 +287,3 @@ const ACCESSOR_FUNCTIONS_LIST = join( ", ", " and ", ) - diff --git a/src/fit.jl b/src/fit.jl index 316d0eab..56087fd3 100644 --- a/src/fit.jl +++ b/src/fit.jl @@ -14,15 +14,19 @@ returning an object, `model`, on which other methods, such as [`predict`](@ref) [`transform`](@ref), can be dispatched. [`LearnAPI.functions(algorithm)`](@ref) returns a list of methods that can be applied to either `algorithm` or `model`. -The second signature applies to algorithms which do not generalize to new observations. In -that case `predict` or `transform` actually execute the algorithm, but may also write to -the (mutable) object returned by `fit`. +The second signature is provided by algorithms that do not generalize to new observations +("static" algorithms). In that case, `transform(model, data)` or `predict(model, ..., +data)` carries out the actual algorithm execution, writing any byproducts of that +operation to the mutable object `model` returned by `fit`. -When `data` is a tuple, a data slurping form of `fit` is typically provided. +Whenever `fit` expects a tuple form of argument, `data = (X1, ..., Xn)`, then the +signature `fit(algorithm, X1, ..., Xn)` is also provided. + +For example, a supervised classifier will typically admit this workflow: ```julia -model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` -ŷ = predict(model, X) +model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` +ŷ = predict(model, Xnew) ``` Use `verbosity=0` for warnings only, and `-1` for silent training. @@ -34,7 +38,14 @@ See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), # New implementations -Implementation is compulsory. The signature must include `verbosity`. +Implementation is compulsory. The signature must include `verbosity`. Note the requirement +on providing slurping signatures. A fallback for the first signature calls the second, +ignoring `data`: + +```julia +fit(algorithm, data...; kwargs...) = fit(algorithm; kwargs...) +``` +$(DOC_DATA_INTERFACE(:fit)) """ -fit(algorithm, data...; kwargs...) = nothing +fit(algorithm, data...; kwargs...) = fit(algorithm; kwargs...) diff --git a/src/minimize.jl b/src/minimize.jl index f37b9d0a..653d3fdf 100644 --- a/src/minimize.jl +++ b/src/minimize.jl @@ -17,7 +17,7 @@ functionality is preserved by `minimize`. # New implementations Overloading `minimize` for new algorithms is optional. The fallback is the -identity. $(DOC_IMPLEMENTED_METHODS(:minimize, overloaded=true)) +identity. $(DOC_IMPLEMENTED_METHODS(":minimize", overloaded=true)) New implementations must enforce the following identities, whenever the right-hand side is defined: diff --git a/src/obs.jl b/src/obs.jl index f67b19c9..2a874d6a 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -7,9 +7,9 @@ Return an algorithm-specific representation of `data`, suitable for passing to ` `data`. Here `model` is the return value of `fit(algorithm, ...)` for some LearnAPI.jl algorithm, `algorithm`. -The returned object is guaranteed to implement observation access as indicated -by [`LearnAPI.data_interface(algorithm)`](@ref) (typically the -[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface). +The returned object is guaranteed to implement observation access as indicated by +[`LearnAPI.data_interface(algorithm)`](@ref) (typically +[`LearnAPI.RandomAccess()`](@ref)). Calling `fit`/`predict`/`transform` on the returned objects may have performance advantages over calling directly on `data` in some contexts. And resampling the returned @@ -21,21 +21,19 @@ object using `MLUtils.getobs` may be cheaper than directly resampling the compon Usual workflow, using data-specific resampling methods: ```julia -X = -y = - -Xtrain = Tables.select(X, 1:100) -ytrain = y[1:100] -model = fit(algorithm, (Xtrain, ytrain)) -ŷ = predict(model, LiteralTarget(), y[101:150]) +data = (X, y) # a DataFrame and a vector +data_train = (Tables.select(X, 1:100), y[1:100]) +model = fit(algorithm, data_train) +ŷ = predict(model, LiteralTarget(), X[101:150]) ``` -Alternative workflow using `obs` and the MLUtils.jl API: +Alternative workflow using `obs` and the MLUtils.jl method `getobs` (assumes +`LearnAPI.data_interface(algorithm) == RandomAccess()`): ```julia import MLUtils -fit_obsevations = obs(algorithm, (X, y)) +fit_observations = obs(algorithm, data) model = fit(algorithm, MLUtils.getobs(fit_observations, 1:100)) predict_observations = obs(model, X) @@ -52,15 +50,16 @@ See also [`LearnAPI.data_interface`](@ref). Implementation is typically optional. -For each supported form of `data` in `fit(algorithm, data)`, `predict(model, data)`, and -`transform(model, data)`, it must be true that `model = fit(algorithm, observations)` is -supported, whenever `observations = obs(algorithm, data)`, and that `predict(model, -observations)` and `transform(model, observations)` are supported, whenever `observations -= obs(model, data)`. +For each supported form of `data` in `fit(algorithm, data)`, it must be true that `model = +fit(algorithm, observations)` is equivalent to `model = fit(algorithm, data)`, whenever +`observations = obs(algorithm, data)`. For each supported form of `data` in calls +`predict(model, ..., data)` and `transform(model, data)`, where implemented, the calls +`predict(model, ..., observations)` and `transform(model, observations)` are supported +alternatives, whenever `observations = obs(model, data)`. The fallback for `obs` is `obs(model_or_algorithm, data) = data`, and the fallback for -`LearnAPI.data_interface(algorithm)` indicates MLUtils.jl as the adopted interface. For -details refer to the [`LearnAPI.data_interface`](@ref) document string. +`LearnAPI.data_interface(algorithm)` is `LearnAPI.RandomAccess()`. For details refer to +the [`LearnAPI.data_interface`](@ref) document string. In particular, if the `data` to be consumed by `fit`, `predict` or `transform` consists only of suitable tables and arrays, then `obs` and `LearnAPI.data_interface` do not need diff --git a/src/predict_transform.jl b/src/predict_transform.jl index a20598f8..97385a78 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -32,6 +32,21 @@ DOC_MINIMIZE(func) = """ +DOC_DATA_INTERFACE(method) = + """ + + ## Assumptions about data + + By default, it is assumed that `data` supports the [`LearnAPI.RandomAccess`](@ref) + interface (all matrices, with observations-as-columns, most tables, and tuples + thereof). See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the case + then an implementation must suitably: (i) overload the trait + [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref). Refer to these + methods' document strings for details. + + """ + + # # METHOD STUBS/FALLBACKS """ @@ -56,17 +71,20 @@ In the following, `algorithm` is some supervised learning algorithm with training features `X`, training target `y`, and test features `Xnew`: ```julia -model = fit(algorithm, X, y; verbosity=0) +model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` predict(model, LiteralTarget(), Xnew) ``` -Note `predict ` does not mutate any argument, except in the special case -`LearnAPI.predict_or_transform_mutates(algorithm) = true`. - See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). # Extended help +If `predict` supports data in the form of a tuple `data = (X1, ..., Xn)`, then a slurping +signature is also provided, as in `predict(model, X1, ..., Xn)`. + +Note `predict ` does not mutate any argument, except in the special case +`LearnAPI.predict_or_transform_mutates(algorithm) = true`. + # New implementations If there is no notion of a "target" variable in the LearnAPI.jl sense, or you need an @@ -78,12 +96,14 @@ convenience form, but it is free to choose the fallback `kind_of_proxy`. Each `kind_of_proxy` that gets an implementation must be added to the list returned by [`LearnAPI.kinds_of_proxy`](@ref). -$(DOC_IMPLEMENTED_METHODS(:predict)) +$(DOC_IMPLEMENTED_METHODS(":predict")) $(DOC_MINIMIZE(:predict)) $(DOC_MUTATION(:predict)) +$(DOC_DATA_INTERFACE(:predict)) + """ function predict end @@ -94,7 +114,7 @@ function predict end Return a transformation of some `data`, using some `model`, as returned by [`fit`](@ref). -For `data` that consists of a tuple, a slurping version is typically provided, i.e., +For `data` that consists of a tuple, a slurping version is also provided, i.e., you can do `transform(model, X1, X2, X3)` in place of `transform(model, (X1, X2, X3))`. # Example @@ -115,7 +135,7 @@ model = fit(algorithm) W = transform(model, X) ``` -or, in one step: +or, in one step (where supported): ```julia W = transform(algorithm, X) @@ -132,12 +152,14 @@ See also [`fit`](@ref), [`predict`](@ref), # New implementations Implementation for new LearnAPI.jl algorithms is optional. -$(DOC_IMPLEMENTED_METHODS(:transform)) +$(DOC_IMPLEMENTED_METHODS(":transform")) $(DOC_MINIMIZE(:transform)) $(DOC_MUTATION(:transform)) +$(DOC_DATA_INTERFACE(:transform)) + """ function transform end @@ -166,7 +188,7 @@ See also [`fit`](@ref), [`transform`](@ref), [`predict`](@ref). # New implementations -Implementation is optional. $(DOC_IMPLEMENTED_METHODS(:inverse_transform)) +Implementation is optional. $(DOC_IMPLEMENTED_METHODS(":inverse_transform")) $(DOC_MINIMIZE(:inverse_transform)) diff --git a/src/traits.jl b/src/traits.jl index 79fd3453..a3ccceb8 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -100,11 +100,11 @@ function constructor end """ LearnAPI.functions(algorithm) -Return a tuple of functions that can be meaningfully applied with `algorithm`, or an -associate model (object returned by `fit(algorithm, ...)`, as the first +Return a tuple of symbols respresenting functions that can be meaningfully applied with +`algorithm`, or an associate model (object returned by `fit(algorithm, ...)`, as the first argument. Algorithm traits (`algorithm` is the *only* argument) are excluded. -In addition to functions, the returned tuple may include expressions, like +In addition to symbols, the returned tuple may include expressions, like `:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl. The understanding is that `algorithm` is a LearnAPI-compliant object whenever the return @@ -117,15 +117,15 @@ value is non-empty. All new implementations must overload this trait. Here's a checklist for elements in the return value: -| function | implementation/overloading compulsory? | include in returned tuple? | -|----------------------|----------------------------------------|----------------------------| -| `fit` | yes | yes | -| `minimize` | no | yes | -| `obs` | no | yes | -| `LearnAPI.algorithm` | yes | yes | -| `inverse_transform` | no | only if implemented | -| `predict` | no | only if implemented | -| `transform` | no | only if implemented | +| symbol | implementation/overloading compulsory? | include in returned tuple? | +|-----------------------|----------------------------------------|----------------------------| +| `:fit` | yes | yes | +| `:minimize` | no | yes | +| `:obs` | no | yes | +| `:LearnAPI.algorithm` | yes | yes | +| `:inverse_transform` | no | only if implemented | +| `:predict` | no | only if implemented | +| `:transform` | no | only if implemented | Also include any implemented accessor functions. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. @@ -137,11 +137,15 @@ functions(::Any) = () """ LearnAPI.kinds_of_proxy(algorithm) -Returns an tuple of all instances, `kind`, for which for which `predict(algorithm, kind, +Returns a tuple of all instances, `kind`, for which for which `predict(algorithm, kind, data...)` has a guaranteed implementation. Each such `kind` subtypes [`LearnAPI.KindOfProxy`](@ref). Examples are `LiteralTarget()` (for predicting actual target values) and `Distributions()` (for predicting probability mass/density functions). +If a `predict(model, data)` is overloaded to return predictions for a specific kind of +proxy (e.g., `predict(model::MyModel, data) = predict(model, Distribution(), data)`) then +that kind appears first in the returned tuple. + See also [`LearnAPI.predict`](@ref), [`LearnAPI.KindOfProxy`](@ref). # Extended help @@ -188,7 +192,7 @@ target(::Any, data) = nothing """ LearnAPI.weights(algorithm)::Bool - LearnAPI.target(algorithm, data) -> weights + LearnAPI.weights(algorithm, data) -> weights First method (an algorithm trait) returns `true` if the second method returns per-observation weights, for some value(s) of `data`, where `data` is a supported argument @@ -333,7 +337,7 @@ load_path(::Any) = "unknown" Returns `true` if one or more properties (fields) of `algorithm` may themselves be algorithms, and `false` otherwise. -See also `[LearnAPI.components]`(@ref). +See also [`LearnAPI.components`](@ref). # New implementations @@ -367,28 +371,23 @@ human_name(M) = snakecase(name(M), delim=' ') # `name` defined below """ LearnAPI.data_interface(algorithm) -Return the data interface supported by `algorithm` for accessing individual observations in -representations of input data returned by [`obs(algorithm, data)`](@ref) or [`obs(model, -data)`](@ref). Here `data` is `fit`, `predict`, or `transform`-consumable data. - -Options for the return value: - -- `Base.HasLength()`: Data returned by `obs` implements the - [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs/numobs` interface; it - usually suffices to overload `Base.getindex` and `Base.length` (which are the - `getobs/numobs` fallbacks). +Return the data interface supported by `algorithm` for accessing individual observations +in representations of input data returned by [`obs(algorithm, data)`](@ref) or +[`obs(model, data)`](@ref), whenever `algorithm == LearnAPI.algorithm(model)`. Here `data` +is `fit`, `predict`, or `transform`-consumable data. -- `Base.SizeUnknown()`: Data returned by `obs` implements Julia's `iterate` - interface. +Possible return values are [`LearnAPI.RandomAccess`](@ref), +[`LearnAPI.FiniteIterable`](@ref), and [`LearnAPI.Iterable`](@ref). See also [`obs`](@ref). # New implementations -The fallback returns `Base.HasLength`. +The fallback returns [`LearnAPI.RandomAccess`](@ref), which applies to arrays, most +tables, and tuples of these. See the doc-string for details. """ -data_interface(::Any) = Base.HasLength() +data_interface(::Any) = LearnAPI.RandomAccess() """ LearnAPI.predict_or_transform_mutates(algorithm) diff --git a/src/types.jl b/src/types.jl index 02218bd3..8d755fdb 100644 --- a/src/types.jl +++ b/src/types.jl @@ -116,3 +116,79 @@ $DOC_HOW_TO_LIST_PROXIES """ KindOfProxy + + +# # DATA INTERFACES + +abstract type DataInterface end +abstract type Finite <: DataInterface end + +""" + LearnAPI.RandomAccess + +A data interface type. We say that `data` implements the `RandomAccess` interface if +`data` implements the methods `getobs` and `numobs` from MLUtils.jl. The first method +allows one to grab observations specified by an arbitrary index set, as in +`MLUtils.getobs(data, [2, 3, 5])`, while the second method returns the total number of +available observations, which is assumed to be known and finite. + +All arrays implement `RandomAccess`, with the last index being the observation index +(observations-as-columns in matrices). + +A Tables.jl compatible table `data` implements `RandomAccess` if `Tables.istable(data)` is +true and if `data` implements `DataAPI.nrows`. This includes many tables, and in +particular, `DataFrame`s. Tables that are also tuples are excluded. + +Any tuple of objects implementing `RandomAccess` also implements `RandomAccess`. + +If [`LearnAPI.data_interface(algorithm)`](@ref) takes the value `RandomAccess()`, then +[`obs`](@ref)`(algorithm, ...)` is guaranteed to return objects implementing the +`RandomAccess` interface, and the same holds for `obs(model, ...)`, whenever +`LearnAPI.algorithm(model) == algorithm`. + +# Implementing `RandomAccess` for new data types + +Typically, to implement `RandomAccess` for a new data type requires only implementing +`Base.getindex` and `Base.length`, which are the fallbacks for `MLUtils.getobs` and +`MLUtils.numobs`, and this avoids making MLUtils.jl a package dependency. + +See also [`LearnAPI.FiniteIterable`](@ref), [`LearnAPI.Iterable`](@ref). +""" +struct RandomAccess <: Finite end + +""" + LearnAPI.FiniteIterable + +A data interface type. We say that `data` implements the `FiniteIterable` interface if +it implements Julia's `iterate` interface, including `Base.length`, and if +`Base.IteratorSize(typeof(data)) == Base.HasLength()`. For example, this is true if: + +- `data` implements the [`LearnAPI.RandomAccess`](@ref) interface (arrays and most tables) + +- `data isa MLUtils.DataLoader`, which includes output from `MLUtils.eachobs`. + +If [`LearnAPI.data_interface(algorithm)`](@ref) takes the value `FiniteIterable()`, then +[`obs`](@ref)`(algorithm, ...)` is guaranteed to return objects implementing the +`FiniteIterable` interface, and the same holds for `obs(model, ...)`, whenever +`LearnAPI.algorithm(model) == algorithm`. + +See also [`LearnAPI.RandomAccess`](@ref), [`LearnAPI.Iterable`](@ref). +""" +struct FiniteIterable <: Finite end + +""" + LearnAPI.Iterable + +A data interface type. We say that `data` implements the `Iterable` interface if it +implements Julia's basic `iterate` interface. (Such objects may not implement +`MLUtils.numobs` or `Base.length`.) + +If [`LearnAPI.data_interface(algorithm)`](@ref) takes the value `Iterable()`, then +[`obs`](@ref)`(algorithm, ...)` is guaranteed to return objects implementing `Iterable`, +and the same holds for `obs(model, ...)`, whenever `LearnAPI.algorithm(model) == +algorithm`. + +See also [`LearnAPI.FiniteIterable`](@ref), [`LearnAPI.RandomAccess`](@ref). + +""" +struct Iterable <: DataInterface end From 31c42c60892502660bc1bc7ca40b637a53a5eb51 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 26 Sep 2024 16:55:03 +1200 Subject: [PATCH 044/187] add fallbacks to reduce need to overload some convenience methods --- docs/src/anatomy_of_an_implementation.md | 222 ++++++++++++----------- src/fit.jl | 7 +- src/predict_transform.jl | 9 +- src/traits.jl | 39 ++-- test/integration/regression.jl | 79 ++++---- test/integration/static_algorithms.jl | 31 ++-- 6 files changed, 207 insertions(+), 180 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 9779403a..37c73c94 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -10,19 +10,19 @@ For a transformer, implementations ordinarily implement `transform` instead of !!! important - The core implementations of `fit`, `predict`, etc, - always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. - Calls like `fit(algorithm, X, y)` are provided as additional convenience methods. + Implementations of `fit`, `predict`, etc, + always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. + For user convenience, calls like `fit(algorithm, X, y)` automatically fallback to `fit(algorithm, (X, y))`. !!! note - If the `data` object consumed by `fit`, `predict`, or `transform` is not - not a suitable table¹, array³, tuple of tables and arrays, or some - other object implementing - the MLUtils.jl `getobs`/`numobs` interface, - then an implementation must: (i) suitably overload the trait - [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as - illustrated below under [Providing an advanced data interface](@ref). + If the `data` object consumed by `fit`, `predict`, or `transform` is not + not a suitable table¹, array³, tuple of tables and arrays, or some + other object implementing + the MLUtils.jl `getobs`/`numobs` interface, + then an implementation must: (i) suitably overload the trait + [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as + illustrated below under [Providing an advanced data interface](@ref). The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. @@ -46,9 +46,10 @@ nothing # hide Instances of `Ridge` will be [algorithms](@ref algorithms), in LearnAPI.jl parlance. -To [qualify](@ref algorithms) as a LearnAPI algorithm, an object must be come with a -mechanism for creating new versions of itself, with modified property (field) values. To -this end, we implement `LearnAPI.constructor`, which must return a keyword constructor: +Associated with each new type of LearnAPI [algorithm](@ref algorithms) will be a keyword +argument constructor, providing default values for all properties (struct fields) that are +not other algorithms, and we must implement `LearnAPI.constructor(algorithm)`, for +recovering the constructor from an instance: ```@example anatomy """ @@ -61,18 +62,15 @@ LearnAPI.constructor(::Ridge) = Ridge nothing # hide ``` -So, if `algorithm = Ridge(lambda=0.1)` then `LearnAPI.constructor(algorithm)(lambda=0.05)` -is another algorithm with the same properties, except that the value of `lambda` has been -changed to `0.05`. - -Note that we attach the docstring to the constructor, not the struct. +So, in this case, if `algorithm = Ridge(0.2)`, then +`LearnAPI.constructor(algorithm)(lambda=0.2) == algorithm` is true. Note that we attach +the docstring to the *constructor*, not the struct. ## Implementing `fit` -A ridge regressor requires two types of data for training: input features `X`, which -here we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a -vector. +A ridge regressor requires two types of data for training: input features `X`, which here +we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a vector. It is convenient to define a new type for the `fit` output, which will include coefficients labelled by feature name for inspection after training: @@ -134,9 +132,32 @@ Here's the implementation for our ridge regressor: ```@example anatomy LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = - Tables.matrix(Xnew)*model.coefficients + Tables.matrix(Xnew)*model.coefficients +``` + +Since we can make no other kind of prediction in this case, we may overload the following +for user convenience: + +```@example anatomy +LearnAPI.predict(model::RidgeFitted, Xnew) = predict(model, LiteralTarget(), Xnew) ``` +## Extracting the target from training data + +The `fit` method consumes data which includes a [target variable](@ref proxy), i.e., the +algorithm is a supervised learner. We must therefore declare how the target variable can be extracted +from training data, by implementing [`LearnAPI.target`](@ref): + +```@example anatomy +LearnAPI.target(algorithm, data) = last(data) +``` + +There is a similar method, [`LearnAPI.input`](@ref) for declaring how input data can be +extracted (for passing to `predict`, for example) but this method has a fallback which +typically suffices: return `first(data)` if `data` is a tuple, and otherwise return +`data`. + + ## Accessor functions An [accessor function](@ref accessor_functions) has the output of [`fit`](@ref) as it's @@ -174,74 +195,46 @@ predictions. Algorithm [traits](@ref traits) record extra generic information about an algorithm, or make specific promises of behavior. They usually have an algorithm as the single argument, -and so we also regard [`LearnAPI.constructor`](@ref) defined above as a trait. - -In LearnAPI.jl `predict` always outputs a [target or target proxy](@ref proxy), where -"target" is understood very broadly. We overload a trait to record the fact here that the -target variable explicitly appears in training (i.e, the algorithm is supervised): - -```julia -LearnAPI.target(::Ridge) = true -``` +and so we regard [`LearnAPI.constructor`](@ref) defined above as a trait. -or, using a shortcut: +Because we have implemented `predict`, we are required to overload the +[`LearnAPI.kinds_of_proxy`](@ref) trait. Because we can only make point predictions of the +target, we do this like this: ```julia -@trait Ridge target = true +LearnAPI.kinds_of_proxy(::Ridge) = (LiteralTarget(),) ``` -The macro can be used to specify multiple traits simultaneously: +A macro provides a shortcut, convenient when multiple traits are to be defined: ```@example anatomy @trait( - Ridge, - constructor = Ridge, - target = true, - kinds_of_proxy=(LiteralTarget(),), - descriptors = (:regression,), - functions = ( - fit, - minimize, - predict, - obs, - LearnAPI.algorithm, - LearnAPI.coefficients, - ) + Ridge, + constructor = Ridge, + kinds_of_proxy=(LiteralTarget(),), + descriptors = (:regression,), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.input), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), + ) ) nothing # hide ``` -The trait `kinds_of_proxy` is required here, because we implemented `predict`. - -The last trait `functions` returns a list of all LearnAPI.jl methods that can be +The last trait, `functions`, returns a list of all LearnAPI.jl methods that can be meaninfully applied to the algorithm or associated model. See [`LearnAPI.functions`](@ref) -for a checklist. This, and [`LearnAPI.constructor`](@ref), are the only universally -compulsory traits. However, it is worthwhile studying the [list of all traits](@ref -traits_list) to see which might apply to a new implementation, to enable maximum buy into -functionality provided by third party packages, and to assist third party algorithms that -match machine learning algorithms to user-defined tasks. - -According to the contract articulated in its document string, having set -[`LearnAPI.target`](@ref)`(::Ridge)`](@ref) equal to `true`, we are obliged to overload a -multi-argument version of `LearnAPI.target` to extract the target from the `data` that -gets supplied to `fit`: - -```@example anatomy -LearnAPI.target(::Ridge, data) = last(data) -``` - -## Convenience methods - -Finally, we extend `fit` and `predict` with signatures convenient for user interaction, -enabling the kind of workflow previewed in [Sample workflow](@ref): +for a checklist. [`LearnAPI.functions`](@ref) and [`LearnAPI.constructor`](@ref), are the +only universally compulsory traits. However, it is worthwhile studying the [list of all +traits](@ref traits_list) to see which might apply to a new implementation, to enable +maximum buy into functionality provided by third party packages, and to assist third party +algorithms that match machine learning algorithms to user-defined tasks. -```@example anatomy -LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) - -LearnAPI.predict(model::RidgeFitted, Xnew) = - predict(model, LiteralTarget(), Xnew) -``` ## [Demonstration](@id workflow) @@ -267,7 +260,9 @@ foreach(println, LearnAPI.functions(algorithm)) Training and predicting: ```@example anatomy -model = fit(algorithm, Tables.subset(X, train), y[train]) +Xtrain = Tables.subset(X, train) +ytrain = y[train] +model = fit(algorithm, (Xtrain, ytrain)) # `fit(algorithm, Xtrain, ytrain)` will also work ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) ``` @@ -299,7 +294,7 @@ using LearnAPI using LinearAlgebra, Tables struct Ridge{T<:Real} - lambda::T + lambda::T end Ridge(; lambda=0.1) = Ridge(lambda) @@ -320,19 +315,20 @@ LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = LearnAPI.predict(model::RidgeFitted, Xnew) = predict(model, LiteralTarget(), Xnew) @trait( - Ridge, - constructor = Ridge, - target = true, - kinds_of_proxy=(LiteralTarget(),), - descriptors = (:regression,), - functions = ( - fit, - minimize, - predict, - obs, - LearnAPI.algorithm, - LearnAPI.coefficients, - ) + Ridge, + constructor = Ridge, + kinds_of_proxy=(LiteralTarget(),), + descriptors = (:regression,), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.input), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), + ) ) n = 10 # number of observations @@ -351,7 +347,7 @@ new type: ```@example anatomy2 struct RidgeFitObs{T,M<:AbstractMatrix{T}} - A::M # p x n + A::M # `p` x `n` matrix names::Vector{Symbol} # features y::Vector{T} # target end @@ -399,20 +395,13 @@ LearnAPI.fit(algorithm::Ridge, data; kwargs...) = fit(algorithm, obs(algorithm, data); kwargs...) ``` -We provide an overloading of `LearnAPI.target` to handle the additional supported data -argument of `fit`: - -```@example anatomy2 -LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y -``` - ### The `obs` contract Providing `fit` signatures matching the output of `obs`, is the first part of the `obs` -contract. The second part is this: *The output of `obs` must implement the* -[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs/numobs` *interface for -accessing individual observations*. It usually suffices to overload `Base.getindex` and -`Base.length` (which are the `getobs/numobs` fallbacks): +contract. The second part is this: *The output of `obs` must implement the interface +specified by the trait* [`LearnAPI.data_interface(algorithm)`](@ref). Assuming this is +[`LearnAPI.RandomAccess()`](@ref) (the default) it usually suffices to overload +`Base.getindex` and `Base.length`: ```@example anatomy2 Base.getindex(data::RidgeFitObs, I) = @@ -433,6 +422,24 @@ LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = predict(model, LiteralTarget(), obs(model, Xnew)) ``` +### `target` and `input` methods + +We provide an additional overloading of [`LearnAPI.target`](@ref) to handle the additional supported data +argument of `fit`: + +```@example anatomy2 +LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y +``` + +Similarly, we must overload [`LearnAPI.input`](@ref), which extracts inputs from training +data (objects that can be passed to `predict`) like this + +```@example anatomy2 +LearnAPI.input(::Ridge, observations::RidgeFitObs) = observations.A +``` +as the fallback mentioned above is no longer adequate. + + ### Important notes: - The observations to be consumed by `fit` are returned by `obs(algorithm::Ridge, ...)`, @@ -445,9 +452,9 @@ LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = table here. Since LearnAPI.jl provides fallbacks for `obs` that simply return the unadulterated data -input, overloading `obs` is optional. This is provided data in publicized `fit`/`predict` -signatures consists of objects implementing the `getobs/numobs` interface (such as tables¹ -and arrays³). +argument, overloading `obs` is optional. This is provided data in publicized +`fit`/`predict` signatures consists only of objects implement the +[`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof). To buy out of supporting the MLUtils.jl interface altogether, an implementation must overload the trait, [`LearnAPI.data_interface(algorithm)`](@ref). @@ -486,6 +493,3 @@ like the native ones, they must be included in the [`LearnAPI.functions`](@ref) declaration. ³ The last index must be the observation index. - -⁴ Guaranteed assuming -`LearnAPI.data_interface(algorithm) == Base.HasLength()`, the default. diff --git a/src/fit.jl b/src/fit.jl index 56087fd3..ddc1dd9b 100644 --- a/src/fit.jl +++ b/src/fit.jl @@ -43,9 +43,12 @@ on providing slurping signatures. A fallback for the first signature calls the s ignoring `data`: ```julia -fit(algorithm, data...; kwargs...) = fit(algorithm; kwargs...) +fit(algorithm, data; kwargs...) = fit(algorithm; kwargs...) ``` $(DOC_DATA_INTERFACE(:fit)) """ -fit(algorithm, data...; kwargs...) = fit(algorithm; kwargs...) +fit(algorithm, data; kwargs...) = + fit(algorithm; kwargs...) +fit(algorithm, data1, datas...; kwargs...) = + fit(algorithm, (data1, datas...); kwargs...) diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 97385a78..932ecc2f 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -105,7 +105,10 @@ $(DOC_MUTATION(:predict)) $(DOC_DATA_INTERFACE(:predict)) """ -function predict end +predict(model, k::KindOfProxy, data1, data2, datas...; kwargs...) = + predict(model, k, (data1, data2, datas...); kwargs...) +predict(model, data1, data2, datas...; kwargs...) = + predict(model, (data1, data2, datas...); kwargs...) """ @@ -161,7 +164,9 @@ $(DOC_MUTATION(:transform)) $(DOC_DATA_INTERFACE(:transform)) """ -function transform end +transform(model, data1, data2...; kwargs...) = + transform(model, (data1, datas...); kwargs...) + """ diff --git a/src/traits.jl b/src/traits.jl index a3ccceb8..090716b4 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -100,12 +100,13 @@ function constructor end """ LearnAPI.functions(algorithm) -Return a tuple of symbols respresenting functions that can be meaningfully applied with -`algorithm`, or an associate model (object returned by `fit(algorithm, ...)`, as the first -argument. Algorithm traits (`algorithm` is the *only* argument) are excluded. +Return a tuple of expressions respresenting functions that can be meaningfully applied +with `algorithm`, or an associated model (object returned by `fit(algorithm, ...)`, as the +first argument. Algorithm traits (methods for which `algorithm` is the *only* argument) +are excluded. -In addition to symbols, the returned tuple may include expressions, like -`:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl. +The returned tuple may include expressions like `:(DecisionTree.print_tree)`, which +reference functions not owned by LearnAPI.jl. The understanding is that `algorithm` is a LearnAPI-compliant object whenever the return value is non-empty. @@ -117,18 +118,22 @@ value is non-empty. All new implementations must overload this trait. Here's a checklist for elements in the return value: -| symbol | implementation/overloading compulsory? | include in returned tuple? | -|-----------------------|----------------------------------------|----------------------------| -| `:fit` | yes | yes | -| `:minimize` | no | yes | -| `:obs` | no | yes | -| `:LearnAPI.algorithm` | yes | yes | -| `:inverse_transform` | no | only if implemented | -| `:predict` | no | only if implemented | -| `:transform` | no | only if implemented | - -Also include any implemented accessor functions. The LearnAPI.jl accessor functions are: -$ACCESSOR_FUNCTIONS_LIST. +| symbol | implementation/overloading compulsory? | include in returned tuple? | +|---------------------------------|----------------------------------------|------------------------------------| +| `:(LearnAPI.fit)` | yes | yes | +| `:(LearnAPI.algorithm)` | yes | yes | +| `:(LearnAPI.minimize)` | no | yes | +| `:(LearnAPI.obs)` | no | yes | +| `:(LearnAPI.input)` | no | yes, unless `fit` consumes no data | +| `:(LearnAPI.target)` | no | only if implemented | +| `:(LearnAPI.weights)` | no | only if implemented | +| `:(LearnAPI.predict)` | no | only if implemented | +| `:(LearnAPI.transform)` | no | only if implemented | +| `:(LearnAPI.inverse_transform)` | no | only if implemented | +| | no | only if implemented | + +Also include any implemented accessor functions, both those owned by LearnaAPI.jl, and any +algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. """ functions(::Any) = () diff --git a/test/integration/regression.jl b/test/integration/regression.jl index 0ff394e4..38b0e8f3 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -7,8 +7,8 @@ import DataFrames # # NAIVE RIDGE REGRESSION WITH NO INTERCEPTS -# We overload `obs` to expose internal representation of input data. See later for a -# simpler variation using the `obs` fallback. +# We overload `obs` to expose internal representation of data. See later for a simpler +# variation using the `obs` fallback. # no docstring here - that goes with the constructor struct Ridge @@ -78,13 +78,10 @@ end LearnAPI.fit(algorithm::Ridge, data; kwargs...) = fit(algorithm, obs(algorithm, data); kwargs...) -# for convenience: -LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) - -# to extract the target: +# extracting stuff from training data: LearnAPI.target(::Ridge, data) = last(data) LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y +LearnAPI.input(::Ridge, observations::RidgeFitObs) = observations.A # observations for consumption by `predict`: LearnAPI.obs(::RidgeFitted, X) = Tables.matrix(X)' @@ -100,6 +97,7 @@ LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = # convenience method: LearnAPI.predict(model::RidgeFitted, data) = predict(model, LiteralTarget(), data) +# accessor function: LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances LearnAPI.minimize(model::RidgeFitted) = @@ -108,18 +106,20 @@ LearnAPI.minimize(model::RidgeFitted) = @trait( Ridge, constructor = Ridge, - target=true, kinds_of_proxy = (LiteralTarget(),), functions = ( - fit, - minimize, - predict, - obs, - LearnAPI.algorithm, - LearnAPI.feature_importances, - ) + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.input), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.feature_importances), + ) ) +# synthetic test data: n = 30 # number of observations train = 1:6 test = 7:10 @@ -127,10 +127,14 @@ a, b, c = rand(n), rand(n), rand(n) X = (; a, b, c) X = DataFrames.DataFrame(X) y = 2a - b + 3c + 0.05*rand(n) +data = (X, y) @testset "test an implementation of ridge regression" begin algorithm = Ridge(lambda=0.5) - @test LearnAPI.obs in LearnAPI.functions(algorithm) + @test :(LearnAPI.obs) in LearnAPI.functions(algorithm) + + @test LearnAPI.target(algorithm, data) == y + @test LearnAPI.input(algorithm, data) == X # verbose fitting: @test_logs( @@ -157,10 +161,12 @@ y = 2a - b + 3c + 0.05*rand(n) @test ŷ isa Vector{Float64} @test predict(model, Tables.subset(X, test)) == ŷ - fitobs = LearnAPI.obs(algorithm, (X, y)) + fitobs = LearnAPI.obs(algorithm, data) predictobs = LearnAPI.obs(model, X) model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) + @test LearnAPI.target(algorithm, fitobs) == y @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) ≈ ŷ + @test predict(model, LearnAPI.input(algorithm, fitobs)) ≈ predict(model, X) @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} @@ -177,9 +183,6 @@ y = 2a - b + 3c + 0.05*rand(n) MLUtils.getobs(predictobs, test) ) ≈ ŷ - @test LearnAPI.target(algorithm, (X, y)) == y - @test LearnAPI.target(algorithm, fitobs) == y - end # # VARIATION OF RIDGE REGRESSION THAT USES FALLBACK OF LearnAPI.obs @@ -221,32 +224,34 @@ function LearnAPI.fit(algorithm::BabyRidge, data; verbosity=1) end +# extracting stuff from training data: LearnAPI.target(::BabyRidge, data) = last(data) -# convenience form: -LearnAPI.fit(algorithm::BabyRidge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) - LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm LearnAPI.predict(model::BabyRidgeFitted, ::LiteralTarget, Xnew) = Tables.matrix(Xnew)*model.coefficients +# convenience method: +LearnAPI.predict(model::BabyRidgeFitted, data) = predict(model, LiteralTarget(), data) + LearnAPI.minimize(model::BabyRidgeFitted) = BabyRidgeFitted(model.algorithm, model.coefficients, nothing) @trait( BabyRidge, - constructor = Ridge, - target=true, + constructor = BabyRidge, kinds_of_proxy = (LiteralTarget(),), functions = ( - fit, - minimize, - predict, - LearnAPI.algorithm, - LearnAPI.feature_importances, - ) + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.input), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.feature_importances), + ) ) @testset "test a variation which does not overload LearnAPI.obs" begin @@ -256,12 +261,14 @@ LearnAPI.minimize(model::BabyRidgeFitted) = ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) @test ŷ isa Vector{Float64} - fitobs = obs(algorithm, (X, y)) + fitobs = obs(algorithm, data) predictobs = LearnAPI.obs(model, X) model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) - @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) == ŷ - - @test LearnAPI.target(algorithm, (X, y)) == y + @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) == ŷ == + predict(model, MLUtils.getobs(predictobs, test)) + @test LearnAPI.target(algorithm, data) == y + @test LearnAPI.predict(model, X) ≈ + LearnAPI.predict(model, LearnAPI.input(algorithm, data)) end true diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index 3991dbf4..1d6a2ad6 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -15,8 +15,8 @@ struct Selector end Selector(; names=Symbol[]) = Selector(names) # LearnAPI.constructor defined later -# `fit` has no input data, does no "learning", and just returns thinly wrapped `algorithm` -# (to distinguish it from the algorithm in dispatch): +# `fit` consumes no observational data, does no "learning", and just returns a thinly +# wrapped `algorithm` (to distinguish it from the algorithm in dispatch): LearnAPI.fit(algorithm::Selector; verbosity=1) = Ref(algorithm) LearnAPI.algorithm(model) = model[] @@ -40,10 +40,11 @@ end Selector, constructor = Selector, functions = ( - fit, - minimize, - transform, - Learn.algorithm, + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.transform), ), ) @@ -61,7 +62,9 @@ end # # FEATURE SELECTOR THAT REPORTS BYPRODUCTS OF SELECTION PROCESS # This a variation of `Selector` above that stores the names of rejected features in the -# model object, for inspection by an accessor function called `rejected`. +# model object, for inspection by an accessor function called `rejected`. Since +# `transform(model, X)` mutates `model` in this case, we must overload the +# `predict_or_transform_mutates` trait. struct Selector2 names::Vector{Symbol} @@ -99,15 +102,15 @@ end @trait( Selector2, - constructor = Selector, + constructor = Selector2, predict_or_transform_mutates = true, functions = ( - fit, - obsfit, - minimize, - transform, - Learn.algorithm, - :(MyPkg.rejected), # accessor function not owned by LearnAPI.jl + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.transform), + :(MyPkg.rejected), # accessor function not owned by LearnAPI.jl, ) ) From 79c67e320277423d39682116d41bb56c556525f4 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 26 Sep 2024 17:33:01 +1200 Subject: [PATCH 045/187] add fallbacks to rm need for overloading predict convenience fn --- docs/src/anatomy_of_an_implementation.md | 19 ++++++++----------- docs/src/obs.md | 2 +- src/fit.jl | 8 +++++--- src/obs.jl | 7 ++++++- src/predict_transform.jl | 13 +++++++------ test/integration/regression.jl | 6 ------ 6 files changed, 27 insertions(+), 28 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 37c73c94..716ef514 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -116,7 +116,7 @@ end ## Implementing `predict` -The primary `predict` call will look like this: +Users will be able to call `predict` like this: ```julia predict(model, LiteralTarget(), Xnew) @@ -128,19 +128,17 @@ the target, such as probability density functions. `LiteralTarget` is an exampl [`LearnAPI.KindOfProxy`](@ref proxy_types) type. Targets and target proxies are discussed [here](@ref proxy). -Here's the implementation for our ridge regressor: +So, we provide this implementation for our ridge regressor: ```@example anatomy LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = Tables.matrix(Xnew)*model.coefficients ``` -Since we can make no other kind of prediction in this case, we may overload the following -for user convenience: +If the kind of proxy is omitted, as in `predict(model, Xnew)`, then a fallback grabs the +first element of the tuple returned by [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), which +we overload appropriately below. -```@example anatomy -LearnAPI.predict(model::RidgeFitted, Xnew) = predict(model, LiteralTarget(), Xnew) -``` ## Extracting the target from training data @@ -263,7 +261,7 @@ Training and predicting: Xtrain = Tables.subset(X, train) ytrain = y[train] model = fit(algorithm, (Xtrain, ytrain)) # `fit(algorithm, Xtrain, ytrain)` will also work -ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) +ŷ = predict(model, Tables.subset(X, test)) ``` Extracting coefficients: @@ -312,7 +310,6 @@ LearnAPI.minimize(model::RidgeFitted) = LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = fit(algorithm, (X, y); kwargs...) -LearnAPI.predict(model::RidgeFitted, Xnew) = predict(model, LiteralTarget(), Xnew) @trait( Ridge, @@ -424,8 +421,8 @@ LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = ### `target` and `input` methods -We provide an additional overloading of [`LearnAPI.target`](@ref) to handle the additional supported data -argument of `fit`: +We provide an additional overloading of [`LearnAPI.target`](@ref) to handle the additional +supported data argument of `fit`: ```@example anatomy2 LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y diff --git a/docs/src/obs.md b/docs/src/obs.md index bae83427..ed44668b 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -82,7 +82,7 @@ end | [`obs(algorithm_or_model, data)`](@ref) | depends | returns `data` | | | | | -A sample implementation is given in [Providing an advanced data interface](@ref). +A sample implementation is given in [Providing an advanced data interface](@ref). ## Reference diff --git a/src/fit.jl b/src/fit.jl index ddc1dd9b..2a5e0cbf 100644 --- a/src/fit.jl +++ b/src/fit.jl @@ -38,13 +38,15 @@ See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), # New implementations -Implementation is compulsory. The signature must include `verbosity`. Note the requirement -on providing slurping signatures. A fallback for the first signature calls the second, -ignoring `data`: +Implementation is compulsory. The signature must include `verbosity`. A fallback for the +first signature calls the second, ignoring `data`: ```julia fit(algorithm, data; kwargs...) = fit(algorithm; kwargs...) ``` + +Fallbacks also provide the data slurping versions. + $(DOC_DATA_INTERFACE(:fit)) """ diff --git a/src/obs.jl b/src/obs.jl index 2a874d6a..2d784a89 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -66,9 +66,14 @@ only of suitable tables and arrays, then `obs` and `LearnAPI.data_interface` do to be overloaded. However, the user will get no performance benefits by using `obs` in that case. +When overloading `obs(algorithm, data)` to output new model-specific representations of +data, it may be necessary to also overload [`LearnAPI.input`](@ref), +[`LearnAPI.target`](@ref) (supervised algorithms), and/or [`LearnAPI.weights`](@ref) (if +weights are supported), for extracting relevant parts of the representation. + ## Sample implementation -Refer to the "Anatomy of an Implementation" section of the LearnAPI +Refer to the "Anatomy of an Implementation" section of the LearnAPI.jl [manual](https://juliaai.github.io/LearnAPI.jl/dev/). diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 932ecc2f..c30cd7a5 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -61,7 +61,8 @@ options with [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), where `algorithm = LearnAPI.algorithm(model)`. The shortcut `predict(model, data)` calls the first method with an algorithm-specific -`kind_of_proxy`. +`kind_of_proxy`, namely the first element of [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), +which lists all supported target proxies. The argument `model` is anything returned by a call of the form `fit(algorithm, ...)`. @@ -90,9 +91,7 @@ Note `predict ` does not mutate any argument, except in the special case If there is no notion of a "target" variable in the LearnAPI.jl sense, or you need an operation with an inverse, implement [`transform`](@ref) instead. -Implementation is optional. If the first signature is implemented for some -`kind_of_proxy`, then the implementation should provide an implementation of the second -convenience form, but it is free to choose the fallback `kind_of_proxy`. Each +Implementation is optional. Only the first signature is implemented, but each `kind_of_proxy` that gets an implementation must be added to the list returned by [`LearnAPI.kinds_of_proxy`](@ref). @@ -105,12 +104,14 @@ $(DOC_MUTATION(:predict)) $(DOC_DATA_INTERFACE(:predict)) """ +predict(model, data) = predict(model, kinds_of_proxy(algorithm(model)) |> first, data) predict(model, k::KindOfProxy, data1, data2, datas...; kwargs...) = predict(model, k, (data1, data2, datas...); kwargs...) predict(model, data1, data2, datas...; kwargs...) = predict(model, (data1, data2, datas...); kwargs...) + """ transform(model, data) @@ -154,8 +155,8 @@ See also [`fit`](@ref), [`predict`](@ref), # New implementations -Implementation for new LearnAPI.jl algorithms is optional. -$(DOC_IMPLEMENTED_METHODS(":transform")) +Implementation for new LearnAPI.jl algorithms is optional. A fallback provides the +slurping version. $(DOC_IMPLEMENTED_METHODS(":transform")) $(DOC_MINIMIZE(:transform)) diff --git a/test/integration/regression.jl b/test/integration/regression.jl index 38b0e8f3..d8118a72 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -94,9 +94,6 @@ LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, observations::AbstractMatr LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = predict(model, LiteralTarget(), obs(model, Xnew)) -# convenience method: -LearnAPI.predict(model::RidgeFitted, data) = predict(model, LiteralTarget(), data) - # accessor function: LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances @@ -232,9 +229,6 @@ LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm LearnAPI.predict(model::BabyRidgeFitted, ::LiteralTarget, Xnew) = Tables.matrix(Xnew)*model.coefficients -# convenience method: -LearnAPI.predict(model::BabyRidgeFitted, data) = predict(model, LiteralTarget(), data) - LearnAPI.minimize(model::BabyRidgeFitted) = BabyRidgeFitted(model.algorithm, model.coefficients, nothing) From d27022916f92f8771b7a3de927870b848fe66d95 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 26 Sep 2024 17:40:35 +1200 Subject: [PATCH 046/187] add some forgotten files --- docs/make.jl | 2 +- docs/src/target_weights_input.md | 35 ++++++++++++++++++++++++++++++++ src/LearnAPI.jl | 2 +- src/target_weights_input.jl | 30 +++++++++++++++++++++++++++ 4 files changed, 67 insertions(+), 2 deletions(-) create mode 100644 docs/src/target_weights_input.md create mode 100644 src/target_weights_input.jl diff --git a/docs/make.jl b/docs/make.jl index ecfc1dd0..2514acce 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -19,7 +19,7 @@ makedocs( "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", "minimize" => "minimize.md", - "input" => "input.md", + "target/weights/input" => "target_weights_input.md", "obs" => "obs.md", "Accessor Functions" => "accessor_functions.md", "Algorithm Traits" => "traits.md", diff --git a/docs/src/target_weights_input.md b/docs/src/target_weights_input.md new file mode 100644 index 00000000..8dfaa9f3 --- /dev/null +++ b/docs/src/target_weights_input.md @@ -0,0 +1,35 @@ +# [`input`](@id input) + +```julia +LearnAPI.input(algorithm, data) -> +``` + +# Typical workflow + +Not typically appearing in a general user's workflow but useful in meta-alagorithms, such +as cross-validation (see the example in [`obs` and Data Interfaces](@ref data_interface)). + +Supposing `algorithm` is a supervised classifier predicting a one-dimensional vector +target: + +```julia +model = fit(algorithm, data) +X = LearnAPI.input(algorithm, data) +y = LearnAPI.target(algorithm, data) +ŷ = predict(model, LiteralTarget(), X) +training_loss = sum(ŷ .!= y) +``` + +# Implementation guide + +The fallback returns `first(data)`, assuming `data` is a tuple, and `data` otherwise. + +| method | compulsory? | +|:-------------------------|:-----------:| +| [`LearnAPI.input`](@ref) | no | + +# Reference + +```@docs +LearnAPI.input +``` diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 66c9aa9e..0de8c026 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -7,7 +7,7 @@ include("types.jl") include("predict_transform.jl") include("fit.jl") include("minimize.jl") -include("input.jl") +include("target_weights_input.jl") include("obs.jl") include("accessor_functions.jl") include("traits.jl") diff --git a/src/target_weights_input.jl b/src/target_weights_input.jl new file mode 100644 index 00000000..35a3e3df --- /dev/null +++ b/src/target_weights_input.jl @@ -0,0 +1,30 @@ +""" + LearnAPI.input(algorithm, data) + +Where `data` is a supported data argument for `fit`, extract from `data` something +suitable for passing as the third argument to `predict`, as in the following sample +workflow: + +```julia +model = fit(algorithm, data) +X = input(data) +ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = LiteralTarget()` +``` + +The return value has the same number of observations as `data` does. Where +`LearnAPI.target(algorithm)` is `true` (supervised learning) one expects `ŷ` above to be +an approximate proxy for `target(algorithm, data)`, the training target. + + +# New implementations + +The following fallbacks typically make overloading `LearnAPI.input` unnecessary: + +```julia +LearnAPI.input(algorithm, data) = data +LearnAPI.input(algorithm, data::Tuple) = first(data) +``` + +""" +input(algorithm, data) = data +input(algorithm, data::Tuple) = first(data) From 6e721c824dd46290233ce0cc74ef3c7fc143f933 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 27 Sep 2024 08:14:09 +1200 Subject: [PATCH 047/187] doc updates and some small re-organziation of code --- docs/src/anatomy_of_an_implementation.md | 48 ++++++++++---------- docs/src/index.md | 7 +-- docs/src/reference.md | 9 ++-- docs/src/target_weights_input.md | 21 ++++++--- docs/src/traits.md | 13 +++--- src/predict_transform.jl | 8 ++-- src/target_weights_input.jl | 56 +++++++++++++++++++++--- src/traits.jl | 41 +---------------- 8 files changed, 111 insertions(+), 92 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 716ef514..58b3ba7b 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -8,21 +8,23 @@ refer to the [demonstration](@ref workflow) of the implementation given later. For a transformer, implementations ordinarily implement `transform` instead of `predict`. For more on `predict` versus `transform`, see [Predict or transform?](@ref) -!!! important +!!! note - Implementations of `fit`, `predict`, etc, - always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`. - For user convenience, calls like `fit(algorithm, X, y)` automatically fallback to `fit(algorithm, (X, y))`. + New implementations of `fit`, `predict`, etc, + always have a *single* `data` argument, as in + `LearnAPI.fit(algorithm, data; verbosity=1) = ...`. + For convenience, user calls like `fit(algorithm, X, y)` automatically fallback + to `fit(algorithm, (X, y))`. !!! note - If the `data` object consumed by `fit`, `predict`, or `transform` is not - not a suitable table¹, array³, tuple of tables and arrays, or some - other object implementing - the MLUtils.jl `getobs`/`numobs` interface, - then an implementation must: (i) suitably overload the trait - [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as - illustrated below under [Providing an advanced data interface](@ref). + If the `data` object consumed by `fit`, `predict`, or `transform` is not + not a suitable table¹, array³, tuple of tables and arrays, or some + other object implementing + the MLUtils.jl `getobs`/`numobs` interface, + then an implementation must: (i) suitably overload the trait + [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as + illustrated below under [Providing an advanced data interface](@ref). The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. @@ -48,7 +50,7 @@ Instances of `Ridge` will be [algorithms](@ref algorithms), in LearnAPI.jl parla Associated with each new type of LearnAPI [algorithm](@ref algorithms) will be a keyword argument constructor, providing default values for all properties (struct fields) that are -not other algorithms, and we must implement `LearnAPI.constructor(algorithm)`, for +not other algorithms, and we must implement [`LearnAPI.constructor(algorithm)`](@ref), for recovering the constructor from an instance: ```@example anatomy @@ -62,7 +64,7 @@ LearnAPI.constructor(::Ridge) = Ridge nothing # hide ``` -So, in this case, if `algorithm = Ridge(0.2)`, then +For example, in this case, if `algorithm = Ridge(0.2)`, then `LearnAPI.constructor(algorithm)(lambda=0.2) == algorithm` is true. Note that we attach the docstring to the *constructor*, not the struct. @@ -123,12 +125,12 @@ predict(model, LiteralTarget(), Xnew) ``` where `Xnew` is a table (of the same form as `X` above). The argument `LiteralTarget()` -signals that we want literal predictions of the target variable, as opposed to a proxy for -the target, such as probability density functions. `LiteralTarget` is an example of a -[`LearnAPI.KindOfProxy`](@ref proxy_types) type. Targets and target proxies are discussed -[here](@ref proxy). +signals that literal predictions of the target variable are sought, as opposed to some +proxy for the target, such as probability density functions. `LiteralTarget` is an +example of a [`LearnAPI.KindOfProxy`](@ref proxy_types) type. Targets and target proxies +are discussed [here](@ref proxy). -So, we provide this implementation for our ridge regressor: +We provide this implementation for our ridge regressor: ```@example anatomy LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = @@ -197,7 +199,7 @@ and so we regard [`LearnAPI.constructor`](@ref) defined above as a trait. Because we have implemented `predict`, we are required to overload the [`LearnAPI.kinds_of_proxy`](@ref) trait. Because we can only make point predictions of the -target, we do this like this: +target, we make this definition: ```julia LearnAPI.kinds_of_proxy(::Ridge) = (LiteralTarget(),) @@ -233,6 +235,11 @@ traits](@ref traits_list) to see which might apply to a new implementation, to e maximum buy into functionality provided by third party packages, and to assist third party algorithms that match machine learning algorithms to user-defined tasks. +Note that we know `Ridge` instances are supervised algorithms because `:(LearnAPI.target) +in LearnAPI.functions(algorithm)`, for every instance `algorithm`. With [some +exceptions](@ref trait_contract), the value of a trait should depend only on the *type* of +the argument. + ## [Demonstration](@id workflow) @@ -308,9 +315,6 @@ LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients LearnAPI.minimize(model::RidgeFitted) = RidgeFitted(model.algorithm, model.coefficients, nothing) -LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) - @trait( Ridge, constructor = Ridge, diff --git a/docs/src/index.md b/docs/src/index.md index b66a6d74..3bb96562 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -31,9 +31,10 @@ ML/statistics toolboxes and other packages. LearnAPI.jl also provides a number o ## Sample workflow Suppose `forest` is some object encapsulating the hyperparameters of the [random forest -algorithm](https://en.wikipedia.org/wiki/Random_forest) (the number of trees, -etc.). Then, a LearnAPI.jl interface can be implemented, for objects with the type of -`forest`, to enable the following basic workflow: +algorithm](https://en.wikipedia.org/wiki/Random_forest) (the number of trees, etc.). Then, +a LearnAPI.jl interface can be implemented, for objects with the type of `forest`, to +enable the basic workflow below. In this case data is presented following the +"scikit-learn" `X, y` pattern, although LearnAPI.jl supports other patterns as well. ```julia X = diff --git a/docs/src/reference.md b/docs/src/reference.md index de0bb3d6..3b8d7397 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -160,11 +160,12 @@ Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_t - [`minimize`](@ref algorithm_minimize): for stripping the `model` output by `fit` of inessential content, for purposes of serialization. -- [`LearnAPI.input`](@ref input): for extracting inputs from training data. +- [`LearnAPI.target`](@ref input), [`LearnAPI.weights`](@ref input), + [`LearnAPI.input`](@ref): for extracting relevant parts of training data, where defined. -- [`obs`](@ref data_interface): a method for exposing to the user algorithm-specific - representations of data that are guaranteed to implement observation access, as - specified by [`LearnAPI.data_interface(algorithm)`](@ref). +- [`obs`](@ref data_interface): optional method for exposing to the user + algorithm-specific representations of data that are guaranteed to implement observation + access, as specified by [`LearnAPI.data_interface(algorithm)`](@ref). - [Accessor functions](@ref accessor_functions): these include functions like `feature_importances` and `training_losses`, for extracting, from training outcomes, diff --git a/docs/src/target_weights_input.md b/docs/src/target_weights_input.md index 8dfaa9f3..847dbbec 100644 --- a/docs/src/target_weights_input.md +++ b/docs/src/target_weights_input.md @@ -1,9 +1,15 @@ -# [`input`](@id input) +# [`target`, `weights`, and `input`](@id input) + +Methods for extracting parts of training data: ```julia -LearnAPI.input(algorithm, data) -> +LearnAPI.target(algorithm, data) -> +LearnAPI.weights(algorithm, data) -> +LearnAPI.input(algorithm, data) -> ``` +Here `data` is something supported in a call of the form `fit(algorithm, data)`. + # Typical workflow Not typically appearing in a general user's workflow but useful in meta-alagorithms, such @@ -24,12 +30,17 @@ training_loss = sum(ŷ .!= y) The fallback returns `first(data)`, assuming `data` is a tuple, and `data` otherwise. -| method | compulsory? | -|:-------------------------|:-----------:| -| [`LearnAPI.input`](@ref) | no | +| method | fallback | compulsory? | | +|:---------------------------|:-----------------:|------------------------|---| +| [`LearnAPI.target`](@ref) | returns `nothing` | no | | +| [`LearnAPI.weights`](@ref) | returns `nothing` | no | | +| [`LearnAPI.input`](@ref) | see docstring | only if fallback fails | | + # Reference ```@docs +LearnAPI.target +LearnAPI.weights LearnAPI.input ``` diff --git a/docs/src/traits.md b/docs/src/traits.md index b04b494d..c75145b1 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -1,8 +1,9 @@ # [Algorithm Traits](@id traits) -Traits generally promise specific algorithm behavior, such as: *This algorithm supports -per-observation weights, or *This algorithm's `transform` method predicts `Real` -vectors*. They also record more mundane information, such as a package license. +Traits generally promise specific algorithm behavior, such as: *This algorithm can make +point or probabilistic predictions*, *This algorithm sees a target variable in training*, +or *This algorithm's `transform` method predicts `Real` vectors*. They also record more +mundane information, such as a package license. Algorithm traits are functions whose first (and usually only) argument is an algorithm. @@ -24,8 +25,6 @@ package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase. | [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | | [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | | [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.target`](@ref)`(algorithm)` | `true` if target can appear in `fit` data | `false` | `true` | -| [`LearnAPI.weights`](@ref)`(algorithm)` | `true` if per-observation weights can appear in `fit` data | `false` | `true` | | [`LearnAPI.descriptors`](@ref)`(algorithm)` | lists one or more suggestive algorithm descriptors from `LearnAPI.descriptors()` | `()` | (:regression, :probabilistic) | | [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | | [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | @@ -91,7 +90,7 @@ Multiple traits can be declared like this: ) ``` -### The global trait contracts +### [The global trait contract](@id trait_contract) To ensure that trait metadata can be stored in an external algorithm registry, LearnAPI.jl requires: @@ -115,8 +114,6 @@ informative (as in `LearnAPI.predict_type(algorithm) = Any`). LearnAPI.constructor LearnAPI.functions LearnAPI.kinds_of_proxy -LearnAPI.target -LearnAPI.weights LearnAPI.descriptors LearnAPI.is_pure_julia LearnAPI.pkg_name diff --git a/src/predict_transform.jl b/src/predict_transform.jl index c30cd7a5..c1c9d9d2 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -1,4 +1,4 @@ - function DOC_IMPLEMENTED_METHODS(name; overloaded=false) +function DOC_IMPLEMENTED_METHODS(name; overloaded=false) word = overloaded ? "overloaded" : "implemented" "If $word, you must include `$name` in the tuple returned by the "* "[`LearnAPI.functions`](@ref) trait. " @@ -105,6 +105,8 @@ $(DOC_DATA_INTERFACE(:predict)) """ predict(model, data) = predict(model, kinds_of_proxy(algorithm(model)) |> first, data) + +# automatic slurping of multiple data arguments: predict(model, k::KindOfProxy, data1, data2, datas...; kwargs...) = predict(model, k, (data1, data2, datas...); kwargs...) predict(model, data1, data2, datas...; kwargs...) = @@ -166,9 +168,7 @@ $(DOC_DATA_INTERFACE(:transform)) """ transform(model, data1, data2...; kwargs...) = - transform(model, (data1, datas...); kwargs...) - - + transform(model, (data1, datas...); kwargs...) # automatic slurping """ inverse_transform(model, data) diff --git a/src/target_weights_input.jl b/src/target_weights_input.jl index 35a3e3df..b5d486e6 100644 --- a/src/target_weights_input.jl +++ b/src/target_weights_input.jl @@ -1,9 +1,47 @@ +""" + LearnAPI.target(algorithm, data) -> target + +Return, for each form of `data` supported in a call of the form [`fit(algorithm, +data)`](@ref), the target variable part of `data`. If `nothing` is returned, the +`algorithm` does not see a target variable in training (is unsupervised). + +Refer to LearnAPI.jl documenation for the precise meaning of "target". + +# New implementations + +A fallback returns `nothing`. Must be implemented if `fit` consumes data including a +target variable. + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.target)"; overloaded=true)) + +""" +target(::Any, data) = nothing + +""" + LearnAPI.weights(algorithm, data) -> weights + +Return, for each form of `data` supported in a call of the form `[`fit(algorithm, +data)`](@ref), the per-observation weights part of `data`. Where `nothing` is returned, no +weights are part of `data`, which is to be interpretted as uniform weighting. + +# New implementations + +Overloading is optional. A fallback returns `nothing`. + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) + +""" +weights(::Any, data) = nothing + """ LearnAPI.input(algorithm, data) -Where `data` is a supported data argument for `fit`, extract from `data` something -suitable for passing as the third argument to `predict`, as in the following sample -workflow: +Return, for each form of `data` supported in a call of the form `[`fit(algorithm, +data)`](@ref), the "input" or "features" part of `data` (as opposed to the target +variable, for example). + +The returned object `X` may always be passed to `predict` or `transform`, where +implemented, as in the following sample workflow: ```julia model = fit(algorithm, data) @@ -11,9 +49,10 @@ X = input(data) ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = LiteralTarget()` ``` -The return value has the same number of observations as `data` does. Where -`LearnAPI.target(algorithm)` is `true` (supervised learning) one expects `ŷ` above to be -an approximate proxy for `target(algorithm, data)`, the training target. +The return value has the same number of observations as `data` does. For supervised models +(i.e., where `:(LearnAPI.target) in LearnAPI.functions(algorithm)`) `ŷ` above is generally +inteneded to be an approximate proxy for `LearnAPI.target(algorithm, data)`, the training +target. # New implementations @@ -25,6 +64,11 @@ LearnAPI.input(algorithm, data) = data LearnAPI.input(algorithm, data::Tuple) = first(data) ``` +Overloading may be necessary if [`obs(algorithm, data)`](@ref) is overloaded to return +some algorithm-specific representation of training `data`. For density estimators, whose +`fit` typically consumes *only* a target variable, you should overload this method to +return `nothing`. + """ input(algorithm, data) = data input(algorithm, data::Tuple) = first(data) diff --git a/src/traits.jl b/src/traits.jl index 090716b4..7fcf63d6 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -27,8 +27,6 @@ const TRAITS = [ :constructor, :functions, :kinds_of_proxy, - :target, - :weights, :descriptors, :is_pure_julia, :pkg_name, @@ -63,8 +61,7 @@ const TRAITS = [ """ Learn.API.constructor(algorithm) -Return a keyword constructor that can be used to clone `algorithm` or make copies with -selectively altered property values: +Return a keyword constructor that can be used to clone `algorithm`: ```julia-repl julia> algorithm.lambda @@ -179,42 +176,6 @@ For more on target variables and target proxies, refer to the LearnAPI documenta """ kinds_of_proxy(::Any) = () -""" - LearnAPI.target(algorithm)::Bool - LearnAPI.target(algorithm, data) -> target - -First method (an algorithm trait) returns `true` if the second method returns a target -variable for some value(s) of `data`, where `data` is a supported argument in -[`fit(algorithm, data)`](@ref). - -# New implementations - -The trait fallback returns `false`. A fallback for the second method returns `nothing`. - -""" -target(::Any) = false -target(::Any, data) = nothing - -""" - LearnAPI.weights(algorithm)::Bool - LearnAPI.weights(algorithm, data) -> weights - -First method (an algorithm trait) returns `true` if the second method returns -per-observation weights, for some value(s) of `data`, where `data` is a supported argument -in [`fit(algorithm, data)`](@ref). - -Otherwise, weights, if they apply, are assumed uniform. - -# New implementations - -The trait fallback returns `false`. A fallback for the second method returns `nothing`, -which is interpreted as uniform weights. - -""" -weights(::Any) = false -weights(::Any, data) = nothing - - descriptors() = [ :regression, :classification, From 729e0d790fefd221b0ebeabdb0074254ae99aa91 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 10:07:12 +1300 Subject: [PATCH 048/187] complete addition of update methods + other tweaks --- docs/make.jl | 4 +- docs/src/accessor_functions.md | 6 +- docs/src/anatomy_of_an_implementation.md | 39 ++++--- docs/src/fit.md | 57 ++++++---- docs/src/index.md | 34 +++--- docs/src/kinds_of_target_proxy.md | 2 +- docs/src/obs.md | 4 +- docs/src/reference.md | 47 ++++---- ...ts_input.md => target_weights_features.md} | 18 +-- src/LearnAPI.jl | 5 +- src/fit.jl | 107 ++++++++++++++++-- src/obs.jl | 2 +- src/predict_transform.jl | 12 +- ...ts_input.jl => target_weights_features.jl} | 20 ++-- src/traits.jl | 84 +++++++------- test/integration/regression.jl | 14 ++- test/integration/static_algorithms.jl | 2 + 17 files changed, 301 insertions(+), 156 deletions(-) rename docs/src/{target_weights_input.md => target_weights_features.md} (59%) rename src/{target_weights_input.jl => target_weights_features.jl} (78%) diff --git a/docs/make.jl b/docs/make.jl index 2514acce..dafb1c97 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -15,11 +15,11 @@ makedocs( "Anatomy of an Implementation" => "anatomy_of_an_implementation.md", "Reference" => [ "Overview" => "reference.md", - "fit" => "fit.md", + "fit/update" => "fit.md", "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", "minimize" => "minimize.md", - "target/weights/input" => "target_weights_input.md", + "target/weights/features" => "target_weights_features.md", "obs" => "obs.md", "Accessor Functions" => "accessor_functions.md", "Algorithm Traits" => "traits.md", diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index f35adc54..e6e50864 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -1,6 +1,7 @@ # [Accessor Functions](@id accessor_functions) -The sole argument of an accessor function is the output, `model`, of [`fit`](@ref). +The sole argument of an accessor function is the output, `model`, of +[`fit`](@ref). Algorithms are free to implement any number of these, or none of them. - [`LearnAPI.algorithm(model)`](@ref) - [`LearnAPI.extras(model)`](@ref) @@ -15,6 +16,9 @@ The sole argument of an accessor function is the output, `model`, of [`fit`](@re - [`LearnAPI.training_scores(model)`](@ref) - [`LearnAPI.components(model)`](@ref) +Algorithm-specific accessor functions may also be implemented. The names of all accessor +functions are included in the list returned by [`LearnAPI.functions(algorithm)`](@ref). + ## Implementation guide All new implementations must implement [`LearnAPI.algorithm`](@ref). While, all others are diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 58b3ba7b..206e624d 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -5,7 +5,7 @@ regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. T workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also refer to the [demonstration](@ref workflow) of the implementation given later. -For a transformer, implementations ordinarily implement `transform` instead of +A transformer ordinarily implements `transform` instead of `predict`. For more on `predict` versus `transform`, see [Predict or transform?](@ref) !!! note @@ -13,18 +13,26 @@ For a transformer, implementations ordinarily implement `transform` instead of New implementations of `fit`, `predict`, etc, always have a *single* `data` argument, as in `LearnAPI.fit(algorithm, data; verbosity=1) = ...`. - For convenience, user calls like `fit(algorithm, X, y)` automatically fallback + For convenience, user-calls, such as `fit(algorithm, X, y)`, automatically fallback to `fit(algorithm, (X, y))`. !!! note + By default, it is assumed that `data` supports the [`LearnAPI.RandomAccess`](@ref) + interface; this includes all matrices, with observations-as-columns, most tables, and + tuples thereof). See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the + case then an implementation must either: + If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable table¹, array³, tuple of tables and arrays, or some other object implementing the MLUtils.jl `getobs`/`numobs` interface, - then an implementation must: (i) suitably overload the trait - [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as - illustrated below under [Providing an advanced data interface](@ref). + then an implementation must: (i) overload [`obs`](@ref) to articulate how + provided data can be transformed into a form that does support + it, as illustrated below under + [Providing an advanced data interface](@ref); or (ii) overload the trait + [`LearnAPI.data_interface`](@ref) to specify a more relaxed data + API. The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. @@ -152,9 +160,9 @@ from training data, by implementing [`LearnAPI.target`](@ref): LearnAPI.target(algorithm, data) = last(data) ``` -There is a similar method, [`LearnAPI.input`](@ref) for declaring how input data can be -extracted (for passing to `predict`, for example) but this method has a fallback which -typically suffices: return `first(data)` if `data` is a tuple, and otherwise return +There is a similar method, [`LearnAPI.features`](@ref) for declaring how training features +can be extracted (for passing to `predict`, for example) but this method has a fallback +which typically suffices: return `first(data)` if `data` is a tuple, and otherwise return `data`. @@ -218,7 +226,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: :(LearnAPI.algorithm), :(LearnAPI.minimize), :(LearnAPI.obs), - :(LearnAPI.input), + :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients), @@ -325,7 +333,7 @@ LearnAPI.minimize(model::RidgeFitted) = :(LearnAPI.algorithm), :(LearnAPI.minimize), :(LearnAPI.obs), - :(LearnAPI.input), + :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients), @@ -423,7 +431,7 @@ LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = predict(model, LiteralTarget(), obs(model, Xnew)) ``` -### `target` and `input` methods +### `target` and `features` methods We provide an additional overloading of [`LearnAPI.target`](@ref) to handle the additional supported data argument of `fit`: @@ -432,11 +440,11 @@ supported data argument of `fit`: LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y ``` -Similarly, we must overload [`LearnAPI.input`](@ref), which extracts inputs from training -data (objects that can be passed to `predict`) like this +Similarly, we must overload [`LearnAPI.features`](@ref), which extracts features from +training data (objects that can be passed to `predict`) like this ```@example anatomy2 -LearnAPI.input(::Ridge, observations::RidgeFitObs) = observations.A +LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A ``` as the fallback mentioned above is no longer adequate. @@ -482,6 +490,9 @@ ẑ = predict(model, MLUtils.getobs(observations_for_predict, test)) @assert ẑ == ŷ ``` +For an application of [`obs`](@ref) to efficient cross-validation, see [here](@ref +obs_workflows). + --- ¹ In LearnAPI.jl a *table* is any object `X` implementing the diff --git a/docs/src/fit.md b/docs/src/fit.md index 1687c686..c512be9c 100644 --- a/docs/src/fit.md +++ b/docs/src/fit.md @@ -1,22 +1,28 @@ -# [`fit`](@ref fit) +# [`fit`, `update`, `update_observations`, and `update_features`](@id fit) -Training for the first time: +### Training ```julia fit(algorithm, data; verbosity=1) -> model fit(algorithm; verbosity=1) -> static_model ``` -Updating: +A "static" algorithm is one that does not generalize to new observations (e.g., some +clustering algorithms); there is no trainiing data and the algorithm is executed by +`predict` or `transform` which receive the data. See example below. + +When `fit` expects a tuple form of argument, `data = (X1, ..., Xn)`, then the signature +`fit(algorithm, X1, ..., Xn)` is also provided. + +### Updating ``` -fit(model, data; verbosity=1, param1=new_value1, param2=new_value2, ...) -> updated_model -fit(model, NewObservations(), new_data; verbosity=1, param1=new_value1, ...) -> updated_model -fit(model, NewFeatures(), new_data; verbosity=1, param1=new_value1, ...) -> updated_model +update(model, data; verbosity=1, param1=new_value1, param2=new_value2, ...) -> updated_model +update_observations(model, new_data; verbosity=1, param1=new_value1, ...) -> updated_model +update_features(model, new_data; verbosity=1, param1=new_value1, ...) -> updated_model ``` -When `fit` expects a tuple form of argument, `data = (X1, ..., Xn)`, then the signature -`fit(algorithm, X1, ..., Xn)` is also provided. +Data slurping forms are similarly provided for updating methods. ## Typical workflows @@ -27,13 +33,13 @@ algorithm = Algorithm(n=100) model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` # Predict probability distributions: -ŷ = predict(model, Distribution(), Xnew) +ŷ = predict(model, Distribution(), Xnew) # Inspect some byproducts of training: LearnAPI.feature_importances(model) # Add 50 iterations and predict again: -model = fit(model; n=150) +model = update(model; n=150) predict(model, Distribution(), X) ``` @@ -41,32 +47,41 @@ predict(model, Distribution(), X) ```julia # Apply some clustering algorithm which cannot be generalized to new data: -model = fit(algorithm) -labels = predict(model, LabelAmbiguous(), X) # mutates `model` +model = fit(algorithm) # no training data +labels = predict(model, LabelAmbiguous(), X) # may mutate `model` + +# Or, in one line: +labels = predict(algorithm, LabelAmbiguous(), X) -# inspect byproducts of the clustering algorithm (e.g., outliers): +# But two-line version exposes byproducts of the clustering algorithm (e.g., outliers): LearnAPI.extras(model) ``` ## Implementation guide -Initial training: +### Training | method | fallback | compulsory? | |:-------------------------------------------------------------------------------|:-----------------------------------------------------------------|--------------------| | [`fit`](@ref)`(algorithm, data; verbosity=1)` | ignores `data` and applies signature below | yes, unless static | | [`fit`](@ref)`(algorithm; verbosity=1)` | none | no, unless static | -Updating: +### Updating + +| method | fallback | compulsory? | +|:-------------------------------------------------------------------------------------|:---------|-------------| +| [`update`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | +| [`update_observations`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | +| [`update_features`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | -| method | fallback | compulsory? | -|:-------------------------------------------------------------------------------|:---------------------------------------------------------------------------|-------------| -| [`fit`](@ref)`(model, data; verbosity=1, param_updates...)` | retrains from scratch on `data` with specified hyperparameter replacements | no | -| [`fit`](@ref)`(model, ::NewObservations, data; verbosity=1, param_updates...)` | none | no | -| [`fit`](@ref)`(model, ::NewFeatures, data; verbosity=1, param_updates...)` | none | no | +There are some contracts regarding the behaviour of the update methods, as they relate to +a previous `fit` call. Consult the document strings for details. ## Reference ```@docs -LearnAPI.fit +fit +update +update_observations +update_features ``` diff --git a/docs/src/index.md b/docs/src/index.md index 3bb96562..cf11d259 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -9,12 +9,14 @@ A base Julia interface for machine learning and statistics
``` -LearnAPI.jl is a lightweight, functional-style interface, providing a collection of -[methods](@ref Methods), such as `fit` and `predict`, to be implemented by algorithms from -machine learning and statistics. Through such implementations, these algorithms buy into -functionality, such as hyperparameter optimization and model composition, as provided by -ML/statistics toolboxes and other packages. LearnAPI.jl also provides a number of Julia -[traits](@ref traits) for promising specific behavior. +LearnAPI.jl is a lightweight, functional-style interface, providing a +collection of [methods](@ref Methods), such as `fit` and `predict`, to be implemented by +algorithms from machine learning and statistics. Through such implementations, these +algorithms buy into functionality, such as hyperparameter optimization and model +composition, as provided by ML/statistics toolboxes and other packages. LearnAPI.jl also +provides a number of Julia [traits](@ref traits) for promising specific behavior. + +LearnAPI.jl has no package dependencies. ```@raw html 🚧 @@ -41,15 +43,18 @@ X = y = Xnew = +# List LearnaAPI functions implemented for `forest`: +LearnAPI.functions(forest) + # Train: model = fit(forest, X, y) +# Generate point predictions: +ŷ = predict(model, Xnew) # or `predict(model, LiteralTarget(), Xnew)` + # Predict probability distributions: predict(model, Distribution(), Xnew) -# Generate point predictions: -ŷ = predict(model, LiteralTarget(), Xnew) # or `predict(model, Xnew)` - # Apply an "accessor function" to inspect byproducts of training: LearnAPI.feature_importances(model) @@ -77,13 +82,14 @@ data_interface) (read as "observations") gives users and meta-algorithms access algorithm-specific representation of input data, which is also guaranteed to implement a standard interface for accessing individual observations, unless the algorithm explicitly opts out. Moreover, the `fit` and `predict` methods will also be able to consume these -alternative data representations. +alternative data representations, for performance benefits in some situations. The fallback data interface is the [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) -`getobs/numobs` interface, and if the input consumed by the algorithm already implements -that interface (tables, arrays, etc.) then overloading `obs` is completely optional. Plain -iteration interfaces, with or without knowledge of the number of observations, can also be -specified (to support, e.g., data loaders reading images from disk). +`getobs/numobs` interface (here tagged as [`LearnAPI.RandomAccess()`](@ref)) and if the +input consumed by the algorithm already implements that interface (tables, arrays, etc.) +then overloading `obs` is completely optional. Plain iteration interfaces, with or without +knowledge of the number of observations, can also be specified (to support, e.g., data +loaders reading images from disk). ## Learning more diff --git a/docs/src/kinds_of_target_proxy.md b/docs/src/kinds_of_target_proxy.md index 35d51e4c..218c378a 100644 --- a/docs/src/kinds_of_target_proxy.md +++ b/docs/src/kinds_of_target_proxy.md @@ -47,7 +47,7 @@ expectiles at 50% will provide `LiteralTarget` instead. > Table of concrete subtypes of `LearnAPI.IID <: LearnAPI.KindOfProxy`. -## Proxies for distribution-fitting algorithms +## Proxies for density estimation lgorithms ```@docs LearnAPI.Single diff --git a/docs/src/obs.md b/docs/src/obs.md index ed44668b..82be98b5 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -11,7 +11,7 @@ obs(algorithm, data) # can be passed to `fit` instead of `data` obs(model, data) # can be passed to `predict` or `transform` instead of `data` ``` -## Typical workflows +## [Typical workflows](@id obs_workflows) LearnAPI.jl makes no universal assumptions about the form of `data` in a call like `fit(algorithm, data)`. However, if we define @@ -46,7 +46,7 @@ import MLUtils algorithm = data = -X = LearnAPI.input(algorithm, data) +X = LearnAPI.features(algorithm, data) y = LearnAPI.target(algorithm, data) train_test_folds = map([1:10, 11:20, 21:30]) do test diff --git a/docs/src/reference.md b/docs/src/reference.md index 3b8d7397..698d0943 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -105,7 +105,7 @@ algorithm-valued. Any object `algorithm` for which [`LearnAPI.functions`](@ref)`(algorithm)` is non-empty is understood to have a valid implementation of the LearnAPI.jl interface. -### Example +#### Example Any instance of `GradientRidgeRegressor` defined below is a valid algorithm. @@ -120,33 +120,35 @@ GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` -### Documentation +## Documentation Attach public LearnAPI.jl-related documentation for an algorithm to it's *constructor*, rather than to the struct defining its type. In this way, an algorithm can implement multiple interfaces, in addition to the LearnAPI interface, with separate document strings for each. - ## Methods -Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_transform`, -`minimize`, and `obs`. - -!!! note +!!! note "Compulsory methods" - All new implementations must implement [`fit`](@ref), - [`LearnAPI.algorithm`](@ref algorithm_minimize), [`LearnAPI.constructor`](@ref) and - [`LearnAPI.functions`](@ref). The last two are algorithm traits, which can be set - with the [`@trait`](@ref) macro. + All new algorithm types must implement [`fit`](@ref), + [`LearnAPI.algorithm`](@ref algorithm_minimize), [`LearnAPI.constructor`](@ref) and + [`LearnAPI.functions`](@ref). +Most algorithms will also implement [`predict`](@ref) and/or [`transform`](@ref). ### List of methods - [`fit`](@ref fit): for training or updating algorithms that generalize to new data. Or, - for non-generalizing algorithms (see [Static Algorithms](@ref)), wrap `algorithm` in a - mutable struct that can be mutated by `predict`/`transform` to record byproducts of - those operations. + for non-generalizing algorithms (see [Static Algorithms](@ref)), for wrapping + `algorithm` in a mutable struct that can be mutated by `predict`/`transform` to record + byproducts of those operations. + +- [`update`](@ref fit): for updating learning outcomes after hyperparameter changes, such + as increasing an iteration parameter. + +- [`update_observations`](@ref fit), [`update_features`](@ref fit): update learning + outcomes by presenting additional training data. - [`predict`](@ref operations): for outputting [targets](@ref proxy) or [target proxies](@ref proxy) (such as probability density functions) @@ -161,20 +163,21 @@ Only these method names are exported by LearnAPI: `fit`, `transform`, `inverse_t inessential content, for purposes of serialization. - [`LearnAPI.target`](@ref input), [`LearnAPI.weights`](@ref input), - [`LearnAPI.input`](@ref): for extracting relevant parts of training data, where defined. + [`LearnAPI.features`](@ref): for extracting relevant parts of training data, where + defined. -- [`obs`](@ref data_interface): optional method for exposing to the user - algorithm-specific representations of data that are guaranteed to implement observation - access, as specified by [`LearnAPI.data_interface(algorithm)`](@ref). +- [`obs`](@ref data_interface): method for exposing to the user + algorithm-specific representations of data, which are additionally guaranteed to + implement the observation access API specified by + [`LearnAPI.data_interface(algorithm)`](@ref). - [Accessor functions](@ref accessor_functions): these include functions like `feature_importances` and `training_losses`, for extracting, from training outcomes, information common to many algorithms. -- [Algorithm traits](@ref traits): special methods, that promise specific algorithm - behavior or for recording general information about the algorithm. Only - [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref) are universally - compulsory. +- [Algorithm traits](@ref traits): methods that promise specific algorithm behavior or + record general information about the algorithm. Only [`LearnAPI.constructor`](@ref) and + [`LearnAPI.functions`](@ref) are universally compulsory. --- diff --git a/docs/src/target_weights_input.md b/docs/src/target_weights_features.md similarity index 59% rename from docs/src/target_weights_input.md rename to docs/src/target_weights_features.md index 847dbbec..78205a44 100644 --- a/docs/src/target_weights_input.md +++ b/docs/src/target_weights_features.md @@ -1,11 +1,11 @@ -# [`target`, `weights`, and `input`](@id input) +# [`target`, `weights`, and `features`](@id input) Methods for extracting parts of training data: ```julia LearnAPI.target(algorithm, data) -> LearnAPI.weights(algorithm, data) -> -LearnAPI.input(algorithm, data) -> +LearnAPI.features(algorithm, data) -> ``` Here `data` is something supported in a call of the form `fit(algorithm, data)`. @@ -20,7 +20,7 @@ target: ```julia model = fit(algorithm, data) -X = LearnAPI.input(algorithm, data) +X = LearnAPI.features(algorithm, data) y = LearnAPI.target(algorithm, data) ŷ = predict(model, LiteralTarget(), X) training_loss = sum(ŷ .!= y) @@ -30,11 +30,11 @@ training_loss = sum(ŷ .!= y) The fallback returns `first(data)`, assuming `data` is a tuple, and `data` otherwise. -| method | fallback | compulsory? | | -|:---------------------------|:-----------------:|------------------------|---| -| [`LearnAPI.target`](@ref) | returns `nothing` | no | | -| [`LearnAPI.weights`](@ref) | returns `nothing` | no | | -| [`LearnAPI.input`](@ref) | see docstring | only if fallback fails | | +| method | fallback | compulsory? | +|:----------------------------|:-----------------:|------------------------| +| [`LearnAPI.target`](@ref) | returns `nothing` | no | +| [`LearnAPI.weights`](@ref) | returns `nothing` | no | +| [`LearnAPI.features`](@ref) | see docstring | only if fallback fails | # Reference @@ -42,5 +42,5 @@ The fallback returns `first(data)`, assuming `data` is a tuple, and `data` other ```@docs LearnAPI.target LearnAPI.weights -LearnAPI.input +LearnAPI.features ``` diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 0de8c026..e98d6dbc 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -7,13 +7,14 @@ include("types.jl") include("predict_transform.jl") include("fit.jl") include("minimize.jl") -include("target_weights_input.jl") +include("target_weights_features.jl") include("obs.jl") include("accessor_functions.jl") include("traits.jl") export @trait -export fit, predict, transform, inverse_transform, minimize, obs +export fit, update, update_observations, update_features +export predict, transform, inverse_transform, minimize, obs for name in Symbol.(CONCRETE_TARGET_PROXY_TYPES_SYMBOLS) @eval export $name diff --git a/src/fit.jl b/src/fit.jl index 2a5e0cbf..faefd610 100644 --- a/src/fit.jl +++ b/src/fit.jl @@ -1,13 +1,8 @@ -# # DOC STRING HELPERS - -const TRAINING_FUNCTIONS = (:fit,) - - # # FIT """ - LearnAPI.fit(algorithm, data; verbosity=1) - LearnAPI.fit(algorithm; verbosity=1) + fit(algorithm, data; verbosity=1) + fit(algorithm; verbosity=1) Execute the algorithm with configuration `algorithm` using the provided training `data`, returning an object, `model`, on which other methods, such as [`predict`](@ref) or @@ -54,3 +49,101 @@ fit(algorithm, data; kwargs...) = fit(algorithm; kwargs...) fit(algorithm, data1, datas...; kwargs...) = fit(algorithm, (data1, datas...); kwargs...) + +# # UPDATE AND COUSINS + +""" + update(model, data; verbosity=1, hyperparam_replacements...) + +Return an updated version of the `model` object returned by a previous [`fit`](@ref) or +`update` call, but with the specified hyperparameter replacements, in the form `p1=value1, +p2=value2, ...`. + +Provided that `data` is identical with the data presented in a preceding `fit` call, as in +the example below, execution is semantically equivalent to the call `fit(algorithm, +data)`, where `algorithm` is `LearnAPI.algorithm(model)` with the specified +replacements. In some cases (typically, when changing an iteration parameter) there may be +a performance benefit to using `update` instead of retraining ab initio. + +If `data` differs from that in the preceding `fit` or `update` call, then behaviour is +algorithm-specific. + +```julia +algorithm = MyForest(ntrees=100) + +# train with 100 trees: +model = fit(algorithm, data) + +# add 50 more trees: +model = update(model, data; ntrees=150) +``` + +See also [`fit`](@ref), [`update_observations`](@ref), [`update_features`](@ref). + +# New implementations + +Implementation is optional. The signature must include +`verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update)")) + +""" +update(model, data1, datas...; kwargs...) = update(model, (data1, datas...); kwargs...) + +""" + update_observations(model, new_data; verbosity=1, parameter_replacements...) + +Return an updated version of the `model` object returned by a previous [`fit`](@ref) or +`update` call given the new observations present in `new_data`. One may additionally +specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. + +When following the call `fit(algorithm, data)`, the `update` call is semantically +equivalent to retraining ab initio using a concatentation of `data` and `new_data`, +*provided there are no hyperparameter replacements.* Behaviour is otherwise +algorithm-specific. + +```julia-repl +algorithm = MyNeuralNetwork(epochs=10, learning_rate=0.01) + +# train for ten epochs: +model = fit(algorithm, data) + +# train for two more epochs using new data and new learning rate: +model = update_observations(model, new_data; epochs=2, learning_rate=0.1) +``` + +See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). + +# Extended help + +# New implementations + +Implementation is optional. The signature must include +`verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_observations)")) + +""" +update_observations(algorithm, data1, datas...; kwargs...) = + update_observations(algorithm, (data1, datas...); kwargs...) + +""" + update_features(model, new_data; verbosity=1, parameter_replacements...) + +Return an updated version of the `model` object returned by a previous [`fit`](@ref) or +`update` call given the new features encapsulated in `new_data`. One may additionally +specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. + +When following the call `fit(algorithm, data)`, the `update` call is semantically +equivalent to retraining ab initio using a concatentation of `data` and `new_data`, +*provided there are no hyperparameter replacements.* Behaviour is otherwise +algorithm-specific. + +See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). + +# Extended help + +# New implementations + +Implementation is optional. The signature must include +`verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_features)")) + +""" +update_features(algorithm, data1, datas...; kwargs...) = + update_features(algorithm, (data1, datas...); kwargs...) diff --git a/src/obs.jl b/src/obs.jl index 2d784a89..e781351e 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -67,7 +67,7 @@ to be overloaded. However, the user will get no performance benefits by using `o that case. When overloading `obs(algorithm, data)` to output new model-specific representations of -data, it may be necessary to also overload [`LearnAPI.input`](@ref), +data, it may be necessary to also overload [`LearnAPI.features`](@ref), [`LearnAPI.target`](@ref) (supervised algorithms), and/or [`LearnAPI.weights`](@ref) (if weights are supported), for extracting relevant parts of the representation. diff --git a/src/predict_transform.jl b/src/predict_transform.jl index c1c9d9d2..6b62dfd5 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -38,11 +38,13 @@ DOC_DATA_INTERFACE(method) = ## Assumptions about data By default, it is assumed that `data` supports the [`LearnAPI.RandomAccess`](@ref) - interface (all matrices, with observations-as-columns, most tables, and tuples - thereof). See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the case - then an implementation must suitably: (i) overload the trait - [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref). Refer to these - methods' document strings for details. + interface; this includes all matrices, with observations-as-columns, most tables, and + tuples thereof). See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the + case then an implementation must either: (i) overload [`obs`](@ref) to articulate how + provided data can be transformed into a form that does support + [`LearnAPI.RandomAccess`](@ref); or (ii) overload the trait + [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. Refer to + document strings for details. """ diff --git a/src/target_weights_input.jl b/src/target_weights_features.jl similarity index 78% rename from src/target_weights_input.jl rename to src/target_weights_features.jl index b5d486e6..e7fd0b63 100644 --- a/src/target_weights_input.jl +++ b/src/target_weights_features.jl @@ -34,18 +34,18 @@ $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) weights(::Any, data) = nothing """ - LearnAPI.input(algorithm, data) + LearnAPI.features(algorithm, data) Return, for each form of `data` supported in a call of the form `[`fit(algorithm, -data)`](@ref), the "input" or "features" part of `data` (as opposed to the target -variable, for example). +data)`](@ref), the "features" part of `data` (as opposed to the target +variable, for example). The returned object `X` may always be passed to `predict` or `transform`, where implemented, as in the following sample workflow: ```julia model = fit(algorithm, data) -X = input(data) +X = features(data) ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = LiteralTarget()` ``` @@ -57,11 +57,13 @@ target. # New implementations -The following fallbacks typically make overloading `LearnAPI.input` unnecessary: +The only contract `features` must satisfy is the one about passability of the output to +`predict` or `transform`, for each supported input `data`. The following fallbacks +typically make overloading `LearnAPI.features` unnecessary: ```julia -LearnAPI.input(algorithm, data) = data -LearnAPI.input(algorithm, data::Tuple) = first(data) +LearnAPI.features(algorithm, data) = data +LearnAPI.features(algorithm, data::Tuple) = first(data) ``` Overloading may be necessary if [`obs(algorithm, data)`](@ref) is overloaded to return @@ -70,5 +72,5 @@ some algorithm-specific representation of training `data`. For density estimator return `nothing`. """ -input(algorithm, data) = data -input(algorithm, data::Tuple) = first(data) +features(algorithm, data) = data +features(algorithm, data::Tuple) = first(data) diff --git a/src/traits.jl b/src/traits.jl index 7fcf63d6..50ddda1d 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -115,19 +115,22 @@ value is non-empty. All new implementations must overload this trait. Here's a checklist for elements in the return value: -| symbol | implementation/overloading compulsory? | include in returned tuple? | -|---------------------------------|----------------------------------------|------------------------------------| -| `:(LearnAPI.fit)` | yes | yes | -| `:(LearnAPI.algorithm)` | yes | yes | -| `:(LearnAPI.minimize)` | no | yes | -| `:(LearnAPI.obs)` | no | yes | -| `:(LearnAPI.input)` | no | yes, unless `fit` consumes no data | -| `:(LearnAPI.target)` | no | only if implemented | -| `:(LearnAPI.weights)` | no | only if implemented | -| `:(LearnAPI.predict)` | no | only if implemented | -| `:(LearnAPI.transform)` | no | only if implemented | -| `:(LearnAPI.inverse_transform)` | no | only if implemented | -| | no | only if implemented | +| symbol | implementation/overloading compulsory? | include in returned tuple? | +|-----------------------------------|----------------------------------------|------------------------------------| +| `:(LearnAPI.fit)` | yes | yes | +| `:(LearnAPI.algorithm)` | yes | yes | +| `:(LearnAPI.minimize)` | no | yes | +| `:(LearnAPI.obs)` | no | yes | +| `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | +| `:(LearnAPI.update)` | no | only if implemented | +| `:(LearnAPI.update_observations)` | no | only if implemented | +| `:(LearnAPI.update_features)` | no | only if implemented | +| `:(LearnAPI.target)` | no | only if implemented | +| `:(LearnAPI.weights)` | no | only if implemented | +| `:(LearnAPI.predict)` | no | only if implemented | +| `:(LearnAPI.transform)` | no | only if implemented | +| `:(LearnAPI.inverse_transform)` | no | only if implemented | +| | no | only if implemented | Also include any implemented accessor functions, both those owned by LearnaAPI.jl, and any algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. @@ -177,38 +180,39 @@ For more on target variables and target proxies, refer to the LearnAPI documenta kinds_of_proxy(::Any) = () descriptors() = [ - :regression, - :classification, - :clustering, - :gradient_descent, - :iterative_algorithms, - :incremental_algorithms, - :dimension_reduction, - :encoders, - :static_algorithms, - :missing_value_imputation, - :ensemble_algorithms, - :wrappers, - :time_series_forecasting, - :time_series_classification, - :survival_analysis, - :distribution_fitters, - :Bayesian_algorithms, - :outlier_detection, - :collaborative_filtering, - :text_analysis, - :audio_analysis, - :natural_language_processing, - :image_processing, + "regression", + "classification", + "clustering", + "gradient descent", + "iterative algorithms", + "incremental algorithms", + "dimension reduction", + "encoders", + "feature engineering", + "static algorithms", + "missing value imputation", + "ensemble algorithms", + "wrappers", + "time series forecasting", + "time series classification", + "survival analysis", + "density estimation", + "Bayesian algorithms", + "outlier detection", + "collaborative filtering", + "text analysis", + "audio analysis", + "natural language processing", + "image processing", ] -const DOC_DESCRIPTORS_LIST = join(map(d -> "`:$d`", descriptors()), ", ") +const DOC_DESCRIPTORS_LIST = join(map(d -> "`\"$d\"`", descriptors()), ", ") """ LearnAPI.descriptors(algorithm) -Lists one or more suggestive algorithm descriptors from this list: $DOC_DESCRIPTORS_LIST (do -`LearnAPI.descriptors()` to reproduce). +Lists one or more suggestive algorithm descriptors. Do `LearnAPI.descriptors()` to list +all possible. !!! warning The value of this trait guarantees no particular behavior. The trait is @@ -216,7 +220,7 @@ Lists one or more suggestive algorithm descriptors from this list: $DOC_DESCRIPT # New implementations -This trait should return a tuple of symbols, as in `(:classifier, :text_analysis)`. +This trait should return a tuple of strings, as in `("classifier", "text analysis")`. """ descriptors(::Any) = () diff --git a/test/integration/regression.jl b/test/integration/regression.jl index d8118a72..e34a2993 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -81,7 +81,7 @@ LearnAPI.fit(algorithm::Ridge, data; kwargs...) = # extracting stuff from training data: LearnAPI.target(::Ridge, data) = last(data) LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y -LearnAPI.input(::Ridge, observations::RidgeFitObs) = observations.A +LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A # observations for consumption by `predict`: LearnAPI.obs(::RidgeFitted, X) = Tables.matrix(X)' @@ -104,12 +104,13 @@ LearnAPI.minimize(model::RidgeFitted) = Ridge, constructor = Ridge, kinds_of_proxy = (LiteralTarget(),), + descriptors = ("regression",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.minimize), :(LearnAPI.obs), - :(LearnAPI.input), + :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.feature_importances), @@ -131,7 +132,7 @@ data = (X, y) @test :(LearnAPI.obs) in LearnAPI.functions(algorithm) @test LearnAPI.target(algorithm, data) == y - @test LearnAPI.input(algorithm, data) == X + @test LearnAPI.features(algorithm, data) == X # verbose fitting: @test_logs( @@ -163,7 +164,7 @@ data = (X, y) model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) @test LearnAPI.target(algorithm, fitobs) == y @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) ≈ ŷ - @test predict(model, LearnAPI.input(algorithm, fitobs)) ≈ predict(model, X) + @test predict(model, LearnAPI.features(algorithm, fitobs)) ≈ predict(model, X) @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} @@ -236,12 +237,13 @@ LearnAPI.minimize(model::BabyRidgeFitted) = BabyRidge, constructor = BabyRidge, kinds_of_proxy = (LiteralTarget(),), + descriptors = ("regression",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.minimize), :(LearnAPI.obs), - :(LearnAPI.input), + :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.feature_importances), @@ -262,7 +264,7 @@ LearnAPI.minimize(model::BabyRidgeFitted) = predict(model, MLUtils.getobs(predictobs, test)) @test LearnAPI.target(algorithm, data) == y @test LearnAPI.predict(model, X) ≈ - LearnAPI.predict(model, LearnAPI.input(algorithm, data)) + LearnAPI.predict(model, LearnAPI.features(algorithm, data)) end true diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index 1d6a2ad6..6a7a72af 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -39,6 +39,7 @@ end @trait( Selector, constructor = Selector, + descriptors = ("feature engineering",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -104,6 +105,7 @@ end Selector2, constructor = Selector2, predict_or_transform_mutates = true, + descriptors = ("feature engineering",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), From 20b4bfff434f8424ea641f8af092873fc03273c4 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 10:19:42 +1300 Subject: [PATCH 049/187] rename fit.* -> fit_update.* and descriptors -> tags --- docs/make.jl | 2 +- docs/src/anatomy_of_an_implementation.md | 4 ++-- docs/src/{fit.md => fit_update.md} | 0 docs/src/traits.md | 4 ++-- src/LearnAPI.jl | 2 +- src/{fit.jl => fit_update.jl} | 0 src/tools.jl | 4 ++-- src/traits.jl | 12 ++++++------ test/integration/regression.jl | 4 ++-- test/integration/static_algorithms.jl | 4 ++-- 10 files changed, 18 insertions(+), 18 deletions(-) rename docs/src/{fit.md => fit_update.md} (100%) rename src/{fit.jl => fit_update.jl} (100%) diff --git a/docs/make.jl b/docs/make.jl index dafb1c97..a0b0bb37 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -15,7 +15,7 @@ makedocs( "Anatomy of an Implementation" => "anatomy_of_an_implementation.md", "Reference" => [ "Overview" => "reference.md", - "fit/update" => "fit.md", + "fit/update" => "fit_update.md", "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", "minimize" => "minimize.md", diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 206e624d..13f17da1 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -220,7 +220,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: Ridge, constructor = Ridge, kinds_of_proxy=(LiteralTarget(),), - descriptors = (:regression,), + tags = (:regression,), functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -327,7 +327,7 @@ LearnAPI.minimize(model::RidgeFitted) = Ridge, constructor = Ridge, kinds_of_proxy=(LiteralTarget(),), - descriptors = (:regression,), + tags = (:regression,), functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), diff --git a/docs/src/fit.md b/docs/src/fit_update.md similarity index 100% rename from docs/src/fit.md rename to docs/src/fit_update.md diff --git a/docs/src/traits.md b/docs/src/traits.md index c75145b1..7699bbce 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -25,7 +25,7 @@ package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase. | [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | | [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | | [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.descriptors`](@ref)`(algorithm)` | lists one or more suggestive algorithm descriptors from `LearnAPI.descriptors()` | `()` | (:regression, :probabilistic) | +| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | | [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | | [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | | [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | @@ -114,7 +114,7 @@ informative (as in `LearnAPI.predict_type(algorithm) = Any`). LearnAPI.constructor LearnAPI.functions LearnAPI.kinds_of_proxy -LearnAPI.descriptors +LearnAPI.tags LearnAPI.is_pure_julia LearnAPI.pkg_name LearnAPI.pkg_license diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index e98d6dbc..ffab0130 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -5,7 +5,7 @@ import InteractiveUtils.subtypes include("tools.jl") include("types.jl") include("predict_transform.jl") -include("fit.jl") +include("fit_update.jl") include("minimize.jl") include("target_weights_features.jl") include("obs.jl") diff --git a/src/fit.jl b/src/fit_update.jl similarity index 100% rename from src/fit.jl rename to src/fit_update.jl diff --git a/src/tools.jl b/src/tools.jl index d86e3d8d..1b033f05 100644 --- a/src/tools.jl +++ b/src/tools.jl @@ -16,7 +16,7 @@ Overload a number of traits for algorithms of type `TypeEx`. For example, the co ```julia @trait( RidgeRegressor, - descriptors = ("regression", ), + tags = ("regression", ), doc_url = "https://some.cool.documentation", ) ``` @@ -24,7 +24,7 @@ Overload a number of traits for algorithms of type `TypeEx`. For example, the co is equivalent to ```julia -LearnAPI.descriptors(::RidgeRegressor) = ("regression", ), +LearnAPI.tags(::RidgeRegressor) = ("regression", ), LearnAPI.doc_url(::RidgeRegressor) = "https://some.cool.documentation", ``` diff --git a/src/traits.jl b/src/traits.jl index 50ddda1d..30ad504b 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -27,7 +27,7 @@ const TRAITS = [ :constructor, :functions, :kinds_of_proxy, - :descriptors, + :tags, :is_pure_julia, :pkg_name, :pkg_license, @@ -179,7 +179,7 @@ For more on target variables and target proxies, refer to the LearnAPI documenta """ kinds_of_proxy(::Any) = () -descriptors() = [ +tags() = [ "regression", "classification", "clustering", @@ -206,12 +206,12 @@ descriptors() = [ "image processing", ] -const DOC_DESCRIPTORS_LIST = join(map(d -> "`\"$d\"`", descriptors()), ", ") +const DOC_TAGS_LIST = join(map(d -> "`\"$d\"`", tags()), ", ") """ - LearnAPI.descriptors(algorithm) + LearnAPI.tags(algorithm) -Lists one or more suggestive algorithm descriptors. Do `LearnAPI.descriptors()` to list +Lists one or more suggestive algorithm tags. Do `LearnAPI.tags()` to list all possible. !!! warning @@ -223,7 +223,7 @@ all possible. This trait should return a tuple of strings, as in `("classifier", "text analysis")`. """ -descriptors(::Any) = () +tags(::Any) = () """ LearnAPI.is_pure_julia(algorithm) diff --git a/test/integration/regression.jl b/test/integration/regression.jl index e34a2993..5b91561e 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -104,7 +104,7 @@ LearnAPI.minimize(model::RidgeFitted) = Ridge, constructor = Ridge, kinds_of_proxy = (LiteralTarget(),), - descriptors = ("regression",) + tags = ("regression",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -237,7 +237,7 @@ LearnAPI.minimize(model::BabyRidgeFitted) = BabyRidge, constructor = BabyRidge, kinds_of_proxy = (LiteralTarget(),), - descriptors = ("regression",) + tags = ("regression",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index 6a7a72af..a143416b 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -39,7 +39,7 @@ end @trait( Selector, constructor = Selector, - descriptors = ("feature engineering",) + tags = ("feature engineering",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -105,7 +105,7 @@ end Selector2, constructor = Selector2, predict_or_transform_mutates = true, - descriptors = ("feature engineering",) + tags = ("feature engineering",) functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), From 1a92f479e796ff708bbe4d9f31a1b9973d889ace Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 10:32:01 +1300 Subject: [PATCH 050/187] tweak --- docs/src/traits.md | 66 +++++++++++++++++++++++----------------------- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/docs/src/traits.md b/docs/src/traits.md index 7699bbce..c20171d7 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -20,39 +20,39 @@ one argument. In the examples column of the table below, `Table`, `Continuous`, `Sampleable` are names owned by the package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). -| trait | return value | fallback value | example | -|:----------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:---------------------------------------------------------| -| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | -| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | -| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | -| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | -| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | -| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | -| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | -| [`LearnAPI.load_path`](@ref)`(algorithm)` | a string indicating where the struct for `typeof(algorithm)` is defined, beginning with name of package providing implementation | `"unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | -| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties (fields) of `algorithm` may be an algorithm | `false` | `true` | -| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | -| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | -| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | -| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | -| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | -| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | -| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | -| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{<:Real}` | -| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `scitype(predict(model, ...))` | `Any` | `AbstractVector{Continuous}` | -| [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `typeof(predict(model, ...))` | `Any` | `AbstractVector{<:Real}` | -| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `transform(model, data)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.transform_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)`ensuring `transform(model, data)` works | `Union{}` | `AbstractMatrix{<:Real}}` | -| [`LearnAPI.transform_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | upper bound on `scitype(transform(model, ...))` | `Any` | `Table(Continuous)` | -| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | upper bound on `typeof(transform(model, ...))` | `Any` | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | +| trait | return value | fallback value | example | +|:----------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| +| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | +| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | +| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | +| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | +| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | +| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | +| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | +| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | +| [`LearnAPI.load_path`](@ref)`(algorithm)` | a string locating the name of `LearnAPI.constructor(algorithm)` is defined, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | +| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties of `algorithm` may be an algorithm | `false` | `true` | +| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | +| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | +| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | +| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | +| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | +| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | +| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | +| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `Table(Continuous)` | +| [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `AbstractMatrix{<:Real}` | +| [`LearnAPI.predict_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{<:Real}` | +| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `scitype(predict(model, ...))` | `Any` | `AbstractVector{Continuous}` | +| [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `typeof(predict(model, ...))` | `Any` | `AbstractVector{<:Real}` | +| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `transform(model, data)` works | `Union{}` | `Table(Continuous)` | +| [`LearnAPI.transform_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)`ensuring `transform(model, data)` works | `Union{}` | `AbstractMatrix{<:Real}}` | +| [`LearnAPI.transform_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | +| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | upper bound on `scitype(transform(model, ...))` | `Any` | `Table(Continuous)` | +| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | upper bound on `typeof(transform(model, ...))` | `Any` | `AbstractMatrix{<:Real}` | +| [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | ### Derived Traits From d1f32596d01de2e4824e4ca1e05a09145ab56c53 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 10:47:09 +1300 Subject: [PATCH 051/187] teak target_observation_scitype --- src/traits.jl | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 30ad504b..c6a7889e 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -464,21 +464,25 @@ fit_observation_type(::Any) = Union{} """ LearnAPI.target_observation_scitype(algorithm) -Return an upper bound `S` on the scitype of each observation of `LearnAPI.target(data)`, -where `data` is an admissible argument in the call `fit(algorithm, data)`. +Return an upper bound `S` on the scitype of each observation of an applicable target +variable. Specifically: -This interpretation only holds if `LearnAPI.target(algorithm)` is `true`. In any case, -however, if `algorithm` implements `predict`, then `S` will always be an -upper bound on the scitype of observations that could be conceivably extracted from the -output of [`predict`](@ref). For example, suppose we have +- If `:(LearnAPI.target) in LearnAPI.functions(algorithm)` (i.e., `fit` consumes target + variables) then "target" means anything returned by `LearnAPI.target(algorithm, data)`, + where `data` is an admissible argument in the call `fit(algorithm, data)`. + +- `S` will always be an upper bound on the scitype of observations that could be + conceivably extracted from the output of [`predict`](@ref). + +To illustate the second case, suppose we have ```julia model = fit(algorithm, data) ŷ = predict(model, Sampleable(), data_new) ``` -Then each sample generated by each "observation" of `ŷ` (a vector of sampleable objects, -say) will be bound in scitype by `S`. +Then each individual sample generated by each "observation" of `ŷ` (a vector of sampleable +objects, say) will be bound in scitype by `S`. See also See also [`LearnAPI.fit_observation_scitype`](@ref). From d69c5b0fb0167a92a0ae1ef8514e706aad7a116c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 11:36:12 +1300 Subject: [PATCH 052/187] purge a bunch of traits related to predict/transform input/output --- docs/src/traits.md | 105 +++++---------- src/traits.jl | 317 +++------------------------------------------ 2 files changed, 49 insertions(+), 373 deletions(-) diff --git a/docs/src/traits.md b/docs/src/traits.md index c20171d7..25edaa1c 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -1,17 +1,10 @@ # [Algorithm Traits](@id traits) -Traits generally promise specific algorithm behavior, such as: *This algorithm can make -point or probabilistic predictions*, *This algorithm sees a target variable in training*, -or *This algorithm's `transform` method predicts `Real` vectors*. They also record more -mundane information, such as a package license. +Algorithm traits are simply functions whose sole argument is an algorithm. -Algorithm traits are functions whose first (and usually only) argument is an algorithm. - -### Special two-argument traits - -The two-argument version of [`LearnAPI.predict_output_scitype`](@ref) and -[`LearnAPI.predict_output_scitype`](@ref) are the only overloadable traits with more than -one argument. +Traits promise specific algorithm behavior, such as: *This algorithm can make point or +probabilistic predictions* or *This algorithm is supervised* (sees a target in +training). They may also record more mundane information, such as a package license. ## [Trait summary](@id trait_summary) @@ -20,50 +13,35 @@ one argument. In the examples column of the table below, `Table`, `Continuous`, `Sampleable` are names owned by the package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). -| trait | return value | fallback value | example | -|:----------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| -| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | -| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | -| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | -| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | -| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | -| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | -| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | -| [`LearnAPI.load_path`](@ref)`(algorithm)` | a string locating the name of `LearnAPI.constructor(algorithm)` is defined, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | -| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties of `algorithm` may be an algorithm | `false` | `true` | -| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | -| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | -| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | -| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | -| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.fit_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | -| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{<:Real}, Real}` | -| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | -| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.predict_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)` ensuring `predict(model, kind, data)` works | `Union{}` | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `predict(model, kind, data)` works | `Union{}` | `Vector{<:Real}` | -| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `scitype(predict(model, ...))` | `Any` | `AbstractVector{Continuous}` | -| [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | upper bound on `typeof(predict(model, ...))` | `Any` | `AbstractVector{<:Real}` | -| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | upper bound on `scitype(data)` ensuring `transform(model, data)` works | `Union{}` | `Table(Continuous)` | -| [`LearnAPI.transform_input_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | upper bound on `typeof(data)`ensuring `transform(model, data)` works | `Union{}` | `AbstractMatrix{<:Real}}` | -| [`LearnAPI.transform_input_observation_type`](@ref)`(algorithm)` | upper bound on `typeof(observation)` for `observation` in `data` ensuring `transform(model, data)` works | `Union{}` | `Vector{Continuous}` | -| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | upper bound on `scitype(transform(model, ...))` | `Any` | `Table(Continuous)` | -| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | upper bound on `typeof(transform(model, ...))` | `Any` | `AbstractMatrix{<:Real}` | -| [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | +| trait | return value | fallback value | example | +|:-------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| +| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | +| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | +| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | +| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | +| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | +| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | +| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | +| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | +| [`LearnAPI.load_path`](@ref)`(algorithm)` | string locating name returned by `LearnAPI.constructor(algorithm)`, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | +| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties of `algorithm` may be an algorithm | `false` | `true` | +| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | +| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | +| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | +| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | +| [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | ### Derived Traits -The following convenience methods are provided but not overloadable by new implementations. +The following are provided for convenience but should not be overloaded by new algorithms: -| trait | return value | example | -|:-----------------------------------------------------|:--------------------------------------------------------------------------------------------------------------|:--------| -| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | -| `LearnAPI.is_algorithm(algorithm)` | `true` if `LearnAPI.functions(algorithm)` is not empty | `true` | -| [`LearnAPI.predict_output_scitype(algorithm)`](@ref) | dictionary of upper bounds on the scitype of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | -| [`LearnAPI.predict_output_type(algorithm)`](@ref) | dictionary of upper bounds on the type of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) | | +| trait | return value | example | +|:-----------------------------------|:---------------------------------------------------------------------|:--------| +| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | +| `LearnAPI.is_algorithm(algorithm)` | `true` if `algorithm` is LearnAPI.jl-compliant | `true` | +| `LearnAPI.target(algorithm)` | `true` if [`LearnAPI.target(algorithm, data)`](@ref) is implemented | `false` | +| `LearnAPI.weights(algorithm)` | `true` if [`LearnAPI.weights(algorithm, data)`](@ref) is implemented | `false` | ## Implementation guide @@ -97,15 +75,15 @@ requires: 1. *Finiteness:* The value of a trait is the same for all `algorithm`s with same value of [`LearnAPI.constructor(algorithm)`](@ref). This typically means trait values do not - depend on type parameters! There is an exception if `is_composite(algorithm) = true`. + depend on type parameters! If `is_composite(algorithm) = true`, this requirement is + dropped. -2. *Immediate serializability:* It should be possible to call a trait without first - installing any third party package. Importing the package that defines the algorithm, - together with `import LearnAPI` should suffice. +2. *Low level deserializability:* It should be possible to evaluate the trait *value* when + `LearnAPI` is the only imported module. Because of 1, combining a lot of functionality into one algorithm (e.g. the algorithm can perform both classification or regression) can mean traits are necessarily less -informative (as in `LearnAPI.predict_type(algorithm) = Any`). +informative (as in `LearnAPI.target_observation_scitype(algorithm) = Any`). ## Reference @@ -124,23 +102,8 @@ LearnAPI.is_composite LearnAPI.human_name LearnAPI.data_interface LearnAPI.iteration_parameter -LearnAPI.fit_scitype -LearnAPI.fit_type LearnAPI.fit_observation_scitype -LearnAPI.fit_observation_type LearnAPI.target_observation_scitype -LearnAPI.predict_input_scitype -LearnAPI.predict_input_observation_scitype -LearnAPI.predict_input_type -LearnAPI.predict_input_observation_type -LearnAPI.predict_output_scitype -LearnAPI.predict_output_type -LearnAPI.transform_input_scitype -LearnAPI.transform_input_observation_scitype -LearnAPI.transform_input_type -LearnAPI.transform_input_observation_type LearnAPI.predict_or_transform_mutates -LearnAPI.transform_output_scitype -LearnAPI.transform_output_type LearnAPI.@trait ``` diff --git a/src/traits.jl b/src/traits.jl index c6a7889e..dfdd5c21 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -3,7 +3,7 @@ const DOC_UNKNOWN = "Returns `\"unknown\"` if the algorithm implementation has "* - "failed to overload the trait. " + "not overloaded the trait. " const DOC_ON_TYPE = "The value of the trait must depend only on the type of `algorithm`. " DOC_ONLY_ONE(func) = @@ -38,21 +38,11 @@ const TRAITS = [ :iteration_parameter, :data_interface, :predict_or_transform_mutates, - :fit_scitype, :fit_observation_scitype, - :fit_type, - :fit_observation_type, :target_observation_scitype, - :predict_input_scitype, - :predict_output_scitype, - :predict_input_type, - :predict_output_type, - :transform_input_scitype, - :transform_output_scitype, - :transform_input_type, - :transform_output_type, :name, :is_algorithm, + :target, ] @@ -147,9 +137,8 @@ data...)` has a guaranteed implementation. Each such `kind` subtypes [`LearnAPI.KindOfProxy`](@ref). Examples are `LiteralTarget()` (for predicting actual target values) and `Distributions()` (for predicting probability mass/density functions). -If a `predict(model, data)` is overloaded to return predictions for a specific kind of -proxy (e.g., `predict(model::MyModel, data) = predict(model, Distribution(), data)`) then -that kind appears first in the returned tuple. +The call `predict(model, data)` always returns `predict(model, kind, data)`, where `kind` +is the first element of the trait's return value. See also [`LearnAPI.predict`](@ref), [`LearnAPI.KindOfProxy`](@ref). @@ -157,9 +146,10 @@ See also [`LearnAPI.predict`](@ref), [`LearnAPI.KindOfProxy`](@ref). # New implementations -Implementation is optional but recommended whenever `predict` is overloaded. +Must be overloaded whenever `predict` is implemented. -Elements of the returned tuple must be one of these: $CONCRETE_TARGET_PROXY_TYPES_LIST. +Elements of the returned tuple must be one of the following, described further in +LearnAPI.jl documentation: $CONCRETE_TARGET_PROXY_TYPES_LIST. Suppose, for example, we have the following implementation of a supervised learner returning only probabilistic predictions: @@ -174,6 +164,8 @@ Then we can declare @trait MyNewAlgorithmType kinds_of_proxy = (LearnaAPI.Distribution(),) ``` +LearnAPI.jl provides the fallback for `predict(model, data)`. + For more on target variables and target proxies, refer to the LearnAPI documentation. """ @@ -336,7 +328,7 @@ to return `"K-nearest neighbors regressor"`. Ideally, this is a "concrete" noun `"ridge regressor"` rather than an "abstract" noun like `"ridge regression"`. """ -human_name(M) = snakecase(name(M), delim=' ') # `name` defined below +human_name(algorithm) = snakecase(name(alogorithm), delim=' ') # `name` defined below """ LearnAPI.data_interface(algorithm) @@ -388,23 +380,6 @@ Implement if algorithm is iterative. Returns a symbol or `nothing`. iteration_parameter(::Any) = nothing -""" - LearnAPI.fit_scitype(algorithm) - -Return an upper bound `S` on the scitype of `data` guaranteed to work when calling -`fit(algorithm, data)`: if `ScientificTypes.scitype(data) <: S`, then is `fit(algorithm, -data)` is supported. - -See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_observation_scitype`](@ref), -[`LearnAPI.fit_observation_type`](@ref). - -# New implementations - -Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) - -""" -fit_scitype(::Any) = Union{} - """ LearnAPI.fit_observation_scitype(algorithm) @@ -415,8 +390,7 @@ when calling `fit`: if `observations = obs(algorithm, data)` and $DOC_EXPLAIN_EACHOBS -See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), -[`LearnAPI.fit_observation_type`](@ref). +See also [`LearnAPI.target_observation_scitype`](@ref). # New implementations @@ -425,42 +399,6 @@ Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) """ fit_observation_scitype(::Any) = Union{} -""" - LearnAPI.fit_type(algorithm) - -Return an upper bound `T` on the type of `data` guaranteed to work when calling -`fit(algorithm, data)`: if `typeof(data) <: T`, then `fit(algorithm, data)` is supported. - -See also [`LearnAPI.fit_scitype`](@ref), [`LearnAPI.fit_observation_type`](@ref). -[`LearnAPI.fit_observation_scitype`](@ref) - -# New implementations - -Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) - -""" -fit_type(::Any) = Union{} - -""" - LearnAPI.fit_observation_type(algorithm) - -Return an upper bound `T` on the type of individual observations guaranteed to work -when calling `fit`: if `observations = obs(algorithm, data)` and -`typeof(o) <:S` for each `o` in `observations`, then the call -`fit(algorithm, data)` is supported. - -$DOC_EXPLAIN_EACHOBS - -See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), -[`LearnAPI.fit_observation_scitype`](@ref). - -# New implementations - -Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) - -""" -fit_observation_type(::Any) = Union{} - """ LearnAPI.target_observation_scitype(algorithm) @@ -494,235 +432,10 @@ Optional. The fallback return value is `Any`. target_observation_scitype(::Any) = Any -function DOC_INPUT_SCITYPE(op) - extra = op == :predict ? " kind_of_proxy," : "" - ONLY = DOC_ONLY_ONE(op) - """ - LearnAPI.$(op)_input_scitype(algorithm) - - Return an upper bound `S` on the scitype of `data` guaranteed to work in the call - `$op(algorithm,$extra data)`: if `ScientificTypes.scitype(data) <: S`, - then `$op(algorithm,$extra data)` is supported. - - See also [`LearnAPI.$(op)_input_type`](@ref). - - # New implementations - - Implementation is optional. The fallback return value is `Union{}`. $ONLY - - """ -end - -function DOC_INPUT_OBSERVATION_SCITYPE(op) - extra = op == :predict ? " kind_of_proxy," : "" - ONLY = DOC_ONLY_ONE(op) - """ - LearnAPI.$(op)_observation_scitype(algorithm) - - Return an upper bound `S` on the scitype of individual observations guaranteed to work - when calling `$op`: if `observations = obs(model, data)`, for some `model` returned by - `fit(algorithm, ...)`, and `ScientificTypes.scitype(o) <: S` for each `o` in - `observations`, then the call `$(op)(model,$extra data)` is supported. - - $DOC_EXPLAIN_EACHOBS - - See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), - [`LearnAPI.fit_observation_type`](@ref). - - # New implementations - - Optional. The fallback return value is `Union{}`. $ONLY - - """ -end - -function DOC_INPUT_TYPE(op) - extra = op == :predict ? " kind_of_proxy," : "" - ONLY = DOC_ONLY_ONE(op) - """ - LearnAPI.$(op)_input_type(algorithm) - - Return an upper bound `T` on the scitype of `data` guaranteed to work in the call - `$op(algorithm,$extra data)`: if `typeof(data) <: T`, - then `$op(algorithm,$extra data)` is supported. - - See also [`LearnAPI.$(op)_input_type`](@ref). - - # New implementations - - Implementation is optional. The fallback return value is `Union{}`. Should not be - overloaded if `LearnAPI.$(op)_input_scitype` is overloaded. - - """ -end - -function DOC_INPUT_OBSERVATION_TYPE(op) - extra = op == :predict ? " kind_of_proxy," : "" - ONLY = DOC_ONLY_ONE(op) - """ - LearnAPI.$(op)_observation_type(algorithm) - - Return an upper bound `T` on the scitype of individual observations guaranteed to work - when calling `$op`: if `observations = obs(model, data)`, for some `model` returned by - `fit(algorithm, ...)`, and `typeof(o) <: T` for each `o` in - `observations`, then the call `$(op)(model,$extra data)` is supported. - - $DOC_EXPLAIN_EACHOBS - - See also See also [`LearnAPI.fit_type`](@ref), [`LearnAPI.fit_scitype`](@ref), - [`LearnAPI.fit_observation_type`](@ref). - - # New implementations - - Optional. The fallback return value is `Union{}`. $ONLY - - """ -end - -DOC_OUTPUT_SCITYPE(op) = - """ - LearnAPI.$(op)_output_scitype(algorithm) - - Return an upper bound on the scitype of the output of the `$op` operation. - - See also [`LearnAPI.$(op)_input_scitype`](@ref). - - # New implementations - - Implementation is optional. The fallback return value is `Any`. - - """ - -DOC_OUTPUT_TYPE(op) = - """ - LearnAPI.$(op)_output_type(algorithm) - - Return an upper bound on the type of the output of the `$op` operation. - - # New implementations - - Implementation is optional. The fallback return value is `Any`. - - """ - -"$(DOC_INPUT_SCITYPE(:predict))" -predict_input_scitype(::Any) = Union{} - -"$(DOC_INPUT_OBSERVATION_SCITYPE(:predict))" -predict_input_observation_scitype(::Any) = Union{} - -"$(DOC_INPUT_TYPE(:predict))" -predict_input_type(::Any) = Union{} - -"$(DOC_INPUT_OBSERVATION_TYPE(:predict))" -predict_input_observation_type(::Any) = Union{} - -"$(DOC_INPUT_SCITYPE(:transform))" -transform_input_scitype(::Any) = Union{} - -"$(DOC_INPUT_OBSERVATION_SCITYPE(:transform))" -transform_input_observation_scitype(::Any) = Union{} - -"$(DOC_INPUT_TYPE(:transform))" -transform_input_type(::Any) = Union{} - -"$(DOC_INPUT_OBSERVATION_TYPE(:transform))" -transform_input_observation_type(::Any) = Union{} - -"$(DOC_OUTPUT_SCITYPE(:transform))" -transform_output_scitype(::Any) = Any - -"$(DOC_OUTPUT_TYPE(:transform))" -transform_output_type(::Any) = Any - - -# # TWO-ARGUMENT TRAITS - -# Here `s` is `:type` or `:scitype`: -const DOC_PREDICT_OUTPUT(s) = - """ - LearnAPI.predict_output_$s(algorithm, kind_of_proxy::KindOfProxy) - - Return an upper bound for the $(s)s of predictions of the specified form where - supported, and otherwise return `Any`. For example, if - - ŷ = predict(model, Distribution(), data) - - successfully returns (i.e., `algorithm` supports predictions of target probability - distributions) then the following is guaranteed to hold: - - $(s)(ŷ) <: predict_output_$(s)(algorithm, Distribution()) - - **Note.** This trait has a single-argument "convenience" version - `LearnAPI.predict_output_$(s)(algorithm)` derived from this one, which returns a - dictionary keyed on target proxy types. - - See also [`LearnAPI.KindOfProxy`](@ref), [`predict`](@ref), - [`predict_input_$(s)`](@ref). - - # New implementations - - Overloading the trait is optional. Here's a sample implementation for a supervised - regressor type `MyRgs` that only predicts actual values of the target: - - ```julia - @trait MyRgs predict_output_$(s) = AbstractVector{ScientificTypesBase.Continuous} - ``` - - The fallback method returns `Any`. - - """ - -"$(DOC_PREDICT_OUTPUT(:scitype))" -predict_output_scitype(algorithm, kind_of_proxy) = Any - -"$(DOC_PREDICT_OUTPUT(:type))" -predict_output_type(algorithm, kind_of_proxy) = Any - - # # DERIVED TRAITS -name(A) = split(string(constructor(A)), ".") |> last - -is_algorithm(A) = !isempty(functions(A)) - +name(algorithm) = split(string(constructor(algorithm)), ".") |> last +is_algorithm(algorithm) = !isempty(functions(algorithm)) preferred_kind_of_proxy(algorithm) = first(kinds_of_proxy(algorithm)) - -const DOC_PREDICT_OUTPUT2(s) = - """ - LearnAPI.predict_output_$(s)(algorithm) - - Return a dictionary of upper bounds on the $(s) of predictions, keyed on concrete - subtypes of [`LearnAPI.KindOfProxy`](@ref). Each of these subtypes represents a - different form of target prediction (`LiteralTarget`, `Distribution`, - `SurvivalFunction`, etc) possibly supported by `algorithm`, but the existence of a key - does not guarantee that form is supported. - - As an example, if - - ŷ = predict(model, Distribution(), data...) - - successfully returns (i.e., `algorithm` supports predictions of target probability - distributions) then the following is guaranteed to hold: - - $(s)(ŷ) <: LearnAPI.predict_output_$(s)s(algorithm)[Distribution] - - See also [`LearnAPI.KindOfProxy`](@ref), [`predict`](@ref), - [`LearnAPI.predict_input_$(s)`](@ref). - - # New implementations - - This single argument trait should not be overloaded. Instead, overload - [`LearnAPI.predict_output_$(s)`](@ref)(algorithm, kind_of_proxy). - - """ - -"$(DOC_PREDICT_OUTPUT2(:scitype))" -predict_output_scitype(algorithm) = - Dict(T => predict_output_scitype(algorithm, T()) - for T in CONCRETE_TARGET_PROXY_TYPES) - -"$(DOC_PREDICT_OUTPUT2(:type))" -predict_output_type(algorithm) = - Dict(T => predict_output_type(algorithm, T()) - for T in CONCRETE_TARGET_PROXY_TYPES) +target(algorithm) = :(LearnAPI.target) in functions(algorithm) +weights(algorithm) = :(LearnAPI.weights) in functions(algorithm) From 11b38cf9696de43e3f4a004b06b862ea06cc174b Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 11:47:27 +1300 Subject: [PATCH 053/187] rename LiteralTarget -> Point --- docs/src/anatomy_of_an_implementation.md | 20 +++++----- docs/src/index.md | 6 +-- docs/src/kinds_of_target_proxy.md | 50 ------------------------ docs/src/minimize.md | 4 +- docs/src/obs.md | 2 +- docs/src/predict_transform.md | 6 +-- docs/src/target_weights_features.md | 2 +- src/obs.jl | 4 +- src/predict_transform.jl | 4 +- src/target_weights_features.jl | 2 +- src/traits.jl | 2 +- src/types.jl | 46 +++++++++++++++++++++- test/integration/regression.jl | 22 +++++------ 13 files changed, 82 insertions(+), 88 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 13f17da1..3c2a7d5f 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -129,19 +129,19 @@ end Users will be able to call `predict` like this: ```julia -predict(model, LiteralTarget(), Xnew) +predict(model, Point(), Xnew) ``` -where `Xnew` is a table (of the same form as `X` above). The argument `LiteralTarget()` +where `Xnew` is a table (of the same form as `X` above). The argument `Point()` signals that literal predictions of the target variable are sought, as opposed to some -proxy for the target, such as probability density functions. `LiteralTarget` is an +proxy for the target, such as probability density functions. `Point` is an example of a [`LearnAPI.KindOfProxy`](@ref proxy_types) type. Targets and target proxies are discussed [here](@ref proxy). We provide this implementation for our ridge regressor: ```@example anatomy -LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = +LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = Tables.matrix(Xnew)*model.coefficients ``` @@ -210,7 +210,7 @@ Because we have implemented `predict`, we are required to overload the target, we make this definition: ```julia -LearnAPI.kinds_of_proxy(::Ridge) = (LiteralTarget(),) +LearnAPI.kinds_of_proxy(::Ridge) = (Point(),) ``` A macro provides a shortcut, convenient when multiple traits are to be defined: @@ -219,7 +219,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: @trait( Ridge, constructor = Ridge, - kinds_of_proxy=(LiteralTarget(),), + kinds_of_proxy=(Point(),), tags = (:regression,), functions = ( :(LearnAPI.fit), @@ -326,7 +326,7 @@ LearnAPI.minimize(model::RidgeFitted) = @trait( Ridge, constructor = Ridge, - kinds_of_proxy=(LiteralTarget(),), + kinds_of_proxy=(Point(),), tags = (:regression,), functions = ( :(LearnAPI.fit), @@ -424,11 +424,11 @@ case: ```@example anatomy2 LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' -LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, observations::AbstractMatrix) = +LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = observations'*model.coefficients -LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = - predict(model, LiteralTarget(), obs(model, Xnew)) +LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = + predict(model, Point(), obs(model, Xnew)) ``` ### `target` and `features` methods diff --git a/docs/src/index.md b/docs/src/index.md index cf11d259..7b638aed 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -50,7 +50,7 @@ LearnAPI.functions(forest) model = fit(forest, X, y) # Generate point predictions: -ŷ = predict(model, Xnew) # or `predict(model, LiteralTarget(), Xnew)` +ŷ = predict(model, Xnew) # or `predict(model, Point(), Xnew)` # Predict probability distributions: predict(model, Distribution(), Xnew) @@ -65,10 +65,10 @@ serialize("my_random_forest.jls", small_model) # Recover saved model and algorithm configuration: recovered_model = deserialize("my_random_forest.jls") @assert LearnAPI.algorithm(recovered_model) == forest -@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ +@assert predict(recovered_model, Point(), Xnew) == ŷ ``` -`Distribution` and `LiteralTarget` are singleton types owned by LearnAPI.jl. They allow +`Distribution` and `Point` are singleton types owned by LearnAPI.jl. They allow dispatch based on the [kind of target proxy](@ref proxy), a key LearnAPI.jl concept. LearnAPI.jl places more emphasis on the notion of target variables and target proxies than on the usual supervised/unsupervised learning dichotomy. From this point of view, a diff --git a/docs/src/kinds_of_target_proxy.md b/docs/src/kinds_of_target_proxy.md index 218c378a..da150f96 100644 --- a/docs/src/kinds_of_target_proxy.md +++ b/docs/src/kinds_of_target_proxy.md @@ -14,64 +14,14 @@ LearnAPI.KindOfProxy LearnAPI.IID ``` -| type | form of an observation | -|:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `LearnAPI.LiteralTarget` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | -| `LearnAPI.Sampleable` | object that can be sampled to obtain object of the same form as target observation | -| `LearnAPI.Distribution` | explicit probability density/mass function whose sample space is all possible target observations | -| `LearnAPI.LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | -| `LearnAPI.Probability`¹ | numerical probability or probability vector | -| `LearnAPI.LogProbability`¹ | log-probability or log-probability vector | -| `LearnAPI.Parametric`¹ | a list of parameters (e.g., mean and variance) describing some distribution | -| `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | -| `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | -| `LearnAPI.LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | -| `LearnAPI.LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number | -| `LearnAPI.Quantile`² | same as target but with quantile interpretation | -| `LearnAPI.Expectile`² | same as target but with expectile interpretation | -| `LearnAPI.ConfidenceInterval`² | confidence interval | -| `LearnAPI.Fuzzy` | finite but possibly varying number of target observations | -| `LearnAPI.ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | -| `LearnAPI.SurvivalFunction` | survival function | -| `LearnAPI.SurvivalDistribution` | probability distribution for survival time | -| `LearnAPI.SurvivalHazardFunction` | hazard function for survival time | -| `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | -| `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | - -¹Provided for completeness but discouraged to avoid [ambiguities in -representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). - -²The level will be controlled by a hyper-parameter; models providing only quantiles or -expectiles at 50% will provide `LiteralTarget` instead. - -> Table of concrete subtypes of `LearnAPI.IID <: LearnAPI.KindOfProxy`. - - ## Proxies for density estimation lgorithms ```@docs LearnAPI.Single ``` -| type `T` | form of output of `predict(model, ::T)` | -|:--------------------------------:|:-----------------------------------------------------------------------| -| `LearnAPI.SingleSampleable` | object that can be sampled to obtain a single target observation | -| `LearnAPI.SingleDistribution` | explicit probability density/mass function for sampling the target | -| `LearnAPI.SingleLogDistribution` | explicit log-probability density/mass function for sampling the target | - -> Table of `LearnAPI.KindOfProxy` subtypes subtyping `LearnAPI.Single` - - ## Joint probability distributions ```@docs LearnAPI.Joint ``` - -| type `T` | form of output of `predict(model, ::T, data)` | -|:-------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `LearnAPI.JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. | -| `LearnAPI.JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | -| `LearnAPI.JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | - -> Table of `LearnAPI.KindOfProxy` subtypes subtyping `LearnAPI.Joint` diff --git a/docs/src/minimize.md b/docs/src/minimize.md index 8e7a4efb..03bc028e 100644 --- a/docs/src/minimize.md +++ b/docs/src/minimize.md @@ -8,14 +8,14 @@ minimize(model) -> ```julia model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` -ŷ = predict(model, LiteralTarget(), Xnew) +ŷ = predict(model, Point(), Xnew) LearnAPI.feature_importances(model) small_model = minimize(model) serialize("my_model.jls", small_model) recovered_model = deserialize("my_random_forest.jls") -@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ +@assert predict(recovered_model, Point(), Xnew) == ŷ # throws MethodError: LearnAPI.feature_importances(recovered_model) diff --git a/docs/src/obs.md b/docs/src/obs.md index 82be98b5..cf794d87 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -68,7 +68,7 @@ scores = map(train_test_folds) do (train, test) global never_trained = false end predictobs_subset = MLUtils.getobs(predictobs, test) - ŷ = predict(model, LiteralTarget(), predictobs_subset) + ŷ = predict(model, Point(), predictobs_subset) return diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 2ec378ef..df961719 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -27,7 +27,7 @@ ŷ = predict(model, Distribution(), Xnew) Generate point predictions: ```julia -ŷ = predict(model, LiteralTarget(), Xnew) +ŷ = predict(model, Point(), Xnew) ``` Train a dimension-reducing `algorithm`: @@ -49,7 +49,7 @@ inverse_transform(model, Xnew_reduced) fitobs = obs(algorithm, (X, y)) # algorithm-specific repr. of data model = fit(algorithm, MLUtils.getobs(fitobs, 1:100)) predictobs = obs(model, MLUtils.getobs(X, 101:150)) -ŷ = predict(model, LiteralTarget(), predictobs) +ŷ = predict(model, Point(), predictobs) ``` @@ -65,7 +65,7 @@ ŷ = predict(model, LiteralTarget(), predictobs) If the algorithm has a notion of [target variable](@ref proxy), then use [`predict`](@ref) to output each supported [kind of target proxy](@ref -proxy_types) (`LiteralTarget()`, `Distribution()`, etc). +proxy_types) (`Point()`, `Distribution()`, etc). For output not associated with a target variable, implement [`transform`](@ref) instead, which does not dispatch on [`LearnAPI.KindOfProxy`](@ref), but can be optionally diff --git a/docs/src/target_weights_features.md b/docs/src/target_weights_features.md index 78205a44..df4f76b7 100644 --- a/docs/src/target_weights_features.md +++ b/docs/src/target_weights_features.md @@ -22,7 +22,7 @@ target: model = fit(algorithm, data) X = LearnAPI.features(algorithm, data) y = LearnAPI.target(algorithm, data) -ŷ = predict(model, LiteralTarget(), X) +ŷ = predict(model, Point(), X) training_loss = sum(ŷ .!= y) ``` diff --git a/src/obs.jl b/src/obs.jl index e781351e..47fd8b79 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -24,7 +24,7 @@ Usual workflow, using data-specific resampling methods: data = (X, y) # a DataFrame and a vector data_train = (Tables.select(X, 1:100), y[1:100]) model = fit(algorithm, data_train) -ŷ = predict(model, LiteralTarget(), X[101:150]) +ŷ = predict(model, Point(), X[101:150]) ``` Alternative workflow using `obs` and the MLUtils.jl method `getobs` (assumes @@ -37,7 +37,7 @@ fit_observations = obs(algorithm, data) model = fit(algorithm, MLUtils.getobs(fit_observations, 1:100)) predict_observations = obs(model, X) -ẑ = predict(model, LiteralTarget(), MLUtils.getobs(predict_observations, 101:150)) +ẑ = predict(model, Point(), MLUtils.getobs(predict_observations, 101:150)) @assert ẑ == ŷ ``` diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 6b62dfd5..a87cf07b 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -57,7 +57,7 @@ DOC_DATA_INTERFACE(method) = The first signature returns target predictions, or proxies for target predictions, for input features `data`, according to some `model` returned by [`fit`](@ref). Where -supported, these are literally target predictions if `kind_of_proxy = LiteralTarget()`, +supported, these are literally target predictions if `kind_of_proxy = Point()`, and probability density/mass functions if `kind_of_proxy = Distribution()`. List all options with [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), where `algorithm = LearnAPI.algorithm(model)`. @@ -75,7 +75,7 @@ training features `X`, training target `y`, and test features `Xnew`: ```julia model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` -predict(model, LiteralTarget(), Xnew) +predict(model, Point(), Xnew) ``` See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index e7fd0b63..69fab433 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -46,7 +46,7 @@ implemented, as in the following sample workflow: ```julia model = fit(algorithm, data) X = features(data) -ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = LiteralTarget()` +ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` ``` The return value has the same number of observations as `data` does. For supervised models diff --git a/src/traits.jl b/src/traits.jl index dfdd5c21..97adf49c 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -134,7 +134,7 @@ functions(::Any) = () Returns a tuple of all instances, `kind`, for which for which `predict(algorithm, kind, data...)` has a guaranteed implementation. Each such `kind` subtypes -[`LearnAPI.KindOfProxy`](@ref). Examples are `LiteralTarget()` (for predicting actual +[`LearnAPI.KindOfProxy`](@ref). Examples are `Point()` (for predicting actual target values) and `Distributions()` (for predicting probability mass/density functions). The call `predict(model, data)` always returns `predict(model, kind, data)`, where `kind` diff --git a/src/types.jl b/src/types.jl index 8d755fdb..e046384d 100644 --- a/src/types.jl +++ b/src/types.jl @@ -18,10 +18,42 @@ following must hold: See also [`LearnAPI.KindOfProxy`](@ref). +# Extended help + +| type | form of an observation | +|:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `LearnAPI.Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | +| `LearnAPI.Sampleable` | object that can be sampled to obtain object of the same form as target observation | +| `LearnAPI.Distribution` | explicit probability density/mass function whose sample space is all possible target observations | +| `LearnAPI.LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | +| `LearnAPI.Probability`¹ | numerical probability or probability vector | +| `LearnAPI.LogProbability`¹ | log-probability or log-probability vector | +| `LearnAPI.Parametric`¹ | a list of parameters (e.g., mean and variance) describing some distribution | +| `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | +| `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | +| `LearnAPI.LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | +| `LearnAPI.LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number | +| `LearnAPI.Quantile`² | same as target but with quantile interpretation | +| `LearnAPI.Expectile`² | same as target but with expectile interpretation | +| `LearnAPI.ConfidenceInterval`² | confidence interval | +| `LearnAPI.Fuzzy` | finite but possibly varying number of target observations | +| `LearnAPI.ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | +| `LearnAPI.SurvivalFunction` | survival function | +| `LearnAPI.SurvivalDistribution` | probability distribution for survival time | +| `LearnAPI.SurvivalHazardFunction` | hazard function for survival time | +| `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | +| `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | + +¹Provided for completeness but discouraged to avoid [ambiguities in +representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). + +²The level will be controlled by a hyper-parameter; models providing only quantiles or +expectiles at 50% will provide `Point` instead. + """ abstract type IID <: KindOfProxy end -struct LiteralTarget <: IID end +struct Point <: IID end struct Sampleable <: IID end struct Distribution <: IID end struct LogDistribution <: IID end @@ -52,6 +84,12 @@ Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). If `kind_of_proxy` is an in kind_of_proxy, data)` represents a *single* probability distribution for the sample space ``Y^n``, where ``Y`` is the space from which the target variable takes its values. +| type `T` | form of output of `predict(model, ::T, data)` | +|:-------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `LearnAPI.JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. | +| `LearnAPI.JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | +| `LearnAPI.JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | + """ abstract type Joint <: KindOfProxy end struct JointSampleable <: Joint end @@ -68,6 +106,12 @@ samples, and we regard the samples as drawn from the "target" variable. If in th `kind_of_proxy` is an instance of `LearnAPI.Single` then, `predict(algorithm)` returns a single object representing a probability distribution. +| type `T` | form of output of `predict(model, ::T)` | +|:--------------------------------:|:-----------------------------------------------------------------------| +| `LearnAPI.SingleSampleable` | object that can be sampled to obtain a single target observation | +| `LearnAPI.SingleDistribution` | explicit probability density/mass function for sampling the target | +| `LearnAPI.SingleLogDistribution` | explicit log-probability density/mass function for sampling the target | + """ abstract type Single <: KindOfProxy end struct SingleSampeable <: Single end diff --git a/test/integration/regression.jl b/test/integration/regression.jl index 5b91561e..c61aa72e 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -87,12 +87,12 @@ LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A LearnAPI.obs(::RidgeFitted, X) = Tables.matrix(X)' # matrix input: -LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, observations::AbstractMatrix) = +LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = observations'*model.coefficients # tabular input: -LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) = - predict(model, LiteralTarget(), obs(model, Xnew)) +LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = + predict(model, Point(), obs(model, Xnew)) # accessor function: LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances @@ -103,7 +103,7 @@ LearnAPI.minimize(model::RidgeFitted) = @trait( Ridge, constructor = Ridge, - kinds_of_proxy = (LiteralTarget(),), + kinds_of_proxy = (Point(),), tags = ("regression",) functions = ( :(LearnAPI.fit), @@ -155,7 +155,7 @@ data = (X, y) ), ) - ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) + ŷ = predict(model, Point(), Tables.subset(X, test)) @test ŷ isa Vector{Float64} @test predict(model, Tables.subset(X, test)) == ŷ @@ -163,7 +163,7 @@ data = (X, y) predictobs = LearnAPI.obs(model, X) model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) @test LearnAPI.target(algorithm, fitobs) == y - @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) ≈ ŷ + @test predict(model, Point(), MLUtils.getobs(predictobs, test)) ≈ ŷ @test predict(model, LearnAPI.features(algorithm, fitobs)) ≈ predict(model, X) @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} @@ -177,7 +177,7 @@ data = (X, y) @test LearnAPI.algorithm(recovered_model) == algorithm @test predict( recovered_model, - LiteralTarget(), + Point(), MLUtils.getobs(predictobs, test) ) ≈ ŷ @@ -227,7 +227,7 @@ LearnAPI.target(::BabyRidge, data) = last(data) LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm -LearnAPI.predict(model::BabyRidgeFitted, ::LiteralTarget, Xnew) = +LearnAPI.predict(model::BabyRidgeFitted, ::Point, Xnew) = Tables.matrix(Xnew)*model.coefficients LearnAPI.minimize(model::BabyRidgeFitted) = @@ -236,7 +236,7 @@ LearnAPI.minimize(model::BabyRidgeFitted) = @trait( BabyRidge, constructor = BabyRidge, - kinds_of_proxy = (LiteralTarget(),), + kinds_of_proxy = (Point(),), tags = ("regression",) functions = ( :(LearnAPI.fit), @@ -254,13 +254,13 @@ LearnAPI.minimize(model::BabyRidgeFitted) = algorithm = BabyRidge(lambda=0.5) model = fit(algorithm, Tables.subset(X, train), y[train]; verbosity=0) - ŷ = predict(model, LiteralTarget(), Tables.subset(X, test)) + ŷ = predict(model, Point(), Tables.subset(X, test)) @test ŷ isa Vector{Float64} fitobs = obs(algorithm, data) predictobs = LearnAPI.obs(model, X) model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) - @test predict(model, LiteralTarget(), MLUtils.getobs(predictobs, test)) == ŷ == + @test predict(model, Point(), MLUtils.getobs(predictobs, test)) == ŷ == predict(model, MLUtils.getobs(predictobs, test)) @test LearnAPI.target(algorithm, data) == y @test LearnAPI.predict(model, X) ≈ From 8fd02c96496b817e03fbc52a76ae8e46c9c82e14 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 13:39:59 +1300 Subject: [PATCH 054/187] fix typos --- src/fit_update.jl | 4 ++-- src/target_weights_features.jl | 6 +++--- src/traits.jl | 4 ++-- typos.toml | 6 ++++++ 4 files changed, 13 insertions(+), 7 deletions(-) create mode 100644 typos.toml diff --git a/src/fit_update.jl b/src/fit_update.jl index faefd610..44b427b2 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -96,7 +96,7 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. When following the call `fit(algorithm, data)`, the `update` call is semantically -equivalent to retraining ab initio using a concatentation of `data` and `new_data`, +equivalent to retraining ab initio using a concatenation of `data` and `new_data`, *provided there are no hyperparameter replacements.* Behaviour is otherwise algorithm-specific. @@ -131,7 +131,7 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. When following the call `fit(algorithm, data)`, the `update` call is semantically -equivalent to retraining ab initio using a concatentation of `data` and `new_data`, +equivalent to retraining ab initio using a concatenation of `data` and `new_data`, *provided there are no hyperparameter replacements.* Behaviour is otherwise algorithm-specific. diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 69fab433..7df72646 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -5,7 +5,7 @@ Return, for each form of `data` supported in a call of the form [`fit(algorithm, data)`](@ref), the target variable part of `data`. If `nothing` is returned, the `algorithm` does not see a target variable in training (is unsupervised). -Refer to LearnAPI.jl documenation for the precise meaning of "target". +Refer to LearnAPI.jl documentation for the precise meaning of "target". # New implementations @@ -22,7 +22,7 @@ target(::Any, data) = nothing Return, for each form of `data` supported in a call of the form `[`fit(algorithm, data)`](@ref), the per-observation weights part of `data`. Where `nothing` is returned, no -weights are part of `data`, which is to be interpretted as uniform weighting. +weights are part of `data`, which is to be interpreted as uniform weighting. # New implementations @@ -51,7 +51,7 @@ ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` The return value has the same number of observations as `data` does. For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(algorithm)`) `ŷ` above is generally -inteneded to be an approximate proxy for `LearnAPI.target(algorithm, data)`, the training +intended to be an approximate proxy for `LearnAPI.target(algorithm, data)`, the training target. diff --git a/src/traits.jl b/src/traits.jl index 97adf49c..93938007 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -87,7 +87,7 @@ function constructor end """ LearnAPI.functions(algorithm) -Return a tuple of expressions respresenting functions that can be meaningfully applied +Return a tuple of expressions representing functions that can be meaningfully applied with `algorithm`, or an associated model (object returned by `fit(algorithm, ...)`, as the first argument. Algorithm traits (methods for which `algorithm` is the *only* argument) are excluded. @@ -328,7 +328,7 @@ to return `"K-nearest neighbors regressor"`. Ideally, this is a "concrete" noun `"ridge regressor"` rather than an "abstract" noun like `"ridge regression"`. """ -human_name(algorithm) = snakecase(name(alogorithm), delim=' ') # `name` defined below +human_name(algorithm) = snakecase(name(algorithm), delim=' ') # `name` defined below """ LearnAPI.data_interface(algorithm) diff --git a/typos.toml b/typos.toml new file mode 100644 index 00000000..8f5d6f5a --- /dev/null +++ b/typos.toml @@ -0,0 +1,6 @@ +[default.extend-words] +# Don't correct "mape" to "map" +mape = "mape" +yhat = "yhat" +LSO ="LSO" +datas = "datas" From 60f8b6c0066097a615f45e98aff0c09ee85f00ad Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 13:44:10 +1300 Subject: [PATCH 055/187] fix syntax error in test --- test/integration/regression.jl | 4 ++-- test/integration/static_algorithms.jl | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/test/integration/regression.jl b/test/integration/regression.jl index c61aa72e..ba68cef1 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -104,7 +104,7 @@ LearnAPI.minimize(model::RidgeFitted) = Ridge, constructor = Ridge, kinds_of_proxy = (Point(),), - tags = ("regression",) + tags = ("regression",), functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -237,7 +237,7 @@ LearnAPI.minimize(model::BabyRidgeFitted) = BabyRidge, constructor = BabyRidge, kinds_of_proxy = (Point(),), - tags = ("regression",) + tags = ("regression",), functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index a143416b..3812fbc6 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -39,7 +39,7 @@ end @trait( Selector, constructor = Selector, - tags = ("feature engineering",) + tags = ("feature engineering",), functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -105,7 +105,7 @@ end Selector2, constructor = Selector2, predict_or_transform_mutates = true, - tags = ("feature engineering",) + tags = ("feature engineering",), functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), From e25e4e738942e63bd8f823c9fc46d33ed8dbca1d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 13:48:20 +1300 Subject: [PATCH 056/187] add julia 1.10 testing to matrix Acked-by: Anthony D. Blaom --- .github/workflows/ci.yml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 37ef5474..ca263a9a 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -17,7 +17,8 @@ jobs: fail-fast: false matrix: version: - - '1.6' + - '1.6' # previous LTS release + - '1.10' # new LTS release - '1' # automatically expands to the latest stable 1.x release of Julia. os: - ubuntu-latest @@ -65,4 +66,4 @@ jobs: using Documenter: DocMeta, doctest using LearnAPI DocMeta.setdocmeta!(LearnAPI, :DocTestSetup, :(using LearnAPI); recursive=true) - doctest(LearnAPI)' \ No newline at end of file + doctest(LearnAPI)' From aabdfa5c573f444d942cfef25524333bab4a8567 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 14:11:28 +1300 Subject: [PATCH 057/187] typo --- docs/src/anatomy_of_an_implementation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 3c2a7d5f..4c13c742 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -20,7 +20,7 @@ A transformer ordinarily implements `transform` instead of By default, it is assumed that `data` supports the [`LearnAPI.RandomAccess`](@ref) interface; this includes all matrices, with observations-as-columns, most tables, and - tuples thereof). See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the + tuples thereof. See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the case then an implementation must either: If the `data` object consumed by `fit`, `predict`, or `transform` is not From a3ece5c07f82397702fdf55c589cb8f0831cc0bb Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 14:13:43 +1300 Subject: [PATCH 058/187] typo --- docs/src/anatomy_of_an_implementation.md | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 4c13c742..08fd5cba 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -12,24 +12,19 @@ A transformer ordinarily implements `transform` instead of New implementations of `fit`, `predict`, etc, always have a *single* `data` argument, as in - `LearnAPI.fit(algorithm, data; verbosity=1) = ...`. + `LearnAPI.fit(algorithm, data; verbosity=1) = ...`. For convenience, user-calls, such as `fit(algorithm, X, y)`, automatically fallback - to `fit(algorithm, (X, y))`. + to `fit(algorithm, (X, y))`. !!! note - By default, it is assumed that `data` supports the [`LearnAPI.RandomAccess`](@ref) - interface; this includes all matrices, with observations-as-columns, most tables, and - tuples thereof. See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the - case then an implementation must either: - If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable table¹, array³, tuple of tables and arrays, or some other object implementing the MLUtils.jl `getobs`/`numobs` interface, then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support - it, as illustrated below under + this interface, as illustrated below under [Providing an advanced data interface](@ref); or (ii) overload the trait [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. From b5cdd784999284ec6dcfad787f8f09fb5acc97db Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 14:34:04 +1300 Subject: [PATCH 059/187] update readme --- README.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index af565313..052356d6 100644 --- a/README.md +++ b/README.md @@ -7,8 +7,7 @@ A base Julia interface for machine learning and statistics - [X] Detailed proposal stage ([this documentation](https://juliaai.github.io/LearnAPI.jl/dev/)). -- [ ] Initial feedback stage (opened mid-January, 2023). General feedback can be provided at [this Julia Discourse thread](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048/20). -- [ ] Implement feedback and finish "To do" list (below) +- [X] Initial feedback stage (opened mid-January, 2023). General feedback can be provided at [this Julia Discourse thread](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048/20). - [ ] Proof of concept implementation - [ ] Polish - [ ] **Register 0.2.0** @@ -17,8 +16,8 @@ You can join a discussion on the LearnAPI proposal at [this](https://discourse.j To do: -- [ ] Add methods to create/save persistent representation of learned parameters -- [ ] Add more repo tests +- [ ] ~~Add methods to create/save persistent representation of learned parameters~~ +- [X] Add more repo tests - [ ] Add methods to test an implementation - [ ] Add user guide ("Common Implementation Patterns" section of manual) From ec3223ce1e7145a511447729396f090916805963 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 15:04:15 +1300 Subject: [PATCH 060/187] typos --- src/predict_transform.jl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/predict_transform.jl b/src/predict_transform.jl index a87cf07b..6f894d6f 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -97,7 +97,7 @@ Implementation is optional. Only the first signature is implemented, but each `kind_of_proxy` that gets an implementation must be added to the list returned by [`LearnAPI.kinds_of_proxy`](@ref). -$(DOC_IMPLEMENTED_METHODS(":predict")) +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.predict)")) $(DOC_MINIMIZE(:predict)) @@ -160,7 +160,7 @@ See also [`fit`](@ref), [`predict`](@ref), # New implementations Implementation for new LearnAPI.jl algorithms is optional. A fallback provides the -slurping version. $(DOC_IMPLEMENTED_METHODS(":transform")) +slurping version. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) $(DOC_MINIMIZE(:transform)) @@ -196,7 +196,7 @@ See also [`fit`](@ref), [`transform`](@ref), [`predict`](@ref). # New implementations -Implementation is optional. $(DOC_IMPLEMENTED_METHODS(":inverse_transform")) +Implementation is optional. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.inverse_transform)")) $(DOC_MINIMIZE(:inverse_transform)) From 6adca0bc868acf667cd63f6d116dad510f599b2d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 15:07:41 +1300 Subject: [PATCH 061/187] typo --- docs/src/kinds_of_target_proxy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/kinds_of_target_proxy.md b/docs/src/kinds_of_target_proxy.md index da150f96..ff9d3f4b 100644 --- a/docs/src/kinds_of_target_proxy.md +++ b/docs/src/kinds_of_target_proxy.md @@ -14,7 +14,7 @@ LearnAPI.KindOfProxy LearnAPI.IID ``` -## Proxies for density estimation lgorithms +## Proxies for density estimation algorithms ```@docs LearnAPI.Single From dd0799ea1fed5d6b350baf4893e2336a736aa9e3 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 17:04:13 +1300 Subject: [PATCH 062/187] doc tweak --- docs/src/anatomy_of_an_implementation.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 08fd5cba..e4d5922f 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -25,7 +25,8 @@ A transformer ordinarily implements `transform` instead of then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support this interface, as illustrated below under - [Providing an advanced data interface](@ref); or (ii) overload the trait + [Providing an advanced data interface](@ref), and which may additionally + enable certain performance benefits; or (ii) overload the trait [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. From a0f8934976038293f5ab2dbb9d82f7c9304a228f Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 17:12:04 +1300 Subject: [PATCH 063/187] clarify alternative data patterns in "anatomy of an interface" --- docs/src/anatomy_of_an_implementation.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index e4d5922f..dcb26e7a 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -76,7 +76,7 @@ the docstring to the *constructor*, not the struct. ## Implementing `fit` A ridge regressor requires two types of data for training: input features `X`, which here -we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a vector. +we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a vector.⁴ It is convenient to define a new type for the `fit` output, which will include coefficients labelled by feature name for inspection after training: @@ -501,3 +501,9 @@ like the native ones, they must be included in the [`LearnAPI.functions`](@ref) declaration. ³ The last index must be the observation index. + +⁴ The `data = (X, y)` pattern implemented here is not only supported pattern. For, +example, `data` might be a single table containing both features and target variable. In +this case, it will be necessary to overload [`LearnAPI.features`](@ref) in addition to +[`LearnAPI.target`](@ref); the name of the target column would need to be a hyperparameter +or `fit` keyword argument. From 56bea3711801f9cbbdbc31840d823271301984f9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 17:14:40 +1300 Subject: [PATCH 064/187] doc formatting --- src/target_weights_features.jl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 7df72646..98bd3888 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -20,7 +20,7 @@ target(::Any, data) = nothing """ LearnAPI.weights(algorithm, data) -> weights -Return, for each form of `data` supported in a call of the form `[`fit(algorithm, +Return, for each form of `data` supported in a call of the form [`fit(algorithm, data)`](@ref), the per-observation weights part of `data`. Where `nothing` is returned, no weights are part of `data`, which is to be interpreted as uniform weighting. @@ -36,7 +36,7 @@ weights(::Any, data) = nothing """ LearnAPI.features(algorithm, data) -Return, for each form of `data` supported in a call of the form `[`fit(algorithm, +Return, for each form of `data` supported in a call of the form [`fit(algorithm, data)`](@ref), the "features" part of `data` (as opposed to the target variable, for example). From 43a1ef5d2404ac46d6fa05511e51a17f9adde997 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 17:17:39 +1300 Subject: [PATCH 065/187] typo --- docs/src/anatomy_of_an_implementation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index dcb26e7a..b8c0b006 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -198,8 +198,8 @@ predictions. ## Algorithm traits Algorithm [traits](@ref traits) record extra generic information about an algorithm, or -make specific promises of behavior. They usually have an algorithm as the single argument, -and so we regard [`LearnAPI.constructor`](@ref) defined above as a trait. +make specific promises of behavior. They are methods that have an algorithm as the solve +argument, and so we regard [`LearnAPI.constructor`](@ref) defined above as a trait. Because we have implemented `predict`, we are required to overload the [`LearnAPI.kinds_of_proxy`](@ref) trait. Because we can only make point predictions of the From bce74cfebc3d34a8ec10a066e2c5e1b8d288577f Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 2 Oct 2024 17:26:18 +1300 Subject: [PATCH 066/187] doc tweaks --- docs/src/anatomy_of_an_implementation.md | 8 +++----- docs/src/obs.md | 2 +- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index b8c0b006..96235eca 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -461,11 +461,9 @@ argument, overloading `obs` is optional. This is provided data in publicized `fit`/`predict` signatures consists only of objects implement the [`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof). -To buy out of supporting the MLUtils.jl interface altogether, an implementation must -overload the trait, [`LearnAPI.data_interface(algorithm)`](@ref). - -For more on data interfaces, see [`obs`](@ref) and -[`LearnAPI.data_interface(algorithm)`](@ref). +To opt out of supporting the MLUtils.jl interface altogether, an implementation must +overload the trait, [`LearnAPI.data_interface(algorithm)`](@ref). See [Data +interfaces](@ref data_interfaces) for details. ## Demonstration of an advanced `obs` workflow diff --git a/docs/src/obs.md b/docs/src/obs.md index cf794d87..0bbb9f24 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -91,7 +91,7 @@ A sample implementation is given in [Providing an advanced data interface](@ref) obs ``` -### Data interfaces +### [Data interfaces](@id data_interfaces) New implementations must overload [`LearnAPI.data_interface(algorithm)`](@ref) if the output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess`](@ref). (Arrays, most From 392971485ee6cb7ae14a5c43594a0963967057ab Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 3 Oct 2024 08:45:17 +1300 Subject: [PATCH 067/187] tweak --- docs/src/traits.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/traits.md b/docs/src/traits.md index 25edaa1c..b2a214da 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -10,8 +10,8 @@ training). They may also record more mundane information, such as a package lice ### [Overloadable traits](@id traits_list) -In the examples column of the table below, `Table`, `Continuous`, `Sampleable` are names owned by the -package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). +In the examples column of the table below, `Continuous` is a name owned the package +[ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). | trait | return value | fallback value | example | |:-------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| From d743d7d5340835836fca3b93ea8f8ed4d0c2b0b7 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 3 Oct 2024 08:49:24 +1300 Subject: [PATCH 068/187] typo --- docs/src/anatomy_of_an_implementation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 96235eca..1ee784ed 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -500,7 +500,7 @@ declaration. ³ The last index must be the observation index. -⁴ The `data = (X, y)` pattern implemented here is not only supported pattern. For, +⁴ The `data = (X, y)` pattern implemented here is not the only supported pattern. For, example, `data` might be a single table containing both features and target variable. In this case, it will be necessary to overload [`LearnAPI.features`](@ref) in addition to [`LearnAPI.target`](@ref); the name of the target column would need to be a hyperparameter From d59fb984689270b5412bda128701caaae3a4fe9c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 3 Oct 2024 08:55:50 +1300 Subject: [PATCH 069/187] doc tweak --- src/types.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/types.jl b/src/types.jl index e046384d..25f98d81 100644 --- a/src/types.jl +++ b/src/types.jl @@ -181,7 +181,7 @@ All arrays implement `RandomAccess`, with the last index being the observation i A Tables.jl compatible table `data` implements `RandomAccess` if `Tables.istable(data)` is true and if `data` implements `DataAPI.nrows`. This includes many tables, and in -particular, `DataFrame`s. Tables that are also tuples are excluded. +particular, `DataFrame`s. Tables that are also tuples are explicitly excluded. Any tuple of objects implementing `RandomAccess` also implements `RandomAccess`. From 84ef5fcf7eaaa1a515250a498504eb55b333b07d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 3 Oct 2024 16:29:03 +1300 Subject: [PATCH 070/187] update readme --- LICENSE | 2 +- README.md | 54 +++++++++++++++------- ROADMAP.md | 52 +++++++++++++++++++++ docs/src/anatomy_of_an_implementation.md | 12 ++--- docs/src/common_implementation_patterns.md | 7 +++ docs/src/patterns/feature_engineering.md | 5 ++ docs/src/patterns/meta_algorithms.md | 1 + src/traits.jl | 11 +++-- 8 files changed, 117 insertions(+), 27 deletions(-) create mode 100644 ROADMAP.md create mode 100644 docs/src/patterns/feature_engineering.md create mode 100644 docs/src/patterns/meta_algorithms.md diff --git a/LICENSE b/LICENSE index 7609ebe2..4690371a 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -MIT License Copyright (c) 2021 - JuliaAI +MIT License Copyright (c) 2024 - Anthony Blaom Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/README.md b/README.md index 052356d6..3ac84bee 100644 --- a/README.md +++ b/README.md @@ -2,26 +2,48 @@ A base Julia interface for machine learning and statistics +[![Lifecycle:Maturing](https://img.shields.io/badge/Lifecycle-Maturing-007EC6)](ROADMAP.md) +[![Build Status](https://github.com/JuliaAI/LearnAPI.jl/workflows/CI/badge.svg)](https://github.com/JuliaAI/LearnAPI.jl/actions) +[![Coverage](https://codecov.io/gh/JuliaAI/LearnAPI.jl/branch/master/graph/badge.svg)](https://codecov.io/github/JuliaAI/LearnAPI.jl?branch=master) +[![Docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://juliaai.github.io/LearnAPI.jl/dev/) -**Devlopement Status:** +Comprehensive documentation is [here](https://juliaai.github.io/LearnAPI.jl/dev/). -- [X] Detailed proposal stage ([this - documentation](https://juliaai.github.io/LearnAPI.jl/dev/)). -- [X] Initial feedback stage (opened mid-January, 2023). General feedback can be provided at [this Julia Discourse thread](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048/20). -- [ ] Proof of concept implementation -- [ ] Polish -- [ ] **Register 0.2.0** +New contributions welcome. See the [road map](ROADMAP.md). -You can join a discussion on the LearnAPI proposal at [this](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048) Julia Discourse thread. +## Code snippet -To do: +Configure a learning algorithm, and inspect available functionality: -- [ ] ~~Add methods to create/save persistent representation of learned parameters~~ -- [X] Add more repo tests -- [ ] Add methods to test an implementation -- [ ] Add user guide ("Common Implementation Patterns" section of manual) +```julia +julia> algorithm = Ridge(lambda=0.1) +julia> LearnAPI.functions(algorithm) +(:(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.minimize), :(LearnAPI.obs), +:(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients)) +``` -[![Build Status](https://github.com/JuliaAI/LearnAPI.jl/workflows/CI/badge.svg)](https://github.com/JuliaAI/LearnAPI.jl/actions) -[![Coverage](https://codecov.io/gh/JuliaAI/LearnAPI.jl/branch/master/graph/badge.svg)](https://codecov.io/github/JuliaAI/LearnAPI.jl?branch=master) -[![Docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://juliaai.github.io/LearnAPI.jl/dev/) +Train: + +```julia +julia> model = fit(algorithm, data) +``` + +Predict: + +```julia +julia> predict(model, data)[1] +"setosa" +``` + +Predict a probability distribution ([proxy](https://juliaai.github.io/LearnAPI.jl/dev/kinds_of_target_proxy/#proxy_types) for the target): + +```julia +julia> predict(model, Distribution(), data)[1] +UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.25, virginica=>0.75) +``` + +## Credits + +Created by Anthony Blaom, in cooperation with [members of the Julia +community](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048). diff --git a/ROADMAP.md b/ROADMAP.md new file mode 100644 index 00000000..3d0f16d2 --- /dev/null +++ b/ROADMAP.md @@ -0,0 +1,52 @@ +# Road map + +- [ ] Mock up a challenging `update` use-case: controlling an iterative algorithm that + wants, for efficiency, to internally compute the out-of-sample predictions that will + be for used to *externally* determined early stopping cc: @jeremiedb + +- [ ] Get code coverage to 100% (see next item) + +- [ ] Add to this repo or a utility repo methods to test a valid implementation of + LearnAPI.jl + +- [ ] Flush out "Common Implementation Patterns". The current plan is to mock up example + implementations, and add them as LearnAPI.jl tests, with links to the test file from + "Common Implementation Patterns". As real-world implementations roll out, we could + increasingly point to those instead, to conserve effort + - [x] regression + - [ ] classification + - [ ] clustering + - [ ] gradient descent + - [ ] iterative algorithms + - [ ] incremental algorithms + - [ ] dimension reduction + - [x] feature engineering + - [x] static algorithms + - [ ] missing value imputation + - [ ] transformers + - [ ] ensemble algorithms + - [ ] time series forecasting + - [ ] time series classification + - [ ] survival analysis + - [ ] density estimation + - [ ] Bayesian algorithms + - [ ] outlier detection + - [ ] collaborative filtering + - [ ] text analysis + - [ ] audio analysis + - [ ] natural language processing + - [ ] image processing + - [ ] meta-algorithms + +- [ ] In a utility package provide: + - [ ] Method to clone an algorithm with user-specified property(hyperparameter) + changes, as in `LearnAPI.clone(algorithm, p1=value1, p22=value2, ...)` (since + `algorithm` can have any type, can't really overload `Base.replace` without + piracy). This will be needed in tuning meta-algorithms. Or should this be in + LearnAPI.jl proper, to expose it to all users? + - [ ] Methods to facilitate common-use case data interfaces: support simultaneously + `fit` data of the form `data = (X, y)` where `X` is table *or* matrix, and + `data` a table with target specified by hyperparameter; here `obs` will return a + thin wrapping of the matrix of `X`, the target `y`, and the names of all + fields. We can have options to make `X` a concrete array or an adjoint, + depending on what is more efficient for the algorithm. diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 1ee784ed..3296f59b 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -24,11 +24,11 @@ A transformer ordinarily implements `transform` instead of the MLUtils.jl `getobs`/`numobs` interface, then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support - this interface, as illustrated below under - [Providing an advanced data interface](@ref), and which may additionally - enable certain performance benefits; or (ii) overload the trait + this interface, as illustrated below under + [Providing an advanced data interface](@ref), and which may additionally + enable certain performance benefits; or (ii) overload the trait [`LearnAPI.data_interface`](@ref) to specify a more relaxed data - API. + API. The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. @@ -503,5 +503,5 @@ declaration. ⁴ The `data = (X, y)` pattern implemented here is not the only supported pattern. For, example, `data` might be a single table containing both features and target variable. In this case, it will be necessary to overload [`LearnAPI.features`](@ref) in addition to -[`LearnAPI.target`](@ref); the name of the target column would need to be a hyperparameter -or `fit` keyword argument. +[`LearnAPI.target`](@ref); the name of the target column would need to be a +hyperparameter. diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 91e5f925..7a4c0406 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -34,8 +34,13 @@ implementations fall into one (or more) of the following informally understood p - [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension +- [Feature Engineering](@ref) + - [Missing Value Imputation](@ref): Transformers that replace missing values. +- [Transformers](@ref): Other transformers, such as standardizers, and categorical + encoders. + - [Clusterering](@ref): Algorithms that group data into clusters for classification and possibly dimension reduction. May be true learners (generalize to new data) or static. @@ -53,3 +58,5 @@ implementations fall into one (or more) of the following informally understood p - [Survival Analysis](@ref) +- [Meta-algorithms](@ref) + diff --git a/docs/src/patterns/feature_engineering.md b/docs/src/patterns/feature_engineering.md new file mode 100644 index 00000000..614f94a6 --- /dev/null +++ b/docs/src/patterns/feature_engineering.md @@ -0,0 +1,5 @@ +# Feature Engineering + +- For a simple feature selection algorithm (no "learning) see [these +examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/static_algorithms.jl) +from tests. diff --git a/docs/src/patterns/meta_algorithms.md b/docs/src/patterns/meta_algorithms.md new file mode 100644 index 00000000..1de7712f --- /dev/null +++ b/docs/src/patterns/meta_algorithms.md @@ -0,0 +1 @@ +# Meta-algorithms diff --git a/src/traits.jl b/src/traits.jl index 93938007..f3335feb 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -148,8 +148,9 @@ See also [`LearnAPI.predict`](@ref), [`LearnAPI.KindOfProxy`](@ref). Must be overloaded whenever `predict` is implemented. -Elements of the returned tuple must be one of the following, described further in -LearnAPI.jl documentation: $CONCRETE_TARGET_PROXY_TYPES_LIST. +Elements of the returned tuple must be instances of types in the return value of +`LearnAPI.kinds_of_proxy()`, i.e., one of the following, described further in LearnAPI.jl +documentation: $CONCRETE_TARGET_PROXY_TYPES_LIST. Suppose, for example, we have the following implementation of a supervised learner returning only probabilistic predictions: @@ -170,6 +171,8 @@ For more on target variables and target proxies, refer to the LearnAPI documenta """ kinds_of_proxy(::Any) = () +kinds_of_proxy() = CONCRETE_TARGET_PROXY_TYPES + tags() = [ "regression", @@ -179,12 +182,11 @@ tags() = [ "iterative algorithms", "incremental algorithms", "dimension reduction", - "encoders", + "transformers", "feature engineering", "static algorithms", "missing value imputation", "ensemble algorithms", - "wrappers", "time series forecasting", "time series classification", "survival analysis", @@ -196,6 +198,7 @@ tags() = [ "audio analysis", "natural language processing", "image processing", + "meta-algorithms" ] const DOC_TAGS_LIST = join(map(d -> "`\"$d\"`", tags()), ", ") From 15340c15dfa822981c16c55b5d5b543f53ff845a Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 3 Oct 2024 16:31:25 +1300 Subject: [PATCH 071/187] readme tweak --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3ac84bee..57fb3774 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ Predict: ```julia julia> predict(model, data)[1] -"setosa" +"virginica" ``` Predict a probability distribution ([proxy](https://juliaai.github.io/LearnAPI.jl/dev/kinds_of_target_proxy/#proxy_types) for the target): From 4d53e3c8818a1fa4bd083c2f900d0fcd7524aec8 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 3 Oct 2024 16:33:20 +1300 Subject: [PATCH 072/187] typo --- docs/src/anatomy_of_an_implementation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 3296f59b..6d294b06 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -198,7 +198,7 @@ predictions. ## Algorithm traits Algorithm [traits](@ref traits) record extra generic information about an algorithm, or -make specific promises of behavior. They are methods that have an algorithm as the solve +make specific promises of behavior. They are methods that have an algorithm as the sole argument, and so we regard [`LearnAPI.constructor`](@ref) defined above as a trait. Because we have implemented `predict`, we are required to overload the From 8e47d30afad5a10a6509643f13ecd6e5649d2938 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 4 Oct 2024 17:59:03 +1300 Subject: [PATCH 073/187] add Cameron to credits --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 57fb3774..26e58da2 100644 --- a/README.md +++ b/README.md @@ -44,6 +44,7 @@ UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.25, virginica=>0.75) ## Credits -Created by Anthony Blaom, in cooperation with [members of the Julia +Created by Anthony Blaom, in cooperation with Cameron Bieganek and other [members of the +Julia community](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048). From 29ccc3be22cdd00ed2fd8ec0bbdd94848e152919 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 4 Oct 2024 21:21:01 +1300 Subject: [PATCH 074/187] fix formatting in roadmap --- ROADMAP.md | 76 +++++++++++++++++++++++++++--------------------------- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index 3d0f16d2..e1449f7c 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -10,43 +10,43 @@ LearnAPI.jl - [ ] Flush out "Common Implementation Patterns". The current plan is to mock up example - implementations, and add them as LearnAPI.jl tests, with links to the test file from - "Common Implementation Patterns". As real-world implementations roll out, we could - increasingly point to those instead, to conserve effort - - [x] regression - - [ ] classification - - [ ] clustering - - [ ] gradient descent - - [ ] iterative algorithms - - [ ] incremental algorithms - - [ ] dimension reduction - - [x] feature engineering - - [x] static algorithms - - [ ] missing value imputation - - [ ] transformers - - [ ] ensemble algorithms - - [ ] time series forecasting - - [ ] time series classification - - [ ] survival analysis - - [ ] density estimation - - [ ] Bayesian algorithms - - [ ] outlier detection - - [ ] collaborative filtering - - [ ] text analysis - - [ ] audio analysis - - [ ] natural language processing - - [ ] image processing - - [ ] meta-algorithms + implementations, and add them as LearnAPI.jl tests, with links to the test file from + "Common Implementation Patterns". As real-world implementations roll out, we could + increasingly point to those instead, to conserve effort + - [x] regression + - [ ] classification + - [ ] clustering + - [ ] gradient descent + - [ ] iterative algorithms + - [ ] incremental algorithms + - [ ] dimension reduction + - [x] feature engineering + - [x] static algorithms + - [ ] missing value imputation + - [ ] transformers + - [ ] ensemble algorithms + - [ ] time series forecasting + - [ ] time series classification + - [ ] survival analysis + - [ ] density estimation + - [ ] Bayesian algorithms + - [ ] outlier detection + - [ ] collaborative filtering + - [ ] text analysis + - [ ] audio analysis + - [ ] natural language processing + - [ ] image processing + - [ ] meta-algorithms - [ ] In a utility package provide: - - [ ] Method to clone an algorithm with user-specified property(hyperparameter) - changes, as in `LearnAPI.clone(algorithm, p1=value1, p22=value2, ...)` (since - `algorithm` can have any type, can't really overload `Base.replace` without - piracy). This will be needed in tuning meta-algorithms. Or should this be in - LearnAPI.jl proper, to expose it to all users? - - [ ] Methods to facilitate common-use case data interfaces: support simultaneously - `fit` data of the form `data = (X, y)` where `X` is table *or* matrix, and - `data` a table with target specified by hyperparameter; here `obs` will return a - thin wrapping of the matrix of `X`, the target `y`, and the names of all - fields. We can have options to make `X` a concrete array or an adjoint, - depending on what is more efficient for the algorithm. + - [ ] Method to clone an algorithm with user-specified property (hyperparameter) + replacement in `LearnAPI.clone(algorithm, p1=value1, p22=value2, ...)` (since + `algorithm` can have any type, can't really overload `Base.replace` without + piracy). This will be needed in tuning meta-algorithms. Or should this be in + LearnAPI.jl proper, to expose it to all users? + - [ ] Methods to facilitate common-use case data interfaces: support simultaneously + `fit` data of the form `data = (X, y)` where `X` is table *or* matrix, and `data` a + table with target specified by hyperparameter; here `obs` will return a thin wrapping + of the matrix of `X`, the target `y`, and the names of all fields. We can have + options to make `X` a concrete array or an adjoint, depending on what is more + efficient for the algorithm. From 511ce6cb9de2396815cdb2a27f5b82f80151846d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 6 Oct 2024 22:14:11 +1300 Subject: [PATCH 075/187] add LearnAPI.clone; tweak contract for update --- Project.toml | 4 +- ROADMAP.md | 5 -- docs/src/reference.md | 14 +++++ docs/src/traits.md | 1 - src/LearnAPI.jl | 1 + src/clone.jl | 23 +++++++++ src/fit_update.jl | 21 +++++--- src/traits.jl | 48 ++++++++++------- test/clone.jl | 22 ++++++++ test/runtests.jl | 116 ++++++------------------------------------ 10 files changed, 124 insertions(+), 131 deletions(-) create mode 100644 src/clone.jl create mode 100644 test/clone.jl diff --git a/Project.toml b/Project.toml index ee543d1a..08f56ba1 100644 --- a/Project.toml +++ b/Project.toml @@ -13,9 +13,11 @@ julia = "1.6" DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" +Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b" +Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] -test = ["DataFrames", "LinearAlgebra", "MLUtils", "Serialization", "Tables", "Test"] +test = ["DataFrames", "LinearAlgebra", "MLUtils", "Random", "Serialization", "Statistics", "Tables", "Test"] diff --git a/ROADMAP.md b/ROADMAP.md index e1449f7c..0e0ca206 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -39,11 +39,6 @@ - [ ] meta-algorithms - [ ] In a utility package provide: - - [ ] Method to clone an algorithm with user-specified property (hyperparameter) - replacement in `LearnAPI.clone(algorithm, p1=value1, p22=value2, ...)` (since - `algorithm` can have any type, can't really overload `Base.replace` without - piracy). This will be needed in tuning meta-algorithms. Or should this be in - LearnAPI.jl proper, to expose it to all users? - [ ] Methods to facilitate common-use case data interfaces: support simultaneously `fit` data of the form `data = (X, y)` where `X` is table *or* matrix, and `data` a table with target specified by hyperparameter; here `obs` will return a thin wrapping diff --git a/docs/src/reference.md b/docs/src/reference.md index 698d0943..bcd5a922 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -91,9 +91,15 @@ named_properties = NamedTuple{properties}(getproperty.(Ref(algorithm), propertie @assert algorithm == LearnAPI.constructor(algorithm)(; named_properties...) ``` +which can be tested with `@assert `[`LearnAPI.clone(algorithm)`](@ref)` == algorithm`. + Note that if if `algorithm` is an instance of a *mutable* struct, this requirement generally requires overloading `Base.==` for the struct. +No LearnAPI.jl method is permitted to mutate an algorithm. In particular, one should make +deep copies of RNG hyperparameters before using them in a new implementation of +[`fit`](@ref). + #### Composite algorithms (wrappers) A *composite algorithm* is one with at least one property that can take other algorithms @@ -179,6 +185,14 @@ Most algorithms will also implement [`predict`](@ref) and/or [`transform`](@ref) record general information about the algorithm. Only [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref) are universally compulsory. + +## Utilities + +```@docs +LearnAPI.clone +LearnAPI.@trait +``` + --- ¹ We acknowledge users may not like this terminology, and may know "algorithm" by some diff --git a/docs/src/traits.md b/docs/src/traits.md index b2a214da..58759137 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -105,5 +105,4 @@ LearnAPI.iteration_parameter LearnAPI.fit_observation_scitype LearnAPI.target_observation_scitype LearnAPI.predict_or_transform_mutates -LearnAPI.@trait ``` diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index ffab0130..8a82874e 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -11,6 +11,7 @@ include("target_weights_features.jl") include("obs.jl") include("accessor_functions.jl") include("traits.jl") +include("clone.jl") export @trait export fit, update, update_observations, update_features diff --git a/src/clone.jl b/src/clone.jl new file mode 100644 index 00000000..2b6eee13 --- /dev/null +++ b/src/clone.jl @@ -0,0 +1,23 @@ +""" + LearnAPI.clone(algorithm; replacements...) + +Return a shallow copy of `algorithm` with the specified hyperparameter replacements. + +```julia +clone(algorithm; epochs=100, learning_rate=0.01) +``` + +It is guaranted that `LearnAPI.clone(algorithm) == algorithm`. + +""" +function clone(algorithm; replacements...) + reps = NamedTuple(replacements) + names = propertynames(algorithm) + rep_names = keys(reps) + + new_values = map(names) do name + name in rep_names && return getproperty(reps, name) + getproperty(algorithm, name) + end + return LearnAPI.constructor(algorithm)(NamedTuple{names}(new_values)...) +end diff --git a/src/fit_update.jl b/src/fit_update.jl index 44b427b2..eb2c9cb8 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -59,14 +59,15 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ `update` call, but with the specified hyperparameter replacements, in the form `p1=value1, p2=value2, ...`. -Provided that `data` is identical with the data presented in a preceding `fit` call, as in -the example below, execution is semantically equivalent to the call `fit(algorithm, -data)`, where `algorithm` is `LearnAPI.algorithm(model)` with the specified -replacements. In some cases (typically, when changing an iteration parameter) there may be -a performance benefit to using `update` instead of retraining ab initio. +Provided that `data` is identical with the data presented in a preceding `fit` call *and* +there is at most one hyperparameter replacement, as in the example below, execution is +semantically equivalent to the call `fit(algorithm, data)`, where `algorithm` is +`LearnAPI.algorithm(model)` with the specified replacements. In some cases (typically, +when changing an iteration parameter) there may be a performance benefit to using `update` +instead of retraining ab initio. -If `data` differs from that in the preceding `fit` or `update` call, then behaviour is -algorithm-specific. +If `data` differs from that in the preceding `fit` or `update` call, or there is more than +one hyperparameter replacement, then behaviour is algorithm-specific. ```julia algorithm = MyForest(ntrees=100) @@ -85,6 +86,8 @@ See also [`fit`](@ref), [`update_observations`](@ref), [`update_features`](@ref) Implementation is optional. The signature must include `verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update)")) +See also [`LearnAPI.clone`](@ref) + """ update(model, data1, datas...; kwargs...) = update(model, (data1, datas...); kwargs...) @@ -119,6 +122,8 @@ See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). Implementation is optional. The signature must include `verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_observations)")) +See also [`LearnAPI.clone`](@ref). + """ update_observations(algorithm, data1, datas...; kwargs...) = update_observations(algorithm, (data1, datas...); kwargs...) @@ -144,6 +149,8 @@ See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). Implementation is optional. The signature must include `verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_features)")) +See also [`LearnAPI.clone`](@ref). + """ update_features(algorithm, data1, datas...; kwargs...) = update_features(algorithm, (data1, datas...); kwargs...) diff --git a/src/traits.jl b/src/traits.jl index f3335feb..bed7cab5 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -105,29 +105,43 @@ value is non-empty. All new implementations must overload this trait. Here's a checklist for elements in the return value: -| symbol | implementation/overloading compulsory? | include in returned tuple? | -|-----------------------------------|----------------------------------------|------------------------------------| -| `:(LearnAPI.fit)` | yes | yes | -| `:(LearnAPI.algorithm)` | yes | yes | -| `:(LearnAPI.minimize)` | no | yes | -| `:(LearnAPI.obs)` | no | yes | -| `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | -| `:(LearnAPI.update)` | no | only if implemented | -| `:(LearnAPI.update_observations)` | no | only if implemented | -| `:(LearnAPI.update_features)` | no | only if implemented | -| `:(LearnAPI.target)` | no | only if implemented | -| `:(LearnAPI.weights)` | no | only if implemented | -| `:(LearnAPI.predict)` | no | only if implemented | -| `:(LearnAPI.transform)` | no | only if implemented | -| `:(LearnAPI.inverse_transform)` | no | only if implemented | -| | no | only if implemented | +| expression | implementation compulsory? | include in returned tuple? | +|-----------------------------------|----------------------------|------------------------------------| +| `:(LearnAPI.fit)` | yes | yes | +| `:(LearnAPI.algorithm)` | yes | yes | +| `:(LearnAPI.minimize)` | no | yes | +| `:(LearnAPI.obs)` | no | yes | +| `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | +| `:(LearnAPI.target)` | no | only if implemented | +| `:(LearnAPI.weights)` | no | only if implemented | +| `:(LearnAPI.update)` | no | only if implemented | +| `:(LearnAPI.update_observations)` | no | only if implemented | +| `:(LearnAPI.update_features)` | no | only if implemented | +| `:(LearnAPI.predict)` | no | only if implemented | +| `:(LearnAPI.transform)` | no | only if implemented | +| `:(LearnAPI.inverse_transform)` | no | only if implemented | +| | no | only if implemented | Also include any implemented accessor functions, both those owned by LearnaAPI.jl, and any algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. """ functions(::Any) = () - +functions() = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.weights), + :(LearnAPI.update), + :(LearnAPI.update_observations), + :(LearnAPI.update_features), + :(LearnAPI.predict), + :(LearnAPI.transform), + :(LearnAPI.inverse_transform), +) """ LearnAPI.kinds_of_proxy(algorithm) diff --git a/test/clone.jl b/test/clone.jl new file mode 100644 index 00000000..13c5ce5a --- /dev/null +++ b/test/clone.jl @@ -0,0 +1,22 @@ +using Test +using LearnAPI + +struct Potato + x + y +end + +Potato(; x=1, y=2) = Potato(x, y) +LearnAPI.constructor(::Potato) = Potato + +@test LearnAPI.clone(Potato()) == Potato() + +p = LearnAPI.clone(Potato(), y=20) +@test p.y == 20 +@test p.x == 1 + +q = LearnAPI.clone(Potato(), y=20, x=10) +@test q.y == 20 +@test q.x == 10 + +true diff --git a/test/runtests.jl b/test/runtests.jl index 93788bc4..44a2a42f 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -1,103 +1,19 @@ using Test -@testset "tools.jl" begin - include("tools.jl") +test_files = [ + "tools.jl", + "traits.jl", + "clone.jl", + "integration/regression.jl", + "integration/static_algorithms.jl", +] + +files = isempty(ARGS) ? test_files : ARGS + +for file in files + quote + @testset $file begin + include($file*".jl") + end + end |> eval end - -@testset "traits.jl" begin - include("traits.jl") -end - -# # INTEGRATION TESTS - -@testset "regression" begin - include("integration/regression.jl") -end - -# @testset "classification" begin -# include("integration/classification.jl") -# end - -# @testset "clustering" begin -# include("integration/clustering.jl") -# end - -# @testset "gradient_descent" begin -# include("integration/gradient_descent.jl") -# end - -# @testset "iterative_algorithms" begin -# include("integration/iterative_algorithms.jl") -# end - -# @testset "incremental_algorithms" begin -# include("integration/incremental_algorithms.jl") -# end - -# @testset "dimension_reduction" begin -# include("integration/dimension_reduction.jl") -# end - -# @testset "encoders" begin -# include("integration/encoders.jl") -# end - -@testset "static_algorithms" begin - include("integration/static_algorithms.jl") -end - -# @testset "missing_value_imputation" begin -# include("integration/missing_value_imputation.jl") -# end - -# @testset "ensemble_algorithms" begin -# include("integration/ensemble_algorithms.jl") -# end - -# @testset "wrappers" begin -# include("integration/wrappers.jl") -# end - -# @testset "time_series_forecasting" begin -# include("integration/time_series_forecasting.jl") -# end - -# @testset "time_series_classification" begin -# include("integration/time_series_classification.jl") -# end - -# @testset "survival_analysis" begin -# include("integration/survival_analysis.jl") -# end - -# @testset "distribution_fitters" begin -# include("integration/distribution_fitters.jl") -# end - -# @testset "Bayesian_algorithms" begin -# include("integration/Bayesian_algorithms.jl") -# end - -# @testset "outlier_detection" begin -# include("integration/outlier_detection.jl") -# end - -# @testset "collaborative_filtering" begin -# include("integration/collaborative_filtering.jl") -# end - -# @testset "text_analysis" begin -# include("integration/text_analysis.jl") -# end - -# @testset "audio_analysis" begin -# include("integration/audio_analysis.jl") -# end - -# @testset "natural_language_processing" begin -# include("integration/natural_language_processing.jl") -# end - -# @testset "image_processing" begin -# include("integration/image_processing.jl") -# end From db2f287dda0559ba26ce914791668d7511051c58 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 7 Oct 2024 11:39:14 +1300 Subject: [PATCH 076/187] fix problem with runtests.jl --- test/runtests.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/runtests.jl b/test/runtests.jl index 44a2a42f..0a52be7b 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -13,7 +13,7 @@ files = isempty(ARGS) ? test_files : ARGS for file in files quote @testset $file begin - include($file*".jl") + include($file) end end |> eval end From 72009e256f17e7eead56733b796511ee22fb9537 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 09:15:08 +1300 Subject: [PATCH 077/187] tweaks and corrections --- docs/src/anatomy_of_an_implementation.md | 2 +- docs/src/target_weights_features.md | 12 +++++------- src/target_weights_features.jl | 21 +++++++++------------ test/integration/regression.jl | 2 +- 4 files changed, 16 insertions(+), 21 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 6d294b06..2fba3329 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -411,7 +411,7 @@ specified by the trait* [`LearnAPI.data_interface(algorithm)`](@ref). Assuming t ```@example anatomy2 Base.getindex(data::RidgeFitObs, I) = RidgeFitObs(data.A[:,I], data.names, y[I]) -Base.length(data::RidgeFitObs, I) = length(data.y) +Base.length(data::RidgeFitObs) = length(data.y) ``` We can do something similar for `predict`, but there's no need for a new type in this diff --git a/docs/src/target_weights_features.md b/docs/src/target_weights_features.md index df4f76b7..910b9a4c 100644 --- a/docs/src/target_weights_features.md +++ b/docs/src/target_weights_features.md @@ -28,13 +28,11 @@ training_loss = sum(ŷ .!= y) # Implementation guide -The fallback returns `first(data)`, assuming `data` is a tuple, and `data` otherwise. - -| method | fallback | compulsory? | -|:----------------------------|:-----------------:|------------------------| -| [`LearnAPI.target`](@ref) | returns `nothing` | no | -| [`LearnAPI.weights`](@ref) | returns `nothing` | no | -| [`LearnAPI.features`](@ref) | see docstring | only if fallback fails | +| method | fallback | compulsory? | +|:----------------------------|:-----------------:|--------------------------| +| [`LearnAPI.target`](@ref) | returns `nothing` | no | +| [`LearnAPI.weights`](@ref) | returns `nothing` | no | +| [`LearnAPI.features`](@ref) | see docstring | if fallback insufficient | # Reference diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 98bd3888..3c9b075f 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -38,7 +38,7 @@ weights(::Any, data) = nothing Return, for each form of `data` supported in a call of the form [`fit(algorithm, data)`](@ref), the "features" part of `data` (as opposed to the target -variable, for example). +variable, for example). The returned object `X` may always be passed to `predict` or `transform`, where implemented, as in the following sample workflow: @@ -49,7 +49,7 @@ X = features(data) ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` ``` -The return value has the same number of observations as `data` does. For supervised models +The returned object has the same number of observations as `data`. For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(algorithm)`) `ŷ` above is generally intended to be an approximate proxy for `LearnAPI.target(algorithm, data)`, the training target. @@ -57,14 +57,9 @@ target. # New implementations -The only contract `features` must satisfy is the one about passability of the output to -`predict` or `transform`, for each supported input `data`. The following fallbacks -typically make overloading `LearnAPI.features` unnecessary: - -```julia -LearnAPI.features(algorithm, data) = data -LearnAPI.features(algorithm, data::Tuple) = first(data) -``` +That the output can be passed to `predict` and/or `transform`, and has the same number of +observations as `data`, are the only contracts. A fallback returns `first(data)` if `data` +is a tuple, and otherwise returns `data`. Overloading may be necessary if [`obs(algorithm, data)`](@ref) is overloaded to return some algorithm-specific representation of training `data`. For density estimators, whose @@ -72,5 +67,7 @@ some algorithm-specific representation of training `data`. For density estimator return `nothing`. """ -features(algorithm, data) = data -features(algorithm, data::Tuple) = first(data) +features(algorithm, data) = _first(data) +_first(data) = data +_first(data::Tuple) = first(data) +# note the factoring above guards agains method ambiguities diff --git a/test/integration/regression.jl b/test/integration/regression.jl index ba68cef1..34144c87 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -39,7 +39,7 @@ LearnAPI.algorithm(model::RidgeFitted) = model.algorithm Base.getindex(data::RidgeFitObs, I) = RidgeFitObs(data.A[:,I], data.names, data.y[I]) -Base.length(data::RidgeFitObs, I) = length(data.y) +Base.length(data::RidgeFitObs) = length(data.y) # observations for consumption by `fit`: function LearnAPI.obs(::Ridge, data) From 1530a84f730044a514211e953f5f071e7d44b2bf Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 09:21:40 +1300 Subject: [PATCH 078/187] add iterative algorithms to integration tests --- Project.toml | 13 +- test/integration/iterative_algorithms.jl | 335 +++++++++++++++++++++++ 2 files changed, 347 insertions(+), 1 deletion(-) create mode 100644 test/integration/iterative_algorithms.jl diff --git a/Project.toml b/Project.toml index 08f56ba1..849adaeb 100644 --- a/Project.toml +++ b/Project.toml @@ -15,9 +15,20 @@ LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b" +StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3" Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] -test = ["DataFrames", "LinearAlgebra", "MLUtils", "Random", "Serialization", "Statistics", "Tables", "Test"] +test = [ + "DataFrames", + "LinearAlgebra", + "MLUtils", + "Random", + "Serialization", + "StableRNGs", + "Statistics", + "Tables", + "Test", +] diff --git a/test/integration/iterative_algorithms.jl b/test/integration/iterative_algorithms.jl new file mode 100644 index 00000000..6e1e0f6e --- /dev/null +++ b/test/integration/iterative_algorithms.jl @@ -0,0 +1,335 @@ +using LearnAPI +using LinearAlgebra +using Tables +import MLUtils +import DataFrames +using Random +using Statistics +using StableRNGs + +# # ENSEMBLE OF RIDGE REGRESSORS + +# We implement a toy algorithm that creates an bagged ensemble of ridge regressors (as +# defined already in test/integration/regressors.jl), i.e, where each atomic model is +# trained on a random sample of the training observations (same number, but sampled with +# replacement). In particular this algorithm has an iteration parameter `n`, and we +# implement `update` for warm restarts when `n` increases. + +# By re-using the data interface for `Ridge`, we ensure that the resampling (bagging) is +# more efficient (no repeated table -> matrix conversions, and we resample matrices +# directly, not the tables). + +# no docstring here - that goes with the constructor +struct RidgeEnsemble + lambda::Float64 + rng # leaving abstract for simplicity + n::Int +end + +""" + RidgeEnsemble(; lambda=0.1, rng=Random.default_rng(), n=10) + +Instantiate a RidgeEnsemble algorithm, bla, bla, bla... + +""" +RidgeEnsemble(; lambda=0.1, rng=Random.default_rng(), n=10) = + RidgeEnsemble(lambda, rng, n) # LearnAPI.constructor defined later + +struct RidgeEnsembleFitted + algorithm::RidgeEnsemble + atom::Ridge + rng # mutated copy of `algorithm.rng` + models # leaving type abstract for simplicity +end + +LearnAPI.algorithm(model::RidgeEnsembleFitted) = model.algorithm + +# we use the same data interface we provided for `Ridge` in regression.jl: +LearnAPI.obs(algorithm::RidgeEnsemble, data) = LearnAPI.obs(Ridge(), data) +LearnAPI.obs(model::RidgeEnsembleFitted, data) = LearnAPI.obs(first(model.models), data) +LearnAPI.target(algorithm::RidgeEnsemble, data) = LearnAPI.target(Ridge(), data) +LearnAPI.features(algorithm::Ridge, data) = LearnAPI.features(Ridge(), data) + +function d(rng) + i = digits(rng.state) + m = min(length(i), 4) + tail = i[end - m + 1:end] + println(join(string.(tail))) +end + +# because we need observation subsampling, we first implement `fit` for output of +# `obs`: +function LearnAPI.fit(algorithm::RidgeEnsemble, data::RidgeFitObs; verbosity=1) + + # unpack hyperparameters: + lambda = algorithm.lambda + rng = deepcopy(algorithm.rng) # to prevent mutation of `algorithm` + n = algorithm.n + + # instantiate atomic algorithm: + atom = Ridge(lambda) + + # initialize ensemble: + models = [] + + # get number of observations: + N = MLUtils.numobs(data) + + # train the ensemble: + for _ in 1:n + bag = rand(rng, 1:N, N) + data_subset = MLUtils.getobs(data, bag) + # step down one verbosity level in atomic fit: + model = fit(atom, data_subset; verbosity=verbosity - 1) + push!(models, model) + end + + # make some noise, if allowed: + verbosity > 0 && @info "Trained $n ridge regression models. " + + return RidgeEnsembleFitted(algorithm, atom, rng, models) + +end + +# ... and so need a `fit` for unprocessed `data = (X, y)`: +LearnAPI.fit(algorithm::RidgeEnsemble, data; kwargs...) = + fit(algorithm, obs(algorithm, data); kwargs...) + +# If `n` is increased, this `update` adds new regressors to the ensemble, including any +# new # hyperparameter updates (e.g, `lambda`) when computing the new +# regressors. Otherwise, update is equivalent to retraining from scratch, with the +# provided hyperparameter updates. +function LearnAPI.update( + model::RidgeEnsembleFitted, + data::RidgeFitObs; + verbosity=1, + replacements..., + ) + + :n in keys(replacements) || return fit(model, data) + + algorithm_old = LearnAPI.algorithm(model) + algorithm = LearnAPI.clone(algorithm_old; replacements...) + n = algorithm.n + Δn = n - algorithm_old.n + n < 0 && return fit(model, algorithm) + + # get number of observations: + N = MLUtils.numobs(data) + + # initialize: + models = model.models + rng = model.rng # as mutated in previous `fit`/`update` calls + + atom = Ridge(; lambda=algorithm.lambda) + + rng2 = StableRNG(123) + for _ in 1:10 + rand(rng2) + end + + # add new regressors to the ensemble: + for _ in 1:Δn + bag = rand(rng, 1:N, N) + data_subset = MLUtils.getobs(data, bag) + model = fit(atom, data_subset; verbosity=verbosity-1) + push!(models, model) + end + + # make some noise, if allowed: + verbosity > 0 && @info "Trained $Δn additional ridge regression models. " + + return RidgeEnsembleFitted(algorithm, atom, rng, models) +end + +# an `update` for unprocessed `data = (X, y)`: +LearnAPI.update(model::RidgeEnsembleFitted, data; kwargs...) = + update(model, obs(LearnAPI.algorithm(model), data); kwargs...) + +# `data` here can be pre-processed or not, because we're just calling the atomic +# `predict`, which already has a data interface, and we don't need any subsampling, like +# we did for `fit`: +LearnAPI.predict(model::RidgeEnsembleFitted, ::Point, data) = + mean(model.models) do atomic_model + predict(atomic_model, Point(), data) + end + +LearnAPI.minimize(model::RidgeEnsembleFitted) = RidgeEnsembleFitted( + model.algorithm, + model.atom, + model.rng, + minimize.(Ref(model.atom), models), +) + +# note the inclusion of `iteration_parameter`: +@trait( + RidgeEnsemble, + constructor = RidgeEnsemble, + iteration_parameter = :n, + kinds_of_proxy = (Point(),), + tags = ("regression", "ensemble algorithms", "iterative models"), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.update), + :(LearnAPI.predict), + :(LearnAPI.feature_importances), + ) +) + +# synthetic test data: +N = 10 # number of observations +train = 1:6 +test = 7:10 +a, b, c = rand(N), rand(N), rand(N) +X = (; a, b, c) +X = DataFrames.DataFrame(X) +y = 2a - b + 3c + 0.05*rand(N) +data = (X, y) +Xtrain = Tables.subset(X, train) +Xtest = Tables.subset(X, test) + +@testset "test an implementation of bagged ensemble of ridge regressors" begin + rng = StableRNG(123) + algorithm = RidgeEnsemble(lambda=0.5, n=4; rng) + @test LearnAPI.clone(algorithm) == algorithm + @test :(LearnAPI.obs) in LearnAPI.functions(algorithm) + @test LearnAPI.target(algorithm, data) == y + @test LearnAPI.features(algorithm, data) == X + + model = @test_logs( + (:info, r"Trained 4 ridge"), + fit(algorithm, Xtrain, y[train]; verbosity=1), + ); + + ŷ4 = predict(model, Point(), Xtest) + @test ŷ4 == predict(model, Xtest) + + # add 3 atomic models to the ensemble: + # model = @test_logs( + # (:info, r"Trained 3 additional"), + # update(model, Xtrain, y[train]; n=7), + # ) + model = update(model, Xtrain, y[train]; verbosity=0, n=7); + ŷ7 = predict(model, Xtest) + + # compare with cold restart: + model = fit(LearnAPI.clone(algorithm; n=7), Xtrain, y[train]; verbosity=0); + @test ŷ7 ≈ predict(model, Xtest) + + + update(model, Xtest; + fitobs = LearnAPI.obs(algorithm, data) + predictobs = LearnAPI.obs(model, X) + model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) + @test LearnAPI.target(algorithm, fitobs) == y + @test predict(model, Point(), MLUtils.getobs(predictobs, test)) ≈ ŷ + @test predict(model, LearnAPI.features(algorithm, fitobs)) ≈ predict(model, X) + + @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} + + filename = tempname() + using Serialization + small_model = minimize(model) + serialize(filename, small_model) + + recovered_model = deserialize(filename) + @test LearnAPI.algorithm(recovered_model) == algorithm + @test predict( + recovered_model, + Point(), + MLUtils.getobs(predictobs, test) + ) ≈ ŷ + +end + +# # VARIATION OF RIDGE REGRESSION THAT USES FALLBACK OF LearnAPI.obs + +# no docstring here - that goes with the constructor +struct BabyRidge + lambda::Float64 +end + +""" + BabyRidge(; lambda=0.1) + +Instantiate a ridge regression algorithm, with regularization of `lambda`. + +""" +BabyRidge(; lambda=0.1) = BabyRidge(lambda) # LearnAPI.constructor defined later + +struct BabyRidgeFitted{T,F} + algorithm::BabyRidge + coefficients::Vector{T} + feature_importances::F +end + +function LearnAPI.fit(algorithm::BabyRidge, data; verbosity=1) + + X, y = data + + lambda = algorithm.lambda + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table)' + + # apply core algorithm: + coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector + + feature_importances = nothing + + return BabyRidgeFitted(algorithm, coefficients, feature_importances) + +end + +# extracting stuff from training data: +LearnAPI.target(::BabyRidge, data) = last(data) + +LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm + +LearnAPI.predict(model::BabyRidgeFitted, ::Point, Xnew) = + Tables.matrix(Xnew)*model.coefficients + +LearnAPI.minimize(model::BabyRidgeFitted) = + BabyRidgeFitted(model.algorithm, model.coefficients, nothing) + +@trait( + BabyRidge, + constructor = BabyRidge, + kinds_of_proxy = (Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.feature_importances), + ) +) + +@testset "test a variation which does not overload LearnAPI.obs" begin + algorithm = BabyRidge(lambda=0.5) + @test + + model = fit(algorithm, Tables.subset(X, train), y[train]; verbosity=0) + ŷ = predict(model, Point(), Tables.subset(X, test)) + @test ŷ isa Vector{Float64} + + fitobs = obs(algorithm, data) + predictobs = LearnAPI.obs(model, X) + model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) + @test predict(model, Point(), MLUtils.getobs(predictobs, test)) == ŷ == + predict(model, MLUtils.getobs(predictobs, test)) + @test LearnAPI.target(algorithm, data) == y + @test LearnAPI.predict(model, X) ≈ + LearnAPI.predict(model, LearnAPI.features(algorithm, data)) +end + +true From 7d45e08a68ff1fe3142a6781824f56998f0af649 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 10:12:33 +1300 Subject: [PATCH 079/187] other tweaks --- ROADMAP.md | 4 +- docs/src/common_implementation_patterns.md | 41 +++-- docs/src/patterns/density_estimation.md | 1 + docs/src/patterns/ensemble_algorithms.md | 5 + docs/src/patterns/iterative_algorithms.md | 4 + .../learning_a_probability_distribution.md | 1 - docs/src/patterns/transformers.md | 1 + src/traits.jl | 4 +- test/integration/iterative_algorithms.jl | 169 ++---------------- test/runtests.jl | 1 + 10 files changed, 57 insertions(+), 174 deletions(-) create mode 100644 docs/src/patterns/density_estimation.md create mode 100644 docs/src/patterns/ensemble_algorithms.md delete mode 100644 docs/src/patterns/learning_a_probability_distribution.md create mode 100644 docs/src/patterns/transformers.md diff --git a/ROADMAP.md b/ROADMAP.md index 0e0ca206..cd524ba5 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -17,14 +17,14 @@ - [ ] classification - [ ] clustering - [ ] gradient descent - - [ ] iterative algorithms + - [x] iterative algorithms - [ ] incremental algorithms - [ ] dimension reduction - [x] feature engineering - [x] static algorithms - [ ] missing value imputation - [ ] transformers - - [ ] ensemble algorithms + - [x] ensemble algorithms - [ ] time series forecasting - [ ] time series classification - [ ] survival analysis diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 7a4c0406..f878aa36 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -20,43 +20,54 @@ Although an implementation is defined purely by the methods and traits it implem implementations fall into one (or more) of the following informally understood patterns or "tasks": +- [Regression](@ref): Supervised learners for continuous targets + - [Classification](@ref): Supervised learners for categorical targets -- [Regression](@ref): Supervised learners for continuous targets +- [Clusterering](@ref): Algorithms that group data into clusters for classification and + possibly dimension reduction. May be true learners (generalize to new data) or static. + +- [Gradient Descent](@ref): Including neural networks. - [Iterative Algorithms](@ref) - [Incremental Algorithms](@ref) +- [Feature Engineering](@ref): Algorithms for selecting or combining features + +- [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension + +- [Missing Value Imputation](@ref) + +- [Transformers](@ref): Other transformers, such as standardizers, and categorical + encoders. + - [Static Algorithms](@ref): Algorithms that do not learn, in the sense they must be re-executed for each new data set (do not generalize), but which have hyperparameters and/or deliver ancillary information about the computation. + +- [Ensemble Algorithms](@ref): Algorithms that blend predictions of multiple algorithms -- [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension +- [Time Series Forecasting](@ref) -- [Feature Engineering](@ref) +- [Time Series Classification](@ref) -- [Missing Value Imputation](@ref): Transformers that replace missing values. +- [Survival Analysis](@ref) -- [Transformers](@ref): Other transformers, such as standardizers, and categorical - encoders. +- [Density Estimation](@ref): Algorithms that learn a probability distribution -- [Clusterering](@ref): Algorithms that group data into clusters for classification and - possibly dimension reduction. May be true learners (generalize to new data) or static. +- [Bayesian Algorithms](@ref) - [Outlier Detection](@ref): Supervised, unsupervised, or semi-supervised learners for anomaly detection. -- [Learning a Probability Distribution](@ref): Algorithms that fit a distribution or - distribution-like object to data - -- [Time Series Forecasting](@ref) +- [Text Analysis](@ref) -- [Time Series Classification](@ref) +- [Audio Analysis](@ref) -- [Supervised Bayesian Algorithms](@ref) +- [Natural Language Processing](@ref) -- [Survival Analysis](@ref) +- [Image Processing](@ref) - [Meta-algorithms](@ref) diff --git a/docs/src/patterns/density_estimation.md b/docs/src/patterns/density_estimation.md new file mode 100644 index 00000000..f535f9fe --- /dev/null +++ b/docs/src/patterns/density_estimation.md @@ -0,0 +1 @@ +# Density Estimation diff --git a/docs/src/patterns/ensemble_algorithms.md b/docs/src/patterns/ensemble_algorithms.md new file mode 100644 index 00000000..44e94b52 --- /dev/null +++ b/docs/src/patterns/ensemble_algorithms.md @@ -0,0 +1,5 @@ +# Ensemble Algorithms + +See [this +example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/iterative_algorithms.jl) +from tests. diff --git a/docs/src/patterns/iterative_algorithms.md b/docs/src/patterns/iterative_algorithms.md index fafd1b7e..397ceb68 100644 --- a/docs/src/patterns/iterative_algorithms.md +++ b/docs/src/patterns/iterative_algorithms.md @@ -1 +1,5 @@ # Iterative Algorithms + +See [this +example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/iterative_algorithms.jl) +from tests. diff --git a/docs/src/patterns/learning_a_probability_distribution.md b/docs/src/patterns/learning_a_probability_distribution.md deleted file mode 100644 index 19a53b86..00000000 --- a/docs/src/patterns/learning_a_probability_distribution.md +++ /dev/null @@ -1 +0,0 @@ -# Learning a Probability Distribution diff --git a/docs/src/patterns/transformers.md b/docs/src/patterns/transformers.md new file mode 100644 index 00000000..08e10a25 --- /dev/null +++ b/docs/src/patterns/transformers.md @@ -0,0 +1 @@ +# Transformers diff --git a/src/traits.jl b/src/traits.jl index bed7cab5..d8104d93 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -195,11 +195,11 @@ tags() = [ "gradient descent", "iterative algorithms", "incremental algorithms", + "feature engineering", "dimension reduction", + "missing value imputation", "transformers", - "feature engineering", "static algorithms", - "missing value imputation", "ensemble algorithms", "time series forecasting", "time series classification", diff --git a/test/integration/iterative_algorithms.jl b/test/integration/iterative_algorithms.jl index 6e1e0f6e..7a1a3808 100644 --- a/test/integration/iterative_algorithms.jl +++ b/test/integration/iterative_algorithms.jl @@ -15,10 +15,6 @@ using StableRNGs # replacement). In particular this algorithm has an iteration parameter `n`, and we # implement `update` for warm restarts when `n` increases. -# By re-using the data interface for `Ridge`, we ensure that the resampling (bagging) is -# more efficient (no repeated table -> matrix conversions, and we resample matrices -# directly, not the tables). - # no docstring here - that goes with the constructor struct RidgeEnsemble lambda::Float64 @@ -44,22 +40,14 @@ end LearnAPI.algorithm(model::RidgeEnsembleFitted) = model.algorithm -# we use the same data interface we provided for `Ridge` in regression.jl: +# We add the same data interface we provided for `Ridge` in regression.jl. This is an +# optional step on which the later code does not depend. LearnAPI.obs(algorithm::RidgeEnsemble, data) = LearnAPI.obs(Ridge(), data) LearnAPI.obs(model::RidgeEnsembleFitted, data) = LearnAPI.obs(first(model.models), data) LearnAPI.target(algorithm::RidgeEnsemble, data) = LearnAPI.target(Ridge(), data) LearnAPI.features(algorithm::Ridge, data) = LearnAPI.features(Ridge(), data) -function d(rng) - i = digits(rng.state) - m = min(length(i), 4) - tail = i[end - m + 1:end] - println(join(string.(tail))) -end - -# because we need observation subsampling, we first implement `fit` for output of -# `obs`: -function LearnAPI.fit(algorithm::RidgeEnsemble, data::RidgeFitObs; verbosity=1) +function LearnAPI.fit(algorithm::RidgeEnsemble, data; verbosity=1) # unpack hyperparameters: lambda = algorithm.lambda @@ -69,16 +57,21 @@ function LearnAPI.fit(algorithm::RidgeEnsemble, data::RidgeFitObs; verbosity=1) # instantiate atomic algorithm: atom = Ridge(lambda) + # ensure data can be subsampled using MLUtils.jl, and that we're feeding the atomic + # `fit` data in an efficient (pre-processed) form: + + observations = obs(atom, data) + # initialize ensemble: models = [] # get number of observations: - N = MLUtils.numobs(data) + N = MLUtils.numobs(observations) # train the ensemble: for _ in 1:n bag = rand(rng, 1:N, N) - data_subset = MLUtils.getobs(data, bag) + data_subset = MLUtils.getobs(observations, bag) # step down one verbosity level in atomic fit: model = fit(atom, data_subset; verbosity=verbosity - 1) push!(models, model) @@ -91,21 +84,11 @@ function LearnAPI.fit(algorithm::RidgeEnsemble, data::RidgeFitObs; verbosity=1) end -# ... and so need a `fit` for unprocessed `data = (X, y)`: -LearnAPI.fit(algorithm::RidgeEnsemble, data; kwargs...) = - fit(algorithm, obs(algorithm, data); kwargs...) - # If `n` is increased, this `update` adds new regressors to the ensemble, including any # new # hyperparameter updates (e.g, `lambda`) when computing the new # regressors. Otherwise, update is equivalent to retraining from scratch, with the # provided hyperparameter updates. -function LearnAPI.update( - model::RidgeEnsembleFitted, - data::RidgeFitObs; - verbosity=1, - replacements..., - ) - +function LearnAPI.update(model::RidgeEnsembleFitted, data; verbosity=1, replacements...) :n in keys(replacements) || return fit(model, data) algorithm_old = LearnAPI.algorithm(model) @@ -114,24 +97,18 @@ function LearnAPI.update( Δn = n - algorithm_old.n n < 0 && return fit(model, algorithm) - # get number of observations: - N = MLUtils.numobs(data) + atom = Ridge(; lambda=algorithm.lambda) + observations = obs(atom, data) + N = MLUtils.numobs(observations) # initialize: models = model.models rng = model.rng # as mutated in previous `fit`/`update` calls - atom = Ridge(; lambda=algorithm.lambda) - - rng2 = StableRNG(123) - for _ in 1:10 - rand(rng2) - end - # add new regressors to the ensemble: for _ in 1:Δn bag = rand(rng, 1:N, N) - data_subset = MLUtils.getobs(data, bag) + data_subset = MLUtils.getobs(observations, bag) model = fit(atom, data_subset; verbosity=verbosity-1) push!(models, model) end @@ -142,13 +119,6 @@ function LearnAPI.update( return RidgeEnsembleFitted(algorithm, atom, rng, models) end -# an `update` for unprocessed `data = (X, y)`: -LearnAPI.update(model::RidgeEnsembleFitted, data; kwargs...) = - update(model, obs(LearnAPI.algorithm(model), data); kwargs...) - -# `data` here can be pre-processed or not, because we're just calling the atomic -# `predict`, which already has a data interface, and we don't need any subsampling, like -# we did for `fit`: LearnAPI.predict(model::RidgeEnsembleFitted, ::Point, data) = mean(model.models) do atomic_model predict(atomic_model, Point(), data) @@ -221,115 +191,6 @@ Xtest = Tables.subset(X, test) model = fit(LearnAPI.clone(algorithm; n=7), Xtrain, y[train]; verbosity=0); @test ŷ7 ≈ predict(model, Xtest) - - update(model, Xtest; - fitobs = LearnAPI.obs(algorithm, data) - predictobs = LearnAPI.obs(model, X) - model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) - @test LearnAPI.target(algorithm, fitobs) == y - @test predict(model, Point(), MLUtils.getobs(predictobs, test)) ≈ ŷ - @test predict(model, LearnAPI.features(algorithm, fitobs)) ≈ predict(model, X) - - @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} - - filename = tempname() - using Serialization - small_model = minimize(model) - serialize(filename, small_model) - - recovered_model = deserialize(filename) - @test LearnAPI.algorithm(recovered_model) == algorithm - @test predict( - recovered_model, - Point(), - MLUtils.getobs(predictobs, test) - ) ≈ ŷ - -end - -# # VARIATION OF RIDGE REGRESSION THAT USES FALLBACK OF LearnAPI.obs - -# no docstring here - that goes with the constructor -struct BabyRidge - lambda::Float64 -end - -""" - BabyRidge(; lambda=0.1) - -Instantiate a ridge regression algorithm, with regularization of `lambda`. - -""" -BabyRidge(; lambda=0.1) = BabyRidge(lambda) # LearnAPI.constructor defined later - -struct BabyRidgeFitted{T,F} - algorithm::BabyRidge - coefficients::Vector{T} - feature_importances::F -end - -function LearnAPI.fit(algorithm::BabyRidge, data; verbosity=1) - - X, y = data - - lambda = algorithm.lambda - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - A = Tables.matrix(table)' - - # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector - - feature_importances = nothing - - return BabyRidgeFitted(algorithm, coefficients, feature_importances) - -end - -# extracting stuff from training data: -LearnAPI.target(::BabyRidge, data) = last(data) - -LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm - -LearnAPI.predict(model::BabyRidgeFitted, ::Point, Xnew) = - Tables.matrix(Xnew)*model.coefficients - -LearnAPI.minimize(model::BabyRidgeFitted) = - BabyRidgeFitted(model.algorithm, model.coefficients, nothing) - -@trait( - BabyRidge, - constructor = BabyRidge, - kinds_of_proxy = (Point(),), - tags = ("regression",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.algorithm), - :(LearnAPI.minimize), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.feature_importances), - ) -) - -@testset "test a variation which does not overload LearnAPI.obs" begin - algorithm = BabyRidge(lambda=0.5) - @test - - model = fit(algorithm, Tables.subset(X, train), y[train]; verbosity=0) - ŷ = predict(model, Point(), Tables.subset(X, test)) - @test ŷ isa Vector{Float64} - - fitobs = obs(algorithm, data) - predictobs = LearnAPI.obs(model, X) - model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) - @test predict(model, Point(), MLUtils.getobs(predictobs, test)) == ŷ == - predict(model, MLUtils.getobs(predictobs, test)) - @test LearnAPI.target(algorithm, data) == y - @test LearnAPI.predict(model, X) ≈ - LearnAPI.predict(model, LearnAPI.features(algorithm, data)) end true diff --git a/test/runtests.jl b/test/runtests.jl index 0a52be7b..dee0c17b 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -6,6 +6,7 @@ test_files = [ "clone.jl", "integration/regression.jl", "integration/static_algorithms.jl", + "integration/iterative_algorithms.jl", ] files = isempty(ARGS) ? test_files : ARGS From 82a9e687a5da68cace30cc885a5c45a9ffe632d4 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 11:45:28 +1300 Subject: [PATCH 080/187] add trait tests --- docs/src/common_implementation_patterns.md | 36 ++++----- docs/src/patterns/meta_algorithms.md | 5 ++ docs/src/reference.md | 4 +- docs/src/traits.md | 14 ++-- src/traits.jl | 25 +----- test/integration/iterative_algorithms.jl | 93 +++++++++++----------- test/traits.jl | 47 ++++++++++- 7 files changed, 128 insertions(+), 96 deletions(-) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index f878aa36..72d577de 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -16,30 +16,30 @@ This guide is intended to be consulted after reading [Anatomy of an Implementation](@ref), which introduces the main interface objects and terminology. -Although an implementation is defined purely by the methods and traits it implements, most +Although an implementation is defined purely by the methods and traits it implements, many implementations fall into one (or more) of the following informally understood patterns or "tasks": - [Regression](@ref): Supervised learners for continuous targets -- [Classification](@ref): Supervised learners for categorical targets +- Classification: Supervised learners for categorical targets -- [Clusterering](@ref): Algorithms that group data into clusters for classification and +- Clusterering: Algorithms that group data into clusters for classification and possibly dimension reduction. May be true learners (generalize to new data) or static. -- [Gradient Descent](@ref): Including neural networks. +- Gradient Descent: Including neural networks. - [Iterative Algorithms](@ref) -- [Incremental Algorithms](@ref) +- Incremental Algorithms - [Feature Engineering](@ref): Algorithms for selecting or combining features -- [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension +- Dimension Reduction: Transformers that learn to reduce feature space dimension -- [Missing Value Imputation](@ref) +- Missing Value Imputation -- [Transformers](@ref): Other transformers, such as standardizers, and categorical +- Transformers: Other transformers, such as standardizers, and categorical encoders. - [Static Algorithms](@ref): Algorithms that do not learn, in the sense they must be @@ -48,26 +48,26 @@ implementations fall into one (or more) of the following informally understood p - [Ensemble Algorithms](@ref): Algorithms that blend predictions of multiple algorithms -- [Time Series Forecasting](@ref) +- Time Series Forecasting -- [Time Series Classification](@ref) +- Time Series Classification -- [Survival Analysis](@ref) +- Survival Analysis -- [Density Estimation](@ref): Algorithms that learn a probability distribution +- Density Estimation: Algorithms that learn a probability distribution -- [Bayesian Algorithms](@ref) +- Bayesian Algorithms -- [Outlier Detection](@ref): Supervised, unsupervised, or semi-supervised learners for +- Outlier Detection: Supervised, unsupervised, or semi-supervised learners for anomaly detection. -- [Text Analysis](@ref) +- Text Analysis -- [Audio Analysis](@ref) +- Audio Analysis -- [Natural Language Processing](@ref) +- Natural Language Processing -- [Image Processing](@ref) +- Image Processing - [Meta-algorithms](@ref) diff --git a/docs/src/patterns/meta_algorithms.md b/docs/src/patterns/meta_algorithms.md index 1de7712f..302e218d 100644 --- a/docs/src/patterns/meta_algorithms.md +++ b/docs/src/patterns/meta_algorithms.md @@ -1 +1,6 @@ # Meta-algorithms + +Many meta-algorithms are wrappers. An example is [this bagged ensemble +algorithm](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/iterative_algorithms.jl) +from tests. + diff --git a/docs/src/reference.md b/docs/src/reference.md index bcd5a922..003f531e 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -141,7 +141,9 @@ for each. [`LearnAPI.algorithm`](@ref algorithm_minimize), [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref). -Most algorithms will also implement [`predict`](@ref) and/or [`transform`](@ref). +Most algorithms will also implement [`predict`](@ref) and/or [`transform`](@ref). For a +bare minimum implementation, see the implementation of `SmallAlgorithm` +[here](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/traits.jl). ### List of methods diff --git a/docs/src/traits.md b/docs/src/traits.md index 58759137..09c6fbf4 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -26,8 +26,8 @@ In the examples column of the table below, `Continuous` is a name owned the pack | [`LearnAPI.load_path`](@ref)`(algorithm)` | string locating name returned by `LearnAPI.constructor(algorithm)`, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | | [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties of `algorithm` may be an algorithm | `false` | `true` | | [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | -| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | | [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | +| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | | [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | | [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | | [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | @@ -36,12 +36,12 @@ In the examples column of the table below, `Continuous` is a name owned the pack The following are provided for convenience but should not be overloaded by new algorithms: -| trait | return value | example | -|:-----------------------------------|:---------------------------------------------------------------------|:--------| -| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | -| `LearnAPI.is_algorithm(algorithm)` | `true` if `algorithm` is LearnAPI.jl-compliant | `true` | -| `LearnAPI.target(algorithm)` | `true` if [`LearnAPI.target(algorithm, data)`](@ref) is implemented | `false` | -| `LearnAPI.weights(algorithm)` | `true` if [`LearnAPI.weights(algorithm, data)`](@ref) is implemented | `false` | +| trait | return value | example | +|:-----------------------------------|:-------------------------------------------------------------------------|:--------| +| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | +| `LearnAPI.is_algorithm(algorithm)` | `true` if `algorithm` is LearnAPI.jl-compliant | `true` | +| `LearnAPI.target(algorithm)` | `true` if `fit` sees a target variable; see [`LearnAPI.target`](@ref) | `false` | +| `LearnAPI.weights(algorithm)` | `true` if `fit` supports per-observation; see [`LearnAPI.weights`](@ref) | `false` | ## Implementation guide diff --git a/src/traits.jl b/src/traits.jl index d8104d93..64c58353 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -23,29 +23,6 @@ const DOC_EXPLAIN_EACHOBS = """ -const TRAITS = [ - :constructor, - :functions, - :kinds_of_proxy, - :tags, - :is_pure_julia, - :pkg_name, - :pkg_license, - :doc_url, - :load_path, - :is_composite, - :human_name, - :iteration_parameter, - :data_interface, - :predict_or_transform_mutates, - :fit_observation_scitype, - :target_observation_scitype, - :name, - :is_algorithm, - :target, -] - - # # OVERLOADABLE TRAITS """ @@ -426,7 +403,7 @@ variable. Specifically: variables) then "target" means anything returned by `LearnAPI.target(algorithm, data)`, where `data` is an admissible argument in the call `fit(algorithm, data)`. -- `S` will always be an upper bound on the scitype of observations that could be +- `S` will always be an upper bound on the scitype of (point) observations that could be conceivably extracted from the output of [`predict`](@ref). To illustate the second case, suppose we have diff --git a/test/integration/iterative_algorithms.jl b/test/integration/iterative_algorithms.jl index 7a1a3808..2dae816b 100644 --- a/test/integration/iterative_algorithms.jl +++ b/test/integration/iterative_algorithms.jl @@ -7,56 +7,57 @@ using Random using Statistics using StableRNGs -# # ENSEMBLE OF RIDGE REGRESSORS - -# We implement a toy algorithm that creates an bagged ensemble of ridge regressors (as -# defined already in test/integration/regressors.jl), i.e, where each atomic model is -# trained on a random sample of the training observations (same number, but sampled with -# replacement). In particular this algorithm has an iteration parameter `n`, and we -# implement `update` for warm restarts when `n` increases. - -# no docstring here - that goes with the constructor -struct RidgeEnsemble - lambda::Float64 - rng # leaving abstract for simplicity +# # ENSEMBLE OF REGRESSORS (A MODEL WRAPPER) + +# We implement a toy algorithm that creates an bagged ensemble of regressors, i.e, where +# each atomic model is trained on a random sample of the training observations (same +# number, but sampled with replacement). In particular this algorithm has an iteration +# parameter `n`, and we implement `update` for warm restarts when `n` increases. + +# no docstring here - that goes with the constructor; some fields left abstract for +# simplicity +# +struct Ensemble + atom # the base regressor being bagged + rng n::Int end +# Since the `atom` hyperparameter is another algorithm, it doesn't need a default in the +# kwarg constructor, but we do need to overload the `LearnAPI.is_composite` trait (done +# later). + """ - RidgeEnsemble(; lambda=0.1, rng=Random.default_rng(), n=10) + Ensemble(atom; rng=Random.default_rng(), n=10) -Instantiate a RidgeEnsemble algorithm, bla, bla, bla... +Instantiate a bagged ensemble of `n` regressors, with base regressor `atom`, etc """ -RidgeEnsemble(; lambda=0.1, rng=Random.default_rng(), n=10) = - RidgeEnsemble(lambda, rng, n) # LearnAPI.constructor defined later +Ensemble(atom; rng=Random.default_rng(), n=10) = + Ensemble(atom, rng, n) # `LearnAPI.constructor` defined later -struct RidgeEnsembleFitted - algorithm::RidgeEnsemble +struct EnsembleFitted + algorithm::Ensemble atom::Ridge rng # mutated copy of `algorithm.rng` models # leaving type abstract for simplicity end -LearnAPI.algorithm(model::RidgeEnsembleFitted) = model.algorithm +LearnAPI.algorithm(model::EnsembleFitted) = model.algorithm -# We add the same data interface we provided for `Ridge` in regression.jl. This is an -# optional step on which the later code does not depend. -LearnAPI.obs(algorithm::RidgeEnsemble, data) = LearnAPI.obs(Ridge(), data) -LearnAPI.obs(model::RidgeEnsembleFitted, data) = LearnAPI.obs(first(model.models), data) -LearnAPI.target(algorithm::RidgeEnsemble, data) = LearnAPI.target(Ridge(), data) -LearnAPI.features(algorithm::Ridge, data) = LearnAPI.features(Ridge(), data) +# We add the same data interface that the atomic regressor uses: +LearnAPI.obs(algorithm::Ensemble, data) = LearnAPI.obs(algorithm.atom, data) +LearnAPI.obs(model::EnsembleFitted, data) = LearnAPI.obs(first(model.models), data) +LearnAPI.target(algorithm::Ensemble, data) = LearnAPI.target(algorithm.atom, data) +LearnAPI.features(algorithm::Ridge, data) = LearnAPI.features(algorithm.atom, data) -function LearnAPI.fit(algorithm::RidgeEnsemble, data; verbosity=1) +function LearnAPI.fit(algorithm::Ensemble, data; verbosity=1) # unpack hyperparameters: - lambda = algorithm.lambda - rng = deepcopy(algorithm.rng) # to prevent mutation of `algorithm` + atom = algorithm.atom + rng = deepcopy(algorithm.rng) # to prevent mutation of `algorithm`! n = algorithm.n - # instantiate atomic algorithm: - atom = Ridge(lambda) - # ensure data can be subsampled using MLUtils.jl, and that we're feeding the atomic # `fit` data in an efficient (pre-processed) form: @@ -80,15 +81,16 @@ function LearnAPI.fit(algorithm::RidgeEnsemble, data; verbosity=1) # make some noise, if allowed: verbosity > 0 && @info "Trained $n ridge regression models. " - return RidgeEnsembleFitted(algorithm, atom, rng, models) + return EnsembleFitted(algorithm, atom, rng, models) end -# If `n` is increased, this `update` adds new regressors to the ensemble, including any -# new # hyperparameter updates (e.g, `lambda`) when computing the new -# regressors. Otherwise, update is equivalent to retraining from scratch, with the -# provided hyperparameter updates. -function LearnAPI.update(model::RidgeEnsembleFitted, data; verbosity=1, replacements...) +# Consistent with the documented `update` contract, we implement this behaviour: If `n` is +# increased, `update` adds new regressors to the ensemble, including any new +# hyperparameter updates (e.g, new `atom`) when computing the new atomic +# models. Otherwise, update is equivalent to retraining from scratch, with the provided +# hyperparameter updates. +function LearnAPI.update(model::EnsembleFitted, data; verbosity=1, replacements...) :n in keys(replacements) || return fit(model, data) algorithm_old = LearnAPI.algorithm(model) @@ -97,7 +99,7 @@ function LearnAPI.update(model::RidgeEnsembleFitted, data; verbosity=1, replacem Δn = n - algorithm_old.n n < 0 && return fit(model, algorithm) - atom = Ridge(; lambda=algorithm.lambda) + atom = algorithm.atom observations = obs(atom, data) N = MLUtils.numobs(observations) @@ -116,15 +118,15 @@ function LearnAPI.update(model::RidgeEnsembleFitted, data; verbosity=1, replacem # make some noise, if allowed: verbosity > 0 && @info "Trained $Δn additional ridge regression models. " - return RidgeEnsembleFitted(algorithm, atom, rng, models) + return EnsembleFitted(algorithm, atom, rng, models) end -LearnAPI.predict(model::RidgeEnsembleFitted, ::Point, data) = +LearnAPI.predict(model::EnsembleFitted, ::Point, data) = mean(model.models) do atomic_model predict(atomic_model, Point(), data) end -LearnAPI.minimize(model::RidgeEnsembleFitted) = RidgeEnsembleFitted( +LearnAPI.minimize(model::EnsembleFitted) = EnsembleFitted( model.algorithm, model.atom, model.rng, @@ -133,9 +135,10 @@ LearnAPI.minimize(model::RidgeEnsembleFitted) = RidgeEnsembleFitted( # note the inclusion of `iteration_parameter`: @trait( - RidgeEnsemble, - constructor = RidgeEnsemble, + Ensemble, + constructor = Ensemble, iteration_parameter = :n, + is_composite = true, kinds_of_proxy = (Point(),), tags = ("regression", "ensemble algorithms", "iterative models"), functions = ( @@ -165,7 +168,8 @@ Xtest = Tables.subset(X, test) @testset "test an implementation of bagged ensemble of ridge regressors" begin rng = StableRNG(123) - algorithm = RidgeEnsemble(lambda=0.5, n=4; rng) + atom = Ridge() + algorithm = Ensemble(atom; n=4, rng) @test LearnAPI.clone(algorithm) == algorithm @test :(LearnAPI.obs) in LearnAPI.functions(algorithm) @test LearnAPI.target(algorithm, data) == y @@ -190,7 +194,6 @@ Xtest = Tables.subset(X, test) # compare with cold restart: model = fit(LearnAPI.clone(algorithm; n=7), Xtrain, y[train]; verbosity=0); @test ŷ7 ≈ predict(model, Xtest) - end true diff --git a/test/traits.jl b/test/traits.jl index 3000d016..24a43ef8 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -1,6 +1,51 @@ -module FruitSalad +using Test using LearnAPI +# A MINIMUM IMPLEMENTATION OF AN ALGORITHM + +# does nothing useful +struct SmallAlgorithm end +LearnAPI.fit(algorithm::SmallAlgorithm, data; verbosity=1) = algorithm +LearnAPI.algorithm(algorithm::SmallAlgorithm) = algorithm +@trait( + SmallAlgorithm, + constructor = SmallAlgorithm, + functions = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + ), +) +######## END OF IMPLEMENTATION ################## + +# ZERO ARGUMENT METHODS + +@test :(LearnAPI.fit) in LearnAPI.functions() +@test Point in LearnAPI.kinds_of_proxy() +@test "regression" in LearnAPI.tags() + +# OVERLOADABLE TRAITS + +small = SmallAlgorithm() +@test !LearnAPI.is_pure_julia(small) +@test LearnAPI.pkg_name(small) == "unknown" +@test LearnAPI.pkg_license(small) == "unknown" +@test LearnAPI.load_path(small) == "unknown" +@test !LearnAPI.is_composite(small) +@test LearnAPI.human_name(small) == "small algorithm" +@test isnothing(LearnAPI.iteration_parameter(small)) +@test LearnAPI.data_interface(small) == LearnAPI.RandomAccess() +@test !(6 isa LearnAPI.fit_observation_scitype(small)) +@test 6 isa LearnAPI.target_observation_scitype(small) + +# DERIVED TRAITS + +@test LearnAPI.is_algorithm(small) +@test !LearnAPI.target(small) +@test !LearnAPI.weights(small) + +module FruitSalad +import LearnAPI + struct RedApple{T} x::T end From 6da8531df1b0cfb9c68e9cfa4f51066473a39ca6 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 12:35:38 +1300 Subject: [PATCH 081/187] dump `predict_or_transform_mutates` in favour of `is_static` --- docs/src/fit_update.md | 27 ++++++++++++++++--- docs/src/traits.md | 38 +++++++++++++-------------- src/clone.jl | 2 +- src/fit_update.jl | 12 +++++---- src/predict_transform.jl | 12 ++++----- src/target_weights_features.jl | 2 +- src/traits.jl | 23 +++++++++++----- test/integration/static_algorithms.jl | 9 ++++--- 8 files changed, 78 insertions(+), 47 deletions(-) diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index c512be9c..226e42c2 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -26,6 +26,8 @@ Data slurping forms are similarly provided for updating methods. ## Typical workflows +### Supervised models + Supposing `Algorithm` is some supervised classifier type, with an iteration parameter `n`: ```julia @@ -43,15 +45,32 @@ model = update(model; n=150) predict(model, Distribution(), X) ``` -### A static algorithm (no "learning") +### Tranformers + +A dimension-reducing transformer, `algorithm` might be used in this way: + +```julia +model = fit(algorithm, X) +transform(model, X) # or `transform(model, Xnew)` +``` + +or, if implemented, using a single call: + +```julia +transform(algorithm, X) # `fit` implied +``` + +### Static algorithms (no "learning") + +Suppose `algorithm` is some clustering algorithm that cannot be generalized to new data +(e.g. DBSCAN): ```julia -# Apply some clustering algorithm which cannot be generalized to new data: model = fit(algorithm) # no training data -labels = predict(model, LabelAmbiguous(), X) # may mutate `model` +labels = predict(model, X) # may mutate `model` # Or, in one line: -labels = predict(algorithm, LabelAmbiguous(), X) +labels = predict(algorithm, X) # But two-line version exposes byproducts of the clustering algorithm (e.g., outliers): LearnAPI.extras(model) diff --git a/docs/src/traits.md b/docs/src/traits.md index 09c6fbf4..83d3287d 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -13,24 +13,24 @@ training). They may also record more mundane information, such as a package lice In the examples column of the table below, `Continuous` is a name owned the package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). -| trait | return value | fallback value | example | -|:-------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| -| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | -| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | -| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | -| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | -| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | -| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | -| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | -| [`LearnAPI.load_path`](@ref)`(algorithm)` | string locating name returned by `LearnAPI.constructor(algorithm)`, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | -| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties of `algorithm` may be an algorithm | `false` | `true` | -| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | -| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | -| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | -| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | -| [`LearnAPI.predict_or_transform_mutates`](@ref)`(algorithm)` | `true` if `predict` or `transform` mutates first argument | `false` | `true` | +| trait | return value | fallback value | example | +|:-----------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| +| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | +| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | +| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | +| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | +| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | +| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | +| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | +| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | +| [`LearnAPI.load_path`](@ref)`(algorithm)` | string locating name returned by `LearnAPI.constructor(algorithm)`, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | +| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties of `algorithm` may be an algorithm | `false` | `true` | +| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | +| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | +| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | +| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | +| [`LearnAPI.is_static`](@ref)`(algorithm)` | `true` if `fit` consumes no data | `false` | `true` | ### Derived Traits @@ -104,5 +104,5 @@ LearnAPI.data_interface LearnAPI.iteration_parameter LearnAPI.fit_observation_scitype LearnAPI.target_observation_scitype -LearnAPI.predict_or_transform_mutates +LearnAPI.is_static ``` diff --git a/src/clone.jl b/src/clone.jl index 2b6eee13..571ea7fe 100644 --- a/src/clone.jl +++ b/src/clone.jl @@ -7,7 +7,7 @@ Return a shallow copy of `algorithm` with the specified hyperparameter replaceme clone(algorithm; epochs=100, learning_rate=0.01) ``` -It is guaranted that `LearnAPI.clone(algorithm) == algorithm`. +It is guaranteed that `LearnAPI.clone(algorithm) == algorithm`. """ function clone(algorithm; replacements...) diff --git a/src/fit_update.jl b/src/fit_update.jl index eb2c9cb8..b6801359 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -10,8 +10,8 @@ returning an object, `model`, on which other methods, such as [`predict`](@ref) list of methods that can be applied to either `algorithm` or `model`. The second signature is provided by algorithms that do not generalize to new observations -("static" algorithms). In that case, `transform(model, data)` or `predict(model, ..., -data)` carries out the actual algorithm execution, writing any byproducts of that +(called *static algorithms*). In that case, `transform(model, data)` or `predict(model, +..., data)` carries out the actual algorithm execution, writing any byproducts of that operation to the mutable object `model` returned by `fit`. Whenever `fit` expects a tuple form of argument, `data = (X1, ..., Xn)`, then the @@ -33,14 +33,16 @@ See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), # New implementations -Implementation is compulsory. The signature must include `verbosity`. A fallback for the -first signature calls the second, ignoring `data`: +Implementation is compulsory. The signature must include `verbosity`. Fallbacks provide +the data slurping versions. A fallback for the first signature calls the second, ignoring +`data`: ```julia fit(algorithm, data; kwargs...) = fit(algorithm; kwargs...) ``` -Fallbacks also provide the data slurping versions. +If only the `fit(algorithm)` signature is expliclty implemented, then the trait +[`LearnAPI.is_static`](@ref) must be overloaded to return `true`. $(DOC_DATA_INTERFACE(:fit)) diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 6f894d6f..9598572b 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -11,11 +11,9 @@ const DOC_OPERATIONS_LIST_FUNCTION = join(map(op -> "`LearnAPI.$op`", OPERATIONS DOC_MUTATION(op) = """ - If [`LearnAPI.predict_or_transform_mutates(algorithm)`](@ref) is overloaded to return - `true`, then `$op` may mutate it's first argument, but not in a way that alters the - result of a subsequent call to `predict`, `transform` or - `inverse_transform`. This is necessary for some non-generalizing algorithms but is - otherwise discouraged. See more at [`fit`](@ref). + If [`LearnAPI.is_static(algorithm)`](@ref) is `true`, then `$op` may mutate it's first + argument, but not in a way that alters the result of a subsequent call to `predict`, + `transform` or `inverse_transform`. See more at [`fit`](@ref). """ @@ -86,7 +84,7 @@ If `predict` supports data in the form of a tuple `data = (X1, ..., Xn)`, then a signature is also provided, as in `predict(model, X1, ..., Xn)`. Note `predict ` does not mutate any argument, except in the special case -`LearnAPI.predict_or_transform_mutates(algorithm) = true`. +`LearnAPI.is_static(algorithm) == true`. # New implementations @@ -150,7 +148,7 @@ W = transform(algorithm, X) ``` Note `transform` does not mutate any argument, except in the special case -`LearnAPI.predict_or_transform_mutates(algorithm) = true`. +`LearnAPI.is_static(algorithm) == true`. See also [`fit`](@ref), [`predict`](@ref), [`inverse_transform`](@ref). diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 3c9b075f..58243030 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -70,4 +70,4 @@ return `nothing`. features(algorithm, data) = _first(data) _first(data) = data _first(data::Tuple) = first(data) -# note the factoring above guards agains method ambiguities +# note the factoring above guards against method ambiguities diff --git a/src/traits.jl b/src/traits.jl index 64c58353..867353b5 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -346,19 +346,30 @@ tables, and tuples of these. See the doc-string for details. data_interface(::Any) = LearnAPI.RandomAccess() """ - LearnAPI.predict_or_transform_mutates(algorithm) + LearnAPI.is_static(algorithm) -Returns `true` if [`predict`](@ref) or [`transform`](@ref) possibly mutate their first -argument, `model`, when `LearnAPI.algorithm(model) == algorithm`. If `false`, no arguments -are ever mutated. +Returns `true` if [`fit`](@ref) is called with no data arguments, as in +`fit(algorithm)`. That is, `algorithm` does not generalize to new data, and data is only +provided at the `predict` or `transform` step. + +For example, some clustering algorithms are applied with this workflow, to label points +observations in `X`: + +```julia +model = fit(algorithm) # no training data +labels = predict(model, X) # may mutate `model`! + +# extract some byproducts of the clustering algorithm (e.g., outliers): +LearnAPI.extras(model) +``` # New implementations This trait, falling back to `false`, may only be overloaded when `fit` has no data -arguments (`algorithm` does not generalize to new data). See more at [`fit`](@ref). +arguments. See more at [`fit`](@ref). """ -predict_or_transform_mutates(::Any) = false +is_static(::Any) = false """ LearnAPI.iteration_parameter(algorithm) diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index 3812fbc6..f02a831d 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -36,10 +36,12 @@ function LearnAPI.transform(algorithm::Selector, X) transform(model, X) end +# note the necessity of overloading `is_static` (`fit` consumes no data): @trait( Selector, constructor = Selector, tags = ("feature engineering",), + is_static = true, functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -63,9 +65,7 @@ end # # FEATURE SELECTOR THAT REPORTS BYPRODUCTS OF SELECTION PROCESS # This a variation of `Selector` above that stores the names of rejected features in the -# model object, for inspection by an accessor function called `rejected`. Since -# `transform(model, X)` mutates `model` in this case, we must overload the -# `predict_or_transform_mutates` trait. +# output of `fit`, for inspection by an accessor function called `rejected`. struct Selector2 names::Vector{Symbol} @@ -101,10 +101,11 @@ function LearnAPI.transform(algorithm::Selector2, X) transform(model, X) end +# note the necessity of overloading `is_static` (`fit` consumes no data): @trait( Selector2, constructor = Selector2, - predict_or_transform_mutates = true, + is_static = true, tags = ("feature engineering",), functions = ( :(LearnAPI.fit), From aa8f9de0896a95c4a6d00eea260177db424f7980 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 12:43:08 +1300 Subject: [PATCH 082/187] doc string typo --- src/traits.jl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 867353b5..1be85484 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -352,8 +352,8 @@ Returns `true` if [`fit`](@ref) is called with no data arguments, as in `fit(algorithm)`. That is, `algorithm` does not generalize to new data, and data is only provided at the `predict` or `transform` step. -For example, some clustering algorithms are applied with this workflow, to label points -observations in `X`: +For example, some clustering algorithms are applied with this workflow, to assign labels +to the observations in `X`: ```julia model = fit(algorithm) # no training data From 5afe7ac77828f06fd1176f956046a1dc60271861 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 16:55:01 +1300 Subject: [PATCH 083/187] doc correction --- docs/src/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/index.md b/docs/src/index.md index 7b638aed..23496062 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -16,7 +16,7 @@ algorithms buy into functionality, such as hyperparameter optimization and model composition, as provided by ML/statistics toolboxes and other packages. LearnAPI.jl also provides a number of Julia [traits](@ref traits) for promising specific behavior. -LearnAPI.jl has no package dependencies. +LearnAPI.jl's only dependency is the standard library `InteractiveUtils`. ```@raw html 🚧 From f690c4497ee5fd551f04a19f528f21a671c7309d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 18:42:42 +1300 Subject: [PATCH 084/187] tweak --- test/traits.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/traits.jl b/test/traits.jl index 24a43ef8..c498e7b7 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -6,7 +6,7 @@ using LearnAPI # does nothing useful struct SmallAlgorithm end LearnAPI.fit(algorithm::SmallAlgorithm, data; verbosity=1) = algorithm -LearnAPI.algorithm(algorithm::SmallAlgorithm) = algorithm +LearnAPI.algorithm(model::SmallAlgorithm) = model @trait( SmallAlgorithm, constructor = SmallAlgorithm, From 2312c1dbab1e6e65c80b57451b2ece9a4374c671 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 18:47:02 +1300 Subject: [PATCH 085/187] tweak --- docs/src/fit_update.md | 2 +- docs/src/reference.md | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 226e42c2..b49199c7 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -60,7 +60,7 @@ or, if implemented, using a single call: transform(algorithm, X) # `fit` implied ``` -### Static algorithms (no "learning") +### [Static algorithms (no "learning")](@id static_algorithms) Suppose `algorithm` is some clustering algorithm that cannot be generalized to new data (e.g. DBSCAN): diff --git a/docs/src/reference.md b/docs/src/reference.md index 003f531e..3b220052 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -148,9 +148,9 @@ bare minimum implementation, see the implementation of `SmallAlgorithm` ### List of methods - [`fit`](@ref fit): for training or updating algorithms that generalize to new data. Or, - for non-generalizing algorithms (see [Static Algorithms](@ref)), for wrapping - `algorithm` in a mutable struct that can be mutated by `predict`/`transform` to record - byproducts of those operations. + for non-generalizing algorithms (see [here](@ref static_algorithms) and [Static + Algorithms](@ref)), for wrapping `algorithm` in a mutable struct that can be mutated by + `predict`/`transform` to record byproducts of those operations. - [`update`](@ref fit): for updating learning outcomes after hyperparameter changes, such as increasing an iteration parameter. From 02af76687c7e24ddbafd9d928f817296b52003fc Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 21:36:13 +1300 Subject: [PATCH 086/187] tweak --- src/traits.jl | 5 ++++- test/integration/iterative_algorithms.jl | 6 ++++++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/traits.jl b/src/traits.jl index 1be85484..81ecf778 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -54,9 +54,12 @@ named_properties = NamedTuple{properties}(getproperty.(Ref(algorithm), propertie @assert algorithm == LearnAPI.constructor(algorithm)(; named_properties...) ``` +which can be tested with `@assert LearnAPI.clone(algorithm) == algorithm`. + The keyword constructor provided by `LearnAPI.constructor` must provide default values for all properties, with the exception of those that can take other LearnAPI.jl algorithms as -values. +values. These can be provided with the default `nothing`, with the constructor throwing an +error if the default value persists. """ function constructor end diff --git a/test/integration/iterative_algorithms.jl b/test/integration/iterative_algorithms.jl index 2dae816b..e548ea87 100644 --- a/test/integration/iterative_algorithms.jl +++ b/test/integration/iterative_algorithms.jl @@ -36,6 +36,12 @@ Instantiate a bagged ensemble of `n` regressors, with base regressor `atom`, etc Ensemble(atom; rng=Random.default_rng(), n=10) = Ensemble(atom, rng, n) # `LearnAPI.constructor` defined later +# pure keyword argument constructor: +function Ensemble(; atom=nothing, kwargs...) + isnothing(atom) && error("You must specify `atom=...` ") + Ensemble(atom; kwargs...) +end + struct EnsembleFitted algorithm::Ensemble atom::Ridge From 040adcd7e580f3b9634ff6fee364ab20e9e38df2 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 8 Oct 2024 21:39:43 +1300 Subject: [PATCH 087/187] tweak --- docs/src/reference.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index 3b220052..d3e8076b 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -105,8 +105,9 @@ deep copies of RNG hyperparameters before using them in a new implementation of A *composite algorithm* is one with at least one property that can take other algorithms as values; for such algorithms [`LearnAPI.is_composite`](@ref)`(algorithm)` must be `true` (fallback is `false`). Generally, the keyword constructor provided by -[`LearnAPI.constructor`](@ref) must provide default values for all fields that are not -algorithm-valued. +[`LearnAPI.constructor`](@ref) must provide default values for all properties that are not +algorithm-valued. Instead, these algorithm-valued properties can have a `nothing` default, +with the constructor throwing an error if the default value persists. Any object `algorithm` for which [`LearnAPI.functions`](@ref)`(algorithm)` is non-empty is understood to have a valid implementation of the LearnAPI.jl interface. From 1f781549da27c1813efc0e131c3b146e0e6be71e Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 10 Oct 2024 13:48:46 +1300 Subject: [PATCH 088/187] add tests --- test/traits.jl | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/test/traits.jl b/test/traits.jl index c498e7b7..a0c8a3d9 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -26,9 +26,14 @@ LearnAPI.algorithm(model::SmallAlgorithm) = model # OVERLOADABLE TRAITS small = SmallAlgorithm() +@test LearnAPI.constructor(small) == SmallAlgorithm +@test LearnAPI.functions(small) == (:(LearnAPI.fit), :(LearnAPI.algorithm)) +@test isempty(LearnAPI.kinds_of_proxy(small)) +@test isempty(LearnAPI.tags(small)) @test !LearnAPI.is_pure_julia(small) @test LearnAPI.pkg_name(small) == "unknown" @test LearnAPI.pkg_license(small) == "unknown" +@test LearnAPI.doc_url(small) == "unknown" @test LearnAPI.load_path(small) == "unknown" @test !LearnAPI.is_composite(small) @test LearnAPI.human_name(small) == "small algorithm" From 083bae9a23ebc196e2ff5459e64562dbf2e4cf6d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 10 Oct 2024 17:43:09 +1300 Subject: [PATCH 089/187] add test for fit fallback --- test/integration/static_algorithms.jl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index f02a831d..328725b9 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -55,6 +55,8 @@ end algorithm = Selector(names=[:x, :w]) X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) model = fit(algorithm) # no data arguments! + # if provided, data is ignored: + @test fit(algorithm, "junk")[] == model[] @test LearnAPI.algorithm(model) == algorithm W = transform(model, X) @test W == DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) From d792e51748b90c6c32351b9345e69f6046c29450 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 13:57:16 +1300 Subject: [PATCH 090/187] fix a mistake in implementation of Ensemble --- test/integration/iterative_algorithms.jl | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/test/integration/iterative_algorithms.jl b/test/integration/iterative_algorithms.jl index e548ea87..1e92e203 100644 --- a/test/integration/iterative_algorithms.jl +++ b/test/integration/iterative_algorithms.jl @@ -55,7 +55,7 @@ LearnAPI.algorithm(model::EnsembleFitted) = model.algorithm LearnAPI.obs(algorithm::Ensemble, data) = LearnAPI.obs(algorithm.atom, data) LearnAPI.obs(model::EnsembleFitted, data) = LearnAPI.obs(first(model.models), data) LearnAPI.target(algorithm::Ensemble, data) = LearnAPI.target(algorithm.atom, data) -LearnAPI.features(algorithm::Ridge, data) = LearnAPI.features(algorithm.atom, data) +LearnAPI.features(algorithm::Ensemble, data) = LearnAPI.features(algorithm.atom, data) function LearnAPI.fit(algorithm::Ensemble, data; verbosity=1) @@ -97,10 +97,11 @@ end # models. Otherwise, update is equivalent to retraining from scratch, with the provided # hyperparameter updates. function LearnAPI.update(model::EnsembleFitted, data; verbosity=1, replacements...) - :n in keys(replacements) || return fit(model, data) - algorithm_old = LearnAPI.algorithm(model) algorithm = LearnAPI.clone(algorithm_old; replacements...) + + :n in keys(replacements) || return fit(algorithm, data) + n = algorithm.n Δn = n - algorithm_old.n n < 0 && return fit(model, algorithm) @@ -156,7 +157,6 @@ LearnAPI.minimize(model::EnsembleFitted) = EnsembleFitted( :(LearnAPI.target), :(LearnAPI.update), :(LearnAPI.predict), - :(LearnAPI.feature_importances), ) ) @@ -190,16 +190,18 @@ Xtest = Tables.subset(X, test) @test ŷ4 == predict(model, Xtest) # add 3 atomic models to the ensemble: - # model = @test_logs( - # (:info, r"Trained 3 additional"), - # update(model, Xtrain, y[train]; n=7), - # ) model = update(model, Xtrain, y[train]; verbosity=0, n=7); ŷ7 = predict(model, Xtest) # compare with cold restart: model = fit(LearnAPI.clone(algorithm; n=7), Xtrain, y[train]; verbosity=0); @test ŷ7 ≈ predict(model, Xtest) + + # test cold restart if another hyperparameter is changed: + model2 = update(model, Xtrain, y[train]; atom=Ridge(0.05)) + algorithm2 = LearnAPI.clone(LearnAPI.algorithm(model); atom=Ridge(0.05)) + @test predict(model, Xtest) ≈ predict(model2, Xtest) + end true From 43db086bab739624a390034e0da977cde85bcf34 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 14:10:04 +1300 Subject: [PATCH 091/187] some minor file re-organization --- docs/src/common_implementation_patterns.md | 2 +- docs/src/patterns/{ensemble_algorithms.md => ensembling.md} | 4 ++-- docs/src/patterns/incremental_algorithms.md | 1 - docs/src/patterns/incremental_models.md | 1 - docs/src/patterns/iterative_algorithms.md | 2 +- docs/src/patterns/meta_algorithms.md | 5 +++-- test/integration/{iterative_algorithms.jl => ensembling.jl} | 0 7 files changed, 7 insertions(+), 8 deletions(-) rename docs/src/patterns/{ensemble_algorithms.md => ensembling.md} (60%) delete mode 100644 docs/src/patterns/incremental_algorithms.md delete mode 100644 docs/src/patterns/incremental_models.md rename test/integration/{iterative_algorithms.jl => ensembling.jl} (100%) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 72d577de..7a5f357c 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -46,7 +46,7 @@ implementations fall into one (or more) of the following informally understood p re-executed for each new data set (do not generalize), but which have hyperparameters and/or deliver ancillary information about the computation. -- [Ensemble Algorithms](@ref): Algorithms that blend predictions of multiple algorithms +- [Ensembling](@ref): Algorithms that blend predictions of multiple algorithms - Time Series Forecasting diff --git a/docs/src/patterns/ensemble_algorithms.md b/docs/src/patterns/ensembling.md similarity index 60% rename from docs/src/patterns/ensemble_algorithms.md rename to docs/src/patterns/ensembling.md index 44e94b52..1513997e 100644 --- a/docs/src/patterns/ensemble_algorithms.md +++ b/docs/src/patterns/ensembling.md @@ -1,5 +1,5 @@ -# Ensemble Algorithms +# Ensembling See [this -example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/iterative_algorithms.jl) +example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) from tests. diff --git a/docs/src/patterns/incremental_algorithms.md b/docs/src/patterns/incremental_algorithms.md deleted file mode 100644 index 54095fa5..00000000 --- a/docs/src/patterns/incremental_algorithms.md +++ /dev/null @@ -1 +0,0 @@ -# Incremental Models diff --git a/docs/src/patterns/incremental_models.md b/docs/src/patterns/incremental_models.md deleted file mode 100644 index 2876c57f..00000000 --- a/docs/src/patterns/incremental_models.md +++ /dev/null @@ -1 +0,0 @@ -# Incremental Algorithms diff --git a/docs/src/patterns/iterative_algorithms.md b/docs/src/patterns/iterative_algorithms.md index 397ceb68..be2a0c7f 100644 --- a/docs/src/patterns/iterative_algorithms.md +++ b/docs/src/patterns/iterative_algorithms.md @@ -1,5 +1,5 @@ # Iterative Algorithms See [this -example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/iterative_algorithms.jl) +example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) from tests. diff --git a/docs/src/patterns/meta_algorithms.md b/docs/src/patterns/meta_algorithms.md index 302e218d..2104ff26 100644 --- a/docs/src/patterns/meta_algorithms.md +++ b/docs/src/patterns/meta_algorithms.md @@ -1,6 +1,7 @@ # Meta-algorithms -Many meta-algorithms are wrappers. An example is [this bagged ensemble -algorithm](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/iterative_algorithms.jl) +Many meta-algorithms are can be implemented as wrappers. An example is [this bagged +ensemble +algorithm](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) from tests. diff --git a/test/integration/iterative_algorithms.jl b/test/integration/ensembling.jl similarity index 100% rename from test/integration/iterative_algorithms.jl rename to test/integration/ensembling.jl From 40fd7733f26d69a05520f0a1f8880dd01235b797 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 14:18:13 +1300 Subject: [PATCH 092/187] add a perceptron classifier to examples --- docs/src/common_implementation_patterns.md | 4 +- docs/src/patterns/classification.md | 4 + docs/src/patterns/gradient_descent.md | 5 + docs/src/patterns/iterative_algorithms.md | 8 +- test/integration/gradient_descent.jl | 386 +++++++++++++++++++++ 5 files changed, 402 insertions(+), 5 deletions(-) create mode 100644 docs/src/patterns/gradient_descent.md create mode 100644 test/integration/gradient_descent.jl diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 7a5f357c..29e67ca8 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -22,12 +22,12 @@ implementations fall into one (or more) of the following informally understood p - [Regression](@ref): Supervised learners for continuous targets -- Classification: Supervised learners for categorical targets +- [Classification](@ref): Supervised learners for categorical targets - Clusterering: Algorithms that group data into clusters for classification and possibly dimension reduction. May be true learners (generalize to new data) or static. -- Gradient Descent: Including neural networks. +- [Gradient Descent](@ref): Including neural networks. - [Iterative Algorithms](@ref) diff --git a/docs/src/patterns/classification.md b/docs/src/patterns/classification.md index 4e8066d9..86fa1158 100644 --- a/docs/src/patterns/classification.md +++ b/docs/src/patterns/classification.md @@ -1 +1,5 @@ # Classification + +See these examples from tests: + +- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/gradient_descent.jl) diff --git a/docs/src/patterns/gradient_descent.md b/docs/src/patterns/gradient_descent.md new file mode 100644 index 00000000..acded653 --- /dev/null +++ b/docs/src/patterns/gradient_descent.md @@ -0,0 +1,5 @@ +# Gradient Descent + +See [this +example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/gradient_descent.jl) +from tests. diff --git a/docs/src/patterns/iterative_algorithms.md b/docs/src/patterns/iterative_algorithms.md index be2a0c7f..abab6316 100644 --- a/docs/src/patterns/iterative_algorithms.md +++ b/docs/src/patterns/iterative_algorithms.md @@ -1,5 +1,7 @@ # Iterative Algorithms -See [this -example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) -from tests. +See these examples from tests: + +- [bagged ensembling](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) + +- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/gradient_descent.jl) diff --git a/test/integration/gradient_descent.jl b/test/integration/gradient_descent.jl new file mode 100644 index 00000000..6b582225 --- /dev/null +++ b/test/integration/gradient_descent.jl @@ -0,0 +1,386 @@ +using Pkg +Pkg.activate("perceptron", shared=true) + +using LearnAPI +using Random +using Statistics +using StableRNGs +import Optimisers +import Zygote +import NNlib +import CategoricalDistributions +import CategoricalDistributions: pdf, mode +import ComponentArrays + +# # PERCEPTRON + +# We implement a simple perceptron classifier to illustrate some common patterns for +# gradient descent algorithms. This includes implementation of the following methods: + +# - `update` +# - `update_observations` +# - `iteration_parameter` +# - `training_losses` +# - `obs` for pre-processing (non-tabular) classification training data +# - `predict(algorithm, ::Distribution, Xnew)` + +# For simplicity, we use single-observation batches for gradient descent updates, and we +# may dodge some standard optimizations. + +# This is also an example of a probability-predicting classifier. + + +# ## Helpers + +""" + brier_loss(probs, hot) + +Return Brier (quadratic) loss. + +- `probs`: predicted probability vector +- `hot`: corresponding ground truth observation, as a one-hot encoded bit vector + +""" +function brier_loss(probs, hot) + offset = 1 + sum(probs.^2) + return offset - 2*(sum(probs.*hot)) +end + +""" + corefit(perceptron, optimiser, X, y_hot, epochs, state, verbosity) + +Return updated `perceptron`, `state` and training losses by carrying out gradient descent +for the specified number of `epochs`. + +- `perceptron`: component array with components `weights` and `bias` +- `optimiser`: optimiser from Optimiser.jl +- `X`: feature matrix, of size (p, n) +- `y_hot`: one-hot encoded target, of size (nclasses, n) +- `epochs`: number of epochs +- `state`: optimiser state + +""" +function corefit(perceptron, X, y_hot, epochs, state, verbosity) + n = size(y_hot) |> last + losses = map(1:epochs) do _ + total_loss = zero(Float32) + for i in 1:n + loss, grad = Zygote.withgradient(perceptron) do p + probs = p.weights*X[:,i] + p.bias |> NNlib.softmax + brier_loss(probs, y_hot[:,i]) + end + ∇loss = only(grad) + state, perceptron = Optimisers.update(state, perceptron, ∇loss) + total_loss += loss + end + # make some noise, if allowed: + verbosity > 0 && @info "Training loss: $total_loss" + total_loss + end + return perceptron, state, losses +end + + +# ## Implementation + +# ### Algorithm + +# no docstring here - that goes with the constructor; +# SOME FIELDS LEFT ABSTRACT FOR SIMPLICITY +struct PerceptronClassifier + epochs::Int + optimiser # an optmiser from Optimsers.jl + rng +end + +""" + PerceptronClassifier(; epochs=50, optimiser=Optimisers.Adam(), rng=Random.default_rng()) + +Instantiate a perceptron classifier. + +Train an instance, `algorithm`, by doing `model = fit(algorithm, X, y)`, where + +- `X is a `Float32` matrix, with observations-as-columns +- `y` (target) is some one-dimensional `CategoricalArray`. + +Get probabilistic predictions with `predict(model, Xnew)` and +point predictions with `predict(model, Point(), Xnew)`. + +# Warm restart options + + update_observations(model, newdata; replacements...) + +Return an updated model, with the weights and bias of the previously learned perceptron +used as the starting state in new gradient descent updates. Adopt any specified +hyperparameter `replacements` (properties of `LearnAPI.algorithm(model)`). + + update(model, newdata; epochs=n, replacements...) + +If `Δepochs = n - perceptron.epochs` is non-negative, then return an updated model, with +the weights and bias of the previously learned perceptron used as the starting state in +new gradient descent updates for `Δepochs` epochs, and using the provided `newdata` +instead of the previous training data. Any other hyperparaameter `replacements` are also +adopted. In `Δepochs` is negative or not specified, instead return `fit(algorithm, +newdata)`, where `algorithm=LearnAPI.clone(algorithm; epochs=n, replacements....)`. + +""" +PerceptronClassifier(; epochs=50, optimiser=Optimisers.Adam(), rng=Random.default_rng()) = + PerceptronClassifier(epochs, optimiser, rng) + + +# ### Data interface + +# For raw training data: +LearnAPI.target(algorithm::PerceptronClassifier, data::Tuple) = last(data) + +# For wrapping pre-processed training data (output of `obs(algorithm, data)`): +struct PerceptronClassifierObservations + X::Matrix{Float32} + y_hot::BitMatrix # one-hot encoded target + classes # the (ordered) pool of `y`, as `CategoricalValue`s +end + +# For pre-processing the training data: +function LearnAPI.obs(algorithm::PerceptronClassifier, data::Tuple) + X, y = data + classes = CategoricalDistributions.classes(y) + y_hot = classes .== permutedims(y) # one-hot encoding + return PerceptronClassifierObservations(X, y_hot, classes) +end + +# implement `RadomAccess()` interface for output of `obs`: +Base.length(observations::PerceptronClassifierObservations) = length(observations.y) +Base.getindex(observations, I) = PerceptronClassifierObservations( + (@view observations.X[:, I]), + (@view observations.y[I]), + observations.classes, +) + +LearnAPI.target( + algorithm::PerceptronClassifier, + observations::PerceptronClassifierObservations, +) = observations.y + +LearnAPI.features( + algorithm::PerceptronClassifier, + observations::PerceptronClassifierObservations, +) = observations.X + +# Note that data consumed by `predict` needs no pre-processing, so no need to overload +# `obs(model, data)`. + + +# ### Fitting and updating + +# For wrapping outcomes of learning: +struct PerceptronClassifierFitted + algorithm::PerceptronClassifier + perceptron # component array storing weights and bias + state # optimiser state + classes # target classes + losses +end + +LearnAPI.algorithm(model::PerceptronClassifierFitted) = model.algorithm + +# `fit` for pre-processed data (output of `obs(algorithm, data)`): +function LearnAPI.fit( + algorithm::PerceptronClassifier, + observations::PerceptronClassifierObservations; + verbosity=1, + ) + + # unpack hyperparameters: + epochs = algorithm.epochs + optimiser = algorithm.optimiser + rng = deepcopy(algorithm.rng) # to prevent mutation of `algorithm`! + + # unpack data: + X = observations.X + y_hot = observations.y_hot + classes = observations.classes + nclasses = length(classes) + + # initialize bias and weights: + weights = randn(rng, Float32, nclasses, p) + bias = zeros(Float32, nclasses) + perceptron = (; weights, bias) |> ComponentArrays.ComponentArray + + # initialize optimiser: + state = Optimisers.setup(optimiser, perceptron) + + perceptron, state, losses = corefit(perceptron, X, y_hot, epochs, state, verbosity) + + return PerceptronClassifierFitted(algorithm, perceptron, state, classes, losses) +end + +# `fit` for unprocessed data: +LearnAPI.fit(algorithm::PerceptronClassifier, data; kwargs...) = + fit(algorithm, obs(algorithm, data); kwargs...) + +# see the `PerceptronClassifier` docstring for `update_observations` logic. +function LearnAPI.update_observations( + model::PerceptronClassifierFitted, + observations_new::PerceptronClassifierObservations; + verbosity=1, + replacements..., + ) + + # unpack data: + X = observations.X + y_hot = observations.y_hot + classes = observations.classes + nclasses = length(classes) + + classes == model.classes || error("New training target has incompatible classes.") + + algorithm_old = LearnAPI.algorithm(model) + algorithm = LearnAPI.clone(algorithm_old; replacements...) + + perceptron = model.perceptron + state = model.state + losses = model.losses + epochs = algorithm.epochs + + perceptron, state, losses_new = corefit(perceptron, X, y_hot, epochs, state, verbosity) + losses = vcat(losses, losses_new) + + return PerceptronClassifierFitted(algorithm, perceptron, state, classes, losses) +end +LearnAPI.update_observations(model::PerceptronClassifierFitted, data; kwargs...) = + update_observations(model, obs(LearnAPI.algorithm(model), data); kwargs...) + +# see the `PerceptronClassifier` docstring for `update` logic. +function LearnAPI.update( + model::PerceptronClassifierFitted, + observations::PerceptronClassifierObservations; + verbosity=1, + replacements..., + ) + + # unpack data: + X = observations.X + y_hot = observations.y_hot + classes = observations.classes + nclasses = length(classes) + + classes == model.classes || error("New training target has incompatible classes.") + + algorithm_old = LearnAPI.algorithm(model) + algorithm = LearnAPI.clone(algorithm_old; replacements...) + :epochs in keys(replacements) || return fit(algorithm, observations) + + perceptron = model.perceptron + state = model.state + losses = model.losses + + epochs = algorithm.epochs + Δepochs = epochs - algorithm_old.epochs + epochs < 0 && return fit(model, algorithm) + + perceptron, state, losses_new = corefit(perceptron, X, y_hot, Δepochs, state, verbosity) + losses = vcat(losses, losses_new) + + return PerceptronClassifierFitted(algorithm, perceptron, state, classes, losses) +end +LearnAPI.update(model::PerceptronClassifierFitted, data; kwargs...) = + update(model, obs(LearnAPI.algorithm(model), data); kwargs...) + + +# ### Predict + +function LearnAPI.predict(model::PerceptronClassifierFitted, ::Distribution, Xnew) + perceptron = model.perceptron + classes = model.classes + probs = perceptron.weights*Xnew .+ perceptron.bias |> NNlib.softmax + return CategoricalDistributions.UnivariateFinite(classes, probs') +end + +LearnAPI.predict(model::PerceptronClassifierFitted, ::Point, Xnew) = + mode.(predict(model, Distribution(), Xnew)) + + +# ### Accessor functions + +LearnAPI.training_losses(model::PerceptronClassifierFitted) = model.losses + + +# ### Traits + +@trait( + PerceptronClassifier, + constructor = PerceptronClassifier, + iteration_parameter = :epochs, + kinds_of_proxy = (Distribution(), Point()), + tags = ("classification", "iterative algorithms", "incremental algorithms"), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.minimize), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.update), + :(LearnAPI.update_observations), + :(LearnAPI.predict), + :(LearnAPI.training_losses), + ) +) + + +# ## Tests + +# synthetic test data: +N = 10 +n = 10N # number of observations +p = 2 # number of features +train = 1:6N +test = (6N+1:10N) +rng = StableRNG(123) +X = randn(rng, Float32, p, n); +coefficients = rand(rng, Float32, p)' +y_continuous = coefficients*X |> vec +η1 = quantile(y_continuous, 1/3) +η2 = quantile(y_continuous, 2/3) +y = map(y_continuous) do η + η < η1 && return "A" + η < η2 && return "B" + "C" +end |> CategoricalDistributions.categorical; +Xtrain = X[:, train]; +Xtest = X[:, test]; +ytrain = y[train]; +ytest = y[test]; + +@testset "PerceptronClassfier" begin + rng = StableRNG(123) + algorithm = PerceptronClassifier(; optimiser=Optimisers.Adam(0.01), epochs=40, rng) + @test LearnAPI.clone(algorithm) == algorithm + @test :(LearnAPI.update) in LearnAPI.functions(algorithm) + @test LearnAPI.target(algorithm, (X, y)) == y + @test LearnAPI.features(algorithm, (X, y)) == X + + model40 = fit(algorithm, Xtrain, ytrain; verbosity=0) + + # 40 epochs is sufficient for 90% accuracy in this case: + @test sum(predict(model40, Point(), Xtest) .== ytest)/length(ytest) > 0.9 + + # get probabilistic predictions: + ŷ40 = predict(model40, Distribution(), Xtest); + @test predict(model40, Xtest) ≈ ŷ40 + + # add 30 epochs in an `update`: + model70 = update(model40, Xtrain, y[train]; verbosity=0, epochs=70) + ŷ70 = predict(model70, Xtest); + @test !(ŷ70 ≈ ŷ40) + + # compare with cold restart: + model = fit(LearnAPI.clone(algorithm; epochs=70), Xtrain, y[train]; verbosity=0); + @test ŷ70 ≈ predict(model, Xtest) + + # instead add 30 epochs using `update_observations` instead: + model70b = update_observations(model40, Xtrain, y[train]; verbosity=0, epochs=30) + @test ŷ70 ≈ predict(model70b, Xtest) ≈ predict(model, Xtest) +end + +true From 09290666b9ab1e8d0c4e972b949ab61f2e386f5c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 14:22:05 +1300 Subject: [PATCH 093/187] update roadmap --- ROADMAP.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ROADMAP.md b/ROADMAP.md index cd524ba5..7da578ae 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -16,7 +16,7 @@ - [x] regression - [ ] classification - [ ] clustering - - [ ] gradient descent + - [x] gradient descent - [x] iterative algorithms - [ ] incremental algorithms - [ ] dimension reduction From 671491c8db8078ddacaf65ccf32ceaaf82a694a9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 14:31:09 +1300 Subject: [PATCH 094/187] fix outdated file reference in tests --- test/runtests.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/runtests.jl b/test/runtests.jl index dee0c17b..2c66588d 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -6,7 +6,7 @@ test_files = [ "clone.jl", "integration/regression.jl", "integration/static_algorithms.jl", - "integration/iterative_algorithms.jl", + "integration/ensembling.jl", ] files = isempty(ARGS) ? test_files : ARGS From 458b03cb12867632fdcd7aca8da0229e62c22d88 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 14:52:36 +1300 Subject: [PATCH 095/187] fix flawed test --- test/integration/ensembling.jl | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/test/integration/ensembling.jl b/test/integration/ensembling.jl index 1e92e203..657f0d29 100644 --- a/test/integration/ensembling.jl +++ b/test/integration/ensembling.jl @@ -194,13 +194,14 @@ Xtest = Tables.subset(X, test) ŷ7 = predict(model, Xtest) # compare with cold restart: - model = fit(LearnAPI.clone(algorithm; n=7), Xtrain, y[train]; verbosity=0); - @test ŷ7 ≈ predict(model, Xtest) + model_cold = fit(LearnAPI.clone(algorithm; n=7), Xtrain, y[train]; verbosity=0); + @test ŷ7 ≈ predict(model_cold, Xtest) - # test cold restart if another hyperparameter is changed: + # test that we get a cold restart if another hyperparameter is changed: model2 = update(model, Xtrain, y[train]; atom=Ridge(0.05)) - algorithm2 = LearnAPI.clone(LearnAPI.algorithm(model); atom=Ridge(0.05)) - @test predict(model, Xtest) ≈ predict(model2, Xtest) + algorithm2 = Ensemble(Ridge(0.05); n=7, rng) + model_cold = fit(algorithm2, Xtrain, y[train]; verbosity=0) + @test predict(model2, Xtest) ≈ predict(model_cold, Xtest) end From 8360ad4d2bcd26c351516c8baf6694f10995cebc Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 18:23:27 +1300 Subject: [PATCH 096/187] doc tweaks --- docs/src/index.md | 25 +++++++++++++++++++------ docs/src/obs.md | 9 +++++---- docs/src/predict_transform.md | 4 ++++ src/obs.jl | 4 ++-- 4 files changed, 30 insertions(+), 12 deletions(-) diff --git a/docs/src/index.md b/docs/src/index.md index 23496062..f79bf24f 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -9,12 +9,13 @@ A base Julia interface for machine learning and statistics
``` -LearnAPI.jl is a lightweight, functional-style interface, providing a -collection of [methods](@ref Methods), such as `fit` and `predict`, to be implemented by -algorithms from machine learning and statistics. Through such implementations, these -algorithms buy into functionality, such as hyperparameter optimization and model -composition, as provided by ML/statistics toolboxes and other packages. LearnAPI.jl also -provides a number of Julia [traits](@ref traits) for promising specific behavior. +LearnAPI.jl is a lightweight, functional-style interface, providing a collection of +[methods](@ref Methods), such as `fit` and `predict`, to be implemented by algorithms from +machine learning and statistics. Its careful design ensures algorithms implementing +LearnAPI.jl can buy into functionality, such as external performance estimates, +hyperparameter optimization and model composition, provided by ML/statistics toolboxes and +other packages. LearnAPI.jl includes a number of Julia [traits](@ref traits) for promising +specific behavior. LearnAPI.jl's only dependency is the standard library `InteractiveUtils`. @@ -91,6 +92,18 @@ then overloading `obs` is completely optional. Plain iteration interfaces, with knowledge of the number of observations, can also be specified (to support, e.g., data loaders reading images from disk). +## Hooks for adding functionality + +A key to enabling toolboxes to enhance LearnAPI.jl algorithm functionality is the +implementation of two key additional methods, beyond the usual `fit` and +`predict`/`transform`. Given any training `data` consumed by `fit` (such as `data = (X, +y)` in the example above) [`LearnAPI.features(algorithm, data)`](@ref input) tells us what +part of `data` comprises *features*, which is something that can be passsed onto to +`predict` or `transform` (`X` in the example) while [`LearnAPI.target(algorithm, +data)`](@ref), if implemented, tells us what part comprises the target (`y` in the +example). By explicitly requiring such methods, we free algorithms to consume data in +multiple forms, including optimised, algorithm-specific forms, as described above. + ## Learning more - [Anatomy of an Implementation](@ref): informal introduction to the main actors in a new diff --git a/docs/src/obs.md b/docs/src/obs.md index 0bbb9f24..5818ea76 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -77,10 +77,11 @@ end ## Implementation guide -| method | compulsory? | fallback | -|:----------------------------------------|:-----------:|:--------------:| -| [`obs(algorithm_or_model, data)`](@ref) | depends | returns `data` | -| | | | +| method | comment | compulsory? | fallback | +|:-------------------------------|:------------------------------------|:-------------:|:---------------| +| [`obs(algorithm, data)`](@ref) | here `data` is `fit`-consumable | not typically | returns `data` | +| [`obs(model, data)`](@ref) | here `data` is `predict`-consumable | not typically | returns `data` | + A sample implementation is given in [Providing an advanced data interface](@ref). diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index df961719..1cf50f54 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -72,6 +72,10 @@ instead, which does not dispatch on [`LearnAPI.KindOfProxy`](@ref), but can be o paired with an implementation of [`inverse_transform`](@ref), for returning (approximate) right inverses to `transform`. +Of course, the one algorithm can implement both a `predict` and `transform` method. For +example a K-means clustering algorithm can `predict` labels and `transform` to reduce +dimension using distances from the cluster centres. + ### [One-liners combining fit and transform/predict](@id one_liners) diff --git a/src/obs.jl b/src/obs.jl index 47fd8b79..8b226211 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -8,8 +8,8 @@ Return an algorithm-specific representation of `data`, suitable for passing to ` algorithm, `algorithm`. The returned object is guaranteed to implement observation access as indicated by -[`LearnAPI.data_interface(algorithm)`](@ref) (typically -[`LearnAPI.RandomAccess()`](@ref)). +[`LearnAPI.data_interface(algorithm)`](@ref), typically +[`LearnAPI.RandomAccess()`](@ref). Calling `fit`/`predict`/`transform` on the returned objects may have performance advantages over calling directly on `data` in some contexts. And resampling the returned From 1e504fe95f25b34321b53d79acf4223c6f565ea0 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 19:26:14 +1300 Subject: [PATCH 097/187] replace minimize -> LearnAPI.strip oops oops tweak --- README.md | 2 +- docs/make.jl | 1 - docs/src/accessor_functions.md | 5 +- docs/src/anatomy_of_an_implementation.md | 14 +++--- docs/src/index.md | 2 +- docs/src/minimize.md | 34 ------------- docs/src/reference.md | 19 ++++---- docs/src/traits.md | 2 +- src/LearnAPI.jl | 3 +- src/accessor_functions.jl | 61 ++++++++++++++++++++++-- src/minimize.jl | 41 ---------------- src/predict_transform.jl | 4 +- src/traits.jl | 9 ++-- test/integration/ensembling.jl | 6 +-- test/integration/gradient_descent.jl | 2 +- test/integration/regression.jl | 10 ++-- test/integration/static_algorithms.jl | 4 +- 17 files changed, 100 insertions(+), 119 deletions(-) delete mode 100644 docs/src/minimize.md delete mode 100644 src/minimize.jl diff --git a/README.md b/README.md index 26e58da2..b6b08900 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,7 @@ Configure a learning algorithm, and inspect available functionality: ```julia julia> algorithm = Ridge(lambda=0.1) julia> LearnAPI.functions(algorithm) -(:(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.minimize), :(LearnAPI.obs), +(:(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients)) ``` diff --git a/docs/make.jl b/docs/make.jl index a0b0bb37..77405bc2 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -18,7 +18,6 @@ makedocs( "fit/update" => "fit_update.md", "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", - "minimize" => "minimize.md", "target/weights/features" => "target_weights_features.md", "obs" => "obs.md", "Accessor Functions" => "accessor_functions.md", diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index e6e50864..68adab31 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -1,10 +1,12 @@ # [Accessor Functions](@id accessor_functions) The sole argument of an accessor function is the output, `model`, of -[`fit`](@ref). Algorithms are free to implement any number of these, or none of them. +[`fit`](@ref). Algorithms are free to implement any number of these, or none of them. Only +`LearnAPI.strip` has a fallback, namely the identity. - [`LearnAPI.algorithm(model)`](@ref) - [`LearnAPI.extras(model)`](@ref) +- [`LearnAPI.strip(model)`](@ref) - [`LearnAPI.coefficients(model)`](@ref) - [`LearnAPI.intercept(model)`](@ref) - [`LearnAPI.tree(model)`](@ref) @@ -31,6 +33,7 @@ optional, any implemented accessor functions must be added to the list returned ```@docs LearnAPI.algorithm LearnAPI.extras +LearnAPI.strip LearnAPI.coefficients LearnAPI.intercept LearnAPI.tree diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 2fba3329..850dd3f0 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -183,15 +183,15 @@ nothing #hide ## Tearing a model down for serialization -The `minimize` method falls back to the identity. Here, for the sake of illustration, we +The `LearnAPI.strip` method falls back to the identity. Here, for the sake of illustration, we overload it to dump the named version of the coefficients: ```@example anatomy -LearnAPI.minimize(model::RidgeFitted) = +LearnAPI.strip(model::RidgeFitted) = RidgeFitted(model.algorithm, model.coefficients, nothing) ``` -Crucially, we can still use `LearnAPI.minimize(model)` in place of `model` to make new +Crucially, we can still use `LearnAPI.strip(model)` in place of `model` to make new predictions. @@ -220,7 +220,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), @@ -285,7 +285,7 @@ Serialization/deserialization: ```@example anatomy using Serialization -small_model = minimize(model) +small_model = LearnAPI.strip(model) filename = tempname() serialize(filename, small_model) ``` @@ -316,7 +316,7 @@ end LearnAPI.algorithm(model::RidgeFitted) = model.algorithm LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients -LearnAPI.minimize(model::RidgeFitted) = +LearnAPI.strip(model::RidgeFitted) = RidgeFitted(model.algorithm, model.coefficients, nothing) @trait( @@ -327,7 +327,7 @@ LearnAPI.minimize(model::RidgeFitted) = functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), diff --git a/docs/src/index.md b/docs/src/index.md index f79bf24f..36e361d2 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -60,7 +60,7 @@ predict(model, Distribution(), Xnew) LearnAPI.feature_importances(model) # Slim down and otherwise prepare model for serialization: -small_model = minimize(model) +small_model = LearnAPI.strip(model) serialize("my_random_forest.jls", small_model) # Recover saved model and algorithm configuration: diff --git a/docs/src/minimize.md b/docs/src/minimize.md deleted file mode 100644 index 03bc028e..00000000 --- a/docs/src/minimize.md +++ /dev/null @@ -1,34 +0,0 @@ -# [`minimize`](@id algorithm_minimize) - -```julia -minimize(model) -> -``` - -# Typical workflow - -```julia -model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` -ŷ = predict(model, Point(), Xnew) -LearnAPI.feature_importances(model) - -small_model = minimize(model) -serialize("my_model.jls", small_model) - -recovered_model = deserialize("my_random_forest.jls") -@assert predict(recovered_model, Point(), Xnew) == ŷ - -# throws MethodError: -LearnAPI.feature_importances(recovered_model) -``` - -# Implementation guide - -| method | compulsory? | fallback | -|:-----------------------------|:-----------:|:--------:| -| [`minimize`](@ref) | no | identity | - -# Reference - -```@docs -minimize -``` diff --git a/docs/src/reference.md b/docs/src/reference.md index d3e8076b..9c13ee79 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -138,9 +138,9 @@ for each. !!! note "Compulsory methods" - All new algorithm types must implement [`fit`](@ref), - [`LearnAPI.algorithm`](@ref algorithm_minimize), [`LearnAPI.constructor`](@ref) and - [`LearnAPI.functions`](@ref). + All new algorithm types must implement [`fit`](@ref), + [`LearnAPI.algorithm`](@ref), [`LearnAPI.constructor`](@ref) and + [`LearnAPI.functions`](@ref). Most algorithms will also implement [`predict`](@ref) and/or [`transform`](@ref). For a bare minimum implementation, see the implementation of `SmallAlgorithm` @@ -152,10 +152,10 @@ bare minimum implementation, see the implementation of `SmallAlgorithm` for non-generalizing algorithms (see [here](@ref static_algorithms) and [Static Algorithms](@ref)), for wrapping `algorithm` in a mutable struct that can be mutated by `predict`/`transform` to record byproducts of those operations. - + - [`update`](@ref fit): for updating learning outcomes after hyperparameter changes, such as increasing an iteration parameter. - + - [`update_observations`](@ref fit), [`update_features`](@ref fit): update learning outcomes by presenting additional training data. @@ -168,9 +168,6 @@ bare minimum implementation, see the implementation of `SmallAlgorithm` - [`inverse_transform`](@ref operations): for inverting the output of `transform` ("inverting" broadly understood) -- [`minimize`](@ref algorithm_minimize): for stripping the `model` output by `fit` of - inessential content, for purposes of serialization. - - [`LearnAPI.target`](@ref input), [`LearnAPI.weights`](@ref input), [`LearnAPI.features`](@ref): for extracting relevant parts of training data, where defined. @@ -181,8 +178,10 @@ bare minimum implementation, see the implementation of `SmallAlgorithm` [`LearnAPI.data_interface(algorithm)`](@ref). - [Accessor functions](@ref accessor_functions): these include functions like - `feature_importances` and `training_losses`, for extracting, from training outcomes, - information common to many algorithms. + `LearnAPI.feature_importances` and `LearnAPI.training_losses`, for extracting, from + training outcomes, information common to many algorithms. This includes + [`LearnAPI.strip(model)`](@ref) for replacing a learning outcome `model` with a + serializable version that can still `predict` or `transform`. - [Algorithm traits](@ref traits): methods that promise specific algorithm behavior or record general information about the algorithm. Only [`LearnAPI.constructor`](@ref) and diff --git a/docs/src/traits.md b/docs/src/traits.md index 83d3287d..cb03f03d 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -16,7 +16,7 @@ In the examples column of the table below, `Continuous` is a name owned the pack | trait | return value | fallback value | example | |:-----------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| | [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | -| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :minimize, :(LearnAPI.algorithm), :obs)` | +| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :LearnAPI.strip, :(LearnAPI.algorithm), :obs)` | | [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | | [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | | [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 8a82874e..74fdd84b 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -6,7 +6,6 @@ include("tools.jl") include("types.jl") include("predict_transform.jl") include("fit_update.jl") -include("minimize.jl") include("target_weights_features.jl") include("obs.jl") include("accessor_functions.jl") @@ -15,7 +14,7 @@ include("clone.jl") export @trait export fit, update, update_observations, update_features -export predict, transform, inverse_transform, minimize, obs +export predict, transform, inverse_transform, obs for name in Symbol.(CONCRETE_TARGET_PROXY_TYPES_SYMBOLS) @eval export $name diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index 854bfdb7..84859307 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -16,15 +16,15 @@ const DOC_STATIC = """ LearnAPI.algorithm(model) - LearnAPI.algorithm(minimized_model) + LearnAPI.algorithm(LearnAPI.stripd_model) -Recover the algorithm used to train `model` or the output of [`minimize(model)`](@ref). +Recover the algorithm used to train `model` or the output of [`LearnAPI.strip(model)`](@ref). In other words, if `model = fit(algorithm, data...)`, for some `algorithm` and `data`, then ```julia -LearnAPI.algorithm(model) == algorithm == LearnAPI.algorithm(minimize(model)) +LearnAPI.algorithm(model) == algorithm == LearnAPI.algorithm(LearnAPI.strip(model)) ``` is `true`. @@ -36,6 +36,61 @@ only contract. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.algorithm)")) """ function algorithm end +""" + LearnAPI.strip(model; options...) + +Return a version of `model` that will generally have a smaller memory allocation than +`model`, suitable for serialization. Here `model` is any object returned by +[`fit`](@ref). Accessor functions that can be called on `model` may not work on +`LearnAPI.strip(model)`, but [`predict`](@ref), [`transform`](@ref) and +[`inverse_transform`](@ref) will work, if implemented. Check +`LearnAPI.functions(LearnAPI.algorithm(model))` to view see what the original `model` +implements. + +Specific algorithms may provide keyword `options` to control how much of the original +functionality is preserved by `LearnAPI.strip`. + +# Typical workflow + +```julia +model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` +ŷ = predict(model, Point(), Xnew) + +small_model = LearnAPI.strip(model) +serialize("my_model.jls", small_model) + +recovered_model = deserialize("my_random_forest.jls") +@assert predict(recovered_model, Point(), Xnew) == ŷ +``` + +# Extended help + +# New implementations + +Overloading `LearnAPI.strip` for new algorithms is optional. The fallback is the +identity. + +New implementations must enforce the following identities, whenever the right-hand side is +defined: + +```julia +predict(LearnAPI.strip(model; options...), args...; kwargs...) == + predict(model, args...; kwargs...) +transform(LearnAPI.strip(model; options...), args...; kwargs...) == + transform(model, args...; kwargs...) +inverse_transform(LearnAPI.strip(model; options), args...; kwargs...) == + inverse_transform(model, args...; kwargs...) +``` + +Additionally: + +```julia +LearnAPI.strip(LearnAPI.strip(model)) == LearnAPI.strip(model) +``` + +""" +LearnAPI.strip(model) = model + """ LearnAPI.feature_importances(model) diff --git a/src/minimize.jl b/src/minimize.jl deleted file mode 100644 index 653d3fdf..00000000 --- a/src/minimize.jl +++ /dev/null @@ -1,41 +0,0 @@ -""" - minimize(model; options...) - -Return a version of `model` that will generally have a smaller memory allocation than -`model`, suitable for serialization. Here `model` is any object returned by -[`fit`](@ref). Accessor functions that can be called on `model` may not work on -`minimize(model)`, but [`predict`](@ref), [`transform`](@ref) and -[`inverse_transform`](@ref) will work, if implemented. Check -`LearnAPI.functions(LearnAPI.algorithm(model))` to view see what the original `model` -implements. - -Specific algorithms may provide keyword `options` to control how much of the original -functionality is preserved by `minimize`. - -# Extended help - -# New implementations - -Overloading `minimize` for new algorithms is optional. The fallback is the -identity. $(DOC_IMPLEMENTED_METHODS(":minimize", overloaded=true)) - -New implementations must enforce the following identities, whenever the right-hand side is -defined: - -```julia -predict(minimize(model; options...), args...; kwargs...) == - predict(model, args...; kwargs...) -transform(minimize(model; options...), args...; kwargs...) == - transform(model, args...; kwargs...) -inverse_transform(minimize(model; options), args...; kwargs...) == - inverse_transform(model, args...; kwargs...) -``` - -Additionally: - -```julia -minimize(minimize(model)) == minimize(model) -``` - -""" -minimize(model) = model diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 9598572b..d59ac78e 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -21,11 +21,11 @@ DOC_MUTATION(op) = DOC_MINIMIZE(func) = """ - If, additionally, [`minimize(model)`](@ref) is overloaded, then the following identity + If, additionally, [`LearnAPI.strip(model)`](@ref) is overloaded, then the following identity must hold: ```julia - $func(minimize(model), args...) = $func(model, args...) + $func(LearnAPI.strip(model), args...) = $func(model, args...) ``` """ diff --git a/src/traits.jl b/src/traits.jl index 81ecf778..614daf53 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -89,7 +89,7 @@ return value: |-----------------------------------|----------------------------|------------------------------------| | `:(LearnAPI.fit)` | yes | yes | | `:(LearnAPI.algorithm)` | yes | yes | -| `:(LearnAPI.minimize)` | no | yes | +| `:(LearnAPI.strip)` | no | yes | | `:(LearnAPI.obs)` | no | yes | | `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | | `:(LearnAPI.target)` | no | only if implemented | @@ -100,17 +100,18 @@ return value: | `:(LearnAPI.predict)` | no | only if implemented | | `:(LearnAPI.transform)` | no | only if implemented | | `:(LearnAPI.inverse_transform)` | no | only if implemented | -| | no | only if implemented | +| < accessor functions> | no | only if implemented | Also include any implemented accessor functions, both those owned by LearnaAPI.jl, and any -algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. +algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST +(`LearnAPI.strip` is always included). """ functions(::Any) = () functions() = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), diff --git a/test/integration/ensembling.jl b/test/integration/ensembling.jl index 657f0d29..bcebacaa 100644 --- a/test/integration/ensembling.jl +++ b/test/integration/ensembling.jl @@ -133,11 +133,11 @@ LearnAPI.predict(model::EnsembleFitted, ::Point, data) = predict(atomic_model, Point(), data) end -LearnAPI.minimize(model::EnsembleFitted) = EnsembleFitted( +LearnAPI.strip(model::EnsembleFitted) = EnsembleFitted( model.algorithm, model.atom, model.rng, - minimize.(Ref(model.atom), models), + LearnAPI.strip.(Ref(model.atom), models), ) # note the inclusion of `iteration_parameter`: @@ -151,7 +151,7 @@ LearnAPI.minimize(model::EnsembleFitted) = EnsembleFitted( functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), diff --git a/test/integration/gradient_descent.jl b/test/integration/gradient_descent.jl index 6b582225..dbf95442 100644 --- a/test/integration/gradient_descent.jl +++ b/test/integration/gradient_descent.jl @@ -316,7 +316,7 @@ LearnAPI.training_losses(model::PerceptronClassifierFitted) = model.losses functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), diff --git a/test/integration/regression.jl b/test/integration/regression.jl index 34144c87..4bcc9fe1 100644 --- a/test/integration/regression.jl +++ b/test/integration/regression.jl @@ -97,7 +97,7 @@ LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = # accessor function: LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances -LearnAPI.minimize(model::RidgeFitted) = +LearnAPI.strip(model::RidgeFitted) = RidgeFitted(model.algorithm, model.coefficients, nothing) @trait( @@ -108,7 +108,7 @@ LearnAPI.minimize(model::RidgeFitted) = functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), @@ -170,7 +170,7 @@ data = (X, y) filename = tempname() using Serialization - small_model = minimize(model) + small_model = LearnAPI.strip(model) serialize(filename, small_model) recovered_model = deserialize(filename) @@ -230,7 +230,7 @@ LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm LearnAPI.predict(model::BabyRidgeFitted, ::Point, Xnew) = Tables.matrix(Xnew)*model.coefficients -LearnAPI.minimize(model::BabyRidgeFitted) = +LearnAPI.strip(model::BabyRidgeFitted) = BabyRidgeFitted(model.algorithm, model.coefficients, nothing) @trait( @@ -241,7 +241,7 @@ LearnAPI.minimize(model::BabyRidgeFitted) = functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index 328725b9..aac7283b 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -45,7 +45,7 @@ end functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.transform), ), @@ -112,7 +112,7 @@ end functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), - :(LearnAPI.minimize), + :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.transform), :(MyPkg.rejected), # accessor function not owned by LearnAPI.jl, From 55caed4462acbfd5f1d321389cdbd38c02351067 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:02:55 +1300 Subject: [PATCH 098/187] doc tweak --- docs/src/anatomy_of_an_implementation.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 850dd3f0..a41325d4 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -181,10 +181,10 @@ LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients nothing #hide ``` -## Tearing a model down for serialization - -The `LearnAPI.strip` method falls back to the identity. Here, for the sake of illustration, we -overload it to dump the named version of the coefficients: +The [`LearnAPI.strip(model)`](@ref) accessor function is for returning a version of +`model` suitable for serialization (typically smaller and data anonymized). It has a +fallback that just returns `model` but for the sake of illustration, we overload it to +dump the named version of the coefficients: ```@example anatomy LearnAPI.strip(model::RidgeFitted) = From 1e9d5e51a82b0703b281607038f87d1121c3ca1d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:08:51 +1300 Subject: [PATCH 099/187] rename a struct in a test --- test/integration/gradient_descent.jl | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/test/integration/gradient_descent.jl b/test/integration/gradient_descent.jl index dbf95442..92c9fdeb 100644 --- a/test/integration/gradient_descent.jl +++ b/test/integration/gradient_descent.jl @@ -134,7 +134,7 @@ PerceptronClassifier(; epochs=50, optimiser=Optimisers.Adam(), rng=Random.defaul LearnAPI.target(algorithm::PerceptronClassifier, data::Tuple) = last(data) # For wrapping pre-processed training data (output of `obs(algorithm, data)`): -struct PerceptronClassifierObservations +struct PerceptronClassifierObs X::Matrix{Float32} y_hot::BitMatrix # one-hot encoded target classes # the (ordered) pool of `y`, as `CategoricalValue`s @@ -145,12 +145,12 @@ function LearnAPI.obs(algorithm::PerceptronClassifier, data::Tuple) X, y = data classes = CategoricalDistributions.classes(y) y_hot = classes .== permutedims(y) # one-hot encoding - return PerceptronClassifierObservations(X, y_hot, classes) + return PerceptronClassifierObs(X, y_hot, classes) end # implement `RadomAccess()` interface for output of `obs`: -Base.length(observations::PerceptronClassifierObservations) = length(observations.y) -Base.getindex(observations, I) = PerceptronClassifierObservations( +Base.length(observations::PerceptronClassifierObs) = length(observations.y) +Base.getindex(observations, I) = PerceptronClassifierObs( (@view observations.X[:, I]), (@view observations.y[I]), observations.classes, @@ -158,12 +158,12 @@ Base.getindex(observations, I) = PerceptronClassifierObservations( LearnAPI.target( algorithm::PerceptronClassifier, - observations::PerceptronClassifierObservations, + observations::PerceptronClassifierObs, ) = observations.y LearnAPI.features( algorithm::PerceptronClassifier, - observations::PerceptronClassifierObservations, + observations::PerceptronClassifierObs, ) = observations.X # Note that data consumed by `predict` needs no pre-processing, so no need to overload @@ -186,7 +186,7 @@ LearnAPI.algorithm(model::PerceptronClassifierFitted) = model.algorithm # `fit` for pre-processed data (output of `obs(algorithm, data)`): function LearnAPI.fit( algorithm::PerceptronClassifier, - observations::PerceptronClassifierObservations; + observations::PerceptronClassifierObs; verbosity=1, ) @@ -221,7 +221,7 @@ LearnAPI.fit(algorithm::PerceptronClassifier, data; kwargs...) = # see the `PerceptronClassifier` docstring for `update_observations` logic. function LearnAPI.update_observations( model::PerceptronClassifierFitted, - observations_new::PerceptronClassifierObservations; + observations_new::PerceptronClassifierObs; verbosity=1, replacements..., ) @@ -253,7 +253,7 @@ LearnAPI.update_observations(model::PerceptronClassifierFitted, data; kwargs...) # see the `PerceptronClassifier` docstring for `update` logic. function LearnAPI.update( model::PerceptronClassifierFitted, - observations::PerceptronClassifierObservations; + observations::PerceptronClassifierObs; verbosity=1, replacements..., ) From 21383ab95a8b38070b6c25c29c0375b74a4e1eeb Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:14:18 +1300 Subject: [PATCH 100/187] typo --- test/integration/gradient_descent.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/integration/gradient_descent.jl b/test/integration/gradient_descent.jl index 92c9fdeb..ff41782b 100644 --- a/test/integration/gradient_descent.jl +++ b/test/integration/gradient_descent.jl @@ -120,7 +120,7 @@ If `Δepochs = n - perceptron.epochs` is non-negative, then return an updated mo the weights and bias of the previously learned perceptron used as the starting state in new gradient descent updates for `Δepochs` epochs, and using the provided `newdata` instead of the previous training data. Any other hyperparaameter `replacements` are also -adopted. In `Δepochs` is negative or not specified, instead return `fit(algorithm, +adopted. If `Δepochs` is negative or not specified, instead return `fit(algorithm, newdata)`, where `algorithm=LearnAPI.clone(algorithm; epochs=n, replacements....)`. """ From 3a56ad3c21dcc50661f2993fd1b4a67b6968fa46 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:18:52 +1300 Subject: [PATCH 101/187] rename a struct in tests --- test/integration/static_algorithms.jl | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/test/integration/static_algorithms.jl b/test/integration/static_algorithms.jl index aac7283b..21b43738 100644 --- a/test/integration/static_algorithms.jl +++ b/test/integration/static_algorithms.jl @@ -69,24 +69,24 @@ end # This a variation of `Selector` above that stores the names of rejected features in the # output of `fit`, for inspection by an accessor function called `rejected`. -struct Selector2 +struct FancySelector names::Vector{Symbol} end -Selector2(; names=Symbol[]) = Selector2(names) # LearnAPI.constructor defined later +FancySelector(; names=Symbol[]) = FancySelector(names) # LearnAPI.constructor defined later -mutable struct Selector2Fit - algorithm::Selector2 +mutable struct FancySelectorFitted + algorithm::FancySelector rejected::Vector{Symbol} - Selector2Fit(algorithm) = new(algorithm) + FancySelectorFitted(algorithm) = new(algorithm) end -LearnAPI.algorithm(model::Selector2Fit) = model.algorithm -rejected(model::Selector2Fit) = model.rejected +LearnAPI.algorithm(model::FancySelectorFitted) = model.algorithm +rejected(model::FancySelectorFitted) = model.rejected # Here we are wrapping `algorithm` with a place-holder for the `rejected` feature names. -LearnAPI.fit(algorithm::Selector2; verbosity=1) = Selector2Fit(algorithm) +LearnAPI.fit(algorithm::FancySelector; verbosity=1) = FancySelectorFitted(algorithm) # output the filtered table and add `rejected` field to model (mutatated!) -function LearnAPI.transform(model::Selector2Fit, X) +function LearnAPI.transform(model::FancySelectorFitted, X) table = Tables.columntable(X) names = Tables.columnnames(table) keep = LearnAPI.algorithm(model).names @@ -98,15 +98,15 @@ function LearnAPI.transform(model::Selector2Fit, X) end # fit and transform in one step: -function LearnAPI.transform(algorithm::Selector2, X) +function LearnAPI.transform(algorithm::FancySelector, X) model = fit(algorithm) transform(model, X) end # note the necessity of overloading `is_static` (`fit` consumes no data): @trait( - Selector2, - constructor = Selector2, + FancySelector, + constructor = FancySelector, is_static = true, tags = ("feature engineering",), functions = ( @@ -120,7 +120,7 @@ end ) @testset "test a variation that reports byproducts" begin - algorithm = Selector2(names=[:x, :w]) + algorithm = FancySelector(names=[:x, :w]) X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) model = fit(algorithm) # no data arguments! @test !isdefined(model, :reject) From f3afd3665066fa69a86c6d511b17fa436ead474f Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:26:13 +1300 Subject: [PATCH 102/187] rename test/integration folder test/patterns --- docs/src/patterns/classification.md | 2 +- docs/src/patterns/ensembling.md | 2 +- docs/src/patterns/feature_engineering.md | 2 +- docs/src/patterns/gradient_descent.md | 2 +- docs/src/patterns/iterative_algorithms.md | 4 ++-- docs/src/patterns/meta_algorithms.md | 2 +- docs/src/patterns/regression.md | 2 +- docs/src/patterns/static_algorithms.md | 2 +- test/{integration => patterns}/ensembling.jl | 0 test/{integration => patterns}/gradient_descent.jl | 0 test/{integration => patterns}/regression.jl | 0 test/{integration => patterns}/static_algorithms.jl | 0 test/runtests.jl | 6 +++--- 13 files changed, 12 insertions(+), 12 deletions(-) rename test/{integration => patterns}/ensembling.jl (100%) rename test/{integration => patterns}/gradient_descent.jl (100%) rename test/{integration => patterns}/regression.jl (100%) rename test/{integration => patterns}/static_algorithms.jl (100%) diff --git a/docs/src/patterns/classification.md b/docs/src/patterns/classification.md index 86fa1158..2913cea5 100644 --- a/docs/src/patterns/classification.md +++ b/docs/src/patterns/classification.md @@ -2,4 +2,4 @@ See these examples from tests: -- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/gradient_descent.jl) +- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/gradient_descent.jl) diff --git a/docs/src/patterns/ensembling.md b/docs/src/patterns/ensembling.md index 1513997e..a93ae305 100644 --- a/docs/src/patterns/ensembling.md +++ b/docs/src/patterns/ensembling.md @@ -1,5 +1,5 @@ # Ensembling See [this -example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) +example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/ensembling.jl) from tests. diff --git a/docs/src/patterns/feature_engineering.md b/docs/src/patterns/feature_engineering.md index 614f94a6..96b8b45e 100644 --- a/docs/src/patterns/feature_engineering.md +++ b/docs/src/patterns/feature_engineering.md @@ -1,5 +1,5 @@ # Feature Engineering - For a simple feature selection algorithm (no "learning) see [these -examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/static_algorithms.jl) +examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/static_algorithms.jl) from tests. diff --git a/docs/src/patterns/gradient_descent.md b/docs/src/patterns/gradient_descent.md index acded653..7fd4a11c 100644 --- a/docs/src/patterns/gradient_descent.md +++ b/docs/src/patterns/gradient_descent.md @@ -1,5 +1,5 @@ # Gradient Descent See [this -example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/gradient_descent.jl) +example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/gradient_descent.jl) from tests. diff --git a/docs/src/patterns/iterative_algorithms.md b/docs/src/patterns/iterative_algorithms.md index abab6316..1cf4ab23 100644 --- a/docs/src/patterns/iterative_algorithms.md +++ b/docs/src/patterns/iterative_algorithms.md @@ -2,6 +2,6 @@ See these examples from tests: -- [bagged ensembling](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) +- [bagged ensembling](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/ensembling.jl) -- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/gradient_descent.jl) +- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/gradient_descent.jl) diff --git a/docs/src/patterns/meta_algorithms.md b/docs/src/patterns/meta_algorithms.md index 2104ff26..17ccad8f 100644 --- a/docs/src/patterns/meta_algorithms.md +++ b/docs/src/patterns/meta_algorithms.md @@ -2,6 +2,6 @@ Many meta-algorithms are can be implemented as wrappers. An example is [this bagged ensemble -algorithm](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/ensembling.jl) +algorithm](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/ensembling.jl) from tests. diff --git a/docs/src/patterns/regression.md b/docs/src/patterns/regression.md index 626d59ce..7cf3b6d0 100644 --- a/docs/src/patterns/regression.md +++ b/docs/src/patterns/regression.md @@ -1,5 +1,5 @@ # Regression See [these -examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/regression.jl) +examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/regression.jl) from tests. diff --git a/docs/src/patterns/static_algorithms.md b/docs/src/patterns/static_algorithms.md index 7f420e2d..21a517dc 100644 --- a/docs/src/patterns/static_algorithms.md +++ b/docs/src/patterns/static_algorithms.md @@ -1,7 +1,7 @@ # Static Algorithms See [these -examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/static_algorithms.jl) +examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/static_algorithms.jl) from tests. diff --git a/test/integration/ensembling.jl b/test/patterns/ensembling.jl similarity index 100% rename from test/integration/ensembling.jl rename to test/patterns/ensembling.jl diff --git a/test/integration/gradient_descent.jl b/test/patterns/gradient_descent.jl similarity index 100% rename from test/integration/gradient_descent.jl rename to test/patterns/gradient_descent.jl diff --git a/test/integration/regression.jl b/test/patterns/regression.jl similarity index 100% rename from test/integration/regression.jl rename to test/patterns/regression.jl diff --git a/test/integration/static_algorithms.jl b/test/patterns/static_algorithms.jl similarity index 100% rename from test/integration/static_algorithms.jl rename to test/patterns/static_algorithms.jl diff --git a/test/runtests.jl b/test/runtests.jl index 2c66588d..5385a731 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,9 +4,9 @@ test_files = [ "tools.jl", "traits.jl", "clone.jl", - "integration/regression.jl", - "integration/static_algorithms.jl", - "integration/ensembling.jl", + "patterns/regression.jl", + "patterns/static_algorithms.jl", + "patterns/ensembling.jl", ] files = isempty(ARGS) ? test_files : ARGS From 168e0c64ce7b953bf79fe8cb355537f71229e7e6 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:31:07 +1300 Subject: [PATCH 103/187] tweaks --- docs/src/common_implementation_patterns.md | 6 ------ docs/src/patterns/feature_engineering.md | 5 ++--- 2 files changed, 2 insertions(+), 9 deletions(-) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 29e67ca8..5ab63cce 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -4,12 +4,6 @@ 🚧 ``` -!!! warning - - Under construction - -!!! warning - This section is only an implementation guide. The definitive specification of the Learn API is given in [Reference](@ref reference). diff --git a/docs/src/patterns/feature_engineering.md b/docs/src/patterns/feature_engineering.md index 96b8b45e..850dc0e3 100644 --- a/docs/src/patterns/feature_engineering.md +++ b/docs/src/patterns/feature_engineering.md @@ -1,5 +1,4 @@ # Feature Engineering -- For a simple feature selection algorithm (no "learning) see [these -examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/static_algorithms.jl) -from tests. +For a simple feature selection algorithm (no "learning) see +[these examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/static_algorithms.jl) from tests. From a683c9325cfd318cbfe4d96a64fa3977d89a9af5 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 11:06:36 +1300 Subject: [PATCH 104/187] add a observation-updatable density estimator to tests --- Project.toml | 2 + docs/src/common_implementation_patterns.md | 8 +- docs/src/patterns/density_estimation.md | 4 + docs/src/patterns/incremental_algorithms.md | 5 + src/predict_transform.jl | 4 + src/types.jl | 54 ++++---- test/patterns/incremental_algorithms.jl | 135 ++++++++++++++++++++ test/runtests.jl | 1 + test/traits.jl | 5 +- 9 files changed, 185 insertions(+), 33 deletions(-) create mode 100644 docs/src/patterns/incremental_algorithms.md create mode 100644 test/patterns/incremental_algorithms.jl diff --git a/Project.toml b/Project.toml index 849adaeb..2d23d7e2 100644 --- a/Project.toml +++ b/Project.toml @@ -11,6 +11,7 @@ julia = "1.6" [extras] DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" +Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" @@ -23,6 +24,7 @@ Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] test = [ "DataFrames", + "Distributions", "LinearAlgebra", "MLUtils", "Random", diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 5ab63cce..7f7641a4 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -1,8 +1,6 @@ # Common Implementation Patterns -```@raw html -🚧 -``` +!!! warning This section is only an implementation guide. The definitive specification of the Learn API is given in [Reference](@ref reference). @@ -25,7 +23,7 @@ implementations fall into one (or more) of the following informally understood p - [Iterative Algorithms](@ref) -- Incremental Algorithms +- [Incremental Algorithms](@ref): Algorithms that can be updated with new observations. - [Feature Engineering](@ref): Algorithms for selecting or combining features @@ -48,7 +46,7 @@ implementations fall into one (or more) of the following informally understood p - Survival Analysis -- Density Estimation: Algorithms that learn a probability distribution +- [Density Estimation](@ref): Algorithms that learn a probability distribution - Bayesian Algorithms diff --git a/docs/src/patterns/density_estimation.md b/docs/src/patterns/density_estimation.md index f535f9fe..e9ca083b 100644 --- a/docs/src/patterns/density_estimation.md +++ b/docs/src/patterns/density_estimation.md @@ -1 +1,5 @@ # Density Estimation + +See these examples from tests: + +- [normal distribution estimator](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl) diff --git a/docs/src/patterns/incremental_algorithms.md b/docs/src/patterns/incremental_algorithms.md new file mode 100644 index 00000000..89ad8643 --- /dev/null +++ b/docs/src/patterns/incremental_algorithms.md @@ -0,0 +1,5 @@ +# Incremental Algorithms + +See these examples from tests: + +- [normal distribution estimator](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl) diff --git a/src/predict_transform.jl b/src/predict_transform.jl index d59ac78e..39bff2a9 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -66,6 +66,9 @@ which lists all supported target proxies. The argument `model` is anything returned by a call of the form `fit(algorithm, ...)`. +If `LearnAPI.features(LearnAPI.algorithm(model)) == nothing`, then argument `data` is +omitted. An example is density estimators. + # Example In the following, `algorithm` is some supervised learning algorithm with @@ -105,6 +108,7 @@ $(DOC_DATA_INTERFACE(:predict)) """ predict(model, data) = predict(model, kinds_of_proxy(algorithm(model)) |> first, data) +predict(model) = predict(model, kinds_of_proxy(algorithm(model)) |> first) # automatic slurping of multiple data arguments: predict(model, k::KindOfProxy, data1, data2, datas...; kwargs...) = diff --git a/src/types.jl b/src/types.jl index 25f98d81..be40922f 100644 --- a/src/types.jl +++ b/src/types.jl @@ -22,27 +22,27 @@ See also [`LearnAPI.KindOfProxy`](@ref). | type | form of an observation | |:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `LearnAPI.Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | -| `LearnAPI.Sampleable` | object that can be sampled to obtain object of the same form as target observation | -| `LearnAPI.Distribution` | explicit probability density/mass function whose sample space is all possible target observations | -| `LearnAPI.LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | -| `LearnAPI.Probability`¹ | numerical probability or probability vector | -| `LearnAPI.LogProbability`¹ | log-probability or log-probability vector | -| `LearnAPI.Parametric`¹ | a list of parameters (e.g., mean and variance) describing some distribution | -| `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | -| `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | -| `LearnAPI.LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | -| `LearnAPI.LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number | -| `LearnAPI.Quantile`² | same as target but with quantile interpretation | -| `LearnAPI.Expectile`² | same as target but with expectile interpretation | -| `LearnAPI.ConfidenceInterval`² | confidence interval | -| `LearnAPI.Fuzzy` | finite but possibly varying number of target observations | -| `LearnAPI.ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | -| `LearnAPI.SurvivalFunction` | survival function | -| `LearnAPI.SurvivalDistribution` | probability distribution for survival time | -| `LearnAPI.SurvivalHazardFunction` | hazard function for survival time | -| `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | -| `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | +| `Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | +| `Sampleable` | object that can be sampled to obtain object of the same form as target observation | +| `Distribution` | explicit probability density/mass function whose sample space is all possible target observations | +| `LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | +| `Probability`¹ | numerical probability or probability vector | +| `LogProbability`¹ | log-probability or log-probability vector | +| `Parametric`¹ | a list of parameters (e.g., mean and variance) describing some distribution | +| `LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering | +| `LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above | +| `LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above | +| `LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number | +| `Quantile`² | same as target but with quantile interpretation | +| `Expectile`² | same as target but with expectile interpretation | +| `ConfidenceInterval`² | confidence interval | +| `Fuzzy` | finite but possibly varying number of target observations | +| `ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | +| `SurvivalFunction` | survival function | +| `SurvivalDistribution` | probability distribution for survival time | +| `SurvivalHazardFunction` | hazard function for survival time | +| `OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | +| `Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | ¹Provided for completeness but discouraged to avoid [ambiguities in representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). @@ -86,9 +86,9 @@ space ``Y^n``, where ``Y`` is the space from which the target variable takes its | type `T` | form of output of `predict(model, ::T, data)` | |:-------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `LearnAPI.JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. | -| `LearnAPI.JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | -| `LearnAPI.JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | +| `JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. | +| `JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | +| `JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` | """ abstract type Joint <: KindOfProxy end @@ -108,9 +108,9 @@ single object representing a probability distribution. | type `T` | form of output of `predict(model, ::T)` | |:--------------------------------:|:-----------------------------------------------------------------------| -| `LearnAPI.SingleSampleable` | object that can be sampled to obtain a single target observation | -| `LearnAPI.SingleDistribution` | explicit probability density/mass function for sampling the target | -| `LearnAPI.SingleLogDistribution` | explicit log-probability density/mass function for sampling the target | +| `SingleSampleable` | object that can be sampled to obtain a single target observation | +| `SingleDistribution` | explicit probability density/mass function for sampling the target | +| `SingleLogDistribution` | explicit log-probability density/mass function for sampling the target | """ abstract type Single <: KindOfProxy end diff --git a/test/patterns/incremental_algorithms.jl b/test/patterns/incremental_algorithms.jl new file mode 100644 index 00000000..71f9bb26 --- /dev/null +++ b/test/patterns/incremental_algorithms.jl @@ -0,0 +1,135 @@ +using LearnAPI +using Statistics +using StableRNGs + +import Distributions + +# # NORMAL DENSITY ESTIMATOR + +# An example of density estimation and also of incremental learning +# (`update_observations`). + + +# ## Implementation + +""" + NormalEstimator() + +Instantiate an algorithm for finding the maximum likelihood normal distribution fitting +some real univariate data `y`. Estimates can be updated with new data. + +```julia +model = fit(NormalEstimator(), y) +d = predict(model) # returns the learned `Normal` distribution +``` + +While the above is equivalent to the single operation `d = +predict(NormalEstimator(), y)`, the above workflow allows for the presentation of +additional observations post facto: The following is equivalent to `d2 = +predict(NormalEstimator(), vcat(y, ynew))`: + +```julia +update_observations(model, ynew) +d2 = predict(model) +``` + +Inspect all learned parameters with `LearnAPI.extras(model)`. Predict a 95% +confidence interval with `predict(model, ConfidenceInterval())` + +""" +struct NormalEstimator end + +struct NormalEstimatorFitted{T} + Σy::T + ȳ::T + ss::T # sum of squared residuals + n::Int +end + +LearnAPI.algorithm(::NormalEstimatorFitted) = NormalEstimator() + +function LearnAPI.fit(::NormalEstimator, y) + n = length(y) + Σy = sum(y) + ȳ = Σy/n + ss = sum(x->x^2, y) - n*ȳ^2 + return NormalEstimatorFitted(Σy, ȳ, ss, n) +end + +function LearnAPI.update_observations(model::NormalEstimatorFitted, ynew) + m = length(ynew) + n = model.n + m + Σynew = sum(ynew) + Σy = model.Σy + Σynew + ȳ = Σy/n + δ = model.n*((m*model.ȳ - Σynew)/n)^2 + ss = model.ss + δ + sum(x -> (x - ȳ)^2, ynew) + return NormalEstimatorFitted(Σy, ȳ, ss, n) +end + +LearnAPI.features(::NormalEstimator, y) = nothing +LearnAPI.target(::NormalEstimator, y) = y + +LearnAPI.predict(model::NormalEstimatorFitted, ::Distribution) = + Distributions.Normal(model.ȳ, sqrt(model.ss/model.n)) +LearnAPI.predict(model::NormalEstimatorFitted, ::Point) = model.ȳ +function LearnAPI.predict(model::NormalEstimatorFitted, ::ConfidenceInterval) + d = predict(model, Distribution()) + return (quantile(d, 0.025), quantile(d, 0.975)) +end + +# for fit and predict in one line: +LearnAPI.predict(::NormalEstimator, k::LearnAPI.KindOfProxy, y) = + predict(fit(NormalEstimator(), y), k) +LearnAPI.predict(::NormalEstimator, y) = predict(NormalEstimator(), Distribution(), y) + +LearnAPI.extras(model::NormalEstimatorFitted) = (μ=model.ȳ, σ=sqrt(model.ss/model.n)) + +@trait( + NormalEstimator, + constructor = NormalEstimator, + kinds_of_proxy = (Distribution(), Point(), ConfidenceInterval()), + tags = ("density estimation", "incremental algorithms"), + is_pure_julia = true, + human_name = "normal distribution estimator", + functions = ( + :(LearnAPI.fit), + :(LearnAPI.algorithm), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.update_observations), + :(LearnAPI.extras), + ), +) + +# ## Tests + +@testset "NormalEstimator" begin + rng = StableRNG(123) + y = rand(rng, 50); + ynew = rand(rng, 10); + algorithm = NormalEstimator() + model = fit(algorithm, y) + d = predict(model) + μ, σ = Distributions.params(d) + @test μ ≈ mean(y) + @test σ ≈ std(y)*sqrt(49/50) # `std` uses Bessel's correction + + # accessor function: + @test LearnAPI.extras(model) == (; μ, σ) + + # one-liner: + @test predict(algorithm, y) == d + @test predict(algorithm, Point(), y) ≈ μ + @test predict(algorithm, ConfidenceInterval(), y)[1] ≈ quantile(d, 0.025) + + # updating: + model = update_observations(model, ynew) + μ2, σ2 = LearnAPI.extras(model) + μ3, σ3 = LearnAPI.extras(fit(algorithm, vcat(y, ynew))) # training ab initio + @test μ2 ≈ μ3 + @test σ2 ≈ σ3 +end diff --git a/test/runtests.jl b/test/runtests.jl index 5385a731..63cdfe62 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -7,6 +7,7 @@ test_files = [ "patterns/regression.jl", "patterns/static_algorithms.jl", "patterns/ensembling.jl", + "patterns/incremental_algorithms.jl", ] files = isempty(ARGS) ? test_files : ARGS diff --git a/test/traits.jl b/test/traits.jl index a0c8a3d9..ab4cad1a 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -13,6 +13,9 @@ LearnAPI.algorithm(model::SmallAlgorithm) = model functions = ( :(LearnAPI.fit), :(LearnAPI.algorithm), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), ), ) ######## END OF IMPLEMENTATION ################## @@ -27,7 +30,7 @@ LearnAPI.algorithm(model::SmallAlgorithm) = model small = SmallAlgorithm() @test LearnAPI.constructor(small) == SmallAlgorithm -@test LearnAPI.functions(small) == (:(LearnAPI.fit), :(LearnAPI.algorithm)) +@test :(LearnAPI.algorithm) in LearnAPI.functions(small) @test isempty(LearnAPI.kinds_of_proxy(small)) @test isempty(LearnAPI.tags(small)) @test !LearnAPI.is_pure_julia(small) From 9b9e4d45a91954e14230b4073092208bc391170e Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 11:13:34 +1300 Subject: [PATCH 105/187] typo --- docs/src/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/index.md b/docs/src/index.md index 36e361d2..959d1dc7 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -98,7 +98,7 @@ A key to enabling toolboxes to enhance LearnAPI.jl algorithm functionality is th implementation of two key additional methods, beyond the usual `fit` and `predict`/`transform`. Given any training `data` consumed by `fit` (such as `data = (X, y)` in the example above) [`LearnAPI.features(algorithm, data)`](@ref input) tells us what -part of `data` comprises *features*, which is something that can be passsed onto to +part of `data` comprises *features*, which is something that can be passed onto to `predict` or `transform` (`X` in the example) while [`LearnAPI.target(algorithm, data)`](@ref), if implemented, tells us what part comprises the target (`y` in the example). By explicitly requiring such methods, we free algorithms to consume data in From d82eaa5a4572a07120bce19e5586d9aa6ef888be Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 12:02:15 +1300 Subject: [PATCH 106/187] add a test for predict and transform slurping fallbacks oops --- src/predict_transform.jl | 8 ++------ test/predict_transform.jl | 19 +++++++++++++++++++ test/runtests.jl | 1 + 3 files changed, 22 insertions(+), 6 deletions(-) create mode 100644 test/predict_transform.jl diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 39bff2a9..1fff01f3 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -4,10 +4,6 @@ function DOC_IMPLEMENTED_METHODS(name; overloaded=false) "[`LearnAPI.functions`](@ref) trait. " end -const OPERATIONS = (:predict, :transform, :inverse_transform) -const DOC_OPERATIONS_LIST_SYMBOL = join(map(op -> "`:$op`", OPERATIONS), ", ") -const DOC_OPERATIONS_LIST_FUNCTION = join(map(op -> "`LearnAPI.$op`", OPERATIONS), ", ") - DOC_MUTATION(op) = """ @@ -171,8 +167,8 @@ $(DOC_MUTATION(:transform)) $(DOC_DATA_INTERFACE(:transform)) """ -transform(model, data1, data2...; kwargs...) = - transform(model, (data1, datas...); kwargs...) # automatic slurping +transform(model, data1, data2, datas...; kwargs...) = + transform(model, (data1, data2, datas...); kwargs...) # automatic slurping """ inverse_transform(model, data) diff --git a/test/predict_transform.jl b/test/predict_transform.jl new file mode 100644 index 00000000..7a496115 --- /dev/null +++ b/test/predict_transform.jl @@ -0,0 +1,19 @@ +using Test +using LearnAPI + +struct Goose end + +LearnAPI.fit(algorithm::Goose) = Ref(algorithm) +LearnAPI.algorithm(::Base.RefValue{Goose}) = Goose() +LearnAPI.predict(::Base.RefValue{Goose}, ::Point, data) = sum(data) +LearnAPI.transform(::Base.RefValue{Goose}, data) = prod(data) +@trait Goose kinds_of_proxy = (Point(),) + +@testset "predict and transform argument slurping" begin + model = fit(Goose()) + @test predict(model, Point(), 2, 3, 4) == 9 + @test predict(model, 2, 3, 4) == 9 + @test transform(model, 2, 3, 4) == 24 +end + +true diff --git a/test/runtests.jl b/test/runtests.jl index 63cdfe62..9af76002 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,6 +4,7 @@ test_files = [ "tools.jl", "traits.jl", "clone.jl", + "predict_transform.jl", "patterns/regression.jl", "patterns/static_algorithms.jl", "patterns/ensembling.jl", From d4546c335336d205daef21e48aa9e04544cf8b73 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 12:34:32 +1300 Subject: [PATCH 107/187] improve coverage --- src/traits.jl | 14 ++------------ test/accessor_functions.jl | 4 ++++ test/fit_update.jl | 14 ++++++++++++++ test/runtests.jl | 2 ++ test/traits.jl | 1 + 5 files changed, 23 insertions(+), 12 deletions(-) create mode 100644 test/accessor_functions.jl create mode 100644 test/fit_update.jl diff --git a/src/traits.jl b/src/traits.jl index 81ecf778..8c41a2a9 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -6,13 +6,6 @@ const DOC_UNKNOWN = "not overloaded the trait. " const DOC_ON_TYPE = "The value of the trait must depend only on the type of `algorithm`. " -DOC_ONLY_ONE(func) = - "Ordinarily, at most one of the following should be overloaded for given "* - "algorithm "* - "`LearnAPI.$(func)_scitype`, `LearnAPI.$(func)_type`, "* - "`LearnAPI.$(func)_observation_scitype`, "* - "`LearnAPI.$(func)_observation_type`." - const DOC_EXPLAIN_EACHOBS = """ @@ -82,7 +75,7 @@ value is non-empty. # New implementations -All new implementations must overload this trait. Here's a checklist for elements in the +All new implementations must implement this trait. Here's a checklist for elements in the return value: | expression | implementation compulsory? | include in returned tuple? | @@ -106,7 +99,6 @@ Also include any implemented accessor functions, both those owned by LearnaAPI.j algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST. """ -functions(::Any) = () functions() = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -195,8 +187,6 @@ tags() = [ "meta-algorithms" ] -const DOC_TAGS_LIST = join(map(d -> "`\"$d\"`", tags()), ", ") - """ LearnAPI.tags(algorithm) @@ -402,7 +392,7 @@ See also [`LearnAPI.target_observation_scitype`](@ref). # New implementations -Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) +Optional. The fallback return value is `Union{}`. """ fit_observation_scitype(::Any) = Union{} diff --git a/test/accessor_functions.jl b/test/accessor_functions.jl new file mode 100644 index 00000000..f22e73bb --- /dev/null +++ b/test/accessor_functions.jl @@ -0,0 +1,4 @@ +using Test +using LearnAPI + +@test strip("junk") == "junk" diff --git a/test/fit_update.jl b/test/fit_update.jl new file mode 100644 index 00000000..aa783432 --- /dev/null +++ b/test/fit_update.jl @@ -0,0 +1,14 @@ +using Test +using LearnAPI + +struct Gander end + +LearnAPI.update(::Gander, data) = sum(data) +LearnAPI.update_features(::Gander, data) = prod(data) + +@testset "update, update_features slurping" begin + @test update(Gander(), 2, 3, 4) == 9 + @test update_features(Gander(), 2, 3, 4) == 24 +end + +true diff --git a/test/runtests.jl b/test/runtests.jl index 2c66588d..a32dbecf 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,6 +4,8 @@ test_files = [ "tools.jl", "traits.jl", "clone.jl", + "fit_update.jl", + "accessor_functions.jl", "integration/regression.jl", "integration/static_algorithms.jl", "integration/ensembling.jl", diff --git a/test/traits.jl b/test/traits.jl index a0c8a3d9..39484fa4 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -41,6 +41,7 @@ small = SmallAlgorithm() @test LearnAPI.data_interface(small) == LearnAPI.RandomAccess() @test !(6 isa LearnAPI.fit_observation_scitype(small)) @test 6 isa LearnAPI.target_observation_scitype(small) +@test !LearnAPI.is_static(small) # DERIVED TRAITS From 893b138104a4b912c4700fdacbff9eb445c78fe0 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 12:34:32 +1300 Subject: [PATCH 108/187] improve coverage --- src/traits.jl | 14 ++------------ test/accessor_functions.jl | 4 ++++ test/fit_update.jl | 14 ++++++++++++++ test/runtests.jl | 5 +++++ test/traits.jl | 1 + 5 files changed, 26 insertions(+), 12 deletions(-) create mode 100644 test/accessor_functions.jl create mode 100644 test/fit_update.jl diff --git a/src/traits.jl b/src/traits.jl index 614daf53..9b566120 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -6,13 +6,6 @@ const DOC_UNKNOWN = "not overloaded the trait. " const DOC_ON_TYPE = "The value of the trait must depend only on the type of `algorithm`. " -DOC_ONLY_ONE(func) = - "Ordinarily, at most one of the following should be overloaded for given "* - "algorithm "* - "`LearnAPI.$(func)_scitype`, `LearnAPI.$(func)_type`, "* - "`LearnAPI.$(func)_observation_scitype`, "* - "`LearnAPI.$(func)_observation_type`." - const DOC_EXPLAIN_EACHOBS = """ @@ -82,7 +75,7 @@ value is non-empty. # New implementations -All new implementations must overload this trait. Here's a checklist for elements in the +All new implementations must implement this trait. Here's a checklist for elements in the return value: | expression | implementation compulsory? | include in returned tuple? | @@ -107,7 +100,6 @@ algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCT (`LearnAPI.strip` is always included). """ -functions(::Any) = () functions() = ( :(LearnAPI.fit), :(LearnAPI.algorithm), @@ -196,8 +188,6 @@ tags() = [ "meta-algorithms" ] -const DOC_TAGS_LIST = join(map(d -> "`\"$d\"`", tags()), ", ") - """ LearnAPI.tags(algorithm) @@ -403,7 +393,7 @@ See also [`LearnAPI.target_observation_scitype`](@ref). # New implementations -Optional. The fallback return value is `Union{}`. $(DOC_ONLY_ONE(:fit)) +Optional. The fallback return value is `Union{}`. """ fit_observation_scitype(::Any) = Union{} diff --git a/test/accessor_functions.jl b/test/accessor_functions.jl new file mode 100644 index 00000000..f22e73bb --- /dev/null +++ b/test/accessor_functions.jl @@ -0,0 +1,4 @@ +using Test +using LearnAPI + +@test strip("junk") == "junk" diff --git a/test/fit_update.jl b/test/fit_update.jl new file mode 100644 index 00000000..aa783432 --- /dev/null +++ b/test/fit_update.jl @@ -0,0 +1,14 @@ +using Test +using LearnAPI + +struct Gander end + +LearnAPI.update(::Gander, data) = sum(data) +LearnAPI.update_features(::Gander, data) = prod(data) + +@testset "update, update_features slurping" begin + @test update(Gander(), 2, 3, 4) == 9 + @test update_features(Gander(), 2, 3, 4) == 24 +end + +true diff --git a/test/runtests.jl b/test/runtests.jl index 9af76002..c71b5e38 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,11 +4,16 @@ test_files = [ "tools.jl", "traits.jl", "clone.jl", + "fit_update.jl", + "accessor_functions.jl", "predict_transform.jl", "patterns/regression.jl", "patterns/static_algorithms.jl", "patterns/ensembling.jl", "patterns/incremental_algorithms.jl", + "patterns/regression.jl", + "patterns/static_algorithms.jl", + "integration/ensembling.jl", ] files = isempty(ARGS) ? test_files : ARGS diff --git a/test/traits.jl b/test/traits.jl index ab4cad1a..e6eaae45 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -44,6 +44,7 @@ small = SmallAlgorithm() @test LearnAPI.data_interface(small) == LearnAPI.RandomAccess() @test !(6 isa LearnAPI.fit_observation_scitype(small)) @test 6 isa LearnAPI.target_observation_scitype(small) +@test !LearnAPI.is_static(small) # DERIVED TRAITS From 92e5e3b1529c3a7f8b5ad0cfd339885fc7ea46e8 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 12:41:01 +1300 Subject: [PATCH 109/187] fix duplication --- test/runtests.jl | 2 -- 1 file changed, 2 deletions(-) diff --git a/test/runtests.jl b/test/runtests.jl index 4d2aa289..9ef643f8 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -11,8 +11,6 @@ test_files = [ "patterns/static_algorithms.jl", "patterns/ensembling.jl", "patterns/incremental_algorithms.jl", - "patterns/regression.jl", - "patterns/static_algorithms.jl", ] files = isempty(ARGS) ? test_files : ARGS From 2bdfece441a5141867c98dc5ffa7deeab856b96d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 12:59:59 +1300 Subject: [PATCH 110/187] update codecov to v4 --- .github/codecov.yml | 3 ++- .github/workflows/ci.yml | 6 ++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/.github/codecov.yml b/.github/codecov.yml index ed9d9f1c..914690d9 100644 --- a/.github/codecov.yml +++ b/.github/codecov.yml @@ -3,6 +3,7 @@ coverage: project: default: threshold: 0.5% + removed_code_behavior: fully_covered_patch patch: default: - target: 80% \ No newline at end of file + target: 80%coverage: diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index ca263a9a..70e66dce 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -45,9 +45,11 @@ jobs: env: JULIA_NUM_THREADS: 2 - uses: julia-actions/julia-processcoverage@v1 - - uses: codecov/codecov-action@v1 + - uses: codecov/codecov-action@v4 with: - file: lcov.info + token: ${{ secrets.CODECOV_TOKEN }} + fail_ci_if_error: false + verbose: true docs: name: Documentation runs-on: ubuntu-latest From 45ee2a62bd4f9eefd68f32dab6af640597946e16 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 12 Oct 2024 13:04:26 +1300 Subject: [PATCH 111/187] update codecov badge --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b6b08900..e93ac8a2 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ A base Julia interface for machine learning and statistics [![Lifecycle:Maturing](https://img.shields.io/badge/Lifecycle-Maturing-007EC6)](ROADMAP.md) [![Build Status](https://github.com/JuliaAI/LearnAPI.jl/workflows/CI/badge.svg)](https://github.com/JuliaAI/LearnAPI.jl/actions) -[![Coverage](https://codecov.io/gh/JuliaAI/LearnAPI.jl/branch/master/graph/badge.svg)](https://codecov.io/github/JuliaAI/LearnAPI.jl?branch=master) +[![codecov](https://codecov.io/gh/JuliaAI/LearnAPI.jl/graph/badge.svg?token=9IWT9KYINZ)](https://codecov.io/gh/JuliaAI/LearnAPI.jl?branch=dev) [![Docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://juliaai.github.io/LearnAPI.jl/dev/) Comprehensive documentation is [here](https://juliaai.github.io/LearnAPI.jl/dev/). From fd672ca7eb070d327c6827a41c0df4386e4d3bed Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:32:09 +1300 Subject: [PATCH 112/187] fix formatting mistake --- docs/src/common_implementation_patterns.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 7f7641a4..528501b8 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -2,6 +2,8 @@ !!! warning +!!! warn + This section is only an implementation guide. The definitive specification of the Learn API is given in [Reference](@ref reference). From 0605baad809273d92c37b5e4a0db4fc2accfeb7e Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 11 Oct 2024 23:32:54 +1300 Subject: [PATCH 113/187] ditto --- docs/src/common_implementation_patterns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 528501b8..c554ca45 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -2,7 +2,7 @@ !!! warning -!!! warn +!!! warning This section is only an implementation guide. The definitive specification of the Learn API is given in [Reference](@ref reference). From ac5419b58d97c6009fe5acc8a9528cefcc131fe3 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 11:18:43 +1300 Subject: [PATCH 114/187] remove slurping fallback signatures --- docs/src/anatomy_of_an_implementation.md | 39 +++++++++++++++----- docs/src/fit_update.md | 20 +++++++++-- docs/src/predict_transform.md | 7 ++-- src/fit_update.jl | 23 ++++++------ src/predict_transform.jl | 45 ++++++++++++------------ test/fit_update.jl | 14 -------- test/patterns/ensembling.jl | 7 ++++ test/patterns/gradient_descent.jl | 16 +++++++-- test/patterns/regression.jl | 20 +++++++++++ test/predict_transform.jl | 19 ---------- test/runtests.jl | 2 -- 11 files changed, 123 insertions(+), 89 deletions(-) delete mode 100644 test/fit_update.jl delete mode 100644 test/predict_transform.jl diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index a41325d4..28de88a3 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -1,20 +1,27 @@ # Anatomy of an Implementation -This section explains a detailed implementation of the LearnAPI for naive [ridge +This section explains a detailed implementation of the LearnAPI.jl for naive [ridge regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also refer to the [demonstration](@ref workflow) of the implementation given later. -A transformer ordinarily implements `transform` instead of -`predict`. For more on `predict` versus `transform`, see [Predict or transform?](@ref) +The core LearnAPI.jl pattern looks like this: + +```julia +model = fit(algorithm, data) +predict(model, newdata) +``` + +A transformer ordinarily implements `transform` instead of `predict`. For more on +`predict` versus `transform`, see [Predict or transform?](@ref) !!! note New implementations of `fit`, `predict`, etc, - always have a *single* `data` argument, as in - `LearnAPI.fit(algorithm, data; verbosity=1) = ...`. - For convenience, user-calls, such as `fit(algorithm, X, y)`, automatically fallback - to `fit(algorithm, (X, y))`. + always have a *single* `data` argument as above. + For convenience, a signature such as `fit(algorithm, X, y)`, calling + `fit(algorithm, (X, y))`, can be added, but the LearnAPI.jl specification is + silent on the meaning or existence of signatures with extra arguments. !!! note @@ -52,7 +59,7 @@ nothing # hide Instances of `Ridge` will be [algorithms](@ref algorithms), in LearnAPI.jl parlance. -Associated with each new type of LearnAPI [algorithm](@ref algorithms) will be a keyword +Associated with each new type of LearnAPI.jl [algorithm](@ref algorithms) will be a keyword argument constructor, providing default values for all properties (struct fields) that are not other algorithms, and we must implement [`LearnAPI.constructor(algorithm)`](@ref), for recovering the constructor from an instance: @@ -244,6 +251,14 @@ in LearnAPI.functions(algorithm)`, for every instance `algorithm`. With [some exceptions](@ref trait_contract), the value of a trait should depend only on the *type* of the argument. +## Signatures added for convenience + +We add one `fit` signature for user-convenience only. The LearnAPI.jl specification has +nothing to say about `fit` signatures with more than two positional arguments. + +```@example anatomy +LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = fit(algorithm, (X, y); kwargs...) +``` ## [Demonstration](@id workflow) @@ -466,6 +481,14 @@ overload the trait, [`LearnAPI.data_interface(algorithm)`](@ref). See [Data interfaces](@ref data_interfaces) for details. +### Addition of signatures for user convenience + +As above, we add a signature which plays no role vis-à-vis LearnAPI.jl. + +```@exammple anatomy2 +LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = fit(algorithm, (X, y); kwargs...) +``` + ## Demonstration of an advanced `obs` workflow We now can train and predict using internal data representations, resampled using the diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index b49199c7..74ee1e0a 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -11,8 +11,6 @@ A "static" algorithm is one that does not generalize to new observations (e.g., clustering algorithms); there is no trainiing data and the algorithm is executed by `predict` or `transform` which receive the data. See example below. -When `fit` expects a tuple form of argument, `data = (X1, ..., Xn)`, then the signature -`fit(algorithm, X1, ..., Xn)` is also provided. ### Updating @@ -32,7 +30,7 @@ Supposing `Algorithm` is some supervised classifier type, with an iteration para ```julia algorithm = Algorithm(n=100) -model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` +model = fit(algorithm, (X, y)) # Predict probability distributions: ŷ = predict(model, Distribution(), Xnew) @@ -76,6 +74,22 @@ labels = predict(algorithm, X) LearnAPI.extras(model) ``` +### Density estimation + +In density estimation, `fit` consumes no features, only a target variable; `predict`, +which consumes no data, returns the learned density: + +```julia +model = fit(algorithm, y) # no features +predict(model) # shortcut for `predict(model, Distribution())` +``` + +A one-liner will typically be implemented as well: + +```julia +predict(algorithm, y) +``` + ## Implementation guide ### Training diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 1cf50f54..605ee27a 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -6,16 +6,15 @@ transform(model, data) inverse_transform(model, data) ``` -When a method expects a tuple form of argument, `data = (X1, ..., Xn)`, then a slurping -signature is also provided, as in `transform(model, X1, ..., Xn)`. - +Versions without the `data` argument may also appear, for example in [Density +estimation](@ref). ## [Typical worklows](@id predict_workflow) Train some supervised `algorithm`: ```julia -model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` +model = fit(algorithm, (X, y)) ``` Predict probability distributions: diff --git a/src/fit_update.jl b/src/fit_update.jl index b6801359..75ae5d6e 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -14,13 +14,10 @@ The second signature is provided by algorithms that do not generalize to new obs ..., data)` carries out the actual algorithm execution, writing any byproducts of that operation to the mutable object `model` returned by `fit`. -Whenever `fit` expects a tuple form of argument, `data = (X1, ..., Xn)`, then the -signature `fit(algorithm, X1, ..., Xn)` is also provided. - -For example, a supervised classifier will typically admit this workflow: +For example, a supervised classifier might have a workflow like this: ```julia -model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` +model = fit(algorithm, (X, y)) ŷ = predict(model, Xnew) ``` @@ -33,16 +30,16 @@ See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), # New implementations -Implementation is compulsory. The signature must include `verbosity`. Fallbacks provide -the data slurping versions. A fallback for the first signature calls the second, ignoring -`data`: +Implementation of exactly one of the signatures is compulsory. If `fit(algorithm; +verbosity=1)` is implemented, then the trait [`LearnAPI.is_static`](@ref) must be +overloaded to return `true`. -```julia -fit(algorithm, data; kwargs...) = fit(algorithm; kwargs...) -``` +The signature must include `verbosity`. -If only the `fit(algorithm)` signature is expliclty implemented, then the trait -[`LearnAPI.is_static`](@ref) must be overloaded to return `true`. +The LearnAPI.jl specification has nothing to say regarding `fit` signatures with more than +two arguments. For convenience, for example, an algorithm is free to implement a slurping +signature, such as `fit(algorithm, X, y, extras...) = fit(algorithm, (X, y, extras...))` but +LearnAPI.jl does not guarantee such signatures are actually implemented. $(DOC_DATA_INTERFACE(:fit)) diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 1fff01f3..16c6cabf 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -13,12 +13,20 @@ DOC_MUTATION(op) = """ +DOC_SLURPING(op) = + """ + + An algorithm is free to implement `$op` signatures with additional positional + arguments (eg., data-slurping signatures) but LearnAPI.jl is silent about their + interpretation or existence. + + """ DOC_MINIMIZE(func) = """ - If, additionally, [`LearnAPI.strip(model)`](@ref) is overloaded, then the following identity - must hold: + If, additionally, [`LearnAPI.strip(model)`](@ref) is overloaded, then the following + identity must hold: ```julia $func(LearnAPI.strip(model), args...) = $func(model, args...) @@ -63,7 +71,7 @@ which lists all supported target proxies. The argument `model` is anything returned by a call of the form `fit(algorithm, ...)`. If `LearnAPI.features(LearnAPI.algorithm(model)) == nothing`, then argument `data` is -omitted. An example is density estimators. +omitted in both signatures. An example is density estimators. # Example @@ -79,10 +87,7 @@ See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). # Extended help -If `predict` supports data in the form of a tuple `data = (X1, ..., Xn)`, then a slurping -signature is also provided, as in `predict(model, X1, ..., Xn)`. - -Note `predict ` does not mutate any argument, except in the special case +Note `predict ` must not mutate any argument, except in the special case `LearnAPI.is_static(algorithm) == true`. # New implementations @@ -90,9 +95,12 @@ Note `predict ` does not mutate any argument, except in the special case If there is no notion of a "target" variable in the LearnAPI.jl sense, or you need an operation with an inverse, implement [`transform`](@ref) instead. -Implementation is optional. Only the first signature is implemented, but each -`kind_of_proxy` that gets an implementation must be added to the list returned by -[`LearnAPI.kinds_of_proxy`](@ref). +Implementation is optional. Only the first signature (with or without the `data` argument) +is implemented, but each `kind_of_proxy` that gets an implementation must be added to the +list returned by [`LearnAPI.kinds_of_proxy`](@ref). + +If `data` is not present in the implemented signature (eg., for density estimators) then +[`LearnAPI.features(algorithm, data)`](@ref) must return `nothing`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.predict)")) @@ -106,23 +114,12 @@ $(DOC_DATA_INTERFACE(:predict)) predict(model, data) = predict(model, kinds_of_proxy(algorithm(model)) |> first, data) predict(model) = predict(model, kinds_of_proxy(algorithm(model)) |> first) -# automatic slurping of multiple data arguments: -predict(model, k::KindOfProxy, data1, data2, datas...; kwargs...) = - predict(model, k, (data1, data2, datas...); kwargs...) -predict(model, data1, data2, datas...; kwargs...) = - predict(model, (data1, data2, datas...); kwargs...) - - - """ transform(model, data) Return a transformation of some `data`, using some `model`, as returned by [`fit`](@ref). -For `data` that consists of a tuple, a slurping version is also provided, i.e., you can do -`transform(model, X1, X2, X3)` in place of `transform(model, (X1, X2, X3))`. - # Example Below, `X` and `Xnew` are data of the same form. @@ -157,8 +154,10 @@ See also [`fit`](@ref), [`predict`](@ref), # New implementations -Implementation for new LearnAPI.jl algorithms is optional. A fallback provides the -slurping version. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) +Implementation for new LearnAPI.jl algorithms is +optional. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) + +$(DOC_SLURPING(:transform)) $(DOC_MINIMIZE(:transform)) diff --git a/test/fit_update.jl b/test/fit_update.jl deleted file mode 100644 index aa783432..00000000 --- a/test/fit_update.jl +++ /dev/null @@ -1,14 +0,0 @@ -using Test -using LearnAPI - -struct Gander end - -LearnAPI.update(::Gander, data) = sum(data) -LearnAPI.update_features(::Gander, data) = prod(data) - -@testset "update, update_features slurping" begin - @test update(Gander(), 2, 3, 4) == 9 - @test update_features(Gander(), 2, 3, 4) == 24 -end - -true diff --git a/test/patterns/ensembling.jl b/test/patterns/ensembling.jl index bcebacaa..8571818a 100644 --- a/test/patterns/ensembling.jl +++ b/test/patterns/ensembling.jl @@ -160,6 +160,13 @@ LearnAPI.strip(model::EnsembleFitted) = EnsembleFitted( ) ) +# convenience method: +LearnAPI.fit(algorithm::Ensemble, X, y, extras...; kwargs...) = + fit(algorithm, (X, y, extras...); kwargs...) +LearnAPI.update(algorithm::Ensemble, X, y, extras...; kwargs...) = + update(algorithm, (X, y, extras...); kwargs...) + + # synthetic test data: N = 10 # number of observations train = 1:6 diff --git a/test/patterns/gradient_descent.jl b/test/patterns/gradient_descent.jl index ff41782b..19f0d363 100644 --- a/test/patterns/gradient_descent.jl +++ b/test/patterns/gradient_descent.jl @@ -227,9 +227,9 @@ function LearnAPI.update_observations( ) # unpack data: - X = observations.X - y_hot = observations.y_hot - classes = observations.classes + X = observations_new.X + y_hot = observations_new.y_hot + classes = observations_new.classes nclasses = length(classes) classes == model.classes || error("New training target has incompatible classes.") @@ -328,6 +328,16 @@ LearnAPI.training_losses(model::PerceptronClassifierFitted) = model.losses ) +# ### Convenience methods + +LearnAPI.fit(algorithm::PerceptronClassifier, X, y; kwargs...) = + fit(algorithm, (X, y); kwargs...) +LearnAPI.update_observations(algorithm::PerceptronClassifier, X, y; kwargs...) = + update_observations(algorithm, (X, y); kwargs...) +LearnAPI.update(algorithm::PerceptronClassifier, X, y; kwargs...) = + update(algorithm, (X, y); kwargs...) + + # ## Tests # synthetic test data: diff --git a/test/patterns/regression.jl b/test/patterns/regression.jl index 4bcc9fe1..35376519 100644 --- a/test/patterns/regression.jl +++ b/test/patterns/regression.jl @@ -10,6 +10,9 @@ import DataFrames # We overload `obs` to expose internal representation of data. See later for a simpler # variation using the `obs` fallback. + +# ## Implementation + # no docstring here - that goes with the constructor struct Ridge lambda::Float64 @@ -117,6 +120,13 @@ LearnAPI.strip(model::RidgeFitted) = ) ) +# convenience method: +LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = + fit(algorithm, (X, y); kwargs...) + + +# ## Tests + # synthetic test data: n = 30 # number of observations train = 1:6 @@ -190,6 +200,9 @@ struct BabyRidge lambda::Float64 end + +# ## Implementation + """ BabyRidge(; lambda=0.1) @@ -250,6 +263,13 @@ LearnAPI.strip(model::BabyRidgeFitted) = ) ) +# convenience method: +LearnAPI.fit(algorithm::BabyRidge, X, y; kwargs...) = + fit(algorithm, (X, y); kwargs...) + + +# ## Tests + @testset "test a variation which does not overload LearnAPI.obs" begin algorithm = BabyRidge(lambda=0.5) diff --git a/test/predict_transform.jl b/test/predict_transform.jl deleted file mode 100644 index 7a496115..00000000 --- a/test/predict_transform.jl +++ /dev/null @@ -1,19 +0,0 @@ -using Test -using LearnAPI - -struct Goose end - -LearnAPI.fit(algorithm::Goose) = Ref(algorithm) -LearnAPI.algorithm(::Base.RefValue{Goose}) = Goose() -LearnAPI.predict(::Base.RefValue{Goose}, ::Point, data) = sum(data) -LearnAPI.transform(::Base.RefValue{Goose}, data) = prod(data) -@trait Goose kinds_of_proxy = (Point(),) - -@testset "predict and transform argument slurping" begin - model = fit(Goose()) - @test predict(model, Point(), 2, 3, 4) == 9 - @test predict(model, 2, 3, 4) == 9 - @test transform(model, 2, 3, 4) == 24 -end - -true diff --git a/test/runtests.jl b/test/runtests.jl index 9ef643f8..f6210235 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,9 +4,7 @@ test_files = [ "tools.jl", "traits.jl", "clone.jl", - "fit_update.jl", "accessor_functions.jl", - "predict_transform.jl", "patterns/regression.jl", "patterns/static_algorithms.jl", "patterns/ensembling.jl", From c71298e8bde2ab244c427f242751f77af0955e58 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 11:40:49 +1300 Subject: [PATCH 115/187] fix some mistakes --- src/fit_update.jl | 14 +++++--------- src/predict_transform.jl | 4 ++-- test/patterns/ensembling.jl | 2 +- test/patterns/static_algorithms.jl | 1 - 4 files changed, 8 insertions(+), 13 deletions(-) diff --git a/src/fit_update.jl b/src/fit_update.jl index 75ae5d6e..de022826 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -44,10 +44,8 @@ LearnAPI.jl does not guarantee such signatures are actually implemented. $(DOC_DATA_INTERFACE(:fit)) """ -fit(algorithm, data; kwargs...) = - fit(algorithm; kwargs...) -fit(algorithm, data1, datas...; kwargs...) = - fit(algorithm, (data1, datas...); kwargs...) +function fit end + # # UPDATE AND COUSINS @@ -88,7 +86,7 @@ Implementation is optional. The signature must include See also [`LearnAPI.clone`](@ref) """ -update(model, data1, datas...; kwargs...) = update(model, (data1, datas...); kwargs...) +function update end """ update_observations(model, new_data; verbosity=1, parameter_replacements...) @@ -124,8 +122,7 @@ Implementation is optional. The signature must include See also [`LearnAPI.clone`](@ref). """ -update_observations(algorithm, data1, datas...; kwargs...) = - update_observations(algorithm, (data1, datas...); kwargs...) +function update_observations end """ update_features(model, new_data; verbosity=1, parameter_replacements...) @@ -151,5 +148,4 @@ Implementation is optional. The signature must include See also [`LearnAPI.clone`](@ref). """ -update_features(algorithm, data1, datas...; kwargs...) = - update_features(algorithm, (data1, datas...); kwargs...) +function update_features end diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 16c6cabf..9e773192 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -166,8 +166,8 @@ $(DOC_MUTATION(:transform)) $(DOC_DATA_INTERFACE(:transform)) """ -transform(model, data1, data2, datas...; kwargs...) = - transform(model, (data1, data2, datas...); kwargs...) # automatic slurping +function transform end + """ inverse_transform(model, data) diff --git a/test/patterns/ensembling.jl b/test/patterns/ensembling.jl index 8571818a..ad348e4a 100644 --- a/test/patterns/ensembling.jl +++ b/test/patterns/ensembling.jl @@ -163,7 +163,7 @@ LearnAPI.strip(model::EnsembleFitted) = EnsembleFitted( # convenience method: LearnAPI.fit(algorithm::Ensemble, X, y, extras...; kwargs...) = fit(algorithm, (X, y, extras...); kwargs...) -LearnAPI.update(algorithm::Ensemble, X, y, extras...; kwargs...) = +LearnAPI.update(algorithm::EnsembleFitted, X, y, extras...; kwargs...) = update(algorithm, (X, y, extras...); kwargs...) diff --git a/test/patterns/static_algorithms.jl b/test/patterns/static_algorithms.jl index 21b43738..5a4c277f 100644 --- a/test/patterns/static_algorithms.jl +++ b/test/patterns/static_algorithms.jl @@ -56,7 +56,6 @@ end X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) model = fit(algorithm) # no data arguments! # if provided, data is ignored: - @test fit(algorithm, "junk")[] == model[] @test LearnAPI.algorithm(model) == algorithm W = transform(model, X) @test W == DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) From 6242737794086e96d5f6d005d11fe0b9666f838d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 11:43:48 +1300 Subject: [PATCH 116/187] typo --- docs/src/anatomy_of_an_implementation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 28de88a3..319e98ed 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -485,7 +485,7 @@ interfaces](@ref data_interfaces) for details. As above, we add a signature which plays no role vis-à-vis LearnAPI.jl. -```@exammple anatomy2 +```@example anatomy2 LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = fit(algorithm, (X, y); kwargs...) ``` From 76a3df8cfb261b8393f3e4ad30974f6afffdfdad Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 12:36:17 +1300 Subject: [PATCH 117/187] doc tweaks --- docs/src/index.md | 20 +++++++++----------- src/fit_update.jl | 7 +++++++ 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/docs/src/index.md b/docs/src/index.md index 959d1dc7..2066e774 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -92,17 +92,15 @@ then overloading `obs` is completely optional. Plain iteration interfaces, with knowledge of the number of observations, can also be specified (to support, e.g., data loaders reading images from disk). -## Hooks for adding functionality - -A key to enabling toolboxes to enhance LearnAPI.jl algorithm functionality is the -implementation of two key additional methods, beyond the usual `fit` and -`predict`/`transform`. Given any training `data` consumed by `fit` (such as `data = (X, -y)` in the example above) [`LearnAPI.features(algorithm, data)`](@ref input) tells us what -part of `data` comprises *features*, which is something that can be passed onto to -`predict` or `transform` (`X` in the example) while [`LearnAPI.target(algorithm, -data)`](@ref), if implemented, tells us what part comprises the target (`y` in the -example). By explicitly requiring such methods, we free algorithms to consume data in -multiple forms, including optimised, algorithm-specific forms, as described above. +## Hooks adding algorithm-generic functonality + +Given any training `data` consumed by `fit` (such as `data = (X, y)` in the example above) +[`LearnAPI.features(algorithm, data)`](@ref input) tells us what part of `data` comprises +*features*, which is something that can be passed onto to `predict` or `transform` (`X` in +the example) while [`LearnAPI.target(algorithm, data)`](@ref), if implemented, tells us +what part comprises the target (`y` in the example). By explicitly requiring such methods, +LearnAPI.jl frees algorithms to consume data in multiple forms, including optimised, +algorithm-specific forms, as described above. ## Learning more diff --git a/src/fit_update.jl b/src/fit_update.jl index de022826..24c7140d 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -36,6 +36,13 @@ overloaded to return `true`. The signature must include `verbosity`. +If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentation, then +[`LearnAPI.target(data)`] must be overloaded to return it. If [`predict`](@ref) or +[`transform`](@ref) are implemented and consume data, then +[`LearnAPI.features(data)`](@ref) must return something that can be passed as data to +these methods. A fallback returns `first(data)` if `data` is a tuple, and `data` +otherwise`. + The LearnAPI.jl specification has nothing to say regarding `fit` signatures with more than two arguments. For convenience, for example, an algorithm is free to implement a slurping signature, such as `fit(algorithm, X, y, extras...) = fit(algorithm, (X, y, extras...))` but From 36112adbc4ec4466d97b09bf13c206b0f794989d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 13:40:17 +1300 Subject: [PATCH 118/187] minor fixes --- docs/src/fit_update.md | 9 ++++++++- src/fit_update.jl | 2 +- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 74ee1e0a..a6125f0d 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -43,6 +43,8 @@ model = update(model; n=150) predict(model, Distribution(), X) ``` +See also [Classification](@ref) and [Regression](@ref). + ### Tranformers A dimension-reducing transformer, `algorithm` might be used in this way: @@ -74,6 +76,8 @@ labels = predict(algorithm, X) LearnAPI.extras(model) ``` +See also [Static Algorithms](@ref) + ### Density estimation In density estimation, `fit` consumes no features, only a target variable; `predict`, @@ -81,7 +85,7 @@ which consumes no data, returns the learned density: ```julia model = fit(algorithm, y) # no features -predict(model) # shortcut for `predict(model, Distribution())` +predict(model) # shortcut for `predict(model, Distribution())`, or similar ``` A one-liner will typically be implemented as well: @@ -90,6 +94,9 @@ A one-liner will typically be implemented as well: predict(algorithm, y) ``` +See also [Density Estimation](@ref). + + ## Implementation guide ### Training diff --git a/src/fit_update.jl b/src/fit_update.jl index 24c7140d..d91bc4e9 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -37,7 +37,7 @@ overloaded to return `true`. The signature must include `verbosity`. If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentation, then -[`LearnAPI.target(data)`] must be overloaded to return it. If [`predict`](@ref) or +[`LearnAPI.target(data)`](@ref) must be overloaded to return it. If [`predict`](@ref) or [`transform`](@ref) are implemented and consume data, then [`LearnAPI.features(data)`](@ref) must return something that can be passed as data to these methods. A fallback returns `first(data)` if `data` is a tuple, and `data` From a6bc77eaf44bf234aa6132bb54257ca052a0f763 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 13:42:47 +1300 Subject: [PATCH 119/187] readme tweak --- README.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index e93ac8a2..39238486 100644 --- a/README.md +++ b/README.md @@ -13,10 +13,15 @@ New contributions welcome. See the [road map](ROADMAP.md). ## Code snippet -Configure a learning algorithm, and inspect available functionality: +Configure a learning algorithm: ```julia julia> algorithm = Ridge(lambda=0.1) +``` + +Inspect available functionality: + +``` julia> LearnAPI.functions(algorithm) (:(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients)) From 6771a69add09c87fcf8cf98e14a95a48563c6552 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 13:46:41 +1300 Subject: [PATCH 120/187] fix kind of proxy for density estimation examples --- docs/src/fit_update.md | 2 +- test/patterns/incremental_algorithms.jl | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index a6125f0d..1a87fc9d 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -85,7 +85,7 @@ which consumes no data, returns the learned density: ```julia model = fit(algorithm, y) # no features -predict(model) # shortcut for `predict(model, Distribution())`, or similar +predict(model) # shortcut for `predict(model, SingleDistribution())`, or similar ``` A one-liner will typically be implemented as well: diff --git a/test/patterns/incremental_algorithms.jl b/test/patterns/incremental_algorithms.jl index 71f9bb26..ff1a0352 100644 --- a/test/patterns/incremental_algorithms.jl +++ b/test/patterns/incremental_algorithms.jl @@ -70,25 +70,25 @@ end LearnAPI.features(::NormalEstimator, y) = nothing LearnAPI.target(::NormalEstimator, y) = y -LearnAPI.predict(model::NormalEstimatorFitted, ::Distribution) = +LearnAPI.predict(model::NormalEstimatorFitted, ::SingleDistribution) = Distributions.Normal(model.ȳ, sqrt(model.ss/model.n)) LearnAPI.predict(model::NormalEstimatorFitted, ::Point) = model.ȳ function LearnAPI.predict(model::NormalEstimatorFitted, ::ConfidenceInterval) - d = predict(model, Distribution()) + d = predict(model, SingleDistribution()) return (quantile(d, 0.025), quantile(d, 0.975)) end # for fit and predict in one line: LearnAPI.predict(::NormalEstimator, k::LearnAPI.KindOfProxy, y) = predict(fit(NormalEstimator(), y), k) -LearnAPI.predict(::NormalEstimator, y) = predict(NormalEstimator(), Distribution(), y) +LearnAPI.predict(::NormalEstimator, y) = predict(NormalEstimator(), SingleDistribution(), y) LearnAPI.extras(model::NormalEstimatorFitted) = (μ=model.ȳ, σ=sqrt(model.ss/model.n)) @trait( NormalEstimator, constructor = NormalEstimator, - kinds_of_proxy = (Distribution(), Point(), ConfidenceInterval()), + kinds_of_proxy = (SingleDistribution(), Point(), ConfidenceInterval()), tags = ("density estimation", "incremental algorithms"), is_pure_julia = true, human_name = "normal distribution estimator", From cf567287e99e733180696ff36bbb83b7a5ea0362 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 14 Oct 2024 13:49:21 +1300 Subject: [PATCH 121/187] drop some fluff from docs/index.md --- docs/src/index.md | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/docs/src/index.md b/docs/src/index.md index 2066e774..713932cf 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -92,16 +92,6 @@ then overloading `obs` is completely optional. Plain iteration interfaces, with knowledge of the number of observations, can also be specified (to support, e.g., data loaders reading images from disk). -## Hooks adding algorithm-generic functonality - -Given any training `data` consumed by `fit` (such as `data = (X, y)` in the example above) -[`LearnAPI.features(algorithm, data)`](@ref input) tells us what part of `data` comprises -*features*, which is something that can be passed onto to `predict` or `transform` (`X` in -the example) while [`LearnAPI.target(algorithm, data)`](@ref), if implemented, tells us -what part comprises the target (`y` in the example). By explicitly requiring such methods, -LearnAPI.jl frees algorithms to consume data in multiple forms, including optimised, -algorithm-specific forms, as described above. - ## Learning more - [Anatomy of an Implementation](@ref): informal introduction to the main actors in a new From f014c8d67d5186dcc312c6ade607e6929cef5151 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 16 Oct 2024 10:12:01 +1300 Subject: [PATCH 122/187] doc and test tweaks --- docs/src/fit_update.md | 2 +- src/fit_update.jl | 43 +++++++++++++++++++------------------- src/predict_transform.jl | 15 +++++-------- test/accessor_functions.jl | 4 +++- 4 files changed, 31 insertions(+), 33 deletions(-) diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 1a87fc9d..a0486bdb 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -45,7 +45,7 @@ predict(model, Distribution(), X) See also [Classification](@ref) and [Regression](@ref). -### Tranformers +### Transformers A dimension-reducing transformer, `algorithm` might be used in this way: diff --git a/src/fit_update.jl b/src/fit_update.jl index d91bc4e9..96407651 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -9,11 +9,6 @@ returning an object, `model`, on which other methods, such as [`predict`](@ref) [`transform`](@ref), can be dispatched. [`LearnAPI.functions(algorithm)`](@ref) returns a list of methods that can be applied to either `algorithm` or `model`. -The second signature is provided by algorithms that do not generalize to new observations -(called *static algorithms*). In that case, `transform(model, data)` or `predict(model, -..., data)` carries out the actual algorithm execution, writing any byproducts of that -operation to the mutable object `model` returned by `fit`. - For example, a supervised classifier might have a workflow like this: ```julia @@ -21,6 +16,12 @@ model = fit(algorithm, (X, y)) ŷ = predict(model, Xnew) ``` +The second signature, with `data` omitted, is provided by algorithms that do not +generalize to new observations (called *static algorithms*). In that case, +`transform(model, data)` or `predict(model, ..., data)` carries out the actual algorithm +execution, writing any byproducts of that operation to the mutable object `model` returned +by `fit`. + Use `verbosity=0` for warnings only, and `-1` for silent training. See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), @@ -41,7 +42,7 @@ If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentat [`transform`](@ref) are implemented and consume data, then [`LearnAPI.features(data)`](@ref) must return something that can be passed as data to these methods. A fallback returns `first(data)` if `data` is a tuple, and `data` -otherwise`. +otherwise. The LearnAPI.jl specification has nothing to say regarding `fit` signatures with more than two arguments. For convenience, for example, an algorithm is free to implement a slurping @@ -63,16 +64,6 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ `update` call, but with the specified hyperparameter replacements, in the form `p1=value1, p2=value2, ...`. -Provided that `data` is identical with the data presented in a preceding `fit` call *and* -there is at most one hyperparameter replacement, as in the example below, execution is -semantically equivalent to the call `fit(algorithm, data)`, where `algorithm` is -`LearnAPI.algorithm(model)` with the specified replacements. In some cases (typically, -when changing an iteration parameter) there may be a performance benefit to using `update` -instead of retraining ab initio. - -If `data` differs from that in the preceding `fit` or `update` call, or there is more than -one hyperparameter replacement, then behaviour is algorithm-specific. - ```julia algorithm = MyForest(ntrees=100) @@ -83,6 +74,16 @@ model = fit(algorithm, data) model = update(model, data; ntrees=150) ``` +Provided that `data` is identical with the data presented in a preceding `fit` call *and* +there is at most one hyperparameter replacement, as in the above example, execution is +semantically equivalent to the call `fit(algorithm, data)`, where `algorithm` is +`LearnAPI.algorithm(model)` with the specified replacements. In some cases (typically, +when changing an iteration parameter) there may be a performance benefit to using `update` +instead of retraining ab initio. + +If `data` differs from that in the preceding `fit` or `update` call, or there is more than +one hyperparameter replacement, then behaviour is algorithm-specific. + See also [`fit`](@ref), [`update_observations`](@ref), [`update_features`](@ref). # New implementations @@ -102,11 +103,6 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ `update` call given the new observations present in `new_data`. One may additionally specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. -When following the call `fit(algorithm, data)`, the `update` call is semantically -equivalent to retraining ab initio using a concatenation of `data` and `new_data`, -*provided there are no hyperparameter replacements.* Behaviour is otherwise -algorithm-specific. - ```julia-repl algorithm = MyNeuralNetwork(epochs=10, learning_rate=0.01) @@ -117,6 +113,11 @@ model = fit(algorithm, data) model = update_observations(model, new_data; epochs=2, learning_rate=0.1) ``` +When following the call `fit(algorithm, data)`, the `update` call is semantically +equivalent to retraining ab initio using a concatenation of `data` and `new_data`, +*provided there are no hyperparameter replacements* (which rules out the example +above). Behaviour is otherwise algorithm-specific. + See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). # Extended help diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 9e773192..726f263f 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -64,6 +64,11 @@ and probability density/mass functions if `kind_of_proxy = Distribution()`. List options with [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), where `algorithm = LearnAPI.algorithm(model)`. +```julia +model = fit(algorithm, (X, y)) +predict(model, Point(), Xnew) +``` + The shortcut `predict(model, data)` calls the first method with an algorithm-specific `kind_of_proxy`, namely the first element of [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), which lists all supported target proxies. @@ -73,16 +78,6 @@ The argument `model` is anything returned by a call of the form `fit(algorithm, If `LearnAPI.features(LearnAPI.algorithm(model)) == nothing`, then argument `data` is omitted in both signatures. An example is density estimators. -# Example - -In the following, `algorithm` is some supervised learning algorithm with -training features `X`, training target `y`, and test features `Xnew`: - -```julia -model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` -predict(model, Point(), Xnew) -``` - See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). # Extended help diff --git a/test/accessor_functions.jl b/test/accessor_functions.jl index f22e73bb..0ee61e55 100644 --- a/test/accessor_functions.jl +++ b/test/accessor_functions.jl @@ -1,4 +1,6 @@ using Test using LearnAPI -@test strip("junk") == "junk" +@test LearnAPI.strip("junk") == "junk" + +true From 8a00b3bc5cc5ab8bd47ff12af5658c5c14de3615 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 16 Oct 2024 10:26:51 +1300 Subject: [PATCH 123/187] fix codecov.yml --- .github/codecov.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/codecov.yml b/.github/codecov.yml index 914690d9..c62bedf5 100644 --- a/.github/codecov.yml +++ b/.github/codecov.yml @@ -6,4 +6,4 @@ coverage: removed_code_behavior: fully_covered_patch patch: default: - target: 80%coverage: + target: 80 From e9f863b64fa4b64d7d2e107ca8a6938537adf5c6 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 16 Oct 2024 12:43:17 +1300 Subject: [PATCH 124/187] add test add test --- test/runtests.jl | 1 + test/target_features.jl | 11 +++++++++++ 2 files changed, 12 insertions(+) create mode 100644 test/target_features.jl diff --git a/test/runtests.jl b/test/runtests.jl index f6210235..47411cb2 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -5,6 +5,7 @@ test_files = [ "traits.jl", "clone.jl", "accessor_functions.jl", + "target_features.jl", "patterns/regression.jl", "patterns/static_algorithms.jl", "patterns/ensembling.jl", diff --git a/test/target_features.jl b/test/target_features.jl new file mode 100644 index 00000000..718d21a3 --- /dev/null +++ b/test/target_features.jl @@ -0,0 +1,11 @@ +using Test +using LearnAPI + +struct Avacado end + +@test isnothing(LearnAPI.target(Avacado(), "salsa")) +@test isnothing(LearnAPI.weights(Avacado(), "salsa")) +@test LearnAPI.features(Avacado(), "salsa") == "salsa" +@test LearnAPI.features(Avacado(), (:X, :y)) == :X + +true From d48d4173b6de76e12b09507c748cdea2b33f6dd9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 16 Oct 2024 13:23:06 +1300 Subject: [PATCH 125/187] typos --- test/target_features.jl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/test/target_features.jl b/test/target_features.jl index 718d21a3..b84ded25 100644 --- a/test/target_features.jl +++ b/test/target_features.jl @@ -1,11 +1,11 @@ using Test using LearnAPI -struct Avacado end +struct Avocado end -@test isnothing(LearnAPI.target(Avacado(), "salsa")) -@test isnothing(LearnAPI.weights(Avacado(), "salsa")) -@test LearnAPI.features(Avacado(), "salsa") == "salsa" -@test LearnAPI.features(Avacado(), (:X, :y)) == :X +@test isnothing(LearnAPI.target(Avocado(), "salsa")) +@test isnothing(LearnAPI.weights(Avocado(), "salsa")) +@test LearnAPI.features(Avocado(), "salsa") == "salsa" +@test LearnAPI.features(Avocado(), (:X, :y)) == :X true From 01245baf5d0a7c577c06b9a5eac4087b3f97aa72 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 18 Oct 2024 15:49:24 +1300 Subject: [PATCH 126/187] use "learner" instead of "algorithm" --- docs/make.jl | 2 +- docs/src/accessor_functions.md | 12 +- docs/src/anatomy_of_an_implementation.md | 123 +++++++++-------- docs/src/common_implementation_patterns.md | 4 +- docs/src/fit_update.md | 42 +++--- docs/src/index.md | 6 +- docs/src/obs.md | 32 ++--- docs/src/predict_transform.md | 38 +++--- docs/src/reference.md | 100 +++++++------- docs/src/target_weights_features.md | 16 +-- docs/src/traits.md | 68 ++++----- src/accessor_functions.jl | 74 +++++----- src/clone.jl | 16 +-- src/fit_update.jl | 47 +++---- src/obs.jl | 36 ++--- src/predict_transform.jl | 58 ++++---- src/target_weights_features.jl | 28 ++-- src/tools.jl | 8 +- src/traits.jl | 152 ++++++++++----------- src/types.jl | 30 ++-- test/patterns/ensembling.jl | 86 ++++++------ test/patterns/gradient_descent.jl | 106 +++++++------- test/patterns/incremental_algorithms.jl | 18 +-- test/patterns/regression.jl | 86 ++++++------ test/patterns/static_algorithms.jl | 50 +++---- test/traits.jl | 24 ++-- 26 files changed, 639 insertions(+), 623 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 77405bc2..86525b98 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -21,7 +21,7 @@ makedocs( "target/weights/features" => "target_weights_features.md", "obs" => "obs.md", "Accessor Functions" => "accessor_functions.md", - "Algorithm Traits" => "traits.md", + "Learner Traits" => "traits.md", ], "Common Implementation Patterns" => "common_implementation_patterns.md", "Testing an Implementation" => "testing_an_implementation.md", diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index 68adab31..cba1a91c 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -1,10 +1,10 @@ # [Accessor Functions](@id accessor_functions) The sole argument of an accessor function is the output, `model`, of -[`fit`](@ref). Algorithms are free to implement any number of these, or none of them. Only +[`fit`](@ref). Learners are free to implement any number of these, or none of them. Only `LearnAPI.strip` has a fallback, namely the identity. -- [`LearnAPI.algorithm(model)`](@ref) +- [`LearnAPI.learner(model)`](@ref) - [`LearnAPI.extras(model)`](@ref) - [`LearnAPI.strip(model)`](@ref) - [`LearnAPI.coefficients(model)`](@ref) @@ -18,12 +18,12 @@ The sole argument of an accessor function is the output, `model`, of - [`LearnAPI.training_scores(model)`](@ref) - [`LearnAPI.components(model)`](@ref) -Algorithm-specific accessor functions may also be implemented. The names of all accessor -functions are included in the list returned by [`LearnAPI.functions(algorithm)`](@ref). +Learner-specific accessor functions may also be implemented. The names of all accessor +functions are included in the list returned by [`LearnAPI.functions(learner)`](@ref). ## Implementation guide -All new implementations must implement [`LearnAPI.algorithm`](@ref). While, all others are +All new implementations must implement [`LearnAPI.learner`](@ref). While, all others are optional, any implemented accessor functions must be added to the list returned by [`LearnAPI.functions`](@ref). @@ -31,7 +31,7 @@ optional, any implemented accessor functions must be added to the list returned ## Reference ```@docs -LearnAPI.algorithm +LearnAPI.learner LearnAPI.extras LearnAPI.strip LearnAPI.coefficients diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 319e98ed..4c36ae2d 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -8,10 +8,12 @@ refer to the [demonstration](@ref workflow) of the implementation given later. The core LearnAPI.jl pattern looks like this: ```julia -model = fit(algorithm, data) +model = fit(learner, data) predict(model, newdata) ``` +Here `learner` specifies hyperparameters, while `model` stores learned parameters and any byproducts of algorithm execution. + A transformer ordinarily implements `transform` instead of `predict`. For more on `predict` versus `transform`, see [Predict or transform?](@ref) @@ -19,8 +21,8 @@ A transformer ordinarily implements `transform` instead of `predict`. For more o New implementations of `fit`, `predict`, etc, always have a *single* `data` argument as above. - For convenience, a signature such as `fit(algorithm, X, y)`, calling - `fit(algorithm, (X, y))`, can be added, but the LearnAPI.jl specification is + For convenience, a signature such as `fit(learner, X, y)`, calling + `fit(learner, (X, y))`, can be added, but the LearnAPI.jl specification is silent on the meaning or existence of signatures with extra arguments. !!! note @@ -28,7 +30,8 @@ A transformer ordinarily implements `transform` instead of `predict`. For more o If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable table¹, array³, tuple of tables and arrays, or some other object implementing - the MLUtils.jl `getobs`/`numobs` interface, + the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) + `getobs`/`numobs` interface, then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support this interface, as illustrated below under @@ -46,9 +49,9 @@ using LinearAlgebra, Tables nothing # hide ``` -## Defining algorithms +## Defining learners -Here's a new type whose instances specify ridge regression parameters: +Here's a new type whose instances specify ridge regression hyperparameters: ```@example anatomy struct Ridge{T<:Real} @@ -57,26 +60,26 @@ end nothing # hide ``` -Instances of `Ridge` will be [algorithms](@ref algorithms), in LearnAPI.jl parlance. +Instances of `Ridge` are *[learners](@ref learners)*, in LearnAPI.jl parlance. -Associated with each new type of LearnAPI.jl [algorithm](@ref algorithms) will be a keyword -argument constructor, providing default values for all properties (struct fields) that are -not other algorithms, and we must implement [`LearnAPI.constructor(algorithm)`](@ref), for -recovering the constructor from an instance: +Associated with each new type of LearnAPI.jl [learner](@ref learners) will be a keyword +argument constructor, providing default values for all properties (typically, struct +fields) that are not other learners, and we must implement +[`LearnAPI.constructor(learner)`](@ref), for recovering the constructor from an instance: ```@example anatomy """ Ridge(; lambda=0.1) -Instantiate a ridge regression algorithm, with regularization of `lambda`. +Instantiate a ridge regression learner, with regularization of `lambda`. """ Ridge(; lambda=0.1) = Ridge(lambda) LearnAPI.constructor(::Ridge) = Ridge nothing # hide ``` -For example, in this case, if `algorithm = Ridge(0.2)`, then -`LearnAPI.constructor(algorithm)(lambda=0.2) == algorithm` is true. Note that we attach +For example, in this case, if `learner = Ridge(0.2)`, then +`LearnAPI.constructor(learner)(lambda=0.2) == learner` is true. Note that we attach the docstring to the *constructor*, not the struct. @@ -90,20 +93,20 @@ coefficients labelled by feature name for inspection after training: ```@example anatomy struct RidgeFitted{T,F} - algorithm::Ridge + learner::Ridge coefficients::Vector{T} named_coefficients::F end nothing # hide ``` -Note that we also include `algorithm` in the struct, for it must be possible to recover -`algorithm` from the output of `fit`; see [Accessor functions](@ref) below. +Note that we also include `learner` in the struct, for it must be possible to recover +`learner` from the output of `fit`; see [Accessor functions](@ref) below. The core implementation of `fit` looks like this: ```@example anatomy -function LearnAPI.fit(algorithm::Ridge, data; verbosity=1) +function LearnAPI.fit(learner::Ridge, data; verbosity=1) X, y = data @@ -112,10 +115,10 @@ function LearnAPI.fit(algorithm::Ridge, data; verbosity=1) names = Tables.columnnames(table) |> collect A = Tables.matrix(table, transpose=true) - lambda = algorithm.lambda + lambda = learner.lambda # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector + coefficients = (A*A' + learner.lambda*I)\(A*y) # vector # determine named coefficients: named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] @@ -123,7 +126,7 @@ function LearnAPI.fit(algorithm::Ridge, data; verbosity=1) # make some noise, if allowed: verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(algorithm, coefficients, named_coefficients) + return RidgeFitted(learner, coefficients, named_coefficients) end ``` @@ -149,34 +152,34 @@ LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = ``` If the kind of proxy is omitted, as in `predict(model, Xnew)`, then a fallback grabs the -first element of the tuple returned by [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), which +first element of the tuple returned by [`LearnAPI.kinds_of_proxy(learner)`](@ref), which we overload appropriately below. ## Extracting the target from training data The `fit` method consumes data which includes a [target variable](@ref proxy), i.e., the -algorithm is a supervised learner. We must therefore declare how the target variable can be extracted +learner is a supervised learner. We must therefore declare how the target variable can be extracted from training data, by implementing [`LearnAPI.target`](@ref): ```@example anatomy -LearnAPI.target(algorithm, data) = last(data) +LearnAPI.target(learner, data) = last(data) ``` There is a similar method, [`LearnAPI.features`](@ref) for declaring how training features -can be extracted (for passing to `predict`, for example) but this method has a fallback -which typically suffices: return `first(data)` if `data` is a tuple, and otherwise return -`data`. +can be extracted (something that can be passed to `predict`) but this method has a +fallback which suffices here: it returns `first(data)` if `data` is a tuple, and `data` +otherwise. ## Accessor functions An [accessor function](@ref accessor_functions) has the output of [`fit`](@ref) as it's sole argument. Every new implementation must implement the accessor function -[`LearnAPI.algorithm`](@ref) for recovering an algorithm from a fitted object: +[`LearnAPI.learner`](@ref) for recovering a learner from a fitted object: ```@example anatomy -LearnAPI.algorithm(model::RidgeFitted) = model.algorithm +LearnAPI.learner(model::RidgeFitted) = model.learner ``` Other accessor functions extract learned parameters or some standard byproducts of @@ -195,17 +198,17 @@ dump the named version of the coefficients: ```@example anatomy LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.algorithm, model.coefficients, nothing) + RidgeFitted(model.learner, model.coefficients, nothing) ``` Crucially, we can still use `LearnAPI.strip(model)` in place of `model` to make new predictions. -## Algorithm traits +## Learner traits -Algorithm [traits](@ref traits) record extra generic information about an algorithm, or -make specific promises of behavior. They are methods that have an algorithm as the sole +Learner [traits](@ref traits) record extra generic information about a learner, or +make specific promises of behavior. They are methods that have a learner as the sole argument, and so we regard [`LearnAPI.constructor`](@ref) defined above as a trait. Because we have implemented `predict`, we are required to overload the @@ -226,7 +229,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: tags = (:regression,), functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -239,15 +242,15 @@ nothing # hide ``` The last trait, `functions`, returns a list of all LearnAPI.jl methods that can be -meaninfully applied to the algorithm or associated model. See [`LearnAPI.functions`](@ref) +meaninfully applied to the learner or associated model. See [`LearnAPI.functions`](@ref) for a checklist. [`LearnAPI.functions`](@ref) and [`LearnAPI.constructor`](@ref), are the only universally compulsory traits. However, it is worthwhile studying the [list of all traits](@ref traits_list) to see which might apply to a new implementation, to enable maximum buy into functionality provided by third party packages, and to assist third party algorithms that match machine learning algorithms to user-defined tasks. -Note that we know `Ridge` instances are supervised algorithms because `:(LearnAPI.target) -in LearnAPI.functions(algorithm)`, for every instance `algorithm`. With [some +Note that we know `Ridge` instances are supervised learners because `:(LearnAPI.target) +in LearnAPI.functions(learner)`, for every instance `learner`. With [some exceptions](@ref trait_contract), the value of a trait should depend only on the *type* of the argument. @@ -257,7 +260,7 @@ We add one `fit` signature for user-convenience only. The LearnAPI.jl specificat nothing to say about `fit` signatures with more than two positional arguments. ```@example anatomy -LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = fit(algorithm, (X, y); kwargs...) +LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...) ``` ## [Demonstration](@id workflow) @@ -277,8 +280,8 @@ nothing # hide ``` ```@example anatomy -algorithm = Ridge(lambda=0.5) -foreach(println, LearnAPI.functions(algorithm)) +learner = Ridge(lambda=0.5) +foreach(println, LearnAPI.functions(learner)) ``` Training and predicting: @@ -286,7 +289,7 @@ Training and predicting: ```@example anatomy Xtrain = Tables.subset(X, train) ytrain = y[train] -model = fit(algorithm, (Xtrain, ytrain)) # `fit(algorithm, Xtrain, ytrain)` will also work +model = fit(learner, (Xtrain, ytrain)) # `fit(learner, Xtrain, ytrain)` will also work ŷ = predict(model, Tables.subset(X, test)) ``` @@ -307,7 +310,7 @@ serialize(filename, small_model) ```julia recovered_model = deserialize(filename) -@assert LearnAPI.algorithm(recovered_model) == algorithm +@assert LearnAPI.learner(recovered_model) == learner @assert predict(recovered_model, X) == predict(model, X) ``` @@ -324,15 +327,15 @@ end Ridge(; lambda=0.1) = Ridge(lambda) struct RidgeFitted{T,F} - algorithm::Ridge + learner::Ridge coefficients::Vector{T} named_coefficients::F end -LearnAPI.algorithm(model::RidgeFitted) = model.algorithm +LearnAPI.learner(model::RidgeFitted) = model.learner LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.algorithm, model.coefficients, nothing) + RidgeFitted(model.learner, model.coefficients, nothing) @trait( Ridge, @@ -341,7 +344,7 @@ LearnAPI.strip(model::RidgeFitted) = tags = (:regression,), functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -390,16 +393,16 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat (observations) which appears first below: ```@example anatomy2 -function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1) +function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) - lambda = algorithm.lambda + lambda = learner.lambda A = observations.A names = observations.names y = observations.y - # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + # apply core learner: + coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix # determine named coefficients: named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] @@ -407,19 +410,19 @@ function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1) # make some noise, if allowed: verbosity > 0 && @info "Coefficients: $named_coefficients" - return RidgeFitted(algorithm, coefficients, named_coefficients) + return RidgeFitted(learner, coefficients, named_coefficients) end -LearnAPI.fit(algorithm::Ridge, data; kwargs...) = - fit(algorithm, obs(algorithm, data); kwargs...) +LearnAPI.fit(learner::Ridge, data; kwargs...) = + fit(learner, obs(learner, data); kwargs...) ``` ### The `obs` contract Providing `fit` signatures matching the output of `obs`, is the first part of the `obs` contract. The second part is this: *The output of `obs` must implement the interface -specified by the trait* [`LearnAPI.data_interface(algorithm)`](@ref). Assuming this is +specified by the trait* [`LearnAPI.data_interface(learner)`](@ref). Assuming this is [`LearnAPI.RandomAccess()`](@ref) (the default) it usually suffices to overload `Base.getindex` and `Base.length`: @@ -462,7 +465,7 @@ as the fallback mentioned above is no longer adequate. ### Important notes: -- The observations to be consumed by `fit` are returned by `obs(algorithm::Ridge, ...)`, +- The observations to be consumed by `fit` are returned by `obs(learner::Ridge, ...)`, while those consumed by `predict` are returned by `obs(model::RidgeFitted, ...)`. We need the different signatures because the form of data consumed by `fit` and `predict` are generally different. @@ -477,7 +480,7 @@ argument, overloading `obs` is optional. This is provided data in publicized [`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof). To opt out of supporting the MLUtils.jl interface altogether, an implementation must -overload the trait, [`LearnAPI.data_interface(algorithm)`](@ref). See [Data +overload the trait, [`LearnAPI.data_interface(learner)`](@ref). See [Data interfaces](@ref data_interfaces) for details. @@ -486,7 +489,7 @@ interfaces](@ref data_interfaces) for details. As above, we add a signature which plays no role vis-à-vis LearnAPI.jl. ```@example anatomy2 -LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = fit(algorithm, (X, y); kwargs...) +LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...) ``` ## Demonstration of an advanced `obs` workflow @@ -496,9 +499,9 @@ generic MLUtils.jl interface: ```@example anatomy2 import MLUtils -algorithm = Ridge() -observations_for_fit = obs(algorithm, (X, y)) -model = fit(algorithm, MLUtils.getobs(observations_for_fit, train)) +learner = Ridge() +observations_for_fit = obs(learner, (X, y)) +model = fit(learner, MLUtils.getobs(observations_for_fit, train)) observations_for_predict = obs(model, X) ẑ = predict(model, MLUtils.getobs(observations_for_predict, test)) ``` diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index c554ca45..7959dce6 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -1,8 +1,6 @@ # Common Implementation Patterns -!!! warning - -!!! warning +!!! important This section is only an implementation guide. The definitive specification of the Learn API is given in [Reference](@ref reference). diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index a0486bdb..29f7af01 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -3,12 +3,12 @@ ### Training ```julia -fit(algorithm, data; verbosity=1) -> model -fit(algorithm; verbosity=1) -> static_model +fit(learner, data; verbosity=1) -> model +fit(learner; verbosity=1) -> static_model ``` A "static" algorithm is one that does not generalize to new observations (e.g., some -clustering algorithms); there is no trainiing data and the algorithm is executed by +clustering algorithms); there is no training data and the algorithm is executed by `predict` or `transform` which receive the data. See example below. @@ -20,17 +20,15 @@ update_observations(model, new_data; verbosity=1, param1=new_value1, ...) -> upd update_features(model, new_data; verbosity=1, param1=new_value1, ...) -> updated_model ``` -Data slurping forms are similarly provided for updating methods. - ## Typical workflows ### Supervised models -Supposing `Algorithm` is some supervised classifier type, with an iteration parameter `n`: +Supposing `Learner` is some supervised classifier type, with an iteration parameter `n`: ```julia -algorithm = Algorithm(n=100) -model = fit(algorithm, (X, y)) +learner = Learner(n=100) +model = fit(learner, (X, y)) # Predict probability distributions: ŷ = predict(model, Distribution(), Xnew) @@ -47,30 +45,30 @@ See also [Classification](@ref) and [Regression](@ref). ### Transformers -A dimension-reducing transformer, `algorithm` might be used in this way: +A dimension-reducing transformer, `learner` might be used in this way: ```julia -model = fit(algorithm, X) +model = fit(learner, X) transform(model, X) # or `transform(model, Xnew)` ``` or, if implemented, using a single call: ```julia -transform(algorithm, X) # `fit` implied +transform(learner, X) # `fit` implied ``` ### [Static algorithms (no "learning")](@id static_algorithms) -Suppose `algorithm` is some clustering algorithm that cannot be generalized to new data +Suppose `learner` is some clustering algorithm that cannot be generalized to new data (e.g. DBSCAN): ```julia -model = fit(algorithm) # no training data +model = fit(learner) # no training data labels = predict(model, X) # may mutate `model` # Or, in one line: -labels = predict(algorithm, X) +labels = predict(learner, X) # But two-line version exposes byproducts of the clustering algorithm (e.g., outliers): LearnAPI.extras(model) @@ -84,14 +82,14 @@ In density estimation, `fit` consumes no features, only a target variable; `pred which consumes no data, returns the learned density: ```julia -model = fit(algorithm, y) # no features +model = fit(learner, y) # no features predict(model) # shortcut for `predict(model, SingleDistribution())`, or similar ``` A one-liner will typically be implemented as well: ```julia -predict(algorithm, y) +predict(learner, y) ``` See also [Density Estimation](@ref). @@ -101,10 +99,12 @@ See also [Density Estimation](@ref). ### Training -| method | fallback | compulsory? | -|:-------------------------------------------------------------------------------|:-----------------------------------------------------------------|--------------------| -| [`fit`](@ref)`(algorithm, data; verbosity=1)` | ignores `data` and applies signature below | yes, unless static | -| [`fit`](@ref)`(algorithm; verbosity=1)` | none | no, unless static | +Exactly one of the following must be implemented: + +| method | fallback | +|:--------------------------------------------|:---------| +| [`fit`](@ref)`(learner, data; verbosity=1)` | none | +| [`fit`](@ref)`(learner; verbosity=1)` | none | ### Updating @@ -114,7 +114,7 @@ See also [Density Estimation](@ref). | [`update_observations`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | | [`update_features`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | -There are some contracts regarding the behaviour of the update methods, as they relate to +There are some contracts governing the behaviour of the update methods, as they relate to a previous `fit` call. Consult the document strings for details. ## Reference diff --git a/docs/src/index.md b/docs/src/index.md index 713932cf..10d38430 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -63,9 +63,9 @@ LearnAPI.feature_importances(model) small_model = LearnAPI.strip(model) serialize("my_random_forest.jls", small_model) -# Recover saved model and algorithm configuration: +# Recover saved model and algorithm configuration ("learner"): recovered_model = deserialize("my_random_forest.jls") -@assert LearnAPI.algorithm(recovered_model) == forest +@assert LearnAPI.learner(recovered_model) == forest @assert predict(recovered_model, Point(), Xnew) == ŷ ``` @@ -73,7 +73,7 @@ recovered_model = deserialize("my_random_forest.jls") dispatch based on the [kind of target proxy](@ref proxy), a key LearnAPI.jl concept. LearnAPI.jl places more emphasis on the notion of target variables and target proxies than on the usual supervised/unsupervised learning dichotomy. From this point of view, a -supervised algorithm is simply one in which a target variable exists, and happens to +supervised learner is simply one in which a target variable exists, and happens to appear as an input to training but not to prediction. ## Data interfaces diff --git a/docs/src/obs.md b/docs/src/obs.md index 5818ea76..3d206b70 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -1,38 +1,38 @@ # [`obs` and Data Interfaces](@id data_interface) The `obs` method takes data intended as input to `fit`, `predict` or `transform`, and -transforms it to an algorithm-specific form guaranteed to implement a form of observation -access designated by the algorithm. The transformed data can then be resampled and passed +transforms it to a learner-specific form guaranteed to implement a form of observation +access designated by the learner. The transformed data can then be resampled and passed on to the relevant method in place of the original input. Using `obs` may provide performance advantages over naive workflows in some cases (e.g., cross-validation). ```julia -obs(algorithm, data) # can be passed to `fit` instead of `data` +obs(learner, data) # can be passed to `fit` instead of `data` obs(model, data) # can be passed to `predict` or `transform` instead of `data` ``` ## [Typical workflows](@id obs_workflows) LearnAPI.jl makes no universal assumptions about the form of `data` in a call -like `fit(algorithm, data)`. However, if we define +like `fit(learner, data)`. However, if we define ```julia -observations = obs(algorithm, data) +observations = obs(learner, data) ``` -then, assuming the typical case that `LearnAPI.data_interface(algorithm) == +then, assuming the typical case that `LearnAPI.data_interface(learner) == LearnAPI.RandomAccess()`, `observations` implements the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface, for grabbing and counting observations. Moreover, we can pass `observations` to `fit` in place of the original data, or first resample it using `MLUtils.getobs`: ```julia -# equivalent to `model = fit(algorithm, data)` -model = fit(algorithm, observations) +# equivalent to `model = fit(learner, data)` +model = fit(learner, observations) # with resampling: resampled_observations = MLUtils.getobs(observations, 1:10) -model = fit(algorithm, resampled_observations) +model = fit(learner, resampled_observations) ``` In some implementations, the alternative pattern above can be used to avoid repeating @@ -43,24 +43,24 @@ how a user might call `obs` and `MLUtils.getobs` to perform efficient cross-vali using LearnAPI import MLUtils -algorithm = +learner = data = -X = LearnAPI.features(algorithm, data) -y = LearnAPI.target(algorithm, data) +X = LearnAPI.features(learner, data) +y = LearnAPI.target(learner, data) train_test_folds = map([1:10, 11:20, 21:30]) do test (setdiff(1:30, test), test) end -fitobs = obs(algorithm, data) +fitobs = obs(learner, data) never_trained = true scores = map(train_test_folds) do (train, test) # train using model-specific representation of data: fitobs_subset = MLUtils.getobs(fitobs, train) - model = fit(algorithm, fitobs_subset) + model = fit(learner, fitobs_subset) # predict on the fold complement: if never_trained @@ -79,7 +79,7 @@ end | method | comment | compulsory? | fallback | |:-------------------------------|:------------------------------------|:-------------:|:---------------| -| [`obs(algorithm, data)`](@ref) | here `data` is `fit`-consumable | not typically | returns `data` | +| [`obs(learner, data)`](@ref) | here `data` is `fit`-consumable | not typically | returns `data` | | [`obs(model, data)`](@ref) | here `data` is `predict`-consumable | not typically | returns `data` | @@ -94,7 +94,7 @@ obs ### [Data interfaces](@id data_interfaces) -New implementations must overload [`LearnAPI.data_interface(algorithm)`](@ref) if the +New implementations must overload [`LearnAPI.data_interface(learner)`](@ref) if the output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess`](@ref). (Arrays, most tables, and all tuples thereof, implement `RandomAccess`.) diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 605ee27a..a6a00047 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -6,15 +6,15 @@ transform(model, data) inverse_transform(model, data) ``` -Versions without the `data` argument may also appear, for example in [Density +Versions without the `data` argument may apply, for example in [Density estimation](@ref). ## [Typical worklows](@id predict_workflow) -Train some supervised `algorithm`: +Train some supervised `learner`: ```julia -model = fit(algorithm, (X, y)) +model = fit(learner, (X, y)) ``` Predict probability distributions: @@ -29,10 +29,10 @@ Generate point predictions: ŷ = predict(model, Point(), Xnew) ``` -Train a dimension-reducing `algorithm`: +Train a dimension-reducing `learner`: ```julia -model = fit(algorithm, X) +model = fit(learner, X) Xnew_reduced = transform(model, Xnew) ``` @@ -42,11 +42,17 @@ Apply an approximate right inverse: inverse_transform(model, Xnew_reduced) ``` +Fit and transform in one line: + +```julia +transform(learner, data) # `fit` implied +``` + ### An advanced workflow ```julia -fitobs = obs(algorithm, (X, y)) # algorithm-specific repr. of data -model = fit(algorithm, MLUtils.getobs(fitobs, 1:100)) +fitobs = obs(learner, (X, y)) # learner-specific repr. of data +model = fit(learner, MLUtils.getobs(fitobs, 1:100)) predictobs = obs(model, MLUtils.getobs(X, 101:150)) ŷ = predict(model, Point(), predictobs) ``` @@ -62,37 +68,37 @@ ŷ = predict(model, Point(), predictobs) ### Predict or transform? -If the algorithm has a notion of [target variable](@ref proxy), then use +If the learner has a notion of [target variable](@ref proxy), then use [`predict`](@ref) to output each supported [kind of target proxy](@ref proxy_types) (`Point()`, `Distribution()`, etc). For output not associated with a target variable, implement [`transform`](@ref) instead, which does not dispatch on [`LearnAPI.KindOfProxy`](@ref), but can be optionally paired with an implementation of [`inverse_transform`](@ref), for returning (approximate) -right inverses to `transform`. +right or left inverses to `transform`. -Of course, the one algorithm can implement both a `predict` and `transform` method. For +Of course, the one learner can implement both a `predict` and `transform` method. For example a K-means clustering algorithm can `predict` labels and `transform` to reduce dimension using distances from the cluster centres. ### [One-liners combining fit and transform/predict](@id one_liners) -Algorithms may optionally overload `transform` to apply `fit` first, using the supplied +Learners may optionally overload `transform` to apply `fit` first, using the supplied data if required, and then immediately `transform` the same data. The same applies to -`predict`. In that case the first argument of `transform`/`predict` is an *algorithm* +`predict`. In that case the first argument of `transform`/`predict` is an *learner* instead of the output of `fit`: ```julia -predict(algorithm, kind_of_proxy, data) # `fit` implied -transform(algorithm, data) # `fit` implied +predict(learner, kind_of_proxy, data) # `fit` implied +transform(learner, data) # `fit` implied ``` -For example, if `fit(algorithm, X)` is defined, then `predict(algorithm, X)` will be +For example, if `fit(learner, X)` is defined, then `predict(learner, X)` will be shorthand for ```julia -model = fit(algorithm, X) +model = fit(learner, X) predict(model, X) ``` diff --git a/docs/src/reference.md b/docs/src/reference.md index 9c13ee79..749e9708 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -16,7 +16,7 @@ ML/statistical algorithms are typically applied in conjunction with resampling o *observations*, as in [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)). In this document *data* will always refer to objects encapsulating an ordered sequence of -individual observations. If an algorithm is trained using multiple data objects, it is +individual observations. If a learner is trained using multiple data objects, it is undertood that individual objects share the same number of observations, and that resampling of one component implies synchronized resampling of the others. @@ -29,9 +29,9 @@ see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. !!! note - In the MLUtils.jl - convention, observations in tables are the rows but observations in a matrix are the - columns. + In the MLUtils.jl + convention, observations in tables are the rows but observations in a matrix are the + columns. ### [Hyperparameters](@id hyperparameters) @@ -70,67 +70,69 @@ dispatch. These are also used to distinguish performance metrics provided by the [StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/). -### [Algorithms](@id algorithms) +### [Learners](@id learners) -An object implementing the LearnAPI.jl interface is called an *algorithm*, although it is -more accurately "the configuration of some algorithm".¹ An algorithm encapsulates a -particular set of user-specified [hyperparameters](@ref) as the object's *properties* -(which conceivably differ from its fields). It does not store learned parameters. +An object implementing the LearnAPI.jl interface is called a *learner*, although it is +more accurately "the configuration of some machine learning or statistical algorithm".¹ A +learner encapsulates a particular set of user-specified [hyperparameters](@ref) as the +object's *properties* (which conceivably differ from its fields). It does not store +learned parameters. Informally, we will sometimes use the word "model" to refer to the output of -`fit(algorithm, ...)` (see below), something which typically does store learned +`fit(learner, ...)` (see below), something which typically does *store* learned parameters. -For `algorithm` to be a valid LearnAPI.jl algorithm, -[`LearnAPI.constructor(algorithm)`](@ref) must be defined and return a keyword constructor -enabling recovery of `algorithm` from its properties: +For `learner` to be a valid LearnAPI.jl learner, +[`LearnAPI.constructor(learner)`](@ref) must be defined and return a keyword constructor +enabling recovery of `learner` from its properties: ```julia -properties = propertynames(algorithm) -named_properties = NamedTuple{properties}(getproperty.(Ref(algorithm), properties)) -@assert algorithm == LearnAPI.constructor(algorithm)(; named_properties...) +properties = propertynames(learner) +named_properties = NamedTuple{properties}(getproperty.(Ref(learner), properties)) +@assert learner == LearnAPI.constructor(learner)(; named_properties...) ``` -which can be tested with `@assert `[`LearnAPI.clone(algorithm)`](@ref)` == algorithm`. +which can be tested with `@assert `[`LearnAPI.clone(learner)`](@ref)` == learner`. -Note that if if `algorithm` is an instance of a *mutable* struct, this requirement +Note that if if `learner` is an instance of a *mutable* struct, this requirement generally requires overloading `Base.==` for the struct. -No LearnAPI.jl method is permitted to mutate an algorithm. In particular, one should make +No LearnAPI.jl method is permitted to mutate a learner. In particular, one should make deep copies of RNG hyperparameters before using them in a new implementation of [`fit`](@ref). -#### Composite algorithms (wrappers) +#### Composite learners (wrappers) -A *composite algorithm* is one with at least one property that can take other algorithms -as values; for such algorithms [`LearnAPI.is_composite`](@ref)`(algorithm)` must be `true` +A *composite learner* is one with at least one property that can take other learners as +values; for such learners [`LearnAPI.is_composite`](@ref)`(learner)` must be `true` (fallback is `false`). Generally, the keyword constructor provided by [`LearnAPI.constructor`](@ref) must provide default values for all properties that are not -algorithm-valued. Instead, these algorithm-valued properties can have a `nothing` default, -with the constructor throwing an error if the default value persists. +learner-valued. Instead, these learner-valued properties can have a `nothing` default, +with the constructor throwing an error if the the constructor call does not explicitly +specify a new value. -Any object `algorithm` for which [`LearnAPI.functions`](@ref)`(algorithm)` is non-empty is +Any object `learner` for which [`LearnAPI.functions(learner)`](@ref) is non-empty is understood to have a valid implementation of the LearnAPI.jl interface. #### Example -Any instance of `GradientRidgeRegressor` defined below is a valid algorithm. +Any instance of `GradientRidgeRegressor` defined below is a valid learner. ```julia struct GradientRidgeRegressor{T<:Real} - learning_rate::T - epochs::Int - l2_regularization::T + learning_rate::T + epochs::Int + l2_regularization::T end GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) = - GradientRidgeRegressor(learning_rate, epochs, l2_regularization) + GradientRidgeRegressor(learning_rate, epochs, l2_regularization) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` ## Documentation -Attach public LearnAPI.jl-related documentation for an algorithm to it's *constructor*, -rather than to the struct defining its type. In this way, an algorithm can implement +Attach public LearnAPI.jl-related documentation for a learner to it's *constructor*, +rather than to the struct defining its type. In this way, a learner can implement multiple interfaces, in addition to the LearnAPI interface, with separate document strings for each. @@ -138,20 +140,20 @@ for each. !!! note "Compulsory methods" - All new algorithm types must implement [`fit`](@ref), - [`LearnAPI.algorithm`](@ref), [`LearnAPI.constructor`](@ref) and - [`LearnAPI.functions`](@ref). + All new learner types must implement [`fit`](@ref), + [`LearnAPI.learner`](@ref), [`LearnAPI.constructor`](@ref) and + [`LearnAPI.functions`](@ref). -Most algorithms will also implement [`predict`](@ref) and/or [`transform`](@ref). For a -bare minimum implementation, see the implementation of `SmallAlgorithm` +Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). For a +bare minimum implementation, see the implementation of `SmallLearner` [here](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/traits.jl). ### List of methods -- [`fit`](@ref fit): for training or updating algorithms that generalize to new data. Or, - for non-generalizing algorithms (see [here](@ref static_algorithms) and [Static - Algorithms](@ref)), for wrapping `algorithm` in a mutable struct that can be mutated by - `predict`/`transform` to record byproducts of those operations. +- [`fit`](@ref fit): for (i) training or updating learners that generalize to new data; or + (ii) wrapping `learner` in an object that is possibly mutated by `predict`/`transform`, + to record byproducts of those operations, in the special case of *non-generalizing* + learners (called here [static algorithms](@ref static_algorithms)) - [`update`](@ref fit): for updating learning outcomes after hyperparameter changes, such as increasing an iteration parameter. @@ -173,18 +175,18 @@ bare minimum implementation, see the implementation of `SmallAlgorithm` defined. - [`obs`](@ref data_interface): method for exposing to the user - algorithm-specific representations of data, which are additionally guaranteed to + learner-specific representations of data, which are additionally guaranteed to implement the observation access API specified by - [`LearnAPI.data_interface(algorithm)`](@ref). + [`LearnAPI.data_interface(learner)`](@ref). - [Accessor functions](@ref accessor_functions): these include functions like `LearnAPI.feature_importances` and `LearnAPI.training_losses`, for extracting, from - training outcomes, information common to many algorithms. This includes + training outcomes, information common to many learners. This includes [`LearnAPI.strip(model)`](@ref) for replacing a learning outcome `model` with a serializable version that can still `predict` or `transform`. -- [Algorithm traits](@ref traits): methods that promise specific algorithm behavior or - record general information about the algorithm. Only [`LearnAPI.constructor`](@ref) and +- [Learner traits](@ref traits): methods that promise specific learner behavior or + record general information about the learner. Only [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref) are universally compulsory. @@ -197,8 +199,8 @@ LearnAPI.@trait --- -¹ We acknowledge users may not like this terminology, and may know "algorithm" by some -other name, such as "strategy", "options", "hyperparameter set", "configuration", or -"model". Consensus on this point is difficult; see, e.g., +¹ We acknowledge users may not like this terminology, and may know "learner" by some other +name, such as "strategy", "options", "hyperparameter set", "configuration", "algorithm", +or "model". Consensus on this point is difficult; see, e.g., [this](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048/20) Julia Discourse discussion. diff --git a/docs/src/target_weights_features.md b/docs/src/target_weights_features.md index 910b9a4c..c54639a6 100644 --- a/docs/src/target_weights_features.md +++ b/docs/src/target_weights_features.md @@ -3,25 +3,25 @@ Methods for extracting parts of training data: ```julia -LearnAPI.target(algorithm, data) -> -LearnAPI.weights(algorithm, data) -> -LearnAPI.features(algorithm, data) -> +LearnAPI.target(learner, data) -> +LearnAPI.weights(learner, data) -> +LearnAPI.features(learner, data) -> ``` -Here `data` is something supported in a call of the form `fit(algorithm, data)`. +Here `data` is something supported in a call of the form `fit(learner, data)`. # Typical workflow Not typically appearing in a general user's workflow but useful in meta-alagorithms, such as cross-validation (see the example in [`obs` and Data Interfaces](@ref data_interface)). -Supposing `algorithm` is a supervised classifier predicting a one-dimensional vector +Supposing `learner` is a supervised classifier predicting a one-dimensional vector target: ```julia -model = fit(algorithm, data) -X = LearnAPI.features(algorithm, data) -y = LearnAPI.target(algorithm, data) +model = fit(learner, data) +X = LearnAPI.features(learner, data) +y = LearnAPI.target(learner, data) ŷ = predict(model, Point(), X) training_loss = sum(ŷ .!= y) ``` diff --git a/docs/src/traits.md b/docs/src/traits.md index cb03f03d..f47f1633 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -1,9 +1,9 @@ -# [Algorithm Traits](@id traits) +# [Learner Traits](@id traits) -Algorithm traits are simply functions whose sole argument is an algorithm. +Learner traits are simply functions whose sole argument is a learner. -Traits promise specific algorithm behavior, such as: *This algorithm can make point or -probabilistic predictions* or *This algorithm is supervised* (sees a target in +Traits promise specific learner behavior, such as: *This learner can make point or +probabilistic predictions* or *This learner is supervised* (sees a target in training). They may also record more mundane information, such as a package license. ## [Trait summary](@id trait_summary) @@ -15,46 +15,46 @@ In the examples column of the table below, `Continuous` is a name owned the pack | trait | return value | fallback value | example | |:-----------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| -| [`LearnAPI.constructor`](@ref)`(algorithm)` | constructor for generating new or modified versions of `algorithm` | (no fallback) | `RidgeRegressor` | -| [`LearnAPI.functions`](@ref)`(algorithm)` | functions you can apply to `algorithm` or associated model (traits excluded) | `()` | `(:fit, :predict, :LearnAPI.strip, :(LearnAPI.algorithm), :obs)` | -| [`LearnAPI.kinds_of_proxy`](@ref)`(algorithm)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.tags`](@ref)`(algorithm)` | lists one or more suggestive algorithm tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | -| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `true` if implementation is 100% Julia code | `false` | `true` | -| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | -| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | name of license of package providing core code | `"unknown"` | `"MIT"` | -| [`LearnAPI.doc_url`](@ref)`(algorithm)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | -| [`LearnAPI.load_path`](@ref)`(algorithm)` | string locating name returned by `LearnAPI.constructor(algorithm)`, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | -| [`LearnAPI.is_composite`](@ref)`(algorithm)` | `true` if one or more properties of `algorithm` may be an algorithm | `false` | `true` | -| [`LearnAPI.human_name`](@ref)`(algorithm)` | human name for the algorithm; should be a noun | type name with spaces | "elastic net regressor" | -| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | symbolic name of an iteration parameter | `nothing` | :epochs | -| [`LearnAPI.data_interface`](@ref)`(algorithm)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | -| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(algorithm, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.target_observation_scitype`](@ref)`(algorithm)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | -| [`LearnAPI.is_static`](@ref)`(algorithm)` | `true` if `fit` consumes no data | `false` | `true` | +| [`LearnAPI.constructor`](@ref)`(learner)` | constructor for generating new or modified versions of `learner` | (no fallback) | `RidgeRegressor` | +| [`LearnAPI.functions`](@ref)`(learner)` | functions you can apply to `learner` or associated model (traits excluded) | `()` | `(:fit, :predict, :LearnAPI.strip, :(LearnAPI.learner), :obs)` | +| [`LearnAPI.kinds_of_proxy`](@ref)`(learner)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(learner, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | +| [`LearnAPI.tags`](@ref)`(learner)` | lists one or more suggestive learner tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | +| [`LearnAPI.is_pure_julia`](@ref)`(learner)` | `true` if implementation is 100% Julia code | `false` | `true` | +| [`LearnAPI.pkg_name`](@ref)`(learner)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | +| [`LearnAPI.pkg_license`](@ref)`(learner)` | name of license of package providing core code | `"unknown"` | `"MIT"` | +| [`LearnAPI.doc_url`](@ref)`(learner)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | +| [`LearnAPI.load_path`](@ref)`(learner)` | string locating name returned by `LearnAPI.constructor(learner)`, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | +| [`LearnAPI.is_composite`](@ref)`(learner)` | `true` if one or more properties of `learner` may be a learner | `false` | `true` | +| [`LearnAPI.human_name`](@ref)`(learner)` | human name for the learner; should be a noun | type name with spaces | "elastic net regressor" | +| [`LearnAPI.iteration_parameter`](@ref)`(learner)` | symbolic name of an iteration parameter | `nothing` | :epochs | +| [`LearnAPI.data_interface`](@ref)`(learner)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | +| [`LearnAPI.fit_observation_scitype`](@ref)`(learner)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.target_observation_scitype`](@ref)`(learner)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | +| [`LearnAPI.is_static`](@ref)`(learner)` | `true` if `fit` consumes no data | `false` | `true` | ### Derived Traits -The following are provided for convenience but should not be overloaded by new algorithms: +The following are provided for convenience but should not be overloaded by new learners: | trait | return value | example | |:-----------------------------------|:-------------------------------------------------------------------------|:--------| -| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | -| `LearnAPI.is_algorithm(algorithm)` | `true` if `algorithm` is LearnAPI.jl-compliant | `true` | -| `LearnAPI.target(algorithm)` | `true` if `fit` sees a target variable; see [`LearnAPI.target`](@ref) | `false` | -| `LearnAPI.weights(algorithm)` | `true` if `fit` supports per-observation; see [`LearnAPI.weights`](@ref) | `false` | +| `LearnAPI.name(learner)` | learner type name as string | "PCA" | +| `LearnAPI.is_learner(learner)` | `true` if `learner` is LearnAPI.jl-compliant | `true` | +| `LearnAPI.target(learner)` | `true` if `fit` sees a target variable; see [`LearnAPI.target`](@ref) | `false` | +| `LearnAPI.weights(learner)` | `true` if `fit` supports per-observation; see [`LearnAPI.weights`](@ref) | `false` | ## Implementation guide A single-argument trait is declared following this pattern: ```julia -LearnAPI.is_pure_julia(algorithm::MyAlgorithmType) = true +LearnAPI.is_pure_julia(learner::MyLearnerType) = true ``` A shorthand for single-argument traits is available: ```julia -@trait MyAlgorithmType is_pure_julia=true +@trait MyLearnerType is_pure_julia=true ``` Multiple traits can be declared like this: @@ -62,7 +62,7 @@ Multiple traits can be declared like this: ```julia @trait( - MyAlgorithmType, + MyLearnerType, is_pure_julia = true, pkg_name = "MyPackage", ) @@ -70,20 +70,20 @@ Multiple traits can be declared like this: ### [The global trait contract](@id trait_contract) -To ensure that trait metadata can be stored in an external algorithm registry, LearnAPI.jl +To ensure that trait metadata can be stored in an external learner registry, LearnAPI.jl requires: -1. *Finiteness:* The value of a trait is the same for all `algorithm`s with same value of - [`LearnAPI.constructor(algorithm)`](@ref). This typically means trait values do not - depend on type parameters! If `is_composite(algorithm) = true`, this requirement is +1. *Finiteness:* The value of a trait is the same for all `learner`s with same value of + [`LearnAPI.constructor(learner)`](@ref). This typically means trait values do not + depend on type parameters! If `is_composite(learner) = true`, this requirement is dropped. 2. *Low level deserializability:* It should be possible to evaluate the trait *value* when `LearnAPI` is the only imported module. -Because of 1, combining a lot of functionality into one algorithm (e.g. the algorithm can +Because of 1, combining a lot of functionality into one learner (e.g. the learner can perform both classification or regression) can mean traits are necessarily less -informative (as in `LearnAPI.target_observation_scitype(algorithm) = Any`). +informative (as in `LearnAPI.target_observation_scitype(learner) = Any`). ## Reference diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index 84859307..bbc713fc 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -9,32 +9,32 @@ const DOC_STATIC = """ - For "static" algorithms (those without training `data`) it may be necessary to first + For "static" learners (those without training `data`) it may be necessary to first call `transform` or `predict` on `model`. """ """ - LearnAPI.algorithm(model) - LearnAPI.algorithm(LearnAPI.stripd_model) + LearnAPI.learner(model) + LearnAPI.learner(LearnAPI.stripd_model) -Recover the algorithm used to train `model` or the output of [`LearnAPI.strip(model)`](@ref). +Recover the learner used to train `model` or the output of [`LearnAPI.strip(model)`](@ref). -In other words, if `model = fit(algorithm, data...)`, for some `algorithm` and `data`, +In other words, if `model = fit(learner, data...)`, for some `learner` and `data`, then ```julia -LearnAPI.algorithm(model) == algorithm == LearnAPI.algorithm(LearnAPI.strip(model)) +LearnAPI.learner(model) == learner == LearnAPI.learner(LearnAPI.strip(model)) ``` is `true`. # New implementations -Implementation is compulsory for new algorithm types. The behaviour described above is the -only contract. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.algorithm)")) +Implementation is compulsory for new learner types. The behaviour described above is the +only contract. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.learner)")) """ -function algorithm end +function learner end """ LearnAPI.strip(model; options...) @@ -44,16 +44,16 @@ Return a version of `model` that will generally have a smaller memory allocation [`fit`](@ref). Accessor functions that can be called on `model` may not work on `LearnAPI.strip(model)`, but [`predict`](@ref), [`transform`](@ref) and [`inverse_transform`](@ref) will work, if implemented. Check -`LearnAPI.functions(LearnAPI.algorithm(model))` to view see what the original `model` +`LearnAPI.functions(LearnAPI.learner(model))` to view see what the original `model` implements. -Specific algorithms may provide keyword `options` to control how much of the original -functionality is preserved by `LearnAPI.strip`. +Implementations may provide learner-specific keyword `options` to control how much of the +original functionality is preserved by `LearnAPI.strip`. # Typical workflow ```julia -model = fit(algorithm, (X, y)) # or `fit(algorithm, X, y)` +model = fit(learner, (X, y)) # or `fit(learner, X, y)` ŷ = predict(model, Point(), Xnew) small_model = LearnAPI.strip(model) @@ -67,7 +67,7 @@ recovered_model = deserialize("my_random_forest.jls") # New implementations -Overloading `LearnAPI.strip` for new algorithms is optional. The fallback is the +Overloading `LearnAPI.strip` for new learners is optional. The fallback is the identity. New implementations must enforce the following identities, whenever the right-hand side is @@ -94,15 +94,15 @@ LearnAPI.strip(model) = model """ LearnAPI.feature_importances(model) -Return the algorithm-specific feature importances of a `model` output by -[`fit`](@ref)`(algorithm, ...)` for some `algorithm`. The value returned has the form of +Return the learner-specific feature importances of a `model` output by +[`fit`](@ref)`(learner, ...)` for some `learner`. The value returned has the form of an abstract vector of `feature::Symbol => importance::Real` pairs (e.g `[:gender => 0.23, :height => 0.7, :weight => 0.1]`). -The `algorithm` supports feature importances if `:(LearnAPI.feature_importances) in -LearnAPI.functions(algorithm)`. +The `learner` supports feature importances if `:(LearnAPI.feature_importances) in +LearnAPI.functions(learner)`. -If an algorithm is sometimes unable to report feature importances then +If a learner is sometimes unable to report feature importances then `LearnAPI.feature_importances` will return all importances as 0.0, as in `[:gender => 0.0, :height => 0.0, :weight => 0.0]`. @@ -124,7 +124,7 @@ an abstract vector of `feature_or_class::Symbol => coefficient::Real` pairs (e.g `feature::Symbol => coefficients::AbstractVector{<:Real}` pairs. The `model` reports coefficients if `:(LearnAPI.coefficients) in -LearnAPI.functions(Learn.algorithm(model))`. +LearnAPI.functions(Learn.learner(model))`. See also [`LearnAPI.intercept`](@ref). @@ -144,7 +144,7 @@ For a linear model, return the learned intercept. The value returned is `Real` target) or an `AbstractVector{<:Real}` (multi-target). The `model` reports intercept if `:(LearnAPI.intercept) in -LearnAPI.functions(Learn.algorithm(model))`. +LearnAPI.functions(Learn.learner(model))`. See also [`LearnAPI.coefficients`](@ref). @@ -200,8 +200,8 @@ function trees end """ LearnAPI.training_losses(model) -Return the training losses obtained when running `model = fit(algorithm, ...)` for some -`algorithm`. +Return the training losses obtained when running `model = fit(learner, ...)` for some +`learner`. See also [`fit`](@ref). @@ -218,8 +218,8 @@ function training_losses end """ LearnAPI.training_predictions(model) -Return internally computed training predictions when running `model = fit(algorithm, ...)` -for some `algorithm`. +Return internally computed training predictions when running `model = fit(learner, ...)` +for some `learner`. See also [`fit`](@ref). @@ -236,14 +236,14 @@ function training_predictions end """ LearnAPI.training_scores(model) -Return the training scores obtained when running `model = fit(algorithm, ...)` for some -`algorithm`. +Return the training scores obtained when running `model = fit(learner, ...)` for some +`learner`. See also [`fit`](@ref). # New implementations -Implement for algorithms, such as outlier detection algorithms, which associate a score +Implement for learners, such as outlier detection algorithms, which associate a score with each observation during training, where these scores are of interest in later processes (e.g, in defining normalized scores for new data). @@ -257,11 +257,11 @@ function training_scores end For a composite `model`, return the component models (`fit` outputs). These will be in the form of a vector of named pairs, `property_name::Symbol => component_model`. Here -`property_name` is the name of some algorithm-valued property (hyper-parameter) of -`algorithm = LearnAPI.algorithm(model)`. +`property_name` is the name of some learner-valued property (hyper-parameter) of +`learner = LearnAPI.learner(model)`. -A composite model is one for which the corresponding `algorithm` includes one or more -algorithm-valued properties, and for which `LearnAPI.is_composite(algorithm)` is `true`. +A composite model is one for which the corresponding `learner` includes one or more +learner-valued properties, and for which `LearnAPI.is_composite(learner)` is `true`. See also [`is_composite`](@ref). @@ -277,8 +277,8 @@ function components end """ LearnAPI.training_labels(model) -Return the training labels obtained when running `model = fit(algorithm, ...)` for some -`algorithm`. +Return the training labels obtained when running `model = fit(learner, ...)` for some +`learner`. See also [`fit`](@ref). @@ -292,7 +292,7 @@ function training_labels end # :extras intentionally excluded: const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( - algorithm, + learner, coefficients, intercept, tree, @@ -316,8 +316,8 @@ const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS_LIST = join( """ LearnAPI.extras(model) -Return miscellaneous byproducts of an algorithm's computation, from the object `model` -returned by a call of the form `fit(algorithm, data)`. +Return miscellaneous byproducts of a learning algorithm's execution, from the +object `model` returned by a call of the form `fit(learner, data)`. $DOC_STATIC diff --git a/src/clone.jl b/src/clone.jl index 571ea7fe..fef6515d 100644 --- a/src/clone.jl +++ b/src/clone.jl @@ -1,23 +1,23 @@ """ - LearnAPI.clone(algorithm; replacements...) + LearnAPI.clone(learner; replacements...) -Return a shallow copy of `algorithm` with the specified hyperparameter replacements. +Return a shallow copy of `learner` with the specified hyperparameter replacements. ```julia -clone(algorithm; epochs=100, learning_rate=0.01) +clone(learner; epochs=100, learning_rate=0.01) ``` -It is guaranteed that `LearnAPI.clone(algorithm) == algorithm`. +It is guaranteed that `LearnAPI.clone(learner) == learner`. """ -function clone(algorithm; replacements...) +function clone(learner; replacements...) reps = NamedTuple(replacements) - names = propertynames(algorithm) + names = propertynames(learner) rep_names = keys(reps) new_values = map(names) do name name in rep_names && return getproperty(reps, name) - getproperty(algorithm, name) + getproperty(learner, name) end - return LearnAPI.constructor(algorithm)(NamedTuple{names}(new_values)...) + return LearnAPI.constructor(learner)(NamedTuple{names}(new_values)...) end diff --git a/src/fit_update.jl b/src/fit_update.jl index 96407651..2421acba 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -1,22 +1,23 @@ # # FIT """ - fit(algorithm, data; verbosity=1) - fit(algorithm; verbosity=1) + fit(learner, data; verbosity=1) + fit(learner; verbosity=1) -Execute the algorithm with configuration `algorithm` using the provided training `data`, -returning an object, `model`, on which other methods, such as [`predict`](@ref) or -[`transform`](@ref), can be dispatched. [`LearnAPI.functions(algorithm)`](@ref) returns a -list of methods that can be applied to either `algorithm` or `model`. +Execute the machine learning or statistical algorithm with configuration `learner` using +the provided training `data`, returning an object, `model`, on which other methods, such +as [`predict`](@ref) or [`transform`](@ref), can be dispatched. +[`LearnAPI.functions(learner)`](@ref) returns a list of methods that can be applied to +either `learner` or `model`. For example, a supervised classifier might have a workflow like this: ```julia -model = fit(algorithm, (X, y)) +model = fit(learner, (X, y)) ŷ = predict(model, Xnew) ``` -The second signature, with `data` omitted, is provided by algorithms that do not +The second signature, with `data` omitted, is provided by learners that do not generalize to new observations (called *static algorithms*). In that case, `transform(model, data)` or `predict(model, ..., data)` carries out the actual algorithm execution, writing any byproducts of that operation to the mutable object `model` returned @@ -31,7 +32,7 @@ See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), # New implementations -Implementation of exactly one of the signatures is compulsory. If `fit(algorithm; +Implementation of exactly one of the signatures is compulsory. If `fit(learner; verbosity=1)` is implemented, then the trait [`LearnAPI.is_static`](@ref) must be overloaded to return `true`. @@ -45,9 +46,9 @@ these methods. A fallback returns `first(data)` if `data` is a tuple, and `data` otherwise. The LearnAPI.jl specification has nothing to say regarding `fit` signatures with more than -two arguments. For convenience, for example, an algorithm is free to implement a slurping -signature, such as `fit(algorithm, X, y, extras...) = fit(algorithm, (X, y, extras...))` but -LearnAPI.jl does not guarantee such signatures are actually implemented. +two arguments. For convenience, for example, an implementation is free to implement a +slurping signature, such as `fit(learner, X, y, extras...) = fit(learner, (X, y, +extras...))` but LearnAPI.jl does not guarantee such signatures are actually implemented. $(DOC_DATA_INTERFACE(:fit)) @@ -65,10 +66,10 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ p2=value2, ...`. ```julia -algorithm = MyForest(ntrees=100) +learner = MyForest(ntrees=100) # train with 100 trees: -model = fit(algorithm, data) +model = fit(learner, data) # add 50 more trees: model = update(model, data; ntrees=150) @@ -76,13 +77,13 @@ model = update(model, data; ntrees=150) Provided that `data` is identical with the data presented in a preceding `fit` call *and* there is at most one hyperparameter replacement, as in the above example, execution is -semantically equivalent to the call `fit(algorithm, data)`, where `algorithm` is -`LearnAPI.algorithm(model)` with the specified replacements. In some cases (typically, +semantically equivalent to the call `fit(learner, data)`, where `learner` is +`LearnAPI.learner(model)` with the specified replacements. In some cases (typically, when changing an iteration parameter) there may be a performance benefit to using `update` instead of retraining ab initio. If `data` differs from that in the preceding `fit` or `update` call, or there is more than -one hyperparameter replacement, then behaviour is algorithm-specific. +one hyperparameter replacement, then behaviour is learner-specific. See also [`fit`](@ref), [`update_observations`](@ref), [`update_features`](@ref). @@ -104,19 +105,19 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. ```julia-repl -algorithm = MyNeuralNetwork(epochs=10, learning_rate=0.01) +learner = MyNeuralNetwork(epochs=10, learning_rate=0.01) # train for ten epochs: -model = fit(algorithm, data) +model = fit(learner, data) # train for two more epochs using new data and new learning rate: model = update_observations(model, new_data; epochs=2, learning_rate=0.1) ``` -When following the call `fit(algorithm, data)`, the `update` call is semantically +When following the call `fit(learner, data)`, the `update` call is semantically equivalent to retraining ab initio using a concatenation of `data` and `new_data`, *provided there are no hyperparameter replacements* (which rules out the example -above). Behaviour is otherwise algorithm-specific. +above). Behaviour is otherwise learner-specific. See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). @@ -139,10 +140,10 @@ Return an updated version of the `model` object returned by a previous [`fit`](@ `update` call given the new features encapsulated in `new_data`. One may additionally specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. -When following the call `fit(algorithm, data)`, the `update` call is semantically +When following the call `fit(learner, data)`, the `update` call is semantically equivalent to retraining ab initio using a concatenation of `data` and `new_data`, *provided there are no hyperparameter replacements.* Behaviour is otherwise -algorithm-specific. +learner-specific. See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). diff --git a/src/obs.jl b/src/obs.jl index 8b226211..d107fa77 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -1,14 +1,14 @@ """ - obs(algorithm, data) + obs(learner, data) obs(model, data) -Return an algorithm-specific representation of `data`, suitable for passing to `fit` +Return learner-specific representation of `data`, suitable for passing to `fit` (first signature) or to `predict` and `transform` (second signature), in place of -`data`. Here `model` is the return value of `fit(algorithm, ...)` for some LearnAPI.jl -algorithm, `algorithm`. +`data`. Here `model` is the return value of `fit(learner, ...)` for some LearnAPI.jl +learner, `learner`. The returned object is guaranteed to implement observation access as indicated by -[`LearnAPI.data_interface(algorithm)`](@ref), typically +[`LearnAPI.data_interface(learner)`](@ref), typically [`LearnAPI.RandomAccess()`](@ref). Calling `fit`/`predict`/`transform` on the returned objects may have performance @@ -23,18 +23,18 @@ Usual workflow, using data-specific resampling methods: ```julia data = (X, y) # a DataFrame and a vector data_train = (Tables.select(X, 1:100), y[1:100]) -model = fit(algorithm, data_train) +model = fit(learner, data_train) ŷ = predict(model, Point(), X[101:150]) ``` -Alternative workflow using `obs` and the MLUtils.jl method `getobs` (assumes -`LearnAPI.data_interface(algorithm) == RandomAccess()`): +Alternative, data agnostic, workflow using `obs` and the MLUtils.jl method `getobs` +(assumes `LearnAPI.data_interface(learner) == RandomAccess()`): ```julia import MLUtils -fit_observations = obs(algorithm, data) -model = fit(algorithm, MLUtils.getobs(fit_observations, 1:100)) +fit_observations = obs(learner, data) +model = fit(learner, MLUtils.getobs(fit_observations, 1:100)) predict_observations = obs(model, X) ẑ = predict(model, Point(), MLUtils.getobs(predict_observations, 101:150)) @@ -50,15 +50,15 @@ See also [`LearnAPI.data_interface`](@ref). Implementation is typically optional. -For each supported form of `data` in `fit(algorithm, data)`, it must be true that `model = -fit(algorithm, observations)` is equivalent to `model = fit(algorithm, data)`, whenever -`observations = obs(algorithm, data)`. For each supported form of `data` in calls +For each supported form of `data` in `fit(learner, data)`, it must be true that `model = +fit(learner, observations)` is equivalent to `model = fit(learner, data)`, whenever +`observations = obs(learner, data)`. For each supported form of `data` in calls `predict(model, ..., data)` and `transform(model, data)`, where implemented, the calls `predict(model, ..., observations)` and `transform(model, observations)` are supported alternatives, whenever `observations = obs(model, data)`. -The fallback for `obs` is `obs(model_or_algorithm, data) = data`, and the fallback for -`LearnAPI.data_interface(algorithm)` is `LearnAPI.RandomAccess()`. For details refer to +The fallback for `obs` is `obs(model_or_learner, data) = data`, and the fallback for +`LearnAPI.data_interface(learner)` is `LearnAPI.RandomAccess()`. For details refer to the [`LearnAPI.data_interface`](@ref) document string. In particular, if the `data` to be consumed by `fit`, `predict` or `transform` consists @@ -66,9 +66,9 @@ only of suitable tables and arrays, then `obs` and `LearnAPI.data_interface` do to be overloaded. However, the user will get no performance benefits by using `obs` in that case. -When overloading `obs(algorithm, data)` to output new model-specific representations of +When overloading `obs(learner, data)` to output new model-specific representations of data, it may be necessary to also overload [`LearnAPI.features`](@ref), -[`LearnAPI.target`](@ref) (supervised algorithms), and/or [`LearnAPI.weights`](@ref) (if +[`LearnAPI.target`](@ref) (supervised learners), and/or [`LearnAPI.weights`](@ref) (if weights are supported), for extracting relevant parts of the representation. ## Sample implementation @@ -78,4 +78,4 @@ Refer to the "Anatomy of an Implementation" section of the LearnAPI.jl """ -obs(algorithm_or_model, data) = data +obs(learner_or_model, data) = data diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 726f263f..5c888e48 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -7,7 +7,7 @@ end DOC_MUTATION(op) = """ - If [`LearnAPI.is_static(algorithm)`](@ref) is `true`, then `$op` may mutate it's first + If [`LearnAPI.is_static(learner)`](@ref) is `true`, then `$op` may mutate it's first argument, but not in a way that alters the result of a subsequent call to `predict`, `transform` or `inverse_transform`. See more at [`fit`](@ref). @@ -16,9 +16,9 @@ DOC_MUTATION(op) = DOC_SLURPING(op) = """ - An algorithm is free to implement `$op` signatures with additional positional - arguments (eg., data-slurping signatures) but LearnAPI.jl is silent about their - interpretation or existence. + An implementation is free to implement `$op` signatures with additional positional + arguments (eg., data-slurping signatures) but LearnAPI.jl is silent about their + interpretation or existence. """ @@ -29,7 +29,7 @@ DOC_MINIMIZE(func) = identity must hold: ```julia - $func(LearnAPI.strip(model), args...) = $func(model, args...) + $func(LearnAPI.strip(model), args...) == $func(model, args...) ``` """ @@ -41,7 +41,7 @@ DOC_DATA_INTERFACE(method) = By default, it is assumed that `data` supports the [`LearnAPI.RandomAccess`](@ref) interface; this includes all matrices, with observations-as-columns, most tables, and - tuples thereof). See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the + tuples thereof. See [`LearnAPI.RandomAccess`](@ref) for details. If this is not the case then an implementation must either: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support [`LearnAPI.RandomAccess`](@ref); or (ii) overload the trait @@ -61,21 +61,21 @@ The first signature returns target predictions, or proxies for target prediction input features `data`, according to some `model` returned by [`fit`](@ref). Where supported, these are literally target predictions if `kind_of_proxy = Point()`, and probability density/mass functions if `kind_of_proxy = Distribution()`. List all -options with [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), where `algorithm = -LearnAPI.algorithm(model)`. +options with [`LearnAPI.kinds_of_proxy(learner)`](@ref), where `learner = +LearnAPI.learner(model)`. ```julia -model = fit(algorithm, (X, y)) +model = fit(learner, (X, y)) predict(model, Point(), Xnew) ``` -The shortcut `predict(model, data)` calls the first method with an algorithm-specific -`kind_of_proxy`, namely the first element of [`LearnAPI.kinds_of_proxy(algorithm)`](@ref), +The shortcut `predict(model, data)` calls the first method with learner-specific +`kind_of_proxy`, namely the first element of [`LearnAPI.kinds_of_proxy(learner)`](@ref), which lists all supported target proxies. -The argument `model` is anything returned by a call of the form `fit(algorithm, ...)`. +The argument `model` is anything returned by a call of the form `fit(learner, ...)`. -If `LearnAPI.features(LearnAPI.algorithm(model)) == nothing`, then argument `data` is +If `LearnAPI.features(LearnAPI.learner(model)) == nothing`, then the argument `data` is omitted in both signatures. An example is density estimators. See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). @@ -83,7 +83,7 @@ See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). # Extended help Note `predict ` must not mutate any argument, except in the special case -`LearnAPI.is_static(algorithm) == true`. +`LearnAPI.is_static(learner) == true`. # New implementations @@ -95,7 +95,7 @@ is implemented, but each `kind_of_proxy` that gets an implementation must be add list returned by [`LearnAPI.kinds_of_proxy`](@ref). If `data` is not present in the implemented signature (eg., for density estimators) then -[`LearnAPI.features(algorithm, data)`](@ref) must return `nothing`. +[`LearnAPI.features(learner, data)`](@ref) must return `nothing`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.predict)")) @@ -106,8 +106,8 @@ $(DOC_MUTATION(:predict)) $(DOC_DATA_INTERFACE(:predict)) """ -predict(model, data) = predict(model, kinds_of_proxy(algorithm(model)) |> first, data) -predict(model) = predict(model, kinds_of_proxy(algorithm(model)) |> first) +predict(model, data) = predict(model, kinds_of_proxy(learner(model)) |> first, data) +predict(model) = predict(model, kinds_of_proxy(learner(model)) |> first) """ transform(model, data) @@ -119,28 +119,34 @@ Return a transformation of some `data`, using some `model`, as returned by Below, `X` and `Xnew` are data of the same form. -For an `algorithm` that generalizes to new data ("learns"): +For a `learner` that generalizes to new data ("learns"): ```julia -model = fit(algorithm, X; verbosity=0) +model = fit(learner, X; verbosity=0) transform(model, Xnew) ``` +or, in one step (where supported): + +```julia +W = transform(learner, X) # `fit` implied +``` + For a static (non-generalizing) transformer: ```julia -model = fit(algorithm) +model = fit(learner) W = transform(model, X) ``` or, in one step (where supported): ```julia -W = transform(algorithm, X) +W = transform(learner, X) # `fit` implied ``` Note `transform` does not mutate any argument, except in the special case -`LearnAPI.is_static(algorithm) == true`. +`LearnAPI.is_static(learner) == true`. See also [`fit`](@ref), [`predict`](@ref), [`inverse_transform`](@ref). @@ -149,7 +155,7 @@ See also [`fit`](@ref), [`predict`](@ref), # New implementations -Implementation for new LearnAPI.jl algorithms is +Implementation for new LearnAPI.jl learners is optional. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) $(DOC_SLURPING(:transform)) @@ -169,15 +175,15 @@ function transform end Inverse transform `data` according to some `model` returned by [`fit`](@ref). Here "inverse" is to be understood broadly, e.g, an approximate -right inverse for [`transform`](@ref). +right or left inverse for [`transform`](@ref). # Example -In the following, `algorithm` is some dimension-reducing algorithm that generalizes to new +In the following, `learner` is some dimension-reducing algorithm that generalizes to new data (such as PCA); `Xtrain` is the training input and `Xnew` the input to be reduced: ```julia -model = fit(algorithm, Xtrain) +model = fit(learner, Xtrain) W = transform(model, Xnew) # reduced version of `Xnew` Ŵ = inverse_transform(model, W) # embedding of `W` in original space ``` diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 58243030..aee3481a 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -1,9 +1,9 @@ """ - LearnAPI.target(algorithm, data) -> target + LearnAPI.target(learner, data) -> target -Return, for each form of `data` supported in a call of the form [`fit(algorithm, +Return, for each form of `data` supported in a call of the form [`fit(learner, data)`](@ref), the target variable part of `data`. If `nothing` is returned, the -`algorithm` does not see a target variable in training (is unsupervised). +`learner` does not see a target variable in training (is unsupervised). Refer to LearnAPI.jl documentation for the precise meaning of "target". @@ -18,9 +18,9 @@ $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.target)"; overloaded=true)) target(::Any, data) = nothing """ - LearnAPI.weights(algorithm, data) -> weights + LearnAPI.weights(learner, data) -> weights -Return, for each form of `data` supported in a call of the form [`fit(algorithm, +Return, for each form of `data` supported in a call of the form [`fit(learner, data)`](@ref), the per-observation weights part of `data`. Where `nothing` is returned, no weights are part of `data`, which is to be interpreted as uniform weighting. @@ -34,9 +34,9 @@ $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) weights(::Any, data) = nothing """ - LearnAPI.features(algorithm, data) + LearnAPI.features(learner, data) -Return, for each form of `data` supported in a call of the form [`fit(algorithm, +Return, for each form of `data` supported in a call of the form [`fit(learner, data)`](@ref), the "features" part of `data` (as opposed to the target variable, for example). @@ -44,14 +44,14 @@ The returned object `X` may always be passed to `predict` or `transform`, where implemented, as in the following sample workflow: ```julia -model = fit(algorithm, data) +model = fit(learner, data) X = features(data) -ŷ = predict(algorithm, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` +ŷ = predict(learner, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` ``` The returned object has the same number of observations as `data`. For supervised models -(i.e., where `:(LearnAPI.target) in LearnAPI.functions(algorithm)`) `ŷ` above is generally -intended to be an approximate proxy for `LearnAPI.target(algorithm, data)`, the training +(i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`) `ŷ` above is generally +intended to be an approximate proxy for `LearnAPI.target(learner, data)`, the training target. @@ -61,13 +61,13 @@ That the output can be passed to `predict` and/or `transform`, and has the same observations as `data`, are the only contracts. A fallback returns `first(data)` if `data` is a tuple, and otherwise returns `data`. -Overloading may be necessary if [`obs(algorithm, data)`](@ref) is overloaded to return -some algorithm-specific representation of training `data`. For density estimators, whose +Overloading may be necessary if [`obs(learner, data)`](@ref) is overloaded to return +some learner-specific representation of training `data`. For density estimators, whose `fit` typically consumes *only* a target variable, you should overload this method to return `nothing`. """ -features(algorithm, data) = _first(data) +features(learner, data) = _first(data) _first(data) = data _first(data::Tuple) = first(data) # note the factoring above guards against method ambiguities diff --git a/src/tools.jl b/src/tools.jl index 1b033f05..731860ff 100644 --- a/src/tools.jl +++ b/src/tools.jl @@ -9,9 +9,9 @@ function name_value_pair(ex) end """ - @trait(TypeEx, trait1=value1, trait2=value2, ...) + @trait(LearnerType, trait1=value1, trait2=value2, ...) -Overload a number of traits for algorithms of type `TypeEx`. For example, the code +Overload a number of traits for learners of type `LearnerType`. For example, the code ```julia @trait( @@ -29,13 +29,13 @@ LearnAPI.doc_url(::RidgeRegressor) = "https://some.cool.documentation", ``` """ -macro trait(algorithm_ex, exs...) +macro trait(learner_ex, exs...) program = quote end for ex in exs trait_ex, value_ex = name_value_pair(ex) push!( program.args, - :($LearnAPI.$trait_ex(::$algorithm_ex) = $value_ex), + :($LearnAPI.$trait_ex(::$learner_ex) = $value_ex), ) end return esc(program) diff --git a/src/traits.jl b/src/traits.jl index 9b566120..a72742e0 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -1,17 +1,17 @@ # There are two types of traits - ordinary traits that an implementation overloads to make -# promises of algorithm behavior, and derived traits, which are never overloaded. +# promises of learner behavior, and derived traits, which are never overloaded. const DOC_UNKNOWN = - "Returns `\"unknown\"` if the algorithm implementation has "* + "Returns `\"unknown\"` if the learner implementation has "* "not overloaded the trait. " -const DOC_ON_TYPE = "The value of the trait must depend only on the type of `algorithm`. " +const DOC_ON_TYPE = "The value of the trait must depend only on the type of `learner`. " const DOC_EXPLAIN_EACHOBS = """ Here, "for each `o` in `observations`" is understood in the sense of - [`LearnAPI.data_interface(algorithm)`](@ref). For example, if - `LearnAPI.data_interface(algorithm) == Base.HasLength()`, then this means "for `o` in + [`LearnAPI.data_interface(learner)`](@ref). For example, if + `LearnAPI.data_interface(learner) == Base.HasLength()`, then this means "for `o` in `MLUtils.eachobs(observations)`". """ @@ -19,16 +19,16 @@ const DOC_EXPLAIN_EACHOBS = # # OVERLOADABLE TRAITS """ - Learn.API.constructor(algorithm) + Learn.API.constructor(learner) -Return a keyword constructor that can be used to clone `algorithm`: +Return a keyword constructor that can be used to clone `learner`: ```julia-repl -julia> algorithm.lambda +julia> learner.lambda 0.1 -julia> C = LearnAPI.constructor(algorithm) -julia> algorithm2 = C(lambda=0.2) -julia> algorithm2.lambda +julia> C = LearnAPI.constructor(learner) +julia> learner2 = C(lambda=0.2) +julia> learner2.lambda 0.2 ``` @@ -36,21 +36,21 @@ julia> algorithm2.lambda All new implementations must overload this trait. -Attach public LearnAPI.jl-related documentation for an algorithm to the constructor, not -the algorithm struct. +Attach public LearnAPI.jl-related documentation for learner to the constructor, not +the learner struct. -It must be possible to recover an algorithm from the constructor returned as follows: +It must be possible to recover learner from the constructor returned as follows: ```julia -properties = propertynames(algorithm) -named_properties = NamedTuple{properties}(getproperty.(Ref(algorithm), properties)) -@assert algorithm == LearnAPI.constructor(algorithm)(; named_properties...) +properties = propertynames(learner) +named_properties = NamedTuple{properties}(getproperty.(Ref(learner), properties)) +@assert learner == LearnAPI.constructor(learner)(; named_properties...) ``` -which can be tested with `@assert LearnAPI.clone(algorithm) == algorithm`. +which can be tested with `@assert LearnAPI.clone(learner) == learner`. The keyword constructor provided by `LearnAPI.constructor` must provide default values for -all properties, with the exception of those that can take other LearnAPI.jl algorithms as +all properties, with the exception of those that can take other LearnAPI.jl learners as values. These can be provided with the default `nothing`, with the constructor throwing an error if the default value persists. @@ -58,17 +58,17 @@ error if the default value persists. function constructor end """ - LearnAPI.functions(algorithm) + LearnAPI.functions(learner) Return a tuple of expressions representing functions that can be meaningfully applied -with `algorithm`, or an associated model (object returned by `fit(algorithm, ...)`, as the -first argument. Algorithm traits (methods for which `algorithm` is the *only* argument) +with `learner`, or an associated model (object returned by `fit(learner, ...)`, as the +first argument. Learner traits (methods for which `learner` is the *only* argument) are excluded. The returned tuple may include expressions like `:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl. -The understanding is that `algorithm` is a LearnAPI-compliant object whenever the return +The understanding is that `learner` is a LearnAPI-compliant object whenever the return value is non-empty. # Extended help @@ -81,7 +81,7 @@ return value: | expression | implementation compulsory? | include in returned tuple? | |-----------------------------------|----------------------------|------------------------------------| | `:(LearnAPI.fit)` | yes | yes | -| `:(LearnAPI.algorithm)` | yes | yes | +| `:(LearnAPI.learner)` | yes | yes | | `:(LearnAPI.strip)` | no | yes | | `:(LearnAPI.obs)` | no | yes | | `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | @@ -96,13 +96,13 @@ return value: | < accessor functions> | no | only if implemented | Also include any implemented accessor functions, both those owned by LearnaAPI.jl, and any -algorithm-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST +learner-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST (`LearnAPI.strip` is always included). """ functions() = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -117,9 +117,9 @@ functions() = ( ) """ - LearnAPI.kinds_of_proxy(algorithm) + LearnAPI.kinds_of_proxy(learner) -Returns a tuple of all instances, `kind`, for which for which `predict(algorithm, kind, +Returns a tuple of all instances, `kind`, for which for which `predict(learner, kind, data...)` has a guaranteed implementation. Each such `kind` subtypes [`LearnAPI.KindOfProxy`](@ref). Examples are `Point()` (for predicting actual target values) and `Distributions()` (for predicting probability mass/density functions). @@ -143,13 +143,13 @@ Suppose, for example, we have the following implementation of a supervised learn returning only probabilistic predictions: ```julia -LearnAPI.predict(algorithm::MyNewAlgorithmType, LearnAPI.Distribution(), Xnew) = ... +LearnAPI.predict(learner::MyNewLearnerType, LearnAPI.Distribution(), Xnew) = ... ``` Then we can declare ```julia -@trait MyNewAlgorithmType kinds_of_proxy = (LearnaAPI.Distribution(),) +@trait MyNewLearnerType kinds_of_proxy = (LearnaAPI.Distribution(),) ``` LearnAPI.jl provides the fallback for `predict(model, data)`. @@ -166,7 +166,7 @@ tags() = [ "classification", "clustering", "gradient descent", - "iterative algorithms", + "iterative learners", "incremental algorithms", "feature engineering", "dimension reduction", @@ -189,9 +189,9 @@ tags() = [ ] """ - LearnAPI.tags(algorithm) + LearnAPI.tags(learner) -Lists one or more suggestive algorithm tags. Do `LearnAPI.tags()` to list +Lists one or more suggestive learner tags. Do `LearnAPI.tags()` to list all possible. !!! warning @@ -206,9 +206,9 @@ This trait should return a tuple of strings, as in `("classifier", "text analysi tags(::Any) = () """ - LearnAPI.is_pure_julia(algorithm) + LearnAPI.is_pure_julia(learner) -Returns `true` if training `algorithm` requires evaluation of pure Julia code only. +Returns `true` if training `learner` requires evaluation of pure Julia code only. # New implementations @@ -218,10 +218,10 @@ The fallback is `false`. is_pure_julia(::Any) = false """ - LearnAPI.pkg_name(algorithm) + LearnAPI.pkg_name(learner) Return the name of the package module which supplies the core training algorithm for -`algorithm`. This is not necessarily the package providing the LearnAPI +`learner`. This is not necessarily the package providing the LearnAPI interface. $DOC_UNKNOWN @@ -234,18 +234,18 @@ Must return a string, as in `"DecisionTree"`. pkg_name(::Any) = "unknown" """ - LearnAPI.pkg_license(algorithm) + LearnAPI.pkg_license(learner) Return the name of the software license, such as `"MIT"`, applying to the package where the -core algorithm for `algorithm` is implemented. +core algorithm for `learner` is implemented. """ pkg_license(::Any) = "unknown" """ - LearnAPI.doc_url(algorithm) + LearnAPI.doc_url(learner) -Return a url where the core algorithm for `algorithm` is documented. +Return a url where the core algorithm for `learner` is documented. $DOC_UNKNOWN @@ -257,11 +257,11 @@ Must return a string, such as `"https://en.wikipedia.org/wiki/Decision_tree_lear doc_url(::Any) = "unknown" """ - LearnAPI.load_path(algorithm) + LearnAPI.load_path(learner) -Return a string indicating where in code the definition of the algorithm's constructor can +Return a string indicating where in code the definition of the learner's constructor can be found, beginning with the name of the package module defining it. By "constructor" we -mean the return value of [`LearnAPI.constructor(algorithm)`](@ref). +mean the return value of [`LearnAPI.constructor(learner)`](@ref). # Implementation @@ -271,7 +271,7 @@ following julia code will not error: ```julia import FastTrees import LearnAPI -@assert FastTrees.LearnAPI.DecisionTreeClassifier == LearnAPI.constructor(algorithm) +@assert FastTrees.LearnAPI.DecisionTreeClassifier == LearnAPI.constructor(learner) ``` $DOC_UNKNOWN @@ -282,18 +282,18 @@ load_path(::Any) = "unknown" """ - LearnAPI.is_composite(algorithm) + LearnAPI.is_composite(learner) -Returns `true` if one or more properties (fields) of `algorithm` may themselves be -algorithms, and `false` otherwise. +Returns `true` if one or more properties (fields) of `learner` may themselves be +learners, and `false` otherwise. See also [`LearnAPI.components`](@ref). # New implementations -This trait should be overloaded if one or more properties (fields) of `algorithm` may take -algorithm values. Fallback return value is `false`. The keyword constructor for such an -algorithm need not prescribe defaults for algorithm-valued properties. Implementation of +This trait should be overloaded if one or more properties (fields) of `learner` may take +learner values. Fallback return value is `false`. The keyword constructor for such an +learner need not prescribe defaults for learner-valued properties. Implementation of the accessor function [`LearnAPI.components`](@ref) is recommended. $DOC_ON_TYPE @@ -303,9 +303,9 @@ $DOC_ON_TYPE is_composite(::Any) = false """ - LearnAPI.human_name(algorithm) + LearnAPI.human_name(learner) -Return a human-readable string representation of `typeof(algorithm)`. Primarily intended +Return a human-readable string representation of `typeof(learner)`. Primarily intended for auto-generation of documentation. # New implementations @@ -316,14 +316,14 @@ to return `"K-nearest neighbors regressor"`. Ideally, this is a "concrete" noun `"ridge regressor"` rather than an "abstract" noun like `"ridge regression"`. """ -human_name(algorithm) = snakecase(name(algorithm), delim=' ') # `name` defined below +human_name(learner) = snakecase(name(learner), delim=' ') # `name` defined below """ - LearnAPI.data_interface(algorithm) + LearnAPI.data_interface(learner) -Return the data interface supported by `algorithm` for accessing individual observations -in representations of input data returned by [`obs(algorithm, data)`](@ref) or -[`obs(model, data)`](@ref), whenever `algorithm == LearnAPI.algorithm(model)`. Here `data` +Return the data interface supported by `learner` for accessing individual observations +in representations of input data returned by [`obs(learner, data)`](@ref) or +[`obs(model, data)`](@ref), whenever `learner == LearnAPI.learner(model)`. Here `data` is `fit`, `predict`, or `transform`-consumable data. Possible return values are [`LearnAPI.RandomAccess`](@ref), @@ -340,17 +340,17 @@ tables, and tuples of these. See the doc-string for details. data_interface(::Any) = LearnAPI.RandomAccess() """ - LearnAPI.is_static(algorithm) + LearnAPI.is_static(learner) Returns `true` if [`fit`](@ref) is called with no data arguments, as in -`fit(algorithm)`. That is, `algorithm` does not generalize to new data, and data is only +`fit(learner)`. That is, `learner` does not generalize to new data, and data is only provided at the `predict` or `transform` step. For example, some clustering algorithms are applied with this workflow, to assign labels to the observations in `X`: ```julia -model = fit(algorithm) # no training data +model = fit(learner) # no training data labels = predict(model, X) # may mutate `model`! # extract some byproducts of the clustering algorithm (e.g., outliers): @@ -366,9 +366,9 @@ arguments. See more at [`fit`](@ref). is_static(::Any) = false """ - LearnAPI.iteration_parameter(algorithm) + LearnAPI.iteration_parameter(learner) -The name of the iteration parameter of `algorithm`, or `nothing` if the algorithm is not +The name of the iteration parameter of `learner`, or `nothing` if the algorithm is not iterative. # New implementations @@ -380,12 +380,12 @@ iteration_parameter(::Any) = nothing """ - LearnAPI.fit_observation_scitype(algorithm) + LearnAPI.fit_observation_scitype(learner) Return an upper bound `S` on the scitype of individual observations guaranteed to work -when calling `fit`: if `observations = obs(algorithm, data)` and +when calling `fit`: if `observations = obs(learner, data)` and `ScientificTypes.scitype(o) <:S` for each `o` in `observations`, then the call -`fit(algorithm, data)` is supported. +`fit(learner, data)` is supported. $DOC_EXPLAIN_EACHOBS @@ -399,14 +399,14 @@ Optional. The fallback return value is `Union{}`. fit_observation_scitype(::Any) = Union{} """ - LearnAPI.target_observation_scitype(algorithm) + LearnAPI.target_observation_scitype(learner) Return an upper bound `S` on the scitype of each observation of an applicable target variable. Specifically: -- If `:(LearnAPI.target) in LearnAPI.functions(algorithm)` (i.e., `fit` consumes target - variables) then "target" means anything returned by `LearnAPI.target(algorithm, data)`, - where `data` is an admissible argument in the call `fit(algorithm, data)`. +- If `:(LearnAPI.target) in LearnAPI.functions(learner)` (i.e., `fit` consumes target + variables) then "target" means anything returned by `LearnAPI.target(learner, data)`, + where `data` is an admissible argument in the call `fit(learner, data)`. - `S` will always be an upper bound on the scitype of (point) observations that could be conceivably extracted from the output of [`predict`](@ref). @@ -414,7 +414,7 @@ variable. Specifically: To illustate the second case, suppose we have ```julia -model = fit(algorithm, data) +model = fit(learner, data) ŷ = predict(model, Sampleable(), data_new) ``` @@ -433,8 +433,8 @@ target_observation_scitype(::Any) = Any # # DERIVED TRAITS -name(algorithm) = split(string(constructor(algorithm)), ".") |> last -is_algorithm(algorithm) = !isempty(functions(algorithm)) -preferred_kind_of_proxy(algorithm) = first(kinds_of_proxy(algorithm)) -target(algorithm) = :(LearnAPI.target) in functions(algorithm) -weights(algorithm) = :(LearnAPI.weights) in functions(algorithm) +name(learner) = split(string(constructor(learner)), ".") |> last +is_learner(learner) = !isempty(functions(learner)) +preferred_kind_of_proxy(learner) = first(kinds_of_proxy(learner)) +target(learner) = :(LearnAPI.target) in functions(learner) +weights(learner) = :(LearnAPI.weights) in functions(learner) diff --git a/src/types.jl b/src/types.jl index be40922f..8da8f318 100644 --- a/src/types.jl +++ b/src/types.jl @@ -99,11 +99,11 @@ struct JointLogDistribution <: Joint end """ Single <: KindOfProxy -Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). It applies only to algorithms for +Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). It applies only to learners for which `predict` has no data argument, i.e., is of the form `predict(model, kind_of_proxy)`. An example is an algorithm learning a probability distribution from samples, and we regard the samples as drawn from the "target" variable. If in this case, -`kind_of_proxy` is an instance of `LearnAPI.Single` then, `predict(algorithm)` returns a +`kind_of_proxy` is an instance of `LearnAPI.Single` then, `predict(learner)` returns a single object representing a probability distribution. | type `T` | form of output of `predict(model, ::T)` | @@ -146,14 +146,14 @@ const DOC_HOW_TO_LIST_PROXIES = LearnAPI.KindOfProxy Abstract type whose concrete subtypes `T` each represent a different kind of proxy for -some target variable, associated with some algorithm. Instances `T()` are used to request +some target variable, associated with some learner. Instances `T()` are used to request the form of target predictions in [`predict`](@ref) calls. See LearnAPI.jl documentation for an explanation of "targets" and "target proxies". For example, `Distribution` is a concrete subtype of `LearnAPI.KindOfProxy` and a call like `predict(model, Distribution(), Xnew)` returns a data object whose observations are -probability density/mass functions, assuming `algorithm` supports predictions of that +probability density/mass functions, assuming `learner` supports predictions of that form. $DOC_HOW_TO_LIST_PROXIES @@ -180,15 +180,15 @@ All arrays implement `RandomAccess`, with the last index being the observation i (observations-as-columns in matrices). A Tables.jl compatible table `data` implements `RandomAccess` if `Tables.istable(data)` is -true and if `data` implements `DataAPI.nrows`. This includes many tables, and in +true and if `data` implements `DataAPI.nrow`. This includes many tables, and in particular, `DataFrame`s. Tables that are also tuples are explicitly excluded. Any tuple of objects implementing `RandomAccess` also implements `RandomAccess`. -If [`LearnAPI.data_interface(algorithm)`](@ref) takes the value `RandomAccess()`, then -[`obs`](@ref)`(algorithm, ...)` is guaranteed to return objects implementing the +If [`LearnAPI.data_interface(learner)`](@ref) takes the value `RandomAccess()`, then +[`obs`](@ref)`(learner, ...)` is guaranteed to return objects implementing the `RandomAccess` interface, and the same holds for `obs(model, ...)`, whenever -`LearnAPI.algorithm(model) == algorithm`. +`LearnAPI.learner(model) == learner`. # Implementing `RandomAccess` for new data types @@ -211,10 +211,10 @@ it implements Julia's `iterate` interface, including `Base.length`, and if - `data isa MLUtils.DataLoader`, which includes output from `MLUtils.eachobs`. -If [`LearnAPI.data_interface(algorithm)`](@ref) takes the value `FiniteIterable()`, then -[`obs`](@ref)`(algorithm, ...)` is guaranteed to return objects implementing the +If [`LearnAPI.data_interface(learner)`](@ref) takes the value `FiniteIterable()`, then +[`obs`](@ref)`(learner, ...)` is guaranteed to return objects implementing the `FiniteIterable` interface, and the same holds for `obs(model, ...)`, whenever -`LearnAPI.algorithm(model) == algorithm`. +`LearnAPI.learner(model) == learner`. See also [`LearnAPI.RandomAccess`](@ref), [`LearnAPI.Iterable`](@ref). """ @@ -227,10 +227,10 @@ A data interface type. We say that `data` implements the `Iterable` interface if implements Julia's basic `iterate` interface. (Such objects may not implement `MLUtils.numobs` or `Base.length`.) -If [`LearnAPI.data_interface(algorithm)`](@ref) takes the value `Iterable()`, then -[`obs`](@ref)`(algorithm, ...)` is guaranteed to return objects implementing `Iterable`, -and the same holds for `obs(model, ...)`, whenever `LearnAPI.algorithm(model) == -algorithm`. +If [`LearnAPI.data_interface(learner)`](@ref) takes the value `Iterable()`, then +[`obs`](@ref)`(learner, ...)` is guaranteed to return objects implementing `Iterable`, +and the same holds for `obs(model, ...)`, whenever `LearnAPI.learner(model) == +learner`. See also [`LearnAPI.FiniteIterable`](@ref), [`LearnAPI.RandomAccess`](@ref). diff --git a/test/patterns/ensembling.jl b/test/patterns/ensembling.jl index ad348e4a..73b864b8 100644 --- a/test/patterns/ensembling.jl +++ b/test/patterns/ensembling.jl @@ -9,10 +9,10 @@ using StableRNGs # # ENSEMBLE OF REGRESSORS (A MODEL WRAPPER) -# We implement a toy algorithm that creates an bagged ensemble of regressors, i.e, where -# each atomic model is trained on a random sample of the training observations (same -# number, but sampled with replacement). In particular this algorithm has an iteration -# parameter `n`, and we implement `update` for warm restarts when `n` increases. +# We implement a learner that creates an bagged ensemble of regressors, i.e, where each +# atomic model is trained on a random sample of the training observations (same number, +# but sampled with replacement). In particular this learner has an iteration parameter +# `n`, and we implement `update` to execute a warm restarts when `n` increases. # no docstring here - that goes with the constructor; some fields left abstract for # simplicity @@ -23,9 +23,9 @@ struct Ensemble n::Int end -# Since the `atom` hyperparameter is another algorithm, it doesn't need a default in the -# kwarg constructor, but we do need to overload the `LearnAPI.is_composite` trait (done -# later). +# Since the `atom` hyperparameter is another learner, the user must explicitly set it in +# constructor calls or an error is thrown. We also need to overload the +# `LearnAPI.is_composite` trait (done later). """ Ensemble(atom; rng=Random.default_rng(), n=10) @@ -36,33 +36,33 @@ Instantiate a bagged ensemble of `n` regressors, with base regressor `atom`, etc Ensemble(atom; rng=Random.default_rng(), n=10) = Ensemble(atom, rng, n) # `LearnAPI.constructor` defined later -# pure keyword argument constructor: +# need a pure keyword argument constructor: function Ensemble(; atom=nothing, kwargs...) isnothing(atom) && error("You must specify `atom=...` ") Ensemble(atom; kwargs...) end struct EnsembleFitted - algorithm::Ensemble + learner::Ensemble atom::Ridge - rng # mutated copy of `algorithm.rng` + rng # mutated copy of `learner.rng` models # leaving type abstract for simplicity end -LearnAPI.algorithm(model::EnsembleFitted) = model.algorithm +LearnAPI.learner(model::EnsembleFitted) = model.learner # We add the same data interface that the atomic regressor uses: -LearnAPI.obs(algorithm::Ensemble, data) = LearnAPI.obs(algorithm.atom, data) +LearnAPI.obs(learner::Ensemble, data) = LearnAPI.obs(learner.atom, data) LearnAPI.obs(model::EnsembleFitted, data) = LearnAPI.obs(first(model.models), data) -LearnAPI.target(algorithm::Ensemble, data) = LearnAPI.target(algorithm.atom, data) -LearnAPI.features(algorithm::Ensemble, data) = LearnAPI.features(algorithm.atom, data) +LearnAPI.target(learner::Ensemble, data) = LearnAPI.target(learner.atom, data) +LearnAPI.features(learner::Ensemble, data) = LearnAPI.features(learner.atom, data) -function LearnAPI.fit(algorithm::Ensemble, data; verbosity=1) +function LearnAPI.fit(learner::Ensemble, data; verbosity=1) # unpack hyperparameters: - atom = algorithm.atom - rng = deepcopy(algorithm.rng) # to prevent mutation of `algorithm`! - n = algorithm.n + atom = learner.atom + rng = deepcopy(learner.rng) # to prevent mutation of `learner`! + n = learner.n # ensure data can be subsampled using MLUtils.jl, and that we're feeding the atomic # `fit` data in an efficient (pre-processed) form: @@ -87,7 +87,7 @@ function LearnAPI.fit(algorithm::Ensemble, data; verbosity=1) # make some noise, if allowed: verbosity > 0 && @info "Trained $n ridge regression models. " - return EnsembleFitted(algorithm, atom, rng, models) + return EnsembleFitted(learner, atom, rng, models) end @@ -97,16 +97,16 @@ end # models. Otherwise, update is equivalent to retraining from scratch, with the provided # hyperparameter updates. function LearnAPI.update(model::EnsembleFitted, data; verbosity=1, replacements...) - algorithm_old = LearnAPI.algorithm(model) - algorithm = LearnAPI.clone(algorithm_old; replacements...) + learner_old = LearnAPI.learner(model) + learner = LearnAPI.clone(learner_old; replacements...) - :n in keys(replacements) || return fit(algorithm, data) + :n in keys(replacements) || return fit(learner, data) - n = algorithm.n - Δn = n - algorithm_old.n - n < 0 && return fit(model, algorithm) + n = learner.n + Δn = n - learner_old.n + n < 0 && return fit(model, learner) - atom = algorithm.atom + atom = learner.atom observations = obs(atom, data) N = MLUtils.numobs(observations) @@ -125,7 +125,7 @@ function LearnAPI.update(model::EnsembleFitted, data; verbosity=1, replacements. # make some noise, if allowed: verbosity > 0 && @info "Trained $Δn additional ridge regression models. " - return EnsembleFitted(algorithm, atom, rng, models) + return EnsembleFitted(learner, atom, rng, models) end LearnAPI.predict(model::EnsembleFitted, ::Point, data) = @@ -134,13 +134,13 @@ LearnAPI.predict(model::EnsembleFitted, ::Point, data) = end LearnAPI.strip(model::EnsembleFitted) = EnsembleFitted( - model.algorithm, + model.learner, model.atom, model.rng, LearnAPI.strip.(Ref(model.atom), models), ) -# note the inclusion of `iteration_parameter`: +# learner traits (note the inclusion of `iteration_parameter`): @trait( Ensemble, constructor = Ensemble, @@ -150,7 +150,7 @@ LearnAPI.strip(model::EnsembleFitted) = EnsembleFitted( tags = ("regression", "ensemble algorithms", "iterative models"), functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -161,10 +161,10 @@ LearnAPI.strip(model::EnsembleFitted) = EnsembleFitted( ) # convenience method: -LearnAPI.fit(algorithm::Ensemble, X, y, extras...; kwargs...) = - fit(algorithm, (X, y, extras...); kwargs...) -LearnAPI.update(algorithm::EnsembleFitted, X, y, extras...; kwargs...) = - update(algorithm, (X, y, extras...); kwargs...) +LearnAPI.fit(learner::Ensemble, X, y, extras...; kwargs...) = + fit(learner, (X, y, extras...); kwargs...) +LearnAPI.update(learner::EnsembleFitted, X, y, extras...; kwargs...) = + update(learner, (X, y, extras...); kwargs...) # synthetic test data: @@ -182,15 +182,15 @@ Xtest = Tables.subset(X, test) @testset "test an implementation of bagged ensemble of ridge regressors" begin rng = StableRNG(123) atom = Ridge() - algorithm = Ensemble(atom; n=4, rng) - @test LearnAPI.clone(algorithm) == algorithm - @test :(LearnAPI.obs) in LearnAPI.functions(algorithm) - @test LearnAPI.target(algorithm, data) == y - @test LearnAPI.features(algorithm, data) == X + learner = Ensemble(atom; n=4, rng) + @test LearnAPI.clone(learner) == learner + @test :(LearnAPI.obs) in LearnAPI.functions(learner) + @test LearnAPI.target(learner, data) == y + @test LearnAPI.features(learner, data) == X model = @test_logs( (:info, r"Trained 4 ridge"), - fit(algorithm, Xtrain, y[train]; verbosity=1), + fit(learner, Xtrain, y[train]; verbosity=1), ); ŷ4 = predict(model, Point(), Xtest) @@ -201,13 +201,13 @@ Xtest = Tables.subset(X, test) ŷ7 = predict(model, Xtest) # compare with cold restart: - model_cold = fit(LearnAPI.clone(algorithm; n=7), Xtrain, y[train]; verbosity=0); + model_cold = fit(LearnAPI.clone(learner; n=7), Xtrain, y[train]; verbosity=0); @test ŷ7 ≈ predict(model_cold, Xtest) # test that we get a cold restart if another hyperparameter is changed: model2 = update(model, Xtrain, y[train]; atom=Ridge(0.05)) - algorithm2 = Ensemble(Ridge(0.05); n=7, rng) - model_cold = fit(algorithm2, Xtrain, y[train]; verbosity=0) + learner2 = Ensemble(Ridge(0.05); n=7, rng) + model_cold = fit(learner2, Xtrain, y[train]; verbosity=0) @test predict(model2, Xtest) ≈ predict(model_cold, Xtest) end diff --git a/test/patterns/gradient_descent.jl b/test/patterns/gradient_descent.jl index 19f0d363..27c9791e 100644 --- a/test/patterns/gradient_descent.jl +++ b/test/patterns/gradient_descent.jl @@ -22,12 +22,12 @@ import ComponentArrays # - `iteration_parameter` # - `training_losses` # - `obs` for pre-processing (non-tabular) classification training data -# - `predict(algorithm, ::Distribution, Xnew)` +# - `predict(learner, ::Distribution, Xnew)` # For simplicity, we use single-observation batches for gradient descent updates, and we -# may dodge some standard optimizations. +# may dodge some optimizations. -# This is also an example of a probability-predicting classifier. +# This is an example of a probability-predicting classifier. # ## Helpers @@ -38,7 +38,7 @@ import ComponentArrays Return Brier (quadratic) loss. - `probs`: predicted probability vector -- `hot`: corresponding ground truth observation, as a one-hot encoded bit vector +- `hot`: corresponding ground truth observation, as a one-hot encoded `BitVector` """ function brier_loss(probs, hot) @@ -54,8 +54,8 @@ for the specified number of `epochs`. - `perceptron`: component array with components `weights` and `bias` - `optimiser`: optimiser from Optimiser.jl -- `X`: feature matrix, of size (p, n) -- `y_hot`: one-hot encoded target, of size (nclasses, n) +- `X`: feature matrix, of size `(p, n)` +- `y_hot`: one-hot encoded target, of size `(nclasses, n)` - `epochs`: number of epochs - `state`: optimiser state @@ -83,7 +83,7 @@ end # ## Implementation -# ### Algorithm +# ### Learner # no docstring here - that goes with the constructor; # SOME FIELDS LEFT ABSTRACT FOR SIMPLICITY @@ -98,7 +98,7 @@ end Instantiate a perceptron classifier. -Train an instance, `algorithm`, by doing `model = fit(algorithm, X, y)`, where +Train an instance, `learner`, by doing `model = fit(learner, X, y)`, where - `X is a `Float32` matrix, with observations-as-columns - `y` (target) is some one-dimensional `CategoricalArray`. @@ -112,7 +112,7 @@ point predictions with `predict(model, Point(), Xnew)`. Return an updated model, with the weights and bias of the previously learned perceptron used as the starting state in new gradient descent updates. Adopt any specified -hyperparameter `replacements` (properties of `LearnAPI.algorithm(model)`). +hyperparameter `replacements` (properties of `LearnAPI.learner(model)`). update(model, newdata; epochs=n, replacements...) @@ -120,8 +120,8 @@ If `Δepochs = n - perceptron.epochs` is non-negative, then return an updated mo the weights and bias of the previously learned perceptron used as the starting state in new gradient descent updates for `Δepochs` epochs, and using the provided `newdata` instead of the previous training data. Any other hyperparaameter `replacements` are also -adopted. If `Δepochs` is negative or not specified, instead return `fit(algorithm, -newdata)`, where `algorithm=LearnAPI.clone(algorithm; epochs=n, replacements....)`. +adopted. If `Δepochs` is negative or not specified, instead return `fit(learner, +newdata)`, where `learner=LearnAPI.clone(learner; epochs=n, replacements....)`. """ PerceptronClassifier(; epochs=50, optimiser=Optimisers.Adam(), rng=Random.default_rng()) = @@ -131,9 +131,9 @@ PerceptronClassifier(; epochs=50, optimiser=Optimisers.Adam(), rng=Random.defaul # ### Data interface # For raw training data: -LearnAPI.target(algorithm::PerceptronClassifier, data::Tuple) = last(data) +LearnAPI.target(learner::PerceptronClassifier, data::Tuple) = last(data) -# For wrapping pre-processed training data (output of `obs(algorithm, data)`): +# For wrapping pre-processed training data (output of `obs(learner, data)`): struct PerceptronClassifierObs X::Matrix{Float32} y_hot::BitMatrix # one-hot encoded target @@ -141,7 +141,7 @@ struct PerceptronClassifierObs end # For pre-processing the training data: -function LearnAPI.obs(algorithm::PerceptronClassifier, data::Tuple) +function LearnAPI.obs(learner::PerceptronClassifier, data::Tuple) X, y = data classes = CategoricalDistributions.classes(y) y_hot = classes .== permutedims(y) # one-hot encoding @@ -157,12 +157,12 @@ Base.getindex(observations, I) = PerceptronClassifierObs( ) LearnAPI.target( - algorithm::PerceptronClassifier, + learner::PerceptronClassifier, observations::PerceptronClassifierObs, ) = observations.y LearnAPI.features( - algorithm::PerceptronClassifier, + learner::PerceptronClassifier, observations::PerceptronClassifierObs, ) = observations.X @@ -174,26 +174,26 @@ LearnAPI.features( # For wrapping outcomes of learning: struct PerceptronClassifierFitted - algorithm::PerceptronClassifier + learner::PerceptronClassifier perceptron # component array storing weights and bias state # optimiser state classes # target classes losses end -LearnAPI.algorithm(model::PerceptronClassifierFitted) = model.algorithm +LearnAPI.learner(model::PerceptronClassifierFitted) = model.learner -# `fit` for pre-processed data (output of `obs(algorithm, data)`): +# `fit` for pre-processed data (output of `obs(learner, data)`): function LearnAPI.fit( - algorithm::PerceptronClassifier, + learner::PerceptronClassifier, observations::PerceptronClassifierObs; verbosity=1, ) # unpack hyperparameters: - epochs = algorithm.epochs - optimiser = algorithm.optimiser - rng = deepcopy(algorithm.rng) # to prevent mutation of `algorithm`! + epochs = learner.epochs + optimiser = learner.optimiser + rng = deepcopy(learner.rng) # to prevent mutation of `learner`! # unpack data: X = observations.X @@ -211,12 +211,12 @@ function LearnAPI.fit( perceptron, state, losses = corefit(perceptron, X, y_hot, epochs, state, verbosity) - return PerceptronClassifierFitted(algorithm, perceptron, state, classes, losses) + return PerceptronClassifierFitted(learner, perceptron, state, classes, losses) end # `fit` for unprocessed data: -LearnAPI.fit(algorithm::PerceptronClassifier, data; kwargs...) = - fit(algorithm, obs(algorithm, data); kwargs...) +LearnAPI.fit(learner::PerceptronClassifier, data; kwargs...) = + fit(learner, obs(learner, data); kwargs...) # see the `PerceptronClassifier` docstring for `update_observations` logic. function LearnAPI.update_observations( @@ -234,21 +234,21 @@ function LearnAPI.update_observations( classes == model.classes || error("New training target has incompatible classes.") - algorithm_old = LearnAPI.algorithm(model) - algorithm = LearnAPI.clone(algorithm_old; replacements...) + learner_old = LearnAPI.learner(model) + learner = LearnAPI.clone(learner_old; replacements...) perceptron = model.perceptron state = model.state losses = model.losses - epochs = algorithm.epochs + epochs = learner.epochs perceptron, state, losses_new = corefit(perceptron, X, y_hot, epochs, state, verbosity) losses = vcat(losses, losses_new) - return PerceptronClassifierFitted(algorithm, perceptron, state, classes, losses) + return PerceptronClassifierFitted(learner, perceptron, state, classes, losses) end LearnAPI.update_observations(model::PerceptronClassifierFitted, data; kwargs...) = - update_observations(model, obs(LearnAPI.algorithm(model), data); kwargs...) + update_observations(model, obs(LearnAPI.learner(model), data); kwargs...) # see the `PerceptronClassifier` docstring for `update` logic. function LearnAPI.update( @@ -266,25 +266,25 @@ function LearnAPI.update( classes == model.classes || error("New training target has incompatible classes.") - algorithm_old = LearnAPI.algorithm(model) - algorithm = LearnAPI.clone(algorithm_old; replacements...) - :epochs in keys(replacements) || return fit(algorithm, observations) + learner_old = LearnAPI.learner(model) + learner = LearnAPI.clone(learner_old; replacements...) + :epochs in keys(replacements) || return fit(learner, observations) perceptron = model.perceptron state = model.state losses = model.losses - epochs = algorithm.epochs - Δepochs = epochs - algorithm_old.epochs - epochs < 0 && return fit(model, algorithm) + epochs = learner.epochs + Δepochs = epochs - learner_old.epochs + epochs < 0 && return fit(model, learner) perceptron, state, losses_new = corefit(perceptron, X, y_hot, Δepochs, state, verbosity) losses = vcat(losses, losses_new) - return PerceptronClassifierFitted(algorithm, perceptron, state, classes, losses) + return PerceptronClassifierFitted(learner, perceptron, state, classes, losses) end LearnAPI.update(model::PerceptronClassifierFitted, data; kwargs...) = - update(model, obs(LearnAPI.algorithm(model), data); kwargs...) + update(model, obs(LearnAPI.learner(model), data); kwargs...) # ### Predict @@ -315,7 +315,7 @@ LearnAPI.training_losses(model::PerceptronClassifierFitted) = model.losses tags = ("classification", "iterative algorithms", "incremental algorithms"), functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -330,12 +330,12 @@ LearnAPI.training_losses(model::PerceptronClassifierFitted) = model.losses # ### Convenience methods -LearnAPI.fit(algorithm::PerceptronClassifier, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) -LearnAPI.update_observations(algorithm::PerceptronClassifier, X, y; kwargs...) = - update_observations(algorithm, (X, y); kwargs...) -LearnAPI.update(algorithm::PerceptronClassifier, X, y; kwargs...) = - update(algorithm, (X, y); kwargs...) +LearnAPI.fit(learner::PerceptronClassifier, X, y; kwargs...) = + fit(learner, (X, y); kwargs...) +LearnAPI.update_observations(learner::PerceptronClassifier, X, y; kwargs...) = + update_observations(learner, (X, y); kwargs...) +LearnAPI.update(learner::PerceptronClassifier, X, y; kwargs...) = + update(learner, (X, y); kwargs...) # ## Tests @@ -364,13 +364,13 @@ ytest = y[test]; @testset "PerceptronClassfier" begin rng = StableRNG(123) - algorithm = PerceptronClassifier(; optimiser=Optimisers.Adam(0.01), epochs=40, rng) - @test LearnAPI.clone(algorithm) == algorithm - @test :(LearnAPI.update) in LearnAPI.functions(algorithm) - @test LearnAPI.target(algorithm, (X, y)) == y - @test LearnAPI.features(algorithm, (X, y)) == X + learner = PerceptronClassifier(; optimiser=Optimisers.Adam(0.01), epochs=40, rng) + @test LearnAPI.clone(learner) == learner + @test :(LearnAPI.update) in LearnAPI.functions(learner) + @test LearnAPI.target(learner, (X, y)) == y + @test LearnAPI.features(learner, (X, y)) == X - model40 = fit(algorithm, Xtrain, ytrain; verbosity=0) + model40 = fit(learner, Xtrain, ytrain; verbosity=0) # 40 epochs is sufficient for 90% accuracy in this case: @test sum(predict(model40, Point(), Xtest) .== ytest)/length(ytest) > 0.9 @@ -385,7 +385,7 @@ ytest = y[test]; @test !(ŷ70 ≈ ŷ40) # compare with cold restart: - model = fit(LearnAPI.clone(algorithm; epochs=70), Xtrain, y[train]; verbosity=0); + model = fit(LearnAPI.clone(learner; epochs=70), Xtrain, y[train]; verbosity=0); @test ŷ70 ≈ predict(model, Xtest) # instead add 30 epochs using `update_observations` instead: diff --git a/test/patterns/incremental_algorithms.jl b/test/patterns/incremental_algorithms.jl index ff1a0352..20b01779 100644 --- a/test/patterns/incremental_algorithms.jl +++ b/test/patterns/incremental_algorithms.jl @@ -15,7 +15,7 @@ import Distributions """ NormalEstimator() -Instantiate an algorithm for finding the maximum likelihood normal distribution fitting +Instantiate a learner for finding the maximum likelihood normal distribution fitting some real univariate data `y`. Estimates can be updated with new data. ```julia @@ -46,7 +46,7 @@ struct NormalEstimatorFitted{T} n::Int end -LearnAPI.algorithm(::NormalEstimatorFitted) = NormalEstimator() +LearnAPI.learner(::NormalEstimatorFitted) = NormalEstimator() function LearnAPI.fit(::NormalEstimator, y) n = length(y) @@ -94,7 +94,7 @@ LearnAPI.extras(model::NormalEstimatorFitted) = (μ=model.ȳ, σ=sqrt(model.ss/ human_name = "normal distribution estimator", functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -111,8 +111,8 @@ LearnAPI.extras(model::NormalEstimatorFitted) = (μ=model.ȳ, σ=sqrt(model.ss/ rng = StableRNG(123) y = rand(rng, 50); ynew = rand(rng, 10); - algorithm = NormalEstimator() - model = fit(algorithm, y) + learner = NormalEstimator() + model = fit(learner, y) d = predict(model) μ, σ = Distributions.params(d) @test μ ≈ mean(y) @@ -122,14 +122,14 @@ LearnAPI.extras(model::NormalEstimatorFitted) = (μ=model.ȳ, σ=sqrt(model.ss/ @test LearnAPI.extras(model) == (; μ, σ) # one-liner: - @test predict(algorithm, y) == d - @test predict(algorithm, Point(), y) ≈ μ - @test predict(algorithm, ConfidenceInterval(), y)[1] ≈ quantile(d, 0.025) + @test predict(learner, y) == d + @test predict(learner, Point(), y) ≈ μ + @test predict(learner, ConfidenceInterval(), y)[1] ≈ quantile(d, 0.025) # updating: model = update_observations(model, ynew) μ2, σ2 = LearnAPI.extras(model) - μ3, σ3 = LearnAPI.extras(fit(algorithm, vcat(y, ynew))) # training ab initio + μ3, σ3 = LearnAPI.extras(fit(learner, vcat(y, ynew))) # training ab initio @test μ2 ≈ μ3 @test σ2 ≈ σ3 end diff --git a/test/patterns/regression.jl b/test/patterns/regression.jl index 35376519..f7d8d073 100644 --- a/test/patterns/regression.jl +++ b/test/patterns/regression.jl @@ -21,7 +21,7 @@ end """ Ridge(; lambda=0.1) -Instantiate a ridge regression algorithm, with regularization of `lambda`. +Instantiate a ridge regression learner, with regularization of `lambda`. """ Ridge(; lambda=0.1) = Ridge(lambda) # LearnAPI.constructor defined later @@ -33,12 +33,12 @@ struct RidgeFitObs{T,M<:AbstractMatrix{T}} end struct RidgeFitted{T,F} - algorithm::Ridge + learner::Ridge coefficients::Vector{T} feature_importances::F end -LearnAPI.algorithm(model::RidgeFitted) = model.algorithm +LearnAPI.learner(model::RidgeFitted) = model.learner Base.getindex(data::RidgeFitObs, I) = RidgeFitObs(data.A[:,I], data.names, data.y[I]) @@ -53,16 +53,16 @@ function LearnAPI.obs(::Ridge, data) end # for observations: -function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1) +function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) # unpack hyperparameters and data: - lambda = algorithm.lambda + lambda = learner.lambda A = observations.A names = observations.names y = observations.y - # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix + # apply core learner: + coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix # determine crude feature importances: feature_importances = @@ -73,13 +73,13 @@ function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1) verbosity > 0 && @info "Features in order of importance: $(first.(feature_importances))" - return RidgeFitted(algorithm, coefficients, feature_importances) + return RidgeFitted(learner, coefficients, feature_importances) end # for unprocessed `data = (X, y)`: -LearnAPI.fit(algorithm::Ridge, data; kwargs...) = - fit(algorithm, obs(algorithm, data); kwargs...) +LearnAPI.fit(learner::Ridge, data; kwargs...) = + fit(learner, obs(learner, data); kwargs...) # extracting stuff from training data: LearnAPI.target(::Ridge, data) = last(data) @@ -101,7 +101,7 @@ LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.algorithm, model.coefficients, nothing) + RidgeFitted(model.learner, model.coefficients, nothing) @trait( Ridge, @@ -110,7 +110,7 @@ LearnAPI.strip(model::RidgeFitted) = tags = ("regression",), functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -121,8 +121,8 @@ LearnAPI.strip(model::RidgeFitted) = ) # convenience method: -LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) +LearnAPI.fit(learner::Ridge, X, y; kwargs...) = + fit(learner, (X, y); kwargs...) # ## Tests @@ -138,17 +138,17 @@ y = 2a - b + 3c + 0.05*rand(n) data = (X, y) @testset "test an implementation of ridge regression" begin - algorithm = Ridge(lambda=0.5) - @test :(LearnAPI.obs) in LearnAPI.functions(algorithm) + learner = Ridge(lambda=0.5) + @test :(LearnAPI.obs) in LearnAPI.functions(learner) - @test LearnAPI.target(algorithm, data) == y - @test LearnAPI.features(algorithm, data) == X + @test LearnAPI.target(learner, data) == y + @test LearnAPI.features(learner, data) == X # verbose fitting: @test_logs( (:info, r"Feature"), fit( - algorithm, + learner, Tables.subset(X, train), y[train]; verbosity=1, @@ -158,7 +158,7 @@ data = (X, y) # quiet fitting: model = @test_logs( fit( - algorithm, + learner, Tables.subset(X, train), y[train]; verbosity=0, @@ -169,12 +169,12 @@ data = (X, y) @test ŷ isa Vector{Float64} @test predict(model, Tables.subset(X, test)) == ŷ - fitobs = LearnAPI.obs(algorithm, data) + fitobs = LearnAPI.obs(learner, data) predictobs = LearnAPI.obs(model, X) - model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) - @test LearnAPI.target(algorithm, fitobs) == y + model = fit(learner, MLUtils.getobs(fitobs, train); verbosity=0) + @test LearnAPI.target(learner, fitobs) == y @test predict(model, Point(), MLUtils.getobs(predictobs, test)) ≈ ŷ - @test predict(model, LearnAPI.features(algorithm, fitobs)) ≈ predict(model, X) + @test predict(model, LearnAPI.features(learner, fitobs)) ≈ predict(model, X) @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} @@ -184,7 +184,7 @@ data = (X, y) serialize(filename, small_model) recovered_model = deserialize(filename) - @test LearnAPI.algorithm(recovered_model) == algorithm + @test LearnAPI.learner(recovered_model) == learner @test predict( recovered_model, Point(), @@ -206,45 +206,45 @@ end """ BabyRidge(; lambda=0.1) -Instantiate a ridge regression algorithm, with regularization of `lambda`. +Instantiate a ridge regression learner, with regularization of `lambda`. """ BabyRidge(; lambda=0.1) = BabyRidge(lambda) # LearnAPI.constructor defined later struct BabyRidgeFitted{T,F} - algorithm::BabyRidge + learner::BabyRidge coefficients::Vector{T} feature_importances::F end -function LearnAPI.fit(algorithm::BabyRidge, data; verbosity=1) +function LearnAPI.fit(learner::BabyRidge, data; verbosity=1) X, y = data - lambda = algorithm.lambda + lambda = learner.lambda table = Tables.columntable(X) names = Tables.columnnames(table) |> collect A = Tables.matrix(table)' - # apply core algorithm: - coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector + # apply core learner: + coefficients = (A*A' + learner.lambda*I)\(A*y) # vector feature_importances = nothing - return BabyRidgeFitted(algorithm, coefficients, feature_importances) + return BabyRidgeFitted(learner, coefficients, feature_importances) end # extracting stuff from training data: LearnAPI.target(::BabyRidge, data) = last(data) -LearnAPI.algorithm(model::BabyRidgeFitted) = model.algorithm +LearnAPI.learner(model::BabyRidgeFitted) = model.learner LearnAPI.predict(model::BabyRidgeFitted, ::Point, Xnew) = Tables.matrix(Xnew)*model.coefficients LearnAPI.strip(model::BabyRidgeFitted) = - BabyRidgeFitted(model.algorithm, model.coefficients, nothing) + BabyRidgeFitted(model.learner, model.coefficients, nothing) @trait( BabyRidge, @@ -253,7 +253,7 @@ LearnAPI.strip(model::BabyRidgeFitted) = tags = ("regression",), functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -264,27 +264,27 @@ LearnAPI.strip(model::BabyRidgeFitted) = ) # convenience method: -LearnAPI.fit(algorithm::BabyRidge, X, y; kwargs...) = - fit(algorithm, (X, y); kwargs...) +LearnAPI.fit(learner::BabyRidge, X, y; kwargs...) = + fit(learner, (X, y); kwargs...) # ## Tests @testset "test a variation which does not overload LearnAPI.obs" begin - algorithm = BabyRidge(lambda=0.5) + learner = BabyRidge(lambda=0.5) - model = fit(algorithm, Tables.subset(X, train), y[train]; verbosity=0) + model = fit(learner, Tables.subset(X, train), y[train]; verbosity=0) ŷ = predict(model, Point(), Tables.subset(X, test)) @test ŷ isa Vector{Float64} - fitobs = obs(algorithm, data) + fitobs = obs(learner, data) predictobs = LearnAPI.obs(model, X) - model = fit(algorithm, MLUtils.getobs(fitobs, train); verbosity=0) + model = fit(learner, MLUtils.getobs(fitobs, train); verbosity=0) @test predict(model, Point(), MLUtils.getobs(predictobs, test)) == ŷ == predict(model, MLUtils.getobs(predictobs, test)) - @test LearnAPI.target(algorithm, data) == y + @test LearnAPI.target(learner, data) == y @test LearnAPI.predict(model, X) ≈ - LearnAPI.predict(model, LearnAPI.features(algorithm, data)) + LearnAPI.predict(model, LearnAPI.features(learner, data)) end true diff --git a/test/patterns/static_algorithms.jl b/test/patterns/static_algorithms.jl index 5a4c277f..fef3cff1 100644 --- a/test/patterns/static_algorithms.jl +++ b/test/patterns/static_algorithms.jl @@ -16,23 +16,23 @@ end Selector(; names=Symbol[]) = Selector(names) # LearnAPI.constructor defined later # `fit` consumes no observational data, does no "learning", and just returns a thinly -# wrapped `algorithm` (to distinguish it from the algorithm in dispatch): -LearnAPI.fit(algorithm::Selector; verbosity=1) = Ref(algorithm) -LearnAPI.algorithm(model) = model[] +# wrapped `learner` (to distinguish it from the learner in dispatch): +LearnAPI.fit(learner::Selector; verbosity=1) = Ref(learner) +LearnAPI.learner(model) = model[] function LearnAPI.transform(model::Base.RefValue{Selector}, X) - algorithm = LearnAPI.algorithm(model) + learner = LearnAPI.learner(model) table = Tables.columntable(X) names = Tables.columnnames(table) - filtered_names = filter(in(algorithm.names), names) + filtered_names = filter(in(learner.names), names) filtered_columns = (Tables.getcolumn(table, name) for name in filtered_names) filtered_table = NamedTuple{filtered_names}((filtered_columns...,)) return Tables.materializer(X)(filtered_table) end # fit and transform in one go: -function LearnAPI.transform(algorithm::Selector, X) - model = fit(algorithm) +function LearnAPI.transform(learner::Selector, X) + model = fit(learner) transform(model, X) end @@ -44,7 +44,7 @@ end is_static = true, functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.transform), @@ -52,14 +52,14 @@ end ) @testset "test a static transformer" begin - algorithm = Selector(names=[:x, :w]) + learner = Selector(names=[:x, :w]) X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) - model = fit(algorithm) # no data arguments! + model = fit(learner) # no data arguments! # if provided, data is ignored: - @test LearnAPI.algorithm(model) == algorithm + @test LearnAPI.learner(model) == learner W = transform(model, X) @test W == DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) - @test W == transform(algorithm, X) + @test W == transform(learner, X) end @@ -74,21 +74,21 @@ end FancySelector(; names=Symbol[]) = FancySelector(names) # LearnAPI.constructor defined later mutable struct FancySelectorFitted - algorithm::FancySelector + learner::FancySelector rejected::Vector{Symbol} - FancySelectorFitted(algorithm) = new(algorithm) + FancySelectorFitted(learner) = new(learner) end -LearnAPI.algorithm(model::FancySelectorFitted) = model.algorithm +LearnAPI.learner(model::FancySelectorFitted) = model.learner rejected(model::FancySelectorFitted) = model.rejected -# Here we are wrapping `algorithm` with a place-holder for the `rejected` feature names. -LearnAPI.fit(algorithm::FancySelector; verbosity=1) = FancySelectorFitted(algorithm) +# Here we are wrapping `learner` with a place-holder for the `rejected` feature names. +LearnAPI.fit(learner::FancySelector; verbosity=1) = FancySelectorFitted(learner) # output the filtered table and add `rejected` field to model (mutatated!) function LearnAPI.transform(model::FancySelectorFitted, X) table = Tables.columntable(X) names = Tables.columnnames(table) - keep = LearnAPI.algorithm(model).names + keep = LearnAPI.learner(model).names filtered_names = filter(in(keep), names) model.rejected = setdiff(names, filtered_names) filtered_columns = (Tables.getcolumn(table, name) for name in filtered_names) @@ -97,8 +97,8 @@ function LearnAPI.transform(model::FancySelectorFitted, X) end # fit and transform in one step: -function LearnAPI.transform(algorithm::FancySelector, X) - model = fit(algorithm) +function LearnAPI.transform(learner::FancySelector, X) + model = fit(learner) transform(model, X) end @@ -110,7 +110,7 @@ end tags = ("feature engineering",), functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.transform), @@ -119,14 +119,14 @@ end ) @testset "test a variation that reports byproducts" begin - algorithm = FancySelector(names=[:x, :w]) + learner = FancySelector(names=[:x, :w]) X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) - model = fit(algorithm) # no data arguments! + model = fit(learner) # no data arguments! @test !isdefined(model, :reject) - @test LearnAPI.algorithm(model) == algorithm + @test LearnAPI.learner(model) == learner filtered = DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) @test transform(model, X) == filtered - @test transform(algorithm, X) == filtered + @test transform(learner, X) == filtered @test rejected(model) == [:y, :z] end diff --git a/test/traits.jl b/test/traits.jl index e6eaae45..32a0c5d4 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -1,18 +1,18 @@ using Test using LearnAPI -# A MINIMUM IMPLEMENTATION OF AN ALGORITHM +# A MINIMUM IMPLEMENTATION OF A LEARNER # does nothing useful -struct SmallAlgorithm end -LearnAPI.fit(algorithm::SmallAlgorithm, data; verbosity=1) = algorithm -LearnAPI.algorithm(model::SmallAlgorithm) = model +struct SmallLearner end +LearnAPI.fit(learner::SmallLearner, data; verbosity=1) = learner +LearnAPI.learner(model::SmallLearner) = model @trait( - SmallAlgorithm, - constructor = SmallAlgorithm, + SmallLearner, + constructor = SmallLearner, functions = ( :(LearnAPI.fit), - :(LearnAPI.algorithm), + :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -28,9 +28,9 @@ LearnAPI.algorithm(model::SmallAlgorithm) = model # OVERLOADABLE TRAITS -small = SmallAlgorithm() -@test LearnAPI.constructor(small) == SmallAlgorithm -@test :(LearnAPI.algorithm) in LearnAPI.functions(small) +small = SmallLearner() +@test LearnAPI.constructor(small) == SmallLearner +@test :(LearnAPI.learner) in LearnAPI.functions(small) @test isempty(LearnAPI.kinds_of_proxy(small)) @test isempty(LearnAPI.tags(small)) @test !LearnAPI.is_pure_julia(small) @@ -39,7 +39,7 @@ small = SmallAlgorithm() @test LearnAPI.doc_url(small) == "unknown" @test LearnAPI.load_path(small) == "unknown" @test !LearnAPI.is_composite(small) -@test LearnAPI.human_name(small) == "small algorithm" +@test LearnAPI.human_name(small) == "small learner" @test isnothing(LearnAPI.iteration_parameter(small)) @test LearnAPI.data_interface(small) == LearnAPI.RandomAccess() @test !(6 isa LearnAPI.fit_observation_scitype(small)) @@ -48,7 +48,7 @@ small = SmallAlgorithm() # DERIVED TRAITS -@test LearnAPI.is_algorithm(small) +@test LearnAPI.is_learner(small) @test !LearnAPI.target(small) @test !LearnAPI.weights(small) From 28ac8595e43d15505c824316bdc24104e4edfc5a Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 18 Oct 2024 16:25:45 +1300 Subject: [PATCH 127/187] doc tweak --- docs/src/common_implementation_patterns.md | 2 +- docs/src/index.md | 12 ++++++------ docs/src/reference.md | 2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 7959dce6..9b128c6a 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -1,4 +1,4 @@ -# Common Implementation Patterns +# [Common Implementation Patterns](@id patterns) !!! important diff --git a/docs/src/index.md b/docs/src/index.md index 10d38430..9ad3e22c 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -11,11 +11,11 @@ A base Julia interface for machine learning and statistics LearnAPI.jl is a lightweight, functional-style interface, providing a collection of [methods](@ref Methods), such as `fit` and `predict`, to be implemented by algorithms from -machine learning and statistics. Its careful design ensures algorithms implementing -LearnAPI.jl can buy into functionality, such as external performance estimates, -hyperparameter optimization and model composition, provided by ML/statistics toolboxes and -other packages. LearnAPI.jl includes a number of Julia [traits](@ref traits) for promising -specific behavior. +machine learning and statistics, some examples of which are listed [here](@ref +patterns). A careful design ensures algorithms implementing LearnAPI.jl can buy into +functionality, such as external performance estimates, hyperparameter optimization and +model composition, provided by ML/statistics toolboxes and other packages. LearnAPI.jl +includes a number of Julia [traits](@ref traits) for promising specific behavior. LearnAPI.jl's only dependency is the standard library `InteractiveUtils`. @@ -99,7 +99,7 @@ loaders reading images from disk). - [Reference](@ref reference): official specification -- [Common Implementation Patterns](@ref): implementation suggestions for common, +- [Common Implementation Patterns](@ref patterns): implementation suggestions for common, informally defined, algorithm types - [Testing an Implementation](@ref) diff --git a/docs/src/reference.md b/docs/src/reference.md index 749e9708..c6e9aaf3 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -2,7 +2,7 @@ Here we give the definitive specification of the LearnAPI.jl interface. For informal guides see [Anatomy of an Implementation](@ref) and [Common Implementation -Patterns](@ref). +Patterns](@ref patterns). ## [Important terms and concepts](@id scope) From 059a6b46f64e308bac5750c44fff12e7b928edeb Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 18 Oct 2024 18:29:55 +1300 Subject: [PATCH 128/187] get rid of InteractiveUtils dep --- Project.toml | 3 - docs/src/index.md | 2 +- src/LearnAPI.jl | 4 +- src/predict_transform.jl | 8 ++- src/traits.jl | 12 ++-- src/types.jl | 124 +++++++++++++++++++++++---------------- test/traits.jl | 2 +- 7 files changed, 89 insertions(+), 66 deletions(-) diff --git a/Project.toml b/Project.toml index 2d23d7e2..7f9d9267 100644 --- a/Project.toml +++ b/Project.toml @@ -3,9 +3,6 @@ uuid = "92ad9a40-7767-427a-9ee6-6e577f1266cb" authors = ["Anthony D. Blaom "] version = "0.1.0" -[deps] -InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240" - [compat] julia = "1.6" diff --git a/docs/src/index.md b/docs/src/index.md index 9ad3e22c..727199ff 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -17,7 +17,7 @@ functionality, such as external performance estimates, hyperparameter optimizati model composition, provided by ML/statistics toolboxes and other packages. LearnAPI.jl includes a number of Julia [traits](@ref traits) for promising specific behavior. -LearnAPI.jl's only dependency is the standard library `InteractiveUtils`. +LearnAPI.jl's has no package dependencies. ```@raw html 🚧 diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 74fdd84b..c1564e06 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -1,7 +1,5 @@ module LearnAPI -import InteractiveUtils.subtypes - include("tools.jl") include("types.jl") include("predict_transform.jl") @@ -16,7 +14,7 @@ export @trait export fit, update, update_observations, update_features export predict, transform, inverse_transform, obs -for name in Symbol.(CONCRETE_TARGET_PROXY_TYPES_SYMBOLS) +for name in CONCRETE_TARGET_PROXY_SYMBOLS @eval export $name end diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 5c888e48..8bb0a254 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -45,7 +45,7 @@ DOC_DATA_INTERFACE(method) = case then an implementation must either: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support [`LearnAPI.RandomAccess`](@ref); or (ii) overload the trait - [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. Refer to + [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. Refer tbo document strings for details. """ @@ -91,8 +91,10 @@ If there is no notion of a "target" variable in the LearnAPI.jl sense, or you ne operation with an inverse, implement [`transform`](@ref) instead. Implementation is optional. Only the first signature (with or without the `data` argument) -is implemented, but each `kind_of_proxy` that gets an implementation must be added to the -list returned by [`LearnAPI.kinds_of_proxy`](@ref). +is implemented, but each `kind_of_proxy::`[`KindOfProxy`](@ref) that gets an +implementation must be added to the list returned by +[`LearnAPI.kinds_of_proxy(learner)`](@ref). List all available kinds of proxy by doing +`LearnAPI.kinds_of_proxy()`. If `data` is not present in the implemented signature (eg., for density estimators) then [`LearnAPI.features(learner, data)`](@ref) must return `nothing`. diff --git a/src/traits.jl b/src/traits.jl index a72742e0..6886a2ec 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -135,9 +135,8 @@ See also [`LearnAPI.predict`](@ref), [`LearnAPI.KindOfProxy`](@ref). Must be overloaded whenever `predict` is implemented. -Elements of the returned tuple must be instances of types in the return value of -`LearnAPI.kinds_of_proxy()`, i.e., one of the following, described further in LearnAPI.jl -documentation: $CONCRETE_TARGET_PROXY_TYPES_LIST. +Elements of the returned tuple must be instances of [`LearnAPI.KindOfProxy`](@ref). List +all possibilities by running `LearnAPI.kinds_of_proxy()`. Suppose, for example, we have the following implementation of a supervised learner returning only probabilistic predictions: @@ -158,7 +157,12 @@ For more on target variables and target proxies, refer to the LearnAPI documenta """ kinds_of_proxy(::Any) = () -kinds_of_proxy() = CONCRETE_TARGET_PROXY_TYPES +kinds_of_proxy() = map(CONCRETE_TARGET_PROXY_SYMBOLS) do ex + quote + $ex() + end |> eval +end + tags() = [ diff --git a/src/types.jl b/src/types.jl index 8da8f318..8a53672d 100644 --- a/src/types.jl +++ b/src/types.jl @@ -53,27 +53,35 @@ expectiles at 50% will provide `Point` instead. """ abstract type IID <: KindOfProxy end -struct Point <: IID end -struct Sampleable <: IID end -struct Distribution <: IID end -struct LogDistribution <: IID end -struct Probability <: IID end -struct LogProbability <: IID end -struct Parametric <: IID end -struct LabelAmbiguous <: IID end -struct LabelAmbiguousSampleable <: IID end -struct LabelAmbiguousDistribution <: IID end -struct LabelAmbiguousFuzzy <: IID end -struct ConfidenceInterval <: IID end -struct Fuzzy <: IID end -struct ProbabilisticFuzzy <: IID end -struct SurvivalFunction <: IID end -struct SurvivalDistribution <: IID end -struct HazardFunction <: IID end -struct OutlierScore <: IID end -struct Continuous <: IID end -struct Quantile <: IID end -struct Expectile <: IID end +const IID_SYMBOLS = [ + :Point, + :Sampleable, + :Distribution, + :LogDistribution, + :Probability, + :LogProbability, + :Parametric, + :LabelAmbiguous, + :LabelAmbiguousSampleable, + :LabelAmbiguousDistribution, + :LabelAmbiguousFuzzy, + :ConfidenceInterval, + :Fuzzy, + :ProbabilisticFuzzy, + :SurvivalFunction, + :SurvivalDistribution, + :HazardFunction, + :OutlierScore, + :Continuous, + :Quantile, + :Expectile, +] + +for S in IID_SYMBOLS + quote + struct $S <: IID end + end |> eval +end """ @@ -92,9 +100,18 @@ space ``Y^n``, where ``Y`` is the space from which the target variable takes its """ abstract type Joint <: KindOfProxy end -struct JointSampleable <: Joint end -struct JointDistribution <: Joint end -struct JointLogDistribution <: Joint end + +const JOINT_SYMBOLS = [ + :JointSampleable, + :JointDistribution, + :JointLogDistribution, +] + +for S in JOINT_SYMBOLS + quote + struct $S <: Joint end + end |> eval +end """ Single <: KindOfProxy @@ -114,32 +131,24 @@ single object representing a probability distribution. """ abstract type Single <: KindOfProxy end -struct SingleSampeable <: Single end -struct SingleDistribution <: Single end -struct SingleLogDistribution <: Single end - -const CONCRETE_TARGET_PROXY_TYPES = [ - subtypes(IID)..., - subtypes(Single)..., - subtypes(Joint)..., + +const SINGLE_SYMBOLS = [ + :SingleSampeable, + :SingleDistribution, + :SingleLogDistribution, ] -const CONCRETE_TARGET_PROXY_TYPES_SYMBOLS = map(CONCRETE_TARGET_PROXY_TYPES) do T - Symbol(last(split(string(T), '.'))) +for S in SINGLE_SYMBOLS + quote + struct $S <: Single end + end |> eval end -const CONCRETE_TARGET_PROXY_TYPES_LIST = join( - map(CONCRETE_TARGET_PROXY_TYPES_SYMBOLS) do s - "`$s()`" - end, - ", ", - " and ", -) - -const DOC_HOW_TO_LIST_PROXIES = - "The instances of [`LearnAPI.KindOfProxy`](@ref) are: "* - "$(LearnAPI.CONCRETE_TARGET_PROXY_TYPES_LIST). " - +const CONCRETE_TARGET_PROXY_SYMBOLS = [ + IID_SYMBOLS..., + SINGLE_SYMBOLS..., + JOINT_SYMBOLS..., +] """ @@ -151,12 +160,25 @@ the form of target predictions in [`predict`](@ref) calls. See LearnAPI.jl documentation for an explanation of "targets" and "target proxies". -For example, `Distribution` is a concrete subtype of `LearnAPI.KindOfProxy` and a call -like `predict(model, Distribution(), Xnew)` returns a data object whose observations are -probability density/mass functions, assuming `learner` supports predictions of that -form. +For example, `Distribution` is a concrete subtype of `IID <: LearnAPI.KindOfProxy` and a +call like `predict(model, Distribution(), Xnew)` returns a data object whose observations +are probability density/mass functions, assuming `learner = LearnAPI.learner(model)` +supports predictions of that form, which is true if `Distribution() in` +[`LearnAPI.kinds_of_proxy(learner)`](@ref). + +Proxy types are grouped under three abstract subtypes: + +- [`LearnAPI.IID`](@ref): The main type, for proxies consisting of uncorrelated individual + components, one for each input observation + +- [`LearnAPI.Joint`](@ref): For learners that predict a single probabilistic structure + encapsulating correlations between target predictions for different input observations + +- [`LearnAPI.Single`](@ref): For learners, such as density estimators, that are trained on + a target variable only (no features); `predict` consumes no data and the returned target + proxy is a single probabilistic structure. -$DOC_HOW_TO_LIST_PROXIES +For lists of all concrete instances, refer to documentation for the relevant subtype. """ KindOfProxy diff --git a/test/traits.jl b/test/traits.jl index 32a0c5d4..6f0b06e8 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -23,7 +23,7 @@ LearnAPI.learner(model::SmallLearner) = model # ZERO ARGUMENT METHODS @test :(LearnAPI.fit) in LearnAPI.functions() -@test Point in LearnAPI.kinds_of_proxy() +@test Point() in LearnAPI.kinds_of_proxy() @test "regression" in LearnAPI.tags() # OVERLOADABLE TRAITS From 7ae83b079b619e86a978d5f3c959765d452a6af1 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 18 Oct 2024 18:36:00 +1300 Subject: [PATCH 129/187] add test --- test/traits.jl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/test/traits.jl b/test/traits.jl index 6f0b06e8..b75ed658 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -48,9 +48,11 @@ small = SmallLearner() # DERIVED TRAITS +@trait SmallLearner kinds_of_proxy=(Point(),) @test LearnAPI.is_learner(small) @test !LearnAPI.target(small) @test !LearnAPI.weights(small) +@test LearnAPI.preferred_kind_of_proxy(small) == Point() module FruitSalad import LearnAPI From a156d4645b070b64c56d632023ca1faf130d1f09 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 23 Oct 2024 17:50:08 +1300 Subject: [PATCH 130/187] update readme re algorithm -> learner --- README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 39238486..d42e6b7e 100644 --- a/README.md +++ b/README.md @@ -13,24 +13,24 @@ New contributions welcome. See the [road map](ROADMAP.md). ## Code snippet -Configure a learning algorithm: +Configure a machine learning algorithm: ```julia -julia> algorithm = Ridge(lambda=0.1) +julia> learner = Ridge(lambda=0.1) ``` Inspect available functionality: ``` -julia> LearnAPI.functions(algorithm) -(:(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.strip), :(LearnAPI.obs), +julia> LearnAPI.functions(learner) +(:(LearnAPI.fit), :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients)) ``` Train: ```julia -julia> model = fit(algorithm, data) +julia> model = fit(learner, data) ``` Predict: From f433d50893dbacb8f1aa95270cacbaad0e9755e9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 23 Oct 2024 17:51:23 +1300 Subject: [PATCH 131/187] upate again --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index d42e6b7e..6d1287e7 100644 --- a/README.md +++ b/README.md @@ -16,13 +16,13 @@ New contributions welcome. See the [road map](ROADMAP.md). Configure a machine learning algorithm: ```julia -julia> learner = Ridge(lambda=0.1) +julia> ridge = Ridge(lambda=0.1) ``` Inspect available functionality: ``` -julia> LearnAPI.functions(learner) +julia> LearnAPI.functions(ridge) (:(LearnAPI.fit), :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients)) ``` @@ -30,7 +30,7 @@ julia> LearnAPI.functions(learner) Train: ```julia -julia> model = fit(learner, data) +julia> model = fit(ridge, data) ``` Predict: From 63f2f0de8c7732bb3fe9b986a5680b534cfb50a1 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 23 Oct 2024 17:54:00 +1300 Subject: [PATCH 132/187] fix doc string --- src/accessor_functions.jl | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index bbc713fc..1fe06d75 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -16,9 +16,10 @@ const DOC_STATIC = """ LearnAPI.learner(model) - LearnAPI.learner(LearnAPI.stripd_model) + LearnAPI.learner(stripped_model) -Recover the learner used to train `model` or the output of [`LearnAPI.strip(model)`](@ref). +Recover the learner used to train `model` or the output, `stripped_model`, of +[`LearnAPI.strip(model)`](@ref). In other words, if `model = fit(learner, data...)`, for some `learner` and `data`, then From 5e60305a726053ab54bdff90af689ee77f6c7478 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 23 Oct 2024 18:12:02 +1300 Subject: [PATCH 133/187] doc tweak --- src/target_weights_features.jl | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index aee3481a..8e58c07e 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -5,9 +5,17 @@ Return, for each form of `data` supported in a call of the form [`fit(learner, data)`](@ref), the target variable part of `data`. If `nothing` is returned, the `learner` does not see a target variable in training (is unsupervised). -Refer to LearnAPI.jl documentation for the precise meaning of "target". +# Extended help -# New implementations +## What is a target variable? + +Examples of target variables are house prices in realestate pricing estimates, the +"spam"/"not spam" labels in an email spam filtering task, "outlier"/"inlier" labels in +outlier detection, cluster labels in clustering problems, and censored survival times in +survival analysis. For more on targets and target proxies, see the "Reference" section of +the LearnAPI.jl documentation. + +## New implementations A fallback returns `nothing`. Must be implemented if `fit` consumes data including a target variable. From 4b9bc5b8cebe34880bd8aa5eb47d34a85af0e202 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 25 Oct 2024 16:09:33 +1300 Subject: [PATCH 134/187] give LearnAPI.functions() a default --- src/traits.jl | 1 + test/patterns/static_algorithms.jl | 18 ++++++++++++++++-- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 6886a2ec..fe20ce53 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -115,6 +115,7 @@ functions() = ( :(LearnAPI.transform), :(LearnAPI.inverse_transform), ) +functions(::Any) = () """ LearnAPI.kinds_of_proxy(learner) diff --git a/test/patterns/static_algorithms.jl b/test/patterns/static_algorithms.jl index fef3cff1..243cab44 100644 --- a/test/patterns/static_algorithms.jl +++ b/test/patterns/static_algorithms.jl @@ -1,5 +1,4 @@ using LearnAPI -using LinearAlgebra using Tables import MLUtils import DataFrames @@ -71,7 +70,22 @@ end struct FancySelector names::Vector{Symbol} end -FancySelector(; names=Symbol[]) = FancySelector(names) # LearnAPI.constructor defined later + +""" + FancySelector(; names=Symbol[]) + +Instantiate a feature selector that exposes the names of rejected features. + +```julia +learner = FancySelector(names=[:x, :w]) +X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) +model = fit(learner) # no data arguments! +transform(model, X) # mutates `model` +@assert rejected(model) == [:y, :z] +``` + +""" +FancySelector(; names=Symbol[]) = FancySelector(names) mutable struct FancySelectorFitted learner::FancySelector From 099d86e83e9c59fd4cb2038bd95766722ea6c0f0 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 25 Oct 2024 16:21:02 +1300 Subject: [PATCH 135/187] add test --- test/traits.jl | 1 + 1 file changed, 1 insertion(+) diff --git a/test/traits.jl b/test/traits.jl index b75ed658..5fbfdca4 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -50,6 +50,7 @@ small = SmallLearner() @trait SmallLearner kinds_of_proxy=(Point(),) @test LearnAPI.is_learner(small) +@test !LearnAPI.is_learner("junk") @test !LearnAPI.target(small) @test !LearnAPI.weights(small) @test LearnAPI.preferred_kind_of_proxy(small) == Point() From bec86e410bc4303b58890223bb5e96edb2622f90 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 25 Oct 2024 20:00:16 +1300 Subject: [PATCH 136/187] typo --- docs/src/fit_update.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 29f7af01..2e2c0858 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -45,7 +45,7 @@ See also [Classification](@ref) and [Regression](@ref). ### Transformers -A dimension-reducing transformer, `learner` might be used in this way: +A dimension-reducing transformer, `learner`, might be used in this way: ```julia model = fit(learner, X) From 6f436ef0e31caba03a380f11a3adf483985ecc6d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 25 Oct 2024 20:38:46 +1300 Subject: [PATCH 137/187] tweak a docstring --- src/accessor_functions.jl | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index 1fe06d75..c284b015 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -32,7 +32,8 @@ is `true`. # New implementations Implementation is compulsory for new learner types. The behaviour described above is the -only contract. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.learner)")) +only contract. You must include `:(LearnAPI.learner)` in the return value of +[`LearnAPI.functions(learner)`](@ref). """ function learner end From 7f795387e2f2307a047a1d7b5f1afb95b410328b Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 16:04:41 +1300 Subject: [PATCH 138/187] remove testing of patterns --- test/runtests.jl | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/test/runtests.jl b/test/runtests.jl index 47411cb2..631b0111 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -6,10 +6,10 @@ test_files = [ "clone.jl", "accessor_functions.jl", "target_features.jl", - "patterns/regression.jl", - "patterns/static_algorithms.jl", - "patterns/ensembling.jl", - "patterns/incremental_algorithms.jl", + # "patterns/regression.jl", + # "patterns/static_algorithms.jl", + # "patterns/ensembling.jl", + # "patterns/incremental_algorithms.jl", ] files = isempty(ARGS) ? test_files : ARGS From 8da527bec0542f2bfd5db5625478e0c1bc82fdd5 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:10:15 +1300 Subject: [PATCH 139/187] add tests --- test/obs.jl | 8 ++++++++ test/predict_transform.jl | 27 +++++++++++++++++++++++++++ test/runtests.jl | 2 ++ test/traits.jl | 2 ++ 4 files changed, 39 insertions(+) create mode 100644 test/obs.jl create mode 100644 test/predict_transform.jl diff --git a/test/obs.jl b/test/obs.jl new file mode 100644 index 00000000..b9ab8119 --- /dev/null +++ b/test/obs.jl @@ -0,0 +1,8 @@ +using Test +using LearnAPI + +@testset "`obs` fallback" begin + @test obs("some learner", 42) == 42 +end + +true diff --git a/test/predict_transform.jl b/test/predict_transform.jl new file mode 100644 index 00000000..3f9648c9 --- /dev/null +++ b/test/predict_transform.jl @@ -0,0 +1,27 @@ +using Test +using LearnAPI + +struct Cherry end + +LearnAPI.fit(learner::Cherry, data; verbosity=1) = Ref(learner) +LearnAPI.learner(model::Base.RefValue{Cherry}) = model[] +LearnAPI.predict(model::Base.RefValue{Cherry}, ::Point, x) = 2x +@trait Cherry kinds_of_proxy=(Point(),) + +struct Ripe end + +LearnAPI.fit(learner::Ripe, data; verbosity=1) = Ref(learner) +LearnAPI.learner(model::Base.RefValue{Ripe}) = model[] +LearnAPI.predict(model::Base.RefValue{Ripe}, ::Distribution) = "a distribution" +LearnAPI.features(::Ripe, data) = nothing +@trait Ripe kinds_of_proxy=(Distribution(),) + +@testset "`predict` with no kind of proxy specified" begin + model = fit(Cherry(), "junk") + @test predict(model, 42) == 84 + + model = fit(Ripe(), "junk") + @test predict(model) == "a distribution" +end + +true diff --git a/test/runtests.jl b/test/runtests.jl index 631b0111..a9bd9760 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,6 +4,8 @@ test_files = [ "tools.jl", "traits.jl", "clone.jl", + "predict_transform.jl", + "obs.jl", "accessor_functions.jl", "target_features.jl", # "patterns/regression.jl", diff --git a/test/traits.jl b/test/traits.jl index 5fbfdca4..f76a361b 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -71,3 +71,5 @@ import .FruitSalad @testset "name" begin @test LearnAPI.name(FruitSalad.RedApple(1)) == "RedApple" end + +true From bc7aff101f18fdd64c5ce43d8845d054cc33db04 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:13:39 +1300 Subject: [PATCH 140/187] drop julia 1.6 support --- .github/workflows/ci.yml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 70e66dce..d71082e6 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -17,8 +17,7 @@ jobs: fail-fast: false matrix: version: - - '1.6' # previous LTS release - - '1.10' # new LTS release + - '1.10' # LTS release - '1' # automatically expands to the latest stable 1.x release of Julia. os: - ubuntu-latest From ddcc2dc1c81ab1fac1b293f1c48df9f36f5f97cc Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:20:19 +1300 Subject: [PATCH 141/187] dump test dependencies no longer needed --- Project.toml | 22 +--------------------- test/runtests.jl | 4 ---- 2 files changed, 1 insertion(+), 25 deletions(-) diff --git a/Project.toml b/Project.toml index 7f9d9267..c488c715 100644 --- a/Project.toml +++ b/Project.toml @@ -7,27 +7,7 @@ version = "0.1.0" julia = "1.6" [extras] -DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" -Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" -LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" -MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" -Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" -Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b" -StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3" -Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" -Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] -test = [ - "DataFrames", - "Distributions", - "LinearAlgebra", - "MLUtils", - "Random", - "Serialization", - "StableRNGs", - "Statistics", - "Tables", - "Test", -] +test = ["Test",] diff --git a/test/runtests.jl b/test/runtests.jl index a9bd9760..8a255c83 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -8,10 +8,6 @@ test_files = [ "obs.jl", "accessor_functions.jl", "target_features.jl", - # "patterns/regression.jl", - # "patterns/static_algorithms.jl", - # "patterns/ensembling.jl", - # "patterns/incremental_algorithms.jl", ] files = isempty(ARGS) ? test_files : ARGS From 04393f031f47e92e42cfcbd66b171cba90809c7c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:21:36 +1300 Subject: [PATCH 142/187] bump compat julia=1.10 --- Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Project.toml b/Project.toml index c488c715..2172f4e7 100644 --- a/Project.toml +++ b/Project.toml @@ -4,7 +4,7 @@ authors = ["Anthony D. Blaom "] version = "0.1.0" [compat] -julia = "1.6" +julia = "1.10" [extras] Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" From 283619998b4500713183bc18818e9f535396a5ef Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:54:38 +1300 Subject: [PATCH 143/187] bump compat for docs --- docs/Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Project.toml b/docs/Project.toml index 47eb52e6..12dcdd5f 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -7,4 +7,4 @@ Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" [compat] Documenter = "1" -julia = "1.6" +julia = "1.10" From 318d84f141a164529221ca65684566e4ee564a7c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 20:12:54 +1300 Subject: [PATCH 144/187] fix a tag --- src/traits.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/traits.jl b/src/traits.jl index fe20ce53..4046b178 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -171,7 +171,7 @@ tags() = [ "classification", "clustering", "gradient descent", - "iterative learners", + "iterative algorithms", "incremental algorithms", "feature engineering", "dimension reduction", From 1b8be296d51920aaa1df074ff0b794b394926c3d Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 20:13:32 +1300 Subject: [PATCH 145/187] whitespace --- src/traits.jl | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 4046b178..e55dfffe 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -164,8 +164,6 @@ kinds_of_proxy() = map(CONCRETE_TARGET_PROXY_SYMBOLS) do ex end |> eval end - - tags() = [ "regression", "classification", From 2ed8af3e5386fe804d495318940af2929719bd28 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 28 Oct 2024 16:26:05 +1300 Subject: [PATCH 146/187] docstring tweak --- src/obs.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/obs.jl b/src/obs.jl index d107fa77..11f4d647 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -55,7 +55,7 @@ fit(learner, observations)` is equivalent to `model = fit(learner, data)`, whene `observations = obs(learner, data)`. For each supported form of `data` in calls `predict(model, ..., data)` and `transform(model, data)`, where implemented, the calls `predict(model, ..., observations)` and `transform(model, observations)` are supported -alternatives, whenever `observations = obs(model, data)`. +alternatives with the same return value, whenever `observations = obs(model, data)`. The fallback for `obs` is `obs(model_or_learner, data) = data`, and the fallback for `LearnAPI.data_interface(learner)` is `LearnAPI.RandomAccess()`. For details refer to From 2b11e6b76c4a28b08d3bfa73466586a3119f876a Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 30 Oct 2024 18:46:36 +1300 Subject: [PATCH 147/187] clarify need for `obs` to be involutive --- docs/src/anatomy_of_an_implementation.md | 23 ++++++++++++++------ src/obs.jl | 27 ++++++++++++++++++------ 2 files changed, 37 insertions(+), 13 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 4c36ae2d..7af5c6d6 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -420,10 +420,21 @@ LearnAPI.fit(learner::Ridge, data; kwargs...) = ### The `obs` contract -Providing `fit` signatures matching the output of `obs`, is the first part of the `obs` -contract. The second part is this: *The output of `obs` must implement the interface -specified by the trait* [`LearnAPI.data_interface(learner)`](@ref). Assuming this is -[`LearnAPI.RandomAccess()`](@ref) (the default) it usually suffices to overload +Providing `fit` signatures matching the output of [`obs`](@ref), is the first part of the +`obs` contract. Since `obs(learner, data)` should evidentally support all `data` that +`fit(learner, data)` supports, we must be able to apply `obs(learner, _)` to it's own +output (`observations` below). This leads to the additional "no-op" declaration + +```@example anatomy2 +LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations +``` + +In other words, we ensure that `obs(learner, _)` is +[involutive](https://en.wikipedia.org/wiki/Involution_(mathematics)). + +The second part of the `obs` contract is this: *The output of `obs` must implement the +interface specified by the trait* [`LearnAPI.data_interface(learner)`](@ref). Assuming +this is [`LearnAPI.RandomAccess()`](@ref) (the default) it usually suffices to overload `Base.getindex` and `Base.length`: ```@example anatomy2 @@ -432,11 +443,11 @@ Base.getindex(data::RidgeFitObs, I) = Base.length(data::RidgeFitObs) = length(data.y) ``` -We can do something similar for `predict`, but there's no need for a new type in this -case: +We do something similar for `predict`, but there's no need for a new type in this case: ```@example anatomy2 LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' +LearnAPI.obs(::RidgeFitted, observations::AbstractArray) = observations # involutivity LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = observations'*model.coefficients diff --git a/src/obs.jl b/src/obs.jl index d107fa77..bfcd87bb 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -54,8 +54,19 @@ For each supported form of `data` in `fit(learner, data)`, it must be true that fit(learner, observations)` is equivalent to `model = fit(learner, data)`, whenever `observations = obs(learner, data)`. For each supported form of `data` in calls `predict(model, ..., data)` and `transform(model, data)`, where implemented, the calls -`predict(model, ..., observations)` and `transform(model, observations)` are supported -alternatives, whenever `observations = obs(model, data)`. +`predict(model, ..., observations)` and `transform(model, observations)` must be supported +alternatives with the same output, whenever `observations = obs(model, data)`. + +Implicit in the above requirements is that `obs(learner, _)` and `obs(model, _)` are +involutive, meaning both the following hold: + +```julia +obs(learner, obs(learner, data)) == obs(learner, data) +obs(model, obs(model, data) == obs(model, obs(model, data) +``` + +If one overloads `obs`, one typically needs additionally overloadings to guarantee +involutivity. The fallback for `obs` is `obs(model_or_learner, data) = data`, and the fallback for `LearnAPI.data_interface(learner)` is `LearnAPI.RandomAccess()`. For details refer to @@ -67,14 +78,16 @@ to be overloaded. However, the user will get no performance benefits by using `o that case. When overloading `obs(learner, data)` to output new model-specific representations of -data, it may be necessary to also overload [`LearnAPI.features`](@ref), -[`LearnAPI.target`](@ref) (supervised learners), and/or [`LearnAPI.weights`](@ref) (if -weights are supported), for extracting relevant parts of the representation. +data, it may be necessary to also overload [`LearnAPI.features(learner, +observations)`](@ref), [`LearnAPI.target(learner, observations)`](@ref) (supervised +learners), and/or [`LearnAPI.weights(learner, observations)`](@ref) (if weights are +supported), for each kind output `observations` of `obs(learner, data)`. ## Sample implementation -Refer to the "Anatomy of an Implementation" section of the LearnAPI.jl -[manual](https://juliaai.github.io/LearnAPI.jl/dev/). +Refer to the ["Anatomy of an +Implementation"](https://juliaai.github.io/LearnAPI.jl/dev/anatomy_of_an_implementation/#Providing-an-advanced-data-interface) +section of the LearnAPI.jl manual. """ From 1c7123f763be3706dc8e9e579b6f45a08265337c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 16:04:41 +1300 Subject: [PATCH 148/187] remove testing of patterns --- test/runtests.jl | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/test/runtests.jl b/test/runtests.jl index 47411cb2..631b0111 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -6,10 +6,10 @@ test_files = [ "clone.jl", "accessor_functions.jl", "target_features.jl", - "patterns/regression.jl", - "patterns/static_algorithms.jl", - "patterns/ensembling.jl", - "patterns/incremental_algorithms.jl", + # "patterns/regression.jl", + # "patterns/static_algorithms.jl", + # "patterns/ensembling.jl", + # "patterns/incremental_algorithms.jl", ] files = isempty(ARGS) ? test_files : ARGS From 69726175dc5ff46eac77db0ba9ffc550f4d0aaf3 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:10:15 +1300 Subject: [PATCH 149/187] add tests --- test/obs.jl | 8 ++++++++ test/predict_transform.jl | 27 +++++++++++++++++++++++++++ test/runtests.jl | 2 ++ test/traits.jl | 2 ++ 4 files changed, 39 insertions(+) create mode 100644 test/obs.jl create mode 100644 test/predict_transform.jl diff --git a/test/obs.jl b/test/obs.jl new file mode 100644 index 00000000..b9ab8119 --- /dev/null +++ b/test/obs.jl @@ -0,0 +1,8 @@ +using Test +using LearnAPI + +@testset "`obs` fallback" begin + @test obs("some learner", 42) == 42 +end + +true diff --git a/test/predict_transform.jl b/test/predict_transform.jl new file mode 100644 index 00000000..3f9648c9 --- /dev/null +++ b/test/predict_transform.jl @@ -0,0 +1,27 @@ +using Test +using LearnAPI + +struct Cherry end + +LearnAPI.fit(learner::Cherry, data; verbosity=1) = Ref(learner) +LearnAPI.learner(model::Base.RefValue{Cherry}) = model[] +LearnAPI.predict(model::Base.RefValue{Cherry}, ::Point, x) = 2x +@trait Cherry kinds_of_proxy=(Point(),) + +struct Ripe end + +LearnAPI.fit(learner::Ripe, data; verbosity=1) = Ref(learner) +LearnAPI.learner(model::Base.RefValue{Ripe}) = model[] +LearnAPI.predict(model::Base.RefValue{Ripe}, ::Distribution) = "a distribution" +LearnAPI.features(::Ripe, data) = nothing +@trait Ripe kinds_of_proxy=(Distribution(),) + +@testset "`predict` with no kind of proxy specified" begin + model = fit(Cherry(), "junk") + @test predict(model, 42) == 84 + + model = fit(Ripe(), "junk") + @test predict(model) == "a distribution" +end + +true diff --git a/test/runtests.jl b/test/runtests.jl index 631b0111..a9bd9760 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -4,6 +4,8 @@ test_files = [ "tools.jl", "traits.jl", "clone.jl", + "predict_transform.jl", + "obs.jl", "accessor_functions.jl", "target_features.jl", # "patterns/regression.jl", diff --git a/test/traits.jl b/test/traits.jl index 5fbfdca4..f76a361b 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -71,3 +71,5 @@ import .FruitSalad @testset "name" begin @test LearnAPI.name(FruitSalad.RedApple(1)) == "RedApple" end + +true From 0685024aaa0d9d54aba4b40db53a51559a9ad197 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:13:39 +1300 Subject: [PATCH 150/187] drop julia 1.6 support --- .github/workflows/ci.yml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 70e66dce..d71082e6 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -17,8 +17,7 @@ jobs: fail-fast: false matrix: version: - - '1.6' # previous LTS release - - '1.10' # new LTS release + - '1.10' # LTS release - '1' # automatically expands to the latest stable 1.x release of Julia. os: - ubuntu-latest From c5cbeb95f1f19a41a22f9ce1c7d985f9ac73954e Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:20:19 +1300 Subject: [PATCH 151/187] dump test dependencies no longer needed --- Project.toml | 22 +--------------------- test/runtests.jl | 4 ---- 2 files changed, 1 insertion(+), 25 deletions(-) diff --git a/Project.toml b/Project.toml index 7f9d9267..c488c715 100644 --- a/Project.toml +++ b/Project.toml @@ -7,27 +7,7 @@ version = "0.1.0" julia = "1.6" [extras] -DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" -Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" -LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" -MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" -Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" -Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b" -StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3" -Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" -Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] -test = [ - "DataFrames", - "Distributions", - "LinearAlgebra", - "MLUtils", - "Random", - "Serialization", - "StableRNGs", - "Statistics", - "Tables", - "Test", -] +test = ["Test",] diff --git a/test/runtests.jl b/test/runtests.jl index a9bd9760..8a255c83 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -8,10 +8,6 @@ test_files = [ "obs.jl", "accessor_functions.jl", "target_features.jl", - # "patterns/regression.jl", - # "patterns/static_algorithms.jl", - # "patterns/ensembling.jl", - # "patterns/incremental_algorithms.jl", ] files = isempty(ARGS) ? test_files : ARGS From 5eee97c66a4beaca0d930393d8dc048d77b40cd9 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:21:36 +1300 Subject: [PATCH 152/187] bump compat julia=1.10 --- Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Project.toml b/Project.toml index c488c715..2172f4e7 100644 --- a/Project.toml +++ b/Project.toml @@ -4,7 +4,7 @@ authors = ["Anthony D. Blaom "] version = "0.1.0" [compat] -julia = "1.6" +julia = "1.10" [extras] Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" From a3354c416b3664c0bc372616ea0f0ba44106365b Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 17:54:38 +1300 Subject: [PATCH 153/187] bump compat for docs --- docs/Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Project.toml b/docs/Project.toml index 47eb52e6..12dcdd5f 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -7,4 +7,4 @@ Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" [compat] Documenter = "1" -julia = "1.6" +julia = "1.10" From a4bfd9af36ec5813c9a9c2a119a7fbc18dea6caa Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 20:12:54 +1300 Subject: [PATCH 154/187] fix a tag --- src/traits.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/traits.jl b/src/traits.jl index fe20ce53..4046b178 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -171,7 +171,7 @@ tags() = [ "classification", "clustering", "gradient descent", - "iterative learners", + "iterative algorithms", "incremental algorithms", "feature engineering", "dimension reduction", From 7e72784d9bea94c4b7135f74c143acd7760639af Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 27 Oct 2024 20:13:32 +1300 Subject: [PATCH 155/187] whitespace --- src/traits.jl | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/traits.jl b/src/traits.jl index 4046b178..e55dfffe 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -164,8 +164,6 @@ kinds_of_proxy() = map(CONCRETE_TARGET_PROXY_SYMBOLS) do ex end |> eval end - - tags() = [ "regression", "classification", From 69fbfebd56bfa321a29dec366ee677a07e37963b Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Wed, 30 Oct 2024 17:26:49 +1300 Subject: [PATCH 156/187] add links to dim-reduction pattern; correct "ensembling" tag --- docs/src/common_implementation_patterns.md | 4 ++-- docs/src/patterns/dimension_reduction.md | 6 ++++++ docs/src/patterns/transformers.md | 6 ++++++ src/traits.jl | 2 +- 4 files changed, 15 insertions(+), 3 deletions(-) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 9b128c6a..6af70007 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -27,11 +27,11 @@ implementations fall into one (or more) of the following informally understood p - [Feature Engineering](@ref): Algorithms for selecting or combining features -- Dimension Reduction: Transformers that learn to reduce feature space dimension +- [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension - Missing Value Imputation -- Transformers: Other transformers, such as standardizers, and categorical +- [Transformers](@ref): Other transformers, such as standardizers, and categorical encoders. - [Static Algorithms](@ref): Algorithms that do not learn, in the sense they must be diff --git a/docs/src/patterns/dimension_reduction.md b/docs/src/patterns/dimension_reduction.md index 3174adb8..63877c8c 100644 --- a/docs/src/patterns/dimension_reduction.md +++ b/docs/src/patterns/dimension_reduction.md @@ -1 +1,7 @@ # Dimension Reduction + +Check out the following examples: + +- [Truncated + SVD]((https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl + (from the TestLearnAPI.jl test suite) diff --git a/docs/src/patterns/transformers.md b/docs/src/patterns/transformers.md index 08e10a25..b6a6336e 100644 --- a/docs/src/patterns/transformers.md +++ b/docs/src/patterns/transformers.md @@ -1 +1,7 @@ # Transformers + +Check out the following examples: + +- [Truncated + SVD]((https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl + (from the TestLearnAPI.jl test suite) diff --git a/src/traits.jl b/src/traits.jl index e55dfffe..7c2dd6a2 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -176,7 +176,7 @@ tags() = [ "missing value imputation", "transformers", "static algorithms", - "ensemble algorithms", + "ensembling", "time series forecasting", "time series classification", "survival analysis", From 0d6669829a92c0ef228d06f39e354307919ea109 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 31 Oct 2024 10:46:19 +1300 Subject: [PATCH 157/187] add data intrfce rqrmnt on output of `features`, `target`, `weights` --- docs/src/anatomy_of_an_implementation.md | 16 ++++--- docs/src/obs.md | 2 +- docs/src/reference.md | 14 +++--- src/fit_update.jl | 2 +- src/obs.jl | 6 ++- src/target_weights_features.jl | 56 +++++++++++++++++------- 6 files changed, 64 insertions(+), 32 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 7af5c6d6..ca86b190 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -35,7 +35,7 @@ A transformer ordinarily implements `transform` instead of `predict`. For more o then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support this interface, as illustrated below under - [Providing an advanced data interface](@ref), and which may additionally + [Providing a separate data front end](@ref), and which may additionally enable certain performance benefits; or (ii) overload the trait [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. @@ -314,7 +314,7 @@ recovered_model = deserialize(filename) @assert predict(recovered_model, X) == predict(model, X) ``` -## Providing an advanced data interface +## Providing a separate data front end ```@setup anatomy2 using LearnAPI @@ -364,9 +364,13 @@ y = 2a - b + 3c + 0.05*rand(n) An implementation may optionally implement [`obs`](@ref), to expose to the user (or some meta-algorithm like cross-validation) the representation of input data internal to `fit` -or `predict`, such as the matrix version `A` of `X` in the ridge example. Here we -specifically wrap all the pre-processed data into single object, for which we introduce a -new type: +or `predict`, such as the matrix version `A` of `X` in the ridge example. That is, we may +factor out of `fit` (and also `predict`) the data pre-processing step, `obs`, to expose +its outcomes. These outcomes become alternative user inputs to `fit`. To see the use of +`obs` in action, see [below](@ref advanced_demo). + +Here we specifically wrap all the pre-processed data into single object, for which we +introduce a new type: ```@example anatomy2 struct RidgeFitObs{T,M<:AbstractMatrix{T}} @@ -503,7 +507,7 @@ As above, we add a signature which plays no role vis-à-vis LearnAPI.jl. LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...) ``` -## Demonstration of an advanced `obs` workflow +## [Demonstration of an advanced `obs` workflow](@id advanced_demo) We now can train and predict using internal data representations, resampled using the generic MLUtils.jl interface: diff --git a/docs/src/obs.md b/docs/src/obs.md index 3d206b70..5d81012e 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -83,7 +83,7 @@ end | [`obs(model, data)`](@ref) | here `data` is `predict`-consumable | not typically | returns `data` | -A sample implementation is given in [Providing an advanced data interface](@ref). +A sample implementation is given in [Providing a separate data front end](@ref). ## Reference diff --git a/docs/src/reference.md b/docs/src/reference.md index c6e9aaf3..422ca675 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -16,9 +16,7 @@ ML/statistical algorithms are typically applied in conjunction with resampling o *observations*, as in [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)). In this document *data* will always refer to objects encapsulating an ordered sequence of -individual observations. If a learner is trained using multiple data objects, it is -undertood that individual objects share the same number of observations, and that -resampling of one component implies synchronized resampling of the others. +individual observations. A `DataFrame` instance, from [DataFrames.jl](https://dataframes.juliadata.org/stable/), is an example of data, the observations being the rows. Typically, data provided to @@ -97,9 +95,11 @@ which can be tested with `@assert `[`LearnAPI.clone(learner)`](@ref)` == learner Note that if if `learner` is an instance of a *mutable* struct, this requirement generally requires overloading `Base.==` for the struct. -No LearnAPI.jl method is permitted to mutate a learner. In particular, one should make -deep copies of RNG hyperparameters before using them in a new implementation of -[`fit`](@ref). +!!! important + + No LearnAPI.jl method is permitted to mutate a learner. In particular, one should make + deep copies of RNG hyperparameters before using them in a new implementation of + [`fit`](@ref). #### Composite learners (wrappers) @@ -145,7 +145,7 @@ for each. [`LearnAPI.functions`](@ref). Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). For a -bare minimum implementation, see the implementation of `SmallLearner` +minimal (but useless) implementation, see the implementation of `SmallLearner` [here](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/traits.jl). ### List of methods diff --git a/src/fit_update.jl b/src/fit_update.jl index 2421acba..39a662a9 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -17,7 +17,7 @@ model = fit(learner, (X, y)) ŷ = predict(model, Xnew) ``` -The second signature, with `data` omitted, is provided by learners that do not +The signature `fit(learner; verbosity=1)` (no `data`) is provided by learners that do not generalize to new observations (called *static algorithms*). In that case, `transform(model, data)` or `predict(model, ..., data)` carries out the actual algorithm execution, writing any byproducts of that operation to the mutable object `model` returned diff --git a/src/obs.jl b/src/obs.jl index bfcd87bb..e6751e6c 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -77,11 +77,13 @@ only of suitable tables and arrays, then `obs` and `LearnAPI.data_interface` do to be overloaded. However, the user will get no performance benefits by using `obs` in that case. -When overloading `obs(learner, data)` to output new model-specific representations of +If overloading `obs(learner, data)` to output new model-specific representations of data, it may be necessary to also overload [`LearnAPI.features(learner, observations)`](@ref), [`LearnAPI.target(learner, observations)`](@ref) (supervised learners), and/or [`LearnAPI.weights(learner, observations)`](@ref) (if weights are -supported), for each kind output `observations` of `obs(learner, data)`. +supported), for each kind output `observations` of `obs(learner, data)`. Moreover, the +outputs of these methods, applied to `observations`, must also implement the interface +specfied by [`LearnAPI.data_interface(learner)`](@ref). ## Sample implementation diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 8e58c07e..ba1a3b75 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -5,11 +5,15 @@ Return, for each form of `data` supported in a call of the form [`fit(learner, data)`](@ref), the target variable part of `data`. If `nothing` is returned, the `learner` does not see a target variable in training (is unsupervised). +The returned object `y` has the same number of observations as `data`. If `data` is the +output of an [`obs`](@ref) call, then `y` is additionally guaranteed to implement the +data interface specified by [`LearnAPI.data_interface(learner)`](@ref). + # Extended help ## What is a target variable? -Examples of target variables are house prices in realestate pricing estimates, the +Examples of target variables are house prices in real estate pricing estimates, the "spam"/"not spam" labels in an email spam filtering task, "outlier"/"inlier" labels in outlier detection, cluster labels in clustering problems, and censored survival times in survival analysis. For more on targets and target proxies, see the "Reference" section of @@ -17,8 +21,12 @@ the LearnAPI.jl documentation. ## New implementations -A fallback returns `nothing`. Must be implemented if `fit` consumes data including a -target variable. +A fallback returns `nothing`. The method must be overloaded if `fit` consumes data +including a target variable. + +If overloading [`obs`](@ref), ensure that the return value, unless `nothing`, implements +the data interface specified by [`LearnAPI.data_interface(learner)`](@ref), in the special +case that `data` is the output of an `obs` call. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.target)"; overloaded=true)) @@ -32,10 +40,20 @@ Return, for each form of `data` supported in a call of the form [`fit(learner, data)`](@ref), the per-observation weights part of `data`. Where `nothing` is returned, no weights are part of `data`, which is to be interpreted as uniform weighting. +The returned object `w` has the same number of observations as `data`. If `data` is the +output of an [`obs`](@ref) call, then `w` is additionally guaranteed to implement the +data interface specified by [`LearnAPI.data_interface(learner)`](@ref). + +# Extended help + # New implementations Overloading is optional. A fallback returns `nothing`. +If overloading [`obs`](@ref), ensure that the return value, unless `nothing`, implements +the data interface specified by [`LearnAPI.data_interface(learner)`](@ref), in the special +case that `data` is the output of an `obs` call. + $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) """ @@ -53,26 +71,34 @@ implemented, as in the following sample workflow: ```julia model = fit(learner, data) -X = features(data) +X = LearnAPI.features(learner, data) ŷ = predict(learner, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` ``` -The returned object has the same number of observations as `data`. For supervised models -(i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`) `ŷ` above is generally -intended to be an approximate proxy for `LearnAPI.target(learner, data)`, the training -target. +For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`) +`ŷ` above is generally intended to be an approximate proxy for `LearnAPI.target(learner, +data)`, the training target. + +The object `X` returned by `LearnAPI.target` has the same number of observations as +`data`. If `data` is the output of an [`obs`](@ref) call, then `X` is additionally +guaranteed to implement the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). +# Extended help # New implementations -That the output can be passed to `predict` and/or `transform`, and has the same number of -observations as `data`, are the only contracts. A fallback returns `first(data)` if `data` -is a tuple, and otherwise returns `data`. +For density estimators, whose `fit` typically consumes *only* a target variable, you +should overload this method to return `nothing`. + +It must otherwise be possible to pass the return value `X` to `predict` and/or +`transform`, and `X` must have same number of observations as `data`. A fallback returns +`first(data)` if `data` is a tuple, and otherwise returns `data`. -Overloading may be necessary if [`obs(learner, data)`](@ref) is overloaded to return -some learner-specific representation of training `data`. For density estimators, whose -`fit` typically consumes *only* a target variable, you should overload this method to -return `nothing`. +Further overloadings may be necessary to handle the case that `data` is the output of +[`obs(learner, data)`](@ref), if `obs` is being overloaded. In this case, be sure that +`X`, unless `nothing`, implements the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). """ features(learner, data) = _first(data) From 12b8f4bac92ced2bc6b249209193297aea9259ae Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 10:46:00 +1300 Subject: [PATCH 158/187] add to docs: fit/predict/trans must support subsampled data --- docs/src/common_implementation_patterns.md | 4 ++-- docs/src/patterns/transformers.md | 2 +- src/obs.jl | 8 +++++++- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 6af70007..b46ed768 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -31,8 +31,8 @@ implementations fall into one (or more) of the following informally understood p - Missing Value Imputation -- [Transformers](@ref): Other transformers, such as standardizers, and categorical - encoders. +- [Transformers](@ref transformers): Other transformers, such as standardizers, and + categorical encoders. - [Static Algorithms](@ref): Algorithms that do not learn, in the sense they must be re-executed for each new data set (do not generalize), but which have hyperparameters diff --git a/docs/src/patterns/transformers.md b/docs/src/patterns/transformers.md index b6a6336e..f085f928 100644 --- a/docs/src/patterns/transformers.md +++ b/docs/src/patterns/transformers.md @@ -1,4 +1,4 @@ -# Transformers +# [Transformers](@id transformers) Check out the following examples: diff --git a/src/obs.jl b/src/obs.jl index e6751e6c..23d5b5b9 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -57,7 +57,13 @@ fit(learner, observations)` is equivalent to `model = fit(learner, data)`, whene `predict(model, ..., observations)` and `transform(model, observations)` must be supported alternatives with the same output, whenever `observations = obs(model, data)`. -Implicit in the above requirements is that `obs(learner, _)` and `obs(model, _)` are +If `LearnAPI.data_interface(learner) == RandomAccess()` (the default), then `fit`, +`predict` and `transform` must additionally accept `obs` output that has been *subsampled* +using `MLUtils.getobs`, with the obvious interpretation applying to the outcomes of such +calls (e.g., if *all* observations are subsampled, then outcomes should be the same as if +using the original data). + +Implicit in preceding requirements is that `obs(learner, _)` and `obs(model, _)` are involutive, meaning both the following hold: ```julia From ec42302dd233d9308ea073e18c8400c30be2e556 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 10:50:54 +1300 Subject: [PATCH 159/187] doc tweak --- src/clone.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/clone.jl b/src/clone.jl index fef6515d..47fc0a3b 100644 --- a/src/clone.jl +++ b/src/clone.jl @@ -7,7 +7,7 @@ Return a shallow copy of `learner` with the specified hyperparameter replacement clone(learner; epochs=100, learning_rate=0.01) ``` -It is guaranteed that `LearnAPI.clone(learner) == learner`. +A LearnAPI.jl contract ensures that `LearnAPI.clone(learner) == learner`. """ function clone(learner; replacements...) From 3ed7301d99221cec8bbbc80815bd0bde2c076d73 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 11:38:35 +1300 Subject: [PATCH 160/187] fix a docstring --- docs/src/reference.md | 2 +- src/target_weights_features.jl | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index 422ca675..a387d115 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -116,7 +116,7 @@ understood to have a valid implementation of the LearnAPI.jl interface. #### Example -Any instance of `GradientRidgeRegressor` defined below is a valid learner. +Below is an example of a learner type with a valid constructor: ```julia struct GradientRidgeRegressor{T<:Real} diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index ba1a3b75..c7831072 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -72,7 +72,7 @@ implemented, as in the following sample workflow: ```julia model = fit(learner, data) X = LearnAPI.features(learner, data) -ŷ = predict(learner, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` +ŷ = predict(model, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` ``` For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`) From 9d35cb94c41f6f1399ce0c2d28f1fea2142a0da6 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 14:50:36 +1300 Subject: [PATCH 161/187] typos --- docs/src/anatomy_of_an_implementation.md | 2 +- src/obs.jl | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index ca86b190..754bf94d 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -425,7 +425,7 @@ LearnAPI.fit(learner::Ridge, data; kwargs...) = ### The `obs` contract Providing `fit` signatures matching the output of [`obs`](@ref), is the first part of the -`obs` contract. Since `obs(learner, data)` should evidentally support all `data` that +`obs` contract. Since `obs(learner, data)` should evidently support all `data` that `fit(learner, data)` supports, we must be able to apply `obs(learner, _)` to it's own output (`observations` below). This leads to the additional "no-op" declaration diff --git a/src/obs.jl b/src/obs.jl index 23d5b5b9..6ea5544e 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -89,7 +89,7 @@ observations)`](@ref), [`LearnAPI.target(learner, observations)`](@ref) (supervi learners), and/or [`LearnAPI.weights(learner, observations)`](@ref) (if weights are supported), for each kind output `observations` of `obs(learner, data)`. Moreover, the outputs of these methods, applied to `observations`, must also implement the interface -specfied by [`LearnAPI.data_interface(learner)`](@ref). +specified by [`LearnAPI.data_interface(learner)`](@ref). ## Sample implementation From 1a235f11059ee73bf84bd9b16482ec720bfc0efd Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 15:33:50 +1300 Subject: [PATCH 162/187] remove redundant files --- test/patterns/ensembling.jl | 215 ------------- test/patterns/gradient_descent.jl | 396 ------------------------ test/patterns/incremental_algorithms.jl | 135 -------- test/patterns/regression.jl | 290 ----------------- test/patterns/static_algorithms.jl | 147 --------- 5 files changed, 1183 deletions(-) delete mode 100644 test/patterns/ensembling.jl delete mode 100644 test/patterns/gradient_descent.jl delete mode 100644 test/patterns/incremental_algorithms.jl delete mode 100644 test/patterns/regression.jl delete mode 100644 test/patterns/static_algorithms.jl diff --git a/test/patterns/ensembling.jl b/test/patterns/ensembling.jl deleted file mode 100644 index 73b864b8..00000000 --- a/test/patterns/ensembling.jl +++ /dev/null @@ -1,215 +0,0 @@ -using LearnAPI -using LinearAlgebra -using Tables -import MLUtils -import DataFrames -using Random -using Statistics -using StableRNGs - -# # ENSEMBLE OF REGRESSORS (A MODEL WRAPPER) - -# We implement a learner that creates an bagged ensemble of regressors, i.e, where each -# atomic model is trained on a random sample of the training observations (same number, -# but sampled with replacement). In particular this learner has an iteration parameter -# `n`, and we implement `update` to execute a warm restarts when `n` increases. - -# no docstring here - that goes with the constructor; some fields left abstract for -# simplicity -# -struct Ensemble - atom # the base regressor being bagged - rng - n::Int -end - -# Since the `atom` hyperparameter is another learner, the user must explicitly set it in -# constructor calls or an error is thrown. We also need to overload the -# `LearnAPI.is_composite` trait (done later). - -""" - Ensemble(atom; rng=Random.default_rng(), n=10) - -Instantiate a bagged ensemble of `n` regressors, with base regressor `atom`, etc - -""" -Ensemble(atom; rng=Random.default_rng(), n=10) = - Ensemble(atom, rng, n) # `LearnAPI.constructor` defined later - -# need a pure keyword argument constructor: -function Ensemble(; atom=nothing, kwargs...) - isnothing(atom) && error("You must specify `atom=...` ") - Ensemble(atom; kwargs...) -end - -struct EnsembleFitted - learner::Ensemble - atom::Ridge - rng # mutated copy of `learner.rng` - models # leaving type abstract for simplicity -end - -LearnAPI.learner(model::EnsembleFitted) = model.learner - -# We add the same data interface that the atomic regressor uses: -LearnAPI.obs(learner::Ensemble, data) = LearnAPI.obs(learner.atom, data) -LearnAPI.obs(model::EnsembleFitted, data) = LearnAPI.obs(first(model.models), data) -LearnAPI.target(learner::Ensemble, data) = LearnAPI.target(learner.atom, data) -LearnAPI.features(learner::Ensemble, data) = LearnAPI.features(learner.atom, data) - -function LearnAPI.fit(learner::Ensemble, data; verbosity=1) - - # unpack hyperparameters: - atom = learner.atom - rng = deepcopy(learner.rng) # to prevent mutation of `learner`! - n = learner.n - - # ensure data can be subsampled using MLUtils.jl, and that we're feeding the atomic - # `fit` data in an efficient (pre-processed) form: - - observations = obs(atom, data) - - # initialize ensemble: - models = [] - - # get number of observations: - N = MLUtils.numobs(observations) - - # train the ensemble: - for _ in 1:n - bag = rand(rng, 1:N, N) - data_subset = MLUtils.getobs(observations, bag) - # step down one verbosity level in atomic fit: - model = fit(atom, data_subset; verbosity=verbosity - 1) - push!(models, model) - end - - # make some noise, if allowed: - verbosity > 0 && @info "Trained $n ridge regression models. " - - return EnsembleFitted(learner, atom, rng, models) - -end - -# Consistent with the documented `update` contract, we implement this behaviour: If `n` is -# increased, `update` adds new regressors to the ensemble, including any new -# hyperparameter updates (e.g, new `atom`) when computing the new atomic -# models. Otherwise, update is equivalent to retraining from scratch, with the provided -# hyperparameter updates. -function LearnAPI.update(model::EnsembleFitted, data; verbosity=1, replacements...) - learner_old = LearnAPI.learner(model) - learner = LearnAPI.clone(learner_old; replacements...) - - :n in keys(replacements) || return fit(learner, data) - - n = learner.n - Δn = n - learner_old.n - n < 0 && return fit(model, learner) - - atom = learner.atom - observations = obs(atom, data) - N = MLUtils.numobs(observations) - - # initialize: - models = model.models - rng = model.rng # as mutated in previous `fit`/`update` calls - - # add new regressors to the ensemble: - for _ in 1:Δn - bag = rand(rng, 1:N, N) - data_subset = MLUtils.getobs(observations, bag) - model = fit(atom, data_subset; verbosity=verbosity-1) - push!(models, model) - end - - # make some noise, if allowed: - verbosity > 0 && @info "Trained $Δn additional ridge regression models. " - - return EnsembleFitted(learner, atom, rng, models) -end - -LearnAPI.predict(model::EnsembleFitted, ::Point, data) = - mean(model.models) do atomic_model - predict(atomic_model, Point(), data) - end - -LearnAPI.strip(model::EnsembleFitted) = EnsembleFitted( - model.learner, - model.atom, - model.rng, - LearnAPI.strip.(Ref(model.atom), models), -) - -# learner traits (note the inclusion of `iteration_parameter`): -@trait( - Ensemble, - constructor = Ensemble, - iteration_parameter = :n, - is_composite = true, - kinds_of_proxy = (Point(),), - tags = ("regression", "ensemble algorithms", "iterative models"), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.update), - :(LearnAPI.predict), - ) -) - -# convenience method: -LearnAPI.fit(learner::Ensemble, X, y, extras...; kwargs...) = - fit(learner, (X, y, extras...); kwargs...) -LearnAPI.update(learner::EnsembleFitted, X, y, extras...; kwargs...) = - update(learner, (X, y, extras...); kwargs...) - - -# synthetic test data: -N = 10 # number of observations -train = 1:6 -test = 7:10 -a, b, c = rand(N), rand(N), rand(N) -X = (; a, b, c) -X = DataFrames.DataFrame(X) -y = 2a - b + 3c + 0.05*rand(N) -data = (X, y) -Xtrain = Tables.subset(X, train) -Xtest = Tables.subset(X, test) - -@testset "test an implementation of bagged ensemble of ridge regressors" begin - rng = StableRNG(123) - atom = Ridge() - learner = Ensemble(atom; n=4, rng) - @test LearnAPI.clone(learner) == learner - @test :(LearnAPI.obs) in LearnAPI.functions(learner) - @test LearnAPI.target(learner, data) == y - @test LearnAPI.features(learner, data) == X - - model = @test_logs( - (:info, r"Trained 4 ridge"), - fit(learner, Xtrain, y[train]; verbosity=1), - ); - - ŷ4 = predict(model, Point(), Xtest) - @test ŷ4 == predict(model, Xtest) - - # add 3 atomic models to the ensemble: - model = update(model, Xtrain, y[train]; verbosity=0, n=7); - ŷ7 = predict(model, Xtest) - - # compare with cold restart: - model_cold = fit(LearnAPI.clone(learner; n=7), Xtrain, y[train]; verbosity=0); - @test ŷ7 ≈ predict(model_cold, Xtest) - - # test that we get a cold restart if another hyperparameter is changed: - model2 = update(model, Xtrain, y[train]; atom=Ridge(0.05)) - learner2 = Ensemble(Ridge(0.05); n=7, rng) - model_cold = fit(learner2, Xtrain, y[train]; verbosity=0) - @test predict(model2, Xtest) ≈ predict(model_cold, Xtest) - -end - -true diff --git a/test/patterns/gradient_descent.jl b/test/patterns/gradient_descent.jl deleted file mode 100644 index 27c9791e..00000000 --- a/test/patterns/gradient_descent.jl +++ /dev/null @@ -1,396 +0,0 @@ -using Pkg -Pkg.activate("perceptron", shared=true) - -using LearnAPI -using Random -using Statistics -using StableRNGs -import Optimisers -import Zygote -import NNlib -import CategoricalDistributions -import CategoricalDistributions: pdf, mode -import ComponentArrays - -# # PERCEPTRON - -# We implement a simple perceptron classifier to illustrate some common patterns for -# gradient descent algorithms. This includes implementation of the following methods: - -# - `update` -# - `update_observations` -# - `iteration_parameter` -# - `training_losses` -# - `obs` for pre-processing (non-tabular) classification training data -# - `predict(learner, ::Distribution, Xnew)` - -# For simplicity, we use single-observation batches for gradient descent updates, and we -# may dodge some optimizations. - -# This is an example of a probability-predicting classifier. - - -# ## Helpers - -""" - brier_loss(probs, hot) - -Return Brier (quadratic) loss. - -- `probs`: predicted probability vector -- `hot`: corresponding ground truth observation, as a one-hot encoded `BitVector` - -""" -function brier_loss(probs, hot) - offset = 1 + sum(probs.^2) - return offset - 2*(sum(probs.*hot)) -end - -""" - corefit(perceptron, optimiser, X, y_hot, epochs, state, verbosity) - -Return updated `perceptron`, `state` and training losses by carrying out gradient descent -for the specified number of `epochs`. - -- `perceptron`: component array with components `weights` and `bias` -- `optimiser`: optimiser from Optimiser.jl -- `X`: feature matrix, of size `(p, n)` -- `y_hot`: one-hot encoded target, of size `(nclasses, n)` -- `epochs`: number of epochs -- `state`: optimiser state - -""" -function corefit(perceptron, X, y_hot, epochs, state, verbosity) - n = size(y_hot) |> last - losses = map(1:epochs) do _ - total_loss = zero(Float32) - for i in 1:n - loss, grad = Zygote.withgradient(perceptron) do p - probs = p.weights*X[:,i] + p.bias |> NNlib.softmax - brier_loss(probs, y_hot[:,i]) - end - ∇loss = only(grad) - state, perceptron = Optimisers.update(state, perceptron, ∇loss) - total_loss += loss - end - # make some noise, if allowed: - verbosity > 0 && @info "Training loss: $total_loss" - total_loss - end - return perceptron, state, losses -end - - -# ## Implementation - -# ### Learner - -# no docstring here - that goes with the constructor; -# SOME FIELDS LEFT ABSTRACT FOR SIMPLICITY -struct PerceptronClassifier - epochs::Int - optimiser # an optmiser from Optimsers.jl - rng -end - -""" - PerceptronClassifier(; epochs=50, optimiser=Optimisers.Adam(), rng=Random.default_rng()) - -Instantiate a perceptron classifier. - -Train an instance, `learner`, by doing `model = fit(learner, X, y)`, where - -- `X is a `Float32` matrix, with observations-as-columns -- `y` (target) is some one-dimensional `CategoricalArray`. - -Get probabilistic predictions with `predict(model, Xnew)` and -point predictions with `predict(model, Point(), Xnew)`. - -# Warm restart options - - update_observations(model, newdata; replacements...) - -Return an updated model, with the weights and bias of the previously learned perceptron -used as the starting state in new gradient descent updates. Adopt any specified -hyperparameter `replacements` (properties of `LearnAPI.learner(model)`). - - update(model, newdata; epochs=n, replacements...) - -If `Δepochs = n - perceptron.epochs` is non-negative, then return an updated model, with -the weights and bias of the previously learned perceptron used as the starting state in -new gradient descent updates for `Δepochs` epochs, and using the provided `newdata` -instead of the previous training data. Any other hyperparaameter `replacements` are also -adopted. If `Δepochs` is negative or not specified, instead return `fit(learner, -newdata)`, where `learner=LearnAPI.clone(learner; epochs=n, replacements....)`. - -""" -PerceptronClassifier(; epochs=50, optimiser=Optimisers.Adam(), rng=Random.default_rng()) = - PerceptronClassifier(epochs, optimiser, rng) - - -# ### Data interface - -# For raw training data: -LearnAPI.target(learner::PerceptronClassifier, data::Tuple) = last(data) - -# For wrapping pre-processed training data (output of `obs(learner, data)`): -struct PerceptronClassifierObs - X::Matrix{Float32} - y_hot::BitMatrix # one-hot encoded target - classes # the (ordered) pool of `y`, as `CategoricalValue`s -end - -# For pre-processing the training data: -function LearnAPI.obs(learner::PerceptronClassifier, data::Tuple) - X, y = data - classes = CategoricalDistributions.classes(y) - y_hot = classes .== permutedims(y) # one-hot encoding - return PerceptronClassifierObs(X, y_hot, classes) -end - -# implement `RadomAccess()` interface for output of `obs`: -Base.length(observations::PerceptronClassifierObs) = length(observations.y) -Base.getindex(observations, I) = PerceptronClassifierObs( - (@view observations.X[:, I]), - (@view observations.y[I]), - observations.classes, -) - -LearnAPI.target( - learner::PerceptronClassifier, - observations::PerceptronClassifierObs, -) = observations.y - -LearnAPI.features( - learner::PerceptronClassifier, - observations::PerceptronClassifierObs, -) = observations.X - -# Note that data consumed by `predict` needs no pre-processing, so no need to overload -# `obs(model, data)`. - - -# ### Fitting and updating - -# For wrapping outcomes of learning: -struct PerceptronClassifierFitted - learner::PerceptronClassifier - perceptron # component array storing weights and bias - state # optimiser state - classes # target classes - losses -end - -LearnAPI.learner(model::PerceptronClassifierFitted) = model.learner - -# `fit` for pre-processed data (output of `obs(learner, data)`): -function LearnAPI.fit( - learner::PerceptronClassifier, - observations::PerceptronClassifierObs; - verbosity=1, - ) - - # unpack hyperparameters: - epochs = learner.epochs - optimiser = learner.optimiser - rng = deepcopy(learner.rng) # to prevent mutation of `learner`! - - # unpack data: - X = observations.X - y_hot = observations.y_hot - classes = observations.classes - nclasses = length(classes) - - # initialize bias and weights: - weights = randn(rng, Float32, nclasses, p) - bias = zeros(Float32, nclasses) - perceptron = (; weights, bias) |> ComponentArrays.ComponentArray - - # initialize optimiser: - state = Optimisers.setup(optimiser, perceptron) - - perceptron, state, losses = corefit(perceptron, X, y_hot, epochs, state, verbosity) - - return PerceptronClassifierFitted(learner, perceptron, state, classes, losses) -end - -# `fit` for unprocessed data: -LearnAPI.fit(learner::PerceptronClassifier, data; kwargs...) = - fit(learner, obs(learner, data); kwargs...) - -# see the `PerceptronClassifier` docstring for `update_observations` logic. -function LearnAPI.update_observations( - model::PerceptronClassifierFitted, - observations_new::PerceptronClassifierObs; - verbosity=1, - replacements..., - ) - - # unpack data: - X = observations_new.X - y_hot = observations_new.y_hot - classes = observations_new.classes - nclasses = length(classes) - - classes == model.classes || error("New training target has incompatible classes.") - - learner_old = LearnAPI.learner(model) - learner = LearnAPI.clone(learner_old; replacements...) - - perceptron = model.perceptron - state = model.state - losses = model.losses - epochs = learner.epochs - - perceptron, state, losses_new = corefit(perceptron, X, y_hot, epochs, state, verbosity) - losses = vcat(losses, losses_new) - - return PerceptronClassifierFitted(learner, perceptron, state, classes, losses) -end -LearnAPI.update_observations(model::PerceptronClassifierFitted, data; kwargs...) = - update_observations(model, obs(LearnAPI.learner(model), data); kwargs...) - -# see the `PerceptronClassifier` docstring for `update` logic. -function LearnAPI.update( - model::PerceptronClassifierFitted, - observations::PerceptronClassifierObs; - verbosity=1, - replacements..., - ) - - # unpack data: - X = observations.X - y_hot = observations.y_hot - classes = observations.classes - nclasses = length(classes) - - classes == model.classes || error("New training target has incompatible classes.") - - learner_old = LearnAPI.learner(model) - learner = LearnAPI.clone(learner_old; replacements...) - :epochs in keys(replacements) || return fit(learner, observations) - - perceptron = model.perceptron - state = model.state - losses = model.losses - - epochs = learner.epochs - Δepochs = epochs - learner_old.epochs - epochs < 0 && return fit(model, learner) - - perceptron, state, losses_new = corefit(perceptron, X, y_hot, Δepochs, state, verbosity) - losses = vcat(losses, losses_new) - - return PerceptronClassifierFitted(learner, perceptron, state, classes, losses) -end -LearnAPI.update(model::PerceptronClassifierFitted, data; kwargs...) = - update(model, obs(LearnAPI.learner(model), data); kwargs...) - - -# ### Predict - -function LearnAPI.predict(model::PerceptronClassifierFitted, ::Distribution, Xnew) - perceptron = model.perceptron - classes = model.classes - probs = perceptron.weights*Xnew .+ perceptron.bias |> NNlib.softmax - return CategoricalDistributions.UnivariateFinite(classes, probs') -end - -LearnAPI.predict(model::PerceptronClassifierFitted, ::Point, Xnew) = - mode.(predict(model, Distribution(), Xnew)) - - -# ### Accessor functions - -LearnAPI.training_losses(model::PerceptronClassifierFitted) = model.losses - - -# ### Traits - -@trait( - PerceptronClassifier, - constructor = PerceptronClassifier, - iteration_parameter = :epochs, - kinds_of_proxy = (Distribution(), Point()), - tags = ("classification", "iterative algorithms", "incremental algorithms"), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.update), - :(LearnAPI.update_observations), - :(LearnAPI.predict), - :(LearnAPI.training_losses), - ) -) - - -# ### Convenience methods - -LearnAPI.fit(learner::PerceptronClassifier, X, y; kwargs...) = - fit(learner, (X, y); kwargs...) -LearnAPI.update_observations(learner::PerceptronClassifier, X, y; kwargs...) = - update_observations(learner, (X, y); kwargs...) -LearnAPI.update(learner::PerceptronClassifier, X, y; kwargs...) = - update(learner, (X, y); kwargs...) - - -# ## Tests - -# synthetic test data: -N = 10 -n = 10N # number of observations -p = 2 # number of features -train = 1:6N -test = (6N+1:10N) -rng = StableRNG(123) -X = randn(rng, Float32, p, n); -coefficients = rand(rng, Float32, p)' -y_continuous = coefficients*X |> vec -η1 = quantile(y_continuous, 1/3) -η2 = quantile(y_continuous, 2/3) -y = map(y_continuous) do η - η < η1 && return "A" - η < η2 && return "B" - "C" -end |> CategoricalDistributions.categorical; -Xtrain = X[:, train]; -Xtest = X[:, test]; -ytrain = y[train]; -ytest = y[test]; - -@testset "PerceptronClassfier" begin - rng = StableRNG(123) - learner = PerceptronClassifier(; optimiser=Optimisers.Adam(0.01), epochs=40, rng) - @test LearnAPI.clone(learner) == learner - @test :(LearnAPI.update) in LearnAPI.functions(learner) - @test LearnAPI.target(learner, (X, y)) == y - @test LearnAPI.features(learner, (X, y)) == X - - model40 = fit(learner, Xtrain, ytrain; verbosity=0) - - # 40 epochs is sufficient for 90% accuracy in this case: - @test sum(predict(model40, Point(), Xtest) .== ytest)/length(ytest) > 0.9 - - # get probabilistic predictions: - ŷ40 = predict(model40, Distribution(), Xtest); - @test predict(model40, Xtest) ≈ ŷ40 - - # add 30 epochs in an `update`: - model70 = update(model40, Xtrain, y[train]; verbosity=0, epochs=70) - ŷ70 = predict(model70, Xtest); - @test !(ŷ70 ≈ ŷ40) - - # compare with cold restart: - model = fit(LearnAPI.clone(learner; epochs=70), Xtrain, y[train]; verbosity=0); - @test ŷ70 ≈ predict(model, Xtest) - - # instead add 30 epochs using `update_observations` instead: - model70b = update_observations(model40, Xtrain, y[train]; verbosity=0, epochs=30) - @test ŷ70 ≈ predict(model70b, Xtest) ≈ predict(model, Xtest) -end - -true diff --git a/test/patterns/incremental_algorithms.jl b/test/patterns/incremental_algorithms.jl deleted file mode 100644 index 20b01779..00000000 --- a/test/patterns/incremental_algorithms.jl +++ /dev/null @@ -1,135 +0,0 @@ -using LearnAPI -using Statistics -using StableRNGs - -import Distributions - -# # NORMAL DENSITY ESTIMATOR - -# An example of density estimation and also of incremental learning -# (`update_observations`). - - -# ## Implementation - -""" - NormalEstimator() - -Instantiate a learner for finding the maximum likelihood normal distribution fitting -some real univariate data `y`. Estimates can be updated with new data. - -```julia -model = fit(NormalEstimator(), y) -d = predict(model) # returns the learned `Normal` distribution -``` - -While the above is equivalent to the single operation `d = -predict(NormalEstimator(), y)`, the above workflow allows for the presentation of -additional observations post facto: The following is equivalent to `d2 = -predict(NormalEstimator(), vcat(y, ynew))`: - -```julia -update_observations(model, ynew) -d2 = predict(model) -``` - -Inspect all learned parameters with `LearnAPI.extras(model)`. Predict a 95% -confidence interval with `predict(model, ConfidenceInterval())` - -""" -struct NormalEstimator end - -struct NormalEstimatorFitted{T} - Σy::T - ȳ::T - ss::T # sum of squared residuals - n::Int -end - -LearnAPI.learner(::NormalEstimatorFitted) = NormalEstimator() - -function LearnAPI.fit(::NormalEstimator, y) - n = length(y) - Σy = sum(y) - ȳ = Σy/n - ss = sum(x->x^2, y) - n*ȳ^2 - return NormalEstimatorFitted(Σy, ȳ, ss, n) -end - -function LearnAPI.update_observations(model::NormalEstimatorFitted, ynew) - m = length(ynew) - n = model.n + m - Σynew = sum(ynew) - Σy = model.Σy + Σynew - ȳ = Σy/n - δ = model.n*((m*model.ȳ - Σynew)/n)^2 - ss = model.ss + δ + sum(x -> (x - ȳ)^2, ynew) - return NormalEstimatorFitted(Σy, ȳ, ss, n) -end - -LearnAPI.features(::NormalEstimator, y) = nothing -LearnAPI.target(::NormalEstimator, y) = y - -LearnAPI.predict(model::NormalEstimatorFitted, ::SingleDistribution) = - Distributions.Normal(model.ȳ, sqrt(model.ss/model.n)) -LearnAPI.predict(model::NormalEstimatorFitted, ::Point) = model.ȳ -function LearnAPI.predict(model::NormalEstimatorFitted, ::ConfidenceInterval) - d = predict(model, SingleDistribution()) - return (quantile(d, 0.025), quantile(d, 0.975)) -end - -# for fit and predict in one line: -LearnAPI.predict(::NormalEstimator, k::LearnAPI.KindOfProxy, y) = - predict(fit(NormalEstimator(), y), k) -LearnAPI.predict(::NormalEstimator, y) = predict(NormalEstimator(), SingleDistribution(), y) - -LearnAPI.extras(model::NormalEstimatorFitted) = (μ=model.ȳ, σ=sqrt(model.ss/model.n)) - -@trait( - NormalEstimator, - constructor = NormalEstimator, - kinds_of_proxy = (SingleDistribution(), Point(), ConfidenceInterval()), - tags = ("density estimation", "incremental algorithms"), - is_pure_julia = true, - human_name = "normal distribution estimator", - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.update_observations), - :(LearnAPI.extras), - ), -) - -# ## Tests - -@testset "NormalEstimator" begin - rng = StableRNG(123) - y = rand(rng, 50); - ynew = rand(rng, 10); - learner = NormalEstimator() - model = fit(learner, y) - d = predict(model) - μ, σ = Distributions.params(d) - @test μ ≈ mean(y) - @test σ ≈ std(y)*sqrt(49/50) # `std` uses Bessel's correction - - # accessor function: - @test LearnAPI.extras(model) == (; μ, σ) - - # one-liner: - @test predict(learner, y) == d - @test predict(learner, Point(), y) ≈ μ - @test predict(learner, ConfidenceInterval(), y)[1] ≈ quantile(d, 0.025) - - # updating: - model = update_observations(model, ynew) - μ2, σ2 = LearnAPI.extras(model) - μ3, σ3 = LearnAPI.extras(fit(learner, vcat(y, ynew))) # training ab initio - @test μ2 ≈ μ3 - @test σ2 ≈ σ3 -end diff --git a/test/patterns/regression.jl b/test/patterns/regression.jl deleted file mode 100644 index f7d8d073..00000000 --- a/test/patterns/regression.jl +++ /dev/null @@ -1,290 +0,0 @@ -using LearnAPI -using LinearAlgebra -using Tables -import MLUtils -import DataFrames - - -# # NAIVE RIDGE REGRESSION WITH NO INTERCEPTS - -# We overload `obs` to expose internal representation of data. See later for a simpler -# variation using the `obs` fallback. - - -# ## Implementation - -# no docstring here - that goes with the constructor -struct Ridge - lambda::Float64 -end - -""" - Ridge(; lambda=0.1) - -Instantiate a ridge regression learner, with regularization of `lambda`. - -""" -Ridge(; lambda=0.1) = Ridge(lambda) # LearnAPI.constructor defined later - -struct RidgeFitObs{T,M<:AbstractMatrix{T}} - A::M # p x n - names::Vector{Symbol} - y::Vector{T} -end - -struct RidgeFitted{T,F} - learner::Ridge - coefficients::Vector{T} - feature_importances::F -end - -LearnAPI.learner(model::RidgeFitted) = model.learner - -Base.getindex(data::RidgeFitObs, I) = - RidgeFitObs(data.A[:,I], data.names, data.y[I]) -Base.length(data::RidgeFitObs) = length(data.y) - -# observations for consumption by `fit`: -function LearnAPI.obs(::Ridge, data) - X, y = data - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - RidgeFitObs(Tables.matrix(table)', names, y) -end - -# for observations: -function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) - - # unpack hyperparameters and data: - lambda = learner.lambda - A = observations.A - names = observations.names - y = observations.y - - # apply core learner: - coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix - - # determine crude feature importances: - feature_importances = - [names[j] => abs(coefficients[j]) for j in eachindex(names)] - sort!(feature_importances, by=last) |> reverse! - - # make some noise, if allowed: - verbosity > 0 && - @info "Features in order of importance: $(first.(feature_importances))" - - return RidgeFitted(learner, coefficients, feature_importances) - -end - -# for unprocessed `data = (X, y)`: -LearnAPI.fit(learner::Ridge, data; kwargs...) = - fit(learner, obs(learner, data); kwargs...) - -# extracting stuff from training data: -LearnAPI.target(::Ridge, data) = last(data) -LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y -LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A - -# observations for consumption by `predict`: -LearnAPI.obs(::RidgeFitted, X) = Tables.matrix(X)' - -# matrix input: -LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = - observations'*model.coefficients - -# tabular input: -LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = - predict(model, Point(), obs(model, Xnew)) - -# accessor function: -LearnAPI.feature_importances(model::RidgeFitted) = model.feature_importances - -LearnAPI.strip(model::RidgeFitted) = - RidgeFitted(model.learner, model.coefficients, nothing) - -@trait( - Ridge, - constructor = Ridge, - kinds_of_proxy = (Point(),), - tags = ("regression",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.feature_importances), - ) -) - -# convenience method: -LearnAPI.fit(learner::Ridge, X, y; kwargs...) = - fit(learner, (X, y); kwargs...) - - -# ## Tests - -# synthetic test data: -n = 30 # number of observations -train = 1:6 -test = 7:10 -a, b, c = rand(n), rand(n), rand(n) -X = (; a, b, c) -X = DataFrames.DataFrame(X) -y = 2a - b + 3c + 0.05*rand(n) -data = (X, y) - -@testset "test an implementation of ridge regression" begin - learner = Ridge(lambda=0.5) - @test :(LearnAPI.obs) in LearnAPI.functions(learner) - - @test LearnAPI.target(learner, data) == y - @test LearnAPI.features(learner, data) == X - - # verbose fitting: - @test_logs( - (:info, r"Feature"), - fit( - learner, - Tables.subset(X, train), - y[train]; - verbosity=1, - ), - ) - - # quiet fitting: - model = @test_logs( - fit( - learner, - Tables.subset(X, train), - y[train]; - verbosity=0, - ), - ) - - ŷ = predict(model, Point(), Tables.subset(X, test)) - @test ŷ isa Vector{Float64} - @test predict(model, Tables.subset(X, test)) == ŷ - - fitobs = LearnAPI.obs(learner, data) - predictobs = LearnAPI.obs(model, X) - model = fit(learner, MLUtils.getobs(fitobs, train); verbosity=0) - @test LearnAPI.target(learner, fitobs) == y - @test predict(model, Point(), MLUtils.getobs(predictobs, test)) ≈ ŷ - @test predict(model, LearnAPI.features(learner, fitobs)) ≈ predict(model, X) - - @test LearnAPI.feature_importances(model) isa Vector{<:Pair{Symbol}} - - filename = tempname() - using Serialization - small_model = LearnAPI.strip(model) - serialize(filename, small_model) - - recovered_model = deserialize(filename) - @test LearnAPI.learner(recovered_model) == learner - @test predict( - recovered_model, - Point(), - MLUtils.getobs(predictobs, test) - ) ≈ ŷ - -end - -# # VARIATION OF RIDGE REGRESSION THAT USES FALLBACK OF LearnAPI.obs - -# no docstring here - that goes with the constructor -struct BabyRidge - lambda::Float64 -end - - -# ## Implementation - -""" - BabyRidge(; lambda=0.1) - -Instantiate a ridge regression learner, with regularization of `lambda`. - -""" -BabyRidge(; lambda=0.1) = BabyRidge(lambda) # LearnAPI.constructor defined later - -struct BabyRidgeFitted{T,F} - learner::BabyRidge - coefficients::Vector{T} - feature_importances::F -end - -function LearnAPI.fit(learner::BabyRidge, data; verbosity=1) - - X, y = data - - lambda = learner.lambda - table = Tables.columntable(X) - names = Tables.columnnames(table) |> collect - A = Tables.matrix(table)' - - # apply core learner: - coefficients = (A*A' + learner.lambda*I)\(A*y) # vector - - feature_importances = nothing - - return BabyRidgeFitted(learner, coefficients, feature_importances) - -end - -# extracting stuff from training data: -LearnAPI.target(::BabyRidge, data) = last(data) - -LearnAPI.learner(model::BabyRidgeFitted) = model.learner - -LearnAPI.predict(model::BabyRidgeFitted, ::Point, Xnew) = - Tables.matrix(Xnew)*model.coefficients - -LearnAPI.strip(model::BabyRidgeFitted) = - BabyRidgeFitted(model.learner, model.coefficients, nothing) - -@trait( - BabyRidge, - constructor = BabyRidge, - kinds_of_proxy = (Point(),), - tags = ("regression",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.features), - :(LearnAPI.target), - :(LearnAPI.predict), - :(LearnAPI.feature_importances), - ) -) - -# convenience method: -LearnAPI.fit(learner::BabyRidge, X, y; kwargs...) = - fit(learner, (X, y); kwargs...) - - -# ## Tests - -@testset "test a variation which does not overload LearnAPI.obs" begin - learner = BabyRidge(lambda=0.5) - - model = fit(learner, Tables.subset(X, train), y[train]; verbosity=0) - ŷ = predict(model, Point(), Tables.subset(X, test)) - @test ŷ isa Vector{Float64} - - fitobs = obs(learner, data) - predictobs = LearnAPI.obs(model, X) - model = fit(learner, MLUtils.getobs(fitobs, train); verbosity=0) - @test predict(model, Point(), MLUtils.getobs(predictobs, test)) == ŷ == - predict(model, MLUtils.getobs(predictobs, test)) - @test LearnAPI.target(learner, data) == y - @test LearnAPI.predict(model, X) ≈ - LearnAPI.predict(model, LearnAPI.features(learner, data)) -end - -true diff --git a/test/patterns/static_algorithms.jl b/test/patterns/static_algorithms.jl deleted file mode 100644 index 243cab44..00000000 --- a/test/patterns/static_algorithms.jl +++ /dev/null @@ -1,147 +0,0 @@ -using LearnAPI -using Tables -import MLUtils -import DataFrames - - -# # TRANSFORMER TO SELECT SOME FEATURES (COLUMNS) OF A TABLE - -# See later for a variation that stores the names of rejected features in the model -# object, for inspection by an accessor function. - -struct Selector - names::Vector{Symbol} -end -Selector(; names=Symbol[]) = Selector(names) # LearnAPI.constructor defined later - -# `fit` consumes no observational data, does no "learning", and just returns a thinly -# wrapped `learner` (to distinguish it from the learner in dispatch): -LearnAPI.fit(learner::Selector; verbosity=1) = Ref(learner) -LearnAPI.learner(model) = model[] - -function LearnAPI.transform(model::Base.RefValue{Selector}, X) - learner = LearnAPI.learner(model) - table = Tables.columntable(X) - names = Tables.columnnames(table) - filtered_names = filter(in(learner.names), names) - filtered_columns = (Tables.getcolumn(table, name) for name in filtered_names) - filtered_table = NamedTuple{filtered_names}((filtered_columns...,)) - return Tables.materializer(X)(filtered_table) -end - -# fit and transform in one go: -function LearnAPI.transform(learner::Selector, X) - model = fit(learner) - transform(model, X) -end - -# note the necessity of overloading `is_static` (`fit` consumes no data): -@trait( - Selector, - constructor = Selector, - tags = ("feature engineering",), - is_static = true, - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.transform), - ), -) - -@testset "test a static transformer" begin - learner = Selector(names=[:x, :w]) - X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) - model = fit(learner) # no data arguments! - # if provided, data is ignored: - @test LearnAPI.learner(model) == learner - W = transform(model, X) - @test W == DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) - @test W == transform(learner, X) -end - - -# # FEATURE SELECTOR THAT REPORTS BYPRODUCTS OF SELECTION PROCESS - -# This a variation of `Selector` above that stores the names of rejected features in the -# output of `fit`, for inspection by an accessor function called `rejected`. - -struct FancySelector - names::Vector{Symbol} -end - -""" - FancySelector(; names=Symbol[]) - -Instantiate a feature selector that exposes the names of rejected features. - -```julia -learner = FancySelector(names=[:x, :w]) -X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) -model = fit(learner) # no data arguments! -transform(model, X) # mutates `model` -@assert rejected(model) == [:y, :z] -``` - -""" -FancySelector(; names=Symbol[]) = FancySelector(names) - -mutable struct FancySelectorFitted - learner::FancySelector - rejected::Vector{Symbol} - FancySelectorFitted(learner) = new(learner) -end -LearnAPI.learner(model::FancySelectorFitted) = model.learner -rejected(model::FancySelectorFitted) = model.rejected - -# Here we are wrapping `learner` with a place-holder for the `rejected` feature names. -LearnAPI.fit(learner::FancySelector; verbosity=1) = FancySelectorFitted(learner) - -# output the filtered table and add `rejected` field to model (mutatated!) -function LearnAPI.transform(model::FancySelectorFitted, X) - table = Tables.columntable(X) - names = Tables.columnnames(table) - keep = LearnAPI.learner(model).names - filtered_names = filter(in(keep), names) - model.rejected = setdiff(names, filtered_names) - filtered_columns = (Tables.getcolumn(table, name) for name in filtered_names) - filtered_table = NamedTuple{filtered_names}((filtered_columns...,)) - return Tables.materializer(X)(filtered_table) -end - -# fit and transform in one step: -function LearnAPI.transform(learner::FancySelector, X) - model = fit(learner) - transform(model, X) -end - -# note the necessity of overloading `is_static` (`fit` consumes no data): -@trait( - FancySelector, - constructor = FancySelector, - is_static = true, - tags = ("feature engineering",), - functions = ( - :(LearnAPI.fit), - :(LearnAPI.learner), - :(LearnAPI.strip), - :(LearnAPI.obs), - :(LearnAPI.transform), - :(MyPkg.rejected), # accessor function not owned by LearnAPI.jl, - ) -) - -@testset "test a variation that reports byproducts" begin - learner = FancySelector(names=[:x, :w]) - X = DataFrames.DataFrame(rand(3, 4), [:x, :y, :z, :w]) - model = fit(learner) # no data arguments! - @test !isdefined(model, :reject) - @test LearnAPI.learner(model) == learner - filtered = DataFrames.DataFrame(Tables.matrix(X)[:,[1,4]], [:x, :w]) - @test transform(model, X) == filtered - @test transform(learner, X) == filtered - @test rejected(model) == [:y, :z] -end - -true From cb8133d400a04c371a54edf37f442c7ce3f16109 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 15:49:27 +1300 Subject: [PATCH 163/187] fix links for patterns to point to LearnTestAPI.jl --- docs/src/patterns/classification.md | 5 +++-- docs/src/patterns/density_estimation.md | 4 ++-- docs/src/patterns/ensembling.md | 7 ++++--- docs/src/patterns/feature_engineering.md | 7 +++++-- docs/src/patterns/gradient_descent.md | 7 ++++--- docs/src/patterns/incremental_algorithms.md | 4 ++-- docs/src/patterns/iterative_algorithms.md | 6 +++--- docs/src/patterns/meta_algorithms.md | 4 ++-- docs/src/patterns/regression.md | 8 +++++--- docs/src/patterns/static_algorithms.md | 8 +++++--- 10 files changed, 35 insertions(+), 25 deletions(-) diff --git a/docs/src/patterns/classification.md b/docs/src/patterns/classification.md index 2913cea5..47e04043 100644 --- a/docs/src/patterns/classification.md +++ b/docs/src/patterns/classification.md @@ -1,5 +1,6 @@ # Classification -See these examples from tests: +See these examples from the JuliaTestAPI.jl test suite: -- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/gradient_descent.jl) +- [perceptron + classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) diff --git a/docs/src/patterns/density_estimation.md b/docs/src/patterns/density_estimation.md index e9ca083b..9fc0144a 100644 --- a/docs/src/patterns/density_estimation.md +++ b/docs/src/patterns/density_estimation.md @@ -1,5 +1,5 @@ # Density Estimation -See these examples from tests: +See these examples from the JuliaTestAPI.jl test suite: -- [normal distribution estimator](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl) +- [normal distribution estimator](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl) diff --git a/docs/src/patterns/ensembling.md b/docs/src/patterns/ensembling.md index a93ae305..ea5faf88 100644 --- a/docs/src/patterns/ensembling.md +++ b/docs/src/patterns/ensembling.md @@ -1,5 +1,6 @@ # Ensembling -See [this -example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/ensembling.jl) -from tests. +See these examples from the JuliaTestAPI.jl test suite: + +- [bagged ensembling of a regression model](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) + diff --git a/docs/src/patterns/feature_engineering.md b/docs/src/patterns/feature_engineering.md index 850dc0e3..6e3c656c 100644 --- a/docs/src/patterns/feature_engineering.md +++ b/docs/src/patterns/feature_engineering.md @@ -1,4 +1,7 @@ # Feature Engineering -For a simple feature selection algorithm (no "learning) see -[these examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/static_algorithms.jl) from tests. +See these examples from the JuliaTestAI.jl test suite: + +- [feature + selectors](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/static_algorithms.jl) + from tests. diff --git a/docs/src/patterns/gradient_descent.md b/docs/src/patterns/gradient_descent.md index 7fd4a11c..c898b38e 100644 --- a/docs/src/patterns/gradient_descent.md +++ b/docs/src/patterns/gradient_descent.md @@ -1,5 +1,6 @@ # Gradient Descent -See [this -example](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/gradient_descent.jl) -from tests. +See these examples from the JuliaTestAI.jl test suite: + +- [perceptron +classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) diff --git a/docs/src/patterns/incremental_algorithms.md b/docs/src/patterns/incremental_algorithms.md index 89ad8643..d2855a55 100644 --- a/docs/src/patterns/incremental_algorithms.md +++ b/docs/src/patterns/incremental_algorithms.md @@ -1,5 +1,5 @@ # Incremental Algorithms -See these examples from tests: +See these examples from the JuliaTestAI.jl test suite: -- [normal distribution estimator](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl) +- [normal distribution estimator](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl) diff --git a/docs/src/patterns/iterative_algorithms.md b/docs/src/patterns/iterative_algorithms.md index 1cf4ab23..b12c6142 100644 --- a/docs/src/patterns/iterative_algorithms.md +++ b/docs/src/patterns/iterative_algorithms.md @@ -1,7 +1,7 @@ # Iterative Algorithms -See these examples from tests: +See these examples from the JuliaTestAI.jl test suite: -- [bagged ensembling](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/ensembling.jl) +- [bagged ensembling](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) -- [perceptron classifier](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/gradient_descent.jl) +- [perceptron classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) diff --git a/docs/src/patterns/meta_algorithms.md b/docs/src/patterns/meta_algorithms.md index 17ccad8f..6a9e7300 100644 --- a/docs/src/patterns/meta_algorithms.md +++ b/docs/src/patterns/meta_algorithms.md @@ -1,7 +1,7 @@ # Meta-algorithms -Many meta-algorithms are can be implemented as wrappers. An example is [this bagged +Many meta-algorithms are can be implemented as wrappers. An example is [this bagged ensemble -algorithm](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/ensembling.jl) +algorithm](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) from tests. diff --git a/docs/src/patterns/regression.md b/docs/src/patterns/regression.md index 7cf3b6d0..1a50aecf 100644 --- a/docs/src/patterns/regression.md +++ b/docs/src/patterns/regression.md @@ -1,5 +1,7 @@ # Regression -See [these -examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/regression.jl) -from tests. +See these examples from the JuliaTestAI.jl test suite: + +- [ridge + regression](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/regression.jl) + diff --git a/docs/src/patterns/static_algorithms.md b/docs/src/patterns/static_algorithms.md index 21a517dc..4724006f 100644 --- a/docs/src/patterns/static_algorithms.md +++ b/docs/src/patterns/static_algorithms.md @@ -1,7 +1,9 @@ # Static Algorithms -See [these -examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/static_algorithms.jl) -from tests. +See these examples from the JuliaTestAI.jl test suite: + +- [feature + selection](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/static_algorithms.jl) + From dbb5ba6adb1f8335ef369d029b970f8bc1556ce5 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 15:53:17 +1300 Subject: [PATCH 164/187] add default_verbosity --- src/verbosity.jl | 25 +++++++++++++++++++++++++ test/verbosity.jl | 7 +++++++ 2 files changed, 32 insertions(+) create mode 100644 src/verbosity.jl create mode 100644 test/verbosity.jl diff --git a/src/verbosity.jl b/src/verbosity.jl new file mode 100644 index 00000000..f403ad3c --- /dev/null +++ b/src/verbosity.jl @@ -0,0 +1,25 @@ +const DEFAULT_VERBOSITY = Ref(1) + +""" + LearnAPI.default_verbosity() + LearnAPI.default_verbosity(level::Int) + +Respectively return and set the default verbosity level for LearnAPI.jl, applying, in +particular, to [`fit`](@ref), [`update`](@ref), [`update_observations`](@ref), and +[`update_features`](@ref). The effect in a top-level call is generally: + + + +| `level` | behaviour | +|:--------|:----------------------------------| +| 1 | informational | +| 0 | warnings only | +| -1 | silent | + + +Methods consuming `verbosity` generally call other verbosity-supporting methods +at one level lower, so increasing `verbosity` beyond `1` may be useful. + +""" +default_verbosity() = DEFAULT_VERBOSITY[] +default_verbosity(level) = (DEFAULT_VERBOSITY[] = level) diff --git a/test/verbosity.jl b/test/verbosity.jl new file mode 100644 index 00000000..72ce29c8 --- /dev/null +++ b/test/verbosity.jl @@ -0,0 +1,7 @@ +using Test + +@test LearnAPI.default_verbosity() ==1 +LearnAPI.default_verbosity(42) +@test LearnAPI.default_verbosity() == 42 + +true From b945e73f449c8e5c07753a8177b2d4c66f43894f Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 16:08:07 +1300 Subject: [PATCH 165/187] fix anatomy of an interface re verbosity --- docs/src/anatomy_of_an_implementation.md | 4 +-- docs/src/fit_update.md | 23 +++++++++-------- src/LearnAPI.jl | 3 ++- src/fit_update.jl | 33 ++++++++++++++++-------- src/verbosity.jl | 7 ++--- test/runtests.jl | 1 + 6 files changed, 43 insertions(+), 28 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 754bf94d..ee097e4f 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -106,7 +106,7 @@ Note that we also include `learner` in the struct, for it must be possible to re The core implementation of `fit` looks like this: ```@example anatomy -function LearnAPI.fit(learner::Ridge, data; verbosity=1) +function LearnAPI.fit(learner::Ridge, data; verbosity=LearnAPI.default_verbosity()) X, y = data @@ -397,7 +397,7 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat (observations) which appears first below: ```@example anatomy2 -function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) +function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=LearnAPI.default_verbosity()) lambda = learner.lambda diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 2e2c0858..08f5067a 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -3,8 +3,8 @@ ### Training ```julia -fit(learner, data; verbosity=1) -> model -fit(learner; verbosity=1) -> static_model +fit(learner, data; verbosity=LearnAPI.default_verbosity()) -> model +fit(learner; verbosity=LearnAPI.default_verbosity()) -> static_model ``` A "static" algorithm is one that does not generalize to new observations (e.g., some @@ -15,8 +15,8 @@ clustering algorithms); there is no training data and the algorithm is executed ### Updating ``` -update(model, data; verbosity=1, param1=new_value1, param2=new_value2, ...) -> updated_model -update_observations(model, new_data; verbosity=1, param1=new_value1, ...) -> updated_model +update(model, data; verbosity=..., param1=new_value1, param2=new_value2, ...) -> updated_model +update_observations(model, new_data; verbosity=..., param1=new_value1, ...) -> updated_model update_features(model, new_data; verbosity=1, param1=new_value1, ...) -> updated_model ``` @@ -101,18 +101,18 @@ See also [Density Estimation](@ref). Exactly one of the following must be implemented: -| method | fallback | -|:--------------------------------------------|:---------| -| [`fit`](@ref)`(learner, data; verbosity=1)` | none | -| [`fit`](@ref)`(learner; verbosity=1)` | none | +| method | fallback | +|:-----------------------------------------------------------------------|:---------| +| [`fit`](@ref)`(learner, data; verbosity=LearnAPI.default_verbosity())` | none | +| [`fit`](@ref)`(learner; verbosity=LearnAPI.default_verbosity())` | none | ### Updating | method | fallback | compulsory? | |:-------------------------------------------------------------------------------------|:---------|-------------| -| [`update`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | -| [`update_observations`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | -| [`update_features`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | +| [`update`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | +| [`update_observations`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | +| [`update_features`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | There are some contracts governing the behaviour of the update methods, as they relate to a previous `fit` call. Consult the document strings for details. @@ -124,4 +124,5 @@ fit update update_observations update_features +LearnAPI.default_verbosity ``` diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index c1564e06..5d31ce93 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -1,7 +1,8 @@ module LearnAPI -include("tools.jl") include("types.jl") +include("verbosity.jl") +include("tools.jl") include("predict_transform.jl") include("fit_update.jl") include("target_weights_features.jl") diff --git a/src/fit_update.jl b/src/fit_update.jl index 39a662a9..4d9c9e2e 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -1,8 +1,8 @@ # # FIT """ - fit(learner, data; verbosity=1) - fit(learner; verbosity=1) + fit(learner, data; verbosity=LearnAPI.default_verbosity()) + fit(learner; verbosity=LearnAPI.default_verbosity()) Execute the machine learning or statistical algorithm with configuration `learner` using the provided training `data`, returning an object, `model`, on which other methods, such @@ -17,26 +17,27 @@ model = fit(learner, (X, y)) ŷ = predict(model, Xnew) ``` -The signature `fit(learner; verbosity=1)` (no `data`) is provided by learners that do not -generalize to new observations (called *static algorithms*). In that case, +The signature `fit(learner; verbosity=...)` (no `data`) is provided by learners that do +not generalize to new observations (called *static algorithms*). In that case, `transform(model, data)` or `predict(model, ..., data)` carries out the actual algorithm execution, writing any byproducts of that operation to the mutable object `model` returned by `fit`. Use `verbosity=0` for warnings only, and `-1` for silent training. -See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), -[`LearnAPI.functions`](@ref), [`obs`](@ref). +See also [`LearnAPI.default_verbosity`](@ref), [`predict`](@ref), [`transform`](@ref), +[`inverse_transform`](@ref), [`LearnAPI.functions`](@ref), [`obs`](@ref). # Extended help # New implementations Implementation of exactly one of the signatures is compulsory. If `fit(learner; -verbosity=1)` is implemented, then the trait [`LearnAPI.is_static`](@ref) must be +verbosity=...)` is implemented, then the trait [`LearnAPI.is_static`](@ref) must be overloaded to return `true`. -The signature must include `verbosity`. +The signature must include `verbosity` with [`LearnAPI.default_verbosity()`](@ref) as +default. If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentation, then [`LearnAPI.target(data)`](@ref) must be overloaded to return it. If [`predict`](@ref) or @@ -59,7 +60,7 @@ function fit end # # UPDATE AND COUSINS """ - update(model, data; verbosity=1, hyperparam_replacements...) + update(model, data; verbosity=LearnAPI.default_verbosity(), hyperparam_replacements...) Return an updated version of the `model` object returned by a previous [`fit`](@ref) or `update` call, but with the specified hyperparameter replacements, in the form `p1=value1, @@ -98,7 +99,12 @@ See also [`LearnAPI.clone`](@ref) function update end """ - update_observations(model, new_data; verbosity=1, parameter_replacements...) + update_observations( + model, + new_data; + parameter_replacements..., + verbosity=LearnAPI.default_verbosity(), + ) Return an updated version of the `model` object returned by a previous [`fit`](@ref) or `update` call given the new observations present in `new_data`. One may additionally @@ -134,7 +140,12 @@ See also [`LearnAPI.clone`](@ref). function update_observations end """ - update_features(model, new_data; verbosity=1, parameter_replacements...) + update_features( + model, + new_data; + parameter_replacements..., + verbosity=LearnAPI.default_verbosity(), + ) Return an updated version of the `model` object returned by a previous [`fit`](@ref) or `update` call given the new features encapsulated in `new_data`. One may additionally diff --git a/src/verbosity.jl b/src/verbosity.jl index f403ad3c..4f77e659 100644 --- a/src/verbosity.jl +++ b/src/verbosity.jl @@ -4,9 +4,10 @@ const DEFAULT_VERBOSITY = Ref(1) LearnAPI.default_verbosity() LearnAPI.default_verbosity(level::Int) -Respectively return and set the default verbosity level for LearnAPI.jl, applying, in -particular, to [`fit`](@ref), [`update`](@ref), [`update_observations`](@ref), and -[`update_features`](@ref). The effect in a top-level call is generally: +Respectively return, or set, the default `verbosity` level for LearnAPI.jl methods that +support it, which includes [`fit`](@ref), [`update`](@ref), +[`update_observations`](@ref), and [`update_features`](@ref). The effect in a top-level +call is generally: diff --git a/test/runtests.jl b/test/runtests.jl index 8a255c83..056fa491 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -2,6 +2,7 @@ using Test test_files = [ "tools.jl", + "verbosity.jl", "traits.jl", "clone.jl", "predict_transform.jl", From f6d8358f7946499c9dd02c8b6b3e2d90ff52b586 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 17:01:21 +1300 Subject: [PATCH 166/187] doc fix --- docs/src/fit_update.md | 2 +- docs/src/reference.md | 16 ++++++++-------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 08f5067a..64f68d73 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -1,4 +1,4 @@ -# [`fit`, `update`, `update_observations`, and `update_features`](@id fit) +# [`fit`, `update`, `update_observations`, and `update_features`](@id fit_docs) ### Training diff --git a/docs/src/reference.md b/docs/src/reference.md index a387d115..53798954 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -150,16 +150,16 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` ### List of methods -- [`fit`](@ref fit): for (i) training or updating learners that generalize to new data; or - (ii) wrapping `learner` in an object that is possibly mutated by `predict`/`transform`, - to record byproducts of those operations, in the special case of *non-generalizing* - learners (called here [static algorithms](@ref static_algorithms)) +- [`fit`](@ref fit_docs): for (i) training or updating learners that generalize to new + data; or (ii) wrapping `learner` in an object that is possibly mutated by + `predict`/`transform`, to record byproducts of those operations, in the special case of + *non-generalizing* learners (called here [static algorithms](@ref static_algorithms)) -- [`update`](@ref fit): for updating learning outcomes after hyperparameter changes, such - as increasing an iteration parameter. +- [`update`](@ref fit_docs): for updating learning outcomes after hyperparameter changes, + such as increasing an iteration parameter. -- [`update_observations`](@ref fit), [`update_features`](@ref fit): update learning - outcomes by presenting additional training data. +- [`update_observations`](@ref fit_docs), [`update_features`](@ref fit_docs): update + learning outcomes by presenting additional training data. - [`predict`](@ref operations): for outputting [targets](@ref proxy) or [target proxies](@ref proxy) (such as probability density functions) From d6cd610d2d751d5067d63811170729d71db1ee48 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 17:22:03 +1300 Subject: [PATCH 167/187] tweak docs --- docs/src/obs.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/src/obs.md b/docs/src/obs.md index 5d81012e..da6b2ccc 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -2,13 +2,14 @@ The `obs` method takes data intended as input to `fit`, `predict` or `transform`, and transforms it to a learner-specific form guaranteed to implement a form of observation -access designated by the learner. The transformed data can then be resampled and passed -on to the relevant method in place of the original input. Using `obs` may provide -performance advantages over naive workflows in some cases (e.g., cross-validation). +access designated by the learner. The transformed data can then passed on to the relevant +method in place of the original input (after first resampling it, if the learner supports +this). Using `obs` may provide performance advantages over naive workflows in some cases +(e.g., cross-validation). ```julia obs(learner, data) # can be passed to `fit` instead of `data` -obs(model, data) # can be passed to `predict` or `transform` instead of `data` +obs(model, data) # can be passed to `predict` or `transform` instead of `data` ``` ## [Typical workflows](@id obs_workflows) From 14c6a1638803883936366a9cc1d0a2f07b2b2a75 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 17:24:41 +1300 Subject: [PATCH 168/187] tweak a docstring --- src/obs.jl | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/obs.jl b/src/obs.jl index 6ea5544e..63a08a76 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -12,9 +12,7 @@ The returned object is guaranteed to implement observation access as indicated b [`LearnAPI.RandomAccess()`](@ref). Calling `fit`/`predict`/`transform` on the returned objects may have performance -advantages over calling directly on `data` in some contexts. And resampling the returned -object using `MLUtils.getobs` may be cheaper than directly resampling the components of -`data`. +advantages over calling directly on `data` in some contexts. # Example From 49ce36b9f01513302e9122b89eac7c7b7e8ea21c Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 17:25:31 +1300 Subject: [PATCH 169/187] typo --- docs/src/patterns/regression.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/patterns/regression.md b/docs/src/patterns/regression.md index 1a50aecf..3a47db65 100644 --- a/docs/src/patterns/regression.md +++ b/docs/src/patterns/regression.md @@ -1,6 +1,6 @@ # Regression -See these examples from the JuliaTestAI.jl test suite: +See these examples from the JuliaTestAPI.jl test suite: - [ridge regression](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/regression.jl) From 33f228ec4e9f437fafd95e58865273142db2ddbf Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 17:28:53 +1300 Subject: [PATCH 170/187] fix some links --- docs/src/patterns/classification.md | 3 +-- docs/src/patterns/dimension_reduction.md | 7 +++---- docs/src/patterns/gradient_descent.md | 3 +-- docs/src/patterns/regression.md | 3 +-- 4 files changed, 6 insertions(+), 10 deletions(-) diff --git a/docs/src/patterns/classification.md b/docs/src/patterns/classification.md index 47e04043..fd278478 100644 --- a/docs/src/patterns/classification.md +++ b/docs/src/patterns/classification.md @@ -2,5 +2,4 @@ See these examples from the JuliaTestAPI.jl test suite: -- [perceptron - classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) +- [perceptron classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) diff --git a/docs/src/patterns/dimension_reduction.md b/docs/src/patterns/dimension_reduction.md index 63877c8c..e886dd15 100644 --- a/docs/src/patterns/dimension_reduction.md +++ b/docs/src/patterns/dimension_reduction.md @@ -1,7 +1,6 @@ # Dimension Reduction -Check out the following examples: +See these examples from the JuliaTestAPI.jl test suite: + +- [Truncated SVD](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl) -- [Truncated - SVD]((https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl - (from the TestLearnAPI.jl test suite) diff --git a/docs/src/patterns/gradient_descent.md b/docs/src/patterns/gradient_descent.md index c898b38e..9dc5401a 100644 --- a/docs/src/patterns/gradient_descent.md +++ b/docs/src/patterns/gradient_descent.md @@ -2,5 +2,4 @@ See these examples from the JuliaTestAI.jl test suite: -- [perceptron -classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) +- [perceptron classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) diff --git a/docs/src/patterns/regression.md b/docs/src/patterns/regression.md index 3a47db65..ca68b308 100644 --- a/docs/src/patterns/regression.md +++ b/docs/src/patterns/regression.md @@ -2,6 +2,5 @@ See these examples from the JuliaTestAPI.jl test suite: -- [ridge - regression](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/regression.jl) +- [ridge regression](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/regression.jl) From f30880d56df44be039b626a6b6ebf9edca0aac01 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 2 Nov 2024 17:32:42 +1300 Subject: [PATCH 171/187] typo --- docs/src/anatomy_of_an_implementation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index ee097e4f..3b7ae4b8 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -242,7 +242,7 @@ nothing # hide ``` The last trait, `functions`, returns a list of all LearnAPI.jl methods that can be -meaninfully applied to the learner or associated model. See [`LearnAPI.functions`](@ref) +meaningfully applied to the learner or associated model. See [`LearnAPI.functions`](@ref) for a checklist. [`LearnAPI.functions`](@ref) and [`LearnAPI.constructor`](@ref), are the only universally compulsory traits. However, it is worthwhile studying the [list of all traits](@ref traits_list) to see which might apply to a new implementation, to enable From 41ecbf3137c8dac595a52c1e8d3a7984d3c93eac Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 3 Nov 2024 14:52:26 +1300 Subject: [PATCH 172/187] typo --- docs/src/reference.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index 53798954..30bcc743 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -77,7 +77,7 @@ object's *properties* (which conceivably differ from its fields). It does not st learned parameters. Informally, we will sometimes use the word "model" to refer to the output of -`fit(learner, ...)` (see below), something which typically does *store* learned +`fit(learner, ...)` (see below), something which typically *does* store learned parameters. For `learner` to be a valid LearnAPI.jl learner, From 242cb34e9fa920f1d1b3dcdf99460bb0eb8dffd2 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 3 Nov 2024 18:04:09 +1300 Subject: [PATCH 173/187] doc tweak --- docs/src/reference.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index 30bcc743..e288439d 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -60,8 +60,8 @@ More generally, whenever we have a variable (e.g., a class label) that can, at l principle, be paired with a predicted value, or some predicted "proxy" for that variable (such as a class probability), then we call the variable a *target* variable, and the predicted output a *target proxy*. In this definition, it is immaterial whether or not the -target appears in training (is supervised) or whether or not the model generalizes to new -observations ("learns"). +target appears in training (the algorithm is supervised) or whether or not predictions +generalize to new input observations (the algorithm "learns"). LearnAPI.jl provides singleton [target proxy types](@ref proxy_types) for prediction dispatch. These are also used to distinguish performance metrics provided by the package From 5e08aabf4309471b658f307919f2d3d548c402ac Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sun, 3 Nov 2024 18:26:29 +1300 Subject: [PATCH 174/187] doc tweak --- docs/src/predict_transform.md | 23 ++++++++++++++--------- docs/src/reference.md | 4 ++-- 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index a6a00047..b15b9055 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -84,24 +84,29 @@ dimension using distances from the cluster centres. ### [One-liners combining fit and transform/predict](@id one_liners) -Learners may optionally overload `transform` to apply `fit` first, using the supplied -data if required, and then immediately `transform` the same data. The same applies to -`predict`. In that case the first argument of `transform`/`predict` is an *learner* -instead of the output of `fit`: +Learners may additionally overload `transform` to apply `fit` first, using the supplied +data if required, and then immediately `transform` the same data. In that case the first +argument of `transform` is an *learner* instead of the output of `fit`: ```julia -predict(learner, kind_of_proxy, data) # `fit` implied transform(learner, data) # `fit` implied ``` -For example, if `fit(learner, X)` is defined, then `predict(learner, X)` will be -shorthand for +This will be shorthand for ```julia -model = fit(learner, X) -predict(model, X) +model = fit(learner, X) # or `fit(learner)` in the static case +transform(model, X) +``` + +The same remarks apply to `predict`, as in + +```julia +predict(learner, kind_of_proxy, data) # `fit` implied ``` +LearnAPI.jl does not, however, guarantee the provision of these one-liners. + ## [Reference](@id predict_ref) ```@docs diff --git a/docs/src/reference.md b/docs/src/reference.md index e288439d..866e9d3b 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -108,7 +108,7 @@ values; for such learners [`LearnAPI.is_composite`](@ref)`(learner)` must be `tr (fallback is `false`). Generally, the keyword constructor provided by [`LearnAPI.constructor`](@ref) must provide default values for all properties that are not learner-valued. Instead, these learner-valued properties can have a `nothing` default, -with the constructor throwing an error if the the constructor call does not explicitly +with the constructor throwing an error if the constructor call does not explicitly specify a new value. Any object `learner` for which [`LearnAPI.functions(learner)`](@ref) is non-empty is @@ -150,7 +150,7 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` ### List of methods -- [`fit`](@ref fit_docs): for (i) training or updating learners that generalize to new +- [`fit`](@ref fit_docs): for (i) training learners that generalize to new data; or (ii) wrapping `learner` in an object that is possibly mutated by `predict`/`transform`, to record byproducts of those operations, in the special case of *non-generalizing* learners (called here [static algorithms](@ref static_algorithms)) From 962ec13284cf71054be57707787b7fb5006602d8 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 4 Nov 2024 11:08:27 +1300 Subject: [PATCH 175/187] add feature_names to accessor functions --- src/accessor_functions.jl | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index c284b015..c1b4447f 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -93,6 +93,25 @@ LearnAPI.strip(LearnAPI.strip(model)) == LearnAPI.strip(model) """ LearnAPI.strip(model) = model +""" + LearnAPI.feature_names(model) + +Return the names of features encountered when fitting or updating some `learner` to obtain +`model`. + +The value returned value is a vector of symbols. + +This method is implemented if `:(LearnAPI.feature_names) in LearnAPI.functions(learner)`. + +See also [`fit`](@ref). + +# New implementations + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.feature_names)")). + +""" +function feature_names end + """ LearnAPI.feature_importances(model) @@ -291,7 +310,6 @@ $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_labels)")). """ function training_labels end - # :extras intentionally excluded: const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( learner, @@ -299,6 +317,7 @@ const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( intercept, tree, trees, + feature_names, feature_importances, training_labels, training_losses, From 14498147e6e01eee80415c2b02117811404c3d77 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 4 Nov 2024 11:10:34 +1300 Subject: [PATCH 176/187] update docs feature_names --- docs/src/accessor_functions.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index cba1a91c..93b377a7 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -11,6 +11,7 @@ The sole argument of an accessor function is the output, `model`, of - [`LearnAPI.intercept(model)`](@ref) - [`LearnAPI.tree(model)`](@ref) - [`LearnAPI.trees(model)`](@ref) +- [`LearnAPI.feature_names(model)`](@ref) - [`LearnAPI.feature_importances(model)`](@ref) - [`LearnAPI.training_labels(model)`](@ref) - [`LearnAPI.training_losses(model)`](@ref) @@ -38,6 +39,7 @@ LearnAPI.coefficients LearnAPI.intercept LearnAPI.tree LearnAPI.trees +LearnAPI.feature_names LearnAPI.feature_importances LearnAPI.training_losses LearnAPI.training_predictions From 105d7ffee4543f74dfba508f64469fd9f48a8d09 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 23 Nov 2024 08:16:55 +1300 Subject: [PATCH 177/187] more doc tweaks --- docs/make.jl | 2 +- docs/src/anatomy_of_an_implementation.md | 16 ++-- docs/src/common_implementation_patterns.md | 2 +- docs/src/index.md | 16 +++- docs/src/reference.md | 8 +- docs/src/target_weights_features.md | 13 ++-- src/obs.jl | 12 +-- src/target_weights_features.jl | 88 ++++++++++++---------- src/traits.jl | 4 +- 9 files changed, 83 insertions(+), 78 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 86525b98..9c73ec0a 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -18,8 +18,8 @@ makedocs( "fit/update" => "fit_update.md", "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", - "target/weights/features" => "target_weights_features.md", "obs" => "obs.md", + "target/weights/features" => "target_weights_features.md", "Accessor Functions" => "accessor_functions.md", "Learner Traits" => "traits.md", ], diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 3b7ae4b8..8a65910c 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -462,21 +462,14 @@ LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = ### `target` and `features` methods -We provide an additional overloading of [`LearnAPI.target`](@ref) to handle the additional -supported data argument of `fit`: +In the general case, we only need to implement [`LearnAPI.target`](@ref) and +[`LearnAPI.features`](@ref) to handle all possible output of `obs(learner, data)`, and now +the fallback for `LearnAPI.features` mentioned before is inadequate. ```@example anatomy2 LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y -``` - -Similarly, we must overload [`LearnAPI.features`](@ref), which extracts features from -training data (objects that can be passed to `predict`) like this - -```@example anatomy2 LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A ``` -as the fallback mentioned above is no longer adequate. - ### Important notes: @@ -501,7 +494,8 @@ interfaces](@ref data_interfaces) for details. ### Addition of signatures for user convenience -As above, we add a signature which plays no role vis-à-vis LearnAPI.jl. +As above, we add a signature for convenience, which the LearnAPI.jl specification +neither requires nor forbids: ```@example anatomy2 LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...) diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index b46ed768..85ebe507 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -3,7 +3,7 @@ !!! important This section is only an implementation guide. The definitive specification of the - Learn API is given in [Reference](@ref reference). + LearnAPI is given in [Reference](@ref reference). This guide is intended to be consulted after reading [Anatomy of an Implementation](@ref), which introduces the main interface objects and terminology. diff --git a/docs/src/index.md b/docs/src/index.md index 727199ff..87f1dd90 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -1,5 +1,15 @@ ```@raw html + +
+ Tutorial  |  + Reference  |  + Patterns +
+ LearnAPI.jl
@@ -86,11 +96,11 @@ opts out. Moreover, the `fit` and `predict` methods will also be able to consume alternative data representations, for performance benefits in some situations. The fallback data interface is the [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl) -`getobs/numobs` interface (here tagged as [`LearnAPI.RandomAccess()`](@ref)) and if the +`getobs/numobs` interface, here tagged as [`LearnAPI.RandomAccess()`](@ref), and if the input consumed by the algorithm already implements that interface (tables, arrays, etc.) then overloading `obs` is completely optional. Plain iteration interfaces, with or without -knowledge of the number of observations, can also be specified (to support, e.g., data -loaders reading images from disk). +knowledge of the number of observations, can also be specified, to support, e.g., data +loaders reading images from disk. ## Learning more diff --git a/docs/src/reference.md b/docs/src/reference.md index 866e9d3b..233ea162 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -170,15 +170,15 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` - [`inverse_transform`](@ref operations): for inverting the output of `transform` ("inverting" broadly understood) -- [`LearnAPI.target`](@ref input), [`LearnAPI.weights`](@ref input), - [`LearnAPI.features`](@ref): for extracting relevant parts of training data, where - defined. - - [`obs`](@ref data_interface): method for exposing to the user learner-specific representations of data, which are additionally guaranteed to implement the observation access API specified by [`LearnAPI.data_interface(learner)`](@ref). +- [`LearnAPI.target`](@ref input), [`LearnAPI.weights`](@ref input), + [`LearnAPI.features`](@ref): for extracting relevant parts of training data, where + defined. + - [Accessor functions](@ref accessor_functions): these include functions like `LearnAPI.feature_importances` and `LearnAPI.training_losses`, for extracting, from training outcomes, information common to many learners. This includes diff --git a/docs/src/target_weights_features.md b/docs/src/target_weights_features.md index c54639a6..925bae67 100644 --- a/docs/src/target_weights_features.md +++ b/docs/src/target_weights_features.md @@ -1,11 +1,13 @@ # [`target`, `weights`, and `features`](@id input) -Methods for extracting parts of training data: +Methods for extracting parts of training observations. Here "observations" means the +output of [`obs(learner, data)`](@ref); if `obs` is not overloaded for `learner`, then +"observations" is any `data` supported in calls of the form [`fit(learner, data)`](@ref) ```julia -LearnAPI.target(learner, data) -> -LearnAPI.weights(learner, data) -> -LearnAPI.features(learner, data) -> +LearnAPI.target(learner, observations) -> +LearnAPI.weights(learner, observations) -> +LearnAPI.features(learner, observations) -> ``` Here `data` is something supported in a call of the form `fit(learner, data)`. @@ -19,7 +21,8 @@ Supposing `learner` is a supervised classifier predicting a one-dimensional vect target: ```julia -model = fit(learner, data) +observations = obs(learner, data) +model = fit(learner, observations) X = LearnAPI.features(learner, data) y = LearnAPI.target(learner, data) ŷ = predict(model, Point(), X) diff --git a/src/obs.jl b/src/obs.jl index 63a08a76..7485cf4d 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -61,8 +61,8 @@ using `MLUtils.getobs`, with the obvious interpretation applying to the outcomes calls (e.g., if *all* observations are subsampled, then outcomes should be the same as if using the original data). -Implicit in preceding requirements is that `obs(learner, _)` and `obs(model, _)` are -involutive, meaning both the following hold: +It is required that `obs(learner, _)` and `obs(model, _)` are involutive, meaning both the +following hold: ```julia obs(learner, obs(learner, data)) == obs(learner, data) @@ -81,14 +81,6 @@ only of suitable tables and arrays, then `obs` and `LearnAPI.data_interface` do to be overloaded. However, the user will get no performance benefits by using `obs` in that case. -If overloading `obs(learner, data)` to output new model-specific representations of -data, it may be necessary to also overload [`LearnAPI.features(learner, -observations)`](@ref), [`LearnAPI.target(learner, observations)`](@ref) (supervised -learners), and/or [`LearnAPI.weights(learner, observations)`](@ref) (if weights are -supported), for each kind output `observations` of `obs(learner, data)`. Moreover, the -outputs of these methods, applied to `observations`, must also implement the interface -specified by [`LearnAPI.data_interface(learner)`](@ref). - ## Sample implementation Refer to the ["Anatomy of an diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index c7831072..611da656 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -1,13 +1,13 @@ """ - LearnAPI.target(learner, data) -> target + LearnAPI.target(learner, observations) -> target -Return, for each form of `data` supported in a call of the form [`fit(learner, -data)`](@ref), the target variable part of `data`. If `nothing` is returned, the +Return, for every conceivable `observations` returned by a call of the form [`obs(learner, +data)`](@ref), the target variable part of `observations`. If `nothing` is returned, the `learner` does not see a target variable in training (is unsupervised). -The returned object `y` has the same number of observations as `data`. If `data` is the -output of an [`obs`](@ref) call, then `y` is additionally guaranteed to implement the -data interface specified by [`LearnAPI.data_interface(learner)`](@ref). +The returned object `y` has the same number of observations as `observations` does and is +guaranteed to implement the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). # Extended help @@ -21,57 +21,61 @@ the LearnAPI.jl documentation. ## New implementations -A fallback returns `nothing`. The method must be overloaded if `fit` consumes data -including a target variable. +A fallback returns `nothing`. The method must be overloaded if [`fit`](@ref) consumes data +that includes a target variable. If `obs` is not being overloaded, then `observations` +above is any `data` supported in calls of the form [`fit(learner, data)`](@ref). The form +of the output `y` should be suitable for pairing with the output of [`predict`](@ref), in +the evaluation of a loss function, for example. -If overloading [`obs`](@ref), ensure that the return value, unless `nothing`, implements -the data interface specified by [`LearnAPI.data_interface(learner)`](@ref), in the special -case that `data` is the output of an `obs` call. +Ensure the object `y` returned by `LearnAPI.target`, unless `nothing`, implements the data +interface specified by [`LearnAPI.data_interface(learner)`](@ref). $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.target)"; overloaded=true)) """ -target(::Any, data) = nothing +target(::Any, observations) = nothing """ - LearnAPI.weights(learner, data) -> weights + LearnAPI.weights(learner, observations) -> weights -Return, for each form of `data` supported in a call of the form [`fit(learner, -data)`](@ref), the per-observation weights part of `data`. Where `nothing` is returned, no -weights are part of `data`, which is to be interpreted as uniform weighting. +Return, for every conceivable `observations` returned by a call of the form [`obs(learner, +data)`](@ref), the weights part of `observations`. Where `nothing` is returned, no weights +are part of `data`, which is to be interpreted as uniform weighting. -The returned object `w` has the same number of observations as `data`. If `data` is the -output of an [`obs`](@ref) call, then `w` is additionally guaranteed to implement the -data interface specified by [`LearnAPI.data_interface(learner)`](@ref). +The returned object `w` has the same number of observations as `observations` does and is +guaranteed to implement the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). # Extended help # New implementations -Overloading is optional. A fallback returns `nothing`. +Overloading is optional. A fallback returns `nothing`. If `obs` is not being overloaded, +then `observations` above is any `data` supported in calls of the form [`fit(learner, +data)`](@ref). -If overloading [`obs`](@ref), ensure that the return value, unless `nothing`, implements -the data interface specified by [`LearnAPI.data_interface(learner)`](@ref), in the special -case that `data` is the output of an `obs` call. +Ensure the returned object, unless `nothing`, implements the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) """ -weights(::Any, data) = nothing +weights(::Any, observations) = nothing """ - LearnAPI.features(learner, data) + LearnAPI.features(learner, observations) -Return, for each form of `data` supported in a call of the form [`fit(learner, -data)`](@ref), the "features" part of `data` (as opposed to the target -variable, for example). +Return, for every conceivable `observations` returned by a call of the form [`obs(learner, +data)`](@ref), the "features" part of `data` (as opposed to the target variable, for +example). The returned object `X` may always be passed to `predict` or `transform`, where implemented, as in the following sample workflow: ```julia -model = fit(learner, data) -X = LearnAPI.features(learner, data) +observations = obs(learner, data) +model = fit(learner, observations) +X = LearnAPI.features(learner, observations) ŷ = predict(model, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` ``` @@ -80,28 +84,30 @@ For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(lea data)`, the training target. The object `X` returned by `LearnAPI.target` has the same number of observations as -`data`. If `data` is the output of an [`obs`](@ref) call, then `X` is additionally -guaranteed to implement the data interface specified by +`observations` does and is guaranteed to implement the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). # Extended help # New implementations +A fallback returns `first(observations)` if `observations` is a tuple, and otherwise +returns `observations`. New implementations may need to overload this method if this +fallback is inadequate. + For density estimators, whose `fit` typically consumes *only* a target variable, you -should overload this method to return `nothing`. +should overload this method to return `nothing`. If `obs` is not being overloaded, then +`observations` above is any `data` supported in calls of the form [`fit(learner, +data)`](@ref). It must otherwise be possible to pass the return value `X` to `predict` and/or -`transform`, and `X` must have same number of observations as `data`. A fallback returns -`first(data)` if `data` is a tuple, and otherwise returns `data`. +`transform`, and `X` must have same number of observations as `data`. -Further overloadings may be necessary to handle the case that `data` is the output of -[`obs(learner, data)`](@ref), if `obs` is being overloaded. In this case, be sure that -`X`, unless `nothing`, implements the data interface specified by +Ensure the returned object, unless `nothing`, implements the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). """ -features(learner, data) = _first(data) -_first(data) = data -_first(data::Tuple) = first(data) +features(learner, observations) = _first(observations) +_first(observations) = observations +_first(observations::Tuple) = first(observations) # note the factoring above guards against method ambiguities diff --git a/src/traits.jl b/src/traits.jl index 7c2dd6a2..2994b456 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -387,7 +387,7 @@ iteration_parameter(::Any) = nothing Return an upper bound `S` on the scitype of individual observations guaranteed to work when calling `fit`: if `observations = obs(learner, data)` and -`ScientificTypes.scitype(o) <:S` for each `o` in `observations`, then the call +`ScientificTypes.scitype(collect(o)) <:S` for each `o` in `observations`, then the call `fit(learner, data)` is supported. $DOC_EXPLAIN_EACHOBS @@ -396,7 +396,7 @@ See also [`LearnAPI.target_observation_scitype`](@ref). # New implementations -Optional. The fallback return value is `Union{}`. +Optional. The fallback return value is `Union{}`. """ fit_observation_scitype(::Any) = Union{} From 07c815e467fe6a4c90e7e273b6ae6769c41964ee Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 23 Nov 2024 16:20:31 +1300 Subject: [PATCH 178/187] fix bad urls --- docs/make.jl | 2 +- docs/src/anatomy_of_an_implementation.md | 10 ++++------ docs/src/fit_update.md | 2 +- docs/src/index.md | 6 +++--- 4 files changed, 9 insertions(+), 11 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 9c73ec0a..48683a0a 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -7,7 +7,7 @@ const REPO = Remotes.GitHub("JuliaAI", "LearnAPI.jl") makedocs( modules=[LearnAPI,], format=Documenter.HTML( - prettyurls = get(ENV, "CI", nothing) == "true", + prettyurls = true,#get(ENV, "CI", nothing) == "true", collapselevel = 1, ), pages=[ diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 8a65910c..9a8ccc14 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -366,8 +366,8 @@ An implementation may optionally implement [`obs`](@ref), to expose to the user meta-algorithm like cross-validation) the representation of input data internal to `fit` or `predict`, such as the matrix version `A` of `X` in the ridge example. That is, we may factor out of `fit` (and also `predict`) the data pre-processing step, `obs`, to expose -its outcomes. These outcomes become alternative user inputs to `fit`. To see the use of -`obs` in action, see [below](@ref advanced_demo). +its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. To see the +use of `obs` in action, see [below](@ref advanced_demo). Here we specifically wrap all the pre-processed data into single object, for which we introduce a new type: @@ -536,7 +536,5 @@ declaration. ³ The last index must be the observation index. ⁴ The `data = (X, y)` pattern implemented here is not the only supported pattern. For, -example, `data` might be a single table containing both features and target variable. In -this case, it will be necessary to overload [`LearnAPI.features`](@ref) in addition to -[`LearnAPI.target`](@ref); the name of the target column would need to be a -hyperparameter. +example, `data` might be `(T, formula)` where `T` is a table and `formula` is an R-style +formula. diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index 64f68d73..d0ae1dc9 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -17,7 +17,7 @@ clustering algorithms); there is no training data and the algorithm is executed ``` update(model, data; verbosity=..., param1=new_value1, param2=new_value2, ...) -> updated_model update_observations(model, new_data; verbosity=..., param1=new_value1, ...) -> updated_model -update_features(model, new_data; verbosity=1, param1=new_value1, ...) -> updated_model +update_features(model, new_data; verbosity=..., param1=new_value1, ...) -> updated_model ``` ## Typical workflows diff --git a/docs/src/index.md b/docs/src/index.md index 87f1dd90..9a0c94f3 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -2,11 +2,11 @@
- Tutorial  |  - Reference  |  - Patterns
From 8e8123a081260ceffe5c496b0406645c99f53a78 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 16 Dec 2024 12:14:12 +1300 Subject: [PATCH 179/187] doc improvements --- docs/make.jl | 2 +- docs/src/anatomy_of_an_implementation.md | 45 ++++++++++++++++++++---- docs/src/obs.md | 8 ++--- docs/src/predict_transform.md | 2 +- docs/src/reference.md | 20 +++++++---- src/fit_update.jl | 5 +-- src/obs.jl | 7 ++-- src/predict_transform.jl | 13 ++++--- src/target_weights_features.jl | 5 ++- src/traits.jl | 4 +-- src/types.jl | 3 +- src/verbosity.jl | 11 +++--- 12 files changed, 81 insertions(+), 44 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 48683a0a..11759727 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -18,7 +18,7 @@ makedocs( "fit/update" => "fit_update.md", "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", - "obs" => "obs.md", + "obs and Data Interfaces" => "obs.md", "target/weights/features" => "target_weights_features.md", "Accessor Functions" => "accessor_functions.md", "Learner Traits" => "traits.md", diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 9a8ccc14..0d4b45f8 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -1,6 +1,6 @@ # Anatomy of an Implementation -This section explains a detailed implementation of the LearnAPI.jl for naive [ridge +This tutorial details an implementation of the LearnAPI.jl for naive [ridge regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also refer to the [demonstration](@ref workflow) of the implementation given later. @@ -35,8 +35,7 @@ A transformer ordinarily implements `transform` instead of `predict`. For more o then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data can be transformed into a form that does support this interface, as illustrated below under - [Providing a separate data front end](@ref), and which may additionally - enable certain performance benefits; or (ii) overload the trait + [Providing a separate data front end](@ref); or (ii) overload the trait [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. @@ -62,7 +61,7 @@ nothing # hide Instances of `Ridge` are *[learners](@ref learners)*, in LearnAPI.jl parlance. -Associated with each new type of LearnAPI.jl [learner](@ref learners) will be a keyword +Associated with each new type of LearnAPI.jl learner will be a keyword argument constructor, providing default values for all properties (typically, struct fields) that are not other learners, and we must implement [`LearnAPI.constructor(learner)`](@ref), for recovering the constructor from an instance: @@ -365,9 +364,41 @@ y = 2a - b + 3c + 0.05*rand(n) An implementation may optionally implement [`obs`](@ref), to expose to the user (or some meta-algorithm like cross-validation) the representation of input data internal to `fit` or `predict`, such as the matrix version `A` of `X` in the ridge example. That is, we may -factor out of `fit` (and also `predict`) the data pre-processing step, `obs`, to expose -its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. To see the -use of `obs` in action, see [below](@ref advanced_demo). +factor out of `fit` (and also `predict`) a data pre-processing step, `obs`, to expose +its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. + +In the default case, the alternative data representations will implement the MLUtils.jl +`getobs/numobs` interface for observation subsampling, which is generally all a user or +meta-algorithm will need, before passing the data on to `fit`/`predict` as you would the +original data. + +So, instead of the pattern + +```julia +model = fit(learner, data) +predict(model, newdata) +``` + +one enables the following alternative (which in any case will still work, because of a +no-op `obs` fallback provided by LearnAPI.jl): + +```julia +observations = obs(learner, data) # pre-processed training data + +# optional subsampling: +observations = MLUtils.getobs(observations, train_indices) + +model = fit(learner, observations) + +newobservations = obs(model, newdata) + +# optional subsampling: +newobservations = MLUtils.getobs(observations, test_indices) + +predict(model, newobservations) +``` + +See also the demonstration [below](@ref advanced_demo). Here we specifically wrap all the pre-processed data into single object, for which we introduce a new type: diff --git a/docs/src/obs.md b/docs/src/obs.md index da6b2ccc..a583f27d 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -47,8 +47,6 @@ import MLUtils learner = data = -X = LearnAPI.features(learner, data) -y = LearnAPI.target(learner, data) train_test_folds = map([1:10, 11:20, 21:30]) do test (setdiff(1:30, test), test) @@ -65,12 +63,14 @@ scores = map(train_test_folds) do (train, test) # predict on the fold complement: if never_trained + X = LearnAPI.features(learner, data) global predictobs = obs(model, X) global never_trained = false end predictobs_subset = MLUtils.getobs(predictobs, test) ŷ = predict(model, Point(), predictobs_subset) + y = LearnAPI.target(learner, data) return end @@ -96,8 +96,8 @@ obs ### [Data interfaces](@id data_interfaces) New implementations must overload [`LearnAPI.data_interface(learner)`](@ref) if the -output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess`](@ref). (Arrays, most -tables, and all tuples thereof, implement `RandomAccess`.) +output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess()`](@ref). Arrays, most +tables, and all tuples thereof, implement `RandomAccess()`. - [`LearnAPI.RandomAccess`](@ref) (default) - [`LearnAPI.FiniteIterable`](@ref) diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index b15b9055..200094bb 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -86,7 +86,7 @@ dimension using distances from the cluster centres. Learners may additionally overload `transform` to apply `fit` first, using the supplied data if required, and then immediately `transform` the same data. In that case the first -argument of `transform` is an *learner* instead of the output of `fit`: +argument of `transform` is a *learner* instead of the output of `fit`: ```julia transform(learner, data) # `fit` implied diff --git a/docs/src/reference.md b/docs/src/reference.md index 233ea162..39283937 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -80,9 +80,8 @@ Informally, we will sometimes use the word "model" to refer to the output of `fit(learner, ...)` (see below), something which typically *does* store learned parameters. -For `learner` to be a valid LearnAPI.jl learner, -[`LearnAPI.constructor(learner)`](@ref) must be defined and return a keyword constructor -enabling recovery of `learner` from its properties: +For every `learner`, [`LearnAPI.constructor(learner)`](@ref) must return a keyword +constructor enabling recovery of `learner` from its properties: ```julia properties = propertynames(learner) @@ -92,7 +91,7 @@ named_properties = NamedTuple{properties}(getproperty.(Ref(learner), properties) which can be tested with `@assert `[`LearnAPI.clone(learner)`](@ref)` == learner`. -Note that if if `learner` is an instance of a *mutable* struct, this requirement +Note that if `learner` is an instance of a *mutable* struct, this requirement generally requires overloading `Base.==` for the struct. !!! important @@ -124,6 +123,13 @@ struct GradientRidgeRegressor{T<:Real} epochs::Int l2_regularization::T end + +""" + GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) + +Instantiate a gradient ridge regressor with the specified hyperparameters. + +""" GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) = GradientRidgeRegressor(learning_rate, epochs, l2_regularization) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor @@ -132,9 +138,9 @@ LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ## Documentation Attach public LearnAPI.jl-related documentation for a learner to it's *constructor*, -rather than to the struct defining its type. In this way, a learner can implement -multiple interfaces, in addition to the LearnAPI interface, with separate document strings -for each. +rather than to the struct defining its type, as shown in the example above. (In this way, +multiple interfaces can share a common struct, with separate document strings for each +interface.) ## Methods diff --git a/src/fit_update.jl b/src/fit_update.jl index 4d9c9e2e..56cbe710 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -21,7 +21,8 @@ The signature `fit(learner; verbosity=...)` (no `data`) is provided by learners not generalize to new observations (called *static algorithms*). In that case, `transform(model, data)` or `predict(model, ..., data)` carries out the actual algorithm execution, writing any byproducts of that operation to the mutable object `model` returned -by `fit`. +by `fit`. Inspect the value of [`LearnAPI.is_static(learner)`](@ref) to determine whether +`fit` consumes `data` or not. Use `verbosity=0` for warnings only, and `-1` for silent training. @@ -117,7 +118,7 @@ learner = MyNeuralNetwork(epochs=10, learning_rate=0.01) model = fit(learner, data) # train for two more epochs using new data and new learning rate: -model = update_observations(model, new_data; epochs=2, learning_rate=0.1) +model = update_observations(model, new_data; epochs=12, learning_rate=0.1) ``` When following the call `fit(learner, data)`, the `update` call is semantically diff --git a/src/obs.jl b/src/obs.jl index 7485cf4d..99153170 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -25,15 +25,13 @@ model = fit(learner, data_train) ŷ = predict(model, Point(), X[101:150]) ``` -Alternative, data agnostic, workflow using `obs` and the MLUtils.jl method `getobs` -(assumes `LearnAPI.data_interface(learner) == RandomAccess()`): +Alternative workflow using `obs` and the MLUtils.jl method `getobs` to carry out +subsampling (assumes `LearnAPI.data_interface(learner) == RandomAccess()`): ```julia import MLUtils - fit_observations = obs(learner, data) model = fit(learner, MLUtils.getobs(fit_observations, 1:100)) - predict_observations = obs(model, X) ẑ = predict(model, Point(), MLUtils.getobs(predict_observations, 101:150)) @assert ẑ == ŷ @@ -41,7 +39,6 @@ ẑ = predict(model, Point(), MLUtils.getobs(predict_observations, 101:150)) See also [`LearnAPI.data_interface`](@ref). - # Extended help # New implementations diff --git a/src/predict_transform.jl b/src/predict_transform.jl index 8bb0a254..d4bfe0c8 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -8,7 +8,8 @@ DOC_MUTATION(op) = """ If [`LearnAPI.is_static(learner)`](@ref) is `true`, then `$op` may mutate it's first - argument, but not in a way that alters the result of a subsequent call to `predict`, + argument (to record byproducts of the computation not naturally part of the return + value) but not in a way that alters the result of a subsequent call to `predict`, `transform` or `inverse_transform`. See more at [`fit`](@ref). """ @@ -82,8 +83,9 @@ See also [`fit`](@ref), [`transform`](@ref), [`inverse_transform`](@ref). # Extended help -Note `predict ` must not mutate any argument, except in the special case -`LearnAPI.is_static(learner) == true`. +In the special case `LearnAPI.is_static(learner) == true`, it is possible that +`predict(model, ...)` will mutate `model`, but not in a way that affects subsequent +`predict` calls. # New implementations @@ -147,8 +149,9 @@ or, in one step (where supported): W = transform(learner, X) # `fit` implied ``` -Note `transform` does not mutate any argument, except in the special case -`LearnAPI.is_static(learner) == true`. +In the special case `LearnAPI.is_static(learner) == true`, it is possible that +`transform(model, ...)` will mutate `model`, but not in a way that affects subsequent +`transform` calls. See also [`fit`](@ref), [`predict`](@ref), [`inverse_transform`](@ref). diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 611da656..5defdba8 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -80,10 +80,9 @@ ŷ = predict(model, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` ``` For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`) -`ŷ` above is generally intended to be an approximate proxy for `LearnAPI.target(learner, -data)`, the training target. +`ŷ` above is generally intended to be an approximate proxy for the target variable. -The object `X` returned by `LearnAPI.target` has the same number of observations as +The object `X` returned by `LearnAPI.features` has the same number of observations as `observations` does and is guaranteed to implement the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). diff --git a/src/traits.jl b/src/traits.jl index 2994b456..ca01783f 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -79,9 +79,9 @@ All new implementations must implement this trait. Here's a checklist for elemen return value: | expression | implementation compulsory? | include in returned tuple? | -|-----------------------------------|----------------------------|------------------------------------| +|:----------------------------------|:---------------------------|:-----------------------------------| | `:(LearnAPI.fit)` | yes | yes | -| `:(LearnAPI.learner)` | yes | yes | +| `:(LearnAPI.learner)` | yes | yes | | `:(LearnAPI.strip)` | no | yes | | `:(LearnAPI.obs)` | no | yes | | `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | diff --git a/src/types.jl b/src/types.jl index 8a53672d..faa6d250 100644 --- a/src/types.jl +++ b/src/types.jl @@ -229,7 +229,8 @@ A data interface type. We say that `data` implements the `FiniteIterable` inter it implements Julia's `iterate` interface, including `Base.length`, and if `Base.IteratorSize(typeof(data)) == Base.HasLength()`. For example, this is true if: -- `data` implements the [`LearnAPI.RandomAccess`](@ref) interface (arrays and most tables) +- `data` implements the [`LearnAPI.RandomAccess`](@ref) interface (arrays and most + tables); or - `data isa MLUtils.DataLoader`, which includes output from `MLUtils.eachobs`. diff --git a/src/verbosity.jl b/src/verbosity.jl index 4f77e659..3723bb77 100644 --- a/src/verbosity.jl +++ b/src/verbosity.jl @@ -2,7 +2,7 @@ const DEFAULT_VERBOSITY = Ref(1) """ LearnAPI.default_verbosity() - LearnAPI.default_verbosity(level::Int) + LearnAPI.default_verbosity(verbosity::Int) Respectively return, or set, the default `verbosity` level for LearnAPI.jl methods that support it, which includes [`fit`](@ref), [`update`](@ref), @@ -11,11 +11,10 @@ call is generally: -| `level` | behaviour | -|:--------|:----------------------------------| -| 1 | informational | -| 0 | warnings only | -| -1 | silent | +| `verbosity` | behaviour | +|:------------|:--------------| +| 1 | informational | +| 0 | warnings only | Methods consuming `verbosity` generally call other verbosity-supporting methods From 6279b25eff320d9a0ea97468a2fd431671492bdd Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 16 Dec 2024 14:59:39 +1300 Subject: [PATCH 180/187] add @functions and have `LearnAPI.functions()` return accessors --- README.md | 5 ++--- docs/src/index.md | 2 +- docs/src/reference.md | 3 ++- src/LearnAPI.jl | 2 +- src/accessor_functions.jl | 31 ++++++++++++++++--------------- src/traits.jl | 35 ++++++++++++++++++++++++++++++++++- 6 files changed, 56 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index 6d1287e7..26dea491 100644 --- a/README.md +++ b/README.md @@ -22,9 +22,8 @@ julia> ridge = Ridge(lambda=0.1) Inspect available functionality: ``` -julia> LearnAPI.functions(ridge) -(:(LearnAPI.fit), :(LearnAPI.learner), :(LearnAPI.strip), :(LearnAPI.obs), -:(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients)) +julia> @functions ridge +(fit, LearnAPI.learner, LearnAPI.strip, obs, LearnAPI.features, LearnAPI.target, predict, LearnAPI.coefficients ``` Train: diff --git a/docs/src/index.md b/docs/src/index.md index 9a0c94f3..ea7f347e 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -55,7 +55,7 @@ y = Xnew = # List LearnaAPI functions implemented for `forest`: -LearnAPI.functions(forest) +@functions forest # Train: model = fit(forest, X, y) diff --git a/docs/src/reference.md b/docs/src/reference.md index 39283937..9cff53d8 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -199,8 +199,9 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` ## Utilities ```@docs +@functions LearnAPI.clone -LearnAPI.@trait +@trait ``` --- diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 5d31ce93..2e4a3ee5 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -11,7 +11,7 @@ include("accessor_functions.jl") include("traits.jl") include("clone.jl") -export @trait +export @trait, @functions export fit, update, update_observations, update_features export predict, transform, inverse_transform, obs diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index c1b4447f..1de228cf 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -312,23 +312,23 @@ function training_labels end # :extras intentionally excluded: const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( - learner, - coefficients, - intercept, - tree, - trees, - feature_names, - feature_importances, - training_labels, - training_losses, - training_predictions, - training_scores, - components, + :(LearnAPI.learner), + :(LearnAPI.coefficients), + :(LearnAPI.intercept), + :(LearnAPI.tree), + :(LearnAPI.trees), + :(LearnAPI.feature_names), + :(LearnAPI.feature_importances), + :(LearnAPI.training_labels), + :(LearnAPI.training_losses), + :(LearnAPI.training_predictions), + :(LearnAPI.training_scores), + :(LearnAPI.components), ) const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS_LIST = join( map(ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS) do f - "[`LearnAPI.$f`](@ref)" + "[`$f`](@ref)" end, ", ", " and ", @@ -354,11 +354,12 @@ $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_labels)")). """ function extras end -const ACCESSOR_FUNCTIONS = (extras, ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS...) +const ACCESSOR_FUNCTIONS = + (:(LearnAPI.extras), ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS...) const ACCESSOR_FUNCTIONS_LIST = join( map(ACCESSOR_FUNCTIONS) do f - "[`LearnAPI.$f`](@ref)" + "[`$f`](@ref)" end, ", ", " and ", diff --git a/src/traits.jl b/src/traits.jl index ca01783f..40d20da1 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -65,12 +65,18 @@ with `learner`, or an associated model (object returned by `fit(learner, ...)`, first argument. Learner traits (methods for which `learner` is the *only* argument) are excluded. +To return actual functions, instead of symbols, use [`@functions`](@ref)` learner` +instead. + The returned tuple may include expressions like `:(DecisionTree.print_tree)`, which reference functions not owned by LearnAPI.jl. The understanding is that `learner` is a LearnAPI-compliant object whenever the return value is non-empty. +Do `LearnAPI.functions()` to list all possible elements of the return value owned by +LearnAPI.jl. + # Extended help # New implementations @@ -100,6 +106,7 @@ learner-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIO (`LearnAPI.strip` is always included). """ +functions(::Any) = () functions() = ( :(LearnAPI.fit), :(LearnAPI.learner), @@ -114,8 +121,34 @@ functions() = ( :(LearnAPI.predict), :(LearnAPI.transform), :(LearnAPI.inverse_transform), + ACCESSOR_FUNCTIONS..., ) -functions(::Any) = () + +""" + @functions learner + +Return a tuple of functions that can be meaningfully applied with `learner`, or an +associated model, as the first argument. An "associated model" is an object returned by +`fit(learner, ...)`. Learner traits (methods for which `learner` is the *only* argument) +are excluded. + +``` +julia> @functions my_feature_selector +(fit, LearnAPI.learner, strip, obs, transform) + +``` + +New learner implementations should overload [`LearnAPI.functions`](@ref). + +See also [`LearnAPI.functions`](@ref). + +""" +macro functions(learner) + quote + exs = LearnAPI.functions(learner) + eval.(exs) + end |> esc +end """ LearnAPI.kinds_of_proxy(learner) From dff3cdedb712079717635f32a9c971b4fe597934 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 20 Dec 2024 15:57:35 +1300 Subject: [PATCH 181/187] typo --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 26dea491..7ef89a62 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,7 @@ Inspect available functionality: ``` julia> @functions ridge -(fit, LearnAPI.learner, LearnAPI.strip, obs, LearnAPI.features, LearnAPI.target, predict, LearnAPI.coefficients +(fit, LearnAPI.learner, LearnAPI.strip, obs, LearnAPI.features, LearnAPI.target, predict, LearnAPI.coefficients) ``` Train: From e4bdbbb238370dd737b068ce99340b9f4fe3e3c3 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 20 Dec 2024 16:00:45 +1300 Subject: [PATCH 182/187] typo --- src/accessor_functions.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index 1de228cf..91c5d76a 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -349,7 +349,7 @@ See also [`fit`](@ref). Implementation is discouraged for byproducts already covered by other LearnAPI.jl accessor functions: $ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS_LIST. -$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_labels)")). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.extras)")). """ function extras end From f4aed662ff4bb86b3b1561d686050ca74911d1a0 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 21 Dec 2024 11:11:05 +1300 Subject: [PATCH 183/187] typo --- docs/src/reference.md | 2 +- src/accessor_functions.jl | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/reference.md b/docs/src/reference.md index 9cff53d8..7935489a 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -45,7 +45,7 @@ values in the set of dictionary keys, can be specified as a hyperparameter. #### Context After training, a supervised classifier predicts labels on some input which are then -compared with ground truth labels using some accuracy measure, to assesses the performance +compared with ground truth labels using some accuracy measure, to assess the performance of the classifier. Alternatively, the classifier predicts class probabilities, which are instead paired with ground truth labels using a proper scoring rule, say. In outlier detection, "outlier"/"inlier" predictions, or probability-like scores, are similarly diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index 91c5d76a..32458eb2 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -247,7 +247,7 @@ See also [`fit`](@ref). # New implementations Implement for iterative algorithms that compute and record training losses as part of -training (e.g. neural networks). +training (e.g. neural networks). Return value should be `AbstractVector`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_predictions)")). From 1e946cf2afc0a4e7fde780e102bfea2b77e4e07f Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Sat, 28 Dec 2024 14:54:51 +1300 Subject: [PATCH 184/187] make multi changes to accssr fnctns; add clone to functions() --- docs/src/accessor_functions.md | 10 +- docs/src/anatomy_of_an_implementation.md | 5 +- docs/src/fit_update.md | 4 +- docs/src/index.md | 11 +- docs/src/patterns/ensembling.md | 1 + docs/src/patterns/iterative_algorithms.md | 2 + docs/src/patterns/regression.md | 1 + docs/src/reference.md | 38 +++--- src/LearnAPI.jl | 2 +- src/accessor_functions.jl | 157 ++++++++++++++++------ src/clone.jl | 2 + src/obs.jl | 8 +- src/target_weights_features.jl | 3 +- src/tools.jl | 3 +- src/traits.jl | 6 +- test/traits.jl | 1 + 16 files changed, 168 insertions(+), 86 deletions(-) diff --git a/docs/src/accessor_functions.md b/docs/src/accessor_functions.md index 93b377a7..11ac67e0 100644 --- a/docs/src/accessor_functions.md +++ b/docs/src/accessor_functions.md @@ -13,9 +13,10 @@ The sole argument of an accessor function is the output, `model`, of - [`LearnAPI.trees(model)`](@ref) - [`LearnAPI.feature_names(model)`](@ref) - [`LearnAPI.feature_importances(model)`](@ref) -- [`LearnAPI.training_labels(model)`](@ref) - [`LearnAPI.training_losses(model)`](@ref) -- [`LearnAPI.training_predictions(model)`](@ref) +- [`LearnAPI.out_of_sample_losses(model)`](@ref) +- [`LearnAPI.predictions(model)`](@ref) +- [`LearnAPI.out_of_sample_indices(model)`](@ref) - [`LearnAPI.training_scores(model)`](@ref) - [`LearnAPI.components(model)`](@ref) @@ -42,9 +43,10 @@ LearnAPI.trees LearnAPI.feature_names LearnAPI.feature_importances LearnAPI.training_losses -LearnAPI.training_predictions +LearnAPI.out_of_sample_losses +LearnAPI.predictions +LearnAPI.out_of_sample_indices LearnAPI.training_scores -LearnAPI.training_labels LearnAPI.components ``` diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 0d4b45f8..9d5de48f 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -50,7 +50,7 @@ nothing # hide ## Defining learners -Here's a new type whose instances specify ridge regression hyperparameters: +Here's a new type whose instances specify the single ridge regression hyperparameter: ```@example anatomy struct Ridge{T<:Real} @@ -280,7 +280,7 @@ nothing # hide ```@example anatomy learner = Ridge(lambda=0.5) -foreach(println, LearnAPI.functions(learner)) +@functions learner ``` Training and predicting: @@ -344,6 +344,7 @@ LearnAPI.strip(model::RidgeFitted) = functions = ( :(LearnAPI.fit), :(LearnAPI.learner), + :(LearnAPI.clone), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index d0ae1dc9..a574f3a1 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -111,8 +111,8 @@ Exactly one of the following must be implemented: | method | fallback | compulsory? | |:-------------------------------------------------------------------------------------|:---------|-------------| | [`update`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | -| [`update_observations`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | -| [`update_features`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | +| [`update_observations`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no | +| [`update_features`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no | There are some contracts governing the behaviour of the update methods, as they relate to a previous `fit` call. Consult the document strings for details. diff --git a/docs/src/index.md b/docs/src/index.md index ea7f347e..55c18898 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -50,9 +50,9 @@ enable the basic workflow below. In this case data is presented following the "scikit-learn" `X, y` pattern, although LearnAPI.jl supports other patterns as well. ```julia -X = -y = -Xnew = +# `X` is some training features +# `y` is some training target +# `Xnew` is some test or production features # List LearnaAPI functions implemented for `forest`: @functions forest @@ -72,11 +72,6 @@ LearnAPI.feature_importances(model) # Slim down and otherwise prepare model for serialization: small_model = LearnAPI.strip(model) serialize("my_random_forest.jls", small_model) - -# Recover saved model and algorithm configuration ("learner"): -recovered_model = deserialize("my_random_forest.jls") -@assert LearnAPI.learner(recovered_model) == forest -@assert predict(recovered_model, Point(), Xnew) == ŷ ``` `Distribution` and `Point` are singleton types owned by LearnAPI.jl. They allow diff --git a/docs/src/patterns/ensembling.md b/docs/src/patterns/ensembling.md index ea5faf88..8d774f5e 100644 --- a/docs/src/patterns/ensembling.md +++ b/docs/src/patterns/ensembling.md @@ -4,3 +4,4 @@ See these examples from the JuliaTestAPI.jl test suite: - [bagged ensembling of a regression model](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) +- [extremely randomized ensemble of decision stumps (regression)](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) diff --git a/docs/src/patterns/iterative_algorithms.md b/docs/src/patterns/iterative_algorithms.md index b12c6142..265dddf7 100644 --- a/docs/src/patterns/iterative_algorithms.md +++ b/docs/src/patterns/iterative_algorithms.md @@ -5,3 +5,5 @@ See these examples from the JuliaTestAI.jl test suite: - [bagged ensembling](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) - [perceptron classifier](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/gradient_descent.jl) + +- [extremely randomized ensemble of decision stumps (regression)](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) diff --git a/docs/src/patterns/regression.md b/docs/src/patterns/regression.md index ca68b308..a6de5b10 100644 --- a/docs/src/patterns/regression.md +++ b/docs/src/patterns/regression.md @@ -4,3 +4,4 @@ See these examples from the JuliaTestAPI.jl test suite: - [ridge regression](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/regression.jl) +- [extremely randomized ensemble of decision stumps](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/ensembling.jl) diff --git a/docs/src/reference.md b/docs/src/reference.md index 7935489a..5484d243 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -27,9 +27,9 @@ see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. !!! note - In the MLUtils.jl - convention, observations in tables are the rows but observations in a matrix are the - columns. + In the MLUtils.jl + convention, observations in tables are the rows but observations in a matrix are the + columns. ### [Hyperparameters](@id hyperparameters) @@ -96,9 +96,9 @@ generally requires overloading `Base.==` for the struct. !!! important - No LearnAPI.jl method is permitted to mutate a learner. In particular, one should make - deep copies of RNG hyperparameters before using them in a new implementation of - [`fit`](@ref). + No LearnAPI.jl method is permitted to mutate a learner. In particular, one should make + deep copies of RNG hyperparameters before using them in a new implementation of + [`fit`](@ref). #### Composite learners (wrappers) @@ -119,19 +119,19 @@ Below is an example of a learner type with a valid constructor: ```julia struct GradientRidgeRegressor{T<:Real} - learning_rate::T - epochs::Int - l2_regularization::T + learning_rate::T + epochs::Int + l2_regularization::T end """ - GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) - + GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) + Instantiate a gradient ridge regressor with the specified hyperparameters. """ GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) = - GradientRidgeRegressor(learning_rate, epochs, l2_regularization) + GradientRidgeRegressor(learning_rate, epochs, l2_regularization) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` @@ -146,9 +146,9 @@ interface.) !!! note "Compulsory methods" - All new learner types must implement [`fit`](@ref), - [`LearnAPI.learner`](@ref), [`LearnAPI.constructor`](@ref) and - [`LearnAPI.functions`](@ref). + All new learner types must implement [`fit`](@ref), + [`LearnAPI.learner`](@ref), [`LearnAPI.constructor`](@ref) and + [`LearnAPI.functions`](@ref). Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). For a minimal (but useless) implementation, see the implementation of `SmallLearner` @@ -198,10 +198,14 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` ## Utilities +- [`clone`](@ref): for cloning a learner with specified hyperparameter replacements. +- [`@trait`](@ref): for simultaneously declaring multiple traits +- [`@functions`](@ref): for listing functions available for use with a learner + ```@docs -@functions -LearnAPI.clone +clone @trait +@functions ``` --- diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 2e4a3ee5..9687c2e9 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -11,7 +11,7 @@ include("accessor_functions.jl") include("traits.jl") include("clone.jl") -export @trait, @functions +export @trait, @functions, clone export fit, update, update_observations, update_features export predict, transform, inverse_transform, obs diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index 32458eb2..db8cb666 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -181,20 +181,22 @@ function intercept end """ LearnAPI.tree(model) -Return a user-friendly tree, in the form of a root object implementing the following -interface defined in AbstractTrees.jl: - -- subtypes `AbstractTrees.AbstractNode{T}` -- implements `AbstractTrees.children()` -- implements `AbstractTrees.printnode()` - -Such a tree can be visualized using the TreeRecipe.jl package, for example. +Return a user-friendly `tree`, implementing the AbstractTrees.jl interface. In particular, +such a tree can be visualized using `AbstractTrees.print_tree(tree)` or using the +TreeRecipe.jl package. See also [`LearnAPI.trees`](@ref). # New implementations -Implementation is optional. +Implementation is optional. The returned object should implement the following interface +defined in AbstractTrees.jl: + +- `tree` subtypes `AbstractTrees.AbstractNode{T}` + +- `AbstractTrees.children(tree)` + +- `AbstractTrees.printnode(tree)` should be human-readable $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.tree)")). @@ -204,14 +206,15 @@ function tree end """ LearnAPI.trees(model) -For some ensemble model, return a vector of trees. See [`LearnAPI.tree`](@ref) for the -form of such trees. +For tree ensemble model, return a vector of trees, each implementing the AbstractTrees.jl +interface. See also [`LearnAPI.tree`](@ref). # New implementations -Implementation is optional. +Implementation is optional. See [`LearnAPI.tree`](@ref) for the interface each tree in the +ensemble should implement. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.trees)")). @@ -221,15 +224,20 @@ function trees end """ LearnAPI.training_losses(model) -Return the training losses obtained when running `model = fit(learner, ...)` for some -`learner`. +Return internally computed training losses obtained when running `model = fit(learner, +...)` for some `learner`, one for each iteration of the algorithm. This will be a +numerical vector. The metric used to compute the loss is generally learner-specific, but +may be a user-specifiable learner hyperparameter. Generally, the smaller the loss, the +better the performance. See also [`fit`](@ref). # New implementations -Implement for iterative algorithms that compute and record training losses as part of -training (e.g. neural networks). +Implement for iterative algorithms that compute meausures of training performance as part +of training (e.g. neural networks). Return one value per iteration, in chronological +order, with an optional pre-training intial value. If scores are being computed rather +than losses, ensure values are multiplied by -1. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_losses)")). @@ -237,36 +245,111 @@ $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_losses)")). function training_losses end """ - LearnAPI.training_predictions(model) + LearnAPI.out_of_sample_losses(model) + +Return internally computed out-of-sample losses obtained when running `model = +fit(learner, ...)` for some `learner`, one for each iteration of the algorithm. This will +be a numeric vector. The metric used to compute the loss is generally learner-specific, +but may be a user-specifiable learner hyperparameter. Generally, the smaller the loss, the +better the performance. + +If the learner is not setting aside a separate validation set, then the losses are all +`Inf`. + +See also [`fit`](@ref). + +# New implementations + +Only implement this method for learners that specifically allow for the supplied training +data to be internally split into separate "train" and "validation" subsets, and which +additionally compute an out-of-sample loss. Return one value per iteration, in +chronological order, with an optional pre-training intial value. If scores are being +computed rather than losses, ensure values are multiplied by -1. + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.out_of_sample_losses)")). + +""" +function out_of_sample_losses end -Return internally computed training predictions when running `model = fit(learner, ...)` -for some `learner`. +""" + LearnAPI.predictions(model) + +Return internally computed predictions on the training data when running `model = +fit(learner, ...)` for some `learner`. These will be actual target predictions or proxies +for the target, according to the first value of +[`LearnAPI.kinds_of_proxy(learner)`](@ref). See also [`fit`](@ref). # New implementations -Implement for iterative algorithms that compute and record training losses as part of -training (e.g. neural networks). Return value should be `AbstractVector`. +Implement for algorithms that internally compute predictions for the training +data. Predictions for the complete test data must be returned, even if only a subset is +used for training. Here are use cases: + +- Clustering algorithms that generalize to new data, but by first learning labels for the + training data (e.g., K-means); use `predictions(model)` to expose these labels + to the user so they can avoid a separate `predict` call. + +- Iterative learners such as neural networks, that need to make in-sample predictions + to estimate to estimate an in-sample loss; use `predictions(model)` + to expose these predictions to the user so they can avoid a separate `predict` call. + +- Ensemble learners, such as gradient tree boosting algorithms, may split the training + data into internal train and validation subsets and can efficiently build up predictions + on both with an update for each new ensemble member; expose these predictions to the + user (for external iteration control, for example) using `predictions(model)` + and articulate the actual split used using + [`LearnAPI.out_of_sample_indices(model)`](@ref). -$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_predictions)")). +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.predictions)")). """ -function training_predictions end +function predictions end + +""" + LearnAPI.out_of_sample_indices(model) + +For a learner implementing [`LearnAPI.predictions`](@ref), return a vector of +observation indices identifying which part, if any, of `yhat = +LearnAPI.predictions(model)`, is actually out-of-sample predictions. If the +learner trained on all data this will be an empty vector. + +Here's a sample workflow for some such `learner`, with training data, `(X, y)`, where `y` +is the training target, here assumed to be a vector. + +```julia +import MLUtils.getobs +model = fit(learner, (X, y)) +yhat = LearnAPI.predictions(model) +test_indices = LearnAPI.out_of_sample_indices(model) +out_of_sample_loss = yhat[test_indices] .!= y[test_indices] |> mean +``` + +# New implementations + +Implement for algorithms that internally split training data into "train" and +"validate" subsets. Assumes +[`LearnAPI.data_interface(learner)`](@ref)`==LearnAPI.RandomAccess()`. + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.out_of_sample_indices)")). +""" +function out_of_sample_indices end """ LearnAPI.training_scores(model) Return the training scores obtained when running `model = fit(learner, ...)` for some -`learner`. +`learner`. This will be a numerical vector whose length coincides with the number of +training observations, and whose interpretation depends on the learner. See also [`fit`](@ref). # New implementations -Implement for learners, such as outlier detection algorithms, which associate a score -with each observation during training, where these scores are of interest in later -processes (e.g, in defining normalized scores for new data). +Implement for learners, such as outlier detection algorithms, which associate a numerical +score with each observation during training, when these scores are of interest in +workflows (e.g, to normalize the scores for new observations). $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_scores)")). @@ -295,21 +378,6 @@ $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.components)")). """ function components end -""" - LearnAPI.training_labels(model) - -Return the training labels obtained when running `model = fit(learner, ...)` for some -`learner`. - -See also [`fit`](@ref). - -# New implementations - -$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_labels)")). - -""" -function training_labels end - # :extras intentionally excluded: const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( :(LearnAPI.learner), @@ -319,9 +387,10 @@ const ACCESSOR_FUNCTIONS_WITHOUT_EXTRAS = ( :(LearnAPI.trees), :(LearnAPI.feature_names), :(LearnAPI.feature_importances), - :(LearnAPI.training_labels), :(LearnAPI.training_losses), - :(LearnAPI.training_predictions), + :(LearnAPI.out_of_sample_losses), + :(LearnAPI.predictions), + :(LearnAPI.out_of_sample_indices), :(LearnAPI.training_scores), :(LearnAPI.components), ) diff --git a/src/clone.jl b/src/clone.jl index 47fc0a3b..3f7d478d 100644 --- a/src/clone.jl +++ b/src/clone.jl @@ -9,6 +9,8 @@ clone(learner; epochs=100, learning_rate=0.01) A LearnAPI.jl contract ensures that `LearnAPI.clone(learner) == learner`. +A new learner implementation does not overload `clone`. + """ function clone(learner; replacements...) reps = NamedTuple(replacements) diff --git a/src/obs.jl b/src/obs.jl index 99153170..2e631d30 100644 --- a/src/obs.jl +++ b/src/obs.jl @@ -2,10 +2,10 @@ obs(learner, data) obs(model, data) -Return learner-specific representation of `data`, suitable for passing to `fit` -(first signature) or to `predict` and `transform` (second signature), in place of -`data`. Here `model` is the return value of `fit(learner, ...)` for some LearnAPI.jl -learner, `learner`. +Return learner-specific representation of `data`, suitable for passing to `fit`, `update`, + `update_observations`, or `update_features` (first signature) or to `predict` and + `transform` (second signature), in place of `data`. Here `model` is the return value of + `fit(learner, ...)` for some LearnAPI.jl learner, `learner`. The returned object is guaranteed to implement observation access as indicated by [`LearnAPI.data_interface(learner)`](@ref), typically diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 5defdba8..2af85bad 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -7,7 +7,8 @@ data)`](@ref), the target variable part of `observations`. If `nothing` is retur The returned object `y` has the same number of observations as `observations` does and is guaranteed to implement the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). +[`LearnAPI.data_interface(learner)`](@ref). It's form should be suitable for pairing with +the output of [`predict`](@ref), for example in a loss function. # Extended help diff --git a/src/tools.jl b/src/tools.jl index 731860ff..e051b64b 100644 --- a/src/tools.jl +++ b/src/tools.jl @@ -11,7 +11,8 @@ end """ @trait(LearnerType, trait1=value1, trait2=value2, ...) -Overload a number of traits for learners of type `LearnerType`. For example, the code +Simultaneously overload a number of traits for learners of type `LearnerType`. For +example, the code ```julia @trait( diff --git a/src/traits.jl b/src/traits.jl index 40d20da1..e8f93a4f 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -88,6 +88,7 @@ return value: |:----------------------------------|:---------------------------|:-----------------------------------| | `:(LearnAPI.fit)` | yes | yes | | `:(LearnAPI.learner)` | yes | yes | +| `:(LearnAPI.clone)` | never overloaded | yes | | `:(LearnAPI.strip)` | no | yes | | `:(LearnAPI.obs)` | no | yes | | `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | @@ -110,6 +111,7 @@ functions(::Any) = () functions() = ( :(LearnAPI.fit), :(LearnAPI.learner), + :(LearnAPI.clone), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -129,8 +131,8 @@ functions() = ( Return a tuple of functions that can be meaningfully applied with `learner`, or an associated model, as the first argument. An "associated model" is an object returned by -`fit(learner, ...)`. Learner traits (methods for which `learner` is the *only* argument) -are excluded. +`fit(learner, ...)`. Learner traits (methods for which `learner` always the *only* +argument) are excluded. ``` julia> @functions my_feature_selector diff --git a/test/traits.jl b/test/traits.jl index f76a361b..35d53430 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -13,6 +13,7 @@ LearnAPI.learner(model::SmallLearner) = model functions = ( :(LearnAPI.fit), :(LearnAPI.learner), + :(LearnAPI.clone), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), From d37e60e03049a470e4859341022283d78591b648 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Thu, 23 Jan 2025 19:35:39 +1300 Subject: [PATCH 185/187] replace is_composite trait with nonlearners trait --- docs/src/anatomy_of_an_implementation.md | 41 ++++++++++------ docs/src/fit_update.md | 2 +- docs/src/predict_transform.md | 4 +- docs/src/reference.md | 11 ++--- docs/src/traits.md | 61 ++++++++++++----------- src/accessor_functions.jl | 62 ++++++++++++------------ src/clone.jl | 10 ++-- src/fit_update.jl | 51 +++++++++---------- src/target_weights_features.jl | 8 +-- src/traits.jl | 41 +++++++++++----- test/traits.jl | 3 +- 11 files changed, 166 insertions(+), 128 deletions(-) diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index 9d5de48f..c977cf50 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -1,10 +1,5 @@ # Anatomy of an Implementation -This tutorial details an implementation of the LearnAPI.jl for naive [ridge -regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of -workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also -refer to the [demonstration](@ref workflow) of the implementation given later. - The core LearnAPI.jl pattern looks like this: ```julia @@ -14,9 +9,21 @@ predict(model, newdata) Here `learner` specifies hyperparameters, while `model` stores learned parameters and any byproducts of algorithm execution. -A transformer ordinarily implements `transform` instead of `predict`. For more on +[Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on `predict` versus `transform`, see [Predict or transform?](@ref) +["Static" algorithms](@ref static_algorithms) have a `fit` that consumes no `data` +(instead `predict` or `transform` does the heavy lifting). In [density +estimation](@ref density_estimation), `predict` consumes no data. + +These are the basic possibilities. + +Elaborating on the core pattern above, we detail in this tutorial an implementation of the +LearnAPI.jl for naive [ridge regression](https://en.wikipedia.org/wiki/Ridge_regression) +with no intercept. The kind of workflow we want to enable has been previewed in [Sample +workflow](@ref). Readers can also refer to the [demonstration](@ref workflow) of the +implementation given later. + !!! note New implementations of `fit`, `predict`, etc, @@ -102,7 +109,7 @@ nothing # hide Note that we also include `learner` in the struct, for it must be possible to recover `learner` from the output of `fit`; see [Accessor functions](@ref) below. -The core implementation of `fit` looks like this: +The implementation of `fit` looks like this: ```@example anatomy function LearnAPI.fit(learner::Ridge, data; verbosity=LearnAPI.default_verbosity()) @@ -131,7 +138,7 @@ end ## Implementing `predict` -Users will be able to call `predict` like this: +One way users will be able to call `predict` is like this: ```julia predict(model, Point(), Xnew) @@ -229,6 +236,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: functions = ( :(LearnAPI.fit), :(LearnAPI.learner), + :(LearnAPI.clone), :(LearnAPI.strip), :(LearnAPI.obs), :(LearnAPI.features), @@ -241,12 +249,17 @@ nothing # hide ``` The last trait, `functions`, returns a list of all LearnAPI.jl methods that can be -meaningfully applied to the learner or associated model. See [`LearnAPI.functions`](@ref) -for a checklist. [`LearnAPI.functions`](@ref) and [`LearnAPI.constructor`](@ref), are the -only universally compulsory traits. However, it is worthwhile studying the [list of all -traits](@ref traits_list) to see which might apply to a new implementation, to enable -maximum buy into functionality provided by third party packages, and to assist third party -algorithms that match machine learning algorithms to user-defined tasks. +meaningfully applied to the learner or associated model. You always include the first five +you see here: `fit`, `learner`, `clone` ,`strip`, `obs`. Here [`clone`](@ref) is a utility +function provided by LearnAPI that you never overload; overloading [`obs`](@ref) is +optional (see [Providing a separate data front end](@ref)) but it is always included +because it has a fallback. See [`LearnAPI.functions`](@ref) for a checklist. + +[`LearnAPI.functions`](@ref) and [`LearnAPI.constructor`](@ref), are the only universally +compulsory traits. However, it is worthwhile studying the [list of all traits](@ref +traits_list) to see which might apply to a new implementation, to enable maximum buy into +functionality provided by third party packages, and to assist third party algorithms that +match machine learning algorithms to user-defined tasks. Note that we know `Ridge` instances are supervised learners because `:(LearnAPI.target) in LearnAPI.functions(learner)`, for every instance `learner`. With [some diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index a574f3a1..c649e6dd 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -76,7 +76,7 @@ LearnAPI.extras(model) See also [Static Algorithms](@ref) -### Density estimation +### [Density estimation](@id density_estimation) In density estimation, `fit` consumes no features, only a target variable; `predict`, which consumes no data, returns the learned density: diff --git a/docs/src/predict_transform.md b/docs/src/predict_transform.md index 200094bb..d6ab8f25 100644 --- a/docs/src/predict_transform.md +++ b/docs/src/predict_transform.md @@ -6,8 +6,8 @@ transform(model, data) inverse_transform(model, data) ``` -Versions without the `data` argument may apply, for example in [Density -estimation](@ref). +Versions without the `data` argument may apply, for example in [density +estimation](@ref density_estimation). ## [Typical worklows](@id predict_workflow) diff --git a/docs/src/reference.md b/docs/src/reference.md index 5484d243..f068afc7 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -103,12 +103,11 @@ generally requires overloading `Base.==` for the struct. #### Composite learners (wrappers) A *composite learner* is one with at least one property that can take other learners as -values; for such learners [`LearnAPI.is_composite`](@ref)`(learner)` must be `true` -(fallback is `false`). Generally, the keyword constructor provided by -[`LearnAPI.constructor`](@ref) must provide default values for all properties that are not -learner-valued. Instead, these learner-valued properties can have a `nothing` default, -with the constructor throwing an error if the constructor call does not explicitly -specify a new value. +values; for such learners [`LearnAPI.learners(learner)`](@ref) is non-empty. A keyword +constructor provided by [`LearnAPI.constructor`](@ref) must provide default values for all +properties that are not in [`LearnAPI.learners(learner)`](@ref). Instead, these +learner-valued properties can have a `nothing` default, with the constructor throwing an +error if the constructor call does not explicitly specify a new value. Any object `learner` for which [`LearnAPI.functions(learner)`](@ref) is non-empty is understood to have a valid implementation of the LearnAPI.jl interface. diff --git a/docs/src/traits.md b/docs/src/traits.md index f47f1633..a95404bf 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -13,45 +13,48 @@ training). They may also record more mundane information, such as a package lice In the examples column of the table below, `Continuous` is a name owned the package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). -| trait | return value | fallback value | example | -|:-----------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:-----------------------------------------------------------| -| [`LearnAPI.constructor`](@ref)`(learner)` | constructor for generating new or modified versions of `learner` | (no fallback) | `RidgeRegressor` | +| trait | return value | fallback value | example | +|:---------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------|:---------------------------------------------------------------| +| [`LearnAPI.constructor`](@ref)`(learner)` | constructor for generating new or modified versions of `learner` | (no fallback) | `RidgeRegressor` | | [`LearnAPI.functions`](@ref)`(learner)` | functions you can apply to `learner` or associated model (traits excluded) | `()` | `(:fit, :predict, :LearnAPI.strip, :(LearnAPI.learner), :obs)` | -| [`LearnAPI.kinds_of_proxy`](@ref)`(learner)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(learner, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | -| [`LearnAPI.tags`](@ref)`(learner)` | lists one or more suggestive learner tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | -| [`LearnAPI.is_pure_julia`](@ref)`(learner)` | `true` if implementation is 100% Julia code | `false` | `true` | -| [`LearnAPI.pkg_name`](@ref)`(learner)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | -| [`LearnAPI.pkg_license`](@ref)`(learner)` | name of license of package providing core code | `"unknown"` | `"MIT"` | -| [`LearnAPI.doc_url`](@ref)`(learner)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | -| [`LearnAPI.load_path`](@ref)`(learner)` | string locating name returned by `LearnAPI.constructor(learner)`, beginning with a package name | "unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | -| [`LearnAPI.is_composite`](@ref)`(learner)` | `true` if one or more properties of `learner` may be a learner | `false` | `true` | -| [`LearnAPI.human_name`](@ref)`(learner)` | human name for the learner; should be a noun | type name with spaces | "elastic net regressor" | -| [`LearnAPI.iteration_parameter`](@ref)`(learner)` | symbolic name of an iteration parameter | `nothing` | :epochs | -| [`LearnAPI.data_interface`](@ref)`(learner)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | -| [`LearnAPI.fit_observation_scitype`](@ref)`(learner)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | -| [`LearnAPI.target_observation_scitype`](@ref)`(learner)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | -| [`LearnAPI.is_static`](@ref)`(learner)` | `true` if `fit` consumes no data | `false` | `true` | +| [`LearnAPI.kinds_of_proxy`](@ref)`(learner)` | instances `kind` of `KindOfProxy` for which an implementation of `LearnAPI.predict(learner, kind, ...)` is guaranteed. | `()` | `(Distribution(), Interval())` | +| [`LearnAPI.tags`](@ref)`(learner)` | lists one or more suggestive learner tags from `LearnAPI.tags()` | `()` | (:regression, :probabilistic) | +| [`LearnAPI.is_pure_julia`](@ref)`(learner)` | `true` if implementation is 100% Julia code | `false` | `true` | +| [`LearnAPI.pkg_name`](@ref)`(learner)` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"unknown"` | `"DecisionTree"` | +| [`LearnAPI.pkg_license`](@ref)`(learner)` | name of license of package providing core code | `"unknown"` | `"MIT"` | +| [`LearnAPI.doc_url`](@ref)`(learner)` | url providing documentation of the core code | `"unknown"` | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | +| [`LearnAPI.load_path`](@ref)`(learner)` | string locating name returned by `LearnAPI.constructor(learner)`, beginning with a package name | `"unknown"` | `FastTrees.LearnAPI.DecisionTreeClassifier` | +| [`LearnAPI.nonlearners`](@ref)`(learner)` | properties *not* corresponding to other learners | all properties | `(:K, :leafsize, :metric,)` | +| [`LearnAPI.human_name`](@ref)`(learner)` | human name for the learner; should be a noun | type name with spaces | "elastic net regressor" | +| [`LearnAPI.iteration_parameter`](@ref)`(learner)` | symbolic name of an iteration parameter | `nothing` | :epochs | +| [`LearnAPI.data_interface`](@ref)`(learner)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | +| [`LearnAPI.fit_observation_scitype`](@ref)`(learner)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.target_observation_scitype`](@ref)`(learner)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | +| [`LearnAPI.is_static`](@ref)`(learner)` | `true` if `fit` consumes no data | `false` | `true` | ### Derived Traits The following are provided for convenience but should not be overloaded by new learners: -| trait | return value | example | -|:-----------------------------------|:-------------------------------------------------------------------------|:--------| -| `LearnAPI.name(learner)` | learner type name as string | "PCA" | -| `LearnAPI.is_learner(learner)` | `true` if `learner` is LearnAPI.jl-compliant | `true` | -| `LearnAPI.target(learner)` | `true` if `fit` sees a target variable; see [`LearnAPI.target`](@ref) | `false` | -| `LearnAPI.weights(learner)` | `true` if `fit` supports per-observation; see [`LearnAPI.weights`](@ref) | `false` | +| trait | return value | example | +|:-------------------------------|:-------------------------------------------------------------------------|:--------------| +| `LearnAPI.name(learner)` | learner type name as string | "PCA" | +| `LearnAPI.learners(learner)` | properties with learner values | `(:atom, )` | +| `LearnAPI.is_learner(learner)` | `true` if `learner` is LearnAPI.jl-compliant | `true` | +| `LearnAPI.target(learner)` | `true` if `fit` sees a target variable; see [`LearnAPI.target`](@ref) | `false` | +| `LearnAPI.weights(learner)` | `true` if `fit` supports per-observation; see [`LearnAPI.weights`](@ref) | `false` | ## Implementation guide +Only `LearnAPI.constructor` and `LearnAPI.functions` are universally compulsory. + A single-argument trait is declared following this pattern: ```julia LearnAPI.is_pure_julia(learner::MyLearnerType) = true ``` -A shorthand for single-argument traits is available: +A macro [`@trait`](@ref) provides a short-cut: ```julia @trait MyLearnerType is_pure_julia=true @@ -75,8 +78,8 @@ requires: 1. *Finiteness:* The value of a trait is the same for all `learner`s with same value of [`LearnAPI.constructor(learner)`](@ref). This typically means trait values do not - depend on type parameters! If `is_composite(learner) = true`, this requirement is - dropped. + depend on type parameters! For composite models (`LearnAPI.learners(learner)` + non-empty) this requirement is dropped. 2. *Low level deserializability:* It should be possible to evaluate the trait *value* when `LearnAPI` is the only imported module. @@ -98,7 +101,7 @@ LearnAPI.pkg_name LearnAPI.pkg_license LearnAPI.doc_url LearnAPI.load_path -LearnAPI.is_composite +LearnAPI.nonlearners LearnAPI.human_name LearnAPI.data_interface LearnAPI.iteration_parameter @@ -106,3 +109,7 @@ LearnAPI.fit_observation_scitype LearnAPI.target_observation_scitype LearnAPI.is_static ``` + +```@docs +LearnAPI.learners +``` diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index db8cb666..bb674b00 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -96,8 +96,8 @@ LearnAPI.strip(model) = model """ LearnAPI.feature_names(model) -Return the names of features encountered when fitting or updating some `learner` to obtain -`model`. +Where supported, return the names of features encountered when fitting or updating some +`learner` to obtain `model`. The value returned value is a vector of symbols. @@ -115,9 +115,9 @@ function feature_names end """ LearnAPI.feature_importances(model) -Return the learner-specific feature importances of a `model` output by -[`fit`](@ref)`(learner, ...)` for some `learner`. The value returned has the form of -an abstract vector of `feature::Symbol => importance::Real` pairs (e.g `[:gender => 0.23, +Where supported, return the learner-specific feature importances of a `model` output by +[`fit`](@ref)`(learner, ...)` for some `learner`. The value returned has the form of an +abstract vector of `feature::Symbol => importance::Real` pairs (e.g `[:gender => 0.23, :height => 0.7, :weight => 0.1]`). The `learner` supports feature importances if `:(LearnAPI.feature_importances) in @@ -247,11 +247,11 @@ function training_losses end """ LearnAPI.out_of_sample_losses(model) -Return internally computed out-of-sample losses obtained when running `model = -fit(learner, ...)` for some `learner`, one for each iteration of the algorithm. This will -be a numeric vector. The metric used to compute the loss is generally learner-specific, -but may be a user-specifiable learner hyperparameter. Generally, the smaller the loss, the -better the performance. +Where supported, return internally computed out-of-sample losses obtained when running +`model = fit(learner, ...)` for some `learner`, one for each iteration of the +algorithm. This will be a numeric vector. The metric used to compute the loss is generally +learner-specific, but may be a user-specifiable learner hyperparameter. Generally, the +smaller the loss, the better the performance. If the learner is not setting aside a separate validation set, then the losses are all `Inf`. @@ -274,10 +274,10 @@ function out_of_sample_losses end """ LearnAPI.predictions(model) -Return internally computed predictions on the training data when running `model = -fit(learner, ...)` for some `learner`. These will be actual target predictions or proxies -for the target, according to the first value of -[`LearnAPI.kinds_of_proxy(learner)`](@ref). +Where supported, return internally computed predictions on the training `data` after +running `model = fit(learner, data)` for some `learner`. Sematically equivalent to calling +`LearnAPI.predict(model, X)`, where `X = LearnAPI.features(obs(learner, data))` but +generally cheaper. See also [`fit`](@ref). @@ -285,11 +285,12 @@ See also [`fit`](@ref). Implement for algorithms that internally compute predictions for the training data. Predictions for the complete test data must be returned, even if only a subset is -used for training. Here are use cases: +internally used for training. Cannot be implemented for static algorithms (algorithms for +which `fit` consumes no data). Here are some possible use cases: - Clustering algorithms that generalize to new data, but by first learning labels for the training data (e.g., K-means); use `predictions(model)` to expose these labels - to the user so they can avoid a separate `predict` call. + to the user so they can avoid the expense of a separate `predict` call. - Iterative learners such as neural networks, that need to make in-sample predictions to estimate to estimate an in-sample loss; use `predictions(model)` @@ -298,9 +299,8 @@ used for training. Here are use cases: - Ensemble learners, such as gradient tree boosting algorithms, may split the training data into internal train and validation subsets and can efficiently build up predictions on both with an update for each new ensemble member; expose these predictions to the - user (for external iteration control, for example) using `predictions(model)` - and articulate the actual split used using - [`LearnAPI.out_of_sample_indices(model)`](@ref). + user (for external iteration control, for example) using `predictions(model)` and + articulate the actual split used using [`LearnAPI.out_of_sample_indices(model)`](@ref). $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.predictions)")). @@ -310,10 +310,10 @@ function predictions end """ LearnAPI.out_of_sample_indices(model) -For a learner implementing [`LearnAPI.predictions`](@ref), return a vector of +For a learner also implementing [`LearnAPI.predictions`](@ref), return a vector of observation indices identifying which part, if any, of `yhat = -LearnAPI.predictions(model)`, is actually out-of-sample predictions. If the -learner trained on all data this will be an empty vector. +LearnAPI.predictions(model)`, is actually out-of-sample predictions. If the learner +trained on all data this will be an empty vector. Here's a sample workflow for some such `learner`, with training data, `(X, y)`, where `y` is the training target, here assumed to be a vector. @@ -339,9 +339,9 @@ function out_of_sample_indices end """ LearnAPI.training_scores(model) -Return the training scores obtained when running `model = fit(learner, ...)` for some -`learner`. This will be a numerical vector whose length coincides with the number of -training observations, and whose interpretation depends on the learner. +Where supported, return the training scores obtained when running `model = fit(learner, +...)` for some `learner`. This will be a numerical vector whose length coincides with the +number of training observations, and whose interpretation depends on the learner. See also [`fit`](@ref). @@ -360,14 +360,14 @@ function training_scores end LearnAPI.components(model) For a composite `model`, return the component models (`fit` outputs). These will be in the -form of a vector of named pairs, `property_name::Symbol => component_model`. Here -`property_name` is the name of some learner-valued property (hyper-parameter) of -`learner = LearnAPI.learner(model)`. +form of a vector of named pairs, `sublearner::Symbol => component_model(s)`, one for each +`sublearner` in [`LearnAPI.learners(learner)`](@ref), where `learner = +LearnAPI.learner(model)`. Here `component_model(s)` will be the `fit` output (or vector of +`fit` outputs) generated internally for the the corresponding sublearner. -A composite model is one for which the corresponding `learner` includes one or more -learner-valued properties, and for which `LearnAPI.is_composite(learner)` is `true`. +The `model` is composite if [`LearnAPI.learners(learner)`](@ref) is non-empty. -See also [`is_composite`](@ref). +See also [`LearnAPI.learners`](@ref). # New implementations diff --git a/src/clone.jl b/src/clone.jl index 3f7d478d..d3e6c872 100644 --- a/src/clone.jl +++ b/src/clone.jl @@ -1,9 +1,12 @@ """ + LearnAPI.clone(learner, replacements...) LearnAPI.clone(learner; replacements...) -Return a shallow copy of `learner` with the specified hyperparameter replacements. +Return a shallow copy of `learner` with the specified hyperparameter replacements. Two +syntaxes are supported, as shown in the following examples: ```julia +clone(learner, :epochs => 100, :learner_rate => 0.01) clone(learner; epochs=100, learning_rate=0.01) ``` @@ -12,11 +15,10 @@ A LearnAPI.jl contract ensures that `LearnAPI.clone(learner) == learner`. A new learner implementation does not overload `clone`. """ -function clone(learner; replacements...) - reps = NamedTuple(replacements) +function clone(learner, args...; kwargs...) + reps = merge(NamedTuple(args), NamedTuple(kwargs)) names = propertynames(learner) rep_names = keys(reps) - new_values = map(names) do name name in rep_names && return getproperty(reps, name) getproperty(learner, name) diff --git a/src/fit_update.jl b/src/fit_update.jl index 56cbe710..015669e7 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -61,11 +61,11 @@ function fit end # # UPDATE AND COUSINS """ - update(model, data; verbosity=LearnAPI.default_verbosity(), hyperparam_replacements...) + update(model, data, param_replacements...; verbosity=1) Return an updated version of the `model` object returned by a previous [`fit`](@ref) or -`update` call, but with the specified hyperparameter replacements, in the form `p1=value1, -p2=value2, ...`. +`update` call, but with the specified hyperparameter replacements, in the form `p1 => +value1, p2 => value2, ...`. ```julia learner = MyForest(ntrees=100) @@ -74,7 +74,7 @@ learner = MyForest(ntrees=100) model = fit(learner, data) # add 50 more trees: -model = update(model, data; ntrees=150) +model = update(model, data, :ntrees => 150) ``` Provided that `data` is identical with the data presented in a preceding `fit` call *and* @@ -91,8 +91,12 @@ See also [`fit`](@ref), [`update_observations`](@ref), [`update_features`](@ref) # New implementations -Implementation is optional. The signature must include -`verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update)")) +Implementation is optional. The signature must include `verbosity`. It should be true that +`LearnAPI.learner(newmodel) == newlearner`, where `newmodel` is the return value and +`newlearner = LearnAPI.clone(learner, replacements...)`. + + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update)")) See also [`LearnAPI.clone`](@ref) @@ -100,25 +104,20 @@ See also [`LearnAPI.clone`](@ref) function update end """ - update_observations( - model, - new_data; - parameter_replacements..., - verbosity=LearnAPI.default_verbosity(), - ) + update_observations(model, new_data, param_replacements...; verbosity=1) Return an updated version of the `model` object returned by a previous [`fit`](@ref) or `update` call given the new observations present in `new_data`. One may additionally -specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. +specify hyperparameter replacements in the form `p1 => value1, p2 => value2, ...`. ```julia-repl -learner = MyNeuralNetwork(epochs=10, learning_rate=0.01) +learner = MyNeuralNetwork(epochs=10, learning_rate => 0.01) # train for ten epochs: model = fit(learner, data) # train for two more epochs using new data and new learning rate: -model = update_observations(model, new_data; epochs=12, learning_rate=0.1) +model = update_observations(model, new_data, epochs => 12, learning_rate => 0.1) ``` When following the call `fit(learner, data)`, the `update` call is semantically @@ -132,8 +131,11 @@ See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). # New implementations -Implementation is optional. The signature must include -`verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_observations)")) +Implementation is optional. The signature must include `verbosity`. It should be true that +`LearnAPI.learner(newmodel) == newlearner`, where `newmodel` is the return value and +`newlearner = LearnAPI.clone(learner, replacements...)`. + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_observations)")) See also [`LearnAPI.clone`](@ref). @@ -141,16 +143,12 @@ See also [`LearnAPI.clone`](@ref). function update_observations end """ - update_features( - model, - new_data; - parameter_replacements..., - verbosity=LearnAPI.default_verbosity(), + update_features(model, new_data, param_replacements...; verbosity=1) ) Return an updated version of the `model` object returned by a previous [`fit`](@ref) or `update` call given the new features encapsulated in `new_data`. One may additionally -specify hyperparameter replacements in the form `p1=value1, p2=value2, ...`. +specify hyperparameter replacements in the form `p1 => value1, p2 => value2, ...`. When following the call `fit(learner, data)`, the `update` call is semantically equivalent to retraining ab initio using a concatenation of `data` and `new_data`, @@ -163,8 +161,11 @@ See also [`fit`](@ref), [`update`](@ref), [`update_features`](@ref). # New implementations -Implementation is optional. The signature must include -`verbosity`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_features)")) +Implementation is optional. The signature must include `verbosity`. It should be true that +`LearnAPI.learner(newmodel) == newlearner`, where `newmodel` is the return value and +`newlearner = LearnAPI.clone(learner, replacements...)`. + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.update_features)")) See also [`LearnAPI.clone`](@ref). diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl index 2af85bad..c14f467b 100644 --- a/src/target_weights_features.jl +++ b/src/target_weights_features.jl @@ -67,11 +67,11 @@ weights(::Any, observations) = nothing LearnAPI.features(learner, observations) Return, for every conceivable `observations` returned by a call of the form [`obs(learner, -data)`](@ref), the "features" part of `data` (as opposed to the target variable, for -example). +data)`](@ref), the "features" part of `observations` (as opposed to the target variable, +for example). -The returned object `X` may always be passed to `predict` or `transform`, where -implemented, as in the following sample workflow: +It must always be possible to pass the returned object `X` to `predict` or `transform`, +where implemented, as in the following sample workflow: ```julia observations = obs(learner, data) diff --git a/src/traits.jl b/src/traits.jl index e8f93a4f..46004d17 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -60,10 +60,10 @@ function constructor end """ LearnAPI.functions(learner) -Return a tuple of expressions representing functions that can be meaningfully applied -with `learner`, or an associated model (object returned by `fit(learner, ...)`, as the -first argument. Learner traits (methods for which `learner` is the *only* argument) -are excluded. +Return a tuple of expressions representing functions that can be meaningfully applied with +`learner`, or an associated model (object returned by `fit(learner, ...)`), as the first +argument. Learner traits (methods for which `learner` is the *only* argument) are +excluded. To return actual functions, instead of symbols, use [`@functions`](@ref)` learner` instead. @@ -320,25 +320,25 @@ load_path(::Any) = "unknown" """ - LearnAPI.is_composite(learner) + LearnAPI.nonlearners(learner) -Returns `true` if one or more properties (fields) of `learner` may themselves be -learners, and `false` otherwise. +Return the properties of `learner` whose corresponding values are not themselves +learners. -See also [`LearnAPI.components`](@ref). +See also [`LearnAPI.learners`](@ref). # New implementations -This trait should be overloaded if one or more properties (fields) of `learner` may take -learner values. Fallback return value is `false`. The keyword constructor for such an -learner need not prescribe defaults for learner-valued properties. Implementation of -the accessor function [`LearnAPI.components`](@ref) is recommended. +This trait should be overloaded if one or more properties (fields) of `learner` take +learner values. The fallback returns `propertynames(learner)`, meaning no properties have +learner values. If overloaded, implementation of the accessor function +[`LearnAPI.components`](@ref) is recommended. $DOC_ON_TYPE """ -is_composite(::Any) = false +nonlearners(learner) = propertynames(learner) """ LearnAPI.human_name(learner) @@ -472,6 +472,21 @@ target_observation_scitype(::Any) = Any # # DERIVED TRAITS name(learner) = split(string(constructor(learner)), ".") |> last + +""" + LearnAPI.learners(learner) + +Return the properties of `learner` whose corresponding values are themselves +learners. + +See also [`LearnAPI.learners`](@ref). + +# New implementations + +This trait should not be overloaded. Instead overload [`LearnAPI.nonlearners`](@ref). + +""" +learners(learner) = setdiff(propertynames(learner), nonlearners(learner)) is_learner(learner) = !isempty(functions(learner)) preferred_kind_of_proxy(learner) = first(kinds_of_proxy(learner)) target(learner) = :(LearnAPI.target) in functions(learner) diff --git a/test/traits.jl b/test/traits.jl index 35d53430..8b0353f3 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -36,10 +36,10 @@ small = SmallLearner() @test isempty(LearnAPI.tags(small)) @test !LearnAPI.is_pure_julia(small) @test LearnAPI.pkg_name(small) == "unknown" +@test isempty(LearnAPI.nonlearners(small)) @test LearnAPI.pkg_license(small) == "unknown" @test LearnAPI.doc_url(small) == "unknown" @test LearnAPI.load_path(small) == "unknown" -@test !LearnAPI.is_composite(small) @test LearnAPI.human_name(small) == "small learner" @test isnothing(LearnAPI.iteration_parameter(small)) @test LearnAPI.data_interface(small) == LearnAPI.RandomAccess() @@ -49,6 +49,7 @@ small = SmallLearner() # DERIVED TRAITS +@test isempty(LearnAPI.learners(small)) @trait SmallLearner kinds_of_proxy=(Point(),) @test LearnAPI.is_learner(small) @test !LearnAPI.is_learner("junk") From d6f8cecb18252a4efc2b8d0f9f92a1f6e2fe1887 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 24 Jan 2025 11:47:39 +1300 Subject: [PATCH 186/187] add DocumenterInterLinks --- docs/Project.toml | 1 + docs/make.jl | 1 + 2 files changed, 2 insertions(+) diff --git a/docs/Project.toml b/docs/Project.toml index 12dcdd5f..4d8ef094 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -1,5 +1,6 @@ [deps] Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" +DocumenterInterLinks = "d12716ef-a0f6-4df4-a9f1-a5a34e75c656" LearnAPI = "92ad9a40-7767-427a-9ee6-6e577f1266cb" MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54" ScientificTypesBase = "30f210dd-8aff-4c5f-94ba-8e64358c1161" diff --git a/docs/make.jl b/docs/make.jl index 11759727..158117cd 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -1,6 +1,7 @@ using Documenter using LearnAPI using ScientificTypesBase +using DocumenterInterLinks const REPO = Remotes.GitHub("JuliaAI", "LearnAPI.jl") From d21c403438b93fa360b7d12fe0e12d3bff6d2b5e Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Fri, 24 Jan 2025 12:02:19 +1300 Subject: [PATCH 187/187] fix typos --- src/accessor_functions.jl | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/accessor_functions.jl b/src/accessor_functions.jl index bb674b00..b5f487b4 100644 --- a/src/accessor_functions.jl +++ b/src/accessor_functions.jl @@ -234,9 +234,9 @@ See also [`fit`](@ref). # New implementations -Implement for iterative algorithms that compute meausures of training performance as part +Implement for iterative algorithms that compute measures of training performance as part of training (e.g. neural networks). Return one value per iteration, in chronological -order, with an optional pre-training intial value. If scores are being computed rather +order, with an optional pre-training initial value. If scores are being computed rather than losses, ensure values are multiplied by -1. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.training_losses)")). @@ -263,7 +263,7 @@ See also [`fit`](@ref). Only implement this method for learners that specifically allow for the supplied training data to be internally split into separate "train" and "validation" subsets, and which additionally compute an out-of-sample loss. Return one value per iteration, in -chronological order, with an optional pre-training intial value. If scores are being +chronological order, with an optional pre-training initial value. If scores are being computed rather than losses, ensure values are multiplied by -1. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.out_of_sample_losses)")). @@ -275,7 +275,7 @@ function out_of_sample_losses end LearnAPI.predictions(model) Where supported, return internally computed predictions on the training `data` after -running `model = fit(learner, data)` for some `learner`. Sematically equivalent to calling +running `model = fit(learner, data)` for some `learner`. Semantically equivalent to calling `LearnAPI.predict(model, X)`, where `X = LearnAPI.features(obs(learner, data))` but generally cheaper.