|
| 1 | +# Algorithm Traits |
| 2 | + |
| 3 | +> **Summary.** Traits allow one to promise particular behaviour for an algorithm, such as: |
| 4 | +> *This algorithm supports per-observation weights, which must appear as the third |
| 5 | +> *argument of > `fit`*, or *This algorithm predicts probability distributions for the |
| 6 | +> *target*, or *This > algorithm's `transform` method predicts `Real` vectors*. |
| 7 | +
|
| 8 | +For any (non-trivial) algorithm, [`LearnAPI.functions`](@ref)`(algorithm)` must be |
| 9 | +overloaded to list the LearnAPI methods that have been explicitly overloaded (algorithm |
| 10 | +traits excluded). Otherwise, overloading traits is optional, except where required by the |
| 11 | +implementation of some LearnAPI method and explicitly documented in that method's |
| 12 | +docstring. |
| 13 | + |
| 14 | +Traits are often called on instances but are usually *defined* on algorithm *types*, as in |
| 15 | + |
| 16 | +```julia |
| 17 | +LearnAPI.is_pure_julia(::Type{<:MyAlgorithmType}) = true |
| 18 | +``` |
| 19 | + |
| 20 | +which has the shorthand |
| 21 | + |
| 22 | +```julia |
| 23 | +@trait MyAlgorithmType is_pure_julia=true |
| 24 | +``` |
| 25 | + |
| 26 | +So, for convenience, every trait `t` is provided the fallback implementation |
| 27 | + |
| 28 | +```julia |
| 29 | +t(algorithm) = t(typeof(algorithm)) |
| 30 | +``` |
| 31 | + |
| 32 | +This means `LearnAPI.is_pure_julia(algorithm) = true` whenever `algorithm isa MyAlgorithmType` in the |
| 33 | +above example. |
| 34 | + |
| 35 | +Every trait has a global fallback implementation for `::Type`. |
| 36 | + |
| 37 | +Traits that vary from instance to instance of the same type are disallowed, except in the |
| 38 | +case of composite algorithms (`is_wrapper(algorithm) = true`) where this is unavoidable. (One |
| 39 | +reason for this is so one can associate with each algorithm type a unique set of trait-based |
| 40 | +"algorithm metadata" for inclusion in searchable algorithm databases.) This requirement |
| 41 | +occasionally requires that an existing algorithm implementation be split into separate |
| 42 | +LearnAPI implementations (e.g., one for regression and another for classification). |
| 43 | + |
| 44 | +**Ordinary traits** are available for overloading by any new LearnAPI implementation. **Derived |
| 45 | +traits** are not. |
| 46 | + |
| 47 | +## Ordinary traits |
| 48 | + |
| 49 | +In the examples column of the table below, `Table`, `Continuous`, `Sampleable` are names owned by the |
| 50 | +package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/). |
| 51 | + |
| 52 | +| trait | fallback value | return value | example | |
| 53 | +|:-------------------------------------------------|:----------------------|:--------------|:--------| |
| 54 | +| [`LearnAPI.functions`](@ref)`(algorithm)` | `()` | implemented LearnAPI functions (traits excluded) | `(:fit, :predict)` | |
| 55 | +| [`LearnAPI.predict_proxy`](@ref)`(algorithm)` | `LearnAPI.None()` | form of target proxy output by `predict` | `LearnAPI.Distribution()` | |
| 56 | +| [`LearnAPI.predict_joint_proxy`](@ref)`(algorithm)` | `LearnAPI.None()` | form of target proxy output by `predict_joint` | `LearnAPI.Distribution()` | |
| 57 | +| [`LearnAPI.position_of_target`](@ref)`(algorithm)` | `0` | † the positional index of the **target** in `data` in `fit(..., data...; metadata)` calls | 2 | |
| 58 | +| [`LearnAPI.position_of_weights`](@ref)`(algorithm)` | `0` | † the positional index of **per-observation weights** in `data` in `fit(..., data...; metadata)` | 3 | |
| 59 | +| [`LearnAPI.descriptors`](@ref)`(algorithm)` | `()` | lists one or more suggestive algorithm descriptors from `LearnAPI.descriptors()` | (:classifier, :probabilistic) | |
| 60 | +| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `false` | is `true` if implementation is 100% Julia code | `true` | |
| 61 | +| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | `"unknown"` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"DecisionTree"` | |
| 62 | +| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | `"unknown"` | name of license of package providing core code | `"MIT"` | |
| 63 | +| [`LearnAPI.doc_url`](@ref)`(algorithm)` | `"unknown"` | url providing documentation of the core code | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` | |
| 64 | +| [`LearnAPI.load_path`](@ref)`(algorithm)` | `"unknown"` | a string indicating where the struct for `typeof(algorithm)` is defined, beginning with name of package providing implementation | `FastTrees.LearnAPI.DecisionTreeClassifier` | |
| 65 | +| [`LearnAPI.is_wrapper`](@ref)`(algorithm)` | `false` | is `true` if one or more properties (fields) of `algorithm` may be an algorithm | `true` | |
| 66 | +| [`LearnAPI.human_name`](@ref)`(algorithm)` | type name with spaces | human name for the algorithm; should be a noun | "elastic net regressor" | |
| 67 | +| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | `nothing` | symbolic name of an iteration parameter | :epochs | |
| 68 | +| [`LearnAPI.fit_keywords`](@ref)`(algorithm)` | `()` | tuple of symbols for keyword arguments accepted by `fit` (corresponding to metadata) | `(:class_weights,)` | |
| 69 | +| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `fit(algorithm, verbosity, data...)`†† | `Tuple{Table(Continuous), AbstractVector{Continuous}}` | |
| 70 | +| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | `Union{}`| upper bound on `scitype(observation)` for `observation` in `data` and `data` in `fit(algorithm, verbosity, data...)`†† | `Tuple{AbstractVector{Continuous}, Continuous}` | |
| 71 | +| [`LearnAPI.fit_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `type(data)` in `fit(algorithm, verbosity, data...)`†† | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` | |
| 72 | +| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | `Union{}`| upper bound on `type(observation)` for `observation` in `data` and `data` in `fit(algorithm, verbosity, data...)`* | `Tuple{AbstractVector{<:Real}, Real}` | |
| 73 | +| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `predict(algorithm, fitted_params, data...)`†† | `Table(Continuous)` | |
| 74 | +| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm)` | `Any` | upper bound on `scitype(first(predict(algorithm, ...)))` | `AbstractVector{Continuous}` | |
| 75 | +| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `predict(algorithm, fitted_params, data...)`†† | `AbstractMatrix{<:Real}` | |
| 76 | +| [`LearnAPI.predict_output_type`](@ref)`(algorithm)` | `Any` | upper bound on `typeof(first(predict(algorithm, ...)))` | `AbstractVector{<:Real}` | |
| 77 | +| [`LearnAPI.predict_joint_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `predict_joint(algorithm, fitted_params, data...)`†† |`Table(Continuous)` | |
| 78 | +| [`LearnAPI.predict_joint_output_scitype`](@ref)`(algorithm)` | `Any` | upper bound on `scitype(first(predict_joint(algorithm, ...)))` | `Sampleable{<:AbstractVector{Continuous}}` | |
| 79 | +| [`LearnAPI.predict_joint_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `predict_joint(algorithm, fitted_params, data...)`†† | `AbstractMatrix{<:Real}` | |
| 80 | +| [`LearnAPI.predict_joint_output_type`](@ref)`(algorithm)` | `Any` | upper bound on `typeof(first(predict_joint(algorithm, ...)))` | `Distributions.Sampleable{Distributions.Multivariate,Distributions.Continuous}` | |
| 81 | +| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `transform(algorithm, fitted_params, data...)`†† | `Table(Continuous)` | |
| 82 | +| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | `Any` | upper bound on `scitype(first(transform(algorithm, ...)))` | `Table(Continuous)` | |
| 83 | +| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `transform(algorithm, fitted_params, data...)`†† | `AbstractMatrix{<:Real}}` | |
| 84 | +| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | `Any` | upper bound on `typeof(first(transform(algorithm, ...)))` | `AbstractMatrix{<:Real}` | |
| 85 | +| [`LearnAPI.inverse_transform_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `inverse_transform(algorithm, fitted_params, data...)`†† | `Table(Continuous)` | |
| 86 | +| [`LearnAPI.inverse_transform_output_scitype`](@ref)`(algorithm)` | `Any` | upper bound on `scitype(first(inverse_transform(algorithm, ...)))` | `Table(Continuous)` | |
| 87 | +| [`LearnAPI.inverse_transform_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `inverse_transform(algorithm, fitted_params, data...)`†† | `AbstractMatrix{<:Real}` | |
| 88 | +| [`LearnAPI.inverse_transform_output_type`](@ref)`(algorithm)` | `Any` | upper bound on `typeof(first(inverse_transform(algorithm, ...)))` | `AbstractMatrix{<:Real}` | |
| 89 | + |
| 90 | + |
| 91 | +† If the value is `0`, then the variable in boldface type is not supported and not |
| 92 | +expected to appear in `data`. If `length(data)` is less than the trait value, then `data` |
| 93 | +is understood to exclude the variable, but note that `fit` can have multiple signatures of |
| 94 | +varying lengths, as in `fit(algorithm, verbosity, X, y)` and `fit(algorithm, verbosity, X, y, |
| 95 | +w)`. A non-zero value is a promise that `fit` includes a signature of sufficient length to |
| 96 | +include the variable. |
| 97 | + |
| 98 | +†† Assuming no [optional data interface](@ref data_interface) is implemented. See docstring |
| 99 | +for the general case. |
| 100 | + |
| 101 | + |
| 102 | +## Derived Traits |
| 103 | + |
| 104 | +The following convenience methods are provided but intended for overloading: |
| 105 | + |
| 106 | +| trait | return value | example | |
| 107 | +|:-----------------------------|:------------------------------------------|:--------| |
| 108 | +| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" | |
| 109 | +| `LearnAPI.is_algorithm(algorithm)` | `true` if `functions(algorithm)` is not empty | `true` | |
| 110 | + |
| 111 | +## Reference |
| 112 | + |
| 113 | +```@docs |
| 114 | +LearnAPI.functions |
| 115 | +LearnAPI.predict_proxy |
| 116 | +LearnAPI.predict_joint_proxy |
| 117 | +LearnAPI.position_of_target |
| 118 | +LearnAPI.position_of_weights |
| 119 | +LearnAPI.descriptors |
| 120 | +LearnAPI.is_pure_julia |
| 121 | +LearnAPI.pkg_name |
| 122 | +LearnAPI.pkg_license |
| 123 | +LearnAPI.doc_url |
| 124 | +LearnAPI.load_path |
| 125 | +LearnAPI.is_wrapper |
| 126 | +LearnAPI.fit_keywords |
| 127 | +LearnAPI.human_name |
| 128 | +LearnAPI.iteration_parameter |
| 129 | +LearnAPI.fit_scitype |
| 130 | +LearnAPI.fit_type |
| 131 | +LearnAPI.fit_observation_scitype |
| 132 | +LearnAPI.fit_observation_type |
| 133 | +LearnAPI.predict_input_scitype |
| 134 | +LearnAPI.predict_output_scitype |
| 135 | +LearnAPI.predict_input_type |
| 136 | +LearnAPI.predict_output_type |
| 137 | +LearnAPI.predict_joint_input_scitype |
| 138 | +LearnAPI.predict_joint_output_scitype |
| 139 | +LearnAPI.predict_joint_input_type |
| 140 | +LearnAPI.predict_joint_output_type |
| 141 | +LearnAPI.transform_input_scitype |
| 142 | +LearnAPI.transform_output_scitype |
| 143 | +LearnAPI.transform_input_type |
| 144 | +LearnAPI.transform_output_type |
| 145 | +LearnAPI.inverse_transform_input_scitype |
| 146 | +LearnAPI.inverse_transform_output_scitype |
| 147 | +LearnAPI.inverse_transform_input_type |
| 148 | +LearnAPI.inverse_transform_output_type |
| 149 | +``` |
0 commit comments