Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ julia = "1.6"

[extras]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Expand All @@ -23,6 +24,7 @@ Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
[targets]
test = [
"DataFrames",
"Distributions",
"LinearAlgebra",
"MLUtils",
"Random",
Expand Down
8 changes: 3 additions & 5 deletions docs/src/common_implementation_patterns.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Common Implementation Patterns

```@raw html
🚧
```
!!! warning

This section is only an implementation guide. The definitive specification of the
Learn API is given in [Reference](@ref reference).
Expand All @@ -25,7 +23,7 @@ implementations fall into one (or more) of the following informally understood p

- [Iterative Algorithms](@ref)

- Incremental Algorithms
- [Incremental Algorithms](@ref): Algorithms that can be updated with new observations.

- [Feature Engineering](@ref): Algorithms for selecting or combining features

Expand All @@ -48,7 +46,7 @@ implementations fall into one (or more) of the following informally understood p

- Survival Analysis

- Density Estimation: Algorithms that learn a probability distribution
- [Density Estimation](@ref): Algorithms that learn a probability distribution

- Bayesian Algorithms

Expand Down
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ A key to enabling toolboxes to enhance LearnAPI.jl algorithm functionality is th
implementation of two key additional methods, beyond the usual `fit` and
`predict`/`transform`. Given any training `data` consumed by `fit` (such as `data = (X,
y)` in the example above) [`LearnAPI.features(algorithm, data)`](@ref input) tells us what
part of `data` comprises *features*, which is something that can be passsed onto to
part of `data` comprises *features*, which is something that can be passed onto to
`predict` or `transform` (`X` in the example) while [`LearnAPI.target(algorithm,
data)`](@ref), if implemented, tells us what part comprises the target (`y` in the
example). By explicitly requiring such methods, we free algorithms to consume data in
Expand Down
4 changes: 4 additions & 0 deletions docs/src/patterns/density_estimation.md
Original file line number Diff line number Diff line change
@@ -1 +1,5 @@
# Density Estimation

See these examples from tests:

- [normal distribution estimator](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl)
5 changes: 5 additions & 0 deletions docs/src/patterns/incremental_algorithms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Incremental Algorithms

See these examples from tests:

- [normal distribution estimator](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/patterns/incremental_algorithms.jl)
12 changes: 6 additions & 6 deletions src/predict_transform.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ function DOC_IMPLEMENTED_METHODS(name; overloaded=false)
"[`LearnAPI.functions`](@ref) trait. "
end

const OPERATIONS = (:predict, :transform, :inverse_transform)
const DOC_OPERATIONS_LIST_SYMBOL = join(map(op -> "`:$op`", OPERATIONS), ", ")
const DOC_OPERATIONS_LIST_FUNCTION = join(map(op -> "`LearnAPI.$op`", OPERATIONS), ", ")

DOC_MUTATION(op) =
"""

Expand Down Expand Up @@ -66,6 +62,9 @@ which lists all supported target proxies.

The argument `model` is anything returned by a call of the form `fit(algorithm, ...)`.

If `LearnAPI.features(LearnAPI.algorithm(model)) == nothing`, then argument `data` is
omitted. An example is density estimators.

# Example

In the following, `algorithm` is some supervised learning algorithm with
Expand Down Expand Up @@ -105,6 +104,7 @@ $(DOC_DATA_INTERFACE(:predict))

"""
predict(model, data) = predict(model, kinds_of_proxy(algorithm(model)) |> first, data)
predict(model) = predict(model, kinds_of_proxy(algorithm(model)) |> first)

# automatic slurping of multiple data arguments:
predict(model, k::KindOfProxy, data1, data2, datas...; kwargs...) =
Expand Down Expand Up @@ -167,8 +167,8 @@ $(DOC_MUTATION(:transform))
$(DOC_DATA_INTERFACE(:transform))

"""
transform(model, data1, data2...; kwargs...) =
transform(model, (data1, datas...); kwargs...) # automatic slurping
transform(model, data1, data2, datas...; kwargs...) =
transform(model, (data1, data2, datas...); kwargs...) # automatic slurping

"""
inverse_transform(model, data)
Expand Down
54 changes: 27 additions & 27 deletions src/types.jl
Original file line number Diff line number Diff line change
Expand Up @@ -22,27 +22,27 @@ See also [`LearnAPI.KindOfProxy`](@ref).
| type | form of an observation |
|:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `LearnAPI.Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode |
| `LearnAPI.Sampleable` | object that can be sampled to obtain object of the same form as target observation |
| `LearnAPI.Distribution` | explicit probability density/mass function whose sample space is all possible target observations |
| `LearnAPI.LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations |
| `LearnAPI.Probability`¹ | numerical probability or probability vector |
| `LearnAPI.LogProbability`¹ | log-probability or log-probability vector |
| `LearnAPI.Parametric`¹ | a list of parameters (e.g., mean and variance) describing some distribution |
| `LearnAPI.LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering |
| `LearnAPI.LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above |
| `LearnAPI.LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above |
| `LearnAPI.LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number |
| `LearnAPI.Quantile`² | same as target but with quantile interpretation |
| `LearnAPI.Expectile`² | same as target but with expectile interpretation |
| `LearnAPI.ConfidenceInterval`² | confidence interval |
| `LearnAPI.Fuzzy` | finite but possibly varying number of target observations |
| `LearnAPI.ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) |
| `LearnAPI.SurvivalFunction` | survival function |
| `LearnAPI.SurvivalDistribution` | probability distribution for survival time |
| `LearnAPI.SurvivalHazardFunction` | hazard function for survival time |
| `LearnAPI.OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) |
| `LearnAPI.Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) |
| `Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode |
| `Sampleable` | object that can be sampled to obtain object of the same form as target observation |
| `Distribution` | explicit probability density/mass function whose sample space is all possible target observations |
| `LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations |
| `Probability`¹ | numerical probability or probability vector |
| `LogProbability`¹ | log-probability or log-probability vector |
| `Parametric`¹ | a list of parameters (e.g., mean and variance) describing some distribution |
| `LabelAmbiguous` | collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering |
| `LabelAmbiguousSampleable` | sampleable version of `LabelAmbiguous`; see `Sampleable` above |
| `LabelAmbiguousDistribution` | pdf/pmf version of `LabelAmbiguous`; see `Distribution` above |
| `LabelAmbiguousFuzzy` | same as `LabelAmbiguous` but with multiple values of indeterminant number |
| `Quantile`² | same as target but with quantile interpretation |
| `Expectile`² | same as target but with expectile interpretation |
| `ConfidenceInterval`² | confidence interval |
| `Fuzzy` | finite but possibly varying number of target observations |
| `ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) |
| `SurvivalFunction` | survival function |
| `SurvivalDistribution` | probability distribution for survival time |
| `SurvivalHazardFunction` | hazard function for survival time |
| `OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) |
| `Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) |
¹Provided for completeness but discouraged to avoid [ambiguities in
representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation).
Expand Down Expand Up @@ -86,9 +86,9 @@ space ``Y^n``, where ``Y`` is the space from which the target variable takes its
| type `T` | form of output of `predict(model, ::T, data)` |
|:-------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `LearnAPI.JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. |
| `LearnAPI.JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` |
| `LearnAPI.JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` |
| `JointSampleable` | object that can be sampled to obtain a *vector* whose elements have the form of target observations; the vector length matches the number of observations in `data`. |
| `JointDistribution` | explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` |
| `JointLogDistribution` | explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in `data` |
"""
abstract type Joint <: KindOfProxy end
Expand All @@ -108,9 +108,9 @@ single object representing a probability distribution.
| type `T` | form of output of `predict(model, ::T)` |
|:--------------------------------:|:-----------------------------------------------------------------------|
| `LearnAPI.SingleSampleable` | object that can be sampled to obtain a single target observation |
| `LearnAPI.SingleDistribution` | explicit probability density/mass function for sampling the target |
| `LearnAPI.SingleLogDistribution` | explicit log-probability density/mass function for sampling the target |
| `SingleSampleable` | object that can be sampled to obtain a single target observation |
| `SingleDistribution` | explicit probability density/mass function for sampling the target |
| `SingleLogDistribution` | explicit log-probability density/mass function for sampling the target |
"""
abstract type Single <: KindOfProxy end
Expand Down
Loading
Loading