Skip to content

Commit 84ef5fc

Browse files
committed
update readme
1 parent d59fb98 commit 84ef5fc

File tree

8 files changed

+117
-27
lines changed

8 files changed

+117
-27
lines changed

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
MIT License Copyright (c) 2021 - JuliaAI
3+
MIT License Copyright (c) 2024 - Anthony Blaom
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 38 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,26 +2,48 @@
22

33
A base Julia interface for machine learning and statistics
44

5+
[![Lifecycle:Maturing](https://img.shields.io/badge/Lifecycle-Maturing-007EC6)](ROADMAP.md)
6+
[![Build Status](https://github.com/JuliaAI/LearnAPI.jl/workflows/CI/badge.svg)](https://github.com/JuliaAI/LearnAPI.jl/actions)
7+
[![Coverage](https://codecov.io/gh/JuliaAI/LearnAPI.jl/branch/master/graph/badge.svg)](https://codecov.io/github/JuliaAI/LearnAPI.jl?branch=master)
8+
[![Docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://juliaai.github.io/LearnAPI.jl/dev/)
59

6-
**Devlopement Status:**
10+
Comprehensive documentation is [here](https://juliaai.github.io/LearnAPI.jl/dev/).
711

8-
- [X] Detailed proposal stage ([this
9-
documentation](https://juliaai.github.io/LearnAPI.jl/dev/)).
10-
- [X] Initial feedback stage (opened mid-January, 2023). General feedback can be provided at [this Julia Discourse thread](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048/20).
11-
- [ ] Proof of concept implementation
12-
- [ ] Polish
13-
- [ ] **Register 0.2.0**
12+
New contributions welcome. See the [road map](ROADMAP.md).
1413

15-
You can join a discussion on the LearnAPI proposal at [this](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048) Julia Discourse thread.
14+
## Code snippet
1615

17-
To do:
16+
Configure a learning algorithm, and inspect available functionality:
1817

19-
- [ ] ~~Add methods to create/save persistent representation of learned parameters~~
20-
- [X] Add more repo tests
21-
- [ ] Add methods to test an implementation
22-
- [ ] Add user guide ("Common Implementation Patterns" section of manual)
18+
```julia
19+
julia> algorithm = Ridge(lambda=0.1)
20+
julia> LearnAPI.functions(algorithm)
21+
(:(LearnAPI.fit), :(LearnAPI.algorithm), :(LearnAPI.minimize), :(LearnAPI.obs),
22+
:(LearnAPI.features), :(LearnAPI.target), :(LearnAPI.predict), :(LearnAPI.coefficients))
23+
```
2324

24-
[![Build Status](https://github.com/JuliaAI/LearnAPI.jl/workflows/CI/badge.svg)](https://github.com/JuliaAI/LearnAPI.jl/actions)
25-
[![Coverage](https://codecov.io/gh/JuliaAI/LearnAPI.jl/branch/master/graph/badge.svg)](https://codecov.io/github/JuliaAI/LearnAPI.jl?branch=master)
26-
[![Docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://juliaai.github.io/LearnAPI.jl/dev/)
25+
Train:
26+
27+
```julia
28+
julia> model = fit(algorithm, data)
29+
```
30+
31+
Predict:
32+
33+
```julia
34+
julia> predict(model, data)[1]
35+
"setosa"
36+
```
37+
38+
Predict a probability distribution ([proxy](https://juliaai.github.io/LearnAPI.jl/dev/kinds_of_target_proxy/#proxy_types) for the target):
39+
40+
```julia
41+
julia> predict(model, Distribution(), data)[1]
42+
UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.25, virginica=>0.75)
43+
```
44+
45+
## Credits
46+
47+
Created by Anthony Blaom, in cooperation with [members of the Julia
48+
community](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048).
2749

ROADMAP.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Road map
2+
3+
- [ ] Mock up a challenging `update` use-case: controlling an iterative algorithm that
4+
wants, for efficiency, to internally compute the out-of-sample predictions that will
5+
be for used to *externally* determined early stopping cc: @jeremiedb
6+
7+
- [ ] Get code coverage to 100% (see next item)
8+
9+
- [ ] Add to this repo or a utility repo methods to test a valid implementation of
10+
LearnAPI.jl
11+
12+
- [ ] Flush out "Common Implementation Patterns". The current plan is to mock up example
13+
implementations, and add them as LearnAPI.jl tests, with links to the test file from
14+
"Common Implementation Patterns". As real-world implementations roll out, we could
15+
increasingly point to those instead, to conserve effort
16+
- [x] regression
17+
- [ ] classification
18+
- [ ] clustering
19+
- [ ] gradient descent
20+
- [ ] iterative algorithms
21+
- [ ] incremental algorithms
22+
- [ ] dimension reduction
23+
- [x] feature engineering
24+
- [x] static algorithms
25+
- [ ] missing value imputation
26+
- [ ] transformers
27+
- [ ] ensemble algorithms
28+
- [ ] time series forecasting
29+
- [ ] time series classification
30+
- [ ] survival analysis
31+
- [ ] density estimation
32+
- [ ] Bayesian algorithms
33+
- [ ] outlier detection
34+
- [ ] collaborative filtering
35+
- [ ] text analysis
36+
- [ ] audio analysis
37+
- [ ] natural language processing
38+
- [ ] image processing
39+
- [ ] meta-algorithms
40+
41+
- [ ] In a utility package provide:
42+
- [ ] Method to clone an algorithm with user-specified property(hyperparameter)
43+
changes, as in `LearnAPI.clone(algorithm, p1=value1, p22=value2, ...)` (since
44+
`algorithm` can have any type, can't really overload `Base.replace` without
45+
piracy). This will be needed in tuning meta-algorithms. Or should this be in
46+
LearnAPI.jl proper, to expose it to all users?
47+
- [ ] Methods to facilitate common-use case data interfaces: support simultaneously
48+
`fit` data of the form `data = (X, y)` where `X` is table *or* matrix, and
49+
`data` a table with target specified by hyperparameter; here `obs` will return a
50+
thin wrapping of the matrix of `X`, the target `y`, and the names of all
51+
fields. We can have options to make `X` a concrete array or an adjoint,
52+
depending on what is more efficient for the algorithm.

docs/src/anatomy_of_an_implementation.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,11 @@ A transformer ordinarily implements `transform` instead of
2424
the MLUtils.jl `getobs`/`numobs` interface,
2525
then an implementation must: (i) overload [`obs`](@ref) to articulate how
2626
provided data can be transformed into a form that does support
27-
this interface, as illustrated below under
28-
[Providing an advanced data interface](@ref), and which may additionally
29-
enable certain performance benefits; or (ii) overload the trait
27+
this interface, as illustrated below under
28+
[Providing an advanced data interface](@ref), and which may additionally
29+
enable certain performance benefits; or (ii) overload the trait
3030
[`LearnAPI.data_interface`](@ref) to specify a more relaxed data
31-
API.
31+
API.
3232

3333
The first line below imports the lightweight package LearnAPI.jl whose methods we will be
3434
extending. The second imports libraries needed for the core algorithm.
@@ -503,5 +503,5 @@ declaration.
503503
⁴ The `data = (X, y)` pattern implemented here is not the only supported pattern. For,
504504
example, `data` might be a single table containing both features and target variable. In
505505
this case, it will be necessary to overload [`LearnAPI.features`](@ref) in addition to
506-
[`LearnAPI.target`](@ref); the name of the target column would need to be a hyperparameter
507-
or `fit` keyword argument.
506+
[`LearnAPI.target`](@ref); the name of the target column would need to be a
507+
hyperparameter.

docs/src/common_implementation_patterns.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,13 @@ implementations fall into one (or more) of the following informally understood p
3434

3535
- [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension
3636

37+
- [Feature Engineering](@ref)
38+
3739
- [Missing Value Imputation](@ref): Transformers that replace missing values.
3840

41+
- [Transformers](@ref): Other transformers, such as standardizers, and categorical
42+
encoders.
43+
3944
- [Clusterering](@ref): Algorithms that group data into clusters for classification and
4045
possibly dimension reduction. May be true learners (generalize to new data) or static.
4146

@@ -53,3 +58,5 @@ implementations fall into one (or more) of the following informally understood p
5358

5459
- [Survival Analysis](@ref)
5560

61+
- [Meta-algorithms](@ref)
62+
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Feature Engineering
2+
3+
- For a simple feature selection algorithm (no "learning) see [these
4+
examples](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/integration/static_algorithms.jl)
5+
from tests.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Meta-algorithms

src/traits.jl

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -148,8 +148,9 @@ See also [`LearnAPI.predict`](@ref), [`LearnAPI.KindOfProxy`](@ref).
148148
149149
Must be overloaded whenever `predict` is implemented.
150150
151-
Elements of the returned tuple must be one of the following, described further in
152-
LearnAPI.jl documentation: $CONCRETE_TARGET_PROXY_TYPES_LIST.
151+
Elements of the returned tuple must be instances of types in the return value of
152+
`LearnAPI.kinds_of_proxy()`, i.e., one of the following, described further in LearnAPI.jl
153+
documentation: $CONCRETE_TARGET_PROXY_TYPES_LIST.
153154
154155
Suppose, for example, we have the following implementation of a supervised learner
155156
returning only probabilistic predictions:
@@ -170,6 +171,8 @@ For more on target variables and target proxies, refer to the LearnAPI documenta
170171
171172
"""
172173
kinds_of_proxy(::Any) = ()
174+
kinds_of_proxy() = CONCRETE_TARGET_PROXY_TYPES
175+
173176

174177
tags() = [
175178
"regression",
@@ -179,12 +182,11 @@ tags() = [
179182
"iterative algorithms",
180183
"incremental algorithms",
181184
"dimension reduction",
182-
"encoders",
185+
"transformers",
183186
"feature engineering",
184187
"static algorithms",
185188
"missing value imputation",
186189
"ensemble algorithms",
187-
"wrappers",
188190
"time series forecasting",
189191
"time series classification",
190192
"survival analysis",
@@ -196,6 +198,7 @@ tags() = [
196198
"audio analysis",
197199
"natural language processing",
198200
"image processing",
201+
"meta-algorithms"
199202
]
200203

201204
const DOC_TAGS_LIST = join(map(d -> "`\"$d\"`", tags()), ", ")

0 commit comments

Comments
 (0)