@@ -10,19 +10,19 @@ For a transformer, implementations ordinarily implement `transform` instead of
10
10
11
11
!!! important
12
12
13
- The core implementations of `fit`, `predict`, etc,
14
- always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`.
15
- Calls like `fit(algorithm, X, y)` are provided as additional convenience methods.
13
+ The core implementations of `fit`, `predict`, etc,
14
+ always have a *single* `data` argument, as in `fit(algorithm, data; verbosity=1)`.
15
+ Calls like `fit(algorithm, X, y)` are provided as additional convenience methods.
16
16
17
17
!!! note
18
18
19
- If the `data` object consumed by `fit`, `predict`, or `transform` is not
20
- not a suitable table¹, array³, tuple of tables and arrays, or some
21
- other object implementing
22
- the MLUtils.jl `getobs`/`numobs` interface,
23
- then an implementation must: (i) suitably overload the trait
24
- [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as
25
- illustrated below under [Providing an advanced data interface](@ref).
19
+ If the `data` object consumed by `fit`, `predict`, or `transform` is not
20
+ not a suitable table¹, array³, tuple of tables and arrays, or some
21
+ other object implementing
22
+ the MLUtils.jl `getobs`/`numobs` interface,
23
+ then an implementation must: (i) suitably overload the trait
24
+ [`LearnAPI.data_interface`](@ref); and/or (ii) overload [`obs`](@ref), as
25
+ illustrated below under [Providing an advanced data interface](@ref).
26
26
27
27
The first line below imports the lightweight package LearnAPI.jl whose methods we will be
28
28
extending. The second imports libraries needed for the core algorithm.
@@ -39,7 +39,7 @@ Here's a new type whose instances specify ridge regression parameters:
39
39
40
40
``` @example anatomy
41
41
struct Ridge{T<:Real}
42
- lambda::T
42
+ lambda::T
43
43
end
44
44
nothing # hide
45
45
```
@@ -63,7 +63,7 @@ changed to `0.05`.
63
63
64
64
## Implementing ` fit `
65
65
66
- A ridge regressor requires two types of data for training: * input features* ` X ` , which
66
+ A ridge regressor requires two types of data for training: input features ` X ` , which
67
67
here we suppose are tabular¹, and a [ target] (@ref proxy) ` y ` , which we suppose is a
68
68
vector.
69
69
@@ -72,9 +72,9 @@ coefficients labelled by feature name for inspection after training:
72
72
73
73
``` @example anatomy
74
74
struct RidgeFitted{T,F}
75
- algorithm::Ridge
76
- coefficients::Vector{T}
77
- named_coefficients::F
75
+ algorithm::Ridge
76
+ coefficients::Vector{T}
77
+ named_coefficients::F
78
78
end
79
79
nothing # hide
80
80
```
@@ -87,25 +87,25 @@ The core implementation of `fit` looks like this:
87
87
``` @example anatomy
88
88
function LearnAPI.fit(algorithm::Ridge, data; verbosity=1)
89
89
90
- X, y = data
90
+ X, y = data
91
91
92
- # data preprocessing:
93
- table = Tables.columntable(X)
94
- names = Tables.columnnames(table) |> collect
95
- A = Tables.matrix(table, transpose=true)
92
+ # data preprocessing:
93
+ table = Tables.columntable(X)
94
+ names = Tables.columnnames(table) |> collect
95
+ A = Tables.matrix(table, transpose=true)
96
96
97
- lambda = algorithm.lambda
97
+ lambda = algorithm.lambda
98
98
99
- # apply core algorithm:
100
- coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector
99
+ # apply core algorithm:
100
+ coefficients = (A*A' + algorithm.lambda*I)\(A*y) # vector
101
101
102
- # determine named coefficients:
103
- named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]
102
+ # determine named coefficients:
103
+ named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]
104
104
105
- # make some noise, if allowed:
106
- verbosity > 0 && @info "Coefficients: $named_coefficients"
105
+ # make some noise, if allowed:
106
+ verbosity > 0 && @info "Coefficients: $named_coefficients"
107
107
108
- return RidgeFitted(algorithm, coefficients, named_coefficients)
108
+ return RidgeFitted(algorithm, coefficients, named_coefficients)
109
109
end
110
110
```
111
111
@@ -127,7 +127,7 @@ Here's the implementation for our ridge regressor:
127
127
128
128
``` @example anatomy
129
129
LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) =
130
- Tables.matrix(Xnew)*model.coefficients
130
+ Tables.matrix(Xnew)*model.coefficients
131
131
```
132
132
133
133
## Accessor functions
@@ -156,7 +156,7 @@ overload it to dump the named version of the coefficients:
156
156
157
157
``` @example anatomy
158
158
LearnAPI.minimize(model::RidgeFitted) =
159
- RidgeFitted(model.algorithm, model.coefficients, nothing)
159
+ RidgeFitted(model.algorithm, model.coefficients, nothing)
160
160
```
161
161
162
162
Crucially, we can still use ` LearnAPI.minimize(model) ` in place of ` model ` to make new
@@ -187,19 +187,19 @@ The macro can be used to specify multiple traits simultaneously:
187
187
188
188
``` @example anatomy
189
189
@trait(
190
- Ridge,
191
- constructor = Ridge,
192
- target = true,
193
- kinds_of_proxy=(LiteralTarget(),),
194
- descriptors = (:regression,),
195
- functions = (
196
- fit,
197
- minimize,
198
- predict,
199
- obs,
200
- LearnAPI.algorithm,
201
- LearnAPI.coefficients,
202
- )
190
+ Ridge,
191
+ constructor = Ridge,
192
+ target = true,
193
+ kinds_of_proxy=(LiteralTarget(),),
194
+ descriptors = (:regression,),
195
+ functions = (
196
+ fit,
197
+ minimize,
198
+ predict,
199
+ obs,
200
+ LearnAPI.algorithm,
201
+ LearnAPI.coefficients,
202
+ )
203
203
)
204
204
nothing # hide
205
205
```
@@ -230,10 +230,10 @@ enabling the kind of workflow previewed in [Sample workflow](@ref):
230
230
231
231
``` @example anatomy
232
232
LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) =
233
- fit(algorithm, (X, y); kwargs...)
233
+ fit(algorithm, (X, y); kwargs...)
234
234
235
235
LearnAPI.predict(model::RidgeFitted, Xnew) =
236
- predict(model, LiteralTarget(), Xnew)
236
+ predict(model, LiteralTarget(), Xnew)
237
237
```
238
238
239
239
## [ Demonstration] (@id workflow)
@@ -292,40 +292,40 @@ using LearnAPI
292
292
using LinearAlgebra, Tables
293
293
294
294
struct Ridge{T<:Real}
295
- lambda::T
295
+ lambda::T
296
296
end
297
297
298
298
Ridge(; lambda=0.1) = Ridge(lambda)
299
299
300
300
struct RidgeFitted{T,F}
301
- algorithm::Ridge
302
- coefficients::Vector{T}
303
- named_coefficients::F
301
+ algorithm::Ridge
302
+ coefficients::Vector{T}
303
+ named_coefficients::F
304
304
end
305
305
306
306
LearnAPI.algorithm(model::RidgeFitted) = model.algorithm
307
307
LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients
308
308
LearnAPI.minimize(model::RidgeFitted) =
309
- RidgeFitted(model.algorithm, model.coefficients, nothing)
309
+ RidgeFitted(model.algorithm, model.coefficients, nothing)
310
310
311
311
LearnAPI.fit(algorithm::Ridge, X, y; kwargs...) =
312
- fit(algorithm, (X, y); kwargs...)
312
+ fit(algorithm, (X, y); kwargs...)
313
313
LearnAPI.predict(model::RidgeFitted, Xnew) = predict(model, LiteralTarget(), Xnew)
314
314
315
315
@trait(
316
- Ridge,
317
- constructor = Ridge,
318
- target = true,
319
- kinds_of_proxy=(LiteralTarget(),),
320
- descriptors = (:regression,),
321
- functions = (
322
- fit,
323
- minimize,
324
- predict,
325
- obs,
326
- LearnAPI.algorithm,
327
- LearnAPI.coefficients,
328
- )
316
+ Ridge,
317
+ constructor = Ridge,
318
+ target = true,
319
+ kinds_of_proxy=(LiteralTarget(),),
320
+ descriptors = (:regression,),
321
+ functions = (
322
+ fit,
323
+ minimize,
324
+ predict,
325
+ obs,
326
+ LearnAPI.algorithm,
327
+ LearnAPI.coefficients,
328
+ )
329
329
)
330
330
331
331
n = 10 # number of observations
@@ -344,20 +344,20 @@ new type:
344
344
345
345
``` @example anatomy2
346
346
struct RidgeFitObs{T,M<:AbstractMatrix{T}}
347
- A::M # p x n
348
- names::Vector{Symbol} # features
349
- y::Vector{T} # target
347
+ A::M # p x n
348
+ names::Vector{Symbol} # features
349
+ y::Vector{T} # target
350
350
end
351
351
```
352
352
353
353
Now we overload ` obs ` to carry out the data pre-processing previously in ` fit ` , like this:
354
354
355
355
``` @example anatomy2
356
356
function LearnAPI.obs(::Ridge, data)
357
- X, y = data
358
- table = Tables.columntable(X)
359
- names = Tables.columnnames(table) |> collect
360
- return RidgeFitObs(Tables.matrix(table)', names, y)
357
+ X, y = data
358
+ table = Tables.columntable(X)
359
+ names = Tables.columnnames(table) |> collect
360
+ return RidgeFitObs(Tables.matrix(table)', names, y)
361
361
end
362
362
```
363
363
@@ -369,27 +369,27 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat
369
369
``` @example anatomy2
370
370
function LearnAPI.fit(algorithm::Ridge, observations::RidgeFitObs; verbosity=1)
371
371
372
- lambda = algorithm.lambda
372
+ lambda = algorithm.lambda
373
373
374
- A = observations.A
375
- names = observations.names
376
- y = observations.y
374
+ A = observations.A
375
+ names = observations.names
376
+ y = observations.y
377
377
378
- # apply core algorithm:
379
- coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix
378
+ # apply core algorithm:
379
+ coefficients = (A*A' + algorithm.lambda*I)\(A*y) # 1 x p matrix
380
380
381
- # determine named coefficients:
382
- named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]
381
+ # determine named coefficients:
382
+ named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]
383
383
384
- # make some noise, if allowed:
385
- verbosity > 0 && @info "Coefficients: $named_coefficients"
384
+ # make some noise, if allowed:
385
+ verbosity > 0 && @info "Coefficients: $named_coefficients"
386
386
387
- return RidgeFitted(algorithm, coefficients, named_coefficients)
387
+ return RidgeFitted(algorithm, coefficients, named_coefficients)
388
388
389
389
end
390
390
391
391
LearnAPI.fit(algorithm::Ridge, data; kwargs...) =
392
- fit(algorithm, obs(algorithm, data); kwargs...)
392
+ fit(algorithm, obs(algorithm, data); kwargs...)
393
393
```
394
394
395
395
We provide an overloading of ` LearnAPI.target ` to handle the additional supported data
@@ -409,7 +409,7 @@ accessing individual observations*. It usually suffices to overload `Base.getind
409
409
410
410
``` @example anatomy2
411
411
Base.getindex(data::RidgeFitObs, I) =
412
- RidgeFitObs(data.A[:,I], data.names, y[I])
412
+ RidgeFitObs(data.A[:,I], data.names, y[I])
413
413
Base.length(data::RidgeFitObs, I) = length(data.y)
414
414
```
415
415
@@ -420,10 +420,10 @@ case:
420
420
LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)'
421
421
422
422
LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, observations::AbstractMatrix) =
423
- observations'*model.coefficients
423
+ observations'*model.coefficients
424
424
425
425
LearnAPI.predict(model::RidgeFitted, ::LiteralTarget, Xnew) =
426
- predict(model, LiteralTarget(), obs(model, Xnew))
426
+ predict(model, LiteralTarget(), obs(model, Xnew))
427
427
```
428
428
429
429
### Important notes:
0 commit comments