1
1
# Anatomy of an Implementation
2
2
3
- This section explains a detailed implementation of the LearnAPI.jl for naive [ ridge
3
+ This tutorial details an implementation of the LearnAPI.jl for naive [ ridge
4
4
regression] ( https://en.wikipedia.org/wiki/Ridge_regression ) with no intercept. The kind of
5
5
workflow we want to enable has been previewed in [ Sample workflow] ( @ref ) . Readers can also
6
6
refer to the [ demonstration] (@ref workflow) of the implementation given later.
@@ -35,8 +35,7 @@ A transformer ordinarily implements `transform` instead of `predict`. For more o
35
35
then an implementation must: (i) overload [`obs`](@ref) to articulate how
36
36
provided data can be transformed into a form that does support
37
37
this interface, as illustrated below under
38
- [Providing a separate data front end](@ref), and which may additionally
39
- enable certain performance benefits; or (ii) overload the trait
38
+ [Providing a separate data front end](@ref); or (ii) overload the trait
40
39
[`LearnAPI.data_interface`](@ref) to specify a more relaxed data
41
40
API.
42
41
@@ -62,7 +61,7 @@ nothing # hide
62
61
63
62
Instances of ` Ridge ` are * [ learners] (@ref learners)* , in LearnAPI.jl parlance.
64
63
65
- Associated with each new type of LearnAPI.jl [ learner] ( @ ref learners) will be a keyword
64
+ Associated with each new type of LearnAPI.jl learner will be a keyword
66
65
argument constructor, providing default values for all properties (typically, struct
67
66
fields) that are not other learners, and we must implement
68
67
[ ` LearnAPI.constructor(learner) ` ] ( @ref ) , for recovering the constructor from an instance:
@@ -365,9 +364,41 @@ y = 2a - b + 3c + 0.05*rand(n)
365
364
An implementation may optionally implement [ ` obs ` ] ( @ref ) , to expose to the user (or some
366
365
meta-algorithm like cross-validation) the representation of input data internal to ` fit `
367
366
or ` predict ` , such as the matrix version ` A ` of ` X ` in the ridge example. That is, we may
368
- factor out of ` fit ` (and also ` predict ` ) the data pre-processing step, ` obs ` , to expose
369
- its outcomes. These outcomes become alternative user inputs to ` fit ` /` predict ` . To see the
370
- use of ` obs ` in action, see [ below] (@ref advanced_demo).
367
+ factor out of ` fit ` (and also ` predict ` ) a data pre-processing step, ` obs ` , to expose
368
+ its outcomes. These outcomes become alternative user inputs to ` fit ` /` predict ` .
369
+
370
+ In the default case, the alternative data representations will implement the MLUtils.jl
371
+ ` getobs/numobs ` interface for observation subsampling, which is generally all a user or
372
+ meta-algorithm will need, before passing the data on to ` fit ` /` predict ` as you would the
373
+ original data.
374
+
375
+ So, instead of the pattern
376
+
377
+ ``` julia
378
+ model = fit (learner, data)
379
+ predict (model, newdata)
380
+ ```
381
+
382
+ one enables the following alternative (which in any case will still work, because of a
383
+ no-op ` obs ` fallback provided by LearnAPI.jl):
384
+
385
+ ``` julia
386
+ observations = obs (learner, data) # pre-processed training data
387
+
388
+ # optional subsampling:
389
+ observations = MLUtils. getobs (observations, train_indices)
390
+
391
+ model = fit (learner, observations)
392
+
393
+ newobservations = obs (model, newdata)
394
+
395
+ # optional subsampling:
396
+ newobservations = MLUtils. getobs (observations, test_indices)
397
+
398
+ predict (model, newobservations)
399
+ ```
400
+
401
+ See also the demonstration [ below] (@ref advanced_demo).
371
402
372
403
Here we specifically wrap all the pre-processed data into single object, for which we
373
404
introduce a new type:
0 commit comments