Skip to content

Commit c8d50ca

Browse files
committed
replace MLUtils wiith MLCore
1 parent 6e2d8c8 commit c8d50ca

File tree

12 files changed

+46
-45
lines changed

12 files changed

+46
-45
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Here `learner` specifies the configuration the algorithm (the hyperparameters) w
2626

2727
## Related packages
2828

29-
- [MLCore.jl](https://github.com/JuliaML/MLCore.jl) ([docs](https://juliaml.github.io/MLUtils.jl/stable/api/#Core-API))
29+
- [MLCore.jl](https://github.com/JuliaML/MLCore.jl) ([docs](https://juliaml.github.io/MLCore.jl/stable/api/#Core-API))
3030

3131
- [LearnTestAPI.jl](https://github.com/JuliaAI/LearnTestAPI.jl): Package to test implementations of LearnAPI.jl (but documented here)
3232

docs/Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
33
DocumenterInterLinks = "d12716ef-a0f6-4df4-a9f1-a5a34e75c656"
44
LearnAPI = "92ad9a40-7767-427a-9ee6-6e577f1266cb"
55
LearnTestAPI = "3111ed91-c4f2-40e7-bb19-7f6c618409b8"
6-
MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54"
6+
MLCore = "c2834f40-e789-41da-a90e-33b280584a8c"
77
ScientificTypesBase = "30f210dd-8aff-4c5f-94ba-8e64358c1161"
88
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
99

docs/src/anatomy_of_an_implementation.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -334,7 +334,7 @@ assumptions about data from those made above.
334334

335335
- If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable
336336
table¹, array³, tuple of tables and arrays, or some other object implementing the
337-
[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface,
337+
[MLCore.jl](https://juliaml.github.io/MLCore.jl/dev/) `getobs`/`numobs` interface,
338338
then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data
339339
can be transformed into a form that does support this interface, as illustrated below
340340
under [Providing a separate data front end](@ref) below; or (ii) overload the trait
@@ -419,7 +419,7 @@ The [`obs`](@ref) methods exist to:
419419
how it works.
420420

421421
In the typical case, where [`LearnAPI.data_interface`](@ref) is not overloaded, the
422-
alternative data representations must implement the MLUtils.jl `getobs/numobs` interface
422+
alternative data representations must implement the MLCore.jl `getobs/numobs` interface
423423
for observation subsampling, which is generally all a user or meta-algorithm will need,
424424
before passing the data on to `fit`/`predict`, as you would the original data.
425425

@@ -436,14 +436,14 @@ one enables the following alternative:
436436
observations = obs(learner, data) # preprocessed training data
437437

438438
# optional subsampling:
439-
observations = MLUtils.getobs(observations, train_indices)
439+
observations = MLCore.getobs(observations, train_indices)
440440

441441
model = fit(learner, observations)
442442

443443
newobservations = obs(model, newdata)
444444

445445
# optional subsampling:
446-
newobservations = MLUtils.getobs(observations, test_indices)
446+
newobservations = MLCore.getobs(observations, test_indices)
447447

448448
predict(model, newobservations)
449449
```
@@ -568,15 +568,15 @@ LearnAPI.target(learner::Ridge, data) = LearnAPI.target(learner, obs(learner, da
568568
are generally different.
569569

570570
- We need the adjoint operator, `'`, because the last dimension in arrays is the
571-
observation dimension, according to the MLUtils.jl convention. Remember, `Xnew` is a
571+
observation dimension, according to the MLCore.jl convention. Remember, `Xnew` is a
572572
table here.
573573

574574
Since LearnAPI.jl provides fallbacks for `obs` that simply return the unadulterated data
575575
argument, overloading `obs` is optional. This is provided data in publicized
576576
`fit`/`predict` signatures already consists only of objects implement the
577577
[`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof).
578578

579-
To opt out of supporting the MLUtils.jl interface altogether, an implementation must
579+
To opt out of supporting the MLCore.jl interface altogether, an implementation must
580580
overload the trait, [`LearnAPI.data_interface(learner)`](@ref). See [Data
581581
interfaces](@ref data_interfaces) for details.
582582

@@ -593,15 +593,15 @@ LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...)
593593
## [Demonstration of an advanced `obs` workflow](@id advanced_demo)
594594

595595
We now can train and predict using internal data representations, resampled using the
596-
generic MLUtils.jl interface:
596+
generic MLCore.jl interface:
597597

598598
```@example anatomy2
599-
import MLUtils
599+
import MLCore
600600
learner = Ridge()
601601
observations_for_fit = obs(learner, (X, y))
602-
model = fit(learner, MLUtils.getobs(observations_for_fit, train))
602+
model = fit(learner, MLCore.getobs(observations_for_fit, train))
603603
observations_for_predict = obs(model, X)
604-
ẑ = predict(model, MLUtils.getobs(observations_for_predict, test))
604+
ẑ = predict(model, MLCore.getobs(observations_for_predict, test))
605605
```
606606

607607
```julia
@@ -616,7 +616,7 @@ obs_workflows).
616616
¹ In LearnAPI.jl a *table* is any object `X` implementing the
617617
[Tables.jl](https://tables.juliadata.org/dev/) interface, additionally satisfying
618618
`Tables.istable(X) == true` and implementing `DataAPI.nrow` (and whence
619-
`MLUtils.numobs`). Tables that are also (unnamed) tuples are disallowed.
619+
`MLCore.numobs`). Tables that are also (unnamed) tuples are disallowed.
620620

621621
² An implementation can provide further accessor functions, if necessary, but
622622
like the native ones, they must be included in the [`LearnAPI.functions`](@ref)

docs/src/index.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -92,12 +92,13 @@ unless the algorithm explicitly opts out. Moreover, the `fit` and `predict` meth
9292
also be able to consume these alternative data representations, for performance benefits
9393
in some situations.
9494

95-
The fallback data interface is the [MLUtils.jl](https://github.com/JuliaML/MLUtils.jl)
96-
`getobs/numobs` interface, here tagged as [`LearnAPI.RandomAccess()`](@ref), and if the
97-
input consumed by the algorithm already implements that interface (tables, arrays, etc.)
98-
then overloading `obs` is completely optional. Plain iteration interfaces, with or without
99-
knowledge of the number of observations, can also be specified, to support, e.g., data
100-
loaders reading images from disk.
95+
The fallback data interface is the [MLCore.jl](https://github.com/JuliaML/MLCore.jl)
96+
`getobs/numobs` interface (previously provided by MLUtils.jl) here tagged as
97+
[`LearnAPI.RandomAccess()`](@ref). However, if the input consumed by the algorithm already
98+
implements that interface (tables, arrays, etc.) then overloading `obs` is completely
99+
optional. Plain iteration interfaces, with or without knowledge of the number of
100+
observations, can also be specified, to support, e.g., data loaders reading images from
101+
disk.
101102

102103
Some canned data front ends (implementations of [`obs`](@ref)) are provided by the
103104
[LearnDataFrontEnds.jl](https://juliaai.github.io/LearnDataFrontEnds.jl/stable/) package.

docs/src/obs.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,26 +26,26 @@ observations = obs(learner, data)
2626

2727
then, assuming the typical case that `LearnAPI.data_interface(learner) ==
2828
LearnAPI.RandomAccess()`, `observations` implements the
29-
[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface, for
29+
[MLCore.jl](https://juliaml.github.io/MLCore.jl/dev/) `getobs`/`numobs` interface, for
3030
grabbing and counting observations. Moreover, we can pass `observations` to `fit` in place
31-
of the original data, or first resample it using `MLUtils.getobs`:
31+
of the original data, or first resample it using `MLCore.getobs`:
3232

3333
```julia
3434
# equivalent to `model = fit(learner, data)`
3535
model = fit(learner, observations)
3636

3737
# with resampling:
38-
resampled_observations = MLUtils.getobs(observations, 1:10)
38+
resampled_observations = MLCore.getobs(observations, 1:10)
3939
model = fit(learner, resampled_observations)
4040
```
4141

4242
In some implementations, the alternative pattern above can be used to avoid repeating
4343
unnecessary internal data preprocessing, or inefficient resampling. For example, here's
44-
how a user might call `obs` and `MLUtils.getobs` to perform efficient cross-validation:
44+
how a user might call `obs` and `MLCore.getobs` to perform efficient cross-validation:
4545

4646
```julia
4747
using LearnAPI
48-
import MLUtils
48+
import MLCore
4949

5050
learner = <some supervised learner>
5151

@@ -61,7 +61,7 @@ never_trained = true
6161
scores = map(train_test_folds) do (train, test)
6262

6363
# train using model-specific representation of data:
64-
fitobs_subset = MLUtils.getobs(fitobs, train)
64+
fitobs_subset = MLCore.getobs(fitobs, train)
6565
model = fit(learner, fitobs_subset)
6666

6767
# predict on the fold complement:
@@ -70,7 +70,7 @@ scores = map(train_test_folds) do (train, test)
7070
global predictobs = obs(model, X)
7171
global never_trained = false
7272
end
73-
predictobs_subset = MLUtils.getobs(predictobs, test)
73+
predictobs_subset = MLCore.getobs(predictobs, test)
7474
= predict(model, Point(), predictobs_subset)
7575

7676
y = LearnAPI.target(learner, data)

docs/src/predict_transform.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ transform(learner, data) # `fit` implied
5252

5353
```julia
5454
fitobs = obs(learner, (X, y)) # learner-specific repr. of data
55-
model = fit(learner, MLUtils.getobs(fitobs, 1:100))
56-
predictobs = obs(model, MLUtils.getobs(X, 101:150))
55+
model = fit(learner, MLCore.getobs(fitobs, 1:100))
56+
predictobs = obs(model, MLCore.getobs(X, 101:150))
5757
= predict(model, Point(), predictobs)
5858
```
5959

docs/src/reference.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,13 @@ individual observations.
2121
A `DataFrame` instance, from [DataFrames.jl](https://dataframes.juliadata.org/stable/), is
2222
an example of data, the observations being the rows. Typically, data provided to
2323
LearnAPI.jl algorithms, will implement the
24-
[MLUtils.jl](https://juliaml.github.io/MLUtils.jl/stable) `getobs/numobs` interface for
24+
[MLCore.jl](https://juliaml.github.io/MLCore.jl/stable) `getobs/numobs` interface for
2525
accessing individual observations, but implementations can opt out of this requirement;
2626
see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details.
2727

2828
!!! note
2929

30-
In the MLUtils.jl
30+
In the MLCore.jl
3131
convention, observations in tables are the rows but observations in a matrix are the
3232
columns.
3333

docs/src/traits.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ In the examples column of the table below, `Continuous` is a name owned the pack
2727
| [`LearnAPI.nonlearners`](@ref)`(learner)` | properties *not* corresponding to other learners | all properties | `(:K, :leafsize, :metric,)` |
2828
| [`LearnAPI.human_name`](@ref)`(learner)` | human name for the learner; should be a noun | type name with spaces | "elastic net regressor" |
2929
| [`LearnAPI.iteration_parameter`](@ref)`(learner)` | symbolic name of an iteration parameter | `nothing` | :epochs |
30-
| [`LearnAPI.data_interface`](@ref)`(learner)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) |
30+
| [`LearnAPI.data_interface`](@ref)`(learner)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLCore.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) |
3131
| [`LearnAPI.fit_scitype`](@ref)`(learner)` | upper bound on `scitype(data)` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` |
3232
| [`LearnAPI.target_observation_scitype`](@ref)`(learner)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` |
3333
| [`LearnAPI.is_static`](@ref)`(learner)` | `true` if `fit` consumes no data | `false` | `true` |

src/accessor_functions.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ Here's a sample workflow for some such `learner`, with training data, `(X, y)`,
319319
is the training target, here assumed to be a vector.
320320
321321
```julia
322-
import MLUtils.getobs
322+
import MLCore.getobs
323323
model = fit(learner, (X, y))
324324
yhat = LearnAPI.predictions(model)
325325
test_indices = LearnAPI.out_of_sample_indices(model)

src/obs.jl

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,15 +25,15 @@ model = fit(learner, data_train)
2525
ŷ = predict(model, Point(), X[101:150])
2626
```
2727
28-
Alternative workflow using `obs` and the MLUtils.jl method `getobs` to carry out
28+
Alternative workflow using `obs` and the MLCore.jl method `getobs` to carry out
2929
subsampling (assumes `LearnAPI.data_interface(learner) == RandomAccess()`):
3030
3131
```julia
32-
import MLUtils
32+
import MLCore
3333
fit_observations = obs(learner, data)
34-
model = fit(learner, MLUtils.getobs(fit_observations, 1:100))
34+
model = fit(learner, MLCore.getobs(fit_observations, 1:100))
3535
predict_observations = obs(model, X)
36-
ẑ = predict(model, Point(), MLUtils.getobs(predict_observations, 101:150))
36+
ẑ = predict(model, Point(), MLCore.getobs(predict_observations, 101:150))
3737
@assert ẑ == ŷ
3838
```
3939
@@ -54,7 +54,7 @@ alternatives with the same output, whenever `observations = obs(model, data)`.
5454
5555
If `LearnAPI.data_interface(learner) == RandomAccess()` (the default), then `fit`,
5656
`predict` and `transform` must additionally accept `obs` output that has been *subsampled*
57-
using `MLUtils.getobs`, with the obvious interpretation applying to the outcomes of such
57+
using `MLCore.getobs`, with the obvious interpretation applying to the outcomes of such
5858
calls (e.g., if *all* observations are subsampled, then outcomes should be the same as if
5959
using the original data).
6060

0 commit comments

Comments
 (0)