Skip to content

Commit 84ad4a2

Browse files
docs tweaks (#24)
* maxnet is registered * fully implement MLJ (#22) * add mlj docstring * test with MLJTestInterface * throw a helpful error if input data only has one class * mljtestinterface is not a dep (oops) * move allequal error to main function * fix allequal error * fix tests * add MLJBase as docs dep * fix mlj doctest * attempt fix of multiclass printing * use @example instead of jldoctest * test for no failures in mlj interface test * maxnet is registered * more MLJ docs * small tweaks to core function docs * add check scitypes Co-authored-by: Anthony Blaom, PhD <[email protected]> * Clogloglink is from Maxnet Co-authored-by: Anthony Blaom, PhD <[email protected]> --------- Co-authored-by: Anthony Blaom, PhD <[email protected]>
1 parent 379dbc8 commit 84ad4a2

File tree

3 files changed

+64
-11
lines changed

3 files changed

+64
-11
lines changed

docs/src/usage/quickstart.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ CurrentModule = Maxnet
33
```
44

55
## Installation
6-
Maxnet.jl is not yet registered - install by running
6+
Install the latest version of Maxnet.jl by running
77
```julia
88
]
9-
add https://github.com/tiemvanderdeure/Maxnet.jl
9+
add Maxnet
1010
```
1111

1212
## Basic usage
@@ -31,7 +31,7 @@ There are numerous settings that can be tweaked to change the model fit. These a
3131
### Model settings
3232
The two most important settings to change when running Maxnet is the feature classes selected and the regularization factor.
3333

34-
By default, the feature classes selected depends on the number of presence points, see [Maxnet.default_features](@ref). To set them manually, specify the `features` keyword using either a `Vector` of `AbstractFeatureClass`, or a `string`, where `l` represents `LinearFeature` and `CategoricalFeature`, `q` represents `QuadraticFeature`, `p` represents `ProductFeature`, `t` represents `ThresholdFeature` and `h` represents `HingeFeature`.
34+
By default, the feature classes selected depends on the number of presence points, see [default_features](@ref). To set them manually, specify the `features` keyword using either a `Vector` of `AbstractFeatureClass`, or a `string`, where `l` represents `LinearFeature` and `CategoricalFeature`, `q` represents `QuadraticFeature`, `p` represents `ProductFeature`, `t` represents `ThresholdFeature` and `h` represents `HingeFeature`.
3535

3636
For example:
3737
```julia

src/maxnet_function.jl

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,11 @@
1616
- `features`: Either a `Vector` of `AbstractFeatureClass` to be used in the model,
1717
or a `String` where "l" = linear and categorical, "q" = quadratic, "p" = product, "t" = threshold, "h" = hinge (e.g. "lqh"); or
1818
By default, the features are based on the number of presences are used. See [`default_features`](@ref)
19-
- `regularization_multiplier`: A constant to adjust regularization, where a higher `regularization_multiplier` results in a higher penalization for features
20-
- `regularization_function`: A function to compute a regularization for each feature. A default `regularization_function` is built in.
21-
- `addsamplestobackground`: A boolean, where `true` adds the background samples to the predictors. Defaults to `true`.
19+
- `regularization_multiplier`: A constant to adjust regularization, where a higher `regularization_multiplier` results in a higher
20+
penalization for features and therefore less overfitting.
21+
- `regularization_function`: A function to compute a regularization for each feature. A default `regularization_function` is built in
22+
and should be used in most cases.
23+
- `addsamplestobackground`: Whether to add presence values to the background. Defaults to `true`.
2224
- `n_knots`: the number of knots used for Threshold and Hinge features. Defaults to 50. Ignored if there are neither Threshold nor Hinge features
2325
- `weight_factor`: A `Float64` value to adjust the weight of the background samples. Defaults to 100.0.
2426
- `kw...`: Further arguments to be passed to `GLMNet.glmnet`
@@ -32,6 +34,7 @@ using Maxnet
3234
p_a, env = Maxnet.bradypus();
3335
bradypus_model = maxnet(p_a, env; features = "lq")
3436
37+
# Output
3538
Fit Maxnet model
3639
Features classes: Maxnet.AbstractFeatureClass[LinearFeature(), CategoricalFeature(), QuadraticFeature()]
3740
Entropy: 6.114650341746531

src/mlj_interface.jl

Lines changed: 55 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -46,16 +46,66 @@ MMI.metadata_model(
4646
"""
4747
$(MMI.doc_header(MaxnetBinaryClassifier))
4848
49-
The keywords `link`, and `clamp` are passed to [`predict`](@ref), while all other keywords are passed to [`maxnet`](@ref).
50-
See the documentation of these functions for the meaning of these parameters and their defaults.
49+
# Training data
50+
51+
In MLJ or MLJBase, bind an instance `model` to data with
52+
53+
mach = machine(model, X, y)
54+
55+
where
56+
57+
- `X`: any table of input features (eg, a `DataFrame`) whose columns
58+
each have one of the following element scitypes: `Continuous` or `<:Multiclass`. Check
59+
`scitypes` with `schema(X)`.
60+
61+
- `y`: is the target, which can be any `AbstractVector` whose element
62+
scitype is `<:Binary`. The first class should refer to background values,
63+
and the second class to presence values.
64+
65+
# Hyper-parameters
66+
67+
- `features`: Specifies which features classes to use in the model, e.g. "lqh" for linear, quadratic and hinge features.
68+
See also [Maxnet.maxnet](@ref)
69+
- `regularization_multiplier = 1.0`: 'Adjust how tight the model will fit. Increasing this will reduce overfitting.
70+
- `regularization_function`: A function to compute the regularization of each feature class. Defaults to `Maxnet.default_regularization`
71+
- `addsamplestobackground = true`: Controls wether to add presence values to the background.
72+
- `n_knots = 50`: The number of knots used for Threshold and Hinge features. A higher number gives more flexibility for these features.
73+
- `weight_factor = 100.0`: A `Float64` value to adjust the weight of the background samples.
74+
- `link = Maxnet.CloglogLink()`: The link function to use when predicting. See `Maxnet.predict`
75+
- `clamp = false`: Clamp values passed to `MLJBase.predict` to the range the model was trained on.
76+
77+
# Operations
78+
79+
- `predict(mach, Xnew)`: return predictions of the target given
80+
features `Xnew` having the same scitype as `X` above. Predictions are
81+
probabilistic and can be interpreted as the probability of presence.
82+
83+
# Fitted Parameters
84+
85+
The fields of `fitted_params(mach)` are:
86+
87+
- `fitresult`: A `Tuple` where the first entry is the `Maxnet.MaxnetModel` returned by the Maxnet algorithm
88+
and the second the entry is the classes of `y`
89+
90+
# Report
91+
92+
The fields of `report(mach)` are:
93+
94+
- `selected_variables`: A `Vector` of `Symbols` of the variables that were selected.
95+
- `selected_features`: A `Vector` of `Maxnet.ModelMatrixColumn` with the features that were selected.
96+
- `complexity`: the number of selected features in the model.
97+
5198
5299
# Example
100+
53101
```@example
54-
using MLJBase
102+
using MLJBase, Maxnet
55103
p_a, env = Maxnet.bradypus()
104+
y = coerce(p_a, Binary)
105+
X = coerce(env, Count => Continuous)
56106
57-
mach = machine(MaxnetBinaryClassifier(features = "lqp"), env, categorical(p_a), scitype_check_level = 0)
58-
fit!(mach, verbosity = 0)
107+
mach = machine(MaxnetBinaryClassifier(features = "lqp"), X, y)
108+
fit!(mach)
59109
yhat = MLJBase.predict(mach, env)
60110
61111
```

0 commit comments

Comments
 (0)