Skip to content

Commit 900efbc

Browse files
default hyperparameters
1 parent ac02f22 commit 900efbc

File tree

7 files changed

+30
-30
lines changed

7 files changed

+30
-30
lines changed

API_REFERENCE_FOR_CLASSIFICATION.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# APLRClassifier
22

3-
## class aplr.APLRClassifier(m:int=9000, v:float=0.1, random_state:int=0, n_jobs:int=0, validation_ratio:float=0.2, bins:int=300, verbosity:int=0, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, boosting_steps_before_pruning_is_done:int = 500, boosting_steps_before_interactions_are_allowed: int = 0)
3+
## class aplr.APLRClassifier(m:int=9000, v:float=0.1, random_state:int=0, n_jobs:int=0, validation_ratio:float=0.2, bins:int=100, verbosity:int=0, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=20, max_eligible_terms:int=10, boosting_steps_before_pruning_is_done:int = 0, boosting_steps_before_interactions_are_allowed: int = 0)
44

55
### Constructor parameters
66

@@ -19,7 +19,7 @@ Multi-threading parameter. If ***0*** then uses all available cores for multi-th
1919
#### validation_ratio (default = 0.2)
2020
The ratio of training observations to use for validation instead of training. The number of boosting steps is automatically tuned to minimize validation error.
2121

22-
#### bins (default = 300)
22+
#### bins (default = 100)
2323
Specifies the maximum number of bins to discretize the data into when searching for the best split. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs. Must be greater than 1.
2424

2525
#### verbosity (default = 0)
@@ -34,14 +34,14 @@ The maximum number of interactions allowed. A lower value may be used to reduce
3434
#### min_observations_in_split (default = 20)
3535
The minimum effective number of observations that a term in the model must rely on. This hyperparameter should be tuned. Larger values are more appropriate for larger datasets. Larger values result in more robust models (lower variance), potentially at the expense of increased bias.
3636

37-
#### ineligible_boosting_steps_added (default = 10)
37+
#### ineligible_boosting_steps_added (default = 20)
3838
Controls how many boosting steps a term that becomes ineligible has to remain ineligible. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
3939

40-
#### max_eligible_terms (default = 5)
40+
#### max_eligible_terms (default = 10)
4141
Limits 1) the number of terms already in the model that can be considered as interaction partners in a boosting step and 2) how many terms remain eligible in the next boosting step. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4242

43-
#### boosting_steps_before_pruning_is_done (default = 500)
44-
Specifies how many boosting steps to wait before pruning the model. If 0 then pruning is not done. If for example 500 (default) then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may improve predictiveness.
43+
#### boosting_steps_before_pruning_is_done (default = 0)
44+
Specifies how many boosting steps to wait before pruning the model. If 0 (default) then pruning is not done. If for example 500 then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may slightly improve predictiveness.
4545

4646
#### boosting_steps_before_interactions_are_allowed (default = 0)
4747
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding.

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# APLRRegressor
22

3-
## class aplr.APLRRegressor(m:int=1000, v:float=0.1, random_state:int=0, loss_function:str="mse", link_function:str="identity", n_jobs:int=0, validation_ratio:float=0.2, bins:int=300, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, verbosity:int=0, dispersion_parameter:float=1.5, validation_tuning_metric:str="default", quantile:float=0.5, calculate_custom_validation_error_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_loss_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_negative_gradient_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, boosting_steps_before_pruning_is_done: int = 500, boosting_steps_before_interactions_are_allowed: int = 0)
3+
## class aplr.APLRRegressor(m:int=1000, v:float=0.1, random_state:int=0, loss_function:str="mse", link_function:str="identity", n_jobs:int=0, validation_ratio:float=0.2, bins:int=100, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=20, max_eligible_terms:int=10, verbosity:int=0, dispersion_parameter:float=1.5, validation_tuning_metric:str="default", quantile:float=0.5, calculate_custom_validation_error_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_loss_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_negative_gradient_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, boosting_steps_before_pruning_is_done: int = 0, boosting_steps_before_interactions_are_allowed: int = 0)
44

55
### Constructor parameters
66

@@ -25,7 +25,7 @@ Multi-threading parameter. If ***0*** then uses all available cores for multi-th
2525
#### validation_ratio (default = 0.2)
2626
The ratio of training observations to use for validation instead of training. The number of boosting steps is automatically tuned to minimize validation error.
2727

28-
#### bins (default = 300)
28+
#### bins (default = 100)
2929
Specifies the maximum number of bins to discretize the data into when searching for the best split. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs. Must be greater than 1.
3030

3131
#### max_interaction_level (default = 1)
@@ -37,10 +37,10 @@ The maximum number of interactions allowed. A lower value may be used to reduce
3737
#### min_observations_in_split (default = 20)
3838
The minimum effective number of observations that a term in the model must rely on. This hyperparameter should be tuned. Larger values are more appropriate for larger datasets. Larger values result in more robust models (lower variance), potentially at the expense of increased bias.
3939

40-
#### ineligible_boosting_steps_added (default = 10)
40+
#### ineligible_boosting_steps_added (default = 20)
4141
Controls how many boosting steps a term that becomes ineligible has to remain ineligible. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4242

43-
#### max_eligible_terms (default = 5)
43+
#### max_eligible_terms (default = 10)
4444
Limits 1) the number of terms already in the model that can be considered as interaction partners in a boosting step and 2) how many terms remain eligible in the next boosting step. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4545

4646
#### verbosity (default = 0)
@@ -102,8 +102,8 @@ def calculate_custom_differentiate_predictions_wrt_linear_predictor(linear_predi
102102
return differentiated_predictions
103103
```
104104

105-
#### boosting_steps_before_pruning_is_done (default = 500)
106-
Specifies how many boosting steps to wait before pruning the model. If 0 then pruning is not done. If for example 500 (default) then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may improve predictiveness.
105+
#### boosting_steps_before_pruning_is_done (default = 0)
106+
Specifies how many boosting steps to wait before pruning the model. If 0 (default) then pruning is not done. If for example 500 then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may slightly improve predictiveness.
107107

108108
#### boosting_steps_before_interactions_are_allowed (default = 0)
109109
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding.

aplr/aplr.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,12 @@ def __init__(
1414
link_function: str = "identity",
1515
n_jobs: int = 0,
1616
validation_ratio: float = 0.2,
17-
bins: int = 300,
17+
bins: int = 100,
1818
max_interaction_level: int = 1,
1919
max_interactions: int = 100000,
2020
min_observations_in_split: int = 20,
21-
ineligible_boosting_steps_added: int = 10,
22-
max_eligible_terms: int = 5,
21+
ineligible_boosting_steps_added: int = 20,
22+
max_eligible_terms: int = 10,
2323
verbosity: int = 0,
2424
dispersion_parameter: float = 1.5,
2525
validation_tuning_metric: str = "default",
@@ -60,7 +60,7 @@ def __init__(
6060
calculate_custom_differentiate_predictions_wrt_linear_predictor_function: Optional[
6161
Callable[[npt.ArrayLike], npt.ArrayLike]
6262
] = None,
63-
boosting_steps_before_pruning_is_done: int = 500,
63+
boosting_steps_before_pruning_is_done: int = 0,
6464
boosting_steps_before_interactions_are_allowed: int = 0,
6565
):
6666
self.m = m
@@ -279,7 +279,7 @@ def __init__(
279279
min_observations_in_split: int = 20,
280280
ineligible_boosting_steps_added: int = 10,
281281
max_eligible_terms: int = 5,
282-
boosting_steps_before_pruning_is_done: int = 500,
282+
boosting_steps_before_pruning_is_done: int = 0,
283283
boosting_steps_before_interactions_are_allowed: int = 0,
284284
):
285285
self.m = m

cpp/APLRClassifier.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,9 @@ class APLRClassifier
4646
size_t boosting_steps_before_interactions_are_allowed;
4747

4848
APLRClassifier(size_t m = 9000, double v = 0.1, uint_fast32_t random_state = std::numeric_limits<uint_fast32_t>::lowest(), size_t n_jobs = 0,
49-
double validation_ratio = 0.2, size_t reserved_terms_times_num_x = 100, size_t bins = 300, size_t verbosity = 0, size_t max_interaction_level = 1,
50-
size_t max_interactions = 100000, size_t min_observations_in_split = 20, size_t ineligible_boosting_steps_added = 10, size_t max_eligible_terms = 5,
51-
size_t boosting_steps_before_pruning_is_done = 500, size_t boosting_steps_before_interactions_are_allowed = 0);
49+
double validation_ratio = 0.2, size_t reserved_terms_times_num_x = 100, size_t bins = 100, size_t verbosity = 0, size_t max_interaction_level = 1,
50+
size_t max_interactions = 100000, size_t min_observations_in_split = 20, size_t ineligible_boosting_steps_added = 20, size_t max_eligible_terms = 10,
51+
size_t boosting_steps_before_pruning_is_done = 0, size_t boosting_steps_before_interactions_are_allowed = 0);
5252
APLRClassifier(const APLRClassifier &other);
5353
~APLRClassifier();
5454
void fit(const MatrixXd &X, const std::vector<std::string> &y, const VectorXd &sample_weight = VectorXd(0),

cpp/APLRRegressor.h

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -158,15 +158,15 @@ class APLRRegressor
158158

159159
APLRRegressor(size_t m = 1000, double v = 0.1, uint_fast32_t random_state = std::numeric_limits<uint_fast32_t>::lowest(), std::string loss_function = "mse",
160160
std::string link_function = "identity", size_t n_jobs = 0, double validation_ratio = 0.2,
161-
size_t reserved_terms_times_num_x = 100, size_t bins = 300, size_t verbosity = 0, size_t max_interaction_level = 1, size_t max_interactions = 100000,
162-
size_t min_observations_in_split = 20, size_t ineligible_boosting_steps_added = 10, size_t max_eligible_terms = 5, double dispersion_parameter = 1.5,
161+
size_t reserved_terms_times_num_x = 100, size_t bins = 100, size_t verbosity = 0, size_t max_interaction_level = 1, size_t max_interactions = 100000,
162+
size_t min_observations_in_split = 20, size_t ineligible_boosting_steps_added = 20, size_t max_eligible_terms = 10, double dispersion_parameter = 1.5,
163163
std::string validation_tuning_metric = "default", double quantile = 0.5,
164164
const std::function<double(VectorXd, VectorXd, VectorXd, VectorXi, MatrixXd)> &calculate_custom_validation_error_function = {},
165165
const std::function<double(VectorXd, VectorXd, VectorXd, VectorXi, MatrixXd)> &calculate_custom_loss_function = {},
166166
const std::function<VectorXd(VectorXd, VectorXd, VectorXi, MatrixXd)> &calculate_custom_negative_gradient_function = {},
167167
const std::function<VectorXd(VectorXd)> &calculate_custom_transform_linear_predictor_to_predictions_function = {},
168168
const std::function<VectorXd(VectorXd)> &calculate_custom_differentiate_predictions_wrt_linear_predictor_function = {},
169-
size_t boosting_steps_before_pruning_is_done = 500, size_t boosting_steps_before_interactions_are_allowed = 0);
169+
size_t boosting_steps_before_pruning_is_done = 0, size_t boosting_steps_before_interactions_are_allowed = 0);
170170
APLRRegressor(const APLRRegressor &other);
171171
~APLRRegressor();
172172
void fit(const MatrixXd &X, const VectorXd &y, const VectorXd &sample_weight = VectorXd(0), const std::vector<std::string> &X_names = {},
@@ -835,8 +835,8 @@ void APLRRegressor::execute_boosting_step(size_t boosting_step)
835835
consider_interactions(predictor_indexes, boosting_step);
836836
select_the_best_term_and_update_errors(boosting_step);
837837
prune_terms(boosting_step);
838-
update_coefficient_steps(boosting_step);
839838
}
839+
update_coefficient_steps(boosting_step);
840840
if (abort_boosting)
841841
return;
842842
update_term_eligibility();

cpp/pythonbinding.cpp

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@ PYBIND11_MODULE(aplr_cpp, m)
2828
int &, int &>(),
2929
py::arg("m") = 1000, py::arg("v") = 0.1, py::arg("random_state") = 0, py::arg("loss_function") = "mse", py::arg("link_function") = "identity",
3030
py::arg("n_jobs") = 0, py::arg("validation_ratio") = 0.2,
31-
py::arg("reserved_terms_times_num_x") = 100, py::arg("bins") = 300, py::arg("verbosity") = 0,
31+
py::arg("reserved_terms_times_num_x") = 100, py::arg("bins") = 100, py::arg("verbosity") = 0,
3232
py::arg("max_interaction_level") = 1, py::arg("max_interactions") = 100000, py::arg("min_observations_in_split") = 20,
33-
py::arg("ineligible_boosting_steps_added") = 10, py::arg("max_eligible_terms") = 5,
33+
py::arg("ineligible_boosting_steps_added") = 20, py::arg("max_eligible_terms") = 10,
3434
py::arg("dispersion_parameter") = 1.5,
3535
py::arg("validation_tuning_metric") = "default",
3636
py::arg("quantile") = 0.5,
@@ -39,7 +39,7 @@ PYBIND11_MODULE(aplr_cpp, m)
3939
py::arg("calculate_custom_negative_gradient_function") = empty_calculate_custom_negative_gradient_function,
4040
py::arg("calculate_custom_transform_linear_predictor_to_predictions_function") = empty_calculate_custom_transform_linear_predictor_to_predictions_function,
4141
py::arg("calculate_custom_differentiate_predictions_wrt_linear_predictor_function") = empty_calculate_custom_differentiate_predictions_wrt_linear_predictor_function,
42-
py::arg("boosting_steps_before_pruning_is_done") = 500, py::arg("boosting_steps_before_interactions_are_allowed") = 0)
42+
py::arg("boosting_steps_before_pruning_is_done") = 0, py::arg("boosting_steps_before_interactions_are_allowed") = 0)
4343
.def("fit", &APLRRegressor::fit, py::arg("X"), py::arg("y"), py::arg("sample_weight") = VectorXd(0), py::arg("X_names") = std::vector<std::string>(),
4444
py::arg("validation_set_indexes") = std::vector<size_t>(), py::arg("prioritized_predictors_indexes") = std::vector<size_t>(),
4545
py::arg("monotonic_constraints") = std::vector<int>(), py::arg("group") = VectorXi(0),
@@ -206,9 +206,9 @@ PYBIND11_MODULE(aplr_cpp, m)
206206
py::class_<APLRClassifier>(m, "APLRClassifier", py::module_local())
207207
.def(py::init<int &, double &, int &, int &, double &, int &, int &, int &, int &, int &, int &, int &, int &, int &, int &>(),
208208
py::arg("m") = 9000, py::arg("v") = 0.1, py::arg("random_state") = 0, py::arg("n_jobs") = 0, py::arg("validation_ratio") = 0.2,
209-
py::arg("reserved_terms_times_num_x") = 100, py::arg("bins") = 300, py::arg("verbosity") = 0,
209+
py::arg("reserved_terms_times_num_x") = 100, py::arg("bins") = 100, py::arg("verbosity") = 0,
210210
py::arg("max_interaction_level") = 1, py::arg("max_interactions") = 100000, py::arg("min_observations_in_split") = 20,
211-
py::arg("ineligible_boosting_steps_added") = 10, py::arg("max_eligible_terms") = 5, py::arg("boosting_steps_before_pruning_is_done") = 500,
211+
py::arg("ineligible_boosting_steps_added") = 20, py::arg("max_eligible_terms") = 10, py::arg("boosting_steps_before_pruning_is_done") = 0,
212212
py::arg("boosting_steps_before_interactions_are_allowed") = 0)
213213
.def("fit", &APLRClassifier::fit, py::arg("X"), py::arg("y"), py::arg("sample_weight") = VectorXd(0), py::arg("X_names") = std::vector<std::string>(),
214214
py::arg("validation_set_indexes") = std::vector<size_t>(), py::arg("prioritized_predictors_indexes") = std::vector<size_t>(),

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515

1616
setuptools.setup(
1717
name="aplr",
18-
version="7.4.1",
18+
version="7.5.0",
1919
description="Automatic Piecewise Linear Regression",
2020
ext_modules=[sfc_module],
2121
author="Mathias von Ottenbreit",

0 commit comments

Comments
 (0)