Skip to content

Commit 9a660ea

Browse files
pruning
1 parent a720be6 commit 9a660ea

File tree

7 files changed

+17
-13
lines changed

7 files changed

+17
-13
lines changed

API_REFERENCE_FOR_CLASSIFICATION.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# APLRClassifier
22

3-
## class aplr.APLRClassifier(m:int=9000, v:float=0.1, random_state:int=0, n_jobs:int=0, validation_ratio:float=0.2, bins:int=300, verbosity:int=0, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, boosting_steps_before_pruning_is_done:int = 0, boosting_steps_before_interactions_are_allowed: int = 0)
3+
## class aplr.APLRClassifier(m:int=9000, v:float=0.1, random_state:int=0, n_jobs:int=0, validation_ratio:float=0.2, bins:int=300, verbosity:int=0, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, boosting_steps_before_pruning_is_done:int = 500, boosting_steps_before_interactions_are_allowed: int = 0)
44

55
### Constructor parameters
66

@@ -40,8 +40,8 @@ Controls how many boosting steps a term that becomes ineligible has to remain in
4040
#### max_eligible_terms (default = 5)
4141
Limits 1) the number of terms already in the model that can be considered as interaction partners in a boosting step and 2) how many terms remain eligible in the next boosting step. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4242

43-
#### boosting_steps_before_pruning_is_done (default = 0)
44-
Specifies how many boosting steps to wait before pruning the model. If 0 (default) then pruning is not done. If for example 500 then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may improve predictiveness.
43+
#### boosting_steps_before_pruning_is_done (default = 500)
44+
Specifies how many boosting steps to wait before pruning the model. If 0 then pruning is not done. If for example 500 (default) then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may improve predictiveness.
4545

4646
#### boosting_steps_before_interactions_are_allowed (default = 0)
4747
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding.

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# APLRRegressor
22

3-
## class aplr.APLRRegressor(m:int=1000, v:float=0.1, random_state:int=0, loss_function:str="mse", link_function:str="identity", n_jobs:int=0, validation_ratio:float=0.2, bins:int=300, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, verbosity:int=0, dispersion_parameter:float=1.5, validation_tuning_metric:str="default", quantile:float=0.5, calculate_custom_validation_error_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_loss_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_negative_gradient_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, boosting_steps_before_pruning_is_done: int = 0, boosting_steps_before_interactions_are_allowed: int = 0)
3+
## class aplr.APLRRegressor(m:int=1000, v:float=0.1, random_state:int=0, loss_function:str="mse", link_function:str="identity", n_jobs:int=0, validation_ratio:float=0.2, bins:int=300, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, verbosity:int=0, dispersion_parameter:float=1.5, validation_tuning_metric:str="default", quantile:float=0.5, calculate_custom_validation_error_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_loss_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_negative_gradient_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, boosting_steps_before_pruning_is_done: int = 500, boosting_steps_before_interactions_are_allowed: int = 0)
44

55
### Constructor parameters
66

@@ -102,8 +102,8 @@ def calculate_custom_differentiate_predictions_wrt_linear_predictor(linear_predi
102102
return differentiated_predictions
103103
```
104104

105-
#### boosting_steps_before_pruning_is_done (default = 0)
106-
Specifies how many boosting steps to wait before pruning the model. If 0 (default) then pruning is not done. If for example 500 then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may improve predictiveness.
105+
#### boosting_steps_before_pruning_is_done (default = 500)
106+
Specifies how many boosting steps to wait before pruning the model. If 0 then pruning is not done. If for example 500 (default) then the model will be pruned in boosting steps 500, 1000, and so on. When pruning, terms are removed as long as this reduces the training error. This can be a computationally costly operation especially if the model gets many terms. Pruning may improve predictiveness.
107107

108108
#### boosting_steps_before_interactions_are_allowed (default = 0)
109109
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding.

aplr/aplr.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ def __init__(
6060
calculate_custom_differentiate_predictions_wrt_linear_predictor_function: Optional[
6161
Callable[[npt.ArrayLike], npt.ArrayLike]
6262
] = None,
63-
boosting_steps_before_pruning_is_done: int = 0,
63+
boosting_steps_before_pruning_is_done: int = 500,
6464
boosting_steps_before_interactions_are_allowed: int = 0,
6565
):
6666
self.m = m
@@ -279,7 +279,7 @@ def __init__(
279279
min_observations_in_split: int = 20,
280280
ineligible_boosting_steps_added: int = 10,
281281
max_eligible_terms: int = 5,
282-
boosting_steps_before_pruning_is_done: int = 0,
282+
boosting_steps_before_pruning_is_done: int = 500,
283283
boosting_steps_before_interactions_are_allowed: int = 0,
284284
):
285285
self.m = m

cpp/APLRClassifier.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ class APLRClassifier
4848
APLRClassifier(size_t m = 9000, double v = 0.1, uint_fast32_t random_state = std::numeric_limits<uint_fast32_t>::lowest(), size_t n_jobs = 0,
4949
double validation_ratio = 0.2, size_t reserved_terms_times_num_x = 100, size_t bins = 300, size_t verbosity = 0, size_t max_interaction_level = 1,
5050
size_t max_interactions = 100000, size_t min_observations_in_split = 20, size_t ineligible_boosting_steps_added = 10, size_t max_eligible_terms = 5,
51-
size_t boosting_steps_before_pruning_is_done = 0, size_t boosting_steps_before_interactions_are_allowed = 0);
51+
size_t boosting_steps_before_pruning_is_done = 500, size_t boosting_steps_before_interactions_are_allowed = 0);
5252
APLRClassifier(const APLRClassifier &other);
5353
~APLRClassifier();
5454
void fit(const MatrixXd &X, const std::vector<std::string> &y, const VectorXd &sample_weight = VectorXd(0),

cpp/APLRRegressor.h

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@ class APLRRegressor
166166
const std::function<VectorXd(VectorXd, VectorXd, VectorXi, MatrixXd)> &calculate_custom_negative_gradient_function = {},
167167
const std::function<VectorXd(VectorXd)> &calculate_custom_transform_linear_predictor_to_predictions_function = {},
168168
const std::function<VectorXd(VectorXd)> &calculate_custom_differentiate_predictions_wrt_linear_predictor_function = {},
169-
size_t boosting_steps_before_pruning_is_done = 0, size_t boosting_steps_before_interactions_are_allowed = 0);
169+
size_t boosting_steps_before_pruning_is_done = 500, size_t boosting_steps_before_interactions_are_allowed = 0);
170170
APLRRegressor(const APLRRegressor &other);
171171
~APLRRegressor();
172172
void fit(const MatrixXd &X, const VectorXd &y, const VectorXd &sample_weight = VectorXd(0), const std::vector<std::string> &X_names = {},
@@ -1269,6 +1269,10 @@ void APLRRegressor::prune_terms(size_t boosting_step)
12691269
{
12701270
remove_unused_terms();
12711271
remove_ineligibility();
1272+
if (verbosity >= 2)
1273+
{
1274+
std::cout << "Pruned " << std::to_string(terms_pruned) <<" terms.\n";
1275+
}
12721276
}
12731277
}
12741278

cpp/pythonbinding.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ PYBIND11_MODULE(aplr_cpp, m)
3939
py::arg("calculate_custom_negative_gradient_function") = empty_calculate_custom_negative_gradient_function,
4040
py::arg("calculate_custom_transform_linear_predictor_to_predictions_function") = empty_calculate_custom_transform_linear_predictor_to_predictions_function,
4141
py::arg("calculate_custom_differentiate_predictions_wrt_linear_predictor_function") = empty_calculate_custom_differentiate_predictions_wrt_linear_predictor_function,
42-
py::arg("boosting_steps_before_pruning_is_done") = 0, py::arg("boosting_steps_before_interactions_are_allowed") = 0)
42+
py::arg("boosting_steps_before_pruning_is_done") = 500, py::arg("boosting_steps_before_interactions_are_allowed") = 0)
4343
.def("fit", &APLRRegressor::fit, py::arg("X"), py::arg("y"), py::arg("sample_weight") = VectorXd(0), py::arg("X_names") = std::vector<std::string>(),
4444
py::arg("validation_set_indexes") = std::vector<size_t>(), py::arg("prioritized_predictors_indexes") = std::vector<size_t>(),
4545
py::arg("monotonic_constraints") = std::vector<int>(), py::arg("group") = VectorXi(0),
@@ -208,7 +208,7 @@ PYBIND11_MODULE(aplr_cpp, m)
208208
py::arg("m") = 9000, py::arg("v") = 0.1, py::arg("random_state") = 0, py::arg("n_jobs") = 0, py::arg("validation_ratio") = 0.2,
209209
py::arg("reserved_terms_times_num_x") = 100, py::arg("bins") = 300, py::arg("verbosity") = 0,
210210
py::arg("max_interaction_level") = 1, py::arg("max_interactions") = 100000, py::arg("min_observations_in_split") = 20,
211-
py::arg("ineligible_boosting_steps_added") = 10, py::arg("max_eligible_terms") = 5, py::arg("boosting_steps_before_pruning_is_done") = 0,
211+
py::arg("ineligible_boosting_steps_added") = 10, py::arg("max_eligible_terms") = 5, py::arg("boosting_steps_before_pruning_is_done") = 500,
212212
py::arg("boosting_steps_before_interactions_are_allowed") = 0)
213213
.def("fit", &APLRClassifier::fit, py::arg("X"), py::arg("y"), py::arg("sample_weight") = VectorXd(0), py::arg("X_names") = std::vector<std::string>(),
214214
py::arg("validation_set_indexes") = std::vector<size_t>(), py::arg("prioritized_predictors_indexes") = std::vector<size_t>(),

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515

1616
setuptools.setup(
1717
name="aplr",
18-
version="7.3.0",
18+
version="7.4.0",
1919
description="Automatic Piecewise Linear Regression",
2020
ext_modules=[sfc_module],
2121
author="Mathias von Ottenbreit",

0 commit comments

Comments
 (0)