Skip to content

Commit 0f42709

Browse files
10.7.2
1 parent 3b605af commit 0f42709

File tree

6 files changed

+90
-10
lines changed

6 files changed

+90
-10
lines changed

API_REFERENCE_FOR_CLASSIFICATION.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Controls how many boosting steps a term that becomes ineligible has to remain in
4141
Limits 1) the number of terms already in the model that can be considered as interaction partners in a boosting step and 2) how many terms remain eligible in the next boosting step. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4242

4343
#### boosting_steps_before_interactions_are_allowed (default = 0)
44-
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding.
44+
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding. Please note that when greater than zero then the algorithm chooses the model from the boosting step with the lowest validation error before proceeding to interaction terms. The latter prevents overfitting.
4545

4646
#### monotonic_constraints_ignore_interactions (default = False)
4747
See ***monotonic_constraints*** in the ***fit*** method.
@@ -50,7 +50,7 @@ See ***monotonic_constraints*** in the ***fit*** method.
5050
If validation loss does not improve during the last ***early_stopping_rounds*** boosting steps then boosting is aborted. The point with this constructor parameter is to speed up the training and make it easier to select a high ***m***.
5151

5252
#### num_first_steps_with_linear_effects_only (default = 0)
53-
Specifies the number of initial boosting steps that are reserved only for linear effects. 0 means that non-linear effects are allowed from the first boosting step. Reasons for setting this parameter to a higher value than 0 could be to 1) build a more interpretable model with more emphasis on linear effects or 2) build a linear only model by setting ***num_first_steps_with_linear_effects_only*** to no less than ***m***.
53+
Specifies the number of initial boosting steps that are reserved only for linear effects. 0 means that non-linear effects are allowed from the first boosting step. Reasons for setting this parameter to a higher value than 0 could be to 1) build a more interpretable model with more emphasis on linear effects or 2) build a linear only model by setting ***num_first_steps_with_linear_effects_only*** to no less than ***m***. Please note that when greater than zero then the algorithm chooses the model from the boosting step with the lowest validation error before proceeding to non-linear effects or interactions. The latter prevents overfitting.
5454

5555
#### penalty_for_non_linearity (default = 0.0)
5656
Specifies a penalty in the range [0.0, 1.0] on terms that are not linear effects. A higher value increases model interpretability but can hurt predictiveness. Values outside of the [0.0, 1.0] range are rounded to the nearest boundary within the range.

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ def calculate_custom_differentiate_predictions_wrt_linear_predictor(linear_predi
103103
```
104104

105105
#### boosting_steps_before_interactions_are_allowed (default = 0)
106-
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding.
106+
Specifies how many boosting steps to wait before searching for interactions. If for example 800, then the algorithm will be forced to only fit main effects in the first 800 boosting steps, after which it is allowed to search for interactions (given that other hyperparameters that control interactions also allow this). The motivation for fitting main effects first may be 1) to get a cleaner looking model that puts more emphasis on main effects and 2) to speed up the algorithm since looking for interactions is computationally more demanding. Please note that when greater than zero then the algorithm chooses the model from the boosting step with the lowest validation error before proceeding to interaction terms. The latter prevents overfitting.
107107

108108
#### monotonic_constraints_ignore_interactions (default = False)
109109
See ***monotonic_constraints*** in the ***fit*** method.
@@ -118,7 +118,7 @@ When ***loss_function*** equals ***group_mse_cycle*** then ***group_mse_cycle_mi
118118
If validation loss does not improve during the last ***early_stopping_rounds*** boosting steps then boosting is aborted. The point with this constructor parameter is to speed up the training and make it easier to select a high ***m***.
119119

120120
#### num_first_steps_with_linear_effects_only (default = 0)
121-
Specifies the number of initial boosting steps that are reserved only for linear effects. 0 means that non-linear effects are allowed from the first boosting step. Reasons for setting this parameter to a higher value than 0 could be to 1) build a more interpretable model with more emphasis on linear effects or 2) build a linear only model by setting ***num_first_steps_with_linear_effects_only*** to no less than ***m***.
121+
Specifies the number of initial boosting steps that are reserved only for linear effects. 0 means that non-linear effects are allowed from the first boosting step. Reasons for setting this parameter to a higher value than 0 could be to 1) build a more interpretable model with more emphasis on linear effects or 2) build a linear only model by setting ***num_first_steps_with_linear_effects_only*** to no less than ***m***. Please note that when greater than zero then the algorithm chooses the model from the boosting step with the lowest validation error before proceeding to non-linear effects or interactions. The latter prevents overfitting.
122122

123123
#### penalty_for_non_linearity (default = 0.0)
124124
Specifies a penalty in the range [0.0, 1.0] on terms that are not linear effects. A higher value increases model interpretability but can hurt predictiveness. Values outside of the [0.0, 1.0] range are rounded to the nearest boundary within the range.

cpp/APLRRegressor.h

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,10 +76,12 @@ class APLRRegressor
7676
double best_validation_error_so_far;
7777
size_t best_m_so_far;
7878
bool linear_effects_only_in_this_boosting_step;
79+
bool non_linear_effects_allowed_in_this_boosting_step;
7980
bool max_terms_reached;
8081
bool round_robin_update_of_existing_terms;
8182
size_t term_to_update_in_this_boosting_step;
8283
size_t cores_to_use;
84+
bool stopped_early;
8385

8486
void validate_input_to_fit(const MatrixXd &X, const VectorXd &y, const VectorXd &sample_weight, const std::vector<std::string> &X_names,
8587
const MatrixXi &cv_observations, const std::vector<size_t> &prioritized_predictors_indexes,
@@ -1171,11 +1173,26 @@ VectorXd APLRRegressor::differentiate_predictions_wrt_linear_predictor()
11711173

11721174
void APLRRegressor::execute_boosting_steps(Eigen::Index fold_index)
11731175
{
1176+
stopped_early = false;
11741177
abort_boosting = false;
11751178
for (size_t boosting_step = 0; boosting_step < m; ++boosting_step)
11761179
{
11771180
linear_effects_only_in_this_boosting_step = num_first_steps_with_linear_effects_only > boosting_step;
1181+
non_linear_effects_allowed_in_this_boosting_step = boosting_steps_before_interactions_are_allowed > boosting_step && !linear_effects_only_in_this_boosting_step;
1182+
bool last_linear_effects_only_step{linear_effects_only_in_this_boosting_step && boosting_step == num_first_steps_with_linear_effects_only - 1};
1183+
bool last_step_before_interactions{non_linear_effects_allowed_in_this_boosting_step && boosting_step == boosting_steps_before_interactions_are_allowed - 1};
11781184
execute_boosting_step(boosting_step, fold_index);
1185+
if (stopped_early)
1186+
{
1187+
if (linear_effects_only_in_this_boosting_step)
1188+
boosting_step = std::min(num_first_steps_with_linear_effects_only - 1, m - 1);
1189+
else if (non_linear_effects_allowed_in_this_boosting_step)
1190+
boosting_step = std::min(boosting_steps_before_interactions_are_allowed - 1, m - 1);
1191+
best_m_so_far = boosting_step;
1192+
stopped_early = false;
1193+
}
1194+
else if ((last_linear_effects_only_step || last_step_before_interactions) && boosting_step + 1 < m)
1195+
find_optimal_m_and_update_model_accordingly();
11791196
if (abort_boosting)
11801197
break;
11811198
if (loss_function == "group_mse_cycle")
@@ -1823,9 +1840,17 @@ void APLRRegressor::abort_boosting_when_no_validation_error_improvement_in_the_l
18231840
bool no_improvement_for_too_long{boosting_step > best_m_so_far + early_stopping_rounds};
18241841
if (no_improvement_for_too_long)
18251842
{
1826-
abort_boosting = true;
1827-
if (verbosity >= 1)
1828-
std::cout << "Aborting boosting because of no validation error improvement in the last " << std::to_string(early_stopping_rounds) << " steps.\n";
1843+
if (linear_effects_only_in_this_boosting_step || non_linear_effects_allowed_in_this_boosting_step)
1844+
{
1845+
find_optimal_m_and_update_model_accordingly();
1846+
stopped_early = true;
1847+
}
1848+
else
1849+
{
1850+
abort_boosting = true;
1851+
if (verbosity >= 1)
1852+
std::cout << "Aborting boosting because of no validation error improvement in the last " << std::to_string(early_stopping_rounds) << " steps.\n";
1853+
}
18291854
}
18301855
}
18311856
}

cpp/tests.cpp

Lines changed: 57 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -236,8 +236,62 @@ class Tests
236236
model.ineligible_boosting_steps_added = 10;
237237
model.max_eligible_terms = 5;
238238
model.dispersion_parameter = 1.0;
239-
model.boosting_steps_before_interactions_are_allowed = 60;
239+
model.boosting_steps_before_interactions_are_allowed = 90;
240+
model.num_first_steps_with_linear_effects_only = 80;
241+
242+
// Data
243+
MatrixXd X_train{load_csv_into_eigen_matrix<MatrixXd>("data/X_train.csv")};
244+
MatrixXd X_test{load_csv_into_eigen_matrix<MatrixXd>("data/X_test.csv")};
245+
VectorXd y_train{load_csv_into_eigen_matrix<MatrixXd>("data/y_train.csv")};
246+
VectorXd y_test{load_csv_into_eigen_matrix<MatrixXd>("data/y_test.csv")};
247+
248+
VectorXd sample_weight{VectorXd::Constant(y_train.size(), 1.0)};
249+
250+
MatrixXi cv_observations = MatrixXi::Constant(y_train.rows(), 2, 1);
251+
cv_observations.col(0)[273] = -1;
252+
cv_observations.col(0)[272] = -1;
253+
cv_observations.col(0)[271] = -1;
254+
cv_observations.col(0)[270] = -1;
255+
cv_observations.col(0)[269] = -1;
256+
cv_observations.col(0)[268] = -1;
257+
cv_observations.col(0)[267] = -1;
258+
cv_observations.col(0)[266] = -1;
259+
cv_observations.col(1) = -cv_observations.col(0);
260+
261+
// Fitting
262+
// model.fit(X_train,y_train);
263+
model.fit(X_train, y_train, sample_weight);
264+
// model.fit(X_train, y_train, sample_weight, {}, cv_observations);
265+
std::cout << "feature importance\n"
266+
<< model.feature_importance << "\n\n";
267+
268+
VectorXd predictions{model.predict(X_test)};
269+
270+
// Saving results
271+
save_as_csv_file("data/output.csv", predictions);
272+
273+
std::cout << predictions.mean() << "\n\n";
274+
tests.push_back(is_approximately_equal(predictions.mean(), 17.380763842227257));
275+
}
276+
277+
void test_aplrregressor_cauchy_linear_effects_only_first_2()
278+
{
279+
// Model
280+
APLRRegressor model{APLRRegressor()};
281+
model.m = 100;
282+
model.v = 1.0;
283+
model.bins = 200;
284+
model.n_jobs = 1;
285+
model.loss_function = "cauchy";
286+
model.verbosity = 3;
287+
model.max_interaction_level = 100;
288+
model.min_observations_in_split = 10;
289+
model.ineligible_boosting_steps_added = 10;
290+
model.max_eligible_terms = 5;
291+
model.dispersion_parameter = 1.0;
292+
model.boosting_steps_before_interactions_are_allowed = 90;
240293
model.num_first_steps_with_linear_effects_only = 80;
294+
model.early_stopping_rounds = 1;
241295

242296
// Data
243297
MatrixXd X_train{load_csv_into_eigen_matrix<MatrixXd>("data/X_train.csv")};
@@ -271,7 +325,7 @@ class Tests
271325
save_as_csv_file("data/output.csv", predictions);
272326

273327
std::cout << predictions.mean() << "\n\n";
274-
tests.push_back(is_approximately_equal(predictions.mean(), 17.965154984786622));
328+
tests.push_back(is_approximately_equal(predictions.mean(), 17.886569073729863));
275329
}
276330

277331
void test_aplrregressor_cauchy_group_mse_validation()
@@ -2354,6 +2408,7 @@ int main()
23542408
tests.test_aplrregressor_cauchy_predictor_specific_penalties_and_learning_rates();
23552409
tests.test_aplrregressor_cauchy_penalties();
23562410
tests.test_aplrregressor_cauchy_linear_effects_only_first();
2411+
tests.test_aplrregressor_cauchy_linear_effects_only_first_2();
23572412
tests.test_aplrregressor_cauchy_group_mse_validation();
23582413
tests.test_aplrregressor_cauchy_group_mse_by_prediction_validation();
23592414
tests.test_aplrregressor_cauchy();
Binary file not shown.

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727

2828
setuptools.setup(
2929
name="aplr",
30-
version="10.7.1",
30+
version="10.7.2",
3131
description="Automatic Piecewise Linear Regression",
3232
ext_modules=[sfc_module],
3333
author="Mathias von Ottenbreit",

0 commit comments

Comments
 (0)