Skip to content

Commit a39bd12

Browse files
added the possibility to use linear effects only for a custom number of initial boosting steps
1 parent 0069510 commit a39bd12

File tree

10 files changed

+167
-66
lines changed

10 files changed

+167
-66
lines changed

API_REFERENCE_FOR_CLASSIFICATION.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# APLRClassifier
22

3-
## class aplr.APLRClassifier(m:int=3000, v:float=0.1, random_state:int=0, n_jobs:int=0, cv_folds:int=5, bins:int=300, verbosity:int=0, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, boosting_steps_before_interactions_are_allowed: int = 0, monotonic_constraints_ignore_interactions: bool = False, early_stopping_rounds: int = 500)
3+
## class aplr.APLRClassifier(m:int=3000, v:float=0.1, random_state:int=0, n_jobs:int=0, cv_folds:int=5, bins:int=300, verbosity:int=0, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, boosting_steps_before_interactions_are_allowed: int = 0, monotonic_constraints_ignore_interactions: bool = False, early_stopping_rounds: int = 500, num_first_steps_with_linear_effects_only: int = 0)
44

55
### Constructor parameters
66

@@ -49,6 +49,9 @@ See ***monotonic_constraints*** in the ***fit*** method.
4949
#### early_stopping_rounds (default = 500)
5050
If validation loss does not improve during the last ***early_stopping_rounds*** boosting steps then boosting is aborted. The point with this constructor parameter is to speed up the training and make it easier to select a high ***m***.
5151

52+
#### num_first_steps_with_linear_effects_only (default = 0)
53+
Specifies the number of initial boosting steps that are reserved only for linear effects. 0 means that non-linear effects are allowed from the first boosting step. Reasons for setting this parameter to a higher value than 0 could be to 1) build a more interpretable model with more emphasis on linear effects or 2) build a linear only model by setting ***num_first_steps_with_linear_effects_only*** to no less than ***m***.
54+
5255

5356
## Method: fit(X:npt.ArrayLike, y:List[str], sample_weight:npt.ArrayLike = np.empty(0), X_names:List[str]=[], cv_observations: npt.ArrayLike = np.empty([0, 0]), prioritized_predictors_indexes:List[int]=[], monotonic_constraints:List[int]=[], interaction_constraints:List[List[int]]=[])
5457

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# APLRRegressor
22

3-
## class aplr.APLRRegressor(m:int=3000, v:float=0.1, random_state:int=0, loss_function:str="mse", link_function:str="identity", n_jobs:int=0, cv_folds:int=5, bins:int=300, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, verbosity:int=0, dispersion_parameter:float=1.5, validation_tuning_metric:str="default", quantile:float=0.5, calculate_custom_validation_error_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_loss_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_negative_gradient_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, boosting_steps_before_interactions_are_allowed: int = 0, monotonic_constraints_ignore_interactions: bool = False, group_mse_by_prediction_bins: int = 10, group_mse_cycle_min_obs_in_bin: int = 30, early_stopping_rounds: int = 500)
3+
## class aplr.APLRRegressor(m:int=3000, v:float=0.1, random_state:int=0, loss_function:str="mse", link_function:str="identity", n_jobs:int=0, cv_folds:int=5, bins:int=300, max_interaction_level:int=1, max_interactions:int=100000, min_observations_in_split:int=20, ineligible_boosting_steps_added:int=10, max_eligible_terms:int=5, verbosity:int=0, dispersion_parameter:float=1.5, validation_tuning_metric:str="default", quantile:float=0.5, calculate_custom_validation_error_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_loss_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], float]]=None, calculate_custom_negative_gradient_function:Optional[Callable[[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike, npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[npt.ArrayLike], npt.ArrayLike]]=None, boosting_steps_before_interactions_are_allowed: int = 0, monotonic_constraints_ignore_interactions: bool = False, group_mse_by_prediction_bins: int = 10, group_mse_cycle_min_obs_in_bin: int = 30, early_stopping_rounds: int = 500, num_first_steps_with_linear_effects_only: int = 0)
44

55
### Constructor parameters
66

@@ -117,6 +117,9 @@ When ***loss_function*** equals ***group_mse_cycle*** then ***group_mse_cycle_mi
117117
#### early_stopping_rounds (default = 500)
118118
If validation loss does not improve during the last ***early_stopping_rounds*** boosting steps then boosting is aborted. The point with this constructor parameter is to speed up the training and make it easier to select a high ***m***.
119119

120+
#### num_first_steps_with_linear_effects_only (default = 0)
121+
Specifies the number of initial boosting steps that are reserved only for linear effects. 0 means that non-linear effects are allowed from the first boosting step. Reasons for setting this parameter to a higher value than 0 could be to 1) build a more interpretable model with more emphasis on linear effects or 2) build a linear only model by setting ***num_first_steps_with_linear_effects_only*** to no less than ***m***.
122+
120123

121124
## Method: fit(X:npt.ArrayLike, y:npt.ArrayLike, sample_weight:npt.ArrayLike = np.empty(0), X_names:List[str]=[], cv_observations: npt.ArrayLike = np.empty([0, 0]), prioritized_predictors_indexes:List[int]=[], monotonic_constraints:List[int]=[], group:npt.ArrayLike = np.empty(0), interaction_constraints:List[List[int]]=[], other_data: npt.ArrayLike = np.empty([0, 0]))
122125

aplr/aplr.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ def __init__(
6565
group_mse_by_prediction_bins: int = 10,
6666
group_mse_cycle_min_obs_in_bin: int = 30,
6767
early_stopping_rounds: int = 500,
68+
num_first_steps_with_linear_effects_only: int = 0,
6869
):
6970
self.m = m
7071
self.v = v
@@ -105,6 +106,9 @@ def __init__(
105106
self.group_mse_by_prediction_bins = group_mse_by_prediction_bins
106107
self.group_mse_cycle_min_obs_in_bin = group_mse_cycle_min_obs_in_bin
107108
self.early_stopping_rounds = early_stopping_rounds
109+
self.num_first_steps_with_linear_effects_only = (
110+
num_first_steps_with_linear_effects_only
111+
)
108112

109113
# Creating aplr_cpp and setting parameters
110114
self.APLRRegressor = aplr_cpp.APLRRegressor()
@@ -159,6 +163,9 @@ def __set_params_cpp(self):
159163
self.group_mse_cycle_min_obs_in_bin
160164
)
161165
self.APLRRegressor.early_stopping_rounds = self.early_stopping_rounds
166+
self.APLRRegressor.num_first_steps_with_linear_effects_only = (
167+
self.num_first_steps_with_linear_effects_only
168+
)
162169

163170
def fit(
164171
self,
@@ -286,6 +293,7 @@ def get_params(self, deep=True):
286293
"group_mse_by_prediction_bins": self.group_mse_by_prediction_bins,
287294
"group_mse_cycle_min_obs_in_bin": self.group_mse_cycle_min_obs_in_bin,
288295
"early_stopping_rounds": self.early_stopping_rounds,
296+
"num_first_steps_with_linear_effects_only": self.num_first_steps_with_linear_effects_only,
289297
}
290298

291299
# For sklearn
@@ -314,6 +322,7 @@ def __init__(
314322
boosting_steps_before_interactions_are_allowed: int = 0,
315323
monotonic_constraints_ignore_interactions: bool = False,
316324
early_stopping_rounds: int = 500,
325+
num_first_steps_with_linear_effects_only: int = 0,
317326
):
318327
self.m = m
319328
self.v = v
@@ -334,6 +343,9 @@ def __init__(
334343
monotonic_constraints_ignore_interactions
335344
)
336345
self.early_stopping_rounds = early_stopping_rounds
346+
self.num_first_steps_with_linear_effects_only = (
347+
num_first_steps_with_linear_effects_only
348+
)
337349

338350
# Creating aplr_cpp and setting parameters
339351
self.APLRClassifier = aplr_cpp.APLRClassifier()
@@ -362,6 +374,9 @@ def __set_params_cpp(self):
362374
self.monotonic_constraints_ignore_interactions
363375
)
364376
self.APLRClassifier.early_stopping_rounds = self.early_stopping_rounds
377+
self.APLRClassifier.num_first_steps_with_linear_effects_only = (
378+
self.num_first_steps_with_linear_effects_only
379+
)
365380

366381
def fit(
367382
self,
@@ -434,6 +449,7 @@ def get_params(self, deep=True):
434449
"boosting_steps_before_interactions_are_allowed": self.boosting_steps_before_interactions_are_allowed,
435450
"monotonic_constraints_ignore_interactions": self.monotonic_constraints_ignore_interactions,
436451
"early_stopping_rounds": self.early_stopping_rounds,
452+
"num_first_steps_with_linear_effects_only": self.num_first_steps_with_linear_effects_only,
437453
}
438454

439455
# For sklearn

cpp/APLRClassifier.h

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,12 +45,13 @@ class APLRClassifier
4545
size_t boosting_steps_before_interactions_are_allowed;
4646
bool monotonic_constraints_ignore_interactions;
4747
size_t early_stopping_rounds;
48+
size_t num_first_steps_with_linear_effects_only;
4849

4950
APLRClassifier(size_t m = 3000, double v = 0.1, uint_fast32_t random_state = std::numeric_limits<uint_fast32_t>::lowest(), size_t n_jobs = 0,
5051
size_t cv_folds = 5, size_t reserved_terms_times_num_x = 100, size_t bins = 300, size_t verbosity = 0, size_t max_interaction_level = 1,
5152
size_t max_interactions = 100000, size_t min_observations_in_split = 20, size_t ineligible_boosting_steps_added = 10, size_t max_eligible_terms = 5,
5253
size_t boosting_steps_before_interactions_are_allowed = 0, bool monotonic_constraints_ignore_interactions = false,
53-
size_t early_stopping_rounds = 500);
54+
size_t early_stopping_rounds = 500, size_t num_first_steps_with_linear_effects_only = 0);
5455
APLRClassifier(const APLRClassifier &other);
5556
~APLRClassifier();
5657
void fit(const MatrixXd &X, const std::vector<std::string> &y, const VectorXd &sample_weight = VectorXd(0),
@@ -71,13 +72,14 @@ APLRClassifier::APLRClassifier(size_t m, double v, uint_fast32_t random_state, s
7172
size_t reserved_terms_times_num_x, size_t bins, size_t verbosity, size_t max_interaction_level, size_t max_interactions,
7273
size_t min_observations_in_split, size_t ineligible_boosting_steps_added, size_t max_eligible_terms,
7374
size_t boosting_steps_before_interactions_are_allowed, bool monotonic_constraints_ignore_interactions,
74-
size_t early_stopping_rounds)
75+
size_t early_stopping_rounds, size_t num_first_steps_with_linear_effects_only)
7576
: m{m}, v{v}, random_state{random_state}, n_jobs{n_jobs}, cv_folds{cv_folds},
7677
reserved_terms_times_num_x{reserved_terms_times_num_x}, bins{bins}, verbosity{verbosity}, max_interaction_level{max_interaction_level},
7778
max_interactions{max_interactions}, min_observations_in_split{min_observations_in_split},
7879
ineligible_boosting_steps_added{ineligible_boosting_steps_added}, max_eligible_terms{max_eligible_terms},
7980
boosting_steps_before_interactions_are_allowed{boosting_steps_before_interactions_are_allowed},
80-
monotonic_constraints_ignore_interactions{monotonic_constraints_ignore_interactions}, early_stopping_rounds{early_stopping_rounds}
81+
monotonic_constraints_ignore_interactions{monotonic_constraints_ignore_interactions}, early_stopping_rounds{early_stopping_rounds},
82+
num_first_steps_with_linear_effects_only{num_first_steps_with_linear_effects_only}
8183
{
8284
}
8385

@@ -91,7 +93,8 @@ APLRClassifier::APLRClassifier(const APLRClassifier &other)
9193
feature_importance{other.feature_importance},
9294
boosting_steps_before_interactions_are_allowed{other.boosting_steps_before_interactions_are_allowed},
9395
monotonic_constraints_ignore_interactions{other.monotonic_constraints_ignore_interactions},
94-
early_stopping_rounds{other.early_stopping_rounds}
96+
early_stopping_rounds{other.early_stopping_rounds},
97+
num_first_steps_with_linear_effects_only{other.num_first_steps_with_linear_effects_only}
9598
{
9699
}
97100

@@ -117,6 +120,7 @@ void APLRClassifier::fit(const MatrixXd &X, const std::vector<std::string> &y, c
117120
logit_models[categories[0]].boosting_steps_before_interactions_are_allowed = boosting_steps_before_interactions_are_allowed;
118121
logit_models[categories[0]].monotonic_constraints_ignore_interactions = monotonic_constraints_ignore_interactions;
119122
logit_models[categories[0]].early_stopping_rounds = early_stopping_rounds;
123+
logit_models[categories[0]].num_first_steps_with_linear_effects_only = num_first_steps_with_linear_effects_only;
120124
logit_models[categories[0]].fit(X, response_values[categories[0]], sample_weight, X_names, cv_observations, prioritized_predictors_indexes,
121125
monotonic_constraints, VectorXi(0), interaction_constraints);
122126

@@ -133,6 +137,7 @@ void APLRClassifier::fit(const MatrixXd &X, const std::vector<std::string> &y, c
133137
logit_models[category].boosting_steps_before_interactions_are_allowed = boosting_steps_before_interactions_are_allowed;
134138
logit_models[category].monotonic_constraints_ignore_interactions = monotonic_constraints_ignore_interactions;
135139
logit_models[category].early_stopping_rounds = early_stopping_rounds;
140+
logit_models[category].num_first_steps_with_linear_effects_only = num_first_steps_with_linear_effects_only;
136141
logit_models[category].fit(X, response_values[category], sample_weight, X_names, cv_observations, prioritized_predictors_indexes,
137142
monotonic_constraints, VectorXi(0), interaction_constraints);
138143
}

cpp/APLRRegressor.h

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ class APLRRegressor
7272
VectorXd intercept_steps;
7373
double best_validation_error_so_far;
7474
size_t best_m_so_far;
75+
bool linear_effects_only_in_this_boosting_step;
7576

7677
void validate_input_to_fit(const MatrixXd &X, const VectorXd &y, const VectorXd &sample_weight, const std::vector<std::string> &X_names,
7778
const MatrixXi &cv_observations, const std::vector<size_t> &prioritized_predictors_indexes,
@@ -210,6 +211,7 @@ class APLRRegressor
210211
VectorXi term_main_predictor_indexes;
211212
VectorXi term_interaction_levels;
212213
size_t early_stopping_rounds;
214+
size_t num_first_steps_with_linear_effects_only;
213215

214216
APLRRegressor(size_t m = 3000, double v = 0.1, uint_fast32_t random_state = std::numeric_limits<uint_fast32_t>::lowest(), std::string loss_function = "mse",
215217
std::string link_function = "identity", size_t n_jobs = 0, size_t cv_folds = 5,
@@ -222,7 +224,8 @@ class APLRRegressor
222224
const std::function<VectorXd(VectorXd)> &calculate_custom_transform_linear_predictor_to_predictions_function = {},
223225
const std::function<VectorXd(VectorXd)> &calculate_custom_differentiate_predictions_wrt_linear_predictor_function = {},
224226
size_t boosting_steps_before_interactions_are_allowed = 0, bool monotonic_constraints_ignore_interactions = false,
225-
size_t group_mse_by_prediction_bins = 10, size_t group_mse_cycle_min_obs_in_bin = 30, size_t early_stopping_rounds = 500);
227+
size_t group_mse_by_prediction_bins = 10, size_t group_mse_cycle_min_obs_in_bin = 30, size_t early_stopping_rounds = 500,
228+
size_t num_first_steps_with_linear_effects_only = 0);
226229
APLRRegressor(const APLRRegressor &other);
227230
~APLRRegressor();
228231
void fit(const MatrixXd &X, const VectorXd &y, const VectorXd &sample_weight = VectorXd(0), const std::vector<std::string> &X_names = {},
@@ -262,7 +265,8 @@ APLRRegressor::APLRRegressor(size_t m, double v, uint_fast32_t random_state, std
262265
const std::function<VectorXd(VectorXd)> &calculate_custom_transform_linear_predictor_to_predictions_function,
263266
const std::function<VectorXd(VectorXd)> &calculate_custom_differentiate_predictions_wrt_linear_predictor_function,
264267
size_t boosting_steps_before_interactions_are_allowed, bool monotonic_constraints_ignore_interactions,
265-
size_t group_mse_by_prediction_bins, size_t group_mse_cycle_min_obs_in_bin, size_t early_stopping_rounds)
268+
size_t group_mse_by_prediction_bins, size_t group_mse_cycle_min_obs_in_bin, size_t early_stopping_rounds,
269+
size_t num_first_steps_with_linear_effects_only)
266270
: reserved_terms_times_num_x{reserved_terms_times_num_x}, intercept{NAN_DOUBLE}, m{m}, v{v},
267271
loss_function{loss_function}, link_function{link_function}, cv_folds{cv_folds}, n_jobs{n_jobs}, random_state{random_state},
268272
bins{bins}, verbosity{verbosity}, max_interaction_level{max_interaction_level},
@@ -276,7 +280,8 @@ APLRRegressor::APLRRegressor(size_t m, double v, uint_fast32_t random_state, std
276280
calculate_custom_differentiate_predictions_wrt_linear_predictor_function{calculate_custom_differentiate_predictions_wrt_linear_predictor_function},
277281
boosting_steps_before_interactions_are_allowed{boosting_steps_before_interactions_are_allowed},
278282
monotonic_constraints_ignore_interactions{monotonic_constraints_ignore_interactions}, group_mse_by_prediction_bins{group_mse_by_prediction_bins},
279-
group_mse_cycle_min_obs_in_bin{group_mse_cycle_min_obs_in_bin}, cv_error{NAN_DOUBLE}, early_stopping_rounds{early_stopping_rounds}
283+
group_mse_cycle_min_obs_in_bin{group_mse_cycle_min_obs_in_bin}, cv_error{NAN_DOUBLE}, early_stopping_rounds{early_stopping_rounds},
284+
num_first_steps_with_linear_effects_only{num_first_steps_with_linear_effects_only}
280285
{
281286
}
282287

@@ -301,7 +306,8 @@ APLRRegressor::APLRRegressor(const APLRRegressor &other)
301306
monotonic_constraints_ignore_interactions{other.monotonic_constraints_ignore_interactions}, group_mse_by_prediction_bins{other.group_mse_by_prediction_bins},
302307
group_mse_cycle_min_obs_in_bin{other.group_mse_cycle_min_obs_in_bin}, cv_error{other.cv_error},
303308
term_main_predictor_indexes{other.term_main_predictor_indexes}, term_interaction_levels{other.term_interaction_levels},
304-
early_stopping_rounds{other.early_stopping_rounds}
309+
early_stopping_rounds{other.early_stopping_rounds},
310+
num_first_steps_with_linear_effects_only{other.num_first_steps_with_linear_effects_only}
305311
{
306312
}
307313

@@ -1026,6 +1032,7 @@ void APLRRegressor::execute_boosting_steps(Eigen::Index fold_index)
10261032
abort_boosting = false;
10271033
for (size_t boosting_step = 0; boosting_step < m; ++boosting_step)
10281034
{
1035+
linear_effects_only_in_this_boosting_step = num_first_steps_with_linear_effects_only > boosting_step;
10291036
execute_boosting_step(boosting_step, fold_index);
10301037
if (abort_boosting)
10311038
break;
@@ -1137,7 +1144,8 @@ void APLRRegressor::estimate_split_point_for_each_term(std::vector<Term> &terms,
11371144
#pragma omp parallel for schedule(guided) if (multithreading)
11381145
for (size_t i = 0; i < terms_indexes.size(); ++i)
11391146
{
1140-
terms[terms_indexes[i]].estimate_split_point(X_train, neg_gradient_current, sample_weight_train, bins, v, min_observations_in_split);
1147+
terms[terms_indexes[i]].estimate_split_point(X_train, neg_gradient_current, sample_weight_train, bins, v, min_observations_in_split,
1148+
linear_effects_only_in_this_boosting_step);
11411149
}
11421150
}
11431151

0 commit comments

Comments
 (0)