Skip to content

Commit cc21bd2

Browse files
Changed default parameters based on empirical results. Added APLRTuner object.
1 parent 4e9bf81 commit cc21bd2

16 files changed

+967
-41
lines changed

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,8 @@ aplr/data
66
build/
77
__pycache__/
88
dist/
9-
aplr.egg-info/
9+
aplr.egg-info/
10+
*.db
11+
python/*.xlsx
12+
python/*.zip
13+
python/results_min_obs_in_split/

API_REFERENCE_FOR_APLR_TUNER.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# APLRTuner
2+
3+
## class aplr.APLRTuner(parameters: Union[Dict[str, List[float]], List[Dict[str, List[float]]]] = {"max_interaction_level": [0, 1], "min_observations_in_split": [4, 10, 20, 100, 500, 1000]}, is_regressor: bool = True)
4+
5+
### Constructor parameters
6+
7+
#### parameters (default = {"max_interaction_level": [0, 1], "min_observations_in_split": [4, 10, 20, 100, 500, 1000]})
8+
The parameters that you wish to tune.
9+
10+
#### is_regressor (default = True)
11+
Whether you want to use APLRRegressor (True) or APLRClassifier (False).
12+
13+
14+
## Method: fit(X: FloatMatrix, y: FloatVector, **kwargs)
15+
16+
***This method tunes the model to data.***
17+
18+
### Parameters
19+
20+
#### X
21+
A numpy matrix with predictor values.
22+
23+
#### y
24+
A numpy vector with response values.
25+
26+
#### kwargs
27+
Optional parameters sent to the fit methods in the underlying APLRRegressor or APLRClassifier models.
28+
29+
30+
## Method: predict(X: FloatMatrix, **kwargs)
31+
32+
***Returns the predictions of the best tuned model as a numpy array if regression or as a list of strings if classification.***
33+
34+
### Parameters
35+
36+
#### X
37+
A numpy matrix with predictor values.
38+
39+
#### kwargs
40+
Optional parameters sent to the predict method in the best tuned model.
41+
42+
43+
## Method: predict_class_probabilities(X: FloatMatrix, **kwargs)
44+
45+
***This method returns predicted class probabilities of the best tuned model as a numpy matrix.***
46+
47+
### Parameters
48+
49+
#### X
50+
A numpy matrix with predictor values.
51+
52+
#### kwargs
53+
Optional parameters sent to the predict_class_probabilities method in the best tuned model.
54+
55+
56+
## Method: predict_proba(X: FloatMatrix, **kwargs)
57+
58+
***This method returns predicted class probabilities of the best tuned model as a numpy matrix. Similar to the predict_class_probabilities method but the name predict_proba is compatible with scikit-learn.***
59+
60+
### Parameters
61+
62+
#### X
63+
A numpy matrix with predictor values.
64+
65+
#### kwargs
66+
Optional parameters sent to the predict_class_probabilities method in the best tuned model.
67+
68+
69+
## Method: get_best_estimator()
70+
71+
***Returns the best tuned model. This is an APLRRegressor or APLRClassifier object.***
72+
73+
74+
## Method: get_cv_results()
75+
76+
***Returns the cv results from the tuning as a list of dictionaries, List[Dict[str, float]].***

API_REFERENCE_FOR_CLASSIFICATION.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# APLRClassifier
22

3-
## class aplr.APLRClassifier(m:int = 20000, v:float = 0.1, random_state:int = 0, n_jobs:int = 0, cv_folds:int = 5, bins:int = 300, verbosity:int = 0, max_interaction_level:int = 1, max_interactions:int = 100000, min_observations_in_split:int = 20, ineligible_boosting_steps_added:int = 10, max_eligible_terms:int = 5, boosting_steps_before_interactions_are_allowed: int = 0, monotonic_constraints_ignore_interactions: bool = False, early_stopping_rounds: int = 500, num_first_steps_with_linear_effects_only: int = 0, penalty_for_non_linearity: float = 0.0, penalty_for_interactions: float = 0.0, max_terms: int = 0)
3+
## class aplr.APLRClassifier(m:int = 20000, v:float = 0.5, random_state:int = 0, n_jobs:int = 0, cv_folds:int = 5, bins:int = 300, verbosity:int = 0, max_interaction_level:int = 1, max_interactions:int = 100000, min_observations_in_split:int = 4, ineligible_boosting_steps_added:int = 15, max_eligible_terms:int = 7, boosting_steps_before_interactions_are_allowed: int = 0, monotonic_constraints_ignore_interactions: bool = False, early_stopping_rounds: int = 500, num_first_steps_with_linear_effects_only: int = 0, penalty_for_non_linearity: float = 0.0, penalty_for_interactions: float = 0.0, max_terms: int = 0)
44

55
### Constructor parameters
66

77
#### m (default = 20000)
88
The maximum number of boosting steps. If validation error does not flatten out at the end of the ***m***th boosting step, then try increasing it (or alternatively increase the learning rate).
99

10-
#### v (default = 0.1)
11-
The learning rate. Must be greater than zero and not more than one. The higher the faster the algorithm learns and the lower ***m*** is required. However, empirical evidence suggests that ***v <= 0.1*** gives better results. If the algorithm learns too fast (requires few boosting steps to converge) then try lowering the learning rate. Computational costs can be reduced by increasing the learning rate while simultaneously decreasing ***m***, potentially at the expense of predictiveness.
10+
#### v (default = 0.5)
11+
The learning rate. Must be greater than zero and not more than one. The higher the faster the algorithm learns and the lower ***m*** is required, reducing computational costs potentially at the expense of predictiveness. Empirical evidence suggests that ***v <= 0.5*** gives good results for APLR.
1212

1313
#### random_state (default = 0)
1414
Used to randomly split training observations into cv_folds if ***cv_observations*** is not specified when fitting.
@@ -31,13 +31,13 @@ Specifies the maximum allowed depth of interaction terms. ***0*** means that int
3131
#### max_interactions (default = 100000)
3232
The maximum number of interactions allowed in each underlying model. A lower value may be used to reduce computational time or to increase interpretability.
3333

34-
#### min_observations_in_split (default = 20)
34+
#### min_observations_in_split (default = 4)
3535
The minimum effective number of observations that a term in the model must rely on. This hyperparameter should be tuned. Larger values are more appropriate for larger datasets. Larger values result in more robust models (lower variance), potentially at the expense of increased bias.
3636

37-
#### ineligible_boosting_steps_added (default = 10)
37+
#### ineligible_boosting_steps_added (default = 15)
3838
Controls how many boosting steps a term that becomes ineligible has to remain ineligible. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
3939

40-
#### max_eligible_terms (default = 5)
40+
#### max_eligible_terms (default = 7)
4141
Limits 1) the number of terms already in the model that can be considered as interaction partners in a boosting step and 2) how many terms remain eligible in the next boosting step. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4242

4343
#### boosting_steps_before_interactions_are_allowed (default = 0)
@@ -93,7 +93,7 @@ An optional list of integers specifying monotonic constraints on model terms. Fo
9393
An optional list containing lists of integers. Specifies interaction constraints on model terms. For example, interaction_constraints = [[0,1], [1,2,3]] means that 1) the first and second predictors may interact with each other, and that 2) the second, third and fourth predictors may interact with each other. There are no interaction constraints on predictors not mentioned in interaction_constraints.
9494

9595
#### predictor_learning_rates
96-
An optional list of floats specifying learning rates for each predictor. If provided then this supercedes ***v***. For example, if there are two predictors in ***X***, then predictor_learning_rates = [0.1,0.2] means that all terms using the first predictor in ***X*** as a main effect will have a learning rate of 0.1 and that all terms using the second predictor in ***X*** as a main effect will have a learning rate of 0.2.
96+
An optional list of floats specifying learning rates for each predictor. If provided then this supercedes ***v***. For example, if there are two predictors in ***X***, then predictor_learning_rates = [0.1, 0.2] means that all terms using the first predictor in ***X*** as a main effect will have a learning rate of 0.1 and that all terms using the second predictor in ***X*** as a main effect will have a learning rate of 0.2.
9797

9898
#### predictor_penalties_for_non_linearity
9999
An optional list of floats specifying penalties for non-linearity for each predictor. If provided then this supercedes ***penalty_for_non_linearity***. For example, if there are two predictors in ***X***, then predictor_penalties_for_non_linearity = [0.1,0.2] means that all terms using the first predictor in ***X*** as a main effect will have a penalty for non-linearity of 0.1 and that all terms using the second predictor in ***X*** as a main effect will have a penalty for non-linearity of 0.2.

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# APLRRegressor
22

3-
## class aplr.APLRRegressor(m:int = 20000, v:float = 0.1, random_state:int = 0, loss_function:str = "mse", link_function:str = "identity", n_jobs:int = 0, cv_folds:int = 5, bins:int = 300, max_interaction_level:int = 1, max_interactions:int = 100000, min_observations_in_split:int = 20, ineligible_boosting_steps_added:int = 10, max_eligible_terms:int = 5, verbosity:int = 0, dispersion_parameter:float = 1.5, validation_tuning_metric:str = "default", quantile:float = 0.5, calculate_custom_validation_error_function:Optional[Callable[[FloatVector, FloatVector, FloatVector, FloatVector, FloatMatrix], float]] = None, calculate_custom_loss_function:Optional[Callable[[FloatVector, FloatVector, FloatVector, FloatVector, FloatMatrix], float]] = None, calculate_custom_negative_gradient_function:Optional[Callable[[FloatVector, FloatVector, FloatVector, FloatMatrix],FloatVector]] = None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[FloatVector], FloatVector]] = None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[FloatVector], FloatVector]] = None, boosting_steps_before_interactions_are_allowed:int = 0, monotonic_constraints_ignore_interactions:bool = False, group_mse_by_prediction_bins:int = 10, group_mse_cycle_min_obs_in_bin:int = 30, early_stopping_rounds:int = 500, num_first_steps_with_linear_effects_only:int = 0, penalty_for_non_linearity:float = 0.0, penalty_for_interactions:float = 0.0, max_terms:int = 0)
3+
## class aplr.APLRRegressor(m:int = 20000, v:float = 0.5, random_state:int = 0, loss_function:str = "mse", link_function:str = "identity", n_jobs:int = 0, cv_folds:int = 5, bins:int = 300, max_interaction_level:int = 1, max_interactions:int = 100000, min_observations_in_split:int = 4, ineligible_boosting_steps_added:int = 15, max_eligible_terms:int = 7, verbosity:int = 0, dispersion_parameter:float = 1.5, validation_tuning_metric:str = "default", quantile:float = 0.5, calculate_custom_validation_error_function:Optional[Callable[[FloatVector, FloatVector, FloatVector, FloatVector, FloatMatrix], float]] = None, calculate_custom_loss_function:Optional[Callable[[FloatVector, FloatVector, FloatVector, FloatVector, FloatMatrix], float]] = None, calculate_custom_negative_gradient_function:Optional[Callable[[FloatVector, FloatVector, FloatVector, FloatMatrix],FloatVector]] = None, calculate_custom_transform_linear_predictor_to_predictions_function:Optional[Callable[[FloatVector], FloatVector]] = None, calculate_custom_differentiate_predictions_wrt_linear_predictor_function:Optional[Callable[[FloatVector], FloatVector]] = None, boosting_steps_before_interactions_are_allowed:int = 0, monotonic_constraints_ignore_interactions:bool = False, group_mse_by_prediction_bins:int = 10, group_mse_cycle_min_obs_in_bin:int = 30, early_stopping_rounds:int = 500, num_first_steps_with_linear_effects_only:int = 0, penalty_for_non_linearity:float = 0.0, penalty_for_interactions:float = 0.0, max_terms:int = 0)
44

55
### Constructor parameters
66

77
#### m (default = 20000)
88
The maximum number of boosting steps. If validation error does not flatten out at the end of the ***m***th boosting step, then try increasing it (or alternatively increase the learning rate).
99

10-
#### v (default = 0.1)
11-
The learning rate. Must be greater than zero and not more than one. The higher the faster the algorithm learns and the lower ***m*** is required. However, empirical evidence suggests that ***v <= 0.1*** gives better results. If the algorithm learns too fast (requires few boosting steps to converge) then try lowering the learning rate. Computational costs can be reduced by increasing the learning rate while simultaneously decreasing ***m***, potentially at the expense of predictiveness.
10+
#### v (default = 0.5)
11+
The learning rate. Must be greater than zero and not more than one. The higher the faster the algorithm learns and the lower ***m*** is required, reducing computational costs potentially at the expense of predictiveness. Empirical evidence suggests that ***v <= 0.5*** gives good results for APLR.
1212

1313
#### random_state (default = 0)
1414
Used to randomly split training observations into cv_folds if ***cv_observations*** is not specified when fitting.
@@ -34,13 +34,13 @@ Specifies the maximum allowed depth of interaction terms. ***0*** means that int
3434
#### max_interactions (default = 100000)
3535
The maximum number of interactions allowed in each underlying model. A lower value may be used to reduce computational time or to increase interpretability.
3636

37-
#### min_observations_in_split (default = 20)
37+
#### min_observations_in_split (default = 4)
3838
The minimum effective number of observations that a term in the model must rely on. This hyperparameter should be tuned. Larger values are more appropriate for larger datasets. Larger values result in more robust models (lower variance), potentially at the expense of increased bias.
3939

40-
#### ineligible_boosting_steps_added (default = 10)
40+
#### ineligible_boosting_steps_added (default = 15)
4141
Controls how many boosting steps a term that becomes ineligible has to remain ineligible. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4242

43-
#### max_eligible_terms (default = 5)
43+
#### max_eligible_terms (default = 7)
4444
Limits 1) the number of terms already in the model that can be considered as interaction partners in a boosting step and 2) how many terms remain eligible in the next boosting step. The default value works well according to empirical results. This hyperparameter is intended for reducing computational costs.
4545

4646
#### verbosity (default = 0)
@@ -167,7 +167,7 @@ An optional list containing lists of integers. Specifies interaction constraints
167167
An optional numpy matrix with other data. This is used in custom loss, negative gradient and validation error functions.
168168

169169
#### predictor_learning_rates
170-
An optional list of floats specifying learning rates for each predictor. If provided then this supercedes ***v***. For example, if there are two predictors in ***X***, then predictor_learning_rates = [0.1,0.2] means that all terms using the first predictor in ***X*** as a main effect will have a learning rate of 0.1 and that all terms using the second predictor in ***X*** as a main effect will have a learning rate of 0.2.
170+
An optional list of floats specifying learning rates for each predictor. If provided then this supercedes ***v***. For example, if there are two predictors in ***X***, then predictor_learning_rates = [0.1, 0.2] means that all terms using the first predictor in ***X*** as a main effect will have a learning rate of 0.1 and that all terms using the second predictor in ***X*** as a main effect will have a learning rate of 0.2.
171171

172172
#### predictor_penalties_for_non_linearity
173173
An optional list of floats specifying penalties for non-linearity for each predictor. If provided then this supercedes ***penalty_for_non_linearity***. For example, if there are two predictors in ***X***, then predictor_penalties_for_non_linearity = [0.1,0.2] means that all terms using the first predictor in ***X*** as a main effect will have a penalty for non-linearity of 0.1 and that all terms using the second predictor in ***X*** as a main effect will have a penalty for non-linearity of 0.2.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Automatic Piecewise Linear Regression.
33

44
# About
5-
Build predictive and interpretable parametric regression or classification machine learning models in Python based on the Automatic Piecewise Linear Regression (APLR) methodology developed by Mathias von Ottenbreit. APLR is often able to compete with tree-based methods on predictiveness, but unlike tree-based methods APLR is interpretable. Please see the [documentation](https://github.com/ottenbreit-data-science/aplr/tree/main/documentation) for more information. Links to published article: [https://link.springer.com/article/10.1007/s00180-024-01475-4](https://link.springer.com/article/10.1007/s00180-024-01475-4) and [https://rdcu.be/dz7bF](https://rdcu.be/dz7bF). More functionality has been added to APLR since the article was published.
5+
Build predictive and interpretable parametric regression or classification machine learning models in Python based on the Automatic Piecewise Linear Regression (APLR) methodology developed by Mathias von Ottenbreit. APLR is often able to compete with tree-based methods on predictiveness, but unlike tree-based methods APLR is interpretable. Furthermore, APLR produces smoother predictions than tree-based methods. Please see the [documentation](https://github.com/ottenbreit-data-science/aplr/tree/main/documentation) for more information. Links to published article: [https://link.springer.com/article/10.1007/s00180-024-01475-4](https://link.springer.com/article/10.1007/s00180-024-01475-4) and [https://rdcu.be/dz7bF](https://rdcu.be/dz7bF). More functionality has been added to APLR since the article was published.
66

77
# How to install
88
***pip install aplr***

0 commit comments

Comments
 (0)