Skip to content

Commit 8c75abf

Browse files
10.12.1
1 parent d3fe6c4 commit 8c75abf

File tree

4 files changed

+19
-15
lines changed

4 files changed

+19
-15
lines changed

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ Limits 1) the number of terms already in the model that can be considered as int
5050
Specifies the variance power when ***loss_function*** is "tweedie". Specifies a dispersion parameter when ***loss_function*** is "negative_binomial", "cauchy" or "weibull".
5151

5252
#### validation_tuning_metric (default = "default")
53-
Specifies which metric to use for validating the model and tuning ***m***. The model will try to minimize the validation metric. Available options are "default" (using the same methodology as when calculating the training error), "mse", "mae", "negative_gini" (normalized), "group_mse", "group_mse_by_prediction", "neg_top_quantile_mean_response", "bottom_quantile_mean_response" and "custom_function". The default is often a choice that fits well with respect to the ***loss_function*** chosen. However, if you want to use ***loss_function*** or ***dispersion_parameter*** as tuning parameters then the default is not suitable. "group_mse" requires that the "group" argument in the ***fit*** method is provided. "group_mse_by_prediction" groups predictions by up to ***group_mse_by_prediction_bins*** groups and calculates groupwise mse. "neg_top_quantile_mean_response" calculates the negative of the sample weighted mean response for observations with predictions in the top quantile (as specified by the ***quantile*** parameter). For example, if ***quantile*** is 0.95, this metric will be the negative of the sample weighted mean response for the 5% of observations with the highest predictions. "bottom_quantile_mean_response" calculates the sample weighted mean response for observations with predictions in the bottom quantile (as specified by the ***quantile*** parameter). For example, if ***quantile*** is 0.05, this metric will be the sample weighted mean response for the 5% of observations with the lowest predictions. For "custom_function" see ***calculate_custom_validation_error_function*** below.
53+
Specifies which metric to use for validating the model and tuning ***m***. The model will try to minimize the validation metric. Available options are "default" (using the same methodology as when calculating the training error), "mse", "mae", "negative_gini" (normalized), "group_mse", "group_mse_by_prediction", "neg_top_quantile_mean_response", "bottom_quantile_mean_response" and "custom_function". The default is often a choice that fits well with respect to the ***loss_function*** chosen. However, if you want to use ***loss_function*** or ***dispersion_parameter*** as tuning parameters then the default is not suitable. "group_mse" requires that the "group" argument in the ***fit*** method is provided. "group_mse_by_prediction" groups predictions by up to ***group_mse_by_prediction_bins*** groups and calculates groupwise mse. "neg_top_quantile_mean_response" calculates the negative of the sample weighted mean response for observations with predictions in the top quantile (as specified by the ***quantile*** parameter). For example, if ***quantile*** is 0.95, this metric will be the negative of the sample weighted mean response for the 5% of observations with the highest predictions. "bottom_quantile_mean_response" calculates the sample weighted mean response for observations with predictions in the bottom quantile (as specified by the ***quantile*** parameter). For example, if ***quantile*** is 0.05, this metric will be the sample weighted mean response for the 5% of observations with the lowest predictions. For "custom_function" see ***calculate_custom_validation_error_function*** below. Please note that for non-default values a significantly higher ***early_stopping_rounds*** than the default of 200 might be needed.
5454

5555
#### quantile (default = 0.5)
5656
Specifies the quantile to use when ***loss_function*** is "quantile" or when ***validation_tuning_metric*** is "neg_top_quantile_mean_response" or "bottom_quantile_mean_response".

cpp/APLRRegressor.h

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -945,7 +945,6 @@ void APLRRegressor::scale_response_if_using_log_link_function()
945945
{
946946
scaling_factor_for_log_link_function = 1 / inverse_scaling_factor;
947947
y_train *= scaling_factor_for_log_link_function;
948-
y_validation *= scaling_factor_for_log_link_function;
949948
}
950949
else
951950
scaling_factor_for_log_link_function = 1.0;
@@ -1773,13 +1772,19 @@ void APLRRegressor::calculate_and_validate_validation_error(size_t boosting_step
17731772

17741773
double APLRRegressor::calculate_validation_error(const VectorXd &predictions)
17751774
{
1775+
VectorXd predictions_used{predictions};
1776+
if (link_function == "log")
1777+
{
1778+
predictions_used /= scaling_factor_for_log_link_function;
1779+
}
1780+
17761781
if (validation_tuning_metric == "default")
17771782
{
17781783
if (loss_function == "custom_function")
17791784
{
17801785
try
17811786
{
1782-
return calculate_custom_loss_function(y_validation, predictions, sample_weight_validation, group_validation, other_data_validation);
1787+
return calculate_custom_loss_function(y_validation, predictions_used, sample_weight_validation, group_validation, other_data_validation);
17831788
}
17841789
catch (const std::exception &e)
17851790
{
@@ -1789,33 +1794,33 @@ double APLRRegressor::calculate_validation_error(const VectorXd &predictions)
17891794
}
17901795
else if (loss_function == "group_mse_cycle")
17911796
{
1792-
return calculate_group_mse_by_prediction_validation_error(predictions);
1797+
return calculate_group_mse_by_prediction_validation_error(predictions_used);
17931798
}
17941799
else
1795-
return calculate_mean_error(calculate_errors(y_validation, predictions, sample_weight_validation, loss_function, dispersion_parameter, group_validation, unique_groups_validation, quantile), sample_weight_validation);
1800+
return calculate_mean_error(calculate_errors(y_validation, predictions_used, sample_weight_validation, loss_function, dispersion_parameter, group_validation, unique_groups_validation, quantile), sample_weight_validation);
17961801
}
17971802
else if (validation_tuning_metric == "mse")
1798-
return calculate_mean_error(calculate_errors(y_validation, predictions, sample_weight_validation, MSE_LOSS_FUNCTION), sample_weight_validation);
1803+
return calculate_mean_error(calculate_errors(y_validation, predictions_used, sample_weight_validation, MSE_LOSS_FUNCTION), sample_weight_validation);
17991804
else if (validation_tuning_metric == "mae")
1800-
return calculate_mean_error(calculate_errors(y_validation, predictions, sample_weight_validation, "mae"), sample_weight_validation);
1805+
return calculate_mean_error(calculate_errors(y_validation, predictions_used, sample_weight_validation, "mae"), sample_weight_validation);
18011806
else if (validation_tuning_metric == "negative_gini")
1802-
return -calculate_gini(y_validation, predictions, sample_weight_validation) / calculate_gini(y_validation, y_validation, sample_weight_validation);
1807+
return -calculate_gini(y_validation, predictions_used, sample_weight_validation) / calculate_gini(y_validation, y_validation, sample_weight_validation);
18031808
else if (validation_tuning_metric == "group_mse")
18041809
{
18051810
bool group_is_not_provided{group_validation.rows() == 0};
18061811
if (group_is_not_provided)
18071812
throw std::runtime_error("When validation_tuning_metric is group_mse then the group argument in fit() must be provided.");
1808-
return calculate_mean_error(calculate_errors(y_validation, predictions, sample_weight_validation, "group_mse", dispersion_parameter, group_validation, unique_groups_validation, quantile), sample_weight_validation);
1813+
return calculate_mean_error(calculate_errors(y_validation, predictions_used, sample_weight_validation, "group_mse", dispersion_parameter, group_validation, unique_groups_validation, quantile), sample_weight_validation);
18091814
}
18101815
else if (validation_tuning_metric == "group_mse_by_prediction")
18111816
{
1812-
return calculate_group_mse_by_prediction_validation_error(predictions);
1817+
return calculate_group_mse_by_prediction_validation_error(predictions_used);
18131818
}
18141819
else if (validation_tuning_metric == "custom_function")
18151820
{
18161821
try
18171822
{
1818-
return calculate_custom_validation_error_function(y_validation, predictions, sample_weight_validation, group_validation, other_data_validation);
1823+
return calculate_custom_validation_error_function(y_validation, predictions_used, sample_weight_validation, group_validation, other_data_validation);
18191824
}
18201825
catch (const std::exception &e)
18211826
{
@@ -1825,7 +1830,7 @@ double APLRRegressor::calculate_validation_error(const VectorXd &predictions)
18251830
}
18261831
else if (validation_tuning_metric == "neg_top_quantile_mean_response")
18271832
{
1828-
double mean_response{calculate_quantile_mean_response(predictions, true)};
1833+
double mean_response{calculate_quantile_mean_response(predictions_used, true)};
18291834
if (std::isinf(mean_response))
18301835
{
18311836
return mean_response;
@@ -1834,7 +1839,7 @@ double APLRRegressor::calculate_validation_error(const VectorXd &predictions)
18341839
}
18351840
else if (validation_tuning_metric == "bottom_quantile_mean_response")
18361841
{
1837-
return calculate_quantile_mean_response(predictions, false);
1842+
return calculate_quantile_mean_response(predictions_used, false);
18381843
}
18391844
else
18401845
throw std::runtime_error(validation_tuning_metric + " is an invalid validation_tuning_metric.");
@@ -2025,7 +2030,6 @@ void APLRRegressor::revert_scaling_if_using_log_link_function()
20252030
if (link_function == "log")
20262031
{
20272032
y_train /= scaling_factor_for_log_link_function;
2028-
y_validation /= scaling_factor_for_log_link_function;
20292033
intercept += std::log(1 / scaling_factor_for_log_link_function);
20302034
for (Eigen::Index i = 0; i < intercept_steps.size(); ++i)
20312035
{
Binary file not shown.

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828

2929
setuptools.setup(
3030
name="aplr",
31-
version="10.12.0",
31+
version="10.12.1",
3232
description="Automatic Piecewise Linear Regression",
3333
ext_modules=[sfc_module],
3434
author="Mathias von Ottenbreit",

0 commit comments

Comments
 (0)