Releases: ottenbreit-data-science/aplr
Bugfix
Fixed a minor issue in the get_unique_term_affiliation_shape method that could cause crashes in rare cases.
Improved visualization of interactions
Changes
-
The
get_unique_term_affiliation_shapemethod now includes an optional parameter,additional_points(default: 250). This parameter adds evenly spaced points for two-way or higher-order interactions, ensuring smoother visualization and reducing artifacts from sparse data. For the same reason, the default value formax_rows_before_samplinghas been increased from 100,000 to 500,000. -
The example scripts now use heatmaps instead of 3D charts to plot two-way interactions, improving clarity and interpretability.
Added a convenience method to APLRRegressor
remove_provided_custom_functions method added to APLRRegressor:
- This method removes any custom functions provided for calculating the loss, negative gradient, or validation error.
- Useful after model training with custom functions, ensuring that the
APLRRegressorobject no longer depends on these functions—so they do not need to be present in the Python environment when loading a saved model.
Smoothing and Mitigation of Boundary Problems by Regularization. Backwards Compatibility.
-
Added
ridge_penaltyHyperparameter: Introduced a new hyperparameter,ridge_penalty(default is 0.0001), which specifies the (weighted) ridge penalty applied to the model. Positive values can smooth model effects and help mitigate boundary problems, such as regression coefficients with excessively high magnitudes near the boundaries. To find the optimal value, consider using a grid search or similar tuning methods. Negative values are treated as zero. The default value of 0.0001 has been determined based on empirical tests on more than a hundred datasets. -
Changed Default Value for
early_stopping_rounds: Updated the default value of theearly_stopping_roundshyperparameter from 500 to 200 to improve convergence speed. -
Introduced Backwards Compatibility: Enabled backwards compatibility for already trained models. It is now possible to load models (e.g., using pickle or joblib) as long as they were trained with APLR version 10.6.1 or newer.
New Features: Specify Minimum Observations Per Split for Each Predictor & Python 3.13 Support
- Specify Minimum Observations Per Split: Added the ability to set
min_observations_in_splitfor individual predictors by passing thepredictor_min_observations_in_splitparameter to thefitmethod. - Python 3.13 Support: Introduced compatibility with Python 3.13.
- PyPy Wheels Removed: Removed PyPy wheels due to a bug in
setuptoolsthat prevents using a newer version ofcibuildwheelfor building wheels.
Minor Bugfix: Early Stopping Issue
Fixed a bug that could, on rare occasions, cause premature early stopping when num_first_steps_with_linear_effects_only > 0 or boosting_steps_before_interactions_are_allowed > 0.
Bugfix: Negative Gradient Calculation for Group MSE Loss Functions
Fixed a bug in the calculation of the negative gradient for the group_mse and group_mse_cycle loss functions.
Enhanced Control Over Training Stages: Linear, Non-Linear, and Interaction Effects
This update improves the ability to sequentially train linear effects, then non-linear effects, and finally interaction effects. While the default hyperparameters do not follow this sequence, it can be enabled using the num_first_steps_with_linear_effects_only and boosting_steps_before_interactions_are_allowed parameters.
With this update, at each stage, the algorithm now selects the boosting step with the lowest validation error before progressing to the next stage. This enhancement helps prevent overfitting and allows for a fully fitted linear model before moving on to non-linear and interaction effects. This approach can improve interpretability, as linear effects are typically easier to understand than non-linear effects, and interactions are often the most complex to interpret.
Minor bugfix: set_intercept method
This release includes a minor bugfix for the set_intercept method. The method now ensures that get_term_coefficients returns the updated intercept, reflecting any adjustments made.
New Feature: Adjust Intercept After Model Fitting
The new set_intercept method in APLRRegressor enables users to manually adjust the model intercept after fitting, which is useful for calibration purposes. The API reference has been updated accordingly, and the README has been rewritten for clarity, with contact details added.