Skip to content

Commit bfa0cbd

Browse files
10.18.0
1 parent 7e03e55 commit bfa0cbd

13 files changed

+514
-411
lines changed

.github/workflows/build_wheels.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ jobs:
66
runs-on: ${{ matrix.os }}
77
strategy:
88
matrix:
9-
os: [ubuntu-latest, windows-latest, macos-13, macos-14]
9+
os: [ubuntu-latest, ubuntu-24.04-arm, windows-latest, windows-11-arm, macos-15-intel, macos-14]
1010
steps:
11-
- uses: actions/checkout@v4
11+
- uses: actions/checkout@v5
1212
- name: Build wheels
13-
uses: pypa/cibuildwheel@v2.22.0
13+
uses: pypa/cibuildwheel@v3.2.1
1414
env:
1515
CIBW_SKIP: "*musllinux* pp*"
1616
CIBW_ENVIRONMENT: MACOSX_DEPLOYMENT_TARGET=11.0

API_REFERENCE_FOR_APLR_TUNER.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,14 @@ The parameters that you wish to tune.
1111
Whether you want to use APLRRegressor (True) or APLRClassifier (False).
1212

1313

14-
## Method: fit(X: FloatMatrix, y: FloatVector, **kwargs)
14+
## Method: fit(X: Union[pd.DataFrame, FloatMatrix], y: FloatVector, **kwargs)
1515

1616
***This method tunes the model to data.***
1717

1818
### Parameters
1919

2020
#### X
21-
A numpy matrix with predictor values.
21+
A numpy matrix or pandas DataFrame with predictor values.
2222

2323
#### y
2424
A numpy vector with response values.
@@ -27,40 +27,40 @@ A numpy vector with response values.
2727
Optional parameters sent to the fit methods in the underlying APLRRegressor or APLRClassifier models.
2828

2929

30-
## Method: predict(X: FloatMatrix, **kwargs)
30+
## Method: predict(X: Union[pd.DataFrame, FloatMatrix], **kwargs)
3131

3232
***Returns the predictions of the best tuned model as a numpy array if regression or as a list of strings if classification.***
3333

3434
### Parameters
3535

3636
#### X
37-
A numpy matrix with predictor values.
37+
A numpy matrix or pandas DataFrame with predictor values.
3838

3939
#### kwargs
4040
Optional parameters sent to the predict method in the best tuned model.
4141

4242

43-
## Method: predict_class_probabilities(X: FloatMatrix, **kwargs)
43+
## Method: predict_class_probabilities(X: Union[pd.DataFrame, FloatMatrix], **kwargs)
4444

4545
***This method returns predicted class probabilities of the best tuned model as a numpy matrix.***
4646

4747
### Parameters
4848

4949
#### X
50-
A numpy matrix with predictor values.
50+
A numpy matrix or pandas DataFrame with predictor values.
5151

5252
#### kwargs
5353
Optional parameters sent to the predict_class_probabilities method in the best tuned model.
5454

5555

56-
## Method: predict_proba(X: FloatMatrix, **kwargs)
56+
## Method: predict_proba(X: Union[pd.DataFrame, FloatMatrix], **kwargs)
5757

5858
***This method returns predicted class probabilities of the best tuned model as a numpy matrix. Similar to the predict_class_probabilities method but the name predict_proba is compatible with scikit-learn.***
5959

6060
### Parameters
6161

6262
#### X
63-
A numpy matrix with predictor values.
63+
A numpy matrix or pandas DataFrame with predictor values.
6464

6565
#### kwargs
6666
Optional parameters sent to the predict_class_probabilities method in the best tuned model.

API_REFERENCE_FOR_CLASSIFICATION.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -65,23 +65,23 @@ Restricts the maximum number of terms in any of the underlying models trained to
6565
Specifies the (weighted) ridge penalty applied to the model. Positive values can smooth model effects and help mitigate boundary problems, such as regression coefficients with excessively high magnitudes near the boundaries. To find the optimal value, consider using a grid search or similar. Negative values are treated as zero.
6666

6767

68-
## Method: fit(X:FloatMatrix, y:List[str], sample_weight:FloatVector = np.empty(0), X_names:List[str] = [], cv_observations:IntMatrix = np.empty([0, 0]), prioritized_predictors_indexes:List[int] = [], monotonic_constraints:List[int] = [], interaction_constraints:List[List[int]] = [], predictor_learning_rates:List[float] = [], predictor_penalties_for_non_linearity:List[float] = [], predictor_penalties_for_interactions:List[float] = [], predictor_min_observations_in_split: List[int] = [])
68+
## Method: fit(X:Union[pd.DataFrame, FloatMatrix], y:Union[FloatVector, List[str]], sample_weight:FloatVector = np.empty(0), X_names:List[str] = [], cv_observations:IntMatrix = np.empty([0, 0]), prioritized_predictors_indexes:List[int] = [], monotonic_constraints:List[int] = [], interaction_constraints:List[List[int]] = [], predictor_learning_rates:List[float] = [], predictor_penalties_for_non_linearity:List[float] = [], predictor_penalties_for_interactions:List[float] = [], predictor_min_observations_in_split: List[int] = [])
6969

7070
***This method fits the model to data.***
7171

7272
### Parameters
7373

7474
#### X
75-
A numpy matrix with predictor values.
75+
A numpy matrix or pandas DataFrame with predictor values. If a pandas DataFrame is provided, the model will automatically handle categorical features and missing values. Categorical features will be one-hot encoded. Missing values will be imputed with the median of the column, and a new binary feature will be added to indicate that the value was missing.
7676

7777
#### y
78-
A list of strings with response values (class names).
78+
A numpy array or list of strings with response values (class names). Other data types will be converted to strings.
7979

8080
#### sample_weight
8181
An optional numpy vector with sample weights. If not specified then the observations are weighted equally.
8282

8383
#### X_names
84-
An optional list of strings containing names for each predictor in ***X***. Naming predictors may increase model readability because model terms get names based on ***X_names***.
84+
An optional list of strings containing names for each predictor in ***X***. Naming predictors may increase model readability because model terms get names based on ***X_names***. **Note:** This parameter is ignored if ***X*** is a pandas DataFrame; the DataFrame's column names will be used instead.
8585

8686
#### cv_observations
8787
An optional integer matrix specifying how each training observation is used in cross validation. If this is specified then ***cv_folds*** is not used. Specifying ***cv_observations*** may be useful for example when modelling time series data (you can place more recent observations in the holdout folds). ***cv_observations*** must contain a column for each desired fold combination. For a given column, row values equalling 1 specify that these rows will be used for training, while row values equalling -1 specify that these rows will be used for validation. Row values equalling 0 will not be used.
@@ -108,35 +108,35 @@ An optional list of floats specifying interaction penalties for each predictor.
108108
An optional list of integers specifying the minimum effective number of observations in a split for each predictor. If provided then this supercedes ***min_observations_in_split***.
109109

110110

111-
## Method: predict_class_probabilities(X:FloatMatrix, cap_predictions_to_minmax_in_training:bool = False)
111+
## Method: predict_class_probabilities(X:Union[pd.DataFrame, FloatMatrix], cap_predictions_to_minmax_in_training:bool = False)
112112

113113
***Returns a numpy matrix containing predictions of the data in X. Requires that the model has been fitted with the fit method.***
114114

115115
### Parameters
116116

117117
#### X
118-
A numpy matrix with predictor values.
118+
A numpy matrix or pandas DataFrame with predictor values.
119119

120120
#### cap_predictions_to_minmax_in_training
121121
If ***True*** then for each underlying logit model the predictions are capped so that they are not less than the minimum and not greater than the maximum prediction or response in the training dataset.
122122

123123

124-
## Method: predict(X:FloatMatrix, cap_predictions_to_minmax_in_training:bool = False)
124+
## Method: predict(X:Union[pd.DataFrame, FloatMatrix], cap_predictions_to_minmax_in_training:bool = False)
125125

126126
***Returns a list of strings containing predictions of the data in X. An observation is classified to the category with the highest predicted class probability. Requires that the model has been fitted with the fit method.***
127127

128128
### Parameters
129129
Parameters are the same as in ***predict_class_probabilities()***.
130130

131131

132-
## Method: calculate_local_feature_contribution(X:FloatMatrix)
132+
## Method: calculate_local_feature_contribution(X:Union[pd.DataFrame, FloatMatrix])
133133

134134
***Returns a numpy matrix containing feature contribution to the linear predictor in X for each predictor. For each prediction this method uses calculate_local_feature_contribution() in the logit APLRRegressor model for the category that corresponds to the prediction. Example: If a prediction is "myclass" then the method uses calculate_local_feature_contribution() in the logit model that predicts whether an observation belongs to class "myclass" or not.***
135135

136136
### Parameters
137137

138138
#### X
139-
A numpy matrix with predictor values.
139+
A numpy matrix or pandas DataFrame with predictor values.
140140

141141

142142
## Method: get_categories()

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -139,14 +139,14 @@ If true, then a mean bias correction is applied to the model's intercept term. T
139139
If true, then a scaling is applied to the negative gradient to speed up convergence. This should primarily be used when the algorithm otherwise converges too slowly or prematurely. This is only applied for the "identity" and "log" link functions.
140140
This will not speed up the combination of "mse" loss with an "identity" link, as this combination is already optimized for speed within the algorithm. Furthermore, this option is not effective for all loss functions, such as "mae" and "quantile".
141141

142-
## Method: fit(X:FloatMatrix, y:FloatVector, sample_weight:FloatVector = np.empty(0), X_names:List[str] = [], cv_observations:IntMatrix = np.empty([0, 0]), prioritized_predictors_indexes:List[int] = [], monotonic_constraints:List[int] = [], group:FloatVector = np.empty(0), interaction_constraints:List[List[int]] = [], other_data:FloatMatrix = np.empty([0, 0]), predictor_learning_rates:List[float] = [], predictor_penalties_for_non_linearity:List[float] = [], predictor_penalties_for_interactions:List[float] = [], predictor_min_observations_in_split: List[int] = [])
142+
## Method: fit(X:Union[pd.DataFrame, FloatMatrix], y:FloatVector, sample_weight:FloatVector = np.empty(0), X_names:List[str] = [], cv_observations:IntMatrix = np.empty([0, 0]), prioritized_predictors_indexes:List[int] = [], monotonic_constraints:List[int] = [], group:FloatVector = np.empty(0), interaction_constraints:List[List[int]] = [], other_data:FloatMatrix = np.empty([0, 0]), predictor_learning_rates:List[float] = [], predictor_penalties_for_non_linearity:List[float] = [], predictor_penalties_for_interactions:List[float] = [], predictor_min_observations_in_split: List[int] = [])
143143

144144
***This method fits the model to data.***
145145

146146
### Parameters
147147

148148
#### X
149-
A numpy matrix with predictor values.
149+
A numpy matrix or pandas DataFrame with predictor values. If a pandas DataFrame is provided, the model will automatically handle categorical features and missing values. Categorical features will be one-hot encoded. Missing values will be imputed with the median of the column, and a new binary feature will be added to indicate that the value was missing.
150150

151151
#### y
152152
A numpy vector with response values.
@@ -155,7 +155,7 @@ A numpy vector with response values.
155155
An optional numpy vector with sample weights. If not specified then the observations are weighted equally.
156156

157157
#### X_names
158-
An optional list of strings containing names for each predictor in ***X***. Naming predictors may increase model readability because model terms get names based on ***X_names***.
158+
An optional list of strings containing names for each predictor in ***X***. Naming predictors may increase model readability because model terms get names based on ***X_names***. **Note:** This parameter is ignored if ***X*** is a pandas DataFrame; the DataFrame's column names will be used instead.
159159

160160
#### cv_observations
161161
An optional integer matrix specifying how each training observation is used in cross validation. If this is specified then ***cv_folds*** is not used. Specifying ***cv_observations*** may be useful for example when modelling time series data (you can place more recent observations in the holdout folds). ***cv_observations*** must contain a column for each desired fold combination. For a given column, row values equalling 1 specify that these rows will be used for training, while row values equalling -1 specify that these rows will be used for validation. Row values equalling 0 will not be used.
@@ -188,14 +188,14 @@ An optional list of floats specifying interaction penalties for each predictor.
188188
An optional list of integers specifying the minimum effective number of observations in a split for each predictor. If provided then this supercedes ***min_observations_in_split***.
189189

190190

191-
## Method: predict(X:FloatMatrix, cap_predictions_to_minmax_in_training:bool = True)
191+
## Method: predict(X:Union[pd.DataFrame, FloatMatrix], cap_predictions_to_minmax_in_training:bool = True)
192192

193193
***Returns a numpy vector containing predictions of the data in X. Requires that the model has been fitted with the fit method.***
194194

195195
### Parameters
196196

197197
#### X
198-
A numpy matrix with predictor values.
198+
A numpy matrix or pandas DataFrame with predictor values.
199199

200200
#### cap_predictions_to_minmax_in_training
201201
If ***True*** then predictions are capped so that they are not less than the minimum and not greater than the maximum prediction or response in the training dataset. This is recommended especially if ***max_interaction_level*** is high. However, if you need the model to extrapolate then set this parameter to ***False***.
@@ -211,67 +211,67 @@ If ***True*** then predictions are capped so that they are not less than the min
211211
A list of strings containing names for each predictor in the ***X*** matrix that the model was trained on.
212212

213213

214-
## Method: calculate_feature_importance(X:FloatMatrix, sample_weight:FloatVector = np.empty(0))
214+
## Method: calculate_feature_importance(X:Union[pd.DataFrame, FloatMatrix], sample_weight:FloatVector = np.empty(0))
215215

216216
***Returns a numpy matrix containing estimated feature importance in X for each predictor.***
217217

218218
### Parameters
219219

220220
#### X
221-
A numpy matrix with predictor values.
221+
A numpy matrix or pandas DataFrame with predictor values.
222222

223223

224-
## Method: calculate_term_importance(X:FloatMatrix, sample_weight:FloatVector = np.empty(0))
224+
## Method: calculate_term_importance(X:Union[pd.DataFrame, FloatMatrix], sample_weight:FloatVector = np.empty(0))
225225

226226
***Returns a numpy matrix containing estimated term importance in X for each term in the model.***
227227

228228
### Parameters
229229

230230
#### X
231-
A numpy matrix with predictor values.
231+
A numpy matrix or pandas DataFrame with predictor values.
232232

233233

234-
## Method: calculate_local_feature_contribution(X:FloatMatrix)
234+
## Method: calculate_local_feature_contribution(X:Union[pd.DataFrame, FloatMatrix])
235235

236236
***Returns a numpy matrix containing feature contribution to the linear predictor in X for each predictor.***
237237

238238
### Parameters
239239

240240
#### X
241-
A numpy matrix with predictor values.
241+
A numpy matrix or pandas DataFrame with predictor values.
242242

243243

244-
## Method: calculate_local_term_contribution(X:FloatMatrix)
244+
## Method: calculate_local_term_contribution(X:Union[pd.DataFrame, FloatMatrix])
245245

246246
***Returns a numpy matrix containing term contribution to the linear predictor in X for each term in the model.***
247247

248248
### Parameters
249249

250250
#### X
251-
A numpy matrix with predictor values.
251+
A numpy matrix or pandas DataFrame with predictor values.
252252

253253

254-
## Method: calculate_local_contribution_from_selected_terms(X:FloatMatrix, predictor_indexes:List[int])
254+
## Method: calculate_local_contribution_from_selected_terms(X:Union[pd.DataFrame, FloatMatrix], predictor_indexes:List[int])
255255

256256
***Returns a numpy vector containing the contribution to the linear predictor from an user specified combination of interacting predictors for each observation in X. This makes it easier to interpret interactions (or main effects if just one predictor is specified), for example by plotting predictor values against the term contribution.***
257257

258258
### Parameters
259259

260260
#### X
261-
A numpy matrix with predictor values.
261+
A numpy matrix or pandas DataFrame with predictor values.
262262

263263
#### predictor_indexes
264264
A list of integers specifying the indexes of predictors in X to use. For example, [1, 3] means the second and fourth predictors in X.
265265

266266

267-
## Method: calculate_terms(X:FloatMatrix)
267+
## Method: calculate_terms(X:Union[pd.DataFrame, FloatMatrix])
268268

269269
***Returns a numpy matrix containing values of model terms calculated on X.***
270270

271271
### Parameters
272272

273273
#### X
274-
A numpy matrix with predictor values.
274+
A numpy matrix or pandas DataFrame with predictor values.
275275

276276

277277
## Method: get_term_names()

0 commit comments

Comments
 (0)