Skip to content

Commit 783c70e

Browse files
10.2.0
1 parent f8e7bb2 commit 783c70e

16 files changed

+274
-114
lines changed

API_REFERENCE_FOR_CLASSIFICATION.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Specifies the maximum number of bins to discretize the data into when searching
2929
Specifies the maximum allowed depth of interaction terms. ***0*** means that interactions are not allowed. This hyperparameter should be tuned by for example doing a grid search for best predictiveness. For best interpretability use 0 (or 1 if interactions are needed).
3030

3131
#### max_interactions (default = 100000)
32-
The maximum number of interactions allowed in each underlying model. A lower value may be used to reduce computational time.
32+
The maximum number of interactions allowed in each underlying model. A lower value may be used to reduce computational time or to increase interpretability.
3333

3434
#### min_observations_in_split (default = 20)
3535
The minimum effective number of observations that a term in the model must rely on. This hyperparameter should be tuned. Larger values are more appropriate for larger datasets. Larger values result in more robust models (lower variance), potentially at the expense of increased bias.
@@ -125,7 +125,7 @@ Parameters are the same as in ***predict_class_probabilities()***.
125125

126126
## Method: calculate_local_feature_contribution(X:npt.ArrayLike)
127127

128-
***Returns a numpy matrix containing estimated feature contribution to the linear predictor in X for each predictor.***
128+
***Returns a numpy matrix containing feature contribution to the linear predictor in X for each predictor. For each prediction this method uses calculate_local_feature_contribution() in the logit APLRRegressor model for the category that corresponds to the prediction. Example: If a prediction is "myclass" then the method uses calculate_local_feature_contribution() in the logit model that predicts whether an observation belongs to class "myclass" or not.***
129129

130130
### Parameters
131131

@@ -160,4 +160,9 @@ A string specifying the label of the category.
160160

161161
## Method: get_feature_importance()
162162

163-
***Returns a numpy vector containing the feature importance of each predictor, estimated as an average of feature importances for the underlying logit models.***
163+
***Returns a numpy vector containing the feature importance of each predictor, estimated as an average of feature importances for the underlying logit models.***
164+
165+
166+
## Method: get_unique_term_affiliations()
167+
168+
***Returns a list of strings containing unique predictor affiliations for terms.***

API_REFERENCE_FOR_REGRESSION.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Specifies the maximum number of bins to discretize the data into when searching
3232
Specifies the maximum allowed depth of interaction terms. ***0*** means that interactions are not allowed. This hyperparameter should be tuned by for example doing a grid search for best predictiveness. For best interpretability use 0 (or 1 if interactions are needed).
3333

3434
#### max_interactions (default = 100000)
35-
The maximum number of interactions allowed in each underlying model. A lower value may be used to reduce computational time.
35+
The maximum number of interactions allowed in each underlying model. A lower value may be used to reduce computational time or to increase interpretability.
3636

3737
#### min_observations_in_split (default = 20)
3838
The minimum effective number of observations that a term in the model must rely on. This hyperparameter should be tuned. Larger values are more appropriate for larger datasets. Larger values result in more robust models (lower variance), potentially at the expense of increased bias.
@@ -221,7 +221,7 @@ A numpy matrix with predictor values.
221221

222222
## Method: calculate_local_feature_contribution(X:npt.ArrayLike)
223223

224-
***Returns a numpy matrix containing estimated feature contribution to the linear predictor in X for each predictor.***
224+
***Returns a numpy matrix containing feature contribution to the linear predictor in X for each predictor.***
225225

226226
### Parameters
227227

@@ -267,6 +267,16 @@ A numpy matrix with predictor values.
267267
***Returns a list of strings containing term names.***
268268

269269

270+
## Method: get_term_affiliations()
271+
272+
***Returns a list of strings containing predictor affiliations for terms.***
273+
274+
275+
## Method: get_unique_term_affiliations()
276+
277+
***Returns a list of strings containing unique predictor affiliations for terms.***
278+
279+
270280
## Method: get_term_coefficients()
271281

272282
***Returns a numpy vector containing term regression coefficients.***

aplr/aplr.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,12 @@ def calculate_terms(self, X: npt.ArrayLike) -> npt.ArrayLike:
250250
def get_term_names(self) -> List[str]:
251251
return self.APLRRegressor.get_term_names()
252252

253+
def get_term_affiliations(self) -> List[str]:
254+
return self.APLRRegressor.get_term_affiliations()
255+
256+
def get_unique_term_affiliations(self) -> List[str]:
257+
return self.APLRRegressor.get_unique_term_affiliations()
258+
253259
def get_term_coefficients(self) -> npt.ArrayLike:
254260
return self.APLRRegressor.get_term_coefficients()
255261

@@ -469,6 +475,9 @@ def get_cv_error(self) -> float:
469475
def get_feature_importance(self) -> npt.ArrayLike:
470476
return self.APLRClassifier.get_feature_importance()
471477

478+
def get_unique_term_affiliations(self) -> List[str]:
479+
return self.APLRClassifier.get_unique_term_affiliations()
480+
472481
# For sklearn
473482
def get_params(self, deep=True):
474483
return {

cpp/APLRClassifier.h

Lines changed: 52 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ class APLRClassifier
2121
void define_cv_observations(const std::vector<std::string> &y, const MatrixXi &cv_observations_);
2222
void invert_second_model_in_two_class_case(APLRRegressor &second_model);
2323
void calculate_validation_metrics();
24+
void calculate_unique_term_affiliations();
2425
void cleanup_after_fit();
2526

2627
public:
@@ -49,6 +50,8 @@ class APLRClassifier
4950
double penalty_for_non_linearity;
5051
double penalty_for_interactions;
5152
size_t max_terms;
53+
std::vector<std::string> unique_term_affiliations;
54+
std::map<std::string, size_t> unique_term_affiliation_map;
5255

5356
APLRClassifier(size_t m = 3000, double v = 0.1, uint_fast32_t random_state = std::numeric_limits<uint_fast32_t>::lowest(), size_t n_jobs = 0,
5457
size_t cv_folds = 5, size_t reserved_terms_times_num_x = 100, size_t bins = 300, size_t verbosity = 0, size_t max_interaction_level = 1,
@@ -72,6 +75,7 @@ class APLRClassifier
7275
MatrixXd get_validation_error_steps();
7376
double get_cv_error();
7477
VectorXd get_feature_importance();
78+
std::vector<std::string> get_unique_term_affiliations();
7579
};
7680

7781
APLRClassifier::APLRClassifier(size_t m, double v, uint_fast32_t random_state, size_t n_jobs, size_t cv_folds,
@@ -104,7 +108,8 @@ APLRClassifier::APLRClassifier(const APLRClassifier &other)
104108
early_stopping_rounds{other.early_stopping_rounds},
105109
num_first_steps_with_linear_effects_only{other.num_first_steps_with_linear_effects_only},
106110
penalty_for_non_linearity{other.penalty_for_non_linearity}, penalty_for_interactions{other.penalty_for_interactions},
107-
max_terms{other.max_terms}
111+
max_terms{other.max_terms}, unique_term_affiliations{other.unique_term_affiliations},
112+
unique_term_affiliation_map{other.unique_term_affiliation_map}
108113
{
109114
}
110115

@@ -163,6 +168,7 @@ void APLRClassifier::fit(const MatrixXd &X, const std::vector<std::string> &y, c
163168
}
164169
}
165170

171+
calculate_unique_term_affiliations();
166172
calculate_validation_metrics();
167173
cleanup_after_fit();
168174
}
@@ -227,17 +233,47 @@ void APLRClassifier::invert_second_model_in_two_class_case(APLRRegressor &second
227233
}
228234
}
229235

236+
void APLRClassifier::calculate_unique_term_affiliations()
237+
{
238+
size_t number_of_term_affiliations{0};
239+
for (std::string &category : categories)
240+
{
241+
number_of_term_affiliations += logit_models[category].number_of_unique_term_affiliations;
242+
}
243+
std::vector<std::string> term_affiliations;
244+
term_affiliations.reserve(number_of_term_affiliations);
245+
size_t counter{0};
246+
for (std::string &category : categories)
247+
{
248+
for (auto &affiliation : logit_models[category].unique_term_affiliations)
249+
{
250+
term_affiliations.push_back(affiliation);
251+
++counter;
252+
}
253+
}
254+
unique_term_affiliations = get_unique_strings_as_vector(term_affiliations);
255+
for (size_t i = 0; i < unique_term_affiliations.size(); ++i)
256+
{
257+
unique_term_affiliation_map[unique_term_affiliations[i]] = i;
258+
}
259+
}
260+
230261
void APLRClassifier::calculate_validation_metrics()
231262
{
232263
double category_weight{1.0 / static_cast<double>(categories.size())};
233264
validation_error_steps = MatrixXd::Constant(m, cv_observations.cols(), 0.0);
234265
cv_error = 0.0;
235-
feature_importance = VectorXd::Constant(logit_models[categories[0]].get_feature_importance().rows(), 0.0);
266+
feature_importance = VectorXd::Constant(unique_term_affiliations.size(), 0.0);
236267
for (std::string &category : categories)
237268
{
238269
cv_error += logit_models[category].get_cv_error() * category_weight;
239270
validation_error_steps += logit_models[category].get_validation_error_steps() * category_weight;
240-
feature_importance += logit_models[category].get_feature_importance() * category_weight;
271+
for (auto &affiliation : logit_models[category].unique_term_affiliations)
272+
{
273+
size_t feature_number_in_classifier{unique_term_affiliation_map[affiliation]};
274+
size_t feature_number_in_logit_model{logit_models[category].unique_term_affiliation_map[affiliation]};
275+
feature_importance[feature_number_in_classifier] += logit_models[category].get_feature_importance()[feature_number_in_logit_model] * category_weight;
276+
}
241277
}
242278
}
243279

@@ -282,11 +318,17 @@ std::vector<std::string> APLRClassifier::predict(const MatrixXd &X, bool cap_pre
282318

283319
MatrixXd APLRClassifier::calculate_local_feature_contribution(const MatrixXd &X)
284320
{
285-
MatrixXd output{MatrixXd::Constant(X.rows(), feature_importance.rows(), 0)};
321+
MatrixXd output{MatrixXd::Constant(X.rows(), unique_term_affiliations.size(), 0)};
286322
std::vector<std::string> predictions{predict(X, false)};
287323
for (size_t row = 0; row < predictions.size(); ++row)
288324
{
289-
output.row(row) = logit_models[predictions[row]].calculate_local_feature_contribution(X.row(row));
325+
VectorXd local_feature_contribution_from_logit_model{logit_models[predictions[row]].calculate_local_feature_contribution(X.row(row)).row(0)};
326+
for (auto &affiliation : logit_models[predictions[row]].unique_term_affiliations)
327+
{
328+
size_t feature_number_in_classifier{unique_term_affiliation_map[affiliation]};
329+
size_t feature_number_in_logit_model{logit_models[predictions[row]].unique_term_affiliation_map[affiliation]};
330+
output.col(feature_number_in_classifier)[row] = local_feature_contribution_from_logit_model[feature_number_in_logit_model];
331+
}
290332
}
291333

292334
return output;
@@ -327,4 +369,9 @@ double APLRClassifier::get_cv_error()
327369
VectorXd APLRClassifier::get_feature_importance()
328370
{
329371
return feature_importance;
372+
}
373+
374+
std::vector<std::string> APLRClassifier::get_unique_term_affiliations()
375+
{
376+
return unique_term_affiliations;
330377
}

0 commit comments

Comments
 (0)