Skip to content

Commit fd73bc3

Browse files
api reference rephrasing
1 parent 79d8796 commit fd73bc3

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

API_REFERENCE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Limits 1) the number of terms already in the model that can be considered as int
5353
Species the variance power for the "tweedie" ***family***.
5454

5555
#### group_size_for_validation_group_mse (default = 100)
56-
APLR calculates a tuning metric, mean squared error for groups of observations in the validation set. This metric is provided by the method ***get_validation_group_mse()***. The metric may be useful for tuning ***tweedie_power*** and to some extent ***family*** or ***link_function***. The reasoning behind this is that while mean squared error (MSE) could be inappropriate for evaluating goodness of fit on for example tweedie distributed data, MSE is often appropriate for evaluating normally distributed data. The mean response and mean prediction of a group of observations is approximately normally distributed according to the Central Limit Theorem (CLT) if there are enough observations in the group, even if individual observations are not normally distributed. Ideally, ***group_size_for_validation_group_mse*** should be large enough so that the Central Limit Theorem holds (at least 30, but the default of 100 is a safer choice). Also, the number of observations in the validation set should be substantially higher than ***group_size_for_validation_group_mse***.
56+
APLR calculates a tuning metric, mean squared error for groups of observations in the validation set. This metric is provided by the method ***get_validation_group_mse()***. The metric may be useful for tuning ***tweedie_power*** and to some extent ***family*** or ***link_function***. The reasoning behind this is that mean squared error (MSE) is often appropriate for evaluating goodness of fit on approximately normally distributed data. The mean of a group of observations is approximately normally distributed according to the Central Limit Theorem (CLT) if there are enough observations in the group, regardless of how individual observations are distributed. Ideally, ***group_size_for_validation_group_mse*** should be large enough so that the Central Limit Theorem holds (at least 30, but the default of 100 is a safer choice). Also, the number of observations in the validation set should be substantially higher than ***group_size_for_validation_group_mse***.
5757

5858

5959
## Method: fit(X:npt.ArrayLike, y:npt.ArrayLike, sample_weight:npt.ArrayLike = np.empty(0), X_names:List[str]=[], validation_set_indexes:List[int]=[])

0 commit comments

Comments
 (0)