File tree Expand file tree Collapse file tree 1 file changed +17
-1
lines changed Expand file tree Collapse file tree 1 file changed +17
-1
lines changed Original file line number Diff line number Diff line change @@ -130,8 +130,24 @@ cross-validation::
130
130
... f"{cv_results['test_score'].std():.3f}"
131
131
... )
132
132
Balanced accuracy mean +/- std. dev.: 0.724 +/- 0.042
133
+
134
+ The cross-validation performance looks good, but evaluating the classifiers
135
+ on the left-out data shows a different picture::
133
136
134
- We see that the statistical performance are worse than in the previous case.
137
+ >>> scores = []
138
+ >>> for fold_id, cv_model in enumerate (cv_results[" estimator" ]):
139
+ ... scores.append(
140
+ ... balanced_accuracy_score(
141
+ ... y_left_out, cv_model.predict(X_left_out)
142
+ ... )
143
+ ... )
144
+ >>> print (
145
+ ... f " Balanced accuracy mean +/- std. dev.: "
146
+ ... f " { np.mean(scores):.3f } +/- { np.std(scores):.3f } "
147
+ ... )
148
+ Balanced accuracy mean +/- std. dev.: 0.698 +/- 0.014
149
+
150
+ We see that the performance is now worse than the cross-validated performance.
135
151
Indeed, the data leakage gave us too optimistic results due to the reason
136
152
stated earlier in this section.
137
153
You can’t perform that action at this time.
0 commit comments