Why calibration parameter T is learned on validation set and not training set?

From my understanding, the workflow suggested in the README is the following (supposing we want to perform hyperparameter optimization):

1. Split data in train-val-test sets
2. For each hyperparameter configuration, train model on train set and evaluate it on val set
3. Choose model with lowest val loss: retrain it on train+val set
4. Calibrate model with temperature scaling using the val set
5. Evaluate final model on test set

But why should we calibrate the model only on the validation set? For example, in [scikit-learn calibration example](https://scikit-learn.org/stable/auto_examples/calibration/plot_calibration.html) they calibrate the classifier on the train set.

My idea would be the following:

1. Split data in train-val-test sets
2. For each hyperparameter configuration, train **and calibrate** model on train set, then evaluate on val set
3. Choose model with lowest val loss: retrain it **and calibrate it** on train+val set
4. Evaluate final model on test set

Is there something I'm missing?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why calibration parameter T is learned on validation set and not training set? #37

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Why calibration parameter T is learned on validation set and not training set? #37

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions