Separate validation precision in trainer arguments

### Description & Motivation

The trainer should have a separate precision parameter for training and validation. 

The feature would allow trading validation speed for some loss of measurement exactness. This is useful when validation takes a noticeable fraction of the overall training time. This scenario is common when training language models (e.g., language translators), where the model quality is measured by comparing *autoregressively generated* outputs with reference sequences. 


### Pitch

The functionality would be exposed via `precision_val` parameter:

```python
trainer = lightning.Trainer(
    precision="16-mixed",
    precision_val="16-true"
)
```

### Alternatives

The user might implement custom precision logic in `on_validation_start` and `on_validation_end` callbacks in their `LightningModule`.

However, `Lightning` seems to generally advise against using `.to(device, dtype)` to manipulate model/tensors representation in memory because `Lightning` itself handles that, so writing custom precision logic in `LightningModule` appears out-of-place and error-prone.

### Additional context

For inspiration, `Seq2SeqTrainer` in the `transformers` library supports the proposed functionality via `fp16_full_eval=True` and `bf16_full_eval=True` parameters.

Relevant docs: https://huggingface.co/docs/transformers/en/main_classes/trainer#transformers.Seq2SeqTrainingArguments.fp16_full_eval


cc @lantiga @borda

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Separate validation precision in trainer arguments #20606

Description & Motivation

Pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Separate validation precision in trainer arguments #20606

Description

Description & Motivation

Pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions