Question related to the loss calculation

Hi, thank you for sharing the great code base!!

I have one question related to the loss calculation. Could you tell me why the average loss is calculated across all segments during training, but only the loss from the last segment is used as the evaluation loss or perplexity? I understand the average loss during the training, but shouldn't we also calculate the average loss during the test to have a fair comparison with other methods that do not segment the data?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question related to the loss calculation #28

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question related to the loss calculation #28

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions