Skip to content

Question related to the loss calculation #28

@zeyuliu1037

Description

@zeyuliu1037

Hi, thank you for sharing the great code base!!

I have one question related to the loss calculation. Could you tell me why the average loss is calculated across all segments during training, but only the loss from the last segment is used as the evaluation loss or perplexity? I understand the average loss during the training, but shouldn't we also calculate the average loss during the test to have a fair comparison with other methods that do not segment the data?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions