Skip to content

Vocab index ordering? #75

@Tahlor

Description

@Tahlor

Why do we calculate word counts if we ultimately include all words and sort the vocab alphabetically? Mostly I'm wondering if you ordered it with the most common words first, and later reverted to alphabetical ordering for some reason. Would it be bad to have the most common words first? Or do we expect the model would learn better with a random ordering?

In utils.py:

        vocabulary_inv = [x[0] for x in word_counts.most_common()]
        vocabulary_inv = list(sorted(vocabulary_inv))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions