Skip to content

Cross entropy loss over cosine distanceΒ #26

@yl3829

Description

@yl3829

I'm wondering the reason of using cross entropy loss over the cosine similarity between attractors and embeddings?

I suspect that this would cause inefficient use of the embedding space. Cross entropy loss try to maximize the cosine similarity between the attractors and the embeddings on the positive samples, this is fine. But on the negative samples, it will minimize the cosine similarity, pushing the similarity to negative 1, instead of 0. Negative 1 similarity means they are fully related but in the opposite direction. I believe this would restrict the representation space of attractors and embeddings.

Would love to hear your thoughts and consideration here. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions