-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
I'm wondering the reason of using cross entropy loss over the cosine similarity between attractors and embeddings?
I suspect that this would cause inefficient use of the embedding space. Cross entropy loss try to maximize the cosine similarity between the attractors and the embeddings on the positive samples, this is fine. But on the negative samples, it will minimize the cosine similarity, pushing the similarity to negative 1, instead of 0. Negative 1 similarity means they are fully related but in the opposite direction. I believe this would restrict the representation space of attractors and embeddings.
Would love to hear your thoughts and consideration here. Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels