Cross entropy loss over cosine distance

I'm wondering the reason of using cross entropy loss over the cosine similarity between attractors and embeddings? 

I suspect that this would cause inefficient use of the embedding space. Cross entropy loss try to maximize the cosine similarity between the attractors and the embeddings on the positive samples, this is fine. But on the negative samples, it will minimize the cosine similarity, pushing the similarity to negative 1, instead of 0. Negative 1 similarity means they are fully related but in the opposite direction. I believe this would restrict the representation space of attractors and embeddings. 

Would love to hear your thoughts and consideration here. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cross entropy loss over cosine distance #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cross entropy loss over cosine distance #26

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions