Skip to content
Discussion options

You must be logged in to vote

Great question — the short answer is yes, FAISS index type absolutely matters depending on your use case and embedding setup.


🧠 Cosine vs Dot Product:

  • IndexFlatL2 uses L2 (Euclidean) distance, not cosine similarity.
  • If you're using cosine-based embeddings (like all-MiniLM-L6-v2), then you should either:
    • Normalize your vectors before adding them, or
    • Switch to IndexFlatIP (inner product), which behaves like cosine if vectors are normalized.

⚠️ Without normalization, IndexFlatIP becomes magnitude-sensitive and can give incorrect ranking.


💡 When to normalize:

import numpy as np
normalized_vectors = vectors / np.linalg.norm(vectors, axis=1, keepdims=True)

If you use HuggingFaceEmbedding…

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by davidshen84
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants