What does normalizing embeddings do? #5067

ShelbyJenkins · 2023-06-02T00:48:30Z

ShelbyJenkins
Jun 2, 2023

I really liked haystack, but I'm trying to migrate off of haystack for my chat bot project due to the large size (5g compressed) of containers.

Without haystack I can't retrieve embeddings from pinecone. Queries return no results just using the openai embedding endpoint and pinecone-client package. I think I tracked the issue to haystack performing operations on embedded vectors. I'm sure there is a way to do it without haystack, but the levels of abstraction are too much of a time sink for me to parse. So I'll just recreate my pinecone index from scratch.

But I am curious what the value this sort of normalization brings!

def _normalize_embedding_2D(emb: np.ndarray) -> None:
        for vec in emb:
            vec = np.ascontiguousarray(vec)
            norm = np.sqrt(vec.dot(vec))
            if norm != 0.0:
                vec /= norm

from:

haystack/haystack/document_stores/base.py

Line 313 in a9a49e2

def _normalize_embedding_2D(emb: np.ndarray) -> None:

julian-risch · 2023-06-05T08:46:24Z

julian-risch
Jun 5, 2023
Maintainer

Hi @ShelbyJenkins sorry to hear that the large amount of memory required by Haystack's dependencies is an issue. Maybe this discussion is relevant to you: #5032

Vector length normalization ensures that all the vectors have the same length so that when we compare vectors, we are only comparing their direction.

0 replies

ShelbyJenkins · 2023-06-05T16:37:36Z

ShelbyJenkins
Jun 5, 2023
Author

@julian-risch thanks for the update. Good to know about that option. FWIW my little chat bot's image size without haystack ended up at 150mb. I'm new to deploying apps in containers, but that seems reasonable.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What does normalizing embeddings do? #5067

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

What does normalizing embeddings do? #5067

Uh oh!

ShelbyJenkins Jun 2, 2023

Replies: 2 comments

Uh oh!

julian-risch Jun 5, 2023 Maintainer

Uh oh!

ShelbyJenkins Jun 5, 2023 Author

ShelbyJenkins
Jun 2, 2023

julian-risch
Jun 5, 2023
Maintainer

ShelbyJenkins
Jun 5, 2023
Author