-
I am using HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2"). When I look at the raw number, they do not correspond with what I see in default__vector_store.json Is this the expected behaviour? does llama_index use some post processing for the embeddings? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
🤖 Hey there, @gzguevara! Good to see you back here. Don't worry, I'm diving into your issue and will get back to you with a detailed response soon. Hang tight! |
Beta Was this translation helpful? Give feedback.
-
🤖 Hey @gzguevara, hope you're doing well! Good to see you again. We're diving into another issue, let's get started. Yes, the behavior you're observing is expected. LlamaIndex does apply post-processing to the embeddings from HuggingFaceEmbedding. The raw numbers you see do not match with what is seen in default__vector_store.json due to the following post-processing steps:
These post-processing steps are implemented in the The If If I hope this clarifies your question. If you have any further questions, feel free to ask. Best, SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
🤖
Hey @gzguevara, hope you're doing well! Good to see you again. We're diving into another issue, let's get started.
Yes, the behavior you're observing is expected. LlamaIndex does apply post-processing to the embeddings from HuggingFaceEmbedding. The raw numbers you see do not match with what is seen in default__vector_store.json due to the following post-processing steps:
Pooling: Depending on the
pooling
attribute of theHuggingFaceEmbedding
class, the embeddings are pooled. Ifpooling
is set toPooling.CLS
, the embeddings are pooled using thecls_pooling
method of thePooling
class. Ifpooling
is not set toPooling.CLS
, the_mean_pooling
method is used to pool the embeddings. This m…