Similarity Search, Retriever: Does question needs to be embed or does it cost any tokens #2213

Uno-count · 2023-08-09T08:44:39Z

Uno-count
Aug 9, 2023

2023-08-10T07:28:25Z

dosubot[bot]
bot Aug 10, 2023

Answer generated by a 🤖

Answer

Hello,

Yes, the question needs to be embedded in the Similarity Search, Retriever, and this process does consume tokens.

In the LangChain framework, the SupabaseHybridSearch class is used to convert the question into a query that can be used to retrieve documents. This class has a method _getRelevantDocuments which takes a question as an argument and returns the relevant documents based on the question. Here is the relevant code:

async _getRelevantDocuments(
  query: string,
  runManager?: CallbackManagerForRetrieverRun
): Promise<Document[]> {
  const searchResults = await this.hybridSearch(
    query,
    this.similarityK,
    this.keywordK,
    runManager?.getChild("hybrid_search")
  );

  return searchResults.map(([doc]) => doc);
}

In this code, the question is converted into a query that can be used to retrieve documents. This process of converting the question into a query consumes tokens.

The SupabaseHybridSearch is instantiated with a SupabaseClient and Embeddings which are used to embed the documents and queries. The SupabaseClient is used to interact with the Supabase database and the Embeddings is used to convert the queries into a format that the vector store can understand. This is part of the LangChain framework's ability to handle structured queries.

As for the number of tokens consumed in the process of embedding a question, in the context shared, the number of tokens consumed is 768. This is evident from the test case for HuggingFaceInferenceEmbeddings where the length of the queryEmbedding is checked to be 768.

const queryEmbedding = await embeddings.embedQuery(documents[0]);
expect(queryEmbedding).toHaveLength(768);

I hope this answers your question. If you have any more questions, feel free to ask.

Best,
Dosu

Sources

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Similarity Search, Retriever: Does question needs to be embed or does it cost any tokens #2213

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Similarity Search, Retriever: Does question needs to be embed or does it cost any tokens #2213

Uh oh!

Uh oh!

Uno-count Aug 9, 2023

Replies: 1 comment

Uh oh!

dosubot[bot] bot Aug 10, 2023

Answer

Sources

Uno-count
Aug 9, 2023

dosubot[bot]
bot Aug 10, 2023