How can I split a query into sentences for sentence window retrieval and reranking, while still sending it as a single query to the LLM? #13835

Satyam7166-tech · 2024-05-30T10:50:51Z

Satyam7166-tech
May 30, 2024

So here is a snippet of my program:

from llama_index.core.postprocessor import MetadataReplacementPostProcessor
query_engine = sentence_index.as_query_engine(
    similarity_top_k=top_k,
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window"), rerank
    ], 
)
window_response = query_engine.query(QUESTION)
print(window_response)

The issue here is that my qureries are not the typical questions that one might do with RAG but they are huge paragraphs due to which the retrieval is not really that good.
So my plan is this:

Query -> broken into sentences -> each sentence retrieiving a few documents (which includes the sentence window and Rerank -> The Query + All the retrieved documenrs from all the sentences going to the LLM as {context}

Of course, there is the option to use SubquestionPostProcessors, MultiStepPostProcessors, etc.
But the issue there is that it will cause a delay (I am hosting my finetuned LLM on Ollama locally).

Answered by dosubot[bot]

May 30, 2024

Hey there, @Satyam7166-tech! I'm Dosu, a bot here to help you with bug fixes, answering questions, and becoming a contributor while you wait for a human maintainer. How can I assist you today?

To split a query into sentences for sentence window retrieval and reranking while still sending it as a single query to the LLM efficiently, you can use the QueryFusionRetriever and SentenceSplitter classes from the LlamaIndex framework. Here's how you can integrate these into your existing code:

Initialize the QueryFusionRetriever:
- This retriever will help you generate multiple search queries from a single input query and apply various fusion modes to rerank the results.
Use the SentenceSpli…

View full answer

@Satyam7166-tech · 2024-05-30T10:54:12Z

dosubot[bot]
bot May 30, 2024

Hey there, @Satyam7166-tech! I'm Dosu, a bot here to help you with bug fixes, answering questions, and becoming a contributor while you wait for a human maintainer. How can I assist you today?

To split a query into sentences for sentence window retrieval and reranking while still sending it as a single query to the LLM efficiently, you can use the QueryFusionRetriever and SentenceSplitter classes from the LlamaIndex framework. Here's how you can integrate these into your existing code:

Initialize the QueryFusionRetriever:
- This retriever will help you generate multiple search queries from a single input query and apply various fusion modes to rerank the results.
Use the SentenceSplitter:
- This class will split your query into sentences efficiently.
Combine the results:
- After retrieving and reranking, combine the results into a single query string to send to the LLM.

Here's a modified version of your code snippet incorporating these steps:

from llama_index.core.retrievers.fusion_retriever import QueryFusionRetriever, FUSION_MODES
from llama_index.core.llms.utils import resolve_llm
from llama_index.core.schema import QueryBundle
from llama_index.legacy.node_parser.text.sentence import SentenceSplitter
from llama_index.core.postprocessor import MetadataReplacementPostProcessor

# Initialize the QueryFusionRetriever
retrievers = [BaseRetriever1(), BaseRetriever2()]  # Replace with actual retrievers
llm = resolve_llm("openai-gpt-3")  # Replace with actual LLM
query_fusion_retriever = QueryFusionRetriever(
    retrievers=retrievers,
    llm=llm,
    mode=FUSION_MODES.RECIPROCAL_RANK,  # Choose the desired fusion mode
    num_queries=4,  # Number of queries to generate
    use_async=True  # Use asynchronous execution
)

# Initialize the SentenceSplitter
sentence_splitter = SentenceSplitter.from_defaults()

# Example query
query = "This is the first sentence. This is the second sentence. This is the third sentence."
query_bundle = QueryBundle(query)

# Retrieve and rerank results
results = query_fusion_retriever._retrieve(query_bundle)

# Split sentences
split_sentences = sentence_splitter.split_text(query)

# Combine results into a single query string
single_query = " ".join(split_sentences)

# Use the combined query in your query engine
query_engine = sentence_index.as_query_engine(
    similarity_top_k=top_k,
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window"), rerank
    ], 
)
window_response = query_engine.query(single_query)
print(window_response)

This approach ensures that your query is split into sentences, processed for semantic similarity, optimized for relevance, and then combined back into a single query string for efficient LLM processing ^[1]^[2]^[3]^[4].

To continue talking to Dosu, mention @dosu.

2 replies

Satyam7166-tech May 30, 2024
Author

@dosu

Can you please explain the role of mode and num_queries here?

Maybe you can use this context, Lets say I have a query with 10 sentences and I want each of those sentences to retrieve 10 top k retrievals and choose 3 of those as the top_n for rerank.

dosubot[bot] bot May 30, 2024

In the QueryFusionRetriever class, the mode and num_queries parameters play crucial roles in determining how queries are generated and how results are fused. Here's how they work:

`mode`

The mode parameter specifies the fusion strategy used to combine results from multiple retrievers. The available modes are defined in the FUSION_MODES enum:

RECIPROCAL_RANK: Applies reciprocal rank fusion, which combines results based on their reciprocal ranks.
SIMPLE: Applies simple re-ordering of results based on their original scores.

The mode parameter is used in the _retrieve method to select the appropriate fusion method:

if self.mode == FUSION_MODES.RECIPROCAL_RANK:
    return self._reciprocal_rerank_fusion(results)[: self.similarity_top_k]
elif self.mode == FUSION_MODES.SIMPLE:
    return self._simple_fusion(results)[: self.similarity_top_k]
else:
    raise ValueError(f"Invalid fusion mode: {self.mode}")

`num_queries`

The num_queries parameter specifies the number of queries to generate from the original query. It is used to create multiple search queries based on the input query, which can then be used to retrieve and rank results.

In the _get_queries method, num_queries determines how many additional queries are generated:

def _get_queries(self, original_query: str) -> List[str]:
    prompt_str = self.query_gen_prompt.format(
        num_queries=self.num_queries - 1,
        query=original_query,
    )
    response = self._llm.complete(prompt_str)

    # assume LLM properly put each query on a newline
    queries = response.text.split("\n")
    if self._verbose:
        queries_str = "\n".join(queries)
        print(f"Generated queries:\n{queries_str}")
    return response.text.split("\n")

In the _retrieve method, num_queries determines whether to generate additional queries:

if self.num_queries > 1:
    queries = self._get_queries(query_bundle.query_str)
else:
    queries = [query_bundle.query_str]

Example Context

Given your context where you have a query with 10 sentences and you want each sentence to retrieve the top 10 results and choose 3 of those for reranking, you can set num_queries to 10 and use the RECIPROCAL_RANK mode for fusion. Here's how you can configure it:

from llama_index.core.retrievers.fusion_retriever import QueryFusionRetriever, FUSION_MODES
from llama_index.core.llms.utils import resolve_llm
from llama_index.core.schema import QueryBundle
from llama_index.legacy.node_parser.text.sentence import SentenceSplitter
from llama_index.core.postprocessor import MetadataReplacementPostProcessor

# Initialize the QueryFusionRetriever
retrievers = [BaseRetriever1(), BaseRetriever2()]  # Replace with actual retrievers
llm = resolve_llm("openai-gpt-3")  # Replace with actual LLM
query_fusion_retriever = QueryFusionRetriever(
    retrievers=retrievers,
    llm=llm,
    mode=FUSION_MODES.RECIPROCAL_RANK,  # Choose the desired fusion mode
    num_queries=10,  # Number of queries to generate
    use_async=True  # Use asynchronous execution
)

# Initialize the SentenceSplitter
sentence_splitter = SentenceSplitter.from_defaults()

# Example query
query = "This is the first sentence. This is the second sentence. This is the third sentence."
query_bundle = QueryBundle(query)

# Retrieve and rerank results
results = query_fusion_retriever._retrieve(query_bundle)

# Split sentences
split_sentences = sentence_splitter.split_text(query)

# Combine results into a single query string
single_query = " ".join(split_sentences)

# Use the combined query in your query engine
query_engine = sentence_index.as_query_engine(
    similarity_top_k=3,  # Choose top 3 for reranking
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window"), rerank
    ], 
)
window_response = query_engine.query(single_query)
print(window_response)

This setup ensures that each sentence in your query retrieves the top 10 results, and the top 3 of those are selected for reranking using the RECIPROCAL_RANK fusion mode ^[1]^[2].

To continue talking to Dosu, mention @dosu.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How can I split a query into sentences for sentence window retrieval and reranking, while still sending it as a single query to the LLM? #13835

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How can I split a query into sentences for sentence window retrieval and reranking, while still sending it as a single query to the LLM? #13835

Uh oh!

Satyam7166-tech May 30, 2024

Replies: 1 comment · 2 replies

Uh oh!

dosubot[bot] bot May 30, 2024

Uh oh!

Uh oh!

Satyam7166-tech May 30, 2024 Author

Uh oh!

dosubot[bot] bot May 30, 2024

mode

num_queries

Example Context

Satyam7166-tech
May 30, 2024

Replies: 1 comment 2 replies

dosubot[bot]
bot May 30, 2024

Satyam7166-tech May 30, 2024
Author

`mode`

`num_queries`