⭐️ Highlights
🔍 Smarter, Broader Retrieval with Multi-Query RAG
This release introduces three new components that significantly boost retrieval recall in RAG systems by expanding the user query and retrieving documents across multiple reformulations:
QueryExpandergenerates semantically similar variations of a user query to broaden search coverage.MultiQueryTextRetrieverruns multiple queries in parallel using a text-based retriever (e.g., BM25) and merges results by score.MultiQueryEmbeddingRetrieverperforms the same multi-query retrieval flow using embeddings, enabling richer semantic recall.
Used together, these components create a multi-query retrieval pipeline that improves recall especially when queries are short or ambiguous.
🧪 Example: Expanding a Query and Retrieving More Relevant Documents
from haystack import Pipeline
from haystack.components.query import QueryExpander
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.retrievers import MultiQueryTextRetriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack import Document
from haystack.document_stores.types import DuplicatePolicy
# Sample documents
docs = [
Document(content="Renewable energy comes from natural sources like wind and sunlight."),
Document(content="Geothermal energy is heat from beneath the Earth's surface."),
Document(content="Hydropower generates electricity using flowing water."),
]
# Store documents
store = InMemoryDocumentStore()
writer = DocumentWriter(document_store=store, policy=DuplicatePolicy.SKIP)
writer.run(documents=docs)
# Components
expander = QueryExpander()
retriever = InMemoryBM25Retriever(document_store=store, top_k=1)
multi_retriever = MultiQueryTextRetriever(retriever=retriever)
# Expand and retrieve
expanded = expander.run(query="renewable energy")
results = multi_retriever.run(queries=expanded["queries"])
for doc in results["documents"]:
print(doc.content)This pipeline expands "renewable energy" into multiple related queries, retrieves documents for each in parallel, and returns a richer set of relevant results — demonstrating how multi-query retrieval improves recall with minimal effort.
⬆️ Upgrade Notes
- Updated the default Azure OpenAI model from
gpt-4o-minitogpt-4.1-miniand the default API version from2023-05-15to2024-12-01-previewfor bothAzureOpenAIGeneratorandAzureOpenAIChatGenerator. - The default OpenAI model has been changed from gpt-4o-mini to gpt-5-mini for OpenAIChatGenerator and OpenAIGenerator. If you rely on the default model and need to continue using gpt-4o-mini, explicitly specify it when initializing these components: OpenAIChatGenerator(model="gpt-4o-mini").
🚀 New Features
- Three new components added
QueryExpander,MultiQueryEmbeddingRetriever,MultiQueryTextRetriever. When used together, they allow a query to be expanded and each expansion is used to retrieve a potentially different set of documents.
⚡️Enhancement Notes
- Added a return_empty_on_no_match parameter (default True) to RegexTextExtractor.__init__(). When set to False, the component returns {"captured_text": ""} instead of {} when no regex match is found. Provides a consistent output structure for pipeline integration.
- The FilterRetriever and AutoMergingRetriever components now support asynchronous execution.
- Previously, when using tracing with objects like
ByteStreamandImageContent, the payload sent to the tracing backend could become too large, hitting provider limits or causing performance degradation. We now replace these objects with string placeholders to avoid oversized payloads. - The default OpenAI model for OpenAIChatGenerator and OpenAIGenerator has been updated from gpt-4o-mini to gpt-5-mini.
🐛 Bug Fixes
-
Ensure request header keys are unique in link_content to prevent 400 Bad Request errors.
Some image providers return a 400 Bad Request when using ImageContent.from_url() because the User-Agent header appears multiple times with different casing (e.g., user-agent, User-Agent). This update normalizes header keys in a case-insensitive way, removes duplicates, and preserves only the last occurrence.
-
Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.
-
Fix the serialization and deserialization of
pipeline_outputsinpipeline_snapshotand make it use the same schema as the rest of the pipeline state when running pipelines with breakpoints. The deserialization of the older format ofpipeline_outputswithout serialization schema is supported till Haystack 2.23.0. -
Fixed ToolInvoker missing tools after warmup for lazy-initialized toolsets. The invoker now refreshes its tool registry post-warmup, ensuring replaced placeholders (e.g., MCPToolset with eager_connect=False) resolve to the actual tool names at invocation time.
💙 Big thank you to everyone who contributed to this release!
@Amnah199, @anakin87, @davidsbatista, @dfokina, @mrchtr, @OscarPindaro, @schwartzadev, @sjrl, @TaMaN2031A, @vblagoje, @YassineGabsi, @ZeJ0hn