Release v2.21.0 · deepset-ai/haystack

⭐️ Highlights

🔍 Smarter, Broader Retrieval with Multi-Query RAG

This release introduces three new components that significantly boost retrieval recall in RAG systems by expanding the user query and retrieving documents across multiple reformulations:

QueryExpander generates semantically similar variations of a user query to broaden search coverage.
MultiQueryTextRetriever runs multiple queries in parallel using a text-based retriever (e.g., BM25) and merges results by score.
MultiQueryEmbeddingRetriever performs the same multi-query retrieval flow using embeddings, enabling richer semantic recall.

Used together, these components create a multi-query retrieval pipeline that improves recall especially when queries are short or ambiguous.

🧪 Example: Expanding a Query and Retrieving More Relevant Documents

from haystack import Pipeline
from haystack.components.query import QueryExpander
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.retrievers import MultiQueryTextRetriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack import Document
from haystack.document_stores.types import DuplicatePolicy

# Sample documents
docs = [
    Document(content="Renewable energy comes from natural sources like wind and sunlight."),
    Document(content="Geothermal energy is heat from beneath the Earth's surface."),
    Document(content="Hydropower generates electricity using flowing water."),
]

# Store documents
store = InMemoryDocumentStore()
writer = DocumentWriter(document_store=store, policy=DuplicatePolicy.SKIP)
writer.run(documents=docs)

# Components
expander = QueryExpander()
retriever = InMemoryBM25Retriever(document_store=store, top_k=1)
multi_retriever = MultiQueryTextRetriever(retriever=retriever)

# Expand and retrieve
expanded = expander.run(query="renewable energy")
results = multi_retriever.run(queries=expanded["queries"])

for doc in results["documents"]:
    print(doc.content)

This pipeline expands "renewable energy" into multiple related queries, retrieves documents for each in parallel, and returns a richer set of relevant results — demonstrating how multi-query retrieval improves recall with minimal effort.

⬆️ Upgrade Notes

Updated the default Azure OpenAI model from gpt-4o-mini to gpt-4.1-mini and the default API version from 2023-05-15 to 2024-12-01-preview for both AzureOpenAIGenerator and AzureOpenAIChatGenerator.
The default OpenAI model has been changed from gpt-4o-mini to gpt-5-mini for OpenAIChatGenerator and OpenAIGenerator. If you rely on the default model and need to continue using gpt-4o-mini, explicitly specify it when initializing these components: OpenAIChatGenerator(model="gpt-4o-mini").

🚀 New Features

Three new components added QueryExpander, MultiQueryEmbeddingRetriever, MultiQueryTextRetriever. When used together, they allow a query to be expanded and each expansion is used to retrieve a potentially different set of documents.

⚡️Enhancement Notes

Added a return_empty_on_no_match parameter (default True) to RegexTextExtractor.__init__(). When set to False, the component returns {"captured_text": ""} instead of {} when no regex match is found. Provides a consistent output structure for pipeline integration.
The FilterRetriever and AutoMergingRetriever components now support asynchronous execution.
Previously, when using tracing with objects like ByteStream and ImageContent, the payload sent to the tracing backend could become too large, hitting provider limits or causing performance degradation. We now replace these objects with string placeholders to avoid oversized payloads.
The default OpenAI model for OpenAIChatGenerator and OpenAIGenerator has been updated from gpt-4o-mini to gpt-5-mini.

🐛 Bug Fixes

Ensure request header keys are unique in link_content to prevent 400 Bad Request errors.

Some image providers return a 400 Bad Request when using ImageContent.from_url() because the User-Agent header appears multiple times with different casing (e.g., user-agent, User-Agent). This update normalizes header keys in a case-insensitive way, removes duplicates, and preserves only the last occurrence.
Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.
Fix the serialization and deserialization of pipeline_outputs in pipeline_snapshot and make it use the same schema as the rest of the pipeline state when running pipelines with breakpoints. The deserialization of the older format of pipeline_outputs without serialization schema is supported till Haystack 2.23.0.
Fixed ToolInvoker missing tools after warmup for lazy-initialized toolsets. The invoker now refreshes its tool registry post-warmup, ensuring replaced placeholders (e.g., MCPToolset with eager_connect=False) resolve to the actual tool names at invocation time.

💙 Big thank you to everyone who contributed to this release!

@Amnah199, @anakin87, @davidsbatista, @dfokina, @mrchtr, @OscarPindaro, @schwartzadev, @sjrl, @TaMaN2031A, @vblagoje, @YassineGabsi, @ZeJ0hn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2.21.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

⭐️ Highlights

🔍 Smarter, Broader Retrieval with Multi-Query RAG

🧪 Example: Expanding a Query and Retrieving More Relevant Documents

⬆️ Upgrade Notes

🚀 New Features

⚡️Enhancement Notes

🐛 Bug Fixes

💙 Big thank you to everyone who contributed to this release!

Contributors

Uh oh!