-
Notifications
You must be signed in to change notification settings - Fork 130
v1.17 release blog #2138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
v1.17 release blog #2138
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
76479a1
v1.17 release blog
abdonpijpelink e1585c7
Trigger Build
abdonpijpelink 615f634
Use 'Search Latency Improvements' instead of 'Faster Search'
abdonpijpelink 7477a6f
Add images
abdonpijpelink f5e680b
Add Web UI update
abdonpijpelink f46f6f0
Add audit logging to the list of honorable mentions
abdonpijpelink 4395767
Review feedback
abdonpijpelink f4d8d4f
Review feedback
abdonpijpelink b59dc21
Finalize relevance feedback section
abdonpijpelink 0437868
Update publication date
abdonpijpelink d62e7f9
Remove link to relevance feeback article
abdonpijpelink e67a302
Revert "Remove link to relevance feeback article"
mrscoopers File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| --- | ||
| title: "Qdrant 1.17 - Relevance Feedback & Search Latency Improvements" | ||
| draft: false | ||
| slug: qdrant-1.17.x | ||
| short_description: "Version 1.17 of Qdrant features a new Relevance Feedback Query and search latency improvements." | ||
| description: "Version 1.17 of Qdrant features a new Relevance Feedback Query, search latency improvements, and better operational observability." | ||
| preview_image: /blog/qdrant-1.17.x/social_preview.jpg | ||
| social_preview_image: /blog/qdrant-1.17.x/social_preview.jpg | ||
| date: 2026-02-20T00:00:00-08:00 | ||
| author: Abdon Pijpelink | ||
| featured: true | ||
| tags: | ||
| - vector search | ||
| - relevance feedback | ||
| - search performance | ||
| - observability | ||
| --- | ||
|
|
||
| [**Qdrant 1.17.0 is out!**](https://github.com/qdrant/qdrant/releases/tag/v1.17.0) Let’s look at the main features for this version: | ||
|
|
||
| **Relevance Feedback Query:** Improve the quality of search results by incorporating information about their relevance. | ||
|
|
||
| **Search Latency Improvements:** Manage search latency with new tools, such as an update queue and delayed fan-outs, as well as many internal search performance improvements. | ||
|
|
||
| **Greater Operational Observability:** Better insights into operational metrics and faster troubleshooting with a new cluster-wide telemetry API and segment optimization monitoring. | ||
|
|
||
| ## Relevance Feedback Query | ||
|
|
||
|  | ||
|
|
||
| Crafting queries is hard: users often struggle to precisely formulate search queries. At the same time, judging the relevance of a given search result is often much easier. Retrieval systems can leverage this [relevance feedback](/articles/search-feedback-loop/) to iteratively refine results toward user intent. | ||
|
|
||
| This release introduces a new [Relevance Feedback Query](/documentation/concepts/search-relevance/#relevance-feedback) as a scalable, vector‑native approach to incorporating relevance feedback. The Relevance Feedback Query uses a small amount of model‑generated feedback to guide the retriever through the entire vector space, effectively nudging search toward “more relevant” results without requiring expensive loops, expensive retrievers, or human labeling. This enables the engine to traverse billions of vectors with improved recall without having to retrain models. | ||
|
|
||
| This method works by collecting lightweight feedback on just a few top results, creating “context pairs” of more‑ and less‑relevant examples. These pairs define a signal that adjusts the scoring function during the next retrieval pass. Instead of rewriting queries or rescoring large batches of documents, Qdrant modifies how similarity is computed. Experiments demonstrate substantial gains, especially when pairing expressive retrievers with strong feedback models. For the methodology and experiments behind this feature, see our article [Relevance Feedback in Qdrant](/articles/relevance-feedback). To get started, refer to the [documentation](/documentation/concepts/search-relevance/#relevance-feedback). | ||
|
|
||
| <figure> | ||
| <img src="/blog/qdrant-1.17.x/relevance-feedback-overview.png"> | ||
| <figcaption> | ||
| Feedback-based scoring combines a candidate’s similarity to a query with its relative distances (delta) to the positive and negative items in context pairs. | ||
| </figcaption> | ||
| </figure> | ||
|
|
||
| ## Search Latency Improvements | ||
|
|
||
|  | ||
|
|
||
| This release includes several changes that reduce search latency. To improve query response times in environments with high write loads, Qdrant can now be configured to avoid creating large unoptimized segments. Additionally, delayed fan-outs help reduce tail latency by querying a second replica if the first does not respond within a configurable latency threshold. | ||
|
|
||
| ### Search Latency Under Write Load | ||
|
|
||
| A common pattern with vector search engines like Qdrant involves bulk uploads. For example, periodically refreshing data from an external source of truth using nightly batch updates. Newly ingested data needs to be indexed, which is a resource-intensive operation. When the data ingestion rate exceeds the indexing rate, this can lead to issues such as: | ||
|
|
||
| - Back-pressure and rejected update operations due to a full update queue. | ||
| - Slow queries over data that has not yet been indexed. | ||
|
|
||
| This release addresses these issues by changing how data is ingested. Shards still process data through the familiar stages: WAL persistence, queued updates, application to unoptimized segments, and eventual full indexing, but two new features reshape how systems behave under heavy write load. | ||
|
|
||
| A new [update queue](/documentation/guides/low-latency-search/#query-indexed-data-only) tracks up to one million pending changes. When the queue fills, back pressure slows incoming writes, preventing runaway load and helping clusters stay stable even during large batch operations or recovery after downtime. | ||
|
|
||
| For applications that demand consistently low-latency search, indexed‑only mode ensures queries touch only fully indexed segments. A side-effect of using indexed-only queries was that they could temporarily hide the newest updates, before they were indexed. A new [`prevent_unoptimized` optimizer setting](/documentation/guides/low-latency-search/#query-indexed-data-only) solves this by throttling updates to match the indexing rate, reducing the creation of large unoptimized segments. | ||
|
|
||
| Together, these features give developers tighter control over write throughput, indexing behavior, and search performance, especially in high‑volume environments. | ||
|
|
||
| ### Reduced Tail Latency with Delayed Fan-Outs | ||
|
|
||
| By default, a search operation queries a single replica of each shard within a collection. If one of these replicas responds slowly due to load or network issues, this negatively impacts the overall search latency. This phenomenon, where a single slow replica increases the 95th or 99th percentile latency of the entire system, is known as “tail latency.” High tail latency can noticeably degrade the user experience. | ||
|
|
||
| To mitigate tail latency for read operations, this release introduces a new [delayed fan-out](/documentation/guides/low-latency-search/#use-delayed-fan-outs) feature. With delayed fan-outs, if the initial request to a replica exceeds a configurable latency threshold, an additional read request is sent to another replica, and Qdrant will use the first available response. Delayed fan-outs help your application provide a consistent, low latency experience to end-users. | ||
|
|
||
| ## Greater Operational Observability | ||
|
|
||
|  | ||
|
|
||
| We are continuously working to enhance the operational observability of Qdrant clusters. In this release, we introduce two new features: a new cluster-wide telemetry API and segment optimization monitoring. | ||
|
|
||
| ### Cluster-Wide Telemetry | ||
|
|
||
| Qdrant’s API exposes a `/telemetry` endpoint which provides information about the current state of a peer in a cluster, including the number of vectors, shards, and other useful information. However, obtaining a complete view of the entire cluster using this endpoint is not straightforward, requiring querying each peer and piecing together a complete view yourself. | ||
|
|
||
| In version 1.17, we’re introducing a new [`/cluster/telemetry` endpoint](/documentation/guides/monitoring/#cluster-wide-telemetry). This API provides information about all peers in a cluster, offering insights into cluster-wide operations such as leader elections, resharding, and shard transfers. | ||
|
|
||
| ### Segment Optimization Monitoring | ||
|
|
||
| Optimization is a background process where Qdrant removes data marked for deletion, merges segments, and creates indexes. To improve visibility into this process, this release introduces [segment optimization monitoring capabilities](/documentation/concepts/optimizer/#optimization-monitoring). | ||
|
|
||
| A new `/collections/{collection_name}/optimizations` API endpoint provides cluster-wide information about the current optimization status, as well as detailed information for current and past optimization operations. Because the output of the API can be verbose, we’ve added a new Optimizations tab to the Collections interface in the Web UI that makes it easier to analyze the data. Here, you can find an overview of the current optimization status, a timeline of current and past optimization operations, and a breakdown of the tasks in a specific cycle and their durations. | ||
|
|
||
| <figure> | ||
| <img width="75%" src="/blog/qdrant-1.17.x/optimizer-web-ui.png"> | ||
| <figcaption> | ||
| The new user interface in the Web UI provides an overview of the current cluster-wide optimization status and a timeline of current and past optimization cycles. | ||
| </figcaption> | ||
| </figure> | ||
|
Comment on lines
89
to
94
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This view is cluster-wide now, and not just for a single node. I think it's worth to mention. |
||
|
|
||
| ## Redesigned Web UI Point Search | ||
timvisee marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|  | ||
|
|
||
| [Web UI](/documentation/web-ui/) is Qdrant’s user interface for managing deployments and collections. It enables you to create and manage collections, run API calls, import sample datasets, and learn about Qdrant's API through interactive tutorials. | ||
|
|
||
| Many people have been asking about point filtering in web UI. And now it's back, better than ever. In this release, we have redesigned the point search interface in the Web UI to make exploring your data and discovering relevant points easier and more intuitive. The new two-field layout enables searching for points similar to another point, filtering by payload values, and finding points by ID. | ||
|
|
||
| <figure> | ||
| <img src="/blog/qdrant-1.17.x/web-ui-search.png"> | ||
| <figcaption> | ||
| The redesigned point search interface in the Web UI provides a way to find points similar to another point and filter on payload values. | ||
| </figcaption> | ||
| </figure> | ||
|
|
||
| ## Honorable Mentions | ||
|
|
||
|  | ||
|
|
||
| As an open source project, we welcome contributions from the Qdrant community. This release features two contributions from community members: | ||
|
|
||
| - Not all payload field indexes are used in combination with dense vector queries. With this release, you can [specify whether individual payload field indexes should be reflected in the HNSW index](/documentation/concepts/indexing/#disable-the-creation-of-extra-edges-for-payload-fields). | ||
| - A new API endpoint is available to [list all user-defined shard keys](/documentation/guides/distributed_deployment/#user-defined-sharding). | ||
|
|
||
| Additionally, this release adds the following features: | ||
|
|
||
| - Upserts now support an [update mode](/documentation/concepts/points/#update-mode) for insert-only or update-only operations. | ||
| - To speed up the recovery of the replicas after they’ve been down, shards will [increase the size of their write-ahead log](https://github.com/qdrant/qdrant/pull/7834) when they detect that one of their remote replicas is unavailable. | ||
| - Reciprocal Rank Fusion (RRF) combines multiple query results into one list, but its default equal weighting can let weaker rankers dilute stronger ones. [Weighted RRF](/documentation/concepts/hybrid-queries/#reciprocal-rank-fusion-rrf) in Qdrant 1.17 addresses this by letting you assign weights to individual queries. | ||
| - A new [user interface in the Web UI enables resharding collections](https://github.com/qdrant/qdrant-web-ui/pull/341) on Qdrant Cloud. | ||
| - Qdrant now supports [audit logging](/documentation/guides/security/#audit-logging) to track all API operations that require authentication or authorization. | ||
| - [External provider API keys for inference requests](/documentation/concepts/inference/#external-embedding-model-providers) can now be provided in the request header. | ||
|
|
||
| For a full list of all changes in version 1.17, please refer to the [change log](https://github.com/qdrant/qdrant/releases/tag/v1.17.0). | ||
|
|
||
| ## Upgrading to Version 1.17 | ||
|
|
||
|  | ||
|
|
||
| On Qdrant Cloud, navigate to the Cluster Details screen and select Version 1.17 from the dropdown menu. The upgrade process may take a few moments. | ||
|
|
||
| We recommend upgrading versions one by one. On Qdrant Cloud, this is done automatically when you select the target version. If you are self-hosting, upgrade to the latest patch version of each intermediate minor version first, for example 1.15.5->1.16.3->1.17.0. | ||
|
|
||
| ## Engage | ||
|
|
||
|  | ||
|
|
||
| We would love to hear your thoughts on this release. If you have any questions or feedback, join our [Discord](https://discord.gg/qdrant) or create an issue on [GitHub](https://github.com/qdrant/qdrant/issues). | ||
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+106 KB
qdrant-landing/static/blog/qdrant-1.17.x/relevance-feedback-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.