PoolingCuVSResourceManager with memory availability #133242

ldematte · 2025-08-20T16:29:26Z

This PR expands #132670 to account for GPU memory availability: a requesting thread can obtain a resource only if there are enough GPU physical resources available (in this first iteration, memory). Otherwise the requesting thread will be blocked and signalled again to re-check conditions are satisfied when memory is freed (when another thread release a resource).

Depends on rapidsai/cuvs#1267 (which is now merged into branch-25.10)

elasticsearchmachine · 2025-08-27T07:55:05Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/CuVSResourceManager.java

ChrisHegarty

LGTM

…with-memory

ldematte added 2 commits August 20, 2025 18:27

PoolingCuVSResourceManager with memory availability

2b0cfa5

Add tests + fixes

204405b

ldematte requested a review from ChrisHegarty August 21, 2025 10:33

ldematte added >non-issue :Search Relevance/Vectors Vector search labels Aug 21, 2025

ldematte added 3 commits August 21, 2025 15:55

Fix: re-acquire res before re-evaluating condition(s)

2cf0388

signalAll

8608731

Short circuit to avoid livelock + spotless

d32a466

ldematte mentioned this pull request Aug 22, 2025

PoolingCuVSResourceManager with GPU utilization #133390

Draft

ldematte marked this pull request as ready for review August 27, 2025 07:54

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Aug 27, 2025

ldematte commented Aug 27, 2025

View reviewed changes

x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/CuVSResourceManager.java Show resolved Hide resolved

ldematte added 2 commits August 27, 2025 17:30

Add dataType to acquire + logging

147c8fe

Fix signature for latest cuvs-java

2fcf997

ldematte added the test-gpu Run tests using a GPU label Sep 2, 2025

ChrisHegarty approved these changes Sep 3, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/es-gpu' into resource-manager-…

f986aa6

…with-memory

ldematte merged commit 83aa729 into elastic:es-gpu Sep 3, 2025
11 of 62 checks passed

ldematte deleted the resource-manager-with-memory branch September 3, 2025 09:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PoolingCuVSResourceManager with memory availability #133242

PoolingCuVSResourceManager with memory availability #133242

Uh oh!

ldematte commented Aug 20, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Aug 27, 2025

Uh oh!

Uh oh!

ChrisHegarty left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PoolingCuVSResourceManager with memory availability #133242

PoolingCuVSResourceManager with memory availability #133242

Uh oh!

Conversation

ldematte commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 27, 2025

Uh oh!

Uh oh!

ChrisHegarty left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ldematte commented Aug 20, 2025 •

edited

Loading