[9.2][GPU] Copy to host in case of small matrices to release resources early (#136464) #136532

ldematte · 2025-10-14T08:09:48Z

Backport of #136464

…ly (elastic#136464) This PR makes a small change to improve parallelism during graph build which we noticed with NVIDIA from profiler traces. In case the resulting graph is "small enough" (where "small enough" is ATM set to 128 MB) we copy the graph entirely to host memory, release the cuvs resources and proceed, instead of downloading data in pages from the device and write each page to disk, which is more efficient but will hold the resources till we finished writing to disk -- on a busy system this can require time.

ldematte added >non-issue backport auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Search Relevance/Vectors Vector search v9.2.1 labels Oct 14, 2025

elasticsearchmachine merged commit d3f82cb into elastic:9.2 Oct 14, 2025
34 checks passed

ldematte deleted the backport/9.2/gpu/optimize-flush-memory-release branch October 14, 2025 09:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[9.2][GPU] Copy to host in case of small matrices to release resources early (#136464) #136532

[9.2][GPU] Copy to host in case of small matrices to release resources early (#136464) #136532

Uh oh!

ldematte commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[9.2][GPU] Copy to host in case of small matrices to release resources early (#136464) #136532

[9.2][GPU] Copy to host in case of small matrices to release resources early (#136464) #136532

Uh oh!

Conversation

ldematte commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants