-
Notifications
You must be signed in to change notification settings - Fork 25.5k
[GPU] Support for performance profiling #136021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/ES92GpuHnswVectorsWriter.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/CuVSResourceManager.java
Outdated
Show resolved
Hide resolved
qa/vector/src/main/java/org/elasticsearch/test/knn/KnnIndexer.java
Outdated
Show resolved
Hide resolved
if (System.getenv("DO_PROFILING") != null) { | ||
jvmArgs '-XX:StartFlightRecording=dumponexit=true,maxsize=250M,filename=knn.jfr,settings=profile.jfc' | ||
} | ||
def asyncProfilerPath = System.getProperty("asyncProfiler.path", null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am uncertain about this part about async profiler, but I trust your expertise on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ldematte Thanks Lorenzo, makes sense. I've left some comments to address, but otherwise it looks good.
…csearch into knn-index-tester-changes
if (asyncProfilerPath != null) { | ||
if (OS.current().equals(OS.MAC)) { | ||
def asyncProfilerAgent = "${asyncProfilerPath}/lib/libasyncProfiler.dylib" | ||
println "Using async-profiler agent ${asyncProfilerAgent}" | ||
jvmArgs "-agentpath:${asyncProfilerAgent}=start,event=cpu,interval=10ms,file=${layout.buildDirectory.asFile.get()}/tmp/elasticsearch-0_%t_%p.jfr" | ||
} else if (OS.current().equals(OS.LINUX)) { | ||
def asyncProfilerAgent = "${asyncProfilerPath}/lib/libasyncProfiler.so" | ||
println "Using async-profiler agent ${asyncProfilerAgent}" | ||
jvmArgs "-agentpath:${asyncProfilerAgent}=start,event=cpu,interval=10ms,wall=50ms,file=${layout.buildDirectory.asFile.get()}/tmp/elasticsearch-0_%t_%p.jfr" | ||
} else { | ||
println "Ignoring 'asyncProfiler.path': not available on ${OS.current()}"; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am cool with this. However, why don't we add wall to MAC as well?
iwc.setMaxBufferedDocs(IndexWriterConfig.DISABLE_AUTO_FLUSH); | ||
iwc.setRAMBufferSizeMB(writerBufferSizeInMb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be careful, we should by default benchmark with ES defaults. Optimizing our benchmarks but not our production code can give a false sense of improvement
} catch (IOException ioe) { | ||
throw new UncheckedIOException(ioe); | ||
} | ||
logger.debug("Index thread times: [{}ms] read, [{}ms] add doc", readTime / 1_000_000, docAddTime / 1_000_000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are going to worry about this nuance, we should simply adjust this to return the time each thread spent indexing and sum those up in the results
Additionally, we would need to separate the call to:
ConcurrentMergeScheduler cms = (ConcurrentMergeScheduler) iwc.getMergeScheduler();
cms.sync();
and maybe have a "exact index time" vs "overall index time" or something.
In order to better understand the performance characteristics of vector indexing with a GPU, this PR introduces 2 changes: