Commit 5afd7fd
committed
Enhance Flamegraph Documentation and GPU Profiling Scripts
- Added an example flamegraph for Qwen3 LLM inference, highlighting key insights and performance bottlenecks.
- Updated README.md to include detailed explanations of CPU and GPU profiling results, emphasizing the correlation between CPU stacks and GPU kernels.
- Modified gpuperf.py to ensure absolute paths are used for output files, improving reliability across different working directories.
- Enhanced merge_gpu_cpu_trace.py to strip ANSI escape sequences from CPU stack traces, ensuring cleaner output for analysis.
- Introduced a new SVG file for the Qwen3 flamegraph, providing a visual representation of profiling data with interactive features.1 parent ad58376 commit 5afd7fd
File tree
4 files changed
+905
-175
lines changed- src/xpu/flamegraph
4 files changed
+905
-175
lines changed
0 commit comments