Skip to content

Commit 8a92c8c

Browse files
committed
updated readme
1 parent f554ca8 commit 8a92c8c

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,11 @@
1-
# Modified HeCBench for Roofline Analysis
1+
# *gpuFLOPBench*: Counting Without Running: Evaluating LLMs’ Reasoning About Code Complexity
2+
3+
This repo is based off of the [HeCBench Suite](https://github.com/zjin-lcf/HeCBench), where we build, profile, and categorize all the CUDA codes to create the **gpuFLOPBench** dataset.
4+
This dataset is designed to test the FLOP prediction capability of state-of-the-art LLMs, where we only supply them with soure code, compiler args, and command-line input arguments, expecting the LLMs to perform constant propagation and predict the number of FLOPs a target CUDA kernel would perform.
5+
The querying is done with simple zero-shot prompting techniques and tool calls, without any agentic or MCP features.
6+
This work gives us a baseline understanding of where current SoTA models are at w.r.t GPU performance prediction from the perspective of FLOP counts.
7+
8+
## Modified HeCBench for GPU FLOP Performance Prediction using LLMs
29

310
We took this version of HeCBench and modified it to build the CUDA and OMP codes to gather their roofline performance data.
411
So far we have a large portion of the CUDA and OMP codes building without issue. We use CMake because the `autohecbench.py` was giving us trouble with easily switching out compilers and build options. There were also many issues with individual makefiles, so we decided to put all the build commands into one big `CMakeLists.txt` file for simplicity. We also wanted to create distinct phases of building and gathering data which wasn't too easy with `autohecbench.py`.

0 commit comments

Comments
 (0)