You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**GraphNet** is a large-scale dataset of deep learning **computation graphs**, built as a standard benchmark for **tensor compiler** optimization. It provides 2.7K computation graphs extracted from state-of-the-art deep learning models spanning diverse tasks and ML frameworks. With standardized formats and rich metadata, GraphNet enables fair comparison and reproducible evaluation of the general optimization capabilities of tensor compilers, thereby supporting advanced research such as AI for System on compilers (AI for Compiler).
2
+
<h1align="center">GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research</h1>
4
3
5
-
<br>
6
4
<divalign="center">
7
-
<imgsrc="/pics/Eval_result.png"alt="Violin plots of speedup distributions"width="65%">
8
-
</div>
9
-
10
-
Compiler developers can use GraphNet samples to evaluate tensor compilers (e.g., CINN, TorchInductor, TVM) on target tasks. The figure above shows the speedup of two compilers (CINN and TorchInductor) across two tasks (CV and NLP).
11
-
12
-
## 🧱 Dataset Construction
13
-
14
-
To guarantee the dataset’s overall quality, reproducibility, and cross-compiler compatibility, we define the following construction **constraints**:
15
5
16
-
1. Computation graphs must be executable in imperative (eager) mode.
17
-
2. Computation graphs and their corresponding Python code must support serialization and deserialization.
18
-
3. The full graph can be decomposed into two disjoint subgraphs.
19
-
4. Operator names within each computation graph must be statically parseable.
20
-
5. If custom operators are used, their implementation code must be fully accessible.
21
-
22
-
### Graph Extraction & Validation
23
-
24
-
We provide automated extraction and validation tools for constructing this dataset.
**Illustration: How does GraphNet extract and construct a computation graph sample on PyTorch?**
12
+
**GraphNet** is a large-scale dataset of deep learning **computation graphs**, built as a standard benchmark for **tensor compiler** optimization. It provides over 2.7K computation graphs extracted from state-of-the-art deep learning models spanning diverse tasks and ML frameworks. With standardized formats and rich metadata, GraphNet enables fair comparison and reproducible evaluation of the general optimization capabilities of tensor compilers, thereby supporting advanced research such as AI for System on compilers.
47
13
14
+
## News
15
+
-[2025-10-14] ✨ Our technical report is out: a detailed study of dataset construction and compiler benchmarking, introducing the novel performance metrics Speedup Score S(t) and Error-aware Speedup Score ES(t). [📘 GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research](./GraphNet_technical_report.pdf)
16
+
-[2025-8-20] 🚀 The second round of [open contribution tasks](https://github.com/PaddlePaddle/Paddle/issues/74773) was released. (completed ✅)
17
+
-[2025-7-30] 🚀 The first round of [open contribution tasks](https://github.com/PaddlePaddle/GraphNet/issues/44) was released. (completed ✅)
18
+
## Benchmark Results
19
+
We evaluate two representative tensor compiler backends, CINN (PaddlePaddle) and TorchInductor (PyTorch), on GraphNet's NLP and CV subsets. The evaluation adopts two quantitative metrics proposed in the [GraphNet Technical Report](./GraphNet_technical_report.pdf):
* Source code of custom_op is required **only when** corresponding operator is used in the module, and **no specific format** is required.
53
-
54
-
**Step 1: graph_net.torch.extract**
55
-
56
-
Import and wrap the model with `graph_net.torch.extract(name=model_name, dynamic=dynamic_mode)()` is all you need:
57
-
58
-
```bash
59
-
import graph_net
60
-
61
-
# Instantiate the model (e.g. a torchvision model)
62
-
model = ...
63
-
64
-
# Extract your own model
65
-
model = graph_net.torch.extract(name="model_name", dynamic="True")(model)
66
-
```
67
-
68
-
After running, the extracted graph will be saved to: `$GRAPH_NET_EXTRACT_WORKSPACE/model_name/`.
69
-
70
-
For more details, see docstring of `graph_net.torch.extract` defined in `graph_net/torch/extractor.py`.
71
-
72
-
**Step 2: graph_net.torch.validate**
73
-
74
-
To verify that the extracted model meets requirements, we use `graph_net.torch.validate` in CI tool and also ask contributors to self-check in advance:
25
+
-**Error-aware Speedup Score** ES(t) — further accounts for runtime and compilation errors.
All the **construction constraints** will be examined automatically. After passing validation, a unique `graph_hash.txt` will be generated and later checked in CI procedure to avoid redundant.
31
+
## Quick Start
32
+
This section shows how to evaluate tensor compilers and reproduce benchmark results (for compiler users and developers),
33
+
as well as how to contribute new computation graphs (for GraphNet contributors).
82
34
83
-
## ⚖️ Compiler Evaluation
35
+
###⚖️ Compiler Evaluation
84
36
85
37
**Step 1: Benchmark**
86
38
87
-
We use `graph_net.torch.test_compiler` to benchmark GraphNet samples with specific batch and log configurations:
39
+
Use graph_net.torch.test_compiler to benchmark GraphNet samples with specific batch and logging configurations:
88
40
89
41
```bash
90
42
# Set your benchmark directory
@@ -110,8 +62,7 @@ After executing, `graph_net.torch.test_compiler` will:
110
62
111
63
**Step 2: Generate JSON Record**
112
64
113
-
This step is to extract information (including failure) from logs in benchmark.
114
-
All the information will be saved to multiple `model_compiler.json` files via:
65
+
Extract runtime, correctness, and failure information from benchmark logs:
115
66
116
67
```bash
117
68
python -m graph_net.log2json \
@@ -121,7 +72,7 @@ python -m graph_net.log2json \
121
72
122
73
**Step 3: Analysis**
123
74
124
-
After processing, we provide`graph_net.violin_analysis` to generate [violin plot](https://en.m.wikipedia.org/wiki/Violin_plot) and `graph_net.S_analysis` to generate S and ES plot based on the JSON results.
75
+
Use`graph_net.violin_analysis` to generate [violin plot](https://en.m.wikipedia.org/wiki/Violin_plot) and `graph_net.S_analysis` to generate S and ES plot based on the JSON results.
The scripts are designed to process a file structure as `/benchmark_path/category_name/`, and items on x-axis are identified by name of the sub-directories. After executing, several summary plots of result in categories (model tasks, libraries...) will be exported to `$GRAPH_NET_BENCHMARK_PATH`.
142
93
143
-
## 📌 Roadmap
94
+
### 🧱 Contribute More Samples
95
+
96
+
GraphNet provides automated tools for graph extraction and validation.
* Source code of custom_op is required **only when** corresponding operator is used in the module, and **no specific format** is required.
125
+
126
+
**Step 1: graph_net.torch.extract**
127
+
128
+
Wrap the model with the extractor — that’s all you need:
129
+
130
+
```bash
131
+
import graph_net
132
+
133
+
# Instantiate the model (e.g. a torchvision model)
134
+
model = ...
135
+
136
+
# Extract your own model
137
+
model = graph_net.torch.extract(name="model_name", dynamic="True")(model)
138
+
```
139
+
140
+
After running, the extracted graph will be saved to: `$GRAPH_NET_EXTRACT_WORKSPACE/model_name/`.
141
+
142
+
For more details, see docstring of `graph_net.torch.extract` defined in `graph_net/torch/extractor.py`.
143
+
144
+
**Step 2: graph_net.torch.validate**
145
+
146
+
To verify that the extracted model meets requirements, we use `graph_net.torch.validate` in CI tool and also ask contributors to self-check in advance:
All the **construction constraints** will be examined automatically. After passing validation, a unique `graph_hash.txt` will be generated and later checked in CI procedure to avoid redundant.
154
+
155
+
156
+
## Future Roadmap
144
157
145
158
1. Scale GraphNet to 10K+ graphs.
146
159
2. Further annotate GraphNet samples into more granular sub-categories
@@ -149,7 +162,7 @@ The scripts are designed to process a file structure as `/benchmark_path/categor
149
162
150
163
**Vision**: GraphNet aims to lay the foundation for AI for Compiler by enabling **large-scale, systematic evaluation** of tensor compiler optimizations, and providing a **dataset for models to learn** and transfer optimization strategies.
151
164
152
-
## 💬 GraphNet Community
165
+
## GraphNet Community
153
166
154
167
You can join our community via following group chats. Welcome to ask any questions about using and building GraphNet.
155
168
@@ -167,5 +180,17 @@ You can join our community via following group chats. Welcome to ask any questio
167
180
</table>
168
181
</div>
169
182
170
-
## 🪪 License
171
-
This project is released under the [MIT License](LICENSE).
183
+
## License and Acknowledgement
184
+
185
+
GraphNet is released under the [MIT License](./LICENSE).
186
+
187
+
If you find this project helpful, please cite:
188
+
189
+
```bibtex
190
+
@article{li2025graphnet,
191
+
title = {GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research},
192
+
author = {Xinqi Li and Yiqun Liu and Shan Jiang and Enrong Zheng and Huaijin Zheng and Wenhao Dai and Haodong Deng and Dianhai Yu and Yanjun Ma},
0 commit comments