PaddlePaddle
diff --git a/‎README.md‎
Lines changed: 68 additions & 73 deletions b/‎README.md‎
Lines changed: 68 additions & 73 deletions
diff --git a/‎graph_net/paddle/validate.py‎
Lines changed: 39 additions & 20 deletions b/‎graph_net/paddle/validate.py‎
Lines changed: 39 additions & 20 deletions
diff --git a/‎paddle_samples/PaddleNLP/bert-base-cased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-base-cased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎paddle_samples/PaddleNLP/bert-base-chinese/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-base-chinese/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎paddle_samples/PaddleNLP/bert-base-multilingual-cased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-base-multilingual-cased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎paddle_samples/PaddleNLP/bert-base-multilingual-uncased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-base-multilingual-uncased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎paddle_samples/PaddleNLP/bert-base-uncased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-base-uncased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎paddle_samples/PaddleNLP/bert-large-cased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-large-cased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎paddle_samples/PaddleNLP/bert-large-uncased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-large-uncased/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎paddle_samples/PaddleNLP/bert-wwm-ext-chinese/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion b/‎paddle_samples/PaddleNLP/bert-wwm-ext-chinese/graph_hash.txt‎
Lines changed: 1 addition & 1 deletion
@@ -1,145 +1,143 @@
 # GraphNet  ![](https://img.shields.io/badge/version-v0.1-brightgreen) ![](https://img.shields.io/github/issues/PaddlePaddle/GraphNet?label=open%20issues)    [![](https://img.shields.io/badge/Contribute%20to%20GraphNet-blue)](https://github.com/PaddlePaddle/GraphNet/issues/98)
 
+**GraphNet** is a large-scale dataset of deep learning **computation graphs**, built as a standard benchmark for **tensor compiler** optimization. It provides 2.7K computation graphs extracted from state-of-the-art deep learning models spanning diverse tasks and ML frameworks. With standardized formats and rich metadata, GraphNet enables fair comparison and reproducible evaluation of the general optimization capabilities of tensor compilers, thereby supporting advanced research such as AI for System on compilers (AI for Compiler).
 
-**GraphNet** is a large-scale dataset of deep learning **computation graphs**, built as a standard benchmark for **tensor compiler** optimization. It provides 2.7K computation graphs extracted from state-of-the-art deep learning models spanning diverse tasks and ML frameworks. With standardized formats and rich metadata, GraphNet enables fair comparison, reproducible evaluation, and deeper research into the general optimization capabilities of tensor compilers.
 <br>
 <div align="center">
-<img src="/pics/graphnet_overview.jpg" alt="GraphNet Architecture Overview" width="65%">
+<img src="/pics/Eval_result.png" alt="Violin plots of speedup distributions" width="65%">
 </div>
 
-With GraphNet, users can:
-1. **Contribute new computation graphs** through the built-in automated extraction and validation pipeline.
-2. **Evaluate tensor compilers** on existing graphs with the integrated compiler evaluation tool, supporting multiple compiler backends.
-3. **Advance research** in tensor compiler optimization using the test data and statistics provided by GraphNet.
-
-
-
-
-**Vision**: We aim to achieve cross-hardware portability of compiler optimizations by allowing models to learn and transfer optimization strategies. It will significantly  reduce the manual effort required to develop efficient operator implementations.
-
+Compiler developers can use GraphNet samples to evaluate tensor compilers (e.g., CINN, TorchInductor, TVM) on target tasks. The figure above shows the speedup of two compilers (CINN and TorchInductor) across two tasks (CV and NLP).
 
-## Dataset Construction
+## 🧱 Dataset Construction
 
 To guarantee the dataset’s overall quality, reproducibility, and cross-compiler compatibility, we define the following construction **constraints**:
 
-1. Dynamic graphs must execute correctly.
-2. Graphs and their corresponding Python code must support serialization and deserialization.
+1. Computation graphs must be executable in imperative (eager) mode.
+2. Computation graphs and their corresponding Python code must support serialization and deserialization.
 3. The full graph can be decomposed into two disjoint subgraphs.
 4. Operator names within each computation graph must be statically parseable.
 5. If custom operators are used, their implementation code must be fully accessible.
 
-
 ### Graph Extraction & Validation
-For full implementation details, please refer to the [Co-Creation Tutorial](https://github.com/PaddlePaddle/GraphNet/blob/develop/CONTRIBUTE_TUTORIAL.md#co-creation-tutorial).
+
+We provide automated extraction and validation tools for constructing this dataset.
+
+<div align="center">
+<img src="/pics/graphnet_overview.jpg" alt="GraphNet Architecture Overview" width="65%">
+</div>
 
 **Demo: Extract & Validate ResNet‑18**
-```
+```bash
 git clone https://github.com/PaddlePaddle/GraphNet.git
 cd GraphNet
 
 # Set your workspace directory
-export GRAPH_NET_EXTRACT_WORKSPACE=/home/yourname/graphnet_workspace
+export GRAPH_NET_EXTRACT_WORKSPACE=/home/yourname/graphnet_workspace/
 
 # Extract the ResNet‑18 computation graph
 python graph_net/test/vision_model_test.py
 
-# Validate the extracted graph (e.g. /home/yourname/graphnet_workspace/resnet18)
+# Validate the extracted graph (e.g. /home/yourname/graphnet_workspace/resnet18/)
 python -m graph_net.torch.validate \
-  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/resnet18
+  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/resnet18/
 ```
 
-**graph_net.torch.extract**
+**Illustration: How does GraphNet extract and construct a computation graph sample on PyTorch?**
+
+<div align="center">
+<img src="/pics/graphnet_sample.png" alt="GraphNet Extract Sample" width="65%">
+</div>
+
+* Source code of custom_op is required **only when** corresponding operator is used in the module, and **no specific format** is required.
+
+**Step 1: graph_net.torch.extract**
 
-```python
+Import and wrap the model with `graph_net.torch.extract(name=model_name, dynamic=dynamic_mode)()` is all you need:
+
+```bash
 import graph_net
 
 # Instantiate the model (e.g. a torchvision model)
 model = ...  
 
 # Extract your own model
-model = graph_net.torch.extract(name="model_name")(model)
-
-# After running, the extracted graph will be saved to:
-#   $GRAPH_NET_EXTRACT_WORKSPACE/model_name
+model = graph_net.torch.extract(name="model_name", dynamic="True")(model)
 ```
 
-For details, see docstring of `graph_net.torch.extract` defined in `graph_net/torch/extractor.py`
-
-**graph_net.torch.validate**
-```
-# Verify that the extracted model meets requirements
-python -m graph_net.torch.validate \
-  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/model_name
-```
+After running, the extracted graph will be saved to: `$GRAPH_NET_EXTRACT_WORKSPACE/model_name/`.
 
+For more details, see docstring of `graph_net.torch.extract` defined in `graph_net/torch/extractor.py`.
 
-## Compiler Evaluation
+**Step 2: graph_net.torch.validate**
 
-The compiler evaluation process takes a GraphNet sample as input and involves:
-1. Running the original model in eager mode to record a baseline.
-2. Compiling the model with the specified backend (e.g., CINN, TorchInductor, TVM).
-3. Executing the compiled model and collecting its runtime and outputs.
-4. Analyzing performance by comparing the compiled results against the baseline.
+To verify that the extracted model meets requirements, we use `graph_net.torch.validate` in CI tool and also ask contributors to self-check in advance:
 
-### Evaluation Metrics
+```bash
+python -m graph_net.torch.validate \
+  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/model_name
+```
 
-We define two key metrics here: **rectified speedup** and **GraphNet Score**. Rectified speedup measures runtime performance while incorporating compilation success, time cost, and correctness. GraphNet Score aggregates the rectified speedup of a compiler on specified tasks, providing a measure of its general optimization capability. 
+All the **construction constraints** will be examined automatically. After passing validation, a unique `graph_hash.txt` will be generated and later checked in CI procedure to avoid redundant.
 
-**Demo: How to benchmark your compiler on the model:**
+## ⚖️ Compiler Evaluation
 
-1. Benchmark
+**Step 1: Benchmark**
 
-We use ```graph_net/benchmark_demo.sh``` to benchmark GraphNet computation graph samples:
+We use `graph_net/benchmark_demo.sh` to benchmark GraphNet computation graph samples:
 
-```
+```bash
 bash graph_net/benchmark_demo.sh &
 ```
 
-The script will run ```graph_net.torch.test_compiler``` with specific batch and log configurations.
+The script runs `graph_net.torch.test_compiler` with specific batch and log configurations.
 
-Or you can customize and use ```graph_net.torch.test_compiler``` yourself:
+Or you can customize and use `graph_net.torch.test_compiler` yourself:
 
-```
-python3 -m graph_net.torch.test_compiler \
+```bash
+python -m graph_net.torch.test_compiler \
   --model-path $GRAPH_NET_EXTRACT_WORKSPACE/model_name/ \
-  --compiler /path/to/custom/compiler/ \
+  --compiler /custom/or/builtin/compiler/ \
+  --warmup /times/to/warmup/ \
+  --trials /times/to/test/ \
+  --device /device/to/execute/ \
   --output-dir /path/to/save/JSON/result/file/
+
 # Note: if --compiler is omitted, PyTorch’s built-in compiler is used by default
 ```
 
-2. Analysis
+After executing, `graph_net.torch.test_compiler` will:
+1. Running the original model in eager mode to record a baseline.
+2. Compiling the model with the specified backend (e.g., CINN, TVM, Inductor, TensorRT, XLA, BladeDISC).
+3. Executing the compiled model and collecting its runtime and outputs.
+4. Conduct speedup by comparing the compiled results against the baseline.
 
-After processing, we provide ```graph_net/analysis.py``` to generate [violin plot](https://en.m.wikipedia.org/wiki/Violin_plot) based on the JSON results.
+**Step 2: Analysis**
 
-```
-python3 graph_net/analysis.py \
+After processing, we provide `graph_net/analysis.py` to generate [violin plot](https://en.m.wikipedia.org/wiki/Violin_plot) based on the JSON results.
+
+```bash
+python -m graph_net.analysis \
   --benchmark-path /path/to/read/JSON/result/file/ \
   --output-dir /path/to/save/output/figures/
 ```
 
-After executing, one summary plot of results on all compilers (as shown below in "Evaluation Results Example"), as well as multiple sub-plots of results in categories (model tasks, Library...) on a single compiler. 
-
-The script is designed to process a file structure as ```/benchmark_path/compiler_name/category_name/``` (for example ```/benchmark_logs/paddle/nlp/```), and items on x-axis are identified by name of the folders. So you can modify  ```read_all_speedups``` function to fit the benchmark settings on your demand.
+After executing, one summary plot of results on all compilers, as well as multiple sub-plots of results in categories (model tasks, Library...) on a single compiler will be exported. 
 
-### Evaluation Results Example
-
-<div align="center">
-<img src="/pics/Eval_result.png" alt="Violin plots of rectified speedup distributions" width="65%">
-</div>
+The script is designed to process a file structure as `/benchmark_path/compiler_name/category_name/` (for example `/benchmark_logs/paddle/nlp/`), and items on x-axis are identified by name of the folders. So you can modify  `read_all_speedups` function to fit the benchmark settings on your demand.
 
-
-## Roadmap
+## 📌 Roadmap
 
 1. Scale GraphNet to 10K+ graphs.
 2. Further annotate GraphNet samples into more granular sub-categories
 3. Extract samples from multi-GPU scenarios to support benchmarking and optimization for large-scale, distributed computing.
 4. Enable splitting full graphs into independently optimized subgraphs and operator sequences.
 
-## GraphNet Community:
-
+**Vision**: GraphNet aims to lay the foundation for AI for Compiler by enabling **large-scale, systematic evaluation** of tensor compiler optimizations, and providing a **dataset for models to learn** and transfer optimization strategies.
 
-You can join GraphNet community via the following group chats.
+## 💬 GraphNet Community
 
+You can join our community via following group chats. Welcome to ask any questions about using and building GraphNet.
 
 <div align="center">
 <table>
@@ -155,8 +153,5 @@ You can join GraphNet community via the following group chats.
 </table>
 </div>
 
-
-
-##  License
+## 🪪 License
 This project is released under the [MIT License](LICENSE).
-
@@ -11,7 +11,7 @@
 import numpy as np
 import graph_net
 import os
-import re
+import ast
 import paddle
 
 
@@ -29,29 +29,49 @@ def _get_sha_hash(content):
     return m.hexdigest()
 
 
-def _save_to_model_path(dump_dir, hash_text):
-    file_path = f"{dump_dir}/graph_hash.txt"
-    with open(file_path, "w") as f:
-        f.write(hash_text)
+def _extract_forward_source(model_path):
+    source = None
+    with open(f"{model_path}/model.py", "r") as f:
+        source = f.read()
 
+    tree = ast.parse(source)
+    forward_code = None
 
-def extract_from_forward_regex(text, case_sensitive=True):
-    pattern = r"forward.*"
-    flags = 0 if case_sensitive else re.IGNORECASE
+    for node in tree.body:
+        if isinstance(node, ast.ClassDef) and node.name == "GraphModule":
+            for fn in node.body:
+                if isinstance(fn, ast.FunctionDef) and fn.name == "forward":
+                    return ast.unparse(fn)
+    return None
 
-    match = re.search(pattern, text, flags)
-    if match:
-        return match.group(0)
+
+def check_graph_hash(args):
+    model_path = args.model_path
+    file_path = f"{model_path}/graph_hash.txt"
+    if args.dump_graph_hash_key:
+        model_str = _extract_forward_source(model_path)
+        assert model_str is not None, f"model_str of {args.model_path} is None."
+        new_hash_text = _get_sha_hash(model_str)
+
+        old_hash_text = None
+        if os.path.exists(file_path):
+            with open(file_path, "r") as f:
+                old_hash_text = f.read()
+
+        if old_hash_text is None or new_hash_text != old_hash_text:
+            print(f"Writing to {file_path}.")
+            with open(file_path, "w") as f:
+                f.write(new_hash_text)
+            if old_hash_text is not None:
+                assert (
+                    new_hash_text == old_hash_text
+                ), f"Hash value for {model_path} is not consistent."
     else:
-        raise ValueError("Erroneous case occurs.")
+        assert os.path.exists(file_path), f"{file_path} does not exist."
 
 
 def main(args):
-    model_path = args.model_path
-    with open(f"{model_path}/model.py", "r") as fp:
-        model_str = fp.read()
-        model_str = extract_from_forward_regex(model_str)
-        _save_to_model_path(model_path, _get_sha_hash(model_str))
+    check_graph_hash(args)
 
     model_path = args.model_path
     model_class = load_class_from_file(
@@ -100,17 +120,16 @@ def main(args):
         required=True,
         help="Path to folder e.g '../test_dataset'",
     )
-
     parser.add_argument(
         "--no-check-redundancy",
         action="store_true",
+        default=False,
         help="whether check model graph redundancy",
     )
-
     parser.add_argument(
         "--dump-graph-hash-key",
         action="store_true",
-        default=False,
+        default=True,
         help="Dump graph hash key",
     )
     parser.add_argument(
 
@@ -1 +1 @@
-c1e7e52eab55414cee7c44a9e8c4f81bbd59e3837b185e179e6317efa04f69ec
+f2b5a332b1b19703e7ccfb450de96c9c12244144c7b9d305d20587f772fb6672
@@ -1 +1 @@
-c1e7e52eab55414cee7c44a9e8c4f81bbd59e3837b185e179e6317efa04f69ec
+f2b5a332b1b19703e7ccfb450de96c9c12244144c7b9d305d20587f772fb6672
@@ -1 +1 @@
-c1e7e52eab55414cee7c44a9e8c4f81bbd59e3837b185e179e6317efa04f69ec
+f2b5a332b1b19703e7ccfb450de96c9c12244144c7b9d305d20587f772fb6672
@@ -1 +1 @@
-c1e7e52eab55414cee7c44a9e8c4f81bbd59e3837b185e179e6317efa04f69ec
+f2b5a332b1b19703e7ccfb450de96c9c12244144c7b9d305d20587f772fb6672
@@ -1 +1 @@
-c1e7e52eab55414cee7c44a9e8c4f81bbd59e3837b185e179e6317efa04f69ec
+f2b5a332b1b19703e7ccfb450de96c9c12244144c7b9d305d20587f772fb6672
@@ -1 +1 @@
-1bad8e4fab570ff456bad864ef45a755f07b2e466cced7983a8383abccc8fc7a
+02fa10efca360c8ba7818c367cdeb9979e2af8c72cf489913396a1f241bbad07
@@ -1 +1 @@
-1bad8e4fab570ff456bad864ef45a755f07b2e466cced7983a8383abccc8fc7a
+02fa10efca360c8ba7818c367cdeb9979e2af8c72cf489913396a1f241bbad07
@@ -1 +1 @@
-c1e7e52eab55414cee7c44a9e8c4f81bbd59e3837b185e179e6317efa04f69ec
+f2b5a332b1b19703e7ccfb450de96c9c12244144c7b9d305d20587f772fb6672
Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-c1e7e52eab55414cee7c44a9e8c4f81bbd59e3837b185e179e6317efa04f69ec`
	`1`	`+f2b5a332b1b19703e7ccfb450de96c9c12244144c7b9d305d20587f772fb6672`
Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-1bad8e4fab570ff456bad864ef45a755f07b2e466cced7983a8383abccc8fc7a`
	`1`	`+02fa10efca360c8ba7818c367cdeb9979e2af8c72cf489913396a1f241bbad07`