PaddlePaddle
diff --git a/‎.github/actions/check-bypass/action.yml‎
Lines changed: 2 additions & 2 deletions b/‎.github/actions/check-bypass/action.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎.github/workflows/Validate-GPU.yml‎
Lines changed: 2 additions & 2 deletions b/‎.github/workflows/Validate-GPU.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 97 additions & 50 deletions b/‎README.md‎
Lines changed: 97 additions & 50 deletions
diff --git a/‎graph_net/benchmark_demo.sh‎
Lines changed: 1 addition & 1 deletion b/‎graph_net/benchmark_demo.sh‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎graph_net/paddle/check_redundant_incrementally.py‎
Lines changed: 51 additions & 21 deletions b/‎graph_net/paddle/check_redundant_incrementally.py‎
Lines changed: 51 additions & 21 deletions
diff --git a/‎graph_net/paddle/samples_util.py‎
Lines changed: 2 additions & 1 deletion b/‎graph_net/paddle/samples_util.py‎
Lines changed: 2 additions & 1 deletion
@@ -1,5 +1,5 @@
 name: "Check bypass"
-description: "A custom action to encapsulate PFCCLab/ci-bypass"
+description: "A custom action to encapsulate GraphNet"
 inputs:
   github-token:
     description: "GitHub token"
@@ -18,7 +18,7 @@ runs:
     - id: check-bypass
       name: Check Bypass
       env:
-        CI_TEAM_MEMBERS: '["SigureMo", "risemeup1", "tianshuo78520a", "0x3878f", "swgu98", "luotao1", "XieYunshen"]'
+        CI_TEAM_MEMBERS: '["lixinqi", "Xreki"]'
       uses: PFCCLab/ci-bypass@v1
       with:
         github-token: ${{ inputs.github-token }}
 
@@ -61,7 +61,7 @@ jobs:
             -v "/home/data/cfs/.ccache:/root/.ccache" \
             -v "/dev/shm:/dev/shm" \
             -v ${{ github.workspace }}/../../..:${{ github.workspace }}/../../.. \
-            -v ${{ github.workspace }}:/graphnet \
+            -v ${{ github.workspace }}:${{ github.workspace }} \
             -e python \
             -e core_index \
             -e BRANCH \
@@ -73,7 +73,7 @@ jobs:
             -e CACHE_DIR \
             -e GITHUB_API_TOKEN \
             -e CFS_DIR \
-            -w /graphnet --network host ${docker_image}
+            -w ${{ github.workspace }} --network host ${docker_image}
 
       - name: Run check
         env:
 
@@ -1,93 +1,143 @@
 # GraphNet  ![](https://img.shields.io/badge/version-v0.1-brightgreen) ![](https://img.shields.io/github/issues/PaddlePaddle/GraphNet?label=open%20issues)    [![](https://img.shields.io/badge/Contribute%20to%20GraphNet-blue)](https://github.com/PaddlePaddle/GraphNet/issues/98)
 
+**GraphNet** is a large-scale dataset of deep learning **computation graphs**, built as a standard benchmark for **tensor compiler** optimization. It provides 2.7K computation graphs extracted from state-of-the-art deep learning models spanning diverse tasks and ML frameworks. With standardized formats and rich metadata, GraphNet enables fair comparison and reproducible evaluation of the general optimization capabilities of tensor compilers, thereby supporting advanced research such as AI for System on compilers (AI for Compiler).
 
-**GraphNet** is a large-scale dataset of deep learning **computation graphs**, designed to serve as a standard benchmark and training corpus for **AI-driven tensor compiler optimization**. It contains diverse graphs extracted from state-of-the-art models, enabling effective evaluation of compiler pass optimizations across frameworks and hardware platforms.
-
+<br>
+<div align="center">
+<img src="/pics/Eval_result.png" alt="Violin plots of speedup distributions" width="65%">
+</div>
 
-With GraphNet, users can:
-1. Quickly benchmark the optimization performance of various compiler strategies.
-2. Easily conduct regression tests on existing compilers.
-3. Train AI‑for‑Systems models to automatically generate compiler optimization passes.
+Compiler developers can use GraphNet samples to evaluate tensor compilers (e.g., CINN, TorchInductor, TVM) on target tasks. The figure above shows the speedup of two compilers (CINN and TorchInductor) across two tasks (CV and NLP).
 
-**Vision**: We aim to achieve cross-hardware portability of compiler optimizations by allowing models to learn and transfer optimization strategies. It will significantly  reduce the manual effort required to develop efficient operator implementations.
+## 🧱 Dataset Construction
 
+To guarantee the dataset’s overall quality, reproducibility, and cross-compiler compatibility, we define the following construction **constraints**:
 
-### Dataset Construction Constraints：
-1. Dynamic graphs must execute correctly.
-2. Graphs and their corresponding Python code must support serialization and deserialization.
+1. Computation graphs must be executable in imperative (eager) mode.
+2. Computation graphs and their corresponding Python code must support serialization and deserialization.
 3. The full graph can be decomposed into two disjoint subgraphs.
 4. Operator names within each computation graph must be statically parseable.
 5. If custom operators are used, their implementation code must be fully accessible.
 
+### Graph Extraction & Validation
 
-## ⚡ Quick Start
-For full implementation details, please refer to the [Co-Creation Tutorial](https://github.com/PaddlePaddle/GraphNet/blob/develop/CONTRIBUTE_TUTORIAL.md#co-creation-tutorial).
-### Benchmark your compiler on the model:
+We provide automated extraction and validation tools for constructing this dataset.
 
-**graph_net.torch.test_compiler** 
-```
-python3 -m graph_net.torch.test_compiler \
-  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/model_name/ \
-  --compiler /path/to/custom/compiler 
-# Note: if --compiler is omitted, PyTorch’s built-in compiler is used by default
-```
+<div align="center">
+<img src="/pics/graphnet_overview.jpg" alt="GraphNet Architecture Overview" width="65%">
+</div>
 
-### Contribute computation graphs to GraphNet:
 **Demo: Extract & Validate ResNet‑18**
-```
+```bash
 git clone https://github.com/PaddlePaddle/GraphNet.git
 cd GraphNet
 
 # Set your workspace directory
-export GRAPH_NET_EXTRACT_WORKSPACE=/home/yourname/graphnet_workspace
+export GRAPH_NET_EXTRACT_WORKSPACE=/home/yourname/graphnet_workspace/
 
 # Extract the ResNet‑18 computation graph
 python graph_net/test/vision_model_test.py
 
-# Validate the extracted graph (e.g. /home/yourname/graphnet_workspace/resnet18)
+# Validate the extracted graph (e.g. /home/yourname/graphnet_workspace/resnet18/)
 python -m graph_net.torch.validate \
-  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/resnet18
+  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/resnet18/
 ```
 
-**graph_net.torch.extract**
+**Illustration: How does GraphNet extract and construct a computation graph sample on PyTorch?**
 
-```python
+<div align="center">
+<img src="/pics/graphnet_sample.png" alt="GraphNet Extract Sample" width="65%">
+</div>
+
+* Source code of custom_op is required **only when** corresponding operator is used in the module, and **no specific format** is required.
+
+**Step 1: graph_net.torch.extract**
+
+Import and wrap the model with `graph_net.torch.extract(name=model_name, dynamic=dynamic_mode)()` is all you need:
+
+```bash
 import graph_net
 
 # Instantiate the model (e.g. a torchvision model)
 model = ...  
 
 # Extract your own model
-model = graph_net.torch.extract(name="model_name")(model)
-
-# After running, the extracted graph will be saved to:
-#   $GRAPH_NET_EXTRACT_WORKSPACE/model_name
+model = graph_net.torch.extract(name="model_name", dynamic="True")(model)
 ```
 
-**graph_net.torch.validate**
-```
-# Verify that the extracted model meets requirements
+After running, the extracted graph will be saved to: `$GRAPH_NET_EXTRACT_WORKSPACE/model_name/`.
+
+For more details, see docstring of `graph_net.torch.extract` defined in `graph_net/torch/extractor.py`.
+
+**Step 2: graph_net.torch.validate**
+
+To verify that the extracted model meets requirements, we use `graph_net.torch.validate` in CI tool and also ask contributors to self-check in advance:
+
+```bash
 python -m graph_net.torch.validate \
   --model-path $GRAPH_NET_EXTRACT_WORKSPACE/model_name
 ```
 
-**graph_net.pack**
-```
-# Create a ZIP archive of $GRAPH_NET_EXTRACT_WORKSPACE.
-# The --clear-after-pack flag (True|False) determines whether to delete the workspace after packing.
-python -m graph_net.pack \
-  --output /path/to/output.zip \
-  --clear-after-pack True
+All the **construction constraints** will be examined automatically. After passing validation, a unique `graph_hash.txt` will be generated and later checked in CI procedure to avoid redundant.
+
+## ⚖️ Compiler Evaluation
+
+**Step 1: Benchmark**
+
+We use `graph_net/benchmark_demo.sh` to benchmark GraphNet computation graph samples:
+
+```bash
+bash graph_net/benchmark_demo.sh &
 ```
 
-Note: To configure your user details (username and email) for GraphNet, run:
+The script runs `graph_net.torch.test_compiler` with specific batch and log configurations.
+
+Or you can customize and use `graph_net.torch.test_compiler` yourself:
+
+```bash
+python -m graph_net.torch.test_compiler \
+  --model-path $GRAPH_NET_EXTRACT_WORKSPACE/model_name/ \
+  --compiler /custom/or/builtin/compiler/ \
+  --warmup /times/to/warmup/ \
+  --trials /times/to/test/ \
+  --device /device/to/execute/ \
+  --output-dir /path/to/save/JSON/result/file/
+
+# Note: if --compiler is omitted, PyTorch’s built-in compiler is used by default
 ```
-python -m graph_net.config --global \
-  --username "your-name" \
-  --email "your-email"
+
+After executing, `graph_net.torch.test_compiler` will:
+1. Running the original model in eager mode to record a baseline.
+2. Compiling the model with the specified backend (e.g., CINN, TVM, Inductor, TensorRT, XLA, BladeDISC).
+3. Executing the compiled model and collecting its runtime and outputs.
+4. Conduct speedup by comparing the compiled results against the baseline.
+
+**Step 2: Analysis**
+
+After processing, we provide `graph_net/analysis.py` to generate [violin plot](https://en.m.wikipedia.org/wiki/Violin_plot) based on the JSON results.
+
+```bash
+python -m graph_net.analysis \
+  --benchmark-path /path/to/read/JSON/result/file/ \
+  --output-dir /path/to/save/output/figures/
 ```
-Once you have packaged these extracted computation graphs, submit them to the GraphNet community via the following group chats.
 
+After executing, one summary plot of results on all compilers, as well as multiple sub-plots of results in categories (model tasks, Library...) on a single compiler will be exported. 
+
+The script is designed to process a file structure as `/benchmark_path/compiler_name/category_name/` (for example `/benchmark_logs/paddle/nlp/`), and items on x-axis are identified by name of the folders. So you can modify  `read_all_speedups` function to fit the benchmark settings on your demand.
+
+## 📌 Roadmap
+
+1. Scale GraphNet to 10K+ graphs.
+2. Further annotate GraphNet samples into more granular sub-categories
+3. Extract samples from multi-GPU scenarios to support benchmarking and optimization for large-scale, distributed computing.
+4. Enable splitting full graphs into independently optimized subgraphs and operator sequences.
+
+**Vision**: GraphNet aims to lay the foundation for AI for Compiler by enabling **large-scale, systematic evaluation** of tensor compiler optimizations, and providing a **dataset for models to learn** and transfer optimization strategies.
+
+## 💬 GraphNet Community
+
+You can join our community via following group chats. Welcome to ask any questions about using and building GraphNet.
 
 <div align="center">
 <table>
@@ -103,8 +153,5 @@ Once you have packaged these extracted computation graphs, submit them to the Gr
 </table>
 </div>
 
-
-
-##  License
+## 🪪 License
 This project is released under the [MIT License](LICENSE).
-
 
@@ -31,7 +31,7 @@ for package_path in "${samples_dir}"/*/; do
 
                 echo "[$(date)] FINISHED: ${package_name}/${model_name}"
             fi
-        } >> "$global_log" 2>&1 &
+        } >> "$global_log" 2>&1
     done
 done
 
 
@@ -43,42 +43,72 @@ def is_single_model_dir(model_dir):
 
 
 def main(args):
-    assert os.path.isdir(args.model_path)
-    assert os.path.isdir(args.graph_net_samples_path)
-    current_model_graph_hash_pathes = set(
-        graph_hash_path
-        for model_path in get_recursively_model_pathes(args.model_path)
-        for graph_hash_path in [f"{model_path}/graph_hash.txt"]
+    assert os.path.isdir(
+        args.graph_net_samples_path
+    ), f"args.graph_net_samples_path ({args.graph_net_samples_path}) is not a directory!"
+
+    current_model_graph_hash_pathes = set()
+    if args.model_path:
+        assert os.path.isdir(
+            args.model_path
+        ), f"args.model_path {args.model_path} is not a directory!"
+        current_model_graph_hash_pathes = set(
+            graph_hash_path
+            for model_path in get_recursively_model_pathes(args.model_path)
+            for graph_hash_path in [f"{model_path}/graph_hash.txt"]
+        )
+
+    find_redundant = False
+    graph_hash2graph_net_model_path = {}
+    for model_path in get_recursively_model_pathes(args.graph_net_samples_path):
+        graph_hash_path = f"{model_path}/graph_hash.txt"
+        if (
+            os.path.isfile(graph_hash_path)
+            and graph_hash_path not in current_model_graph_hash_pathes
+        ):
+            graph_hash = open(graph_hash_path).read()
+            if graph_hash not in graph_hash2graph_net_model_path.keys():
+                graph_hash2graph_net_model_path[graph_hash] = [graph_hash_path]
+            else:
+                find_redundant = True
+                graph_hash2graph_net_model_path[graph_hash].append(graph_hash_path)
+    print(
+        f"Totally {len(graph_hash2graph_net_model_path)} unique samples under {args.graph_net_samples_path}."
     )
-    graph_hash2graph_net_model_path = {
-        graph_hash: graph_hash_path
-        for model_path in get_recursively_model_pathes(args.graph_net_samples_path)
-        for graph_hash_path in [f"{model_path}/graph_hash.txt"]
-        if os.path.isfile(graph_hash_path)
-        if graph_hash_path not in current_model_graph_hash_pathes
-        for graph_hash in [open(graph_hash_path).read()]
-    }
-    for current_model_graph_hash_path in current_model_graph_hash_pathes:
-        graph_hash = open(current_model_graph_hash_path).read()
+
+    if args.model_path:
+        # Check whether the specified model is redundant.
+        for current_model_graph_hash_path in current_model_graph_hash_pathes:
+            graph_hash = open(current_model_graph_hash_path).read()
+            assert (
+                graph_hash not in graph_hash2graph_net_model_path
+            ), f"Redundant models detected.\n\tgraph_hash:{graph_hash}, newly-added-model-path:{current_model_graph_hash_path}, existing-model-path:{graph_hash2graph_net_model_path[graph_hash]}."
+    else:
+        # Check whether there are redundant samples under samples directory.
+        for graph_hash, graph_paths in graph_hash2graph_net_model_path.items():
+            if len(graph_paths) > 1:
+                print(f"Redundant models detected for grap_hash {graph_hash}:")
+                for model_path in graph_paths:
+                    print(f"    {model_path}")
         assert (
-            graph_hash not in graph_hash2graph_net_model_path
-        ), f"Redundant models detected. old-model-path:{current_model_graph_hash_path}, new-model-path:{graph_hash2graph_net_model_path[graph_hash]}."
+            not find_redundant
+        ), f"Redundant models detected under {args.graph_net_samples_path}."
 
 
 if __name__ == "__main__":
     parser = argparse.ArgumentParser(description="Test compiler performance.")
     parser.add_argument(
         "--model-path",
         type=str,
-        required=True,
+        required=False,
         help="Path to model file(s), each subdirectory containing graph_net.json will be regarded as a model",
     )
     parser.add_argument(
         "--graph-net-samples-path",
         type=str,
-        required=False,
-        default="default",
+        required=True,
         help="Path to GraphNet samples",
     )
     args = parser.parse_args()
+    print(args)
     main(args=args)
@@ -3,4 +3,5 @@
 
 
 def get_default_samples_directory():
-    return f"{os.path.dirname(graph_net.__file__)}/../paddle_samples"
+    graph_net_root = os.path.dirname(os.path.dirname(graph_net.__file__))
+    return f"{graph_net_root}/paddle_samples"