Skip to content

Commit 992b765

Browse files
committed
merge code
2 parents 34733d4 + c2280ca commit 992b765

File tree

9 files changed

+388
-103
lines changed

9 files changed

+388
-103
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,9 @@ python -m graph_net.plot_violin \
9696

9797
The scripts are designed to process a file structure as `/benchmark_path/category_name/`, and items on x-axis are identified by name of the sub-directories. After executing, several summary plots of result in categories (model tasks, libraries...) will be exported to `$GRAPH_NET_BENCHMARK_PATH`.
9898

99+
### Hardware Regression Testing
100+
We also provide a two-step workflow that validates compiler correctness and performance against a "golden" reference, which is crucial for hardware-specific testing and regression tracking. Details can be found in this [guide](./docs/hardware_test.md).
101+
99102
### 🧱 Construction & Contribution Guide
100103
Want to understand how GraphNet is built or contribute new samples?
101104
Check out the [Construction Guide](./docs/README_contribute.md) for details on the extraction and validation workflow.

docs/hardware_test.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
## Hardware Regression Testing
2+
### Step 1: Generate Reference Data
3+
First, use `graph_net.paddle.test_reference_device` on a trusted setting (e.g., a specific hardware/compiler version) to generate baseline logs and output files.
4+
```bash
5+
python -m graph_net.paddle.test_reference_device \
6+
--model-path /path/to/all_models/ \
7+
--reference-dir ./gold_reference \
8+
--compiler cinn \
9+
--device cuda
10+
# --reference-dir: (Required) Directory where the output .log (performance/config) and .pdout (output tensors) files will be saved.
11+
# --compiler: Specifies the compiler backend.
12+
```
13+
### Step 2: Run Regression Test
14+
After changing hardware, run the correctness test script. This script reads the reference data, re-runs the models using the exact same configuration, and compares the new results against the "golden" reference.
15+
```bash
16+
python -m graph_net.paddle.test_device_correctness \
17+
--reference-dir ./golden_reference \
18+
--device cuda
19+
```
20+
This script will report any failures (e.g., compilation errors, output mismatches) and print a performance comparison (speedup/slowdown) against the reference log, allowing you to quickly identify regressions.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
#!/bin/bash
2+
set -x
3+
4+
# input model path
5+
MODEL_PATH_IN_SAMPLES=/timm/resnet18
6+
# extract subgraph 0-8, 8-16
7+
read -r -d '' json_str <<'EOF'
8+
{
9+
"output_dir": "/tmp/naive_decompose_workspace",
10+
"split_positions": [8, 16, 32],
11+
"group_head_and_tail": true,
12+
"chain_style": true
13+
}
14+
EOF
15+
CONFIG=$(echo $json_str | base64 -w 0)
16+
17+
mkdir -p /tmp/naive_decompose_workspace
18+
GRAPH_NET_ROOT=$(python3 -c "import graph_net; import os; print(
19+
os.path.dirname(graph_net.__file__))")
20+
python3 -m graph_net.torch.single_device_runner --model-path $GRAPH_NET_ROOT/../samples/$MODEL_PATH_IN_SAMPLES --enable-extract True --extract-name resnet18 --dump-graph-hash-key --custom-extractor-path=$GRAPH_NET_ROOT/torch/naive_graph_decomposer.py --custom-extractor-config=$CONFIG
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#!/bin/bash
2+
3+
# input model path
4+
MODEL_PATH_IN_SAMPLES=/timm/resnet18
5+
read -r -d '' json_str <<'EOF'
6+
{
7+
"output_dir": "/tmp/naive_decompose_workspace",
8+
"split_positions": [8, 32],
9+
"group_head_and_tail": true
10+
}
11+
EOF
12+
CONFIG=$(echo $json_str | base64 -w 0)
13+
14+
mkdir -p /tmp/naive_decompose_workspace
15+
GRAPH_NET_ROOT=$(python3 -c "import graph_net; import os; print(
16+
os.path.dirname(graph_net.__file__))")
17+
python3 -m graph_net.torch.single_device_runner --model-path $GRAPH_NET_ROOT/../samples/$MODEL_PATH_IN_SAMPLES --enable-extract True --extract-name resnet18 --dump-graph-hash-key --custom-extractor-path=$GRAPH_NET_ROOT/torch/naive_graph_decomposer.py --custom-extractor-config=$CONFIG

graph_net/test/torch_extractor_test.py

Lines changed: 8 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -19,36 +19,22 @@ def forward(self, x):
1919

2020

2121
class WrapperModule(torch.nn.Module):
22-
def __init__(self, submodule):
22+
def __init__(self, submodule, seq_no):
2323
super().__init__()
2424
self.submodule = submodule
25+
self.seq_no = seq_no
2526

2627
def forward(self, *args):
2728
print("Args:")
2829
print(args)
2930
return self.submodule(*args)
3031

3132

32-
def submodule_hook(submodule: torch.fx.GraphModule):
33-
print(f"{'-'*8} [submodule] {'-'*8}\n")
33+
def submodule_hook(submodule: torch.fx.GraphModule, seq_no):
34+
print(f"{'-'*8} [submodule-{seq_no}] {'-'*8}\n")
3435
print(submodule.graph)
35-
"""
36-
graph():
37-
%add : [num_users=1] = placeholder[target=add]
38-
%mul : [num_users=1] = call_function[target=operator.mul](args = (%add, 2), kwargs = {})
39-
%clamp : [num_users=1] = call_method[target=clamp](args = (%mul,), kwargs = {min: 0.0, max: 1.0})
40-
return (clamp,)
41-
42-
"""
4336
print(submodule.code)
44-
"""
45-
def forward(self, add):
46-
mul = add * 2; add = None
47-
clamp = mul.clamp(min = 0.0, max = 1.0); mul = None
48-
return (clamp,)
49-
"""
50-
51-
return WrapperModule(submodule)
37+
return WrapperModule(submodule, seq_no)
5238

5339

5440
class TestExtractorSubmodule(unittest.TestCase):
@@ -87,9 +73,10 @@ def forward(self, x):
8773

8874
folded = fold_range_to_submodule(
8975
symbolic_traced,
90-
start_node_idx=2,
91-
end_node_idx=4,
76+
start_node_idx=0,
77+
end_node_idx=2,
9278
submodule_hook=submodule_hook,
79+
# group_head_and_tail=False,
9380
)
9481
folded_output = folded(inp)
9582

0 commit comments

Comments
 (0)