Skip to content

Commit 103873e

Browse files
committed
Merge branch 'develop' of github.com:PaddlePaddle/GraphNet into develop
2 parents b2e46e8 + 9297cf6 commit 103873e

27 files changed

+21550
-172
lines changed

docs/CONTRIBUTE_TUTORIAL.md

Lines changed: 6 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -255,35 +255,22 @@ python -m graph_net.config \
255255
256256
```
257257
258-
2. **Package the graph**
258+
2. **Commit the changes**
259259
260+
Move the new sample to **samples** directory and commit.
260261
```bash
261-
python -m graph_net.pack --output /path/to/output.zip --clear-after-pack True
262-
```
263-
264-
This API:
265-
266-
a. Packages all files under `$GRAPH_NET_EXTRACT_WORKSPACE` into `/path/to/output.zip` (You can set it to `GraphNet/samples`)
267-
268-
b. Clears the workspace if `--clear-after-pack` is `True`
269-
270-
Note: If third-party ops are used, contributors must include them manually in the package. As long as `validate` passes, no specific folder structure is required.
271-
272-
3. **Commit the changes**
273-
274-
Move the packaged computational graph in the previous step to **samples** directory and commit.
275-
```bash
276-
git add <the packaged computational graph>
262+
git add <the new sample>
277263
git commit -m "Description"
278264
```
265+
Note: If third-party ops are used, contributors must include them manually in the package.
279266
280-
4. **Push the branch to your fork**
267+
3. **Push the branch to your fork**
281268
282269
```bash
283270
git push origin feature/your-branch-name
284271
```
285272
286-
5. **Submit a Pull Request**
273+
4. **Submit a Pull Request**
287274
288275
> **Note**: For clarity and maintainability, each PR should follow the Single Responsibility Principle (SRP). Submit only a single graph or a focused feature improvement per PR. For example, if you both update extraction logic and collect multiple models, each graph and each method update should be a separate PR.
289276

docs/CONTRIBUTE_TUTORIAL_cn.md

Lines changed: 7 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -247,32 +247,22 @@ python -m graph_net.config \
247247
--username "john_doe" \
248248
249249
```
250-
2. **打包计算图**
251250

252-
```bash
253-
python -m graph_net.pack --output /path/to/output.zip --clear-after-pack True
254-
```
255-
该API的功能为:
256-
257-
a. 打包`$GRAPH_NET_EXTRACT_WORKSPACE`下的所有文件到`/path/to/output.zip` (可以设置到`GraphNet/samples`
258-
259-
b. 若`--clear-after-pack``True`,则打包后清空`$GRAPH_NET_EXTRACT_WORKSPACE`
251+
2. **提交修改**
260252

261-
请注意,如果有第三方算子,需要贡献者自行打包到计算图压缩包内。目前没有特别规定存放的目录结构,但只要通过了validate环节,就可以达到验收标准。
262-
263-
3. **提交修改**
264-
265-
移动上一步打包完成的计算图压缩包到**samples**目录,然后提交。
253+
移动新增的计算图样本到**samples**目录,然后提交。
266254
```bash
267-
git add <计算图压缩包>
255+
git add <新计算图样本>
268256
git commit -m "描述"
269257
```
270-
4. **推送分支到远程**(你的 Fork 仓库)
258+
请注意,如果有第三方算子,需要贡献者自行打包到计算图压缩包内。
259+
260+
3. **推送分支到远程**(你的 Fork 仓库)
271261

272262
```bash
273263
git push origin feature/your-branch-name
274264
```
275-
5. **提交 Pull Request**
265+
4. **提交 Pull Request**
276266

277267
> **注意**:为方便管理,每个PR应遵守Single Responsibility Principle (SRP)原则,**仅新增单一份计算图、或聚焦于单一功能改进**,避免将多个修改混合提交。例如,如果您修改了抓取方法,然后为支持某类模型收集了数据,那么其中每份单个模型的计算图、修改的新一份抓取方法,都应打开为独立的PR。
278268

graph_net/sample_pass/agent_unittest_generator.py

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ def __init__(
224224
self.output_dir = Path(output_dir)
225225
self.device = self._choose_device(device)
226226
self.generate_main = generate_main
227-
self.try_run = try_run and generate_main
227+
self.try_run = try_run
228228
self.data_input_predicator = self._make_data_input_predicator(
229229
data_input_predicator_filepath, data_input_predicator_class_name
230230
)
@@ -244,20 +244,26 @@ def generate(self):
244244
input_tensor_metas,
245245
weight_tensor_metas,
246246
) = self._get_input_and_weight_tensor_metas(input_arg_names, weight_arg_names)
247-
graph_module_desc = GraphModuleDescriptor(
248-
device=self.device,
249-
generate_main=self.generate_main,
250-
model_name=model_name,
251-
input_arg_names=input_arg_names,
252-
input_tensor_metas=input_tensor_metas,
253-
weight_arg_names=weight_arg_names,
254-
weight_tensor_metas=weight_tensor_metas,
255-
forward_body=self._get_forward_body(
256-
graph_module, input_arg_names, weight_arg_names
257-
),
258-
)
259-
unittest = self._render_template(graph_module_desc)
260-
if self._try_to_run_unittest(unittest):
247+
248+
def _generate_unittest(generate_main):
249+
graph_module_desc = GraphModuleDescriptor(
250+
device=self.device,
251+
generate_main=generate_main,
252+
model_name=model_name,
253+
input_arg_names=input_arg_names,
254+
input_tensor_metas=input_tensor_metas,
255+
weight_arg_names=weight_arg_names,
256+
weight_tensor_metas=weight_tensor_metas,
257+
forward_body=self._get_forward_body(
258+
graph_module, input_arg_names, weight_arg_names
259+
),
260+
)
261+
return self._render_template(graph_module_desc)
262+
263+
# Generate unittest with main for try-run.
264+
unittest_for_try_run = _generate_unittest(generate_main=self.try_run)
265+
if self._try_to_run_unittest(unittest_for_try_run):
266+
unittest = _generate_unittest(generate_main=self.generate_main)
261267
self._write_to_file(unittest, self.output_dir)
262268

263269
def _choose_device(self, device) -> str:
Lines changed: 254 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,254 @@
1+
from graph_net.sample_pass.sample_pass import SamplePass
2+
from graph_net.sample_pass.resumable_sample_pass_mixin import ResumableSamplePassMixin
3+
from pathlib import Path
4+
import json
5+
from itertools import groupby
6+
from dataclasses import dataclass
7+
8+
9+
class FusibleSubgraphRangesGenerator(SamplePass, ResumableSamplePassMixin):
10+
def __init__(self, config):
11+
super().__init__(config)
12+
13+
def declare_config(
14+
self,
15+
model_path_prefix: str,
16+
output_dir: str,
17+
input_json_file_name: str,
18+
resume: bool = False,
19+
limits_handled_models: int = None,
20+
output_json_file_name: str = "fusible_subgraph_ranges.json",
21+
):
22+
pass
23+
24+
def __call__(self, rel_model_path: str):
25+
self.resumable_handle_sample(rel_model_path)
26+
27+
def sample_handled(self, rel_model_path: str) -> bool:
28+
file_name = self.config["output_json_file_name"]
29+
return self.naive_sample_handled(rel_model_path, search_file_name=file_name)
30+
31+
def resume(self, rel_model_path: str):
32+
analyzer = self._make_analyzer(rel_model_path)
33+
output_obj = {
34+
"subgraph_ranges": analyzer.analyze(),
35+
}
36+
self._save_output(rel_model_path, output_obj)
37+
38+
def _save_output(self, rel_model_path, output_obj):
39+
output_json = json.dumps(output_obj, indent=4)
40+
output_dir_path = Path(self.config["output_dir"]) / rel_model_path
41+
output_dir_path.mkdir(parents=True, exist_ok=True)
42+
output_file_path = output_dir_path / self.config["output_json_file_name"]
43+
output_file_path.write_text(output_json)
44+
45+
def _make_analyzer(self, rel_model_path: str):
46+
model_path = (
47+
Path(self.config["model_path_prefix"])
48+
/ rel_model_path
49+
/ self.config["input_json_file_name"]
50+
)
51+
json_ctx = self._make_json_ctx(model_path)
52+
return FusibleSubgraphRangesAnalyzer(
53+
num_subgraph_kernels_list=self._get_num_subgraph_kernels_list(json_ctx),
54+
num_subgraph_ops_list=self._get_num_subgraph_ops_list(json_ctx),
55+
start_offset_in_original_graph=self._get_start_offset_in_original_graph(
56+
json_ctx
57+
),
58+
)
59+
60+
def _get_start_offset_in_original_graph(self, json_ctx):
61+
return json_ctx["start_offset_in_original_graph"]
62+
63+
def _get_num_subgraph_kernels_list(self, json_ctx):
64+
return json_ctx["num_subgraph_kernels"]
65+
66+
def _get_num_subgraph_ops_list(self, json_ctx):
67+
return json_ctx["num_subgraph_ops"]
68+
69+
def _make_json_ctx(self, model_path: Path):
70+
obj = json.loads(model_path.read_text())
71+
assert len(obj["num_subgraph_kernels"]) == len(obj["num_subgraph_ops"])
72+
return obj
73+
74+
75+
class FusibleSubgraphRangesAnalyzer:
76+
def __init__(
77+
self,
78+
num_subgraph_kernels_list: list[int],
79+
num_subgraph_ops_list: list[int],
80+
start_offset_in_original_graph: int,
81+
):
82+
assert len(num_subgraph_kernels_list) == len(num_subgraph_ops_list)
83+
self.num_subgraph_kernels_list = num_subgraph_kernels_list
84+
self.num_subgraph_ops_list = num_subgraph_ops_list
85+
self.start_offset_in_original_graph = start_offset_in_original_graph
86+
87+
def analyze(self):
88+
analysis_ctx = self._make_analysis_ctx()
89+
num_kernels_and_num_ops_list = analysis_ctx.num_kernels_and_num_ops_list
90+
# The tail num_kernels equals the head num_kernels for each num_ops_list
91+
naive_proposal_fused_num_ops_lists = [
92+
sorted(set(num_ops_list))
93+
for _, num_ops_list in num_kernels_and_num_ops_list
94+
if len(set(num_ops_list)) > 1
95+
]
96+
proposal_fused_num_ops_lists = self._merge_all_decreasing_num_ops_lists(
97+
analysis_ctx, naive_proposal_fused_num_ops_lists
98+
)
99+
return self._create_subgraph_ranges_from_proposal(
100+
analysis_ctx,
101+
proposal_fused_num_ops_lists,
102+
)
103+
104+
def _merge_all_decreasing_num_ops_lists(self, analysis_ctx, num_ops_lists):
105+
dead_loop_detect_cnt = 0
106+
kLimit = 99999
107+
while True:
108+
last_len_num_ops_lists = len(num_ops_lists)
109+
num_ops_lists = self._merge_one_decreasing_num_ops_lists(
110+
analysis_ctx, num_ops_lists
111+
)
112+
assert last_len_num_ops_lists >= len(num_ops_lists)
113+
if last_len_num_ops_lists == len(num_ops_lists):
114+
break
115+
dead_loop_detect_cnt += 1
116+
assert dead_loop_detect_cnt < kLimit, f"{dead_loop_detect_cnt=}"
117+
return num_ops_lists
118+
119+
def _merge_one_decreasing_num_ops_lists(self, analysis_ctx, num_ops_lists):
120+
merge_pos = self._detect_mergable_decreasing_position(
121+
analysis_ctx, num_ops_lists
122+
)
123+
if merge_pos is None:
124+
return num_ops_lists
125+
assert merge_pos >= 0
126+
assert merge_pos < len(num_ops_lists) - 1
127+
return [
128+
*num_ops_lists[:merge_pos],
129+
[*num_ops_lists[merge_pos], *num_ops_lists[merge_pos + 1]],
130+
*num_ops_lists[merge_pos + 2 :],
131+
]
132+
133+
def _detect_mergable_decreasing_position(self, analysis_ctx, num_ops_lists):
134+
def get_cur_tail_num_kernels(i):
135+
return analysis_ctx.num_kernels4num_ops(num_ops_lists[i][-1])
136+
137+
def get_next_head_num_kernels(i):
138+
return analysis_ctx.num_kernels4num_ops(num_ops_lists[i + 1][0])
139+
140+
for i in range(len(num_ops_lists) - 1):
141+
assert len(num_ops_lists[i]) > 1
142+
if get_cur_tail_num_kernels(i) >= get_next_head_num_kernels(i):
143+
return i
144+
return None
145+
146+
def _create_subgraph_ranges_from_proposal(
147+
self, analysis_ctx, proposal_fused_num_ops_lists
148+
):
149+
# filter valid num_ops_list
150+
151+
def is_a_range(int_list):
152+
assert len(int_list) > 1
153+
return (int_list[-1] + 1) - int_list[0] == len(int_list)
154+
155+
def have_any_increasing(num_ops_list: list[int]):
156+
for i, cur_num_ops in enumerate(num_ops_list):
157+
if i == 0:
158+
continue
159+
cur_num_kernels = analysis_ctx.num_kernels4num_ops(cur_num_ops)
160+
last_num_kernels = analysis_ctx.num_kernels4num_ops(num_ops_list[i - 1])
161+
if cur_num_kernels > last_num_kernels:
162+
return True
163+
return False
164+
165+
def head_eq_tail(num_ops_list: list[int]):
166+
return analysis_ctx.num_kernels4num_ops(
167+
num_ops_list[0]
168+
) == analysis_ctx.num_kernels4num_ops(num_ops_list[-1])
169+
170+
def head_gt_tail(num_ops_list: list[int]):
171+
return analysis_ctx.num_kernels4num_ops(
172+
num_ops_list[0]
173+
) > analysis_ctx.num_kernels4num_ops(num_ops_list[-1])
174+
175+
def valid_fused_ops(num_ops_list: list[int]):
176+
if head_gt_tail(num_ops_list):
177+
return True
178+
if head_eq_tail(num_ops_list):
179+
return not have_any_increasing(num_ops_list)
180+
return False
181+
182+
proposal_fused_num_ops_lists = [
183+
sorted(set(num_ops_list)) for num_ops_list in proposal_fused_num_ops_lists
184+
]
185+
num_ops_lists = [
186+
num_ops_list
187+
for num_ops_list in proposal_fused_num_ops_lists
188+
if len(num_ops_list) > 1
189+
if is_a_range(num_ops_list)
190+
if valid_fused_ops(num_ops_list)
191+
]
192+
fusible_subgraph_ranges = [
193+
(start, end)
194+
for num_ops_list in num_ops_lists
195+
for start in [num_ops_list[0] - 1]
196+
for end in [num_ops_list[-1]]
197+
]
198+
199+
# sorted by `start`
200+
def range_sort_key(pair):
201+
start, end = pair
202+
# smaller `start` first
203+
# bigger `end` first
204+
return (start, -end)
205+
206+
fusible_subgraph_ranges = sorted(fusible_subgraph_ranges, key=range_sort_key)
207+
# remove shadowed
208+
fusible_subgraph_ranges = [
209+
fusible_subgraph_ranges[i]
210+
for i in range(len(fusible_subgraph_ranges))
211+
if i == 0
212+
or (fusible_subgraph_ranges[i][0] >= fusible_subgraph_ranges[i - 1][1])
213+
]
214+
return fusible_subgraph_ranges
215+
216+
def _make_analysis_ctx(self):
217+
return AnalysisContext(
218+
num_kernels_and_num_ops_list=self._make_num_kernels_and_num_ops_list(),
219+
num_ops2num_kernels=self._make_num_ops2num_kernels(),
220+
)
221+
222+
def _make_num_ops2num_kernels(self):
223+
return dict(zip(self.num_subgraph_ops_list, self.num_subgraph_kernels_list))
224+
225+
def _make_num_kernels_and_num_ops_list(self):
226+
num_kernels_and_num_ops = zip(
227+
self.num_subgraph_kernels_list,
228+
self.num_subgraph_ops_list,
229+
)
230+
231+
def get_num_kernels(pair):
232+
return pair[0]
233+
234+
def get_num_ops(pair):
235+
return pair[1]
236+
237+
num_kernels_and_num_ops = sorted(num_kernels_and_num_ops, key=get_num_ops)
238+
grouped_num_kernels_and_num_ops = groupby(
239+
num_kernels_and_num_ops, key=get_num_kernels
240+
)
241+
num_kernels_and_num_ops_list = [
242+
(num_kernels, [num_ops for _, num_ops in group])
243+
for num_kernels, group in grouped_num_kernels_and_num_ops
244+
]
245+
return num_kernels_and_num_ops_list
246+
247+
248+
@dataclass
249+
class AnalysisContext:
250+
num_kernels_and_num_ops_list: list[(int, list[int])]
251+
num_ops2num_kernels: dict[int, int]
252+
253+
def num_kernels4num_ops(self, num_ops: int):
254+
return self.num_ops2num_kernels[num_ops]

0 commit comments

Comments
 (0)