[SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp #3694
+12
−14
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
本 PR 依赖 Paddle 主框架的两个PR:
pir::Place
in CudaGraphOp output to avoid memcpy Paddle#75078#3302 添加了
append_attention_with_output
但是开启后存在打断,本PR消除full_cuda_graph=false
时的打断动态图下运行的 cpp_extensions,都是不需要
key_cache_out
和value_cache_out
的本PR移除自定义算子注册的
key_cache_out
与value_cache_out
,与动态图对齐另外静态图没有 place,SOT转静的时候会打断:
故而移除
.to(qkv.place)
PS: 目前CUDAGraph + 子图切分的脚本:
cc @SigureMo @zyfncg @gongshaotian