Replies: 2 comments 5 replies
-
Yes, Due to this and the simplicity of the |
Beta Was this translation helpful? Give feedback.
-
Thanks for the quick and concise explanation. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Thanks for supporting the great development environment.
When using gcn_conv in torch_geometric, I noticed that there are two options; using adj_t or edge_index to describe aggregation stage and both are totally the same whatever I use.
But it looks like there is some performance gap between them; doing sparse matmul with adj_t is faster than the other at least two times with cache = True.
In general, as far as I know, with torch only model, aggregation is composed of sparse matmul with adjacency matrix and input like "torch.spmm(input, adj)"
In the aspect of that, I'm curious why edge_index is used in most cases with pytorch_geometric model implementation.
Even in your environment (pytorch_geometric/benchmark/kernel/main_performance.py), default method utilizes edge_index and gather,scatter for aggregation.
Are there any advantages of edge_index compared with adj_t?
profile result
(Sorry for the bad paste, this environment is automatically converting below format.)
with adj_t
message_and_aggregate 11.000us 0.07% 11.861ms 11.861ms 1
torch_sparse::spmm_sum 11.844ms 78.71% 11.846ms 11.846ms 1
aten::scatter_add_ 1.279ms 8.50% 1.292ms 646.000us 2
aten::linear 44.000us 0.29% 900.000us 300.000us 3
dense_matmul 42.000us 0.28% 865.000us 865.000us 1
aten::matmul 19.000us 0.13% 723.000us 723.000us 1
with edge_index
aten::index_select 60.884ms 54.79% 60.888ms 60.888ms 1
aten::scatter_add_ 26.154ms 23.54% 26.168ms 8.723ms 3
aggregate 17.000us 0.02% 24.854ms 24.854ms 1
message 10.000us 0.01% 21.628ms 21.628ms 1
(I'm now using gpu model)
"total" is total model running time.
propagate is "def propagate".
As you can see, total running time of adj_t is roughly over 7 times faster than that of edge_index in "cuda total".
Beta Was this translation helpful? Give feedback.
All reactions