Hello, why the transformation in ipynb for the matrix multiplication can make cache more friendly? #13
Unanswered
mazdarx7fc3s
asked this question in
Q&A
Replies: 2 comments 1 reply
-
It's because we enhance the cache hit rate. Please see https://tvm.apache.org/docs/how_to/optimize_operators/opt_gemm.html#blocking |
Beta Was this translation helpful? Give feedback.
1 reply
-
usually reuse of a small chunk of data helps cache-friendliness. Considering the buffer access under the inner loops, i.e. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
before transformation:
after transformation:
In my view, before the transformation, we need make a 1024*1024*1024-for-loop, after the transformation, we still need make the 1024*1024*1024-for-loop. Why the time costs decreases so much?
Beta Was this translation helpful? Give feedback.
All reactions