Commit 5cef2fd
Back out "Do not use BNNS copy when dtypes differ in CoreML (pytorch#13018)"
Summary:
the diff D79416945 make the model inference slow
1. in old 08/01 build runner on Mac , P1905141721
Prefilled 18 tokens @ 250 tokens/second.
Generated 23 tokens @ 18.4 tokens/second.
2. in today 0814 build runner, on Mac, P1905142300
refilled 18 tokens @ 36.5112 token/s in 493ms
Generated 23 tokens @ 2.25734 token/s in 10189ms
Differential Revision: D803627301 parent 8e208ad commit 5cef2fd
1 file changed
+0
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | 126 | | |
130 | 127 | | |
131 | 128 | | |
| |||
0 commit comments