pytorch与turbomind推理速度对比 #2538
zhuchen1109
started this conversation in
General
Replies: 2 comments 10 replies
-
目前还有一定差距
数据的话可以看下 https://github.com/InternLM/lmdeploy/actions/runs/11051851005 ,rps 一般差 15% 左右,长文本差距依然很大 如果有什么关于这个的建议或者看法欢迎交流 |
Beta Was this translation helpful? Give feedback.
6 replies
-
还想多问一个,Chunk Prefills有计划支持吗? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
新版pytorch支持了cuda graph,速度提升明显。那么现在pytorch是否赶上或超过turbomind的推理速度了呢?这个有正式的对比数据吗供参考吗?
Beta Was this translation helpful? Give feedback.
All reactions