TG speed prediction via RAM BW test #818
magikRUKKOLA
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
It seems that token generation speed can be predicted via the RAM read speed. Specifically, for example I am having two machines one with AMD Threadripper PRO 3995wx (64C) and one with AMD Threadripper 3445wx (12C) with DDR4 3200 MT/s 8 channel.
I can measure the RAM BW via the stream tool:
So, for the 64C machine its:
And for the 12C machine its:
Now, lets get the sweep-bench result for the 64C machine (GLM-4.6 IQ5_K):
Now if I want to predict the TG for 12C machine I would just:
So the predicted speed is about 5.1 tps. Which is very close to the actual result:
Can anyone post their ./stream and llama-sweep-bench results in order to compare how it relates to TG speed?
Beta Was this translation helpful? Give feedback.
All reactions