-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
Hey,
I had served the model using cli script but I found the model giving repetitive answers and getting stuck in a loop and ending abruptly also for at some instances it was just repeating the prompts of the user and not giving any response. I had tried running the model in 4 bit, 8 bit and full precision mode but the issue did not resolve.
Also, I was trying to run the eval script on 6gb VRAM gpu but it did not have the 4 bit configuration and using the 4 bit configuration while loading pretrained model gave some data mismatch errors for me. Can you please share if you have some configuration of eval script that can be run using 4bit mode. I am using the pretrained weights here due to hardware limitations
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels