-
Notifications
You must be signed in to change notification settings - Fork 352
Open
Description
I am not sure any pair of the model works, I tried bunch of different routers (mf - gives me always rate limit error, bert - authorization error, random - same), and bunch of models:
ollama/llama models - all api connections failed
chatgpt models - rate limit exceed, quota exceed
huggingface llama models - rate limit exceed
Probably there are some changes over 8 months I am not aware or not surfaced in the tutorial
Command I was running (e.g. huggingface llama model and anyscale mistral model)
python3 -m routellm.openai_server --routers bert --strong-model huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct --weak-model anyscale/mistralai/Mixtral-8x7B-Instruct-v0.1
python3 -m examples.router_chat --router bert --threshold 0.11593
Metadata
Metadata
Assignees
Labels
No labels