Is this the correct LLM inference setup #1491
-
| I am still fairly new to using LLMs locally and I would be grateful if you could let me know if I am doing it correctly. I am using Llama 2 and Mistral locally for inference for a specific function call use case (get_current_weather). This is part of my thesis where I test small base models function call performance vs finetuned vs baseline models (GPT 3.5-Turbo and GPT 4). The LLM is supposed to infer from the question asked if it should use the get_current_weather function or not. Below is how I am running inference: 
 
 system_message = "You are a helpful assistant with access to functions. Use the provided function to answer current weather questions." Unsurprisingly, the small base models don't manage to perform the function call. I just want to make sure it's truly caused by bad performance and not my me using the tool wrong. With a similar OpenAI API setup I managed to get decent and good results. Thanks for your input :) | 
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
| You won't get any function call responses from this model, mainly because it does not have a function calling capable chat template (and you're using  Additionally you are doing a couple of things wrong: 
 {
    "type": "function",
    "function": {
        "name": "get_current_weather"
    }
} | 
Beta Was this translation helpful? Give feedback.
You won't get any function call responses from this model, mainly because it does not have a function calling capable chat template (and you're using
chat_format="llama-2"anyway, which will ignore the chat template), if you want to use thetoolsparameter you have to use a model with the correct chat template (like this one) or a function calling chat format that the model supports.Additionally you are doing a couple of things wrong:
tools=tools, you are passing a single function instead, that won't workassistant_responseis set wrong it should be set toresult["choices"][0]["text"]tool_choice="auto"actually does nothing with regular chat templates (which might be what …