Model response format from custom chat completion service to ensure that functions are auto invoked by SK #6389

ddramireddy · 2024-05-24T11:39:29Z

ddramireddy
May 24, 2024

Hi Team,

We have a custom chat completion service, which calls the model and getting the response. The custom service does some other tasks like load balancing, content checks as well. Currently the streaming response returned by this service is not interpreted by SK for calling the functions automatically. It seems the issue is with our service response format.

Any idea, how we can get the format expected by SK to call the functions automatically?

We are using SK 1.9 version. Below is the code snippet. If I use Azure Open AI model, it's working fine. But, if I use our custom service, it's not working. This must be due to response format returned by service itself. Any pointers in identifying the expected format so that auto function invocation happens?

`
IKernelBuilder kernelBuilder = Kernel.CreateBuilder()
.AddAzureOpenAIChatCompletion(
deploymentName: "",
endpoint: "",
apiKey: "");

Kernel kernel = kernelBuilder.Build();

kernel.ImportPluginFromObject(new WeatherSearchPlugin(), nameof(WeatherSearchPlugin));

// Create the stream
var chatCompletion = kernel.GetRequiredService<IChatCompletionService>();

var funcCallingHistory = new ChatHistory();
funcCallingHistory.AddAssistantMessage("You are an expert in weather forecast");
funcCallingHistory.AddUserMessage("What is the weather now?");

var promptExecutionSettings = new OpenAIPromptExecutionSettings
{
    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
};

var stream = chatCompletion.GetStreamingChatMessageContentsAsync(
chatHistory: funcCallingHistory,
promptExecutionSettings,
kernel: kernel);

var Content = "";
var enumerator = stream.GetAsyncEnumerator();

try
{
// Stream the message to the client
while (await enumerator.MoveNextAsync())
{
Content += enumerator.Current;
}
}
finally
{
await enumerator.DisposeAsync();
}
Console.WriteLine(Content);

public class WeatherSearchPlugin
{
[KernelFunction, Description("search weather")]
public string WeatherSearch()
{
return "2 degree,rainy";
}
}

`

Response returned by model is in below format:

`
{
"id": "chatcmpl-9SNdkJ7kD2WkkycxCXfgAPFbotVpv",
"object": "chat.completion",
"created": 1716551084,
"model": "/var/azureml-app/azureml-models/gpt4-001/*************.json",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_aP9ynkQtX6ghTHYrE3ZU4kHG",
"type": "function",
"function": {
"name": "WeatherSearch",
"arguments": "{}"
}
}
]
},
"finish_reason": "tool_calls"
}
],
"usage": {
"prompt_tokens": 46,
"completion_tokens": 8,
"total_tokens": 54
},
"system_fingerprint": null
}

`

Answered by dmytrostruk

May 26, 2024

It looks as I misunderstood that Semantic Kernel framework does call the functions automatically based on model response.

@ddramireddy That's correct, but SK also has a mode to ask LLM to provide which functions to call, but SK won't call it automatically, instead, it will provide function information to you, so you will be able to call it manually. This file contains some examples with automatic and manual function calling:
https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/Concepts/AutoFunctionCalling/OpenAI_FunctionCalling.cs

Is there any reason, this is not abstracted to Semantic Kernel framework, instead of delegating to individual connectors/services?

Great qu…

View full answer

dmytrostruk · 2024-05-25T02:09:26Z

dmytrostruk
May 25, 2024
Collaborator

Hi @ddramireddy !
We have unit tests for non-streaming and streaming function calling scenarios with test model responses:

Non-streaming:

semantic-kernel/dotnet/src/Connectors/Connectors.UnitTests/OpenAI/TestData/chat_completion_single_function_call_test_response.json

Lines 1 to 32 in 6551546

    
           { 
        
             "id": "response-id", 
        
             "object": "chat.completion", 
        
             "created": 1699896916, 
        
             "model": "gpt-3.5-turbo-0613", 
        
             "choices": [ 
        
               { 
        
                 "index": 0, 
        
                 "message": { 
        
                   "role": "assistant", 
        
                   "content": null, 
        
                   "tool_calls": [ 
        
                     { 
        
                       "id": "1", 
        
                       "type": "function", 
        
                       "function": { 
        
                         "name": "MyPlugin-GetCurrentWeather", 
        
                         "arguments": "{\n\"location\": \"Boston, MA\"\n}" 
        
                       } 
        
                     } 
        
                   ] 
        
                 }, 
        
                 "logprobs": null, 
        
                 "finish_reason": "tool_calls" 
        
               } 
        
             ], 
        
             "usage": { 
        
               "prompt_tokens": 82, 
        
               "completion_tokens": 17, 
        
               "total_tokens": 99 
        
             } 
        
           }

Streaming:

semantic-kernel/dotnet/src/Connectors/Connectors.UnitTests/OpenAI/TestData/chat_completion_streaming_single_function_call_test_response.txt

Lines 1 to 3 in 6551546

    
           data: {"id":"response-id","object":"chat.completion.chunk","created":1704212243,"model":"gpt-4","system_fingerprint":null,"choices":[{"index":0,"delta":{"role":"assistant","content":"Test chat streaming response","tool_calls":[{"index":0,"id":"1","type":"function","function":{"name":"MyPlugin-GetCurrentWeather","arguments":"{\n\"location\": \"Boston, MA\"\n}"}}]},"finish_reason":"tool_calls"}]} 
        
           data: [DONE]

Looking at model response provided by you, it looks like a response for non-streaming scenario. You could try to get non-streaming chat message content and see if your function will be called. If you still want to use streaming capability, your custom model should return a response in the same format as provided above for streaming scenario.

Please let me know if that helps!

0 replies

ddramireddy · 2024-05-25T17:25:40Z

ddramireddy
May 25, 2024
Author

Hi @dmytrostruk ,

Thank you for the quick response.

It looks as I misunderstood that Semantic Kernel framework does call the functions automatically based on model response. But, it seems the connectors needs to parse the model response, if there are any tool calls, execute those functions. I can see OpenAI connector is invoking the functions based on response in tool_calls.

https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/Connectors/Connectors.OpenAI/AzureSdk/ClientCore.cs#L462

Is there any reason, this is not abstracted to Semantic Kernel framework, instead of delegating to individual connectors/services?

0 replies

dmytrostruk · 2024-05-26T02:09:06Z

dmytrostruk
May 26, 2024
Collaborator

It looks as I misunderstood that Semantic Kernel framework does call the functions automatically based on model response.

@ddramireddy That's correct, but SK also has a mode to ask LLM to provide which functions to call, but SK won't call it automatically, instead, it will provide function information to you, so you will be able to call it manually. This file contains some examples with automatic and manual function calling:
https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/Concepts/AutoFunctionCalling/OpenAI_FunctionCalling.cs

Is there any reason, this is not abstracted to Semantic Kernel framework, instead of delegating to individual connectors/services?

Great question! This work is currently in progress:

0 replies

ddramireddy · 2024-05-26T17:48:01Z

ddramireddy
May 26, 2024
Author

Wow, Great to know this. :) If this change can come in next few weeks, we can wait, and it can reduce some of our development effort.

Thank you very much @dmytrostruk

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model response format from custom chat completion service to ensure that functions are auto invoked by SK #6389

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Model response format from custom chat completion service to ensure that functions are auto invoked by SK #6389

Uh oh!

Uh oh!

ddramireddy May 24, 2024

Replies: 4 comments

Uh oh!

dmytrostruk May 25, 2024 Collaborator

Uh oh!

ddramireddy May 25, 2024 Author

Uh oh!

dmytrostruk May 26, 2024 Collaborator

Uh oh!

ddramireddy May 26, 2024 Author

ddramireddy
May 24, 2024

dmytrostruk
May 25, 2024
Collaborator

ddramireddy
May 25, 2024
Author

dmytrostruk
May 26, 2024
Collaborator

ddramireddy
May 26, 2024
Author