.NET SDK: Slow performance on Chat Completions API with Azure Open AI #7437
-
Hello, We are using Semantic Kernel C# SDK to interact with Azure Open AI. In one of the steps of our flow, we are calling chat completions api in the following manner. Adding the Kernel.
Call to Chat Completions API:
The call to ChatMessageContent messageContent = await chatCompletion.GetChatMessageContentAsync( is taking between 10 to 15s. In another implementation, that is prior to this one, we are using Chat Completions REST API directly and a similar call takes between 2 and 3s. Using this approach we have full control on the endpoint and the version we are using of that endpoint. Using Semantic Kernel SDK, we are facing this performance problem. The version of Semantic Kernel SDK is 1.16.1. I have tried testing the API both in localhost in my DEV box and deploying to Azure using an App Service using the following App Service Plan (1 instance only): and the problem happens both in DEV and Azure deployment. Any help is appreciated! Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
@miguelisidoro Thanks for reporting this problem! Would it be possible for you to run This should output the breakdown for each C# function, execution time and number of calls: With this view, you should be able to see where exactly there is a delay in execution.
Would be interesting to compare both approaches using the same deployment model, just for results to be more accurate. Things like different deployment and AI model used can impact the performance as well. |
Beta Was this translation helpful? Give feedback.
-
Hello @dmytrostruk , About instrumentation, all I could gather is: That basically doesn't help. Any other suggestion? About your other suggestion regarding deployment and AI models, I will make some tests. Thanks |
Beta Was this translation helpful? Give feedback.
-
Hello @dmytrostruk, Problem solved. Thanks |
Beta Was this translation helpful? Give feedback.
@miguelisidoro Thanks for reporting this problem! Would it be possible for you to run
Debug > Performance Profiler
withInstrumentation
enabled in Visual Studio?https://learn.microsoft.com/en-us/visualstudio/profiling/profiling-feature-tour?view=vs-2022#instrumentation
This should output the breakdown for each C# function, execution time and number of calls:
With this view, you should be able to see where exactly there is a delay in execution.
Would be interesting to compare both approaches using the same deployment model, just for results to be more accurate. Things like different dep…