diff --git a/README.md b/README.md index 74984f7f..4d928d33 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,7 @@ languages: - bicep products: - ai-services +- azure-signalr - azure-blob-storage - azure-container-apps - azure-cognitive-search @@ -57,11 +58,11 @@ description: A csharp sample app that chats with your data using OpenAI and AI S [![Open in GitHub - Codespaces](https://img.shields.io/static/v1?style=for-the-badge&label=GitHub+Codespaces&message=Open&color=brightgreen&logo=github)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=624102171&machine=standardLinux32gb&devcontainer_path=.devcontainer%2Fdevcontainer.json&location=WestUs2) [![Open in Remote - Containers](https://img.shields.io/static/v1?style=for-the-badge&label=Remote%20-%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/azure-samples/azure-search-openai-demo-csharp) -This sample demonstrates a few approaches for creating ChatGPT-like experiences over your own data using the Retrieval Augmented Generation pattern. It uses Azure OpenAI Service to access the ChatGPT model (`gpt-4o-mini`), and Azure AI Search for data indexing and retrieval. +This sample demonstrates a few approaches for creating ChatGPT-like experiences over your own data using the Retrieval Augmented Generation pattern. It uses Azure OpenAI Service to access the ChatGPT model (`gpt-4o-mini`), and Azure AI Search for data indexing and retrieval, and Azure SignalR Service for real-time streaming responses. The repo includes sample data so it's ready to try end-to-end. In this sample application, we use a fictitious company called Contoso Electronics, and the experience allows its employees to ask questions about the benefits, internal policies, as well as job descriptions and roles. -![RAG Architecture](docs/appcomponents.png) +![RAG Architecture](docs/appcomponents-signalr.png) For more details on how this application was built, check out: @@ -75,6 +76,7 @@ We want to hear from you! Are you interested in building or currently building i ## Features - Voice Chat, Chat and Q&A interfaces +- Real-time streaming responses using Azure SignalR Service - Explores various options to help users evaluate the trustworthiness of responses with citations, tracking of source content, etc. - Shows possible approaches for data preparation, prompt construction, and orchestration of interaction between model (ChatGPT) and retriever (Azure AI Search) - Settings directly in the UX to tweak the behavior and experiment with options @@ -83,10 +85,11 @@ We want to hear from you! Are you interested in building or currently building i ## Application architecture -- **User interface** - The application’s chat interface is a [Blazor WebAssembly](https://learn.microsoft.com/aspnet/core/blazor/) application. This interface is what accepts user queries, routes request to the application backend, and displays generated responses. +- **User interface** - The application's chat interface is a [Blazor WebAssembly](https://learn.microsoft.com/aspnet/core/blazor/) application. This interface is what accepts user queries, routes request to the application backend, and displays generated responses. - **Backend** - The application backend is an [ASP.NET Core Minimal API](https://learn.microsoft.com/aspnet/core/fundamentals/minimal-apis/overview). The backend hosts the Blazor static web application and what orchestrates the interactions among the different services. Services used in this application include: - [**Azure AI Search**](https://learn.microsoft.com/azure/search/search-what-is-azure-search) – indexes documents from the data stored in an Azure Storage Account. This makes the documents searchable using [vector search](https://learn.microsoft.com/azure/search/search-get-started-vector) capabilities. - [**Azure OpenAI Service**](https://learn.microsoft.com/azure/ai-services/openai/overview) – provides the Large Language Models to generate responses. [Semantic Kernel](https://learn.microsoft.com/semantic-kernel/whatissk) is used in conjunction with the Azure OpenAI Service to orchestrate the more complex AI workflows. + - [**Azure SignalR Service**](https://learn.microsoft.com/azure/azure-signalr/signalr-overview) - enables real-time streaming of AI responses to the client application. ## Getting Started @@ -108,8 +111,9 @@ Pricing varies per region and usage, so it isn't possible to predict exact costs - [**Azure OpenAI Service**](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). Standard tier, GPT and Ada models. Pricing per 1K tokens used, and at least 1K tokens are used per question. - [**Azure AI Document Intelligence**](https://azure.microsoft.com/pricing/details/ai-document-intelligence/). SO (Standard) tier using pre-built layout. Pricing per document page, sample documents have 261 pages total. - [**Azure AI Search**](https://azure.microsoft.com/pricing/details/search/) Basic tier, 1 replica, free level of semantic search. Pricing per hour. -- [**Azure Blob Storage**](https://azure.microsoft.com/pricing/details/storage/blobs/). Standard tier with ZRS (Zone-redundant storage). Pricing per storage and read operations. +- [**Azure Blob Storage**](https://azure.microsoft.com/pricing/details/storage/blobs/). Standard tier with ZRS (Zone-redundant storage). Pricing per storage and read operations. - [**Azure Monitor**](https://azure.microsoft.com/pricing/details/monitor/). Pay-as-you-go tier. Costs based on data ingested. +- [**Azure SignalR Service**](https://azure.microsoft.com/pricing/details/signalr-service/). Premium tier with 1 unit. Pricing per unit per hour. To reduce costs, you can switch to free SKUs for various services, but those SKUs have limitations. See this [guide on deploying with minimal costs](./docs/deploy_lowcost.md) for more details. @@ -374,6 +378,7 @@ to production. Here are some things to consider: ### Resources - [Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and Azure AI Search](https://aka.ms/entgptsearchblog) +- [Azure SignalR Service](https://learn.microsoft.com/azure/azure-signalr/signalr-overview) - [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search) - [Azure OpenAI Service](https://learn.microsoft.com/azure/cognitive-services/openai/overview) - [`Azure.AI.OpenAI` NuGet package](https://www.nuget.org/packages/Azure.AI.OpenAI) diff --git a/app/Directory.Packages.props b/app/Directory.Packages.props index 051efaf3..dd17fa18 100644 --- a/app/Directory.Packages.props +++ b/app/Directory.Packages.props @@ -52,5 +52,7 @@ + + \ No newline at end of file diff --git a/app/SharedWebComponents/Components/Answer.razor b/app/SharedWebComponents/Components/Answer.razor index 106f4e65..ced0cbce 100644 --- a/app/SharedWebComponents/Components/Answer.razor +++ b/app/SharedWebComponents/Components/Answer.razor @@ -28,11 +28,11 @@ } } - @if (answer is { FollowupQuestions.Count: > 0 }) + @if (Retort?.Context?.FollowupQuestions is { Length: > 0 }) {
Follow-up questions: - @foreach (var followup in answer.FollowupQuestions) + @foreach (var followup in Retort.Context.FollowupQuestions) { diff --git a/app/SharedWebComponents/Components/SettingsPanel.razor b/app/SharedWebComponents/Components/SettingsPanel.razor index 0f8b7a7f..8e933716 100644 --- a/app/SharedWebComponents/Components/SettingsPanel.razor +++ b/app/SharedWebComponents/Components/SettingsPanel.razor @@ -41,6 +41,10 @@ Color="Color.Primary" Label="Use query-contextual summaries instead of whole documents" /> + + diff --git a/app/SharedWebComponents/Models/StreamingMessage.cs b/app/SharedWebComponents/Models/StreamingMessage.cs new file mode 100644 index 00000000..31f93b44 --- /dev/null +++ b/app/SharedWebComponents/Models/StreamingMessage.cs @@ -0,0 +1,9 @@ +// Copyright (c) Microsoft. All rights reserved. + +namespace SharedWebComponents.Models; + +internal class StreamingMessage +{ + public string Type { get; set; } = ""; + public object? Content { get; set; } +} \ No newline at end of file diff --git a/app/SharedWebComponents/Pages/Chat.razor b/app/SharedWebComponents/Pages/Chat.razor index a36b4467..a061f659 100644 --- a/app/SharedWebComponents/Pages/Chat.razor +++ b/app/SharedWebComponents/Pages/Chat.razor @@ -88,6 +88,10 @@ OnClick="@OnClearChat" Disabled=@(_isReceivingResponse || _questionAndAnswerMap is { Count: 0 }) /> + + + diff --git a/app/SharedWebComponents/Pages/Chat.razor.cs b/app/SharedWebComponents/Pages/Chat.razor.cs index 8211087a..40bf7b53 100644 --- a/app/SharedWebComponents/Pages/Chat.razor.cs +++ b/app/SharedWebComponents/Pages/Chat.razor.cs @@ -1,13 +1,19 @@ // Copyright (c) Microsoft. All rights reserved. namespace SharedWebComponents.Pages; +using Microsoft.AspNetCore.SignalR.Client; +using System.Text.Json; +using SharedWebComponents.Models; -public sealed partial class Chat +public sealed partial class Chat : IAsyncDisposable { private string _userQuestion = ""; private UserQuestion _currentQuestion; private string _lastReferenceQuestion = ""; private bool _isReceivingResponse = false; + private bool _useStreaming = true; + private HubConnection? _hubConnection; + private string _streamingResponse = ""; private readonly Dictionary _questionAndAnswerMap = []; @@ -15,16 +21,153 @@ public sealed partial class Chat [Inject] public required ApiClient ApiClient { get; set; } + [Inject] public required NavigationManager NavigationManager { get; set; } + [CascadingParameter(Name = nameof(Settings))] public required RequestSettingsOverrides Settings { get; set; } [CascadingParameter(Name = nameof(IsReversed))] public required bool IsReversed { get; set; } - private Task OnAskQuestionAsync(string question) + protected override async Task OnInitializedAsync() { - _userQuestion = question; - return OnAskClickedAsync(); + await ConnectToHub(); + } + + private async Task ConnectToHub() + { + if (_hubConnection?.State == HubConnectionState.Connected) + { + return; + } + + _hubConnection = new HubConnectionBuilder() + .WithUrl(NavigationManager.ToAbsoluteUri("/chat-hub")) + .WithAutomaticReconnect(new[] { TimeSpan.Zero, TimeSpan.FromSeconds(2), TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(30) }) + .Build(); + + _hubConnection.On("ReceiveMessage", (message) => + { + try + { + if (_currentQuestion != default) + { + var streamingMessage = JsonSerializer.Deserialize(message); + if (streamingMessage?.Content == null) return; + + switch (streamingMessage.Type.ToLowerInvariant()) + { + case "content": + _streamingResponse += streamingMessage.Content; + UpdateAnswerInMap(_streamingResponse); + break; + + case "answer": + if (streamingMessage.Content is JsonElement answerElement) + { + var answer = answerElement.GetString(); + if (answer != null) + { + _streamingResponse = answer; + UpdateAnswerInMap(answer); + } + } + else if (streamingMessage.Content is string answerString) + { + _streamingResponse = answerString; + UpdateAnswerInMap(answerString); + } + break; + + case "thoughts": + if (streamingMessage.Content is JsonElement thoughtsElement) + { + var thoughts = thoughtsElement.EnumerateArray() + .Select(t => new Thoughts( + t.GetProperty("Title").GetString()!, + t.GetProperty("Description").GetString()!)) + .ToArray(); + UpdateThoughtsInMap(thoughts); + } + break; + + case "followup": + if (streamingMessage.Content is JsonElement followupElement) + { + var followupQuestions = followupElement.ValueKind == JsonValueKind.Array + ? followupElement.EnumerateArray() + .Select(q => q.GetString() ?? string.Empty) + .Where(q => !string.IsNullOrEmpty(q)) + .ToArray() + : new[] { followupElement.GetString() ?? string.Empty }; + + if (followupQuestions.Any()) + { + UpdateFollowupQuestionsInMap(followupQuestions); + } + } + break; + + case "supporting": + if (streamingMessage.Content is JsonElement supportingElement) + { + var supportingContent = supportingElement.EnumerateArray() + .Select(s => new SupportingContentRecord( + s.GetProperty("Title").GetString()!, + s.GetProperty("Description").GetString()! + )) + .ToArray(); + + if (supportingContent.Any()) + { + UpdateSupportingContentInMap(supportingContent); + } + } + break; + + case "images": + if (streamingMessage.Content is JsonElement imagesElement) + { + var images = imagesElement.EnumerateArray() + .Select(i => new SupportingImageRecord( + i.GetProperty("Title").GetString()!, + i.GetProperty("Url").GetString()!)) + .ToArray(); + + if (images.Any()) + { + UpdateImagesInMap(images); + } + } + break; + + case "complete": + if (streamingMessage.Content is JsonElement completeElement) + { + var citationBaseUrl = completeElement.GetProperty("citationBaseUrl").GetString(); + UpdateAnswerInMap(_streamingResponse, citationBaseUrl); + _userQuestion = ""; + _currentQuestion = default; + } + break; + } + StateHasChanged(); + } + } + catch (JsonException ex) + { + Console.WriteLine($"Error deserializing response: {ex.Message}"); + } + }); + + try + { + await _hubConnection.StartAsync(); + } + catch (Exception ex) + { + Console.WriteLine($"Error starting SignalR connection: {ex.Message}"); + } } private async Task OnAskClickedAsync() @@ -38,24 +181,47 @@ private async Task OnAskClickedAsync() _lastReferenceQuestion = _userQuestion; _currentQuestion = new(_userQuestion, DateTime.Now); _questionAndAnswerMap[_currentQuestion] = null; + _streamingResponse = ""; try { var history = _questionAndAnswerMap - .Where(x => x.Value?.Choices is { Length: > 0}) - .SelectMany(x => new ChatMessage[] { new ChatMessage("user", x.Key.Question), new ChatMessage("assistant", x.Value!.Choices[0].Message.Content) }) + .Where(x => x.Value?.Choices is { Length: > 0 }) + .SelectMany(x => new ChatMessage[] { + new ChatMessage("user", x.Key.Question), + new ChatMessage("assistant", x.Value!.Choices[0].Message.Content) + }) .ToList(); history.Add(new ChatMessage("user", _userQuestion)); var request = new ChatRequest([.. history], Settings.Overrides); - var result = await ApiClient.ChatConversationAsync(request); - _questionAndAnswerMap[_currentQuestion] = result.Response; - if (result.IsSuccessful) + if (_useStreaming && _hubConnection?.State == HubConnectionState.Connected) { - _userQuestion = ""; - _currentQuestion = default; + try + { + await _hubConnection.InvokeAsync("SendChatRequest", request); + } + catch (Exception ex) + { + _questionAndAnswerMap[_currentQuestion] = new ChatAppResponseOrError( + Array.Empty(), + $"Error: {ex.Message}"); + _userQuestion = ""; + _currentQuestion = default; + } + } + else + { + var result = await ApiClient.ChatConversationAsync(request); + _questionAndAnswerMap[_currentQuestion] = result.Response; + + if (_questionAndAnswerMap[_currentQuestion]?.Error == null) + { + _userQuestion = ""; + _currentQuestion = default; + } } } finally @@ -70,4 +236,101 @@ private void OnClearChat() _currentQuestion = default; _questionAndAnswerMap.Clear(); } -} + + public async ValueTask DisposeAsync() + { + if (_hubConnection is not null) + { + await _hubConnection.DisposeAsync(); + } + } + + private async Task OnAskQuestionAsync(string question) + { + _userQuestion = question; + await OnAskClickedAsync(); + } + + private void UpdateAnswerInMap(string answer, string? citationBaseUrl = null) + { + var currentResponse = _questionAndAnswerMap[_currentQuestion]; + var choice = currentResponse?.Choices.FirstOrDefault(); + var context = choice?.Context ?? new ResponseContext(null, null, Array.Empty(), Array.Empty()); + + _questionAndAnswerMap[_currentQuestion] = new ChatAppResponseOrError(new[] { + new ResponseChoice( + Index: 0, + Message: new ResponseMessage("assistant", answer), + Context: context, + CitationBaseUrl: citationBaseUrl ?? choice?.CitationBaseUrl ?? "") + }); + } + + private void UpdateThoughtsInMap(Thoughts[] thoughts) + { + var currentResponse = _questionAndAnswerMap[_currentQuestion]; + if (currentResponse?.Choices.FirstOrDefault() is { } choice) + { + var context = new ResponseContext( + choice.Context.DataPointsContent, + choice.Context.DataPointsImages, + choice.Context.FollowupQuestions, + thoughts); + + _questionAndAnswerMap[_currentQuestion] = new ChatAppResponseOrError(new[] { + choice with { Context = context } + }); + } + } + + private void UpdateFollowupQuestionsInMap(string[] followupQuestions) + { + var currentResponse = _questionAndAnswerMap[_currentQuestion]; + if (currentResponse?.Choices.FirstOrDefault() is { } choice) + { + var context = new ResponseContext( + choice.Context.DataPointsContent, + choice.Context.DataPointsImages, + followupQuestions, + choice.Context.Thoughts); + + _questionAndAnswerMap[_currentQuestion] = new ChatAppResponseOrError(new[] { + choice with { Context = context } + }); + } + } + + private void UpdateSupportingContentInMap(SupportingContentRecord[] supportingContent) + { + var currentResponse = _questionAndAnswerMap[_currentQuestion]; + if (currentResponse?.Choices.FirstOrDefault() is { } choice) + { + var context = new ResponseContext( + supportingContent, + choice.Context.DataPointsImages, + choice.Context.FollowupQuestions, + choice.Context.Thoughts); + + _questionAndAnswerMap[_currentQuestion] = new ChatAppResponseOrError(new[] { + choice with { Context = context } + }); + } + } + + private void UpdateImagesInMap(SupportingImageRecord[] images) + { + var currentResponse = _questionAndAnswerMap[_currentQuestion]; + if (currentResponse?.Choices.FirstOrDefault() is { } choice) + { + var context = new ResponseContext( + choice.Context.DataPointsContent, + images, + choice.Context.FollowupQuestions, + choice.Context.Thoughts); + + _questionAndAnswerMap[_currentQuestion] = new ChatAppResponseOrError(new[] { + choice with { Context = context } + }); + } + } +} \ No newline at end of file diff --git a/app/SharedWebComponents/SharedWebComponents.csproj b/app/SharedWebComponents/SharedWebComponents.csproj index e979cb4e..9077cbd0 100644 --- a/app/SharedWebComponents/SharedWebComponents.csproj +++ b/app/SharedWebComponents/SharedWebComponents.csproj @@ -20,6 +20,7 @@ + diff --git a/app/backend/Extensions/ServiceCollectionExtensions.cs b/app/backend/Extensions/ServiceCollectionExtensions.cs index 7b5f8ff9..2738fe76 100644 --- a/app/backend/Extensions/ServiceCollectionExtensions.cs +++ b/app/backend/Extensions/ServiceCollectionExtensions.cs @@ -1,5 +1,8 @@ // Copyright (c) Microsoft. All rights reserved. +using Microsoft.AspNetCore.SignalR; +using MinimalApi.Hubs; + namespace MinimalApi.Extensions; internal static class ServiceCollectionExtensions @@ -82,6 +85,8 @@ internal static IServiceCollection AddAzureServices(this IServiceCollection serv var useVision = config["UseVision"] == "true"; var openAIClient = sp.GetRequiredService(); var searchClient = sp.GetRequiredService(); + var hubContext = sp.GetRequiredService>(); + if (useVision) { var azureComputerVisionServiceEndpoint = config["AzureComputerVisionServiceEndpoint"]; @@ -89,11 +94,22 @@ internal static IServiceCollection AddAzureServices(this IServiceCollection serv var httpClient = sp.GetRequiredService().CreateClient(); var visionService = new AzureComputerVisionService(httpClient, azureComputerVisionServiceEndpoint, s_azureCredential); - return new ReadRetrieveReadChatService(searchClient, openAIClient, config, visionService, s_azureCredential); + return new ReadRetrieveReadChatService( + searchClient, + openAIClient, + config, + hubContext, + visionService, + s_azureCredential); } else { - return new ReadRetrieveReadChatService(searchClient, openAIClient, config, tokenCredential: s_azureCredential); + return new ReadRetrieveReadChatService( + searchClient, + openAIClient, + config, + hubContext, + tokenCredential: s_azureCredential); } }); diff --git a/app/backend/Extensions/WebApplicationExtensions.cs b/app/backend/Extensions/WebApplicationExtensions.cs index 64464f52..f1e86f07 100644 --- a/app/backend/Extensions/WebApplicationExtensions.cs +++ b/app/backend/Extensions/WebApplicationExtensions.cs @@ -1,5 +1,7 @@ // Copyright (c) Microsoft. All rights reserved. +using MinimalApi.Hubs; + namespace MinimalApi.Extensions; internal static class WebApplicationExtensions @@ -25,6 +27,9 @@ internal static WebApplication MapApi(this WebApplication app) api.MapGet("enableLogout", OnGetEnableLogout); + // Only need to map the Hub + app.MapHub(ChatHub.HubUrl); + return app; } diff --git a/app/backend/Hubs/ChatHub.cs b/app/backend/Hubs/ChatHub.cs new file mode 100644 index 00000000..5e1fdfef --- /dev/null +++ b/app/backend/Hubs/ChatHub.cs @@ -0,0 +1,23 @@ +using Microsoft.AspNetCore.SignalR; +using System; + +namespace MinimalApi.Hubs; + +public class ChatHub : Hub +{ + public const string HubUrl = "/chat-hub"; + private readonly ReadRetrieveReadChatService _chatService; + + public ChatHub(ReadRetrieveReadChatService chatService) + { + _chatService = chatService ?? throw new ArgumentNullException(nameof(chatService)); + } + + public async Task SendChatRequest(ChatRequest request) + { + await _chatService.ReplyStreamingAsync( + request.History, + request.Overrides, + Context.ConnectionId); + } +} \ No newline at end of file diff --git a/app/backend/MinimalApi.csproj b/app/backend/MinimalApi.csproj index 04783d07..f5061190 100644 --- a/app/backend/MinimalApi.csproj +++ b/app/backend/MinimalApi.csproj @@ -27,6 +27,7 @@ + diff --git a/app/backend/Program.cs b/app/backend/Program.cs index ecada14c..a53b1549 100644 --- a/app/backend/Program.cs +++ b/app/backend/Program.cs @@ -1,6 +1,9 @@ // Copyright (c) Microsoft. All rights reserved. using Microsoft.AspNetCore.Antiforgery; +using Microsoft.AspNetCore.SignalR; +using Azure.Identity; +using Microsoft.Azure.SignalR; var builder = WebApplication.CreateBuilder(args); @@ -16,6 +19,29 @@ builder.Services.AddAzureServices(); builder.Services.AddAntiforgery(options => { options.HeaderName = "X-CSRF-TOKEN-HEADER"; options.FormFieldName = "X-CSRF-TOKEN-FORM"; }); builder.Services.AddHttpClient(); +builder.Services.AddSignalR().AddAzureSignalR(options => +{ + static string? GetEnvVar(string key) => Environment.GetEnvironmentVariable(key); + + // Try to get endpoint and client ID first + var endpoint = GetEnvVar("AZURE_SIGNALR_ENDPOINT"); + var clientId = GetEnvVar("AZURE_CLIENT_ID"); + + if (endpoint != null && clientId != null) + { + options.Endpoints = new[] + { + new ServiceEndpoint(new Uri(endpoint), new ManagedIdentityCredential(clientId)) + }; + } + else + { + // Fall back to connection string + var connectionString = GetEnvVar("AZURE_SIGNALR_CONNECTION_STRING") + ?? throw new InvalidOperationException("Neither managed identity credentials nor connection string are configured for Azure SignalR"); + options.ConnectionString = connectionString; + } +}); if (builder.Environment.IsDevelopment()) { diff --git a/app/backend/Services/ReadRetrieveReadChatService.cs b/app/backend/Services/ReadRetrieveReadChatService.cs index 7c72dcd9..82ff3562 100644 --- a/app/backend/Services/ReadRetrieveReadChatService.cs +++ b/app/backend/Services/ReadRetrieveReadChatService.cs @@ -1,9 +1,13 @@ // Copyright (c) Microsoft. All rights reserved. using Azure.Core; +using Microsoft.AspNetCore.SignalR; using Microsoft.SemanticKernel.ChatCompletion; using Microsoft.SemanticKernel.Connectors.OpenAI; using Microsoft.SemanticKernel.Embeddings; +using MinimalApi.Hubs; +using System.Text; +using System.Text.RegularExpressions; namespace MinimalApi.Services; #pragma warning disable SKEXP0011 // Mark members as static @@ -15,11 +19,17 @@ public class ReadRetrieveReadChatService private readonly IConfiguration _configuration; private readonly IComputerVisionService? _visionService; private readonly TokenCredential? _tokenCredential; + private readonly IHubContext? _hubContext; + private record StreamingMessage(string Type, T Content); + + private string? _currentAnswer = null; + private string? _currentThoughts = null; public ReadRetrieveReadChatService( ISearchService searchClient, OpenAIClient client, IConfiguration configuration, + IHubContext? hubContext = null, IComputerVisionService? visionService = null, TokenCredential? tokenCredential = null) { @@ -54,6 +64,7 @@ public ReadRetrieveReadChatService( _configuration = configuration; _visionService = visionService; _tokenCredential = tokenCredential; + _hubContext = hubContext; } public async Task ReplyAsync( @@ -110,7 +121,7 @@ standard plan AND dental AND employee benefit. } else { - documentContents = string.Join("\r", documentContentList.Select(x =>$"{x.Title}:{x.Content}")); + documentContents = string.Join("\r", documentContentList.Select(x => $"{x.Title}:{x.Content}")); } // step 2.5 @@ -140,7 +151,6 @@ standard plan AND dental AND employee benefit. } } - if (images != null) { var prompt = @$"## Source ## @@ -244,4 +254,328 @@ You answer needs to be a json object with the following format. return new ChatAppResponse(new[] { choice }); } + + public async Task ReplyStreamingAsync( + ChatMessage[] history, + RequestOverrides? overrides, + string? connectionId = null, + CancellationToken cancellationToken = default) + { + if (_hubContext is null) + { + throw new InvalidOperationException("HubContext is required for streaming response"); + } + + if (string.IsNullOrEmpty(connectionId)) + { + throw new ArgumentException("ConnectionId is required for streaming response", nameof(connectionId)); + } + + var chat = _kernel.GetRequiredService(); + var embedding = _kernel.GetRequiredService(); + float[]? embeddings = null; + var question = history.LastOrDefault(m => m.IsUser)?.Content is { } userQuestion + ? userQuestion + : throw new InvalidOperationException("Use question is null"); + + if (overrides?.RetrievalMode != RetrievalMode.Text && embedding is not null) + { + embeddings = (await embedding.GenerateEmbeddingAsync(question, cancellationToken: cancellationToken)).ToArray(); + } + + // step 1 + // use llm to get query if retrieval mode is not vector + string? query = null; + if (overrides?.RetrievalMode != RetrievalMode.Vector) + { + var getQueryChat = new ChatHistory(@"You are a helpful AI assistant, generate search query for followup question. +Make your respond simple and precise. Return the query only, do not return any other text. +e.g. +Northwind Health Plus AND standard plan. +standard plan AND dental AND employee benefit. +"); + + getQueryChat.AddUserMessage(question); + var result = await chat.GetChatMessageContentAsync( + getQueryChat, + cancellationToken: cancellationToken); + + query = result.Content ?? throw new InvalidOperationException("Failed to get search query"); + } + + // step 2 + // use query to search related docs + var documentContentList = await _searchClient.QueryDocumentsAsync(query, embeddings, overrides, cancellationToken); + + string documentContents = string.Empty; + if (documentContentList.Length == 0) + { + documentContents = "no source available."; + } + else + { + documentContents = string.Join("\r", documentContentList.Select(x => $"{x.Title}:{x.Content}")); + } + + // step 2.5 + // retrieve images if _visionService is available + SupportingImageRecord[]? images = default; + if (_visionService is not null) + { + var queryEmbeddings = await _visionService.VectorizeTextAsync(query ?? question, cancellationToken); + images = await _searchClient.QueryImagesAsync(query, queryEmbeddings.vector, overrides, cancellationToken); + } + + // step 3 + // put together related docs and conversation history to generate answer + var answerChat = new ChatHistory( + "You are a system assistant who helps the company employees with their questions. Be brief in your answers"); + + // add chat history + foreach (var message in history) + { + if (message.IsUser) + { + answerChat.AddUserMessage(message.Content); + } + else + { + answerChat.AddAssistantMessage(message.Content); + } + } + + if (images != null) + { + var prompt = @$"## Source ## +{documentContents} +## End ## + +Answer question based on available source and images. +Respond in the following format: +[ANSWER START] +Your answer here. If no source available, say I don't know. +[ANSWER END] + +[THOUGHTS START] +Brief thoughts on how you came up with the answer, e.g. what sources you used, what you thought about, etc. +[THOUGHTS END]"; + + var tokenRequestContext = new TokenRequestContext(new[] { "https://storage.azure.com/.default" }); + var sasToken = await (_tokenCredential?.GetTokenAsync(tokenRequestContext, cancellationToken) ?? throw new InvalidOperationException("Failed to get token")); + var sasTokenString = sasToken.Token; + var imageUrls = images.Select(x => $"{x.Url}?{sasTokenString}").ToArray(); + var collection = new ChatMessageContentItemCollection(); + collection.Add(new TextContent(prompt)); + foreach (var imageUrl in imageUrls) + { + collection.Add(new ImageContent(new Uri(imageUrl))); + } + + answerChat.AddUserMessage(collection); + } + else + { + var prompt = @$" ## Source ## +{documentContents} +## End ## + +Respond in the following format: +[ANSWER START] +Your answer here, add a source reference to the end of each sentence. e.g. Apple is a fruit [reference1.pdf][reference2.pdf]. If no source available, say I don't know. +[ANSWER END] + +[THOUGHTS START] +Brief thoughts on how you came up with the answer, e.g. what sources you used, what you thought about, etc. +[THOUGHTS END]"; + answerChat.AddUserMessage(prompt); + } + + var promptExecutingSetting = new OpenAIPromptExecutionSettings + { + MaxTokens = 1024, + Temperature = overrides?.Temperature ?? 0.7, + StopSequences = [], + }; + + try + { + var streamingResponse = chat.GetStreamingChatMessageContentsAsync( + answerChat, + promptExecutingSetting, + cancellationToken: cancellationToken); + + var currentStreamedContent = new StringBuilder(); + await foreach (var content in streamingResponse) + { + if (!string.IsNullOrEmpty(content.Content)) + { + try + { + currentStreamedContent.Append(content.Content); + var currentContent = currentStreamedContent.ToString(); + + // Handle answer section + if (currentContent.Contains("[ANSWER START]")) + { + var answerContent = currentContent + .Split(new[] { "[ANSWER START]" }, StringSplitOptions.None)[1] + .Split(new[] { "[ANSWER END]" }, StringSplitOptions.None)[0] + .Trim(); + + if (answerContent != _currentAnswer) + { + _currentAnswer = answerContent; + var answerMessage = new StreamingMessage("answer", _currentAnswer); + await _hubContext.Clients.Client(connectionId) + .SendAsync("ReceiveMessage", JsonSerializer.Serialize(answerMessage), cancellationToken); + } + } + + // Handle thoughts section + if (currentContent.Contains("[THOUGHTS START]")) + { + var thoughtsContent = currentContent + .Split(new[] { "[THOUGHTS START]" }, StringSplitOptions.None)[1] + .Split(new[] { "[THOUGHTS END]" }, StringSplitOptions.None)[0] + .Trim(); + + if (thoughtsContent != _currentThoughts) + { + _currentThoughts = thoughtsContent; + var thoughtsArray = new[] + { + new + { + Title = "Thoughts", + Description = _currentThoughts + } + }; + var thoughtsMessage = new StreamingMessage( + "thoughts", + thoughtsArray); + await _hubContext.Clients.Client(connectionId) + .SendAsync("ReceiveMessage", JsonSerializer.Serialize(thoughtsMessage), cancellationToken); + } + } + + // If we haven't started any section yet, send as raw content + if (!currentContent.Contains("[ANSWER START]") && !currentContent.Contains("[THOUGHTS START]")) + { + var streamingMessage = new StreamingMessage("content", content.Content); + await _hubContext.Clients.Client(connectionId) + .SendAsync("ReceiveMessage", JsonSerializer.Serialize(streamingMessage), cancellationToken); + } + } + catch (Exception ex) + { + throw new InvalidOperationException("Error processing streaming response", ex); + } + } + } + + // Handle follow-up questions + if (overrides?.SuggestFollowupQuestions is true) + { + var followUpQuestionChat = new ChatHistory(@"You are a helpful AI assistant"); + followUpQuestionChat.AddUserMessage($@"Generate three follow-up question based on the answer you just generated. +# Answer +{_currentAnswer} + +# Format of the response +[FOLLOWUP START] +Your follow-up questions here, one per line. +[FOLLOWUP END]"); + + var followUpStreamingResponse = chat.GetStreamingChatMessageContentsAsync( + followUpQuestionChat, + promptExecutingSetting, + cancellationToken: cancellationToken); + + var followUpQuestions = new List(); + var followUpContent = new StringBuilder(); + + await foreach (var content in followUpStreamingResponse) + { + if (!string.IsNullOrEmpty(content.Content)) + { + try + { + followUpContent.Append(content.Content); + var currentContent = followUpContent.ToString(); + + if (currentContent.Contains("[FOLLOWUP START]")) + { + var questionsContent = currentContent + .Split(new[] { "[FOLLOWUP START]" }, StringSplitOptions.None)[1] + .Split(new[] { "[FOLLOWUP END]" }, StringSplitOptions.None)[0] + .Trim(); + + var currentQuestions = questionsContent + .Split('\n', StringSplitOptions.RemoveEmptyEntries) + .Select(q => q.Trim()) + .Where(q => !string.IsNullOrEmpty(q)) + .ToList(); + + + if (!followUpQuestions.SequenceEqual(currentQuestions)) + { + followUpQuestions = currentQuestions; + var followUpMessage = new StreamingMessage( + "followup", + followUpQuestions.ToArray()); + await _hubContext.Clients.Client(connectionId) + .SendAsync("ReceiveMessage", JsonSerializer.Serialize(followUpMessage), cancellationToken); + } + } + } + catch (Exception ex) + { + throw new InvalidOperationException("Error processing follow-up questions streaming", ex); + } + } + } + } + + // Send supporting content first + if (documentContentList.Length > 0) + { + var supportingContentArray = documentContentList.Select(x => new + { + Title = x.Title, + Description = x.Content + }).ToArray(); + + var supportingContent = new StreamingMessage( + "supporting", + supportingContentArray); + + await _hubContext.Clients.Client(connectionId) + .SendAsync("ReceiveMessage", JsonSerializer.Serialize(supportingContent), cancellationToken); + } + + // Then send supporting images if available + if (images?.Length > 0) + { + var supportingImagesArray = images.Select(x => new SupportingImageRecord(x.Title, x.Url)).ToArray(); + + var supportingImages = new StreamingMessage("images", supportingImagesArray); + + await _hubContext.Clients.Client(connectionId) + .SendAsync("ReceiveMessage", JsonSerializer.Serialize(supportingImages), cancellationToken); + } + + // Finally send the complete message + var completeMessage = new StreamingMessage("complete", new + { + citationBaseUrl = _configuration.ToCitationBaseUrl() + }); + await _hubContext.Clients.Client(connectionId) + .SendAsync("ReceiveMessage", JsonSerializer.Serialize(completeMessage), cancellationToken); + + } + catch (Exception ex) + { + throw new InvalidOperationException("Error in streaming response", ex); + } + } } diff --git a/app/shared/Shared/Models/RequestOverrides.cs b/app/shared/Shared/Models/RequestOverrides.cs index 87cec34b..c995c3fb 100644 --- a/app/shared/Shared/Models/RequestOverrides.cs +++ b/app/shared/Shared/Models/RequestOverrides.cs @@ -46,4 +46,7 @@ public record RequestOverrides [JsonPropertyName("vector_fields")] public bool? VectorFields { get; set; } = false; + + [JsonPropertyName("useStreaming")] + public bool UseStreaming { get; set; } = false; } diff --git a/app/tests/MinimalApi.Tests/ReadRetrieveReadChatServiceTest.cs b/app/tests/MinimalApi.Tests/ReadRetrieveReadChatServiceTest.cs index 42812634..823be8a4 100644 --- a/app/tests/MinimalApi.Tests/ReadRetrieveReadChatServiceTest.cs +++ b/app/tests/MinimalApi.Tests/ReadRetrieveReadChatServiceTest.cs @@ -104,6 +104,7 @@ public async Task FinancialReportTestAsync() azureSearchService, openAIClient, configuration, + null, azureComputerVisionService, azureCredential); diff --git a/docs/appcomponents-signalr.png b/docs/appcomponents-signalr.png new file mode 100644 index 00000000..80093e9c Binary files /dev/null and b/docs/appcomponents-signalr.png differ diff --git a/infra/app/web.bicep b/infra/app/web.bicep index af0bd22c..7f93d7ec 100644 --- a/infra/app/web.bicep +++ b/infra/app/web.bicep @@ -65,6 +65,9 @@ param openAiApiKey string @description('An array of service binds') param serviceBinds array +@description('The SignalR endpoint') +param signalREndpoint string + resource webIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = { name: identityName location: location @@ -149,6 +152,10 @@ module app '../core/host/container-app-upsert.bicep' = { name: 'OPENAI_API_KEY' value: openAiApiKey } + { + name: 'AZURE_SIGNALR_ENDPOINT' + value: signalREndpoint + } ] targetPort: 8080 } diff --git a/infra/core/signalr/signalr.bicep b/infra/core/signalr/signalr.bicep new file mode 100644 index 00000000..7df8390a --- /dev/null +++ b/infra/core/signalr/signalr.bicep @@ -0,0 +1,26 @@ +metadata description = 'Creates an Azure SignalR Services instance.' +param name string +param location string = resourceGroup().location +param tags object = {} + +@description('Controls whether local authentication is disabled') +param disableLocalAuth bool = true + +@description('The SKU of the SignalR service') +param sku object = { + name: 'Premium_P1' +} + +resource signalR 'Microsoft.SignalRService/signalR@2023-08-01-preview' = { + name: name + location: location + tags: tags + sku: sku + properties: { + disableLocalAuth: disableLocalAuth + } +} + +output endpoint string = 'https://${signalR.properties.hostName}' +output id string = signalR.id +output name string = signalR.name diff --git a/infra/main.bicep b/infra/main.bicep index 79010b0d..80268ee9 100644 --- a/infra/main.bicep +++ b/infra/main.bicep @@ -339,6 +339,7 @@ module web './app/web.bicep' = { openAiChatGptDeployment: useAOAI ? azureChatGptDeploymentName : '' openAiEmbeddingDeployment: useAOAI ? azureEmbeddingDeploymentName : '' serviceBinds: [] + signalREndpoint: signalr.outputs.endpoint } } @@ -740,6 +741,37 @@ module visionRoleBackend 'core/security/role.bicep' = if (useVision) { } } +module signalr './core/signalr/signalr.bicep' = { + name: 'signalr' + scope: resourceGroup + params: { + name: '${abbrs.signalRServiceSignalR}${resourceToken}' + location: location + tags: updatedTags + } +} + +// Add SignalR role assignments for the web app +module signalRRoleUser 'core/security/role.bicep' = { + scope: resourceGroup + name: 'signalr-role-user' + params: { + principalId: principalId + roleDefinitionId: '420fcaa2-552c-430f-98ca-3264be4806c7' // SignalR App Server + principalType: principalType + } +} + +module signalRRoleBackend 'core/security/role.bicep' = { + scope: resourceGroup + name: 'signalr-role-backend' + params: { + principalId: web.outputs.SERVICE_WEB_IDENTITY_PRINCIPAL_ID + roleDefinitionId: '420fcaa2-552c-430f-98ca-3264be4806c7' // SignalR App Server + principalType: 'ServicePrincipal' + } +} + output APPLICATIONINSIGHTS_CONNECTION_STRING string = monitoring.outputs.applicationInsightsConnectionString output APPLICATIONINSIGHTS_NAME string = monitoring.outputs.applicationInsightsName output AZURE_USE_APPLICATION_INSIGHTS bool = useApplicationInsights @@ -782,3 +814,5 @@ output USE_VISION bool = useVision output OPENAI_EMBEDDING_DEPLOYMENT string = openAiEmbeddingDeployment output AZURE_OPENAI_CHATGPT_MODEL_VERSION string = azureOpenAIChatGptModelVersion output AZURE_OPENAI_CHATGPT_MODEL_NAME string = azureOpenAIChatGptModelName +output AZURE_SIGNALR_ENDPOINT string = signalr.outputs.endpoint +output AZURE_CLIENT_ID string = web.outputs.SERVICE_WEB_IDENTITY_PRINCIPAL_ID