Skip to content

Commit 692d9e3

Browse files
authored
Merge pull request #213 from dotnet-presentations/copilot/improve-part-3-readme
Improve Part 3 readme to explain how function invocation works
2 parents 0e6199e + 92b0324 commit 692d9e3

File tree

1 file changed

+229
-0
lines changed

1 file changed

+229
-0
lines changed

Part 3 - Template Exploration/README.md

Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,231 @@ Key points about `IChatClient`:
213213
1. It supports both one-off responses and conversation history
214214
1. It enables function calling and other advanced features
215215

216+
## How Function Invocation Connects the LLM to the Vector Database
217+
218+
One of the most powerful features of the AI Web Chat template is how it uses **function invocation** to enable the large language model (LLM) to search the vector database when needed. This creates a Retrieval Augmented Generation (RAG) system where the AI can access your custom data.
219+
220+
### Function Invocation Architecture
221+
222+
Here's how the LLM decides when to search the vector database and retrieve relevant information:
223+
224+
```mermaid
225+
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#f4f4f4', 'primaryTextColor': '#000', 'primaryBorderColor': '#333', 'lineColor': '#333', 'secondaryColor': '#e1f5fe', 'tertiaryColor': '#f3e5f5' }}}%%
226+
sequenceDiagram
227+
participant User
228+
participant Chat as Chat.razor
229+
participant LLM as IChatClient<br/>(LLM with Function Calling)
230+
participant Search as SearchAsync Function
231+
participant VDB as Vector Database<br/>(Qdrant)
232+
233+
User->>Chat: "What are the GPS watch features?"
234+
235+
Chat->>LLM: Send message + available tools<br/>(SearchAsync function)
236+
237+
Note over LLM: LLM analyzes the question<br/>and decides it needs to<br/>search for information
238+
239+
LLM->>Chat: Function call request<br/>SearchAsync("GPS watch features")
240+
241+
Chat->>Search: Execute SearchAsync<br/>searchPhrase: "GPS watch features"
242+
243+
Search->>VDB: Vector similarity search<br/>for "GPS watch features"
244+
245+
VDB-->>Search: Top 5 relevant chunks<br/>from ingested documents
246+
247+
Search-->>Chat: Formatted results in XML format
248+
249+
Chat->>LLM: Function result (search results)
250+
251+
Note over LLM: LLM uses search results<br/>to generate accurate answer<br/>with citations
252+
253+
LLM-->>Chat: Final response with citations
254+
255+
Chat-->>User: Answer with citations in XML format
256+
```
257+
258+
This sequence diagram shows the complete flow from user question to AI response, highlighting how the LLM autonomously decides to call the search function.
259+
260+
**Note on data format:** The search results are returned in XML format like `<result filename="..." page_number="...">text</result>`, and the LLM includes citations as `<citation filename='...' page_number='...'>quote</citation>`.
261+
262+
### Enabling Function Invocation in Program.cs
263+
264+
Function invocation is enabled when configuring the chat client:
265+
266+
```csharp
267+
// From Program.cs
268+
var openai = builder.AddAzureOpenAIClient("openai");
269+
openai.AddChatClient("gpt-4o-mini")
270+
.UseFunctionInvocation() // This enables the LLM to call functions
271+
.UseOpenTelemetry(configure: c =>
272+
c.EnableSensitiveData = builder.Environment.IsDevelopment());
273+
```
274+
275+
The `.UseFunctionInvocation()` middleware:
276+
277+
1. **Intercepts function call requests** from the LLM
278+
2. **Executes the requested function** with the parameters the LLM provides
279+
3. **Returns the function results** back to the LLM
280+
4. **Continues the conversation** so the LLM can use the results
281+
282+
Without this middleware, the LLM would only receive function call *requests* but wouldn't be able to execute them.
283+
284+
### Registering the Search Function in Chat.razor
285+
286+
The search function is registered as a tool that the LLM can use:
287+
288+
```csharp
289+
// From Chat.razor OnInitialized
290+
protected override void OnInitialized()
291+
{
292+
statefulMessageCount = 0;
293+
messages.Add(new(ChatRole.System, SystemPrompt));
294+
chatOptions.Tools = [AIFunctionFactory.Create(SearchAsync)];
295+
}
296+
```
297+
298+
**What happens here:**
299+
300+
1. `AIFunctionFactory.Create(SearchAsync)` analyzes the `SearchAsync` method
301+
2. It reads the `[Description]` attributes to understand what the function does
302+
3. It creates a tool definition that gets sent to the LLM with each request
303+
4. The LLM sees the tool definition and knows it can call this function when needed
304+
305+
The tool definition sent to the LLM looks conceptually like this:
306+
307+
```json
308+
{
309+
"name": "SearchAsync",
310+
"description": "Searches for information using a phrase or keyword",
311+
"parameters": {
312+
"searchPhrase": {
313+
"type": "string",
314+
"description": "The phrase to search for."
315+
},
316+
"filenameFilter": {
317+
"type": "string",
318+
"description": "If possible, specify the filename to search that file only. If not provided or empty, the search includes all files.",
319+
"required": false
320+
}
321+
}
322+
}
323+
```
324+
325+
### The SearchAsync Function Implementation
326+
327+
Here's how the search function is implemented:
328+
329+
```csharp
330+
[Description("Searches for information using a phrase or keyword")]
331+
private async Task<IEnumerable<string>> SearchAsync(
332+
[Description("The phrase to search for.")] string searchPhrase,
333+
[Description("If possible, specify the filename to search that file only. If not provided or empty, the search includes all files.")] string? filenameFilter = null)
334+
{
335+
await InvokeAsync(StateHasChanged); // Update UI to show function is being called
336+
var results = await Search.SearchAsync(searchPhrase, filenameFilter, maxResults: 5);
337+
return results.Select(result =>
338+
$"<result filename=\"{result.DocumentId}\" page_number=\"{result.PageNumber}\">{result.Text}</result>");
339+
}
340+
```
341+
342+
**Key aspects of this function:**
343+
344+
1. **Description Attributes**: The `[Description]` attributes provide context to the LLM about when and how to use this function
345+
2. **Parameters**: The LLM can extract search phrases from the user's question and pass them as parameters
346+
3. **Vector Search**: Calls `SemanticSearch.SearchAsync()` to find the most relevant chunks from the vector database
347+
4. **Formatted Results**: Returns results in a structured XML format that the LLM can parse and use
348+
5. **UI Update**: Calls `StateHasChanged()` so users can see when a search is happening
349+
350+
### How the LLM Decides to Call the Function
351+
352+
The LLM uses several factors to decide when to call the search function:
353+
354+
1. **System Prompt**: The system prompt tells the LLM to "Use the search tool to find relevant information"
355+
2. **Function Description**: The description "Searches for information using a phrase or keyword" helps the LLM understand the function's purpose
356+
3. **User Question**: When a user asks a question that requires specific information, the LLM recognizes it needs to search
357+
4. **Context**: If the conversation history doesn't contain the needed information, the LLM will search
358+
359+
**Example decision process:**
360+
361+
- User asks: "What are the GPS watch features?"
362+
- LLM thinks: "I need specific information about GPS watch features"
363+
- LLM sees: SearchAsync tool is available for finding information
364+
- LLM decides: Call `SearchAsync(searchPhrase: "GPS watch features")`
365+
- LLM receives: Relevant text chunks about GPS watches
366+
- LLM generates: Answer based on the search results with citations
367+
368+
### The Complete RAG Flow
369+
370+
Here's what happens in a complete Retrieval Augmented Generation (RAG) interaction:
371+
372+
```mermaid
373+
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#f4f4f4', 'primaryTextColor': '#000', 'primaryBorderColor': '#333', 'lineColor': '#333', 'secondaryColor': '#e1f5fe', 'tertiaryColor': '#f3e5f5' }}}%%
374+
flowchart TB
375+
Start([User Asks Question]) --> AddMsg[Add user message<br/>to conversation]
376+
377+
AddMsg --> Stream[Call ChatClient.<br/>GetStreamingResponseAsync]
378+
379+
Stream --> LLMAnalyze{LLM Analyzes:<br/>Do I need more info?}
380+
381+
LLMAnalyze -->|Yes, need to search| FuncCall[LLM returns<br/>function call request]
382+
LLMAnalyze -->|No, can answer directly| DirectAnswer[LLM generates<br/>direct response]
383+
384+
FuncCall --> Middleware[UseFunctionInvocation<br/>middleware intercepts]
385+
386+
Middleware --> Execute[Execute SearchAsync<br/>with LLM's parameters]
387+
388+
Execute --> VectorSearch[Vector similarity search<br/>in Qdrant]
389+
390+
VectorSearch --> Results[Return top 5<br/>matching chunks]
391+
392+
Results --> BackToLLM[Send results<br/>back to LLM]
393+
394+
BackToLLM --> LLMGenerate[LLM generates answer<br/>using search results]
395+
396+
LLMGenerate --> AddCitations[LLM adds citations<br/>in XML format]
397+
398+
AddCitations --> StreamToUser[Stream response<br/>to user]
399+
400+
DirectAnswer --> StreamToUser
401+
402+
StreamToUser --> End([User sees answer<br/>with citations])
403+
404+
style Start fill:#e8f5e8
405+
style LLMAnalyze fill:#fff4e6
406+
style FuncCall fill:#f9d5e5
407+
style Middleware fill:#e1f5fe
408+
style VectorSearch fill:#e1f5fe
409+
style LLMGenerate fill:#d5e8d4
410+
style End fill:#e8f5e8
411+
```
412+
413+
**Step-by-step breakdown:**
414+
415+
1. **User Question**: User submits a question through the chat interface
416+
2. **Message Added**: The question is added to the conversation history
417+
3. **LLM Analysis**: The LLM analyzes whether it needs additional information
418+
4. **Function Call Decision**:
419+
- If needed: LLM requests to call SearchAsync with specific parameters
420+
- If not needed: LLM generates a direct response
421+
5. **Middleware Interception**: `UseFunctionInvocation()` middleware catches the function call request
422+
6. **Function Execution**: `SearchAsync` is executed with the LLM's chosen parameters
423+
7. **Vector Search**: The search service queries Qdrant for semantically similar content
424+
8. **Results Return**: Top 5 matching chunks are formatted and returned to the middleware
425+
9. **Back to LLM**: The function results are sent back to the LLM as part of the conversation
426+
10. **Answer Generation**: The LLM uses the search results to generate an accurate, grounded response
427+
11. **Citations Added**: The LLM includes citations in the specified XML format
428+
12. **Streaming Response**: The complete answer is streamed back to the user
429+
430+
### Why This Approach is Powerful
431+
432+
This function invocation pattern provides several benefits:
433+
434+
1. **Autonomous Decision Making**: The LLM decides when it needs to search, not hardcoded logic
435+
2. **Dynamic Parameter Selection**: The LLM extracts the right search terms from the user's question
436+
3. **Grounded Responses**: Answers are based on actual data from your documents, not the LLM's training data
437+
4. **Transparent Citations**: Users can see exactly which documents and pages were used
438+
5. **Flexible RAG**: The same pattern can be extended with additional functions (e.g., calculator, weather, database queries)
439+
6. **Provider Agnostic**: Works with any LLM that supports function calling (OpenAI, Azure OpenAI, etc.)
440+
216441
## Microsoft Extensions for Vector Data with Vector Collections
217442

218443
The template uses Microsoft Extensions for Vector Data to implement document ingestion and semantic search. Instead of using a separate database for tracking ingested documents, everything is stored directly in vector collections.
@@ -533,6 +758,10 @@ The semantic search process:
533758
- How services are configured and orchestrated in .NET Aspire
534759
- How the main application is structured and configured
535760
- How `IChatClient` is set up and used for interacting with AI models
761+
- **How function invocation enables the LLM to autonomously search the vector database**
762+
- **How `.UseFunctionInvocation()` middleware processes function calls**
763+
- **How `AIFunctionFactory.Create()` registers functions as tools for the LLM**
764+
- **How the complete RAG (Retrieval Augmented Generation) flow works from question to cited answer**
536765
- How vector collections are used to store both document chunks and metadata
537766
- How Microsoft Extensions for Vector Data simplifies document ingestion with vector-native storage
538767
- How the simplified architecture eliminates the need for separate ingestion cache databases

0 commit comments

Comments
 (0)