You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Part 3 - Template Exploration/README.md
+229Lines changed: 229 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -213,6 +213,231 @@ Key points about `IChatClient`:
213
213
1. It supports both one-off responses and conversation history
214
214
1. It enables function calling and other advanced features
215
215
216
+
## How Function Invocation Connects the LLM to the Vector Database
217
+
218
+
One of the most powerful features of the AI Web Chat template is how it uses **function invocation** to enable the large language model (LLM) to search the vector database when needed. This creates a Retrieval Augmented Generation (RAG) system where the AI can access your custom data.
219
+
220
+
### Function Invocation Architecture
221
+
222
+
Here's how the LLM decides when to search the vector database and retrieve relevant information:
VDB-->>Search: Top 5 relevant chunks<br/>from ingested documents
246
+
247
+
Search-->>Chat: Formatted results in XML format
248
+
249
+
Chat->>LLM: Function result (search results)
250
+
251
+
Note over LLM: LLM uses search results<br/>to generate accurate answer<br/>with citations
252
+
253
+
LLM-->>Chat: Final response with citations
254
+
255
+
Chat-->>User: Answer with citations in XML format
256
+
```
257
+
258
+
This sequence diagram shows the complete flow from user question to AI response, highlighting how the LLM autonomously decides to call the search function.
259
+
260
+
**Note on data format:** The search results are returned in XML format like `<result filename="..." page_number="...">text</result>`, and the LLM includes citations as `<citation filename='...' page_number='...'>quote</citation>`.
261
+
262
+
### Enabling Function Invocation in Program.cs
263
+
264
+
Function invocation is enabled when configuring the chat client:
265
+
266
+
```csharp
267
+
// From Program.cs
268
+
varopenai=builder.AddAzureOpenAIClient("openai");
269
+
openai.AddChatClient("gpt-4o-mini")
270
+
.UseFunctionInvocation() // This enables the LLM to call functions
1.`AIFunctionFactory.Create(SearchAsync)` analyzes the `SearchAsync` method
301
+
2. It reads the `[Description]` attributes to understand what the function does
302
+
3. It creates a tool definition that gets sent to the LLM with each request
303
+
4. The LLM sees the tool definition and knows it can call this function when needed
304
+
305
+
The tool definition sent to the LLM looks conceptually like this:
306
+
307
+
```json
308
+
{
309
+
"name": "SearchAsync",
310
+
"description": "Searches for information using a phrase or keyword",
311
+
"parameters": {
312
+
"searchPhrase": {
313
+
"type": "string",
314
+
"description": "The phrase to search for."
315
+
},
316
+
"filenameFilter": {
317
+
"type": "string",
318
+
"description": "If possible, specify the filename to search that file only. If not provided or empty, the search includes all files.",
319
+
"required": false
320
+
}
321
+
}
322
+
}
323
+
```
324
+
325
+
### The SearchAsync Function Implementation
326
+
327
+
Here's how the search function is implemented:
328
+
329
+
```csharp
330
+
[Description("Searches for information using a phrase or keyword")]
331
+
privateasyncTask<IEnumerable<string>>SearchAsync(
332
+
[Description("The phrase to search for.")] stringsearchPhrase,
333
+
[Description("If possible, specify the filename to search that file only. If not provided or empty, the search includes all files.")] string?filenameFilter=null)
334
+
{
335
+
awaitInvokeAsync(StateHasChanged); // Update UI to show function is being called
StreamToUser --> End([User sees answer<br/>with citations])
403
+
404
+
style Start fill:#e8f5e8
405
+
style LLMAnalyze fill:#fff4e6
406
+
style FuncCall fill:#f9d5e5
407
+
style Middleware fill:#e1f5fe
408
+
style VectorSearch fill:#e1f5fe
409
+
style LLMGenerate fill:#d5e8d4
410
+
style End fill:#e8f5e8
411
+
```
412
+
413
+
**Step-by-step breakdown:**
414
+
415
+
1.**User Question**: User submits a question through the chat interface
416
+
2.**Message Added**: The question is added to the conversation history
417
+
3.**LLM Analysis**: The LLM analyzes whether it needs additional information
418
+
4.**Function Call Decision**:
419
+
- If needed: LLM requests to call SearchAsync with specific parameters
420
+
- If not needed: LLM generates a direct response
421
+
5.**Middleware Interception**: `UseFunctionInvocation()` middleware catches the function call request
422
+
6.**Function Execution**: `SearchAsync` is executed with the LLM's chosen parameters
423
+
7.**Vector Search**: The search service queries Qdrant for semantically similar content
424
+
8.**Results Return**: Top 5 matching chunks are formatted and returned to the middleware
425
+
9.**Back to LLM**: The function results are sent back to the LLM as part of the conversation
426
+
10.**Answer Generation**: The LLM uses the search results to generate an accurate, grounded response
427
+
11.**Citations Added**: The LLM includes citations in the specified XML format
428
+
12.**Streaming Response**: The complete answer is streamed back to the user
429
+
430
+
### Why This Approach is Powerful
431
+
432
+
This function invocation pattern provides several benefits:
433
+
434
+
1.**Autonomous Decision Making**: The LLM decides when it needs to search, not hardcoded logic
435
+
2.**Dynamic Parameter Selection**: The LLM extracts the right search terms from the user's question
436
+
3.**Grounded Responses**: Answers are based on actual data from your documents, not the LLM's training data
437
+
4.**Transparent Citations**: Users can see exactly which documents and pages were used
438
+
5.**Flexible RAG**: The same pattern can be extended with additional functions (e.g., calculator, weather, database queries)
439
+
6.**Provider Agnostic**: Works with any LLM that supports function calling (OpenAI, Azure OpenAI, etc.)
440
+
216
441
## Microsoft Extensions for Vector Data with Vector Collections
217
442
218
443
The template uses Microsoft Extensions for Vector Data to implement document ingestion and semantic search. Instead of using a separate database for tracking ingested documents, everything is stored directly in vector collections.
@@ -533,6 +758,10 @@ The semantic search process:
533
758
- How services are configured and orchestrated in .NET Aspire
534
759
- How the main application is structured and configured
535
760
- How `IChatClient` is set up and used for interacting with AI models
761
+
-**How function invocation enables the LLM to autonomously search the vector database**
762
+
-**How `.UseFunctionInvocation()` middleware processes function calls**
763
+
-**How `AIFunctionFactory.Create()` registers functions as tools for the LLM**
764
+
-**How the complete RAG (Retrieval Augmented Generation) flow works from question to cited answer**
536
765
- How vector collections are used to store both document chunks and metadata
537
766
- How Microsoft Extensions for Vector Data simplifies document ingestion with vector-native storage
538
767
- How the simplified architecture eliminates the need for separate ingestion cache databases
0 commit comments