Skip to content

Commit 6b1f29f

Browse files
committed
Add support for integrated vectorization
1 parent e6ede18 commit 6b1f29f

14 files changed

+312
-165
lines changed

README.md

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -38,22 +38,24 @@ graph TD
3838
acs[Azure AI Search]
3939
aoai[Azure OpenAI]
4040
webapp[Web App]
41-
functionapp[Function App]
41+
functionapp[Function Apps]
4242
storage[Storage Account]
4343
44-
webapp -->|Generate query embeddings for vector search| aoai
44+
webapp -->|Generate query embeddings for vector search (for external vectorization)| aoai
4545
webapp -->|Send chat requests| aoai
4646
webapp -->|Send search requests| acs
4747
webapp -->|Upload new documents| storage
48-
functionapp -->|Generate embeddings for chunks| aoai
48+
functionapp -->|Generate embeddings for chunks (for external vectorization)| aoai
49+
functionapp -->|Push chunks into search index (for push model)| acs
50+
acs -->|Generate embeddings for chunks and search queries (for integrated vectorization)| aoai
4951
acs -->|Populate search index from documents| storage
50-
acs -->|Generate chunks and embeddings to index| functionapp
52+
acs -->|Generate chunks and embeddings to index (for external vectorization)| functionapp
5153
aoai -->|Find relevant context to build prompt for Azure OpenAI on your data| acs
5254
```
5355

5456
When you deploy the solution, it creates an [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search) service which indexes document content from a blob storage container. (Note that documents are assumed to be in English.)
5557

56-
The documents in the index are also chunked into smaller pieces, and vector embeddings are created for these chunks using a Function App based on the [Azure OpenAI Embeddings Generator power skill](https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator). This allows you to easily try out [vector and hybrid search](https://learn.microsoft.com/azure/search/vector-search-overview). With Azure AI Search on its own, the responses *always* come directly from the source data, rather than being generated by an AI model. You can optionally use [semantic ranking](https://learn.microsoft.com/azure/search/semantic-search-overview) which *does* use AI, not to generate content but to increase the relevancy of the results and provide semantic answers and captions.
58+
The documents in the index are also chunked into smaller pieces, and vector embeddings are created for these chunks using either [integrated vectorization](https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization), or external vectorization using a Function App. This allows you to easily try out [vector and hybrid search](https://learn.microsoft.com/azure/search/vector-search-overview). With Azure AI Search on its own, the responses *always* come directly from the source data, rather than being generated by an AI model. You can optionally use [semantic ranking](https://learn.microsoft.com/azure/search/semantic-search-overview) which *does* use AI, not to generate content but to increase the relevancy of the results and provide semantic answers and captions.
5759

5860
The solution also deploys an [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/overview) service. It provides an embeddings model to generate the vector representations of the document chunks and search queries, and a GPT model to generate answers to your search queries. If you choose the option to use [Azure OpenAI "on your data"](https://learn.microsoft.com/azure/ai-services/openai/concepts/use-your-data), these AI-generated responses can be grounded in (and even limited to) the information in your Azure AI Search indexes. This option allows you to let Azure OpenAI orchestrate the [Retrieval Augmented Generation (RAG)](https://aka.ms/what-is-rag) pattern. This means your search query will first be used to retrieve the most relevant documents (or preferably *smaller chunks of those documents*) from your private data source. Those search results are then used as context in the prompt that gets sent to the AI model, along with the original search query. This allows the AI model to generate a response based on the most relevant source data, rather than the public data that was used to train the model. Next to letting Azure OpenAI orchestrate the RAG pattern, the web application can also use [Semantic Kernel](https://learn.microsoft.com/semantic-kernel/overview/) to perform that orchestration, using a prompt and other parameters you can control yourself.
5961

@@ -109,9 +111,9 @@ This can easily be done by setting up the built-in [authentication and authoriza
109111

110112
## Configuration
111113

112-
The ARM template deploys the services and sets the configuration settings for the Web App and Function App. Most of these shouldn't be changed as they contain connection settings between the various services, but you can change the settings below for the App Service Web App.
114+
The ARM template deploys the services and sets the configuration settings for the Web App and Function Apps. Most of these shouldn't be changed as they contain connection settings between the various services, but you can change the settings below for the App Service Web App.
113115

114-
> Note that the settings of the Function App shouldn't be changed, as the [power skill](https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator) was tweaked for this project to take any relevant settings from the request sent by the Azure AI Search skillset instead of from configuration (for example, the embedding model and chunk size to use).
116+
> Note that the settings of the Function Apps shouldn't be changed, as the [power skill](https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator) was tweaked for this project to take any relevant settings from the request sent by the Azure AI Search skillset instead of from configuration (for example, the embedding model and chunk size to use).
115117
116118
| Setting | Purpose | Default value |
117119
| ------- | ------- | ------------- |
@@ -121,12 +123,14 @@ The ARM template deploys the services and sets the configuration settings for th
121123
| `OpenAIGptDeployment` | The deployment name of the [Azure OpenAI GPT model](https://learn.microsoft.com/azure/ai-services/openai/concepts/models) to use | `gpt-35-turbo` |
122124
| `StorageContainerNameBlobDocuments`* | The name of the storage container that contains the documents | `blob-documents` |
123125
| `StorageContainerNameBlobChunks`* | The name of the storage container that contains the document chunks | `blob-chunks` |
124-
| `TextEmbedderNumTokens` | The number of tokens per chunk when splitting documents into smaller pieces | `2048` |
125-
| `TextEmbedderTokenOverlap` | The number of tokens to overlap between consecutive chunks | `0` |
126-
| `TextEmbedderMinChunkSize` | The minimum number of tokens of a chunk (smaller chunks are excluded) | `10` |
126+
| `TextChunkerPageLength` | In case of integrated vectorization, the number of characters per page (chunk) when splitting documents into smaller pieces | `2000` |
127+
| `TextChunkerPageOverlap` | In case of integrated vectorization, the number of characters to overlap between consecutive pages (chunks) | `500` |
128+
| `TextEmbedderNumTokens` | In case of external vectorization, the number of tokens per chunk when splitting documents into smaller pieces | `2048` |
129+
| `TextEmbedderTokenOverlap` | In case of external vectorization, the number of tokens to overlap between consecutive chunks | `0` |
130+
| `TextEmbedderMinChunkSize` | In case of external vectorization, the minimum number of tokens of a chunk (smaller chunks are excluded) | `10` |
127131
| `SearchIndexNameBlobDocuments`* | The name of the search index that contains the documents | `blob-documents` |
128132
| `SearchIndexNameBlobChunks`* | The name of the search index that contains the document chunks | `blob-chunks` |
129-
| `SearchIndexerSkillType`* | The type of chunking and embedding skill to use as part of the documents indexer: `pull` uses a [knowledge store](https://learn.microsoft.com/azure/search/knowledge-store-concept-intro) to store the chunk data in blobs and a separate indexer to pull these into the document chunks index; `push` directly uploads the data from the custom skill into the document chunks index | `pull` |
133+
| `SearchIndexerSkillType`* | The type of chunking and embedding skill to use as part of the documents indexer: `integrated` uses [integrated vectorization](https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization); `pull` uses a custom skill with a [knowledge store](https://learn.microsoft.com/azure/search/knowledge-store-concept-intro) to store the chunk data in blobs and a separate indexer to pull these into the document chunks index; `push` directly uploads the data from a custom skill into the document chunks index | `integrated` |
130134
| `SearchIndexerScheduleMinutes`* | The number of minutes between indexer executions in Azure AI Search | `5` |
131135
| `InitialDocumentUrls` | A space-separated list of URLs for the documents to include by default | A [resiliency](https://azure.microsoft.com/mediahandler/files/resourcefiles/resilience-in-azure-whitepaper/Resiliency-whitepaper.pdf) and [compliance](https://azure.microsoft.com/mediahandler/files/resourcefiles/data-residency-data-sovereignty-and-compliance-in-the-microsoft-cloud/Data_Residency_Data_Sovereignty_Compliance_Microsoft_Cloud.pdf) document |
132136
| `DefaultSystemRoleInformation` | The default instructions for the AI model | "You are an AI assistant that helps people find information." |

azuredeploy.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@
6161
"openaiApiVersion": "2023-06-01-preview",
6262
"storageContainerNameBlobDocuments": "blob-documents",
6363
"storageContainerNameBlobChunks": "blob-chunks",
64+
"textChunkerPageLength": 2000,
65+
"textChunkerPageOverlap": 500,
6466
"textEmbedderNumTokens": 2048,
6567
"textEmbedderTokenOverlap": 0,
6668
"textEmbedderMinChunkSize": 10,
@@ -534,6 +536,14 @@
534536
"type": "string",
535537
"value": "[variables('functionApiKey')]"
536538
},
539+
"textChunkerPageLength": {
540+
"type": "string",
541+
"value": "[variables('textChunkerPageLength')]"
542+
},
543+
"textChunkerPageOverlap": {
544+
"type": "string",
545+
"value": "[variables('textChunkerPageOverlap')]"
546+
},
537547
"textEmbedderNumTokens": {
538548
"type": "int",
539549
"value": "[variables('textEmbedderNumTokens')]"

src/Azure.AISearch.WebApp/AppSettings.cs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ public class AppSettings
1414
public string? TextEmbedderFunctionEndpointPython { get; set; }
1515
public string? TextEmbedderFunctionEndpointDotNet { get; set; }
1616
public string? TextEmbedderFunctionApiKey { get; set; }
17+
public int? TextChunkerPageLength { get; set; } // If unspecified, will use 2000 characters per page.
18+
public int? TextChunkerPageOverlap { get; set; } // If unspecified, will use 500 characters overlap.
1719
public int? TextEmbedderNumTokens { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
1820
public int? TextEmbedderTokenOverlap { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
1921
public int? TextEmbedderMinChunkSize { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
@@ -22,7 +24,7 @@ public class AppSettings
2224
public string? SearchServiceSku { get; set; }
2325
public string? SearchIndexNameBlobDocuments { get; set; }
2426
public string? SearchIndexNameBlobChunks { get; set; }
25-
public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "pull" model.
27+
public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "integrated" model.
2628
public int? SearchIndexerScheduleMinutes { get; set; } // If unspecified, will be set to 5 minutes.
2729
public string? InitialDocumentUrls { get; set; }
2830
public string? DefaultSystemRoleInformation { get; set; }

src/Azure.AISearch.WebApp/Constants.cs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,14 @@ public static class Constants
77
public static class ConfigurationNames
88
{
99
public const string SemanticConfigurationNameDefault = "default";
10-
public const string VectorSearchConfigurationNameDefault = "default";
10+
public const string VectorSearchProfileNameDefault = "default-profile";
11+
public const string VectorSearchAlgorithNameDefault = "default-algorithm";
12+
public const string VectorSearchVectorizerNameDefault = "default-vectorizer";
1113
}
1214

1315
public static class SearchIndexerSkillTypes
1416
{
17+
public const string Integrated = "integrated";
1518
public const string Pull = "pull";
1619
public const string Push = "push";
1720
}

src/Azure.AISearch.WebApp/Models/AppSettingsOverride.cs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@ namespace Azure.AISearch.WebApp.Models;
44
// as they are not used anywhere else and don't depend on other settings.
55
public class AppSettingsOverride
66
{
7+
public int? TextChunkerPageLength { get; set; } // If unspecified, will use 2000 characters per page.
8+
public int? TextChunkerPageOverlap { get; set; } // If unspecified, will use 500 characters overlap.
79
public int? TextEmbedderNumTokens { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
810
public int? TextEmbedderTokenOverlap { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
911
public int? TextEmbedderMinChunkSize { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
10-
public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "pull" model.
12+
public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "integrated" model.
1113
public int? SearchIndexerScheduleMinutes { get; set; } // If unspecified, will be set to 5 minutes.
1214
}

src/Azure.AISearch.WebApp/Models/DocumentChunk.cs

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,6 @@ namespace Azure.AISearch.WebApp.Models;
33
public class DocumentChunk
44
{
55
public string? Id { get; set; }
6-
public long ChunkIndex { get; set; }
7-
public long ChunkOffset { get; set; }
8-
public long ChunkLength { get; set; }
96
public string? Content { get; set; }
107
public IReadOnlyList<float>? ContentVector { get; set; }
118
public string? SourceDocumentId { get; set; }

src/Azure.AISearch.WebApp/Models/SearchRequest.cs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ public class SearchRequest
1010
public QuerySyntax QuerySyntax { get; set; } = QuerySyntax.Simple;
1111
public DataSourceType DataSource { get; set; } = DataSourceType.None;
1212
public string? OpenAIGptDeployment { get; set; }
13+
public bool UseIntegratedVectorization { get; set; }
1314
public int? VectorNearestNeighborsCount { get; set; } = Constants.Defaults.VectorNearestNeighborsCount;
1415
public bool LimitToDataSource { get; set; } = true; // "Limit responses to your data content"
1516
public string? SystemRoleInformation { get; set; } // Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant’s personality, tell it what it should and shouldn’t answer, and tell it how to format responses. There’s no token limit for this section, but it will be included with every API call, so it counts against the overall token limit.

src/Azure.AISearch.WebApp/Models/SearchResult.cs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@ public class SearchResult
66
public string? SearchIndexKey { get; set; }
77
public string? DocumentId { get; set; }
88
public string? DocumentTitle { get; set; }
9-
public int? ChunkIndex { get; set; }
109
public double? Score { get; set; }
1110
public IDictionary<string, IList<string>> Highlights { get; set; } = new Dictionary<string, IList<string>>();
1211
public IList<string> Captions { get; set; } = new List<string>();

0 commit comments

Comments
 (0)