You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-custom-skill-interface.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
8
8
ms.custom:
9
9
- ignite-2023
10
10
ms.topic: how-to
11
-
ms.date: 05/28/2024
11
+
ms.date: 01/15/2025
12
12
---
13
13
14
14
# Add a custom skill to an Azure AI Search enrichment pipeline
@@ -23,13 +23,13 @@ If you're building a custom skill, this article describes the interface you use
23
23
24
24
Building a custom skill gives you a way to insert transformations unique to your content. For example, you could build custom classification models to differentiate business and financial contracts and documents, or add a speech recognition skill to reach deeper into audio files for relevant content. For a step-by-step example, see [Example: Creating a custom skill for AI enrichment](cognitive-search-create-custom-skill-example.md).
25
25
26
-
## Set the endpoint and timeout interval
26
+
## Set the endpoint and time-out interval
27
27
28
28
The interface for a custom skill is specified through the [Custom Web API skill](cognitive-search-custom-skill-web-api.md).
"description": "This skill has a 230 second timeout",
32
+
"description": "This skill has a 230 second time-out",
33
33
"uri": "https://[your custom skill uri goes here]",
34
34
"authResourceId": "[for managed identity connections, your app's client ID goes here]",
35
35
"timeout": "PT230S",
@@ -45,7 +45,7 @@ If instead your function or app uses Azure managed identities and Azure roles fo
45
45
46
46
+ Your [custom skill definition](cognitive-search-custom-skill-web-api.md) must include an `authResourceId` property. This property takes an application (client) ID, in a [supported format](/azure/active-directory/develop/security-best-practices-for-app-registration#application-id-uri): `api://<appId>`.
47
47
48
-
By default, the connection to the endpoint times out if a response isn't returned within a 30-second window (`PT30S`). The indexing pipeline is synchronous and indexing will produce a timeout error if a response isn't received in that time frame. You can increase the interval to a maximum value of 230 seconds by setting the timeout parameter (`PT230S`).
48
+
By default, the connection to the endpoint times out if a response isn't returned within a 30-second window (`PT30S`). The indexing pipeline is synchronous and indexing will produce a time-out error if a response isn't received in that time frame. You can increase the interval to a maximum value of 230 seconds by setting the `timeout` parameter (`PT230S`).
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-working-with-skillsets.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,12 +6,12 @@ author: HeidiSteen
6
6
ms.author: heidist
7
7
ms.service: azure-ai-search
8
8
ms.topic: conceptual
9
-
ms.date: 06/06/2024
9
+
ms.date: 01/15/2025
10
10
---
11
11
12
12
# Skillset concepts in Azure AI Search
13
13
14
-
This article is for developers who need a deeper understanding of skillset concepts and composition, and assumes familiarity with the high-level concepts of [applied AI](cognitive-search-concept-intro.md) in Azure AI Search.
14
+
This article is for developers who need a deeper understanding of skillset composition, and assumes familiarity with the high-level concepts of [AI enrichment](cognitive-search-concept-intro.md), or applied AI, in Azure AI Search.
15
15
16
16
A skillset is a reusable object in Azure AI Search that's attached to [an indexer](search-indexer-overview.md). It contains one or more skills that call built-in AI or external custom processing over documents retrieved from an external data source.
17
17
@@ -21,7 +21,7 @@ The following diagram illustrates the basic data flow of skillset execution.
21
21
22
22
From the onset of skillset processing to its conclusion, skills read from and write to an [*enriched document*](#enrichment-tree) that exists in memory. Initially, an enriched document is just the raw content extracted from a data source (articulated as the `"/document"` root node). With each skill execution, the enriched document gains structure and substance as each skill writes its output as nodes in the graph.
23
23
24
-
After skillset execution is done, the output of an enriched document finds its way into an index through user-defined *output field mappings*. Any raw content that you want transferred intact, from source to an index, is defined through *field mappings*.
24
+
After skillset execution is done, the output of an enriched document finds its way into an index through user-defined *output field mappings*. Any raw content that you want transferred intact, from source to an index, is defined through *field mappings*. In contrast, *output field mappings* transfer in-memory content (nodes) to the index.
25
25
26
26
To configure applied AI, specify settings in a skillset and indexer.
27
27
@@ -191,9 +191,11 @@ The root node for all enrichments is `"/document"`. When you're working with blo
191
191
192
192
### Skill #1: Split skill
193
193
194
-
When source content consists of large chunks of text, it's helpful to break it into smaller components for greater accuracy of language, sentiment, and key phrase detection. There are two grains available: pages and sentences. A page consists of approximately 5,000 characters.
194
+
When source content consists of large chunks of text, it's helpful to break it into smaller components for [integrated vectorization](vector-search-integrated-vectorization.md), or for greater accuracy of language, sentiment, and key phrase detection. There are two grains available: pages and sentences. A page consists of approximately 5,000 characters.
195
195
196
-
A text split skill is typically first in a skillset.
196
+
An alternative to chunking with the Split skill is through the [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md), but that skill is out of scope for this article.
197
+
198
+
When chunking is required, the Split skill is typically first in a skillset.
@@ -239,7 +241,7 @@ Customer feedback reflects a range of positive and negative experiences. The sen
239
241
240
242
Given the context of `/document/reviews_text/pages/*`, both sentiment analysis and key phrase skills are invoked once for each of the items in the `pages` collection. The output from the skill will be a node under the associated page element.
241
243
242
-
You should now be able to look at the rest of the skills in the skillset and visualize how the tree of enrichments continue to grow with the execution of each skill. Some skills, such as the merge skill and the shaper skill, also create new nodes but only use data from existing nodes and don't create net new enrichments.
244
+
You should now be able to look at the rest of the skills in the skillset and visualize how the enrichment tree grows with the execution of each skill. Some skills, such as the merge skill and the shaper skill, also create new nodes but only use data from existing nodes and don't create net new enrichments.
243
245
244
246

Copy file name to clipboardExpand all lines: articles/search/search-blob-storage-integration.md
+10-8Lines changed: 10 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,23 +10,23 @@ ms.service: azure-ai-search
10
10
ms.custom:
11
11
- ignite-2023
12
12
ms.topic: conceptual
13
-
ms.date: 05/04/2024
13
+
ms.date: 01/15/2025
14
14
---
15
15
16
16
# Search over Azure Blob Storage content
17
17
18
18
Searching across the variety of content types stored in Azure Blob Storage can be a difficult problem to solve, but [Azure AI Search](search-what-is-azure-search.md) provides deep integration at the content layer, extracting and inferring textual information, which can then be queried in a search index.
19
19
20
-
In this article, review the basic workflow for extracting content and metadata from blobs and sending it to a [search index](search-what-is-an-index.md) in Azure AI Search. The resulting index can be queried using full text search. Optionally, you can send processed blob content to a [knowledge store](knowledge-store-concept-intro.md) for non-search scenarios.
20
+
In this article, review the basic workflow for extracting content and metadata from blobs and sending it to a [search index](search-what-is-an-index.md) in Azure AI Search. The resulting index can be queried using full text search or vector search. Optionally, you can send processed blob content to a [knowledge store](knowledge-store-concept-intro.md) for non-search scenarios.
21
21
22
22
> [!NOTE]
23
23
> Already familiar with the workflow and composition? [Configure a blob indexer](search-howto-indexing-azure-blob-storage.md) is your next step.
24
24
25
-
## What it means to add full text search to blob data
25
+
## What it means to add search over blob data
26
26
27
27
Azure AI Search is a standalone search service that supports indexing and query workloads over user-defined indexes that contain your private searchable content hosted in the cloud. Co-locating your searchable content with the query engine in the cloud is necessary for performance, returning results at a speed users have come to expect from search queries.
28
28
29
-
Azure AI Search integrates with Azure Blob Storage at the indexing layer, importing your blob content as search documents that are indexed into *inverted indexes* and other query structures that support free-form text queries and filter expressions. Because your blob content is indexed into a search index, you can use the full range of query features in Azure AI Search to find information in your blob content.
29
+
Azure AI Search integrates with Azure Blob Storage at the indexing layer, importing your blob content as search documents that are indexed into *inverted indexes* and other query structures that support free-form text queries, vector queries, and filter expressions. Because your blob content is indexed into a search index, you can use the full range of query features in Azure AI Search to find information in your blob content.
30
30
31
31
Inputs are your blobs, in a single container, in Azure Blob Storage. Blobs can be almost any kind of text data. If your blobs contain images, you can add [AI enrichment](cognitive-search-concept-intro.md) to create and extract text and features from images.
32
32
@@ -58,6 +58,8 @@ By default, most blobs are indexed as a single search document in the index, inc
+[Indexing plain text blobs](search-howto-index-plaintext-blobs.md)
61
63
62
64
A compound or embedded document (such as a ZIP archive, a Word document with embedded Outlook email containing attachments, or an .MSG file with attachments) is also indexed as a single document. For example, all images extracted from the attachments of an .MSG file will be returned in the normalized_images field. If you have images, consider adding [AI enrichment](cognitive-search-concept-intro.md) to get more search utility from that content.
63
65
@@ -70,7 +72,7 @@ Textual content of a document is extracted into a string field named "content".
70
72
71
73
An *indexer* is a data-source-aware subservice in Azure AI Search, equipped with internal logic for sampling data, reading and retrieving data and metadata, and serializing data from native formats into JSON documents for subsequent import.
72
74
73
-
Blobs in Azure Storage are indexed using the [blob indexer](search-howto-indexing-azure-blob-storage.md). You can invoke this indexer by using the **Azure AI Search** command in Azure Storage, the **Import data**wizard, a REST API, or the .NET SDK. In code, you use this indexer by setting the type, and by providing connection information that includes an Azure Storage account along with a blob container. You can subset your blobs by creating a virtual directory, which you can then pass as a parameter, or by filtering on a file type extension.
75
+
Blobs in Azure Storage are indexed using the [blob indexer](search-howto-indexing-azure-blob-storage.md). You can invoke this indexer by using the **Azure AI Search** command in Azure Storage, the [**Import data**wizards](search-import-data-portal.md), a REST API, or the .NET SDK. In code, you use this indexer by setting the type, and by providing connection information that includes an Azure Storage account along with a blob container. You can subset your blobs by creating a virtual directory, which you can then pass as a parameter, or by filtering on a file type extension.
74
76
75
77
An indexer ["cracks a document"](search-indexer-overview.md#document-cracking), opening a blob to inspect content. After connecting to the data source, it's the first step in the pipeline. For blob data, this is where PDF, Office docs, and other content types are detected. Document cracking with text extraction is no charge. If your blobs contain image content, images are ignored unless you [add AI enrichment](cognitive-search-concept-intro.md). Standard indexing applies only to text content.
76
78
@@ -92,7 +94,7 @@ By running a blob indexer over a container, you can extract text and metadata fr
92
94
93
95
You can control which blobs are indexed, and which are skipped, by the blob's file type or by setting properties on the blob themselves, causing the indexer to skip over them.
94
96
95
-
Include specific file extensions by setting `"indexedFileNameExtensions"` to a comma-separated list of file extensions (with a leading dot). Exclude specific file extensions by setting `"excludedFileNameExtensions"` to the extensions that should be skipped. If the same extension is in both lists, it will be excluded from indexing.
97
+
Include specific file extensions by setting `"indexedFileNameExtensions"` to a comma-separated list of file extensions (with a leading dot). Exclude specific file extensions by setting `"excludedFileNameExtensions"` to the extensions that should be skipped. If the same extension is in both lists, it's excluded from indexing.
96
98
97
99
```http
98
100
PUT /indexers/[indexer name]?api-version=2024-07-01
@@ -110,7 +112,7 @@ PUT /indexers/[indexer name]?api-version=2024-07-01
110
112
111
113
The indexer configuration parameters apply to all blobs in the container or folder. Sometimes, you want to control how *individual blobs* are indexed.
112
114
113
-
Add the following metadata properties and values to blobs in Blob Storage. When the indexer encounters this property, it will skip the blob or its content in the indexing run.
115
+
Add the following metadata properties and values to blobs in Blob Storage. When the indexer encounters this property, it skips the blob or its content in the indexing run.
114
116
115
117
| Property name | Property value | Explanation |
116
118
| ------------- | -------------- | ----------- |
@@ -122,7 +124,7 @@ Add the following metadata properties and values to blobs in Blob Storage. When
122
124
A common scenario that makes it easy to sort through blobs of any content type is to [index both custom metadata and system properties](search-blob-metadata-properties.md) for each blob. In this way, information for all blobs is indexed regardless of document type, stored in an index in your search service. Using your new index, you can then proceed to sort, filter, and facet across all Blob storage content.
123
125
124
126
> [!NOTE]
125
-
> Blob Index tags are natively indexed by the Blob storage service and exposed for querying. If your blobs' key/value attributes require indexing and filtering capabilities, Blob Index tags should be leveraged instead of metadata.
127
+
> Blob Index tags are natively indexed by the Blob storage service and exposed for querying. If your blobs' key/value attributes require indexing and filtering capabilities, Blob Index tags should be used instead of metadata.
126
128
>
127
129
> To learn more about Blob Index, see [Manage and find data on Azure Blob Storage with Blob Index](/azure/storage/blobs/storage-manage-find-blobs).
Copy file name to clipboardExpand all lines: articles/search/search-create-service-portal.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ ms.custom:
11
11
- references_regions
12
12
- build-2024
13
13
ms.topic: conceptual
14
-
ms.date: 10/17/2024
14
+
ms.date: 01/15/2025
15
15
---
16
16
17
17
# Create an Azure AI Search service in the Azure portal
@@ -119,8 +119,8 @@ Generally, choose a region near you, unless the following considerations apply:
119
119
120
120
Currently, the following regions offer cross-region among all three services (Azure AI Search, Azure OpenAI, Azure AI Vision multimodal). This list isn't definitive, and there might be more choices beyond the regions listed here depending on the tier. Also, region status can change quickly, so be sure to confirm region choice before installing.
121
121
122
-
+**Americas**: West US
123
-
+**Europe**: France Central, North Europe, Sweden Central
122
+
+**Americas**: West US, East US
123
+
+**Europe**: Switzerland North, Sweden Central
124
124
125
125
## Choose a tier
126
126
@@ -161,7 +161,7 @@ Unless you're using the Azure portal, programmatic access to your new service re
161
161
162
162
:::image type="content" source="media/search-create-service-portal/set-authentication-options.png" lightbox="media/search-create-service-portal/set-authentication-options.png" alt-text="Screenshot of the Keys page with authentication options." border="true":::
163
163
164
-
An endpoint and key aren't needed for portal-based tasks. the Azure portal is already linked to your Azure AI Search resource with admin rights. For a portal walkthrough, start with [Quickstart: Create an Azure AI Search index in the Azure portal](search-get-started-portal.md).
164
+
An endpoint and key aren't needed for portal-based tasks. The Azure portal is already linked to your Azure AI Search resource with admin rights. For a portal walkthrough, start with [Quickstart: Create an Azure AI Search index in the Azure portal](search-get-started-portal.md).
165
165
166
166
## Scale your service
167
167
@@ -195,7 +195,7 @@ Although most customers use just one service, service redundancy might be necess
195
195
+ Globally deployed applications might require search services in each geography to minimize latency.
196
196
197
197
> [!NOTE]
198
-
> In Azure AI Search, you cannot segregate indexing and querying operations; thus, you would never create multiple services for segregated workloads. An index is always queried on the service in which it was created (you cannot create an index in one service and copy it to another).
198
+
> In Azure AI Search, you can't segregate indexing and querying operations; thus, you would never create multiple services for segregated workloads. An index is always queried on the service in which it was created (you can't create an index in one service and copy it to another).
199
199
200
200
A second service isn't required for high availability. High availability for queries is achieved when you use two or more replicas in the same service. Replica updates are sequential, which means at least one is operational when a service update is rolled out. For more information about uptime, see [Service Level Agreements](https://azure.microsoft.com/support/legal/sla/search/v1_0/).
Copy file name to clipboardExpand all lines: articles/search/search-explorer.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
7
7
ms.author: heidist
8
8
ms.service: azure-ai-search
9
9
ms.topic: quickstart
10
-
ms.date: 06/14/2024
10
+
ms.date: 01/15/2025
11
11
ms.custom:
12
12
- mode-ui
13
13
---
@@ -54,7 +54,7 @@ There are two approaches for querying in Search explorer.
54
54
+ JSON view supports parameterized queries. Filters, orderby, select, count, searchFields, and all other parameters must be set in JSON view.
55
55
56
56
> [!TIP]
57
-
> JSON view provides intellisense for parameter name completion. Place the cursor inside the JSON view and type a space character to show a list of all query parameters, or type a single letter like "s" to show just the query parameters starting with "s". Intellisense doesn't exclude invalid parameters so use your best judgement.
57
+
> JSON view provides intellisense for parameter name completion. Place the cursor inside the JSON view and type a space character to show a list of all query parameters, or type a single letter like "s" to show just the query parameters starting with "s". Intellisense doesn't exclude invalid parameters so use your best judgment.
58
58
59
59
Switch to **JSON view** for parameterized queries. The examples in this article assume JSON view throughout. You can paste JSON examples from this article into the text area.
60
60
@@ -70,7 +70,8 @@ Equivalent syntax for an empty search is `*` or `"search": "*"`.
0 commit comments