Skip to content

Commit cfcd77c

Browse files
committed
January freshness, group 1 files
1 parent 45ff12c commit cfcd77c

11 files changed

+52
-55
lines changed

articles/search/cognitive-search-custom-skill-interface.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
88
ms.custom:
99
- ignite-2023
1010
ms.topic: how-to
11-
ms.date: 05/28/2024
11+
ms.date: 01/15/2025
1212
---
1313

1414
# Add a custom skill to an Azure AI Search enrichment pipeline
@@ -23,13 +23,13 @@ If you're building a custom skill, this article describes the interface you use
2323

2424
Building a custom skill gives you a way to insert transformations unique to your content. For example, you could build custom classification models to differentiate business and financial contracts and documents, or add a speech recognition skill to reach deeper into audio files for relevant content. For a step-by-step example, see [Example: Creating a custom skill for AI enrichment](cognitive-search-create-custom-skill-example.md).
2525

26-
## Set the endpoint and timeout interval
26+
## Set the endpoint and time-out interval
2727

2828
The interface for a custom skill is specified through the [Custom Web API skill](cognitive-search-custom-skill-web-api.md).
2929

3030
```json
3131
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
32-
"description": "This skill has a 230 second timeout",
32+
"description": "This skill has a 230 second time-out",
3333
"uri": "https://[your custom skill uri goes here]",
3434
"authResourceId": "[for managed identity connections, your app's client ID goes here]",
3535
"timeout": "PT230S",
@@ -45,7 +45,7 @@ If instead your function or app uses Azure managed identities and Azure roles fo
4545

4646
+ Your [custom skill definition](cognitive-search-custom-skill-web-api.md) must include an `authResourceId` property. This property takes an application (client) ID, in a [supported format](/azure/active-directory/develop/security-best-practices-for-app-registration#application-id-uri): `api://<appId>`.
4747

48-
By default, the connection to the endpoint times out if a response isn't returned within a 30-second window (`PT30S`). The indexing pipeline is synchronous and indexing will produce a timeout error if a response isn't received in that time frame. You can increase the interval to a maximum value of 230 seconds by setting the timeout parameter (`PT230S`).
48+
By default, the connection to the endpoint times out if a response isn't returned within a 30-second window (`PT30S`). The indexing pipeline is synchronous and indexing will produce a time-out error if a response isn't received in that time frame. You can increase the interval to a maximum value of 230 seconds by setting the `timeout` parameter (`PT230S`).
4949

5050
## Format Web API inputs
5151

articles/search/cognitive-search-working-with-skillsets.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ author: HeidiSteen
66
ms.author: heidist
77
ms.service: azure-ai-search
88
ms.topic: conceptual
9-
ms.date: 06/06/2024
9+
ms.date: 01/15/2025
1010
---
1111

1212
# Skillset concepts in Azure AI Search
1313

14-
This article is for developers who need a deeper understanding of skillset concepts and composition, and assumes familiarity with the high-level concepts of [applied AI](cognitive-search-concept-intro.md) in Azure AI Search.
14+
This article is for developers who need a deeper understanding of skillset composition, and assumes familiarity with the high-level concepts of [AI enrichment](cognitive-search-concept-intro.md), or applied AI, in Azure AI Search.
1515

1616
A skillset is a reusable object in Azure AI Search that's attached to [an indexer](search-indexer-overview.md). It contains one or more skills that call built-in AI or external custom processing over documents retrieved from an external data source.
1717

@@ -21,7 +21,7 @@ The following diagram illustrates the basic data flow of skillset execution.
2121

2222
From the onset of skillset processing to its conclusion, skills read from and write to an [*enriched document*](#enrichment-tree) that exists in memory. Initially, an enriched document is just the raw content extracted from a data source (articulated as the `"/document"` root node). With each skill execution, the enriched document gains structure and substance as each skill writes its output as nodes in the graph.
2323

24-
After skillset execution is done, the output of an enriched document finds its way into an index through user-defined *output field mappings*. Any raw content that you want transferred intact, from source to an index, is defined through *field mappings*.
24+
After skillset execution is done, the output of an enriched document finds its way into an index through user-defined *output field mappings*. Any raw content that you want transferred intact, from source to an index, is defined through *field mappings*. In contrast, *output field mappings* transfer in-memory content (nodes) to the index.
2525

2626
To configure applied AI, specify settings in a skillset and indexer.
2727

@@ -191,9 +191,11 @@ The root node for all enrichments is `"/document"`. When you're working with blo
191191

192192
### Skill #1: Split skill
193193

194-
When source content consists of large chunks of text, it's helpful to break it into smaller components for greater accuracy of language, sentiment, and key phrase detection. There are two grains available: pages and sentences. A page consists of approximately 5,000 characters.
194+
When source content consists of large chunks of text, it's helpful to break it into smaller components for [integrated vectorization](vector-search-integrated-vectorization.md), or for greater accuracy of language, sentiment, and key phrase detection. There are two grains available: pages and sentences. A page consists of approximately 5,000 characters.
195195

196-
A text split skill is typically first in a skillset.
196+
An alternative to chunking with the Split skill is through the [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md), but that skill is out of scope for this article.
197+
198+
When chunking is required, the Split skill is typically first in a skillset.
197199

198200
```json
199201
"@odata.type": "#Microsoft.Skills.Text.SplitSkill",
@@ -239,7 +241,7 @@ Customer feedback reflects a range of positive and negative experiences. The sen
239241

240242
Given the context of `/document/reviews_text/pages/*`, both sentiment analysis and key phrase skills are invoked once for each of the items in the `pages` collection. The output from the skill will be a node under the associated page element.
241243

242-
You should now be able to look at the rest of the skills in the skillset and visualize how the tree of enrichments continue to grow with the execution of each skill. Some skills, such as the merge skill and the shaper skill, also create new nodes but only use data from existing nodes and don't create net new enrichments.
244+
You should now be able to look at the rest of the skills in the skillset and visualize how the enrichment tree grows with the execution of each skill. Some skills, such as the merge skill and the shaper skill, also create new nodes but only use data from existing nodes and don't create net new enrichments.
243245

244246
![enrichment tree after all skills](media/cognitive-search-working-with-skillsets/enrichment-tree-final.png "Enrichment tree after all skills")
245247

-96 Bytes
Loading
13 KB
Loading

articles/search/search-blob-storage-integration.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,23 +10,23 @@ ms.service: azure-ai-search
1010
ms.custom:
1111
- ignite-2023
1212
ms.topic: conceptual
13-
ms.date: 05/04/2024
13+
ms.date: 01/15/2025
1414
---
1515

1616
# Search over Azure Blob Storage content
1717

1818
Searching across the variety of content types stored in Azure Blob Storage can be a difficult problem to solve, but [Azure AI Search](search-what-is-azure-search.md) provides deep integration at the content layer, extracting and inferring textual information, which can then be queried in a search index.
1919

20-
In this article, review the basic workflow for extracting content and metadata from blobs and sending it to a [search index](search-what-is-an-index.md) in Azure AI Search. The resulting index can be queried using full text search. Optionally, you can send processed blob content to a [knowledge store](knowledge-store-concept-intro.md) for non-search scenarios.
20+
In this article, review the basic workflow for extracting content and metadata from blobs and sending it to a [search index](search-what-is-an-index.md) in Azure AI Search. The resulting index can be queried using full text search or vector search. Optionally, you can send processed blob content to a [knowledge store](knowledge-store-concept-intro.md) for non-search scenarios.
2121

2222
> [!NOTE]
2323
> Already familiar with the workflow and composition? [Configure a blob indexer](search-howto-indexing-azure-blob-storage.md) is your next step.
2424
25-
## What it means to add full text search to blob data
25+
## What it means to add search over blob data
2626

2727
Azure AI Search is a standalone search service that supports indexing and query workloads over user-defined indexes that contain your private searchable content hosted in the cloud. Co-locating your searchable content with the query engine in the cloud is necessary for performance, returning results at a speed users have come to expect from search queries.
2828

29-
Azure AI Search integrates with Azure Blob Storage at the indexing layer, importing your blob content as search documents that are indexed into *inverted indexes* and other query structures that support free-form text queries and filter expressions. Because your blob content is indexed into a search index, you can use the full range of query features in Azure AI Search to find information in your blob content.
29+
Azure AI Search integrates with Azure Blob Storage at the indexing layer, importing your blob content as search documents that are indexed into *inverted indexes* and other query structures that support free-form text queries, vector queries, and filter expressions. Because your blob content is indexed into a search index, you can use the full range of query features in Azure AI Search to find information in your blob content.
3030

3131
Inputs are your blobs, in a single container, in Azure Blob Storage. Blobs can be almost any kind of text data. If your blobs contain images, you can add [AI enrichment](cognitive-search-concept-intro.md) to create and extract text and features from images.
3232

@@ -58,6 +58,8 @@ By default, most blobs are indexed as a single search document in the index, inc
5858

5959
+ [Indexing JSON blobs](search-howto-index-json-blobs.md)
6060
+ [Indexing CSV blobs](search-howto-index-csv-blobs.md)
61+
+ [Indexing Markdown blobs](search-how-to-index-markdown-blobs.md)
62+
+ [Indexing plain text blobs](search-howto-index-plaintext-blobs.md)
6163

6264
A compound or embedded document (such as a ZIP archive, a Word document with embedded Outlook email containing attachments, or an .MSG file with attachments) is also indexed as a single document. For example, all images extracted from the attachments of an .MSG file will be returned in the normalized_images field. If you have images, consider adding [AI enrichment](cognitive-search-concept-intro.md) to get more search utility from that content.
6365

@@ -70,7 +72,7 @@ Textual content of a document is extracted into a string field named "content".
7072

7173
An *indexer* is a data-source-aware subservice in Azure AI Search, equipped with internal logic for sampling data, reading and retrieving data and metadata, and serializing data from native formats into JSON documents for subsequent import.
7274

73-
Blobs in Azure Storage are indexed using the [blob indexer](search-howto-indexing-azure-blob-storage.md). You can invoke this indexer by using the **Azure AI Search** command in Azure Storage, the **Import data** wizard, a REST API, or the .NET SDK. In code, you use this indexer by setting the type, and by providing connection information that includes an Azure Storage account along with a blob container. You can subset your blobs by creating a virtual directory, which you can then pass as a parameter, or by filtering on a file type extension.
75+
Blobs in Azure Storage are indexed using the [blob indexer](search-howto-indexing-azure-blob-storage.md). You can invoke this indexer by using the **Azure AI Search** command in Azure Storage, the [**Import data** wizards](search-import-data-portal.md), a REST API, or the .NET SDK. In code, you use this indexer by setting the type, and by providing connection information that includes an Azure Storage account along with a blob container. You can subset your blobs by creating a virtual directory, which you can then pass as a parameter, or by filtering on a file type extension.
7476

7577
An indexer ["cracks a document"](search-indexer-overview.md#document-cracking), opening a blob to inspect content. After connecting to the data source, it's the first step in the pipeline. For blob data, this is where PDF, Office docs, and other content types are detected. Document cracking with text extraction is no charge. If your blobs contain image content, images are ignored unless you [add AI enrichment](cognitive-search-concept-intro.md). Standard indexing applies only to text content.
7678

@@ -92,7 +94,7 @@ By running a blob indexer over a container, you can extract text and metadata fr
9294

9395
You can control which blobs are indexed, and which are skipped, by the blob's file type or by setting properties on the blob themselves, causing the indexer to skip over them.
9496

95-
Include specific file extensions by setting `"indexedFileNameExtensions"` to a comma-separated list of file extensions (with a leading dot). Exclude specific file extensions by setting `"excludedFileNameExtensions"` to the extensions that should be skipped. If the same extension is in both lists, it will be excluded from indexing.
97+
Include specific file extensions by setting `"indexedFileNameExtensions"` to a comma-separated list of file extensions (with a leading dot). Exclude specific file extensions by setting `"excludedFileNameExtensions"` to the extensions that should be skipped. If the same extension is in both lists, it's excluded from indexing.
9698

9799
```http
98100
PUT /indexers/[indexer name]?api-version=2024-07-01
@@ -110,7 +112,7 @@ PUT /indexers/[indexer name]?api-version=2024-07-01
110112

111113
The indexer configuration parameters apply to all blobs in the container or folder. Sometimes, you want to control how *individual blobs* are indexed.
112114

113-
Add the following metadata properties and values to blobs in Blob Storage. When the indexer encounters this property, it will skip the blob or its content in the indexing run.
115+
Add the following metadata properties and values to blobs in Blob Storage. When the indexer encounters this property, it skips the blob or its content in the indexing run.
114116

115117
| Property name | Property value | Explanation |
116118
| ------------- | -------------- | ----------- |
@@ -122,7 +124,7 @@ Add the following metadata properties and values to blobs in Blob Storage. When
122124
A common scenario that makes it easy to sort through blobs of any content type is to [index both custom metadata and system properties](search-blob-metadata-properties.md) for each blob. In this way, information for all blobs is indexed regardless of document type, stored in an index in your search service. Using your new index, you can then proceed to sort, filter, and facet across all Blob storage content.
123125

124126
> [!NOTE]
125-
> Blob Index tags are natively indexed by the Blob storage service and exposed for querying. If your blobs' key/value attributes require indexing and filtering capabilities, Blob Index tags should be leveraged instead of metadata.
127+
> Blob Index tags are natively indexed by the Blob storage service and exposed for querying. If your blobs' key/value attributes require indexing and filtering capabilities, Blob Index tags should be used instead of metadata.
126128
>
127129
> To learn more about Blob Index, see [Manage and find data on Azure Blob Storage with Blob Index](/azure/storage/blobs/storage-manage-find-blobs).
128130

articles/search/search-create-service-portal.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.custom:
1111
- references_regions
1212
- build-2024
1313
ms.topic: conceptual
14-
ms.date: 10/17/2024
14+
ms.date: 01/15/2025
1515
---
1616

1717
# Create an Azure AI Search service in the Azure portal
@@ -119,8 +119,8 @@ Generally, choose a region near you, unless the following considerations apply:
119119

120120
Currently, the following regions offer cross-region among all three services (Azure AI Search, Azure OpenAI, Azure AI Vision multimodal). This list isn't definitive, and there might be more choices beyond the regions listed here depending on the tier. Also, region status can change quickly, so be sure to confirm region choice before installing.
121121

122-
+ **Americas**: West US
123-
+ **Europe**: France Central, North Europe, Sweden Central
122+
+ **Americas**: West US, East US
123+
+ **Europe**: Switzerland North, Sweden Central
124124

125125
## Choose a tier
126126

@@ -161,7 +161,7 @@ Unless you're using the Azure portal, programmatic access to your new service re
161161

162162
:::image type="content" source="media/search-create-service-portal/set-authentication-options.png" lightbox="media/search-create-service-portal/set-authentication-options.png" alt-text="Screenshot of the Keys page with authentication options." border="true":::
163163

164-
An endpoint and key aren't needed for portal-based tasks. the Azure portal is already linked to your Azure AI Search resource with admin rights. For a portal walkthrough, start with [Quickstart: Create an Azure AI Search index in the Azure portal](search-get-started-portal.md).
164+
An endpoint and key aren't needed for portal-based tasks. The Azure portal is already linked to your Azure AI Search resource with admin rights. For a portal walkthrough, start with [Quickstart: Create an Azure AI Search index in the Azure portal](search-get-started-portal.md).
165165

166166
## Scale your service
167167

@@ -195,7 +195,7 @@ Although most customers use just one service, service redundancy might be necess
195195
+ Globally deployed applications might require search services in each geography to minimize latency.
196196

197197
> [!NOTE]
198-
> In Azure AI Search, you cannot segregate indexing and querying operations; thus, you would never create multiple services for segregated workloads. An index is always queried on the service in which it was created (you cannot create an index in one service and copy it to another).
198+
> In Azure AI Search, you can't segregate indexing and querying operations; thus, you would never create multiple services for segregated workloads. An index is always queried on the service in which it was created (you can't create an index in one service and copy it to another).
199199
200200
A second service isn't required for high availability. High availability for queries is achieved when you use two or more replicas in the same service. Replica updates are sequential, which means at least one is operational when a service update is rolled out. For more information about uptime, see [Service Level Agreements](https://azure.microsoft.com/support/legal/sla/search/v1_0/).
201201

articles/search/search-explorer.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
77
ms.author: heidist
88
ms.service: azure-ai-search
99
ms.topic: quickstart
10-
ms.date: 06/14/2024
10+
ms.date: 01/15/2025
1111
ms.custom:
1212
- mode-ui
1313
---
@@ -54,7 +54,7 @@ There are two approaches for querying in Search explorer.
5454
+ JSON view supports parameterized queries. Filters, orderby, select, count, searchFields, and all other parameters must be set in JSON view.
5555

5656
> [!TIP]
57-
> JSON view provides intellisense for parameter name completion. Place the cursor inside the JSON view and type a space character to show a list of all query parameters, or type a single letter like "s" to show just the query parameters starting with "s". Intellisense doesn't exclude invalid parameters so use your best judgement.
57+
> JSON view provides intellisense for parameter name completion. Place the cursor inside the JSON view and type a space character to show a list of all query parameters, or type a single letter like "s" to show just the query parameters starting with "s". Intellisense doesn't exclude invalid parameters so use your best judgment.
5858
5959
Switch to **JSON view** for parameterized queries. The examples in this article assume JSON view throughout. You can paste JSON examples from this article into the text area.
6060

@@ -70,7 +70,8 @@ Equivalent syntax for an empty search is `*` or `"search": "*"`.
7070

7171
```json
7272
{
73-
"search": "*"
73+
"search": "*",
74+
"count": true
7475
}
7576
```
7677

0 commit comments

Comments
 (0)