You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-quickstart-blob.md
+36-25Lines changed: 36 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,13 @@ ms.service: cognitive-search
10
10
ms.custom:
11
11
- ignite-2023
12
12
ms.topic: quickstart
13
-
ms.date: 06/29/2023
13
+
ms.date: 11/30/2023
14
14
---
15
15
# Quickstart: Create a skillset in the Azure portal
16
16
17
-
In this Azure AI Search quickstart, you learn how a skillset in Azure AI Search adds Optical Character Recognition (OCR), image analysis, language detection, text translation, and entity recognition to create text-searchable content in a search index.
17
+
In this quickstart, you learn how a skillset in Azure AI Search adds Optical Character Recognition (OCR), image analysis, language detection, text translation, and entity recognition to generate text-searchable content in a search index.
18
18
19
-
You can run the **Import data** wizard in the Azure portal to apply skills that create and transform textual content during indexing. Output is a searchable index containing AI-generated image text, captions, and entities. Generated content is queryable in the portal using [**Search explorer**](search-explorer.md).
19
+
You can run the **Import data** wizard in the Azure portal to apply skills that create and transform textual content during indexing. Input is your raw data, usually blobs in Azure Storage. Output is a searchable index containing AI-generated image text, captions, and entities. Generated content is queryable in the portal using [**Search explorer**](search-explorer.md).
20
20
21
21
To prepare, you create a few resources and upload sample files before running the wizard.
22
22
@@ -47,9 +47,9 @@ In the following steps, set up a blob container in Azure Storage to store hetero
47
47
48
48
+ Choose the StorageV2 (general purpose V2).
49
49
50
-
1. In Azure portal, open your Azure Storage page and create a container. You can use the default public access level.
50
+
1. In Azure portal, open your Azure Storage page and create a container. You can use the default access level.
51
51
52
-
1. In Container, select **Upload** to upload the sample files you downloaded in the first step. Notice that you have a wide range of content types, including images and application files that aren't full text searchable in their native formats.
52
+
1. In Container, select **Upload** to upload the sample files. Notice that you have a wide range of content types, including images and application files that aren't full text searchable in their native formats.
53
53
54
54
:::image type="content" source="media/cognitive-search-quickstart-blob/sample-data.png" alt-text="Screenshot of source files in Azure Blob Storage." border="false":::
55
55
@@ -59,7 +59,7 @@ You're now ready to move on the Import data wizard.
59
59
60
60
1. Sign in to the [Azure portal](https://portal.azure.com/) with your Azure account.
61
61
62
-
1.[Find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Storage%2storageAccounts/) and on the Overview page, select **Import data** on the command bar to set up cognitive enrichment in four steps.
62
+
1.[Find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Storage%2storageAccounts/) and on the Overview page, select **Import data** on the command bar to create searchable content in four steps.
63
63
64
64
:::image type="content" source="media/search-import-data-portal/import-data-cmd.png" alt-text="Screenshot of the Import data command." border="true":::
65
65
@@ -101,72 +101,83 @@ Next, configure AI enrichment to invoke OCR, image analysis, and natural languag
101
101
102
102
### Step 3: Configure the index
103
103
104
-
An index contains your searchable content and the **Import data** wizard can usually create the schema for you by sampling the data source. In this step, review the generated schema and potentially revise any settings. Below is the default schema created for the demo Blob data set.
104
+
An index contains your searchable content and the **Import data** wizard can usually create the schema by sampling the data source. In this step, review the generated schema and potentially revise any settings.
105
105
106
106
For this quickstart, the wizard does a good job setting reasonable defaults:
107
107
108
-
+ Default fields are based on metadata properties for existing blobs, plus the new fields for the enrichment output (for example, `people`, `organizations`, `locations`). Data types are inferred from metadata and by data sampling.
108
+
+ Default fields are based on metadata properties of existing blobs, plus the new fields for the enrichment output (for example, `people`, `organizations`, `locations`). Data types are inferred from metadata and by data sampling.
109
109
110
110
+ Default document key is *metadata_storage_path* (selected because the field contains unique values).
111
111
112
112
+ Default attributes are **Retrievable** and **Searchable**. **Searchable** allows full text search a field. **Retrievable** means field values can be returned in results. The wizard assumes you want these fields to be retrievable and searchable because you created them via a skillset. Select **Filterable** if you want to use fields in a filter expression.
113
113
114
114
:::image type="content" source="media/cognitive-search-quickstart-blob/index-fields.png" alt-text="Screenshot of the index definition page." border="true":::
115
115
116
-
Marking a field as **Retrievable** doesn't mean that the field *must* be present in the search results. You can control search results composition by using the **$select** query parameter to specify which fields to include.
116
+
Marking a field as **Retrievable** doesn't mean that the field *must* be present in the search results. You can control search results composition by using the **select** query parameter to specify which fields to include.
117
117
118
118
Continue to the next page.
119
119
120
120
### Step 4: Configure the indexer
121
121
122
122
The indexer drives the indexing process. It specifies the data source name, a target index, and frequency of execution. The **Import data** wizard creates several objects, including an indexer that you can reset and run repeatedly.
123
123
124
-
1. In the **Indexer** page, you can accept the default name and select **Once** to run it immediately.
124
+
1. In the **Indexer** page, accept the default name and select **Once**.
125
125
126
126
:::image type="content" source="media/cognitive-search-quickstart-blob/indexer-def.png" alt-text="Screenshot of the indexer definition page." border="true":::
127
127
128
128
1. Select **Submit** to create and simultaneously run the indexer.
129
129
130
130
## Monitor status
131
131
132
-
Cognitive skills indexing takes longer to complete than typical text-based indexing, especially OCR and image analysis. To monitor progress, go to the Overview page and select **Indexers** in the middle of page.
132
+
Select **Indexers** from the left navigation pane to monitor status, and then select the indexer. Skills-based indexing takes longer than text-based indexing, especially OCR and image analysis.
133
133
134
134
:::image type="content" source="media/cognitive-search-quickstart-blob/indexer-notification.png" alt-text="Screenshot of the indexer status page." border="true":::
135
135
136
-
To check details about execution status, select an indexer from the list, and then select **Success** (or **Failed**) to view execution details.
136
+
To view details about execution status, select **Success** (or **Failed**) to view execution details.
137
137
138
-
In this demo, there's one warning: `"Could not execute skill because one or more skill input was invalid."` It tells you that a PNG file in the data source doesn't provide a text input to Entity Recognition. This warning occurs because the upstream OCR skill didn't recognize any text in the image, and thus couldn't provide a text input to the downstream Entity Recognition skill.
138
+
In this demo, there are a few warnings: `"Could not execute skill because one or more skill input was invalid."` It tells you that a PNG file in the data source doesn't provide a text input to Entity Recognition. This warning occurs because the upstream OCR skill didn't recognize any text in the image, and thus couldn't provide a text input to the downstream Entity Recognition skill.
139
139
140
140
Warnings are common in skillset execution. As you become familiar with how skills iterate over your data, you might begin to notice patterns and learn which warnings are safe to ignore.
141
141
142
142
## Query in Search explorer
143
143
144
-
After an index is created, run queries in**Search explorer** to return results.
144
+
After an index is created, use**Search explorer** to return results.
145
145
146
-
1. On the search service dashboard page, select **Search explorer** on the command bar.
146
+
1. On the left, select **Indexes** and then select the index. **Search explorer**is on the first tab.
147
147
148
-
1. Select **Change Index** at the top to select the index you created.
149
-
150
-
1. Enter a search string to query the index, such as `search=Satya Nadella&$select=people,organizations,locations&$count=true`.
148
+
1. Enter a search string to query the index, such as `satya nadella`. The search bar accepts keywords, quote-enclosed phrases, and operators (`"Satya Nadella" +"Bill Gates" +"Steve Ballmer"`).
151
149
152
150
Results are returned as verbose JSON, which can be hard to read, especially in large documents. Some tips for searching in this tool include the following techniques:
153
151
154
-
+ Append `$select` to limit the fields returned in results.
152
+
+ Switch to JSON view to specify parameters that shape results.
153
+
+ Add `select` to limit the fields in results.
154
+
+ Add `count` to show the number of matches.
155
155
+ Use CTRL-F to search within the JSON for specific properties or terms.
156
156
157
-
Query strings are case-sensitive so if you get an "unknown field" message, check **Fields** or **Index Definition (JSON)** to verify name and case.
158
-
159
157
:::image type="content" source="media/cognitive-search-quickstart-blob/search-explorer.png" alt-text="Screenshot of the Search explorer page." border="true":::
> Query strings are case-sensitive so if you get an "unknown field" message, check **Fields** or **Index Definition (JSON)** to verify name and case.
171
+
161
172
## Takeaways
162
173
163
-
You've now created your first skillset and learned important concepts useful for prototyping an enriched search solution using your own data.
174
+
You've now created your first skillset and learned the basic steps of skills-based indexing.
164
175
165
-
Some key concepts that we hope you picked up include the dependency on Azure data sources. A skillset is bound to an indexer, and indexers are Azure and source-specific. Although this quickstart uses Azure Blob Storage, other Azure data sources are possible. For more information, see [Indexers in Azure AI Search](search-indexer-overview.md).
176
+
Some key concepts that we hope you picked up include the dependencies. A skillset is bound to an indexer, and indexers are Azure and source-specific. Although this quickstart uses Azure Blob Storage, other Azure data sources are possible. For more information, see [Indexers in Azure AI Search](search-indexer-overview.md).
166
177
167
178
Another important concept is that skills operate over content types, and when working with heterogeneous content, some inputs are skipped. Also, large files or fields might exceed the indexer limits of your service tier. It's normal to see warnings when these events occur.
168
179
169
-
Output is directed to a search index, and there's a mapping between name-value pairs created during indexing and individual fields in your index. Internally, the portal sets up [annotations](cognitive-search-concept-annotations-syntax.md) and defines a [skillset](cognitive-search-defining-skillset.md), establishing the order of operations and general flow. These steps are hidden in the portal, but when you start writing code, these concepts become important.
180
+
Output is routed to a search index, and there's a mapping between name-value pairs created during indexing and individual fields in your index. Internally, the wizard sets up [an enrichment tree](cognitive-search-concept-annotations-syntax.md) and defines a [skillset](cognitive-search-defining-skillset.md), establishing the order of operations and general flow. These steps are hidden in the wizard, but when you start writing code, these concepts become important.
170
181
171
182
Finally, you learned that can verify content by querying the index. In the end, what Azure AI Search provides is a searchable index, which you can query using either the [simple](/rest/api/searchservice/simple-query-syntax-in-azure-search) or [fully extended query syntax](/rest/api/searchservice/lucene-query-syntax-in-azure-search). An index containing enriched fields is like any other. If you want to incorporate standard or [custom analyzers](search-analyzers.md), [scoring profiles](/rest/api/searchservice/add-scoring-profiles-to-a-search-index), [synonyms](search-synonyms.md), [faceted navigation](search-faceted-navigation.md), geo-search, or any other Azure AI Search feature, you can certainly do so.
172
183
@@ -176,7 +187,7 @@ When you're working in your own subscription, it's a good idea at the end of a p
176
187
177
188
You can find and manage resources in the portal, using the **All resources** or **Resource groups** link in the left-navigation pane.
178
189
179
-
If you're using a free service, remember that you're limited to three indexes, indexers, and data sources. You can delete individual items in the portal to stay under the limit.
190
+
If you use a free service, remember that you're limited to three indexes, indexers, and data sources. You can delete individual items in the portal to stay under the limit.
Copy file name to clipboardExpand all lines: articles/search/search-get-started-portal-import-vectors.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ ms.service: cognitive-search
9
9
ms.custom:
10
10
- ignite-2023
11
11
ms.topic: quickstart
12
-
ms.date: 11/06/2023
12
+
ms.date: 11/29/2023
13
13
---
14
14
15
15
# Quickstart: Integrated vectorization (preview)
@@ -141,7 +141,7 @@ Search explorer accepts text strings as input and then vectorizes the text for v
141
141
142
142
1. Make sure the API version is **2023-10-01-preview**.
143
143
144
-
1.Enter your search string. Here's a string that gets a count of the chunked documents and selects just the title and chunk fields: `$count=true&$select=title,chunk`.
144
+
1.Select **JSON view** so that you can enter text for your vector query in the **text** vector query parameter.
0 commit comments