Skip to content

Commit 500beb9

Browse files
committed
review references to alphanumeric
1 parent c62c7a9 commit 500beb9

6 files changed

+15
-15
lines changed

articles/search/cognitive-search-create-custom-skill-example.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Although this example uses an Azure Function to host a web API, it isn't require
3232

3333
1. In Visual Studio, select **New** > **Project** from the File menu.
3434

35-
1. Choose **Azure Functions** as the template and select **Next**. Type a name for your project, and select **Create**. The function app name must be valid as a C# namespace, so don't use underscores, hyphens, or any other non-alphanumeric characters.
35+
1. Choose **Azure Functions** as the template and select **Next**. Type a name for your project, and select **Create**. The function app name must be valid as a C# namespace, so don't use underscores, hyphens, or any other special characters.
3636

3737
1. Select a framework that has long term support.
3838

articles/search/index-add-scoring-profiles.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ Scoring profiles supplement the default scoring algorithm by boosting the scores
9999

100100
For standalone text queries, scoring profiles identify the maximum 1,000 matches in a [BM25-ranked search](index-similarity-and-scoring.md), and the top 50 are returned in results.
101101

102-
For pure vectors, the query is vector-only, but if the [*k*-matching documents](vector-search-ranking.md) include alphanumeric fields that a scoring profile can process, a scoring profile is applied. The scoring profile revises the result set by boosting documents that match criteria in the profile.
102+
For pure vectors, the query is vector-only, but if the [*k*-matching documents](vector-search-ranking.md) include nonvector alphanumeric fields, a scoring profile is applied. The scoring profile revises the result set by boosting documents that match criteria in the profile.
103103

104104
For text queries in a hybrid query, scoring profiles identify the maximum 1,000 matches in a BM25-ranked search. However, once those 1,000 results are identified, they're restored to their original BM25 order so that they can be rescored alongside vectors results in the final [Reciprocal Ranking Function (RRF)](hybrid-search-ranking.md) ordering, where the scoring profile (identified as "final document boosting adjustment" in the illustration) is applied to the merged results, along with [vector weighting](vector-search-how-to-query.md#vector-weighting), and [semantic ranking](semantic-search-overview.md) as the last step.
105105

articles/search/search-filters.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Filtering occurs in tandem with search, qualifying which documents to include in
4646

4747
## How filters are defined
4848

49-
Filters apply to alphanumeric content on fields that are attributed as `filterable`.
49+
Filters apply to text and numeric (nonvector) content on fields that are attributed as `filterable`.
5050

5151
Filters are OData expressions, articulated in the [filter syntax](search-query-odata-filter.md) supported by Azure AI Search.
5252

articles/search/search-how-to-create-indexers.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.service: azure-ai-search
1111
ms.custom:
1212
- ignite-2023
1313
ms.topic: how-to
14-
ms.date: 10/24/2024
14+
ms.date: 12/17/2024
1515
---
1616

1717
# Create an indexer in Azure AI Search
@@ -22,9 +22,9 @@ You can use an indexer to automate data import and indexing in Azure AI Search.
2222

2323
Indexers support two workflows:
2424

25-
+ **Text-based indexing**: Extract strings and metadata from textual content for full text search scenarios.
25+
+ **Raw content indexing (plain text or vectors)**: Extract strings and metadata from textual content for full text search scenarios. Extracts raw vector content for vector search (for example, vectors in an Azure SQL database or Azure Cosmos DB collection). In this workflow, indexing occurs only over existing content that you provide.
2626

27-
+ **Skills-based indexing**: Use built-in or custom skills that add integrated machine learning for analysis over images and large undifferentiated content, extracting or inferring text and structure. Skills-based indexing enables search over content that isn't otherwise easily full text searchable. To learn more, see [AI enrichment in Azure AI Search](cognitive-search-concept-intro.md).
27+
+ **Skills-based indexing**: Extends indexing through built-in or custom skills that create or generate new searchable content. For example, you can add integrated machine learning for analysis over images and unstructured text, extracting or inferring text and structure. Or, use skills to chunk and vectorize content from text and images. Skills-based indexing creates or generates new content that doesn't exist in your external data source. New content becomes part of your index when you add fields to the index schema that accepts the incoming data. To learn more, see [AI enrichment in Azure AI Search](cognitive-search-concept-intro.md).
2828

2929
## Prerequisites
3030

@@ -38,11 +38,11 @@ Indexers support two workflows:
3838

3939
## Indexer patterns
4040

41-
When you create an indexer, the definition is one of two patterns: *text-based indexing* or *skills-based indexing*. The patterns are the same, except that skills-based indexing has more definitions.
41+
When you create an indexer, the definition is one of two patterns: *content-based indexing* or *skills-based indexing*. The patterns are the same, except that skills-based indexing has more definitions.
4242

43-
### Indexer example for text-based indexing
43+
### Indexer example for content-based indexing
4444

45-
Text-based indexing for full text search is the primary use case for indexers. For this workflow, an indexer looks like this example.
45+
Content-based indexing for full text or vector search is the primary use case for indexers. For this workflow, an indexer looks like this example.
4646

4747
```json
4848
{
@@ -114,16 +114,16 @@ Indexers work with data sets. When you run an indexer, it connects to your data
114114

115115
| Source data | Tasks |
116116
|-------------|-------|
117-
| JSON documents | Make sure the structure or shape of incoming data corresponds to the schema of your search index. Most search indexes are fairly flat, where the fields collection consists of fields at the same level. However, hierarchical or nested structures are possible through [complex fields and collections](search-howto-complex-data-types.md). |
117+
| JSON documents | JSON documents can contain text, numbers, and vectors. Make sure the structure or shape of incoming data corresponds to the schema of your search index. Most search indexes are fairly flat, where the fields collection consists of fields at the same level. However, hierarchical or nested structures are possible through [complex fields and collections](search-howto-complex-data-types.md). |
118118
| Relational | Provide data as a flattened row set, where each row becomes a full or partial search document in the index. <br><br> To flatten relational data into a row set, you should create a SQL view, or build a query that returns parent and child records in the same row. For example, the built-in hotels sample dataset is an SQL database that has 50 records (one for each hotel), linked to room records in a related table. The query that flattens the collective data into a row set embeds all of the room information in JSON documents in each hotel record. The embedded room information is a generated by a query that uses a **FOR JSON AUTO** clause. <br><br> You can learn more about this technique in [define a query that returns embedded JSON](index-sql-relational-data.md#define-a-query-that-returns-embedded-json). This is just one example; you can find other approaches that produce the same result. |
119119
| Files | An indexer generally creates one search document for each file, where the search document consists of fields for content and metadata. Depending on the file type, the indexer can sometimes [parse one file into multiple search documents](search-howto-index-one-to-many-blobs.md). For example, in a CSV file, each row can become a standalone search document. |
120120

121121
Remember that you only need to pull in searchable and filterable data:
122122

123-
+ Searchable data is text
124-
+ Filterable data is alphanumeric
123+
+ Searchable data is text or vectors
124+
+ Filterable data is text and numbers (non-vector fields)
125125

126-
Azure AI Search can't search over binary data in any format, although it can extract and infer text descriptions of image files (see [AI enrichment](cognitive-search-concept-intro.md)) to create searchable content. Likewise, large text can be broken down and analyzed by natural language models to find structure or relevant information, generating new content that you can add to a search document.
126+
Azure AI Search can't do a full-text search over binary data in any format, although it can extract and infer text descriptions of image files (see [AI enrichment](cognitive-search-concept-intro.md)) to create searchable content. Likewise, large text can be broken down and analyzed by natural language models to find structure or relevant information, generating new content that you can add to a search document. It can also do vector search over embeddings, including quantized embeddings in a binary format.
127127

128128
Given that indexers don't fix data problems, other forms of data cleansing or manipulation might be needed. For more information, you should refer to the product documentation of your [Azure database product](/azure/?product=databases).
129129

articles/search/search-howto-incremental-index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ PUT https://[YOUR-SEARCH-SERVICE].search.windows.net/indexers/[YOUR-INDEXER-NAME
134134
}
135135
```
136136

137-
If you now issue another GET request on the indexer, the response from the service includes an `ID` property in the cache object. The alphanumeric string is appended to the name of the container containing all the cached results and intermediate state of each document processed by this indexer. The ID is used to uniquely name the cache in Blob storage.
137+
If you now issue another GET request on the indexer, the response from the service includes an `ID` property in the cache object. The string is appended to the name of the container containing all the cached results and intermediate state of each document processed by this indexer. The ID is used to uniquely name the cache in Blob storage.
138138

139139
```http
140140
"cache": {

articles/search/search-import-data-portal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ customer intent: As a developer, I want to use wizards for index creation so tha
1717

1818
Azure AI Search has two import wizards that automate indexing and object creation so that you can begin querying immediately. If you're new to Azure AI Search, these wizards are one of the most powerful features at your disposal. With minimal effort, you can create an indexing or enrichment pipeline that exercises most of the functionality of Azure AI Search.
1919

20-
+ **Import data wizard** supports nonvector workflows. You can extract alphanumeric text from raw documents. You can also configure applied AI and built-in skills that infer structure and generate text searchable content from image files and unstructured data.
20+
+ **Import data wizard** supports nonvector workflows. You can extract text and numbers from raw documents. You can also configure applied AI and built-in skills that infer structure and generate text searchable content from image files and unstructured data.
2121

2222
+ **Import and vectorize data wizard** adds chunking and vectorization. You must specify an existing deployment of an embedding model, but the wizard makes the connection, formulates the request, and handles the response. It generates vector content from text or image content.
2323

0 commit comments

Comments
 (0)