You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-how-to-create-indexers.md
+8-3Lines changed: 8 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ ms.service: azure-ai-search
11
11
ms.custom:
12
12
- ignite-2023
13
13
ms.topic: how-to
14
-
ms.date: 10/10/2024
14
+
ms.date: 10/24/2024
15
15
---
16
16
17
17
# Create an indexer in Azure AI Search
@@ -177,7 +177,7 @@ When you're ready to create an indexer on a remote search service, you need a se
177
177
178
178
### [**REST**](#tab/indexer-rest)
179
179
180
-
Visual Studio Code with a REST client can send indexer requests. Using the app, you can connect to your search service and send [Create indexer (REST)](/rest/api/searchservice/indexers/create) or [Update indexer](/rest/api/searchservice/indexers/create-or-update) requests.
180
+
Visual Studio Code with a REST client can send indexer requests. Using the app, you can connect to your search service and send [Create indexer (REST)](/rest/api/searchservice/indexers/create) or [Create or Update indexer](/rest/api/searchservice/indexers/create-or-update) requests.
181
181
182
182
```http
183
183
POST /indexers?api-version=[api-version]
@@ -188,12 +188,17 @@ POST /indexers?api-version=[api-version]
188
188
"parameters": {
189
189
"batchSize": null,
190
190
"maxFailedItems": null,
191
-
"maxFailedItemsPerBatch": null
191
+
"maxFailedItemsPerBatch": null,
192
+
"configuration": {
193
+
"executionEnvironment": "standard"
194
+
}
192
195
},
193
196
"fieldMappings": [ optional unless there are field discrepancies that need resolution]
194
197
}
195
198
```
196
199
200
+
Parameters are used to set the batch size and how to handle processing failures. The [execution environment](search-howto-run-reset-indexers.md#indexer-execution) determines whether indexer and skillset processing can use the multitenant capabilities provided by Microsoft or the private processing nodes allocated exclusively to your search service.
201
+
197
202
There are numerous tutorials and examples that demonstrate REST clients for creating objects. [Quickstart: Text search using REST](search-get-started-rest.md) can get you started.
Copy file name to clipboardExpand all lines: articles/search/search-limits-quotas-capacity.md
+15-13Lines changed: 15 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,16 +63,18 @@ You might find some variation in maximum limits if your service happens to be pr
63
63
64
64
Maximum number of documents per index are:
65
65
66
-
+ 24 billion on Basic, S1, S2, S3, L1, and L2 search services.
67
-
+ 2 billion on S3 HD.
66
+
+ 24 billion on Basic, S1, S2, S3
67
+
+ 2 billion on S3 HD
68
+
+ 288 billion on L1
69
+
+ 576 billion on L2
68
70
69
71
Each instance of a complex collection counts as a separate document in terms of these limits.
70
72
71
-
Maximum document size when calling an Index API is approximately 16 megabytes.
73
+
Maximum size of each document is approximately 16 megabytes. Document size is actually a limit on the size of the indexing API request payload, which is 16 megabytes. That payload can be a single document, or a batch of documents. For a batch with a single document, the maximum document size is 16 MB of JSON.
72
74
73
-
Document size is actually a limit on the size of the Index API request body. Since you can pass a batch of multiple documents to the Index API at once, the size limit realistically depends on how many documents are in the batch. For a batch with a single document, the maximum document size is 16 MB of JSON.
75
+
Document size applies to *push mode* indexing that uploads documents to a search service. If you're using an indexer for *pull mode* indexing, your source files can be any file size, subject to [indexer limits](#indexer-limits). For the blob indexer, file size limits are larger for higher tiers. For example, the S1 limit is 128 megabytes, S2 limit is 256 megabytes, and so forth.
74
76
75
-
When estimating document size, remember to consider only those fields that add value to your search scenarios, and exclude any source fields that have no purpose in the queries you intend to run.
77
+
When estimating document size, remember to index only those fields that add value to your search scenarios, and exclude any source fields that have no purpose in the queries you intend to run.
76
78
77
79
## Vector index size limits
78
80
@@ -119,9 +121,9 @@ Maximum running times exist to provide balance and stability to the service as a
119
121
| Maximum indexing load per invocation |10,000 documents |Limited only by maximum documents |Limited only by maximum documents |Limited only by maximum documents |Limited only by maximum documents |N/A |No limit |No limit |
| Maximum running time <sup>5</sup>| 1-3 minutes |2 or 24 hours |2 or 24 hours |2 or 24 hours |2 or 24 hours |N/A |2 or 24 hours |2 or 24 hours |
122
-
| Maximum running time for indexers with a skillset <sup>6</sup> | 3-10 minutes |2 hours |2 hours |2 hours |2 hours |N/A |2 hours |2 hours |
124
+
| Maximum running time for indexers with a skillset <sup>6</sup> | 3-10 minutes |2 or 24 hours |2 or 24 hours |2 or 24 hours |2 or 24 hours |N/A |2 or 24 hours |2 or 24 hours |
| Blob indexer: maximum characters of content extracted from a blob <sup>7</sup> |32,000 |64,000 |4 million |8 million |16 million |N/A |4 million |4 million |
126
+
| Blob indexer: maximum characters of content extracted from a blob <sup>6</sup> |32,000 |64,000 |4 million |8 million |16 million |N/A |4 million |4 million |
125
127
126
128
<sup>1</sup> Free services have indexer maximum execution time of 3 minutes for blob sources and 1 minute for all other data sources. Indexer invocation is once every 180 seconds. For AI indexing that calls into Azure AI services, free services are limited to 20 free transactions per indexer per day, where a transaction is defined as a document that successfully passes through the enrichment pipeline (tip: you can reset an indexer to reset its count).
127
129
@@ -131,11 +133,9 @@ Maximum running times exist to provide balance and stability to the service as a
131
133
132
134
<sup>4</sup> Maximum of 30 skills per skillset.
133
135
134
-
<sup>5</sup> Regarding the 2 or 24 hour maximum duration for indexers: a 2-hour maximum is the most common and it's what you should plan for. The 24-hour limit is from an older indexer implementation. If you have unscheduled indexers that run continuously for 24 hours, it's because those indexers couldn't be migrated to the newer infrastructure. As a general rule, for indexing jobs that can't finish within two hours, put the indexer on a [2-hour schedule](search-howto-schedule-indexers.md). When the first 2-hour interval is complete, the indexer picks up where it left off when starting the next 2-hour interval.
136
+
<sup>5</sup> Regarding the 2 or 24 hour maximum duration for indexers: a 2-hour maximum is the most common and it's what you should plan for. It refers to indexers that run in the [public environment](search-howto-run-reset-indexers.md#indexer-execution), used to offload computationally intensive processing and leave more resources for queries. The 24-hour limit applies if you configure the indexer to run in a private environment using only the infrastructure that's allocated to your search service. Note that some older indexers are incapable of running in the public environment, and those indexers always have a 24-hour processing range. If you have unscheduled indexers that run continuously for 24 hours, you can assume those indexers couldn't be migrated to the newer infrastructure. As a general rule, for indexing jobs that can't finish within two hours, put the indexer on a [2-hour schedule](search-howto-schedule-indexers.md). When the first 2-hour interval is complete, the indexer picks up where it left off when starting the next 2-hour interval.
135
137
136
-
<sup>6</sup> Skillset execution, and image analysis in particular, are computationally intensive and consume disproportionate amounts of available processing power. Running time for these workloads is shorter so that other jobs in the queue have more opportunity to run.
137
-
138
-
<sup>7</sup> The maximum number of characters is based on Unicode code units, specifically UTF-16.
138
+
<sup>6</sup> The maximum number of characters is based on Unicode code units, specifically UTF-16.
139
139
140
140
> [!NOTE]
141
141
> As stated in the [Index limits](#index-limits), indexers will also enforce the upper limit of 3000 elements across all complex collections per document starting with the latest GA API version that supports complex types (`2019-05-06`) onwards. This means that if you've created your indexer with a prior API version, you will not be subject to this limit. To preserve maximum compatibility, an indexer that was created with a prior API version and then updated with an API version `2019-05-06` or later, will still be **excluded** from the limits. Customers should be aware of the adverse impact of having very large complex collections (as stated previously) and we highly recommend creating any new indexers with the latest GA API version.
@@ -204,8 +204,10 @@ L2 reranking using the semantic reranker has an expected volume:
204
204
205
205
## API request limits
206
206
207
-
+ Maximum of 16 MB per request <sup>1</sup>
208
-
+ Maximum 8-KB URL length
207
+
Except where noted, the following API requests apply to all programmable interfaces, including the Azure SDKs.
208
+
209
+
+ Maximum of 16 MB per indexing or query request when pushing a payload to the search service <sup>1</sup>
210
+
+ Maximum 8-KB URL length (applies to REST APIs only)
209
211
+ Maximum 1,000 documents per batch of index uploads, merges, or deletes
0 commit comments