You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-indexer-troubleshooting.md
+15-1Lines changed: 15 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ author: mgottein
8
8
ms.author: magottei
9
9
ms.service: cognitive-search
10
10
ms.topic: conceptual
11
-
ms.date: 05/23/2022
11
+
ms.date: 06/24/2022
12
12
---
13
13
14
14
# Indexer troubleshooting guidance for Azure Cognitive Search
@@ -214,6 +214,20 @@ api-key: [admin key]
214
214
215
215
Azure Cognitive Search has an implicit dependency on Cosmos DB indexing. If you turn off automatic indexing in Cosmos DB, Azure Cognitive Search returns a successful state, but fails to index container contents. For instructions on how to check settings and turn on indexing, see [Manage indexing in Azure Cosmos DB](../cosmos-db/how-to-manage-indexing-policy.md#use-the-azure-portal).
216
216
217
+
218
+
## Indexer reflects a different document count than data source or index
219
+
220
+
Indexer may show a different document count than either the data source, the index or count in your code in a point in time, depending on specific circumstances. Here are some possible causes of why this may occur:
221
+
222
+
- The indexer has a Deleted Document Policy. The deleted documents get counted on the indexer end if they are indexed before they get deleted.
223
+
- If the ID column in the data source is not unique. This is for data sources that have the concept of column, such as Cosmos DB.
224
+
- If the data source definition has a different query than the one you are using to estimate the number of records. In example, in your data base you are querying all your data base record count, while in the data source definition query you may be selecting just a subset of records to index.
225
+
- The counts are being checked in different intervals for each component of the pipeline: data source, indexer and index.
226
+
- The index may take some minutes to show the real document count.
227
+
- The data source has a file that's mapped to many documents. This condition can occur when [indexing blobs](search-howto-index-json-blobs.md) and "parsingMode" is set to **`jsonArray`** and **`jsonLines`**.
228
+
- Due to [documents processed multiple times](#documents-processed-multiple-times).
229
+
230
+
217
231
## Documents processed multiple times
218
232
219
233
Indexers leverage a conservative buffering strategy to ensure that every new and changed document in the data source is picked up during indexing. In certain situations, these buffers can overlap, causing an indexer to index a document two or more times resulting in the processed documents count to be more than actual number of documents in the data source. This behavior does **not** affect the data stored in the index, such as duplicating documents, only that it may take longer to reach eventual consistency. This can be especially prevalent if any of the following conditions are true:
0 commit comments