Skip to content

Commit 3f39242

Browse files
authored
Fixing branch conflicts
1 parent 1acf93f commit 3f39242

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

articles/search/search-indexer-troubleshooting.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ author: mgottein
88
ms.author: magottei
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 05/23/2022
11+
ms.date: 06/24/2022
1212
---
1313

1414
# Indexer troubleshooting guidance for Azure Cognitive Search
@@ -214,6 +214,20 @@ api-key: [admin key]
214214

215215
Azure Cognitive Search has an implicit dependency on Cosmos DB indexing. If you turn off automatic indexing in Cosmos DB, Azure Cognitive Search returns a successful state, but fails to index container contents. For instructions on how to check settings and turn on indexing, see [Manage indexing in Azure Cosmos DB](../cosmos-db/how-to-manage-indexing-policy.md#use-the-azure-portal).
216216

217+
218+
## Indexer reflects a different document count than data source or index
219+
220+
Indexer may show a different document count than either the data source, the index or count in your code in a point in time, depending on specific circumstances. Here are some possible causes of why this may occur:
221+
222+
- The indexer has a Deleted Document Policy. The deleted documents get counted on the indexer end if they are indexed before they get deleted.
223+
- If the ID column in the data source is not unique. This is for data sources that have the concept of column, such as Cosmos DB.
224+
- If the data source definition has a different query than the one you are using to estimate the number of records. In example, in your data base you are querying all your data base record count, while in the data source definition query you may be selecting just a subset of records to index.
225+
- The counts are being checked in different intervals for each component of the pipeline: data source, indexer and index.
226+
- The index may take some minutes to show the real document count.
227+
- The data source has a file that's mapped to many documents. This condition can occur when [indexing blobs](search-howto-index-json-blobs.md) and "parsingMode" is set to **`jsonArray`** and **`jsonLines`**.
228+
- Due to [documents processed multiple times](#documents-processed-multiple-times).
229+
230+
217231
## Documents processed multiple times
218232

219233
Indexers leverage a conservative buffering strategy to ensure that every new and changed document in the data source is picked up during indexing. In certain situations, these buffers can overlap, causing an indexer to index a document two or more times resulting in the processed documents count to be more than actual number of documents in the data source. This behavior does **not** affect the data stored in the index, such as duplicating documents, only that it may take longer to reach eventual consistency. This can be especially prevalent if any of the following conditions are true:

0 commit comments

Comments
 (0)