You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-sharepoint-online.md
+25-4Lines changed: 25 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ manager: liamca
9
9
10
10
ms.service: cognitive-search
11
11
ms.topic: how-to
12
-
ms.date: 09/08/2022
12
+
ms.date: 02/23/2023
13
13
---
14
14
15
15
# Index data from SharePoint document libraries
@@ -19,15 +19,13 @@ ms.date: 09/08/2022
19
19
20
20
This article explains how to configure a [search indexer](search-indexer-overview.md) to index documents stored in SharePoint document libraries for full text search in Azure Cognitive Search. Configuration steps are followed by a deeper exploration of behaviors and scenarios you're likely to encounter.
21
21
22
-
> [!NOTE]
23
-
> SharePoint supports a granular authorization model that determines per-user access at the document level. The SharePoint indexer does not pull these permissions into the search index, and Cognitive Search does not support document-level authorization. When a document is indexed from SharePoint into a search service, the content is available to anyone who has read access to the index. If you require document-level permissions, you should investigate [security filters to trim results](search-security-trimming-for-azure-search-with-aad.md) of unauthorized content.
24
22
25
23
## Functionality
26
24
27
25
An indexer in Azure Cognitive Search is a crawler that extracts searchable data and metadata from a data source. The SharePoint indexer will connect to your SharePoint site and index documents from one or more document libraries. The indexer provides the following functionality:
28
26
29
27
+ Index content and metadata from one or more document libraries.
30
-
+ Incremental indexing, where the indexer identifies which files have changed and indexes only the updated content. For example, if five PDFs are originally indexed and one is updated, only the updated PDF is indexed.
28
+
+ Incremental indexing, where the indexer identifies which file content or metadata have changed and indexes only the updated data. For example, if five PDFs are originally indexed and one is updated, only the updated PDF is indexed.
31
29
+ Deletion detection is built in. If a document is deleted from a document library, the indexer will detect the delete on the next indexer run and remove the document from the index.
32
30
+ Text and normalized images will be extracted by default from the documents that are indexed. Optionally a [skillset](cognitive-search-working-with-skillsets.md) can be added to the pipeline for [AI enrichment](cognitive-search-concept-intro.md).
33
31
@@ -466,6 +464,29 @@ You can also continue indexing if errors happen at any point of processing, eith
466
464
}
467
465
```
468
466
467
+
## Limitations and considerations
468
+
469
+
These are the limitations of this feature:
470
+
471
+
+ Indexing [SharePoint Lists](https://support.microsoft.com/office/introduction-to-lists-0a1c3ace-def0-44af-b225-cfa8d92c52d7) is not supported.
472
+
473
+
+ If a SharePoint file content and/or metadata has been indexed, renaming a SharePoint folder in its parent hierarchy is not a condition that will re-index the document.
474
+
475
+
+ Indexing SharePoint .ASPX site content is not supported.
476
+
477
+
+[Private endpoint](search-indexer-howto-access-private.md) is not supported.
478
+
479
+
+ SharePoint supports a granular authorization model that determines per-user access at the document level. The SharePoint indexer does not pull these permissions into the search index, and Cognitive Search does not support document-level authorization. When a document is indexed from SharePoint into a search service, the content is available to anyone who has read access to the index. If you require document-level permissions, you should investigate [security filters to trim results](search-security-trimming-for-azure-search-with-aad.md) of unauthorized content.
480
+
481
+
482
+
These are the considerations when using this feature:
483
+
484
+
+ If there is a requirement to implement a SharePoint content indexing solution with Cognitive Search in a production environment, consider create a custom connector using [Microsoft Graph Data Connect](/graph/data-connect-concept-overview) with [Blob indexer](search-howto-indexing-azure-blob-storage.md) and [Microsoft Graph API](/graph/use-the-api) for incremental indexing.
485
+
486
+
+ There could be Microsoft 365 processes that update SharePoint file system-metadata (based on different configurations in SharePoint) and will cause the SharePoint indexer to trigger. Make sure that you test your setup and understand the document processing count prior to using any AI enrichment. Since this is a third-party connector to Azure (since SharePoint is located in Microsoft 365), SharePoint configuration is not checked by the indexer.
487
+
488
+
489
+
469
490
## See also
470
491
471
492
+[Indexers in Azure Cognitive Search](search-indexer-overview.md)
0 commit comments