Skip to content

Commit 6861055

Browse files
Merge pull request #6586 from HeidiSteen/heidist-freshness
Refresh sharepoint indexer doc
2 parents d5f3c4f + f507b27 commit 6861055

File tree

6 files changed

+37
-39
lines changed

6 files changed

+37
-39
lines changed
-166 KB
Loading
4.27 KB
Loading
47 KB
Loading
25.5 KB
Loading
65.7 KB
Loading

articles/search/search-howto-index-sharepoint-online.md

Lines changed: 37 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: gmndrg
66
ms.author: gimondra
77
ms.service: azure-ai-search
88
ms.topic: how-to
9-
ms.date: 06/17/2025
9+
ms.date: 08/13/2025
1010
ms.custom:
1111
- ignite-2023
1212
- sfi-image-nochange
@@ -22,14 +22,14 @@ ms.custom:
2222
>
2323
> [Fill out this form](https://aka.ms/azure-cognitive-search/indexer-preview) to register for the preview. All requests are approved automatically. After you fill out the form, use a [preview REST API](search-api-preview.md) to index your content.
2424
25-
This article explains how to configure a [search indexer](search-indexer-overview.md) to index documents stored in SharePoint document libraries for full text search in Azure AI Search. Configuration steps are first, followed by behaviors and scenarios
25+
This article explains how to configure a [search indexer](search-indexer-overview.md) to index documents stored in SharePoint document libraries for full text search in Azure AI Search. Configuration steps are first, followed by behaviors and scenarios.
2626

27-
In Azure AI Search, an indexer extracts searchable data and metadata from a data source. The SharePoint Online indexer connects to your SharePoint site and indexes documents from one or more document libraries. The indexer provides the following functionality:
27+
In Azure AI Search, an indexer extracts searchable data and metadata from a data source. The SharePoint Online indexer provides the following functionality:
2828

2929
+ Indexes files and metadata from one or more document libraries.
3030
+ Indexes incrementally, picking up just the new and changed files and metadata.
3131
+ Detects deleted content automatically. Document deletion in the library is picked up on the next indexer run, and the corresponding search document is removed from the index.
32-
+ Extracts text and normalized images from indexed documents automatically. Optionally, you can add a [skillset](cognitive-search-working-with-skillsets.md) for deeper [AI enrichment](cognitive-search-concept-intro.md), like OCR or text translation.
32+
+ Extracts text and normalized images from indexed documents automatically. Optionally, you can add a [skillset](cognitive-search-working-with-skillsets.md) for deeper [AI enrichment](cognitive-search-concept-intro.md), like OCR or entity recognition.
3333

3434
## Prerequisites
3535

@@ -39,6 +39,8 @@ In Azure AI Search, an indexer extracts searchable data and metadata from a data
3939

4040
+ Files in a [document library](https://support.microsoft.com/office/what-is-a-document-library-3b5976dd-65cf-4c9e-bf5a-713c10ca2872).
4141

42+
+ [Visual Studio Code](https://code.visualstudio.com/download) with the [REST Client extension](https://marketplace.visualstudio.com/items?itemName=humao.rest-client) for setting up and running the indexer pipeline.
43+
4244
## Supported document formats
4345

4446
The SharePoint Online indexer can extract text from the following document formats:
@@ -73,9 +75,9 @@ Here are some considerations when using this feature:
7375

7476
+ If you still need a custom SharePoint Online content indexing solution using Azure AI Search in a production environment, despite the recommendation to use Microsoft Copilot Studio, consider:
7577

76-
- Creating a custom connector with [SharePoint Webhooks](/sharepoint/dev/apis/webhooks/overview-sharepoint-webhooks), calling [Microsoft Graph API](/graph/use-the-api) to export the data to an Azure Blob container, and then use the [Azure blob indexer](search-howto-indexing-azure-blob-storage.md) for incremental indexing.
78+
+ Creating a custom connector with [SharePoint Webhooks](/sharepoint/dev/apis/webhooks/overview-sharepoint-webhooks), calling [Microsoft Graph API](/graph/use-the-api) to export the data to an Azure Blob container, and then use the [Azure blob indexer](search-howto-indexing-azure-blob-storage.md) for incremental indexing.
7779

78-
- Creating your own [Azure Logic Apps workflow](/azure/logic-apps/logic-apps-overview) using [Azure Logic Apps SharePoint connector](/connectors/sharepointonline/) and [Azure AI Search connector](/connectors/azureaisearch/) when reaching General Availability. You can use the workflow generated by the [Azure portal wizard](search-how-to-index-logic-apps-indexers.md) as a starting point and then customize it in the [Azure Logic Apps designer](/azure/logic-apps/quickstart-create-example-consumption-workflow#add-the-trigger) to include the transformation steps you need. The Azure Logic App workflow created when using the [Azure AI Search wizard](search-how-to-index-logic-apps-indexers.md) to index SharePoint Online data is a [consumption workflow](/azure/logic-apps/logic-apps-overview#key-terms). If you're setting up production workloads, make sure to switch to a [standard logic app workflow](/azure/logic-apps/logic-apps-overview#key-terms) and take advantage of its additional enterprise features.
80+
+ Creating your own [Azure Logic Apps workflow](/azure/logic-apps/logic-apps-overview) using [Azure Logic Apps SharePoint connector](/connectors/sharepointonline/) and [Azure AI Search connector](/connectors/azureaisearch/) when reaching General Availability. You can use the workflow generated by the [Azure portal wizard](search-how-to-index-logic-apps-indexers.md) as a starting point and then customize it in the [Azure Logic Apps designer](/azure/logic-apps/quickstart-create-example-consumption-workflow#add-the-trigger) to include the transformation steps you need. The Azure Logic App workflow created when using the [Azure AI Search wizard](search-how-to-index-logic-apps-indexers.md) to index SharePoint Online data is a [consumption workflow](/azure/logic-apps/logic-apps-overview#key-terms). If you're setting up production workloads, make sure to switch to a [standard logic app workflow](/azure/logic-apps/logic-apps-overview#key-terms) and take advantage of its additional enterprise features.
7981

8082
Regardless of the approach you choose, whether building a custom connector with SharePoint hooks or creating an Azure Logic Apps workflow, be sure to implement robust security measures. These measures include configuring shared private links, setting up firewalls, preserving user permissions from the source and honor those permissions at query time, among others. You should also regularly audit and monitor your pipeline.
8183

@@ -123,58 +125,53 @@ The SharePoint Online indexer uses a Microsoft Entra application for authenticat
123125

124126
1. On the navigation pane under **Manage**, select **API permissions**, then **Add a permission**, then **Microsoft Graph**.
125127

126-
+ If the indexer is using application API permissions, then select **Application permissions** and add the following:
128+
+ If the indexer is using application API permissions, select **Application permissions**, and then select **Application.Read.All**.
129+
130+
:::image type="content" source="media/search-howto-index-sharepoint-online/application-api-permissions.png" alt-text="Screenshot of application API permissions." lightbox="media/search-howto-index-sharepoint-online/application-api-permissions.png":::
127131

128-
+ **Application - Files.Read.All**
129-
+ **Application - Sites.Read.All**
130-
131-
:::image type="content" source="media/search-howto-index-sharepoint-online/application-api-permissions.png" alt-text="Screenshot of application API permissions.":::
132-
133-
Using application permissions means that the indexer accesses the SharePoint site in a service context. So when you run the indexer it will have access to all content in the SharePoint tenant, which requires tenant admin approval. A client secret is also required for authentication. Setting up the client secret is described later in this article.
132+
Using application permissions means that the indexer accesses the SharePoint site in a service context. So when you run the indexer, it has access to all content in the SharePoint tenant, which requires tenant admin approval. A client secret is also required for authentication. Setting up the client secret is described later in this article.
134133

135-
+ If the indexer is using delegated API permissions, select **Delegated permissions** and add the following:
134+
+ If the indexer is using delegated API permissions, select **Delegated permissions** and then select **Application.Read.All**.
136135

137-
+ **Delegated - Files.Read.All**
138-
+ **Delegated - Sites.Read.All**
139-
+ **Delegated - User.Read**
140-
141-
:::image type="content" source="media/search-howto-index-sharepoint-online/delegated-api-permissions.png" alt-text="Screenshot showing delegated API permissions.":::
142-
143-
Delegated permissions allow the search client to connect to SharePoint under the security identity of the current user.
136+
:::image type="content" source="media/search-howto-index-sharepoint-online/delegated-api-permissions.png" alt-text="Screenshot showing delegated API permissions." lightbox="media/search-howto-index-sharepoint-online/delegated-api-permissions.png":::
137+
138+
Delegated permissions allow the search client to connect to SharePoint under the security identity of the current user.
144139

145140
1. Give admin consent.
146141

147142
Tenant admin consent is required when using application API permissions. Some tenants are locked down in such a way that tenant admin consent is required for delegated API permissions as well. If either of these conditions apply, you’ll need to have a tenant admin grant consent for this Microsoft Entra application before creating the indexer.
148143

149-
:::image type="content" source="media/search-howto-index-sharepoint-online/aad-app-grant-admin-consent.png" alt-text="Screenshot showing Microsoft Entra app grant admin consent.":::
144+
:::image type="content" source="media/search-howto-index-sharepoint-online/aad-app-grant-admin-consent.png" alt-text="Screenshot showing Microsoft Entra app grant admin consent." lightbox="media/search-howto-index-sharepoint-online/aad-app-grant-admin-consent.png":::
145+
146+
1. From the menu, select **Authentication (Preview)**.
150147

151-
1. Select the **Authentication** tab.
148+
1. On the **Redirect URI configuration** tab, select **+ Add Redirect URI**, then **Mobile and desktop applications**, then check `https://login.microsoftonline.com/common/oauth2/nativeclient`, then **Configure**.
152149

153-
1. Set **Allow public client flows** to **Yes** then select **Save**.
150+
:::image type="content" source="media/search-howto-index-sharepoint-online/aad-app-authentication-configuration.png" alt-text="Screenshot showing Microsoft Entra app authentication configuration." lightbox="media/search-howto-index-sharepoint-online/aad-app-authentication-configuration.png" :::
154151

155-
1. Select **+ Add a platform**, then **Mobile and desktop applications**, then check `https://login.microsoftonline.com/common/oauth2/nativeclient`, then **Configure**.
152+
1. Select the **Settings** tab.
156153

157-
:::image type="content" source="media/search-howto-index-sharepoint-online/aad-app-authentication-configuration.png" alt-text="Screenshot showing Microsoft Entra app authentication configuration.":::
154+
1. Enable **Allow public client flows**. Save your changes.
158155

159156
1. (Application API Permissions only) To authenticate to the Microsoft Entra application using application permissions, the indexer requires a client secret.
160157

161-
+ Select **Certificates & Secrets** from the menu on the left, then **Client secrets**, then **New client secret**.
162-
163-
:::image type="content" source="media/search-howto-index-sharepoint-online/application-client-secret.png" alt-text="Screenshot showing new client secret.":::
164-
158+
+ From the menu, select **Certificates & Secrets**, then **Client secrets**, then **New client secret**.
159+
160+
:::image type="content" source="media/search-howto-index-sharepoint-online/application-client-secret.png" alt-text="Screenshot showing new client secret." lightbox="media/search-howto-index-sharepoint-online/application-client-secret.png" :::
161+
165162
+ In the menu that pops up, enter a description for the new client secret. Adjust the expiration date if necessary. If the secret expires, it needs to be recreated and the indexer needs to be updated with the new secret.
166-
167-
:::image type="content" source="media/search-howto-index-sharepoint-online/application-client-secret-setup.png" alt-text="Screenshot showing how to set up a client secret.":::
168-
169-
+ The new client secret appears in the secret list. Once you navigate away from the page, the secret is no longer be visible, so copy it using the copy button and save it in a secure location.
170-
171-
:::image type="content" source="media/search-howto-index-sharepoint-online/application-client-secret-copy.png" alt-text="Screenshot showing where to copy a client secret.":::
163+
164+
:::image type="content" source="media/search-howto-index-sharepoint-online/application-client-secret-setup.png" alt-text="Screenshot showing how to set up a client secret." lightbox="media/search-howto-index-sharepoint-online/application-client-secret-setup.png":::
165+
166+
+ The new client secret appears in the secret list. Once you navigate away from the page, the secret is no longer be visible, so copy the value using the copy button and save it in a secure location.
167+
168+
:::image type="content" source="media/search-howto-index-sharepoint-online/application-client-secret-copy.png" alt-text="Screenshot showing where to copy a client secret.":::
172169

173170
<a name="create-data-source"></a>
174171

175172
### Step 4: Create data source
176173

177-
Starting in this section, use a preview REST API for the remaining steps. We recommend the latest preview API.
174+
Starting in this section, use a preview REST API and a REST client for the remaining steps. We recommend the latest preview API.
178175

179176
A data source specifies which data to index, credentials, and policies to efficiently identify changes in the data (new, modified, or deleted rows). A data source can be used by multiple indexers in the same search service.
180177

@@ -202,7 +199,7 @@ api-key: [admin key]
202199

203200
#### Connection string format
204201

205-
The format of the connection string changes based on whether the indexer is using delegated API permissions or application API permissions
202+
The format of the connection string changes based on whether the indexer is using delegated API permissions or application API permissions.
206203

207204
+ Delegated API permissions connection string format
208205

@@ -212,6 +209,8 @@ The format of the connection string changes based on whether the indexer is usin
212209

213210
`SharePointOnlineEndpoint=[SharePoint site url];ApplicationId=[Azure AD App ID];ApplicationSecret=[Azure AD App client secret];TenantId=[SharePoint site tenant id]`
214211

212+
You can get tenant ID from the overview page in the Microsoft Entra admin center in your M365 subscription.
213+
215214
> [!NOTE]
216215
> If the SharePoint site is in the same tenant as the search service and system-assigned managed identity is enabled, `TenantId` doesn't have to be included in the connection string. If the SharePoint site is in a different tenant from the search service, `TenantId` must be included.
217216
@@ -267,7 +266,6 @@ There are a few steps to creating the indexer:
267266
"batchSize": null,
268267
"maxFailedItems": null,
269268
"maxFailedItemsPerBatch": null,
270-
"base64EncodeKeys": null,
271269
"configuration": {
272270
"indexedFileNameExtensions" : ".pdf, .docx",
273271
"excludedFileNameExtensions" : ".png, .jpg",

0 commit comments

Comments
 (0)