Skip to content

Commit 4c57054

Browse files
committed
SharePoint Online indexer consistency pass
1 parent 519db06 commit 4c57054

5 files changed

+21
-21
lines changed

articles/search/retrieval-augmented-generation-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ The information retrieval system provides the searchable index, query logic, and
7777

7878
The LLM receives the original prompt, plus the results from Azure AI Search. The LLM analyzes the results and formulates a response. If the LLM is ChatGPT, the user interaction might be a back and forth conversation. If you're using Davinci, the prompt might be a fully composed answer. An Azure solution most likely uses Azure OpenAI, but there's no hard dependency on this specific service.
7979

80-
Azure AI Search doesn't provide native LLM integration, web frontends, or vector encoding (embeddings) out of the box, so you need to write code that handles those parts of the solution. You can review demo source ([Azure-Samples/azure-search-openai-demo](https://github.com/Azure-Samples/azure-search-openai-demo)) for a blueprint of what a full solution entails.
80+
Azure AI Search doesn't provide native LLM integration for prompt flows or chat preservation, so you need to write code that handles orchestration and state. You can review demo source ([Azure-Samples/azure-search-openai-demo](https://github.com/Azure-Samples/azure-search-openai-demo)) for a blueprint of what a full solution entails.
8181

8282
## Searchable content in Azure AI Search
8383

articles/search/search-api-preview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Preview features are removed from this list if they're retired or transition to
3131
| [**Text Split skill**](cognitive-search-skill-textsplit.md) | AI enrichment (skills) | Text Split has two new chunking-related properties in preview: `maximumPagesToTake`, `pageOverlapLength`. | [Create or Update Skillset (preview)](/rest/api/searchservice/preview-api/create-or-update-skillset), 2023-10-01-Preview or later. Also available in the portal through the [Import and vectorize data wizard](search-get-started-portal-import-vectors.md). |
3232
| [**Index projections**](index-projections-concept-intro.md) | AI enrichment (skills) | A component of a skillset definition that defines the shape of a secondary index, supporting a one-to-many index pattern, where content from an enrichment pipeline can target multiple indexes.| [Create or Update Skillset (preview)](/rest/api/searchservice/preview-api/create-or-update-skillset), 2023-10-01-Preview or later. Also available in the portal through the [Import and vectorize data wizard](search-get-started-portal-import-vectors.md). |
3333
| [**Azure Files indexer**](search-file-storage-integration.md) | Indexer data source | New data source for indexer-based indexing from [Azure Files](https://azure.microsoft.com/services/storage/files/) | [Create or Update Data Source (preview)](/rest/api/searchservice/preview-api/create-or-update-data-source), 2021-04-30-Preview or later. |
34-
| [**SharePoint Indexer**](search-howto-index-sharepoint-online.md) | Indexer data source | New data source for indexer-based indexing of SharePoint content. | [Sign up](https://aka.ms/azure-cognitive-search/indexer-preview) to enable the feature. Use [Create or Update Data Source (preview)](/rest/api/searchservice/preview-api/create-or-update-data-source), 2020-06-30-Preview or later, or the Azure portal. |
34+
| [**SharePoint Online indexer**](search-howto-index-sharepoint-online.md) | Indexer data source | New data source for indexer-based indexing of SharePoint content. | [Sign up](https://aka.ms/azure-cognitive-search/indexer-preview) to enable the feature. Use [Create or Update Data Source (preview)](/rest/api/searchservice/preview-api/create-or-update-data-source), 2020-06-30-Preview or later, or the Azure portal. |
3535
| [**MySQL indexer**](search-howto-index-mysql.md) | Indexer data source | New data source for indexer-based indexing of Azure MySQL data sources.| [Sign up](https://aka.ms/azure-cognitive-search/indexer-preview) to enable the feature. Use [Create or Update Data Source (preview)](/rest/api/searchservice/preview-api/create-or-update-data-source), 2020-06-30-Preview or later, [.NET SDK 11.2.1](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype.mysql), and Azure portal. |
3636
| [**Azure Cosmos DB for MongoDB indexer**](search-howto-index-cosmosdb.md) | Indexer data source | New data source for indexer-based indexing through the MongoDB APIs in Azure Cosmos DB. | [Sign up](https://aka.ms/azure-cognitive-search/indexer-preview) to enable the feature. Use [Create or Update Data Source (preview)](/rest/api/searchservice/preview-api/create-or-update-data-source), 2020-06-30-Preview or later, or the Azure portal.|
3737
| [**Azure Cosmos DB for Apache Gremlin indexer**](search-howto-index-cosmosdb.md) | Indexer data source | New data source for indexer-based indexing through the Apache Gremlin APIs in Azure Cosmos DB. | [Sign up](https://aka.ms/azure-cognitive-search/indexer-preview) to enable the feature. Use [Create or Update Data Source (preview)](/rest/api/searchservice/preview-api/create-or-update-data-source), 2020-06-30-Preview or later.|

articles/search/search-blob-metadata-properties.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Azure AI Search supports blob indexing and SharePoint document indexing for the
2424

2525
## Properties by document format
2626

27-
The following table summarizes processing for each document format, and describes the metadata properties extracted by a blob indexer and the SharePoint indexer.
27+
The following table summarizes processing for each document format, and describes the metadata properties extracted by a blob indexer and the SharePoint Online indexer.
2828

2929
| Document format / content type | Extracted metadata | Processing details |
3030
| --- | --- | --- |

articles/search/search-howto-index-sharepoint-online.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: SharePoint indexer (preview)
2+
title: SharePoint Online indexer (preview)
33
titleSuffix: Azure AI Search
4-
description: Set up a SharePoint indexer to automate indexing of document library content in Azure AI Search.
4+
description: Set up a SharePoint Online indexer to automate indexing of document library content in Azure AI Search.
55
author: gmndrg
66
ms.author: gimondra
77

@@ -15,7 +15,7 @@ ms.date: 03/07/2024
1515
# Index data from SharePoint document libraries
1616

1717
> [!IMPORTANT]
18-
> SharePoint indexer support is in public preview. It's offered "as-is", under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/) and supported on best effort only. Preview features aren't recommended for production workloads and aren't guaranteed to become generally available.
18+
> SharePoint Online indexer support is in public preview. It's offered "as-is", under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/) and supported on best effort only. Preview features aren't recommended for production workloads and aren't guaranteed to become generally available.
1919
>
2020
> Be sure to visit the [known limitations](#limitations-and-considerations) section before you start.
2121
>
@@ -25,7 +25,7 @@ This article explains how to configure a [search indexer](search-indexer-overvie
2525

2626
## Functionality
2727

28-
An indexer in Azure AI Search is a crawler that extracts searchable data and metadata from a data source. The SharePoint indexer connects to your SharePoint site and indexes documents from one or more document libraries. The indexer provides the following functionality:
28+
An indexer in Azure AI Search is a crawler that extracts searchable data and metadata from a data source. The SharePoint Online indexer connects to your SharePoint site and indexes documents from one or more document libraries. The indexer provides the following functionality:
2929

3030
+ Index files and metadata from one or more document libraries.
3131
+ Index incrementally, picking up just the new and changed files and metadata.
@@ -40,7 +40,7 @@ An indexer in Azure AI Search is a crawler that extracts searchable data and met
4040

4141
## Supported document formats
4242

43-
The SharePoint indexer can extract text from the following document formats:
43+
The SharePoint Online indexer can extract text from the following document formats:
4444

4545
[!INCLUDE [search-document-data-sources](../../includes/search-blob-data-sources.md)]
4646

@@ -66,13 +66,13 @@ Here are the considerations when using this feature:
6666

6767
+ If you need a SharePoint content indexing solution in a production environment, consider creating a custom connector with [SharePoint Webhooks](/sharepoint/dev/apis/webhooks/overview-sharepoint-webhooks), calling [Microsoft Graph API](/graph/use-the-api) to export the data to an Azure Blob container, and then use the [Azure Blob indexer](search-howto-indexing-azure-blob-storage.md) for incremental indexing.
6868

69-
<!-- + There could be Microsoft 365 processes that update SharePoint file system-metadata (based on different configurations in SharePoint) and will cause the SharePoint indexer to trigger. Make sure that you test your setup and understand the document processing count prior to using any AI enrichment. Since this is a third-party connector to Azure (SharePoint is located in Microsoft 365), SharePoint configuration is not checked by the indexer. -->
69+
<!-- + There could be Microsoft 365 processes that update SharePoint file system-metadata (based on different configurations in SharePoint) and will cause the SharePoint Online indexer to trigger. Make sure that you test your setup and understand the document processing count prior to using any AI enrichment. Since this is a third-party connector to Azure (SharePoint is located in Microsoft 365), SharePoint configuration is not checked by the indexer. -->
7070

71-
+ If your SharePoint configuration allows Microsoft 365 processes to update SharePoint file system metadata, be aware that these updates can trigger the SharePoint indexer, causing the indexer to ingest documents multiple times. Because the SharePoint indexer is a third-party connector to Azure, the indexer can't read the configuration or vary its behavior. It responds to changes in new and changed content, regardless of how those updates are made. For this reason, make sure that you test your setup and understand the document processing count prior to using the indexer and any AI enrichment.
71+
+ If your SharePoint configuration allows Microsoft 365 processes to update SharePoint file system metadata, be aware that these updates can trigger the SharePoint Online indexer, causing the indexer to ingest documents multiple times. Because the SharePoint Online indexer is a third-party connector to Azure, the indexer can't read the configuration or vary its behavior. It responds to changes in new and changed content, regardless of how those updates are made. For this reason, make sure that you test your setup and understand the document processing count prior to using the indexer and any AI enrichment.
7272

73-
## Configure the SharePoint indexer
73+
## Configure the SharePoint Online indexer
7474

75-
To set up the SharePoint indexer, use both the Azure portal and a preview REST API.
75+
To set up the SharePoint Online indexer, use both the Azure portal and a preview REST API.
7676

7777
This section provides the steps. You can also watch the following video.
7878

@@ -92,7 +92,7 @@ After selecting **Save**, you get an Object ID that has been assigned to your se
9292

9393
### Step 2: Decide which permissions the indexer requires
9494

95-
The SharePoint indexer supports both [delegated and application](/graph/auth/auth-concepts#delegated-and-application-permissions) permissions. Choose which permissions you want to use based on your scenario.
95+
The SharePoint Online indexer supports both [delegated and application](/graph/auth/auth-concepts#delegated-and-application-permissions) permissions. Choose which permissions you want to use based on your scenario.
9696

9797
We recommend app-based permissions. See [limitations](#limitations-and-considerations) for known issues related to delegated permissions.
9898

@@ -106,7 +106,7 @@ If your Microsoft Entra organization has [conditional access enabled](../active-
106106

107107
### Step 3: Create a Microsoft Entra application registration
108108

109-
The SharePoint indexer uses this Microsoft Entra application for authentication.
109+
The SharePoint Online indexer uses this Microsoft Entra application for authentication.
110110

111111
1. Sign in to the [Azure portal](https://portal.azure.com).
112112

@@ -240,7 +240,7 @@ api-key: [admin key]
240240
```
241241

242242
> [!IMPORTANT]
243-
> Only [`metadata_spo_site_library_item_id`](#metadata) may be used as the key field in an index populated by the SharePoint indexer. If a key field doesn't exist in the data source, `metadata_spo_site_library_item_id` is automatically mapped to the key field.
243+
> Only [`metadata_spo_site_library_item_id`](#metadata) may be used as the key field in an index populated by the SharePoint Online indexer. If a key field doesn't exist in the data source, `metadata_spo_site_library_item_id` is automatically mapped to the key field.
244244
245245
### Step 6: Create an indexer
246246

@@ -309,7 +309,7 @@ There are a few steps to creating the indexer:
309309
310310
:::image type="content" source="media/search-howto-index-sharepoint-online/enter-device-code.png" alt-text="Screenshot showing how to enter a device code.":::
311311
312-
1. The SharePoint indexer will access the SharePoint content as the signed-in user. The user that logs in during this step will be that signed-in user. So, if you sign in with a user account that doesn’t have access to a document in the Document Library that you want to index, the indexer won’t have access to that document.
312+
1. The SharePoint Online indexer will access the SharePoint content as the signed-in user. The user that logs in during this step will be that signed-in user. So, if you sign in with a user account that doesn’t have access to a document in the Document Library that you want to index, the indexer won’t have access to that document.
313313
314314
If possible, we recommend creating a new user account and giving that new user the exact permissions that you want the indexer to have.
315315
@@ -383,7 +383,7 @@ If you're indexing document metadata (`"dataToExtract": "contentAndMetadata"`),
383383
| metadata_spo_item_weburi | Edm.String | The URI of the item. |
384384
| metadata_spo_item_path | Edm.String | The combination of the parent path and item name. |
385385
386-
The SharePoint indexer also supports metadata specific to each document type. More information can be found in [Content metadata properties used in Azure AI Search](search-blob-metadata-properties.md).
386+
The SharePoint Online indexer also supports metadata specific to each document type. More information can be found in [Content metadata properties used in Azure AI Search](search-blob-metadata-properties.md).
387387
388388
> [!NOTE]
389389
> To index custom metadata, "additionalColumns" must be specified in the [query parameter of the data source](#query).
@@ -410,7 +410,7 @@ PUT /indexers/[indexer name]?api-version=2020-06-30
410410

411411
## Controlling which documents are indexed
412412

413-
A single SharePoint indexer can index content from one or more document libraries. Use the "container" parameter on the data source definition to indicate which sites and document libraries to index from.
413+
A single SharePoint Online indexer can index content from one or more document libraries. Use the "container" parameter on the data source definition to indicate which sites and document libraries to index from.
414414

415415
The [data source "container" section](#create-data-source) has two properties for this task: "name" and "query".
416416

@@ -443,7 +443,7 @@ The "query" parameter of the data source is made up of keyword/value pairs. The
443443

444444
## Handling errors
445445

446-
By default, the SharePoint indexer stops as soon as it encounters a document with an unsupported content type (for example, an image). You can use the `excludedFileNameExtensions` parameter to skip certain content types. However, you might need to index documents without knowing all the possible content types in advance. To continue indexing when an unsupported content type is encountered, set the `failOnUnsupportedContentType` configuration parameter to false:
446+
By default, the SharePoint Online indexer stops as soon as it encounters a document with an unsupported content type (for example, an image). You can use the `excludedFileNameExtensions` parameter to skip certain content types. However, you might need to index documents without knowing all the possible content types in advance. To continue indexing when an unsupported content type is encountered, set the `failOnUnsupportedContentType` configuration parameter to false:
447447

448448
```http
449449
PUT https://[service name].search.windows.net/indexers/[indexer name]?api-version=2023-10-01-Preview

articles/search/search-indexer-troubleshooting.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,15 +80,15 @@ If the database is paused, the first sign in from your search service is expecte
8080

8181
## Microsoft Entra Conditional Access policies
8282

83-
When you create a SharePoint indexer, there's a step requiring you to sign in to your Microsoft Entra app after providing a device code. If you receive a message that says `"Your sign-in was successful but your admin requires the device requesting access to be managed"`, the indexer is probably blocked from the SharePoint document library by a [Conditional Access](../active-directory/conditional-access/overview.md) policy.
83+
When you create a SharePoint Online indexer, there's a step requiring you to sign in to your Microsoft Entra app after providing a device code. If you receive a message that says `"Your sign-in was successful but your admin requires the device requesting access to be managed"`, the indexer is probably blocked from the SharePoint document library by a [Conditional Access](../active-directory/conditional-access/overview.md) policy.
8484

8585
To update the policy and allow indexer access to the document library:
8686

8787
1. Open the Azure portal and search for **Microsoft Entra Conditional Access**.
8888

8989
1. Select **Policies** on the left menu. If you don't have access to view this page, you need to either find someone who has access or get access.
9090

91-
1. Determine which policy is blocking the SharePoint indexer from accessing the document library. The policy that might be blocking the indexer includes the user account that you used to authenticate during the indexer creation step in the **Users and groups** section. The policy also might have **Conditions** that:
91+
1. Determine which policy is blocking the SharePoint Online indexer from accessing the document library. The policy that might be blocking the indexer includes the user account that you used to authenticate during the indexer creation step in the **Users and groups** section. The policy also might have **Conditions** that:
9292

9393
* Restrict **Windows** platforms.
9494
* Restrict **Mobile apps and desktop clients**.

0 commit comments

Comments
 (0)