Skip to content

Commit 9669a6a

Browse files
Merge pull request #7322 from gmndrg/release-sept-2025-search
Update indexing guide for OneLake files
2 parents cfa72b7 + 27e54e8 commit 9669a6a

File tree

2 files changed

+26
-9
lines changed

2 files changed

+26
-9
lines changed

articles/search/search-how-to-index-onelake-files.md

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: gimondra
77
manager: nitinme
88
ms.service: azure-ai-search
99
ms.topic: how-to
10-
ms.date: 09/17/2025
10+
ms.date: 09/26/2025
1111
ms.custom:
1212
- build-2024
1313
- ignite-2024
@@ -88,7 +88,12 @@ The following OneLake shortcuts are supported by the OneLake files indexer:
8888

8989
+ This indexer doesn't support SQL queries, but the query used in the data source configuration is exclusively to add optionally the folder or shortcut to access.
9090

91-
+ There's no support to ingest files from **My Workspace** workspace in OneLake since this is a personal repository per user.
91+
+ There's no support to ingest files from **My Workspace** workspace in OneLake since this is a personal repository per user.
92+
93+
+ Microsoft Purview Sensitivity Labels applied via Data Map are not currently supported. If sensitivity labels are applied to artifacts in OneLake using [Microsoft Purview Data Map](/purview/data-map-sensitivity-labels-apply), the indexer may fail to execute properly. To bypass this restriction, an exception must be granted by your organization’s IT team responsible for managing Purview sensitivity labels and Data Map configurations.
94+
95+
+ Workspace role-based permissions in Microsoft OneLake may affect indexer access to files. Ensure that the Azure AI Search service principal (managed identity) has sufficient permissions over the files you intend to access in the target [Microsoft Fabric workspace](/fabric/fundamentals/workspaces).
96+
9297

9398
## Prepare data for indexing
9499

@@ -162,19 +167,23 @@ The minimum role assignment for your search service identity is Contributor.
162167

163168
:::image type="content" source="media/search-how-to-index-onelake-files/add-user-assigned-managed-identity.png" alt-text="Screenshot showing a Contributor role assignment for a search service user-assigned managed identity in the Azure portal." lightbox="media/search-how-to-index-onelake-files/add-user-assigned-managed-identity.png":::
164169

170+
## Configure a shared private link (required if using Fabric workspace-level private link)
171+
172+
If your Fabric workspace is secured with a [private link](/fabric/security/security-workspace-level-private-links-overview), Azure AI Search won't be able to access your lakehouse data over the public internet, and you won't be able to configure the indexer or its required dependencies, such as the data source. To enable access, you must configure a [shared private link](search-indexer-howto-access-private.md) between Azure AI Search and your Fabric workspace.
173+
165174
## Define the data source
166175

167-
A data source is defined as an independent resource so that it can be used by multiple indexers.
176+
A data source is defined as an independent resource so that it can be used by multiple indexers.
168177

169178
1. Use the [Create or update a data source REST API](/rest/api/searchservice/data-sources/create-or-update) to set its definition. These are the most significant steps of the definition.
170179

171180
1. Set `"type"` to `"onelake"` (required).
172181

173182
1. Get the Microsoft Fabric workspace GUID and the lakehouse GUID:
174183

175-
+ In Power BI, open the lakehouse you'd like to import data from. Notice the lakehouse URL in the browser. It should look similar to this example: "https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111". The URL contains both the workspace GUID and the lakehouse GUID.
184+
+ In Power BI, open the lakehouse you'd like to import data from. Notice the lakehouse URL in the browser. It should look similar to this example: "https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111". The URL contains both the workspace GUID and the lakehouse GUID. If the Fabric workspace is secured with a private link, the URL would start with "https://{FabricWorkspaceGuid}.z{xy}.blob.fabric.microsoft.com".
176185

177-
+ Copy the workspace GUID, which is listed to the right of "groups" in the URL. In this example, it would be 00000000-0000-0000-0000-000000000000. In your REST file, create an environment variable for `{FabricWorkspaceGuid}` and set it to the workspace GUID.
186+
+ Copy the workspace GUID, which is listed to the right of "groups" in the URL. In this example, it would be 00000000-0000-0000-0000-000000000000. In your REST file, create an environment variable for `{FabricWorkspaceGuid}` and set it to the workspace GUID. If your workspace uses a private link, the workspace GUID will appear in a different location in the URL. Be sure to reference the correct part of the URL based on your setup.
178187

179188
:::image type="content" source="media/search-how-to-index-onelake-files/fabric-guid.png" alt-text="Screenshot of the Fabric workspace GUID in the Azure portal." lightbox="media/search-how-to-index-onelake-files/fabric-guid.png" :::
180189

@@ -190,16 +199,24 @@ A data source is defined as an independent resource so that it can be used by mu
190199
}
191200
```
192201

202+
For your setup with [shared private link](search-indexer-howto-access-private.md), setup the managed identities using the following connection string, that varies from the setup using the internet for communication. Note that not only the URL is different, but also `WorkspaceEndpoint` is used, instead of `ResourceId`. Take this into consideration when configuring either the system-managed identity or user-managed identity setups.
203+
204+
```json
205+
"credentials": {
206+
"connectionString": "WorkspaceEndpoint=https://{FabricWorkspaceGuid}.z{xy}.blob.fabric.microsoft.com"
207+
}
208+
```
209+
193210
1. Set `"container.name"` to the lakehouse GUID, replacing `{LakehouseGuid}` with the value you copied in the previous step. Use `"query"` to optionally specify a lakehouse subfolder or shortcut.
194211

195-
```json
212+
```json
196213
"container": {
197214
"name": "{LakehouseGuid}",
198215
"query": "{optionalLakehouseFolderOrShortcut}"
199216
}
200217
```
201218

202-
1. Set the authentication method using the user-assigned managed identity, or skip to the next step for system-managed identity.
219+
1. Set the authentication method using the user-assigned managed identity, or skip to the next step for system-managed identity.
203220

204221
```json
205222
{

articles/search/search-indexer-howto-access-private.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: mrcarter8
77
ms.author: mcarter
88
ms.service: azure-ai-search
99
ms.topic: how-to
10-
ms.date: 07/31/2025
10+
ms.date: 09/26/2025
1111
ms.update-cycle: 180-days
1212
ms.custom:
1313
- ignite-2024
@@ -134,7 +134,7 @@ You can create a shared private link for the following resources.
134134

135135
<sup>8</sup> Shared private links are now supported (as of November 2024) for connections to Azure AI services multi-service accounts. Azure AI Search connects to Azure AI services multi-service for [billing purposes](cognitive-search-attach-cognitive-services.md). These connections can now be private through a shared private link. Shared private link is only supported when configuring [a managed identity (keyless configuration)](cognitive-search-attach-cognitive-services.md#bill-through-a-keyless-connection) in the skillset definition.
136136

137-
<sup>9</sup> Shared private link is supported for connections to OneLake workspace. To create a `privateLinkServicesForFabric` resource specific to a workspace, [register](/azure/azure-resource-manager/management/resource-providers-and-types#register-resource-provider) `Microsoft.Fabric` namespace to your subscription and refer to step 2 as documented in [Create the private link service in Azure](/fabric/security/security-workspace-level-private-links-set-up#step-2-create-the-private-link-service-in-azure).
137+
<sup>9</sup> Shared private link is supported for connections to OneLake workspace. To create a `privateLinkServicesForFabric` resource specific to a workspace, [register](/azure/azure-resource-manager/management/resource-providers-and-types#register-resource-provider) `Microsoft.Fabric` namespace to your subscription and refer to step 2 as documented in [Create the private link service in Azure](/fabric/security/security-workspace-level-private-links-set-up#step-2-create-the-private-link-service-in-azure). Note that when using a shared private link, the OneLake data source configuration must be defined with a specific connection string as outlined in the [OneLake indexer documentation](search-how-to-index-onelake-files.md#define-the-data-source).
138138

139139
## 1 - Create a shared private link
140140

0 commit comments

Comments
 (0)