|
| 1 | +--- |
| 2 | +title: Linking Tables in OneLake to Azure Machine Learning Through UI |
| 3 | +titleSuffix: Azure Machine Learning |
| 4 | +description: Learn how to link a Table in OneLake Lakehouse to Azure Machine Learning and create datastore through UI. |
| 5 | +author: helenzusa1 |
| 6 | +ms.author: helenzeng |
| 7 | +ms.reviewer: franksolomon |
| 8 | +ms.service: azure-machine-learning |
| 9 | +ms.subservice: mldata |
| 10 | +ms.topic: how-to |
| 11 | +ms.date: 03/03/2025 |
| 12 | +#Customer intent: Existing solutions help link lakehouse files to Azure Machine Learning resources, and create a datastore through the SDK. However, some customers have lakehouse tables, and they want to create a datastore in Azure Machine Learning through the UI. |
| 13 | +--- |
| 14 | + |
| 15 | +# Quickstart: Create a datastore in Azure Machine Learning through the UI to link a lakehouse table |
| 16 | + |
| 17 | +Existing solutions can link an Azure Machine Learning resource to OneLake, extract the data, and create a datastore in Azure Machine Learning. However, [in those solutions](#references), the OneLake data is of type "Files." Those solutions don't work for OneLake table-type data, as shown in the following screenshot: |
| 18 | + |
| 19 | +:::image type="content" source="media/create-datastore-with-user-interface/show-fabric-table.png" alt-text="Screenshot showing a table in Microsoft Fabric." lightbox="./media/create-datastore-with-user-interface/show-fabric-table.png"::: |
| 20 | + |
| 21 | +Additionally, some customers might prefer to build the link in the UI. A solution that links Azure Machine Learning resources to OneLake tables is needed. |
| 22 | +In this article, you learn how to link OneLake tables to Azure Machine Learning studio resources through the UI. |
| 23 | + |
| 24 | +## Prerequisites |
| 25 | + |
| 26 | +- An Azure subscription; if you don't have an Azure subscription, [create a free account](https://azure.microsoft.com/free) before you start. |
| 27 | +- An Azure Machine Learning workspace. Visit [Create workspace resources](./quickstart-create-resources.md). |
| 28 | +- An Azure Data Lake Storage (ADLS) storage account. Visit [Create an Azure Data Lake Storage (ADLS) storage account](/azure/storage/blobs/create-data-lake-storage-account). |
| 29 | +- Knowledge of assigning roles in Azure storage account. |
| 30 | + |
| 31 | +## Solution structure |
| 32 | + |
| 33 | +This solution has three parts. First, create and set up a Data Lake Storage account in the Azure portal. Next, copy the data from OneLake to Azure Data Lake Storage. Bring the data to the Azure Machine Learning resource, and lastly, create the datastore. The following screenshot shows the overall flow of the solution: |
| 34 | + |
| 35 | +:::image type="content" source="media/create-datastore-with-user-interface/overall-idea.png" alt-text="Screenshot showing the overall flow of the solution." lightbox="./media/create-datastore-with-user-interface/overall-idea.png"::: |
| 36 | + |
| 37 | +## Set up the Data Lake storage account in the Azure portal |
| 38 | + |
| 39 | +Assign the **Storage Blob Data Contributor** and **Storage File Data Privileged Contributor** roles to the user identity or service principal, to enable key access and **creating container** permissions. To assign appropriate roles to the user identity: |
| 40 | + |
| 41 | +1. Open the [Microsoft Azure portal](https://portal.azure.com) |
| 42 | +1. Select the **Storage accounts** service. |
| 43 | + |
| 44 | + :::image type="content" source="media/apache-spark-environment-configuration/find-storage-accounts-service.png" lightbox="media/apache-spark-environment-configuration/find-storage-accounts-service.png" alt-text="Screenshot showing selection of Storage Accounts service."::: |
| 45 | + |
| 46 | +1. On the **Storage accounts** page, select the Data Lake Storage account you created in the prerequisite step. A page showing the storage account properties opens. |
| 47 | + |
| 48 | + :::image type="content" source="media/create-datastore-with-user-interface/create-storage-account.png" alt-text="Screenshot showing the properties page of the data lake storage account." lightbox="./media/create-datastore-with-user-interface/create-storage-account.png"::: |
| 49 | + |
| 50 | +1. Select the **Access keys** from the left panel and record the key. This value is required in a later step. |
| 51 | + |
| 52 | +1. Select and enable **Allow storage account key access** as shown in the following screenshot: |
| 53 | + |
| 54 | + :::image type="content" source="media/create-datastore-with-user-interface/enable-key-access.png" alt-text="Screenshot showing how to enable key access of data lake storage account in Azure portal." lightbox="./media/create-datastore-with-user-interface/enable-key-access.png"::: |
| 55 | + |
| 56 | +1. Select **Access Control (IAM)** from left panel, and assign the **Storage Blob Data Contributor** and **Storage File Data Privileged Contributor** roles to the service principal. |
| 57 | + |
| 58 | + :::image type="content" source="media/create-datastore-with-user-interface/assign-roles.png" alt-text="Screenshot showing how to assign roles of data lake storage account in Azure portal." lightbox="./media/create-datastore-with-user-interface/assign-roles.png"::: |
| 59 | + |
| 60 | +1. Create a container in the storage account. Name it **onelake-table**. |
| 61 | + |
| 62 | + :::image type="content" source="media/create-datastore-with-user-interface/create-container.png" alt-text="Screenshot showing creation of a data lake storage account container in the Azure portal." lightbox="./media/create-datastore-with-user-interface/create-container.png"::: |
| 63 | + |
| 64 | +## Use a Fabric data pipeline to copy data to an Azure Data Lake Storage account |
| 65 | + |
| 66 | +1. At the Fabric portal, select **Data pipeline** at the New item page. |
| 67 | + |
| 68 | + :::image type="content" source="media/create-datastore-with-user-interface/create-pipeline.png" alt-text="Screenshot showing selection of data pipeline at the Fabric New item page." lightbox="./media/create-datastore-with-user-interface/create-pipeline.png"::: |
| 69 | + |
| 70 | +1. Select **Copy data assistant**. |
| 71 | + |
| 72 | + :::image type="content" source="media/create-datastore-with-user-interface/copy-data-assistant.png" alt-text="Screenshot showing selection of Copy data assistant." lightbox="./media/create-datastore-with-user-interface/copy-data-assistant.png"::: |
| 73 | + |
| 74 | +1. In **Copy data assistant**, select **Azure Blobs**: |
| 75 | + |
| 76 | + :::image type="content" source="media/create-datastore-with-user-interface/select-azure-blob.png" alt-text="Screenshot showing selection of Select Azure blobs in the Fabric Copy data assistant." lightbox="./media/create-datastore-with-user-interface/select-azure-blob.png"::: |
| 77 | + |
| 78 | +1. To create a connection to the Azure Data Lake storage account, select **Authentication kind: Account key** and then **Next**: |
| 79 | + |
| 80 | + <!-- Maybe place a red highlight box around "Authentication kind: Account key" --> |
| 81 | + |
| 82 | + :::image type="content" source="media/create-datastore-with-user-interface/create-connection.png" alt-text="Screenshot that shows how to create a connection in a Fabric data pipeline." lightbox="./media/create-datastore-with-user-interface/create-connection.png"::: |
| 83 | + |
| 84 | +1. Select the data destination, and select Next: |
| 85 | + |
| 86 | + <!-- Maybe place red highlight boxes around "OK" and "Next" --> |
| 87 | + |
| 88 | + :::image type="content" source="media/create-datastore-with-user-interface/select-destination-folder.png" alt-text="Screenshot that shows selection of the data destination." lightbox="./media/create-datastore-with-user-interface/select-destination-folder.png"::: |
| 89 | + |
| 90 | +1. Connect to the data destination, and select Next: |
| 91 | + |
| 92 | + :::image type="content" source="media/create-datastore-with-user-interface/connect-data-destination.png" alt-text="Screenshot that shows connection to the data destination." lightbox="./media/create-datastore-with-user-interface/connect-data-destination.png"::: |
| 93 | + |
| 94 | +1. That step automatically starts the data copy job: |
| 95 | + |
| 96 | + <!-- This image does not seem to highlight how to start the data copy job - it does not seem to highlight a control that actually starts the data copy job --> |
| 97 | + |
| 98 | + :::image type="content" source="media/create-datastore-with-user-interface/copy-activity-scheduled.png" alt-text="Screenshot that shows the copy activity is scheduled." lightbox="./media/create-datastore-with-user-interface/copy-activity-scheduled.png"::: |
| 99 | + |
| 100 | + This step might take a while. It directly leads to the next step. |
| 101 | + |
| 102 | +1. Check that the data copy job finished successfully: |
| 103 | + |
| 104 | + :::image type="content" source="media/create-datastore-with-user-interface/copy-activity-success.png" alt-text="Screenshot showing that the copy operation succeeded." lightbox="./media/create-datastore-with-user-interface/copy-activity-success.png"::: |
| 105 | + |
| 106 | +## Create datastore in Azure Machine Learning linking to Azure Data Lake Storage container |
| 107 | + |
| 108 | +Now that your data is in the Azure Data Lake storage resource, you can create an Azure Machine Learning datastore. |
| 109 | + |
| 110 | +1. In Azure storage account, the **container** as shown on the left has data, as shown on the right: |
| 111 | + |
| 112 | + <!-- This image does not seem to highlight how to verify that the container has the actual data. Also, |
| 113 | + did the earlier steps show that we gave the container the name "onelake-table"? |
| 114 | + |
| 115 | + The create-container.png image step created the **onelake-table** container.--> |
| 116 | + |
| 117 | + :::image type="content" source="media/create-datastore-with-user-interface/check-container.png" alt-text="Screenshot that shows how to verify the data in Azure storage account container." lightbox="./media/create-datastore-with-user-interface/check-container.png"::: |
| 118 | + |
| 119 | +1. In Machine Learning studio create data asset, select the **File (uri_file)** type: |
| 120 | + |
| 121 | + :::image type="content" source="media/create-datastore-with-user-interface/create-data-asset.png" alt-text="Screenshot showing selection of the File (uri_file) type." lightbox="./media/create-datastore-with-user-interface/create-data-asset.png"::: |
| 122 | + |
| 123 | +1. Select **From Azure storage**: |
| 124 | + |
| 125 | + :::image type="content" source="media/create-datastore-with-user-interface/select-azure-storage.png" alt-text="Screenshot that shows how to select Azure storage." lightbox="./media/create-datastore-with-user-interface/select-azure-storage.png"::: |
| 126 | + |
| 127 | +1. Using the **Account key** value from the earlier **Create a connection to the Azure Data Lake storage account** step, create a **New datastore**: |
| 128 | + |
| 129 | + :::image type="content" source="media/create-datastore-with-user-interface/new-datastore.png" alt-text="Screenshot that shows how to create new datastore in Azure Machine Learning." lightbox="./media/create-datastore-with-user-interface/new-datastore.png"::: |
| 130 | + |
| 131 | +1. You can also directly create a datastore in the Azure Machine Learning Studio: |
| 132 | + |
| 133 | + :::image type="content" source="media/create-datastore-with-user-interface/create-datastore.png" alt-text="Screenshot that shows how to create a datastore in Azure Machine Learning." lightbox="./media/create-datastore-with-user-interface/create-datastore.png"::: |
| 134 | + |
| 135 | +1. You can review details of the datastore you created: |
| 136 | + |
| 137 | + :::image type="content" source="media/create-datastore-with-user-interface/datastore-created.png" alt-text="Screenshot that shows details of the datastore you created." lightbox="./media/create-datastore-with-user-interface/datastore-created.png"::: |
| 138 | + |
| 139 | +1. Review the data in the datastore |
| 140 | + |
| 141 | + <!-- The data in the access-datastore.png image came from this |
| 142 | +
|
| 143 | + https://microsoftlearning.github.io/mslearn-fabric/Instructions/Labs/01-lakehouse.html |
| 144 | +
|
| 145 | + resource. --> |
| 146 | + |
| 147 | + :::image type="content" source="media/create-datastore-with-user-interface/access-datastore.png" alt-text="Screenshot that shows how to access a datastore in Azure Machine Learning." lightbox="./media/create-datastore-with-user-interface/access-datastore.png"::: |
| 148 | + |
| 149 | +Now that you successfully created the datastore in Azure Machine Learning, you can use it in machine learning exercises. |
| 150 | + |
| 151 | +## References |
| 152 | + |
| 153 | ++ [Read from a specified table from lakehouse in One workspace using Notebook in other workspace](https://community.fabric.microsoft.com/t5/Data-Engineering/Read-from-a-specified-table-from-lakehouse-in-One-workspace/m-p/4234885) |
| 154 | ++ [Delta Lake Tables For Optimal Direct Lake Performance In Fabric Python Notebook](https://fabric.guru/delta-lake-tables-for-optimal-direct-lake-performance-in-fabric-python-notebook) |
| 155 | ++ [Create a OneLake (Microsoft Fabric) datastore (preview)](./how-to-datastore.md#create-a-onelake-microsoft-fabric-datastore-preview) |
| 156 | ++ [Spark connector for Microsoft Fabric Data Warehouse](/fabric/data-engineering/spark-data-warehouse-connector) |
| 157 | ++ [AML and OneLake and Fabric Better Together Demo](https://github.com/azeltov/aml_one_lake) |
0 commit comments