Skip to content

Commit 243616e

Browse files
Merge pull request #3536 from fbsolo-ms1/freshness-updates
Freshness update for how-to-access-data.md . . .
2 parents 8ba4bc1 + a8a704d commit 243616e

File tree

2 files changed

+34
-34
lines changed

2 files changed

+34
-34
lines changed

articles/machine-learning/v1/how-to-access-data.md

Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.topic: how-to
99
ms.author: yogipandey
1010
author: ynpandey
1111
ms.reviewer: nibaccam
12-
ms.date: 02/27/2024
12+
ms.date: 03/13/2025
1313
ms.custom: UpdateFrequency5, data4ml
1414
#Customer intent: As an experienced Python developer, I need to make my data in Azure storage available to my remote compute to train my machine learning models.
1515
---
@@ -21,14 +21,14 @@ ms.custom: UpdateFrequency5, data4ml
2121

2222
In this article, learn how to connect to data storage services on Azure with Azure Machine Learning datastores and the [Azure Machine Learning Python SDK](/python/api/overview/azure/ml/intro).
2323

24-
Datastores securely connect to your storage service on Azure, and they avoid risk to your authentication credentials or the integrity of your original data store. A datastore stores connection information - for example, your subscription ID or token authorization - in the [Key Vault](https://azure.microsoft.com/services/key-vault/) associated with the workspace. With a datastore, you can securely access your storage because you can avoid hard-coding connection information in your scripts. You can create datastores that connect to [these Azure storage solutions](#supported-data-storage-service-types).
24+
A datastore securely connects to your storage service on Azure, and it avoids risk to your authentication credentials or the integrity of your original data store. A datastore stores connection information - for example, your subscription ID or token authorization - in the [Key Vault](https://azure.microsoft.com/services/key-vault/) associated with the workspace. With a datastore, you can securely access your storage because you can avoid hard-coding connection information in your scripts. You can create datastores that connect to [these Azure storage solutions](#supported-data-storage-service-types).
2525

26-
For information that describes how datastores fit with the Azure Machine Learning overall data access workflow, visit [Securely access data](concept-data.md#data-workflow) article.
26+
For more information describing how datastores fit with the overall Azure Machine Learning data access workflow, visit [Securely access data](concept-data.md#data-workflow) article.
2727

2828
To learn how to connect to a data storage resource with a UI, visit [Connect to data storage with the studio UI](how-to-connect-data-ui.md#create-datastores).
2929

3030
>[!TIP]
31-
> This article assumes that you will connect to your storage service with credential-based authentication credentials - for example, a service principal or a shared access signature (SAS) token. Note that if credentials are registered with datastores, all users with the workspace *Reader* role can retrieve those credentials. For more information, visit [Manage roles in your workspace](../how-to-assign-roles.md#default-roles).
31+
> This article assumes that you want to connect to your storage service with credential-based authentication credentials - for example, a service principal or a shared access signature (SAS) token. If credentials are registered with datastores, all users with the workspace *Reader* role can retrieve those credentials. For more information, visit [Manage roles in your workspace](../how-to-assign-roles.md#default-roles).
3232
>
3333
> For more information about identity-based data access, visit [Identity-based data access to storage services (v1)](../how-to-identity-based-data-access.md).
3434
@@ -42,28 +42,28 @@ To learn how to connect to a data storage resource with a UI, visit [Connect to
4242

4343
- An Azure Machine Learning workspace.
4444

45-
[Create an Azure Machine Learning workspace](../quickstart-create-resources.md), or use an existing workspace via the Python SDK
45+
[Create an Azure Machine Learning workspace](../quickstart-create-resources.md), or use an existing workspace via the Python SDK
4646

47-
Import the `Workspace` and `Datastore` class, and load your subscription information from the `config.json` file with the `from_config()` function. By default, the function looks for the JSON file in the current directory, but you can also specify a path parameter to point to the file with `from_config(path="your/file/path")`:
47+
Import the `Workspace` and `Datastore` class, and load your subscription information from the `config.json` file with the `from_config()` function. By default, the function looks for the JSON file in the current directory, but you can also specify a path parameter to point to the file with `from_config(path="your/file/path")`:
4848

49-
```Python
50-
import azureml.core
51-
from azureml.core import Workspace, Datastore
52-
53-
ws = Workspace.from_config()
54-
```
49+
```python
50+
import azureml.core
51+
from azureml.core import Workspace, Datastore
52+
53+
ws = Workspace.from_config()
54+
```
5555

56-
Workspace creation automatically registers an Azure blob container and an Azure file share, as datastores, to the workspace. They're named `workspaceblobstore` and `workspacefilestore`, respectively. The `workspaceblobstore` stores workspace artifacts and your machine learning experiment logs. It serves as the **default datastore** and can't be deleted from the workspace. The `workspacefilestore` stores notebooks and R scripts authorized via [compute instance](../concept-compute-instance.md#accessing-files).
56+
Workspace creation automatically registers an Azure blob container and an Azure file share, as datastores, to the workspace. They're named `workspaceblobstore` and `workspacefilestore`, respectively. The `workspaceblobstore` stores workspace artifacts and your machine learning experiment logs. It serves as the **default datastore** and can't be deleted from the workspace. The `workspacefilestore` stores notebooks and R scripts authorized via a [compute instance](../concept-compute-instance.md#accessing-files).
5757

58-
> [!NOTE]
59-
> Azure Machine Learning designer automatically creates a datastore named **azureml_globaldatasets** when you open a sample in the designer homepage. This datastore only contains sample datasets. Please **do not** use this datastore for any confidential data access.
58+
> [!NOTE]
59+
> Azure Machine Learning designer automatically creates a datastore named **azureml_globaldatasets** when you open a sample in the designer homepage. This datastore only contains sample datasets. **Do not** use this datastore for any confidential data access.
6060
6161
## Supported data storage service types
6262

63-
Datastores currently support storage of connection information to the storage services listed in this matrix:
63+
Datastores currently support storage of connection information to the storage services listed in this matrix:
6464

6565
> [!TIP]
66-
> **For unsupported storage solutions** (those not listed in the following table), you might encounter issues as you connect and work with your data. We suggest that you [move your data](#move-data-to-supported-azure-storage-solutions) to a supported Azure storage solution. This can also help with additional scenarios- - for example, reduction of data egress cost during ML experiments.
66+
> **For unsupported storage solutions** (those not listed in the following table), you might encounter issues as you connect and work with your data. We suggest that you [move your data](#move-data-to-supported-azure-storage-solutions) to a supported Azure storage solution. This can also help with other scenarios - for example, reduction of data egress cost during ML experiments.
6767
6868
| Storage type | Authentication type | [Azure Machine Learning studio](https://ml.azure.com/) | [Azure Machine Learning  Python SDK](/python/api/overview/azure/ml/intro) | [Azure Machine Learning CLI](reference-azure-machine-learning-cli.md) | [Azure Machine Learning  REST API](/rest/api/azureml/) | VS Code |
6969
|---|---|---|---|---|---|---|
@@ -108,7 +108,7 @@ Azure Machine Learning can receive requests from clients outside of the virtual
108108
### Access validation
109109

110110
> [!WARNING]
111-
> Cross tenant access to storage accounts is not supported. If your scenario needs cross tenant access, reach out to the Azure Machine Learning Data Support team alias at **[email protected]** for assistance with a custom code solution.
111+
> Cross-tenant access to storage accounts isn't supported. If your scenario needs cross-tenant access, reach out to the ([Azure Machine Learning Data Support team](mailto:[email protected])) for assistance with a custom code solution.
112112
113113
**As part of the initial datastore creation and registration process**, Azure Machine Learning automatically validates that the underlying storage service exists and that the user-provided principal (username, service principal, or SAS token) can access the specified storage.
114114

@@ -119,7 +119,7 @@ To authenticate your access to the underlying storage service, you can provide e
119119
You can find account key, SAS token, and service principal information at your [Azure portal](https://portal.azure.com).
120120

121121
* To use an account key or SAS token for authentication, select **Storage Accounts** on the left pane, and choose the storage account that you want to register
122-
* The **Overview** page provides account name, file share name, container, etc. information
122+
* The **Overview** page provides account name, file share name, container, etc. information
123123
* For account keys, go to **Access keys** on the **Settings** pane
124124
* For SAS tokens, go to **Shared access signatures** on the **Settings** pane
125125

@@ -139,7 +139,7 @@ For Azure blob container and Azure Data Lake Gen 2 storage, ensure that your aut
139139

140140
## Create and register datastores
141141

142-
Registration of an Azure storage solution as a datastore automatically creates and registers that datastore to a specific workspace. Review [storage access & permissions](#storage-access-and-permissions) in this document for guidance about virtual network scenarios, and where to find required authentication credentials.
142+
Registration of an Azure storage solution as a datastore automatically creates and registers that datastore to a specific workspace. Review the [storage access & permissions](#storage-access-and-permissions) section in this document for guidance about virtual network scenarios, and where to find required authentication credentials.
143143

144144
That section offers examples that describe how to create and register a datastore via the Python SDK for these storage types. The parameters shown these examples are the **required parameters** to create and register a datastore:
145145

@@ -152,18 +152,18 @@ That section offers examples that describe how to create and register a datastor
152152
To learn how to connect to a data storage resource with a UI, visit [Connect to data with Azure Machine Learning studio](how-to-connect-data-ui.md).
153153

154154
>[!IMPORTANT]
155-
> If you unregister and re-register a datastore with the same name, and the re-registration fails, the Azure Key Vault for your workspace may not have soft-delete enabled. By default, soft-delete is enabled for the key vault instance created by your workspace, but it may not be enabled if you used an existing key vault or have a workspace created before October 2020. For information that describes how to enable soft-delete, see [Turn on Soft Delete for an existing key vault](/azure/key-vault/general/soft-delete-change#turn-on-soft-delete-for-an-existing-key-vault).
155+
> If you unregister and re-register a datastore with the same name, and the re-registration fails, the Azure Key Vault for your workspace might not have soft-delete enabled. By default, soft-delete is enabled for the key vault instance created by your workspace. However, it might not be enabled if you used an existing key vault, or if you have a workspace created before October 2020. For more information about how to enable soft-delete, visit [Turn on Soft Delete for an existing key vault](/azure/key-vault/general/soft-delete-change#turn-on-soft-delete-for-an-existing-key-vault).
156156
157157
> [!NOTE]
158-
> A datastore name should only contain lowercase letters, digits and underscores.
158+
> A datastore name should only contain lowercase letters, digits, and underscores.
159159
160160
### Azure blob container
161161

162162
To register an Azure blob container as a datastore, use the [`register_azure_blob_container()`](/python/api/azureml-core/azureml.core.datastore%28class%29#azureml-core-datastore-register-azure-blob-container) method.
163163

164-
This code sample creates and registers the `blob_datastore_name` datastore to the `ws` workspace. The datastore uses the provided account access key to access the `my-container-name` blob container on the `my-account-name` storage account. Review the [storage access & permissions](#storage-access-and-permissions) section for guidance about virtual network scenarios, and where to find required authentication credentials.
164+
The following code sample creates and registers the `blob_datastore_name` datastore to the `ws` workspace. The datastore uses the provided account access key to access the `my-container-name` blob container on the `my-account-name` storage account. Review the [storage access & permissions](#storage-access-and-permissions) section for guidance about virtual network scenarios, and where to find required authentication credentials.
165165

166-
```Python
166+
```python
167167
blob_datastore_name='azblobsdk' # Name of the datastore to workspace
168168
container_name=os.getenv("BLOB_CONTAINER", "<my-container-name>") # Name of Azure blob container
169169
account_name=os.getenv("BLOB_ACCOUNTNAME", "<my-account-name>") # Storage account name
@@ -182,7 +182,7 @@ To register an Azure file share as a datastore, use the [`register_azure_file_sh
182182

183183
This code sample creates and registers the `file_datastore_name` datastore to the `ws` workspace. The datastore uses the `my-fileshare-name` file share on the `my-account-name` storage account, with the provided account access key. Review the [storage access & permissions](#storage-access-and-permissions) section for guidance about virtual network scenarios, and where to find required authentication credentials.
184184

185-
```Python
185+
```python
186186
file_datastore_name='azfilesharesdk' # Name of the datastore to workspace
187187
file_share_name=os.getenv("FILE_SHARE_CONTAINER", "<my-fileshare-name>") # Name of Azure file share container
188188
account_name=os.getenv("FILE_SHARE_ACCOUNTNAME", "<my-account-name>") # Storage account name
@@ -197,13 +197,13 @@ file_datastore = Datastore.register_azure_file_share(workspace=ws,
197197

198198
### Azure Data Lake Storage Generation 2
199199

200-
For an Azure Data Lake Storage Generation 2 (ADLS Gen 2) datastore, use the[register_azure_data_lake_gen2()](/python/api/azureml-core/azureml.core.datastore%28class%29#azureml-core-datastore-register-azure-data-lake-gen2) method to register a credential datastore connected to an Azure Data Lake Gen 2 storage with [service principal permissions](/azure/active-directory/develop/howto-create-service-principal-portal).
200+
For an Azure Data Lake Storage Generation 2 (ADLS Gen 2) datastore, use the [register_azure_data_lake_gen2()](/python/api/azureml-core/azureml.core.datastore%28class%29#azureml-core-datastore-register-azure-data-lake-gen2) method to register a credential datastore connected to an Azure Data Lake Gen 2 storage with [service principal permissions](/azure/active-directory/develop/howto-create-service-principal-portal).
201201

202202
To use your service principal, you must [register your application](/azure/active-directory/develop/app-objects-and-service-principals) and grant the service principal data access via either Azure role-based access control (Azure RBAC) or access control lists (ACL). For more information, visit [access control set up for ADLS Gen 2](/azure/storage/blobs/data-lake-storage-access-control-model).
203203

204204
This code creates and registers the `adlsgen2_datastore_name` datastore to the `ws` workspace. This datastore accesses the file system `test` in the `account_name` storage account, through use of the provided service principal credentials. Review the [storage access & permissions](#storage-access-and-permissions) section for guidance on virtual network scenarios, and where to find required authentication credentials.
205205

206-
```python
206+
```python
207207
adlsgen2_datastore_name = 'adlsgen2datastore'
208208

209209
subscription_id=os.getenv("ADL_SUBSCRIPTION", "<my_subscription_id>") # subscription id of ADLS account
@@ -242,13 +242,13 @@ After datastore creation, [create an Azure Machine Learning dataset](how-to-crea
242242

243243
To get a specific datastore registered in the current workspace, use the [`get()`](/python/api/azureml-core/azureml.core.datastore%28class%29#get-workspace--datastore-name-) static method on the `Datastore` class:
244244

245-
```Python
245+
```python
246246
# Get a named datastore from the current workspace
247247
datastore = Datastore.get(ws, datastore_name='your datastore name')
248248
```
249249
To get the list of datastores registered with a given workspace, use the [`datastores`](/python/api/azureml-core/azureml.core.workspace%28class%29#datastores) property on a workspace object:
250250

251-
```Python
251+
```python
252252
# List all datastores registered in the current workspace
253253
datastores = ws.datastores
254254
for name, datastore in datastores.items():
@@ -257,18 +257,18 @@ for name, datastore in datastores.items():
257257

258258
This code sample shows how to get the default datastore of the workspace:
259259

260-
```Python
260+
```python
261261
datastore = ws.get_default_datastore()
262262
```
263-
You can also change the default datastore with this code sample. Only the SDK supports this ability:
263+
You can also change the default datastore with the following code sample. Only the SDK supports this ability:
264264

265-
```Python
265+
```python
266266
ws.set_default_datastore(new_default_datastore)
267267
```
268268

269269
## Access data during scoring
270270

271-
Azure Machine Learning provides several ways to use your models for scoring. Some of these methods provide no access to datastores. The following table describes which methods allow access to datastores during scoring:
271+
Azure Machine Learning provides several ways to use your models for scoring. Some of these methods provide no access to datastores. The following table describes the methods which allow access to datastores during scoring:
272272

273273
| Method | Datastore access | Description |
274274
| ----- | :-----: | ----- |

articles/machine-learning/v1/how-to-connect-data-ui.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.topic: how-to
99
ms.author: yogipandey
1010
author: ynpandey
1111
ms.reviewer: fsolomon
12-
ms.date: 02/09/2024
12+
ms.date: 03/13/2025
1313
ms.custom: UpdateFrequency5, data4ml
1414
#Customer intent: As low code experience data scientist, I need to make my data in storage on Azure available to my remote compute to train my ML models.
1515
---

0 commit comments

Comments
 (0)