You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-access-data.md
+21-21Lines changed: 21 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,10 +6,10 @@ services: machine-learning
6
6
ms.service: machine-learning
7
7
ms.subservice: core
8
8
ms.topic: conceptual
9
-
ms.author: ylxiong
10
-
author: YLXiong1125
9
+
ms.author: sihhu
10
+
author: MayMSFT
11
11
ms.reviewer: nibaccam
12
-
ms.date: 01/13/2020
12
+
ms.date: 01/15/2020
13
13
ms.custom: seodec18
14
14
15
15
# Customer intent: As an experienced Python developer, I need to make my data in Azure Storage available to my remote compute to train my machine learning models.
@@ -56,13 +56,16 @@ Datastores currently support storing connection information to the storage servi
Databricks File System| No authentication | | ✓`*` | ✓ `*`|✓`*`
59
+
Databricks File System| No authentication | | ✓* | ✓ * |✓*
60
60
61
-
`*`only supported on local compute target scenarios
61
+
*only supported on local compute target scenarios
62
62
63
63
### Storage guidance
64
64
65
-
We recommend creating a datastore for an Azure blob container. Both standard and premium storage are available for blobs. Although premium storage is more expensive, its faster throughput speeds might improve the speed of your training runs, particularly if you train against a large dataset. For information about the cost of storage accounts, see the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculator/?service=machine-learning-service).
65
+
We recommend creating a datastore for an Azure blob container.
66
+
Both standard and premium storage are available for blobs. Although premium storage is more expensive, its faster throughput speeds might improve the speed of your training runs, particularly if you train against a large dataset. For information about the cost of storage accounts, see the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculator/?service=machine-learning-service).
67
+
68
+
When you create a workspace, an Azure blob container and an Azure file share are automatically registered to the workspace. They're named `workspaceblobstore` and `workspacefilestore`, respectively. They store the connection information for the blob container and the file share that are provisioned in the storage account attached to the workspace. The `workspaceblobstore` container is set as the default datastore.
For an Azure Data Lake Storage Generation 2 (ADLS Gen 2) datastore, use [register_azure_data_lake_gen2()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py#register-azure-data-lake-gen2-workspace--datastore-name--filesystem--account-name--tenant-id--client-id--client-secret--resource-url-none--authority-url-none--protocol-none--endpoint-none--overwrite-false-) to register a credential datastore connected to an Azure DataLake Gen 2 storage with service principal permissions. Learn more about [access control et up for ADLS Gen 2](https://docs.microsoft.com/azure/storage/blobs/data-lake-storage-access-control).
135
+
For an Azure Data Lake Storage Generation 2 (ADLS Gen 2) datastore, use [register_azure_data_lake_gen2()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py#register-azure-data-lake-gen2-workspace--datastore-name--filesystem--account-name--tenant-id--client-id--client-secret--resource-url-none--authority-url-none--protocol-none--endpoint-none--overwrite-false-) to register a credential datastore connected to an Azure DataLake Gen 2 storage with service principal permissions. Learn more about [access control set up for ADLS Gen 2](https://docs.microsoft.com/azure/storage/blobs/data-lake-storage-access-control).
133
136
134
137
The following code creates and registers the `adlsgen2_datastore_name` datastore to the `ws` workspace. This datastore accesses the file system `test` on the `account_name` storage account, by using the provided service principal credentials.
135
138
@@ -144,14 +147,13 @@ tenant_id=os.getenv("ADLSGEN2_TENANT", "<my_tenant_id>") # tenant id of service
144
147
client_id=os.getenv("ADLSGEN2_CLIENTID", "<my_client_id>") # client id of service principal
145
148
client_secret=os.getenv("ADLSGEN2_CLIENT_SECRET", "<my_client_secret>") # the secret of service principal
account_name=account_name, # ADLS Gen2 account name
153
+
filesystem='test', # ADLS Gen2 filesystem
154
+
tenant_id=tenant_id, # tenant id of service principal
155
+
client_id=client_id, # client id of service principal
156
+
client_secret=client_secret) # the secret of service principal
155
157
```
156
158
157
159
### Azure Machine Learning studio
@@ -189,8 +191,6 @@ for name, datastore in datastores.items():
189
191
print(name, datastore.datastore_type)
190
192
```
191
193
192
-
When you create a workspace, an Azure blob container and an Azure file share are automatically registered to the workspace. They're named `workspaceblobstore` and `workspacefilestore`, respectively. They store the connection information for the blob container and the file share that are provisioned in the storage account attached to the workspace. The `workspaceblobstore` container is set as the default datastore.
193
-
194
194
To get the workspace's default datastore, use this line:
0 commit comments