Skip to content

Commit ff19a4b

Browse files
authored
Merge pull request #101202 from nibaccam/datastore
Data| Datastore minor updates
2 parents 2c97afe + 73daf90 commit ff19a4b

File tree

1 file changed

+21
-21
lines changed

1 file changed

+21
-21
lines changed

articles/machine-learning/how-to-access-data.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@ services: machine-learning
66
ms.service: machine-learning
77
ms.subservice: core
88
ms.topic: conceptual
9-
ms.author: ylxiong
10-
author: YLXiong1125
9+
ms.author: sihhu
10+
author: MayMSFT
1111
ms.reviewer: nibaccam
12-
ms.date: 01/13/2020
12+
ms.date: 01/15/2020
1313
ms.custom: seodec18
1414

1515
# Customer intent: As an experienced Python developer, I need to make my data in Azure Storage available to my remote compute to train my machine learning models.
@@ -56,13 +56,16 @@ Datastores currently support storing connection information to the storage servi
5656
Azure&nbsp;SQL&nbsp;Database| SQL authentication <br>Service principal| ✓ | ✓ | ✓ |✓
5757
Azure&nbsp;PostgreSQL | SQL authentication| ✓ | ✓ | ✓ |✓
5858
Azure&nbsp;Database&nbsp;for&nbsp;MySQL | SQL authentication| | ✓ | ✓ |✓
59-
Databricks&nbsp;File&nbsp;System| No authentication | | ✓`*` | ✓ `*`|✓`*`
59+
Databricks&nbsp;File&nbsp;System| No authentication | | ✓* | ✓ * |✓*
6060

61-
`*` only supported on local compute target scenarios
61+
*only supported on local compute target scenarios
6262

6363
### Storage guidance
6464

65-
We recommend creating a datastore for an Azure blob container. Both standard and premium storage are available for blobs. Although premium storage is more expensive, its faster throughput speeds might improve the speed of your training runs, particularly if you train against a large dataset. For information about the cost of storage accounts, see the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculator/?service=machine-learning-service).
65+
We recommend creating a datastore for an Azure blob container.
66+
Both standard and premium storage are available for blobs. Although premium storage is more expensive, its faster throughput speeds might improve the speed of your training runs, particularly if you train against a large dataset. For information about the cost of storage accounts, see the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculator/?service=machine-learning-service).
67+
68+
When you create a workspace, an Azure blob container and an Azure file share are automatically registered to the workspace. They're named `workspaceblobstore` and `workspacefilestore`, respectively. They store the connection information for the blob container and the file share that are provisioned in the storage account attached to the workspace. The `workspaceblobstore` container is set as the default datastore.
6669

6770
<a name="access"></a>
6871

@@ -121,15 +124,15 @@ account_name=os.getenv("FILE_SHARE_ACCOUNTNAME", "<my-account-name>") # Storage
121124
account_key=os.getenv("FILE_SHARE_ACCOUNT_KEY", "<my-account-key>") # Storage account key
122125

123126
file_datastore = Datastore.register_azure_file_share(workspace=ws,
124-
datastore_name=file_datastore_name,
125-
file_share_name=file_share_name,
126-
account_name=account_name,
127-
account_key=account_key)
127+
datastore_name=file_datastore_name,
128+
file_share_name=file_share_name,
129+
account_name=account_name,
130+
account_key=account_key)
128131
```
129132

130133
#### Azure Data Lake Storage Generation 2
131134

132-
For an Azure Data Lake Storage Generation 2 (ADLS Gen 2) datastore, use [register_azure_data_lake_gen2()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py#register-azure-data-lake-gen2-workspace--datastore-name--filesystem--account-name--tenant-id--client-id--client-secret--resource-url-none--authority-url-none--protocol-none--endpoint-none--overwrite-false-) to register a credential datastore connected to an Azure DataLake Gen 2 storage with service principal permissions. Learn more about [access control et up for ADLS Gen 2](https://docs.microsoft.com/azure/storage/blobs/data-lake-storage-access-control).
135+
For an Azure Data Lake Storage Generation 2 (ADLS Gen 2) datastore, use [register_azure_data_lake_gen2()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py#register-azure-data-lake-gen2-workspace--datastore-name--filesystem--account-name--tenant-id--client-id--client-secret--resource-url-none--authority-url-none--protocol-none--endpoint-none--overwrite-false-) to register a credential datastore connected to an Azure DataLake Gen 2 storage with service principal permissions. Learn more about [access control set up for ADLS Gen 2](https://docs.microsoft.com/azure/storage/blobs/data-lake-storage-access-control).
133136

134137
The following code creates and registers the `adlsgen2_datastore_name` datastore to the `ws` workspace. This datastore accesses the file system `test` on the `account_name` storage account, by using the provided service principal credentials.
135138

@@ -144,14 +147,13 @@ tenant_id=os.getenv("ADLSGEN2_TENANT", "<my_tenant_id>") # tenant id of service
144147
client_id=os.getenv("ADLSGEN2_CLIENTID", "<my_client_id>") # client id of service principal
145148
client_secret=os.getenv("ADLSGEN2_CLIENT_SECRET", "<my_client_secret>") # the secret of service principal
146149

147-
adlsgen2_datastore = Datastore.register_azure_data_lake_gen2(
148-
workspace=ws,
149-
datastore_name=adlsgen2_datastore_name,
150-
account_name=account_name, # ADLS Gen2 account name
151-
filesystem='test', # Name of ADLS Gen2 filesystem
152-
tenant_id=tenant_id, # tenant id of service principal
153-
client_id=client_id, # client id of service principal
154-
client_secret=client_secret) # the secret of service principal
150+
adlsgen2_datastore = Datastore.register_azure_data_lake_gen2(workspace=ws,
151+
datastore_name=adlsgen2_datastore_name,
152+
account_name=account_name, # ADLS Gen2 account name
153+
filesystem='test', # ADLS Gen2 filesystem
154+
tenant_id=tenant_id, # tenant id of service principal
155+
client_id=client_id, # client id of service principal
156+
client_secret=client_secret) # the secret of service principal
155157
```
156158

157159
### Azure Machine Learning studio
@@ -189,8 +191,6 @@ for name, datastore in datastores.items():
189191
print(name, datastore.datastore_type)
190192
```
191193

192-
When you create a workspace, an Azure blob container and an Azure file share are automatically registered to the workspace. They're named `workspaceblobstore` and `workspacefilestore`, respectively. They store the connection information for the blob container and the file share that are provisioned in the storage account attached to the workspace. The `workspaceblobstore` container is set as the default datastore.
193-
194194
To get the workspace's default datastore, use this line:
195195

196196
```Python

0 commit comments

Comments
 (0)