You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Add additional Azure storage accounts to HDInsight
3
-
description: Learn how to add additional Azure storage accounts to an existing HDInsight cluster.
2
+
title: Add additional Azure Storage accounts to HDInsight
3
+
description: Learn how to add additional Azure Storage accounts to an existing HDInsight cluster.
4
4
author: hrasheed-msft
5
5
ms.author: hrasheed
6
6
ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
8
ms.topic: conceptual
9
-
ms.date: 10/31/2019
9
+
ms.date: 01/21/2020
10
10
---
11
11
12
12
# Add additional storage accounts to HDInsight
13
13
14
-
Learn how to use script actions to add additional Azure storage*accounts* to HDInsight. The steps in this document add a storage *account* to an existing Linux-based HDInsight cluster. This article applies to storage *accounts* (not the default cluster storage account), and not additional storage such as [Azure Data Lake Storage Gen1](hdinsight-hadoop-use-data-lake-store.md) and [Azure Data Lake Storage Gen2](hdinsight-hadoop-use-data-lake-storage-gen2.md).
14
+
Learn how to use script actions to add additional Azure Storage*accounts* to HDInsight. The steps in this document add a storage *account* to an existing HDInsight cluster. This article applies to storage *accounts* (not the default cluster storage account), and not additional storage such as [Azure Data Lake Storage Gen1](hdinsight-hadoop-use-data-lake-store.md) and [Azure Data Lake Storage Gen2](hdinsight-hadoop-use-data-lake-storage-gen2.md).
15
15
16
16
> [!IMPORTANT]
17
17
> The information in this document is about adding additional storage account(s) to a cluster after it has been created. For information on adding storage accounts during cluster creation, see [Set up clusters in HDInsight with Apache Hadoop, Apache Spark, Apache Kafka, and more](hdinsight-hadoop-provision-linux-clusters.md).
@@ -20,21 +20,10 @@ Learn how to use script actions to add additional Azure storage *accounts* to HD
20
20
21
21
* A Hadoop cluster on HDInsight. See [Get Started with HDInsight on Linux](./hadoop/apache-hadoop-linux-tutorial-get-started.md).
22
22
* Storage account name and key. See [Manage storage account access keys](../storage/common/storage-account-keys-manage.md).
* If using PowerShell, you'll need the AZ module. See [Overview of Azure PowerShell](https://docs.microsoft.com/powershell/azure/overview).
25
-
* If you haven't installed the Azure CLI, see [Azure Command-Line Interface (CLI)](https://docs.microsoft.com/cli/azure/?view=azure-cli-latest).
26
-
* If using bash or a windows command prompt, you'll also need **jq**, a command-line JSON processor. See [https://stedolan.github.io/jq/](https://stedolan.github.io/jq/). For bash on Ubuntu on Windows 10 see [Windows Subsystem for Linux Installation Guide for Windows 10](https://docs.microsoft.com/windows/wsl/install-win10).
27
24
28
25
## How it works
29
26
30
-
This script takes the following parameters:
31
-
32
-
*__Azure storage account name__: The name of the storage account to add to the HDInsight cluster. After running the script, HDInsight can read and write data stored in this storage account.
33
-
34
-
*__Azure storage account key__: A key that grants access to the storage account.
35
-
36
-
*__-p__ (optional): If specified, the key isn't encrypted and is stored in the core-site.xml file as plain text.
37
-
38
27
During processing, the script performs the following actions:
39
28
40
29
* If the storage account already exists in the core-site.xml configuration for the cluster, the script exits and no further actions are performed.
@@ -50,80 +39,38 @@ During processing, the script performs the following actions:
50
39
> [!WARNING]
51
40
> Using a storage account in a different location than the HDInsight cluster is not supported.
__Requirements__: The script must be applied on the __Head nodes__. You don't need to mark this script as __Persisted__, as it directly updates the Ambari configuration for the cluster.
58
-
59
-
## To use the script
60
-
61
-
This script can be used from the Azure PowerShell, Azure CLI, or the Azure portal.
62
-
63
-
### PowerShell
64
-
65
-
Using [Submit-AzHDInsightScriptAction](https://docs.microsoft.com/powershell/module/az.hdinsight/submit-azhdinsightscriptaction). Replace `CLUSTERNAME`, `ACCOUNTNAME`, and `ACCOUNTKEY` with the appropriate values.
Using [az hdinsight script-action execute](https://docs.microsoft.com/cli/azure/hdinsight/script-action?view=azure-cli-latest#az-hdinsight-script-action-execute). Replace `CLUSTERNAME`, `RESOURCEGROUP`, `ACCOUNTNAME`, and `ACCOUNTKEY` with the appropriate values.
Use [Script Action](hdinsight-hadoop-customize-cluster-linux.md#apply-a-script-action-to-a-running-cluster) to apply the changes with the following considerations:
See [Apply a script action to a running cluster](hdinsight-hadoop-customize-cluster-linux.md#apply-a-script-action-to-a-running-cluster).
52
+
*`ACCOUNTNAME` is the name of the storage account to add to the HDInsight cluster.
53
+
*`ACCOUNTKEY` is the access key for `ACCOUNTNAME`.
54
+
*`-p` is optional. If specified, the key isn't encrypted and is stored in the core-site.xml file as plain text.
101
55
102
-
## Known issues
103
-
104
-
### Storage firewall
105
-
106
-
If you choose to secure your storage account with the **Firewalls and virtual networks** restrictions on **Selected networks**, be sure to enable the exception **Allow trusted Microsoft services...** so that HDInsight can access your storage account.
56
+
## Verification
107
57
108
-
### Storage accounts not displayed in Azure portal or tools
58
+
When viewing the HDInsight cluster in the Azure portal, selecting the __Storage Accounts__ entry under __Properties__ doesn't display storage accounts added through this script action. Azure PowerShell and Azure CLI don't display the additional storage account either. The storage information isn't displayed because the script only modifies the `core-site.xml` configuration for the cluster. This information isn't used when retrieving the cluster information using Azure management APIs.
109
59
110
-
When viewing the HDInsight cluster in the Azure portal, selecting the __Storage Accounts__ entry under __Properties__ doesn't display storage accounts added through this script action. Azure PowerShell and Azure CLI don't display the additional storage account either.
60
+
To verify the additional storage use one of the methods shown below:
111
61
112
-
The storage information isn't displayed because the script only modifies the core-site.xml configuration for the cluster. This information isn't used when retrieving the cluster information using Azure management APIs.
62
+
### Powershell
113
63
114
-
To view storage account information added to the cluster using this script, use the Ambari REST API. Use the following commands to retrieve this information for your cluster:
115
-
116
-
### PowerShell
117
-
118
-
Replace `CLUSTERNAME` with the properly cased cluster name. Replace `ACCOUNTNAME` with the actual names. When prompted, enter the cluster login password.
64
+
The script will return the Storage Account name(s) associated with the given cluster. Replace `CLUSTERNAME` with the actual cluster name, and then run the script.
$value = ($respObj.items.configurations | Where type -EQ "core-site").properties | Get-Member -membertype properties | Where Name -Like "fs.azure.account.key.*"
87
+
foreach ($name in $value ) { $name.Name.Split(".")[4]}
138
88
```
139
89
140
-
### bash
90
+
### Apache Ambari
141
91
142
-
Replace `CLUSTERNAME` with the properly cased cluster name. Replace `PASSWORD` with the cluster admin password. Replace `STORAGEACCOUNT` with the actual storage account name.
92
+
1. From a web browser, navigate to `https://CLUSTERNAME.azurehdinsight.net`, where `CLUSTERNAME` is the name of your cluster.
1. From a web browser, navigate to `https://CLUSTERNAME.azurehdinsight.net`, where `CLUSTERNAME` is the name of your cluster.
159
103
160
-
Replace `CLUSTERNAME` with the properly cased cluster name in both scripts. First identify the service config version in use by entering the command below:
This text is an example of an encrypted key, which is used to access the storage account.
116
+
If you choose to secure your storage account with the **Firewalls and virtual networks** restrictions on **Selected networks**, be sure to enable the exception **Allow trusted Microsoft services...** so that HDInsight can access your storage account.
181
117
182
118
### Unable to access storage after changing key
183
119
184
120
If you change the key for a storage account, HDInsight can no longer access the storage account. HDInsight uses a cached copy of key in the core-site.xml for the cluster. This cached copy must be updated to match the new key.
185
121
186
122
Running the script action again does __not__ update the key, as the script checks to see if an entry for the storage account already exists. If an entry already exists, it doesn't make any changes.
187
123
188
-
To work around this problem, you must remove the existing entry for the storage account. Use the following steps to remove the existing entry:
124
+
To work around this problem:
125
+
1. Remove the storage account.
126
+
1. Add the storage account.
189
127
190
128
> [!IMPORTANT]
191
129
> Rotating the storage key for the primary storage account attached to a cluster is not supported.
192
130
193
-
1. In a web browser, open the Ambari Web UI for your HDInsight cluster. The URI is `https://CLUSTERNAME.azurehdinsight.net`. Replace `CLUSTERNAME` with the name of your cluster.
194
-
195
-
When prompted, enter the HTTP login user and password for your cluster.
196
-
197
-
2. From the list of services on the left of the page, select __HDFS__. Then select the __Configs__ tab in the center of the page.
198
-
199
-
3. In the __Filter...__ field, enter a value of __fs.azure.account__. This returns entries for any additional storage accounts that have been added to the cluster. There are two types of entries; __keyprovider__ and __key__. Both contain the name of the storage account as part of the key name.
200
-
201
-
The following are example entries for a storage account named __mystorage__:
4. After you've identified the keys for the storage account you need to remove, use the red '-' icon to the right of the entry to delete it. Then use the __Save__ button to save your changes.
207
-
208
-
5. After changes have been saved, use the script action to add the storage account and new key value to the cluster.
209
-
210
131
### Poor performance
211
132
212
133
If the storage account is in a different region than the HDInsight cluster, you may experience poor performance. Accessing data in a different region sends network traffic outside the regional Azure data center and across the public internet, which can introduce latency.
0 commit comments