Skip to content

Commit 3bd6ce7

Browse files
Merge pull request #99076 from hrasheed-msft/hdi_vmss
Hdi vmss
2 parents 672a41e + 6502bce commit 3bd6ce7

17 files changed

+68
-36
lines changed

articles/hdinsight/domain-joined/hdinsight-use-oozie-domain-joined-clusters.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ Oozie workflow definitions are written in Apache Hadoop Process Definition Langu
8181
<credential name="metastore_token" type="hcat">
8282
<property>
8383
<name>hcat.metastore.uri</name>
84-
<value>thrift://hn0-<clustername>.<Domain>.com:9083</value>
84+
<value>thrift://<active-headnode-name>-<clustername>.<Domain>.com:9083</value>
8585
</property>
8686
<property>
8787
<name>hcat.metastore.principal</name>
@@ -209,6 +209,23 @@ Oozie workflow definitions are written in Apache Hadoop Process Definition Langu
209209

210210
2. After the nano editor opens, use the following XML as the contents of the file:
211211

212+
<<<<<<< HEAD
213+
```bash
214+
nameNode=adl://home
215+
jobTracker=headnodehost:8050
216+
queueName=default
217+
examplesRoot=examples
218+
oozie.wf.application.path=${nameNode}/user/[domainuser]/examples/apps/map-reduce/workflow.xml
219+
hiveScript1=${nameNode}/user/${user.name}/countrowshive1.hql
220+
hiveScript2=${nameNode}/user/${user.name}/countrowshive2.hql
221+
oozie.use.system.libpath=true
222+
user.name=[domainuser]
223+
jdbcPrincipal=hive/<active-headnode-name>.<Domain>.com@<Domain>.COM
224+
jdbcURL=[jdbcurlvalue]
225+
hiveOutputDirectory1=${nameNode}/user/${user.name}/hiveresult1
226+
hiveOutputDirectory2=${nameNode}/user/${user.name}/hiveresult2
227+
```
228+
=======
212229
```bash
213230
nameNode=adl://home
214231
jobTracker=headnodehost:8050
@@ -224,6 +241,7 @@ Oozie workflow definitions are written in Apache Hadoop Process Definition Langu
224241
hiveOutputDirectory1=${nameNode}/user/${user.name}/hiveresult1
225242
hiveOutputDirectory2=${nameNode}/user/${user.name}/hiveresult2
226243
```
244+
>>>>>>> 0650d78429b6d1b43cddf90fc713eb4050d71eef
227245

228246
- Use the `adl://home` URI for the `nameNode` property if you have Azure Data Lake Storage Gen1 as your primary cluster storage. If you're using Azure Blob Storage, then change this to `wasb://home`. If you're using Azure Data Lake Storage Gen2, then change this to `abfs://home`.
229247
- Replace `domainuser` with your username for the domain.

articles/hdinsight/hadoop/apache-ambari-troubleshoot-fivezerotwo-error.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ service ambari-server start
4747
In some scenarios, your headnode runs out of memory, and the Linux oom-killer starts to pick processes to kill. You can verify this situation by searching the AmbariServer process ID, which should not be found. Then look at your `/var/log/syslog`, and look for something like this:
4848

4949
```
50-
Jul 27 15:29:30 hn0-xxxxxx kernel: [874192.703153] java invoked oom-killer: gfp_mask=0x23201ca, order=0, oom_score_adj=0
50+
Jul 27 15:29:30 xxx-xxxxxx kernel: [874192.703153] java invoked oom-killer: gfp_mask=0x23201ca, order=0, oom_score_adj=0
5151
```
5252

5353
Then identify which processes are taking memories and try to further root cause.

articles/hdinsight/hadoop/hdinsight-hdfs-troubleshoot-safe-mode.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@ This article describes troubleshooting steps and possible resolutions for issues
1818
The local Apache Hadoop Distributed File System (HDFS) is stuck in safe mode on the HDInsight cluster. You receive an error message similar as follows:
1919

2020
```output
21-
hdiuser@hn0-spark2:~$ hdfs dfs -D "fs.default.name=hdfs://mycluster/" -mkdir /temp
22-
17/04/05 16:20:52 WARN retry.RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.mkdirs over hn0-spark2.2oyzcdm4sfjuzjmj5dnmvscjpg.dx.internal.cloudapp.net/10.0.0.22:8020. Not retrying because try once and fail.
21+
hdiuser@spark2:~$ hdfs dfs -D "fs.default.name=hdfs://mycluster/" -mkdir /temp
22+
17/04/05 16:20:52 WARN retry.RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.mkdirs over spark2.2oyzcdm4sfjuzjmj5dnmvscjpg.dx.internal.cloudapp.net/10.0.0.22:8020. Not retrying because try once and fail.
2323
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /temp. Name node is in safe mode.
2424
It was turned on manually. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
2525
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1359)
@@ -42,7 +42,7 @@ The HDInsight cluster has been scaled down to very few nodes below, or number of
4242
1. Check on the integrity of HDFS on the HDInsight cluster with the following command:
4343

4444
```bash
45-
hdiuser@hn0-spark2:~$ hdfs fsck -D "fs.default.name=hdfs://mycluster/" /
45+
hdiuser@spark2:~$ hdfs fsck -D "fs.default.name=hdfs://mycluster/" /
4646
```
4747

4848
1. If determined there are no missing, corrupt or under replicated blocks or those blocks can be ignored run the following command to take the name node out of safe mode:

articles/hdinsight/hdinsight-hadoop-customize-cluster-linux.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -443,11 +443,11 @@ If cluster creation fails because of a script error, the logs are kept in the cl
443443

444444
Under this directory, the logs are organized separately for **headnode**, **worker node**, and **zookeeper node**. See the following examples:
445445

446-
* **Headnode**: `<uniqueidentifier>AmbariDb-hn0-<generated_value>.cloudapp.net`
446+
* **Headnode**: `<ACTIVE-HEADNODE-NAME>.cloudapp.net`
447447

448-
* **Worker node**: `<uniqueidentifier>AmbariDb-wn0-<generated_value>.cloudapp.net`
448+
* **Worker node**: `<ACTIVE-WORKERNODE-NAME>.cloudapp.net`
449449

450-
* **Zookeeper node**: `<uniqueidentifier>AmbariDb-zk0-<generated_value>.cloudapp.net`
450+
* **Zookeeper node**: `<ACTIVE-ZOOKEEPERNODE-NAME>.cloudapp.net`
451451

452452
* All **stdout** and **stderr** of the corresponding host is uploaded to the storage account. There's one **output-\*.txt** and **errors-\*.txt** for each script action. The **output-*.txt** file contains information about the URI of the script that was run on the host. The following text is an example of this information:
453453

articles/hdinsight/hdinsight-hadoop-hue-linux.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ SSH Tunneling is the only way to access Hue on the cluster once it is running. T
6767
6868
This will return a name similar to the following:
6969
70-
hn0-myhdi-nfebtpfdv1nubcidphpap2eq2b.ex.internal.cloudapp.net
70+
myhdi-nfebtpfdv1nubcidphpap2eq2b.ex.internal.cloudapp.net
7171
7272
This is the hostname of the primary headnode where the Hue website is located.
7373

articles/hdinsight/hdinsight-hadoop-linux-use-ssh-unix.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ The head nodes and edge node (if there is one) can be accessed over the internet
156156
> [!IMPORTANT]
157157
> The previous examples assume that you are using password authentication, or that certificate authentication is occurring automatically. If you use an SSH key-pair for authentication, and the certificate is not used automatically, use the `-i` parameter to specify the private key. For example, `ssh -i ~/.ssh/mykey [email protected]`.
158158

159-
Once connected, the prompt changes to indicate the SSH user name and the node you're connected to. For example, when connected to the primary head node as `sshuser`, the prompt is `sshuser@hn0-clustername:~$`.
159+
Once connected, the prompt changes to indicate the SSH user name and the node you're connected to. For example, when connected to the primary head node as `sshuser`, the prompt is `sshuser@<active-headnode-name>:~$`.
160160
161161
### Connect to worker and Apache Zookeeper nodes
162162

articles/hdinsight/hdinsight-high-availability-linux.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Nodes in an HDInsight cluster are implemented using Azure Virtual Machines. The
2828

2929
To ensure high availability of Hadoop services, HDInsight provides two head nodes. Both head nodes are active and running within the HDInsight cluster simultaneously. Some services, such as Apache HDFS or Apache Hadoop YARN, are only 'active' on one head node at any given time. Other services such as HiveServer2 or Hive MetaStore are active on both head nodes at the same time.
3030

31-
Head nodes (and other nodes in HDInsight) have a numeric value as part of the hostname of the node. For example, `hn0-CLUSTERNAME` or `hn4-CLUSTERNAME`.
31+
To obtain the hostnames for different node types in your cluster, please use the [Ambari REST API](hdinsight-hadoop-manage-ambari-rest-api.md#example-get-the-fqdn-of-cluster-nodes).
3232

3333
> [!IMPORTANT]
3434
> Do not associate the numeric value with whether a node is primary or secondary. The numeric value is only present to provide a unique name for each node.
@@ -83,7 +83,7 @@ curl -u admin:$password "https://$clusterName.azurehdinsight.net/api/v1/clusters
8383
This command returns a value similar to the following, which contains the internal URL to use with the `oozie` command:
8484

8585
```output
86-
"oozie.base.url": "http://hn0-CLUSTERNAME-randomcharacters.cx.internal.cloudapp.net:11000/oozie"
86+
"oozie.base.url": "http://<ACTIVE-HEADNODE-NAME>cx.internal.cloudapp.net:11000/oozie"
8787
```
8888

8989
For more information on working with the Ambari REST API, see [Monitor and Manage HDInsight using the Apache Ambari REST API](hdinsight-hadoop-manage-ambari-rest-api.md).
@@ -189,7 +189,7 @@ The response is similar to the following JSON:
189189

190190
```json
191191
{
192-
"href" : "http://hn0-CLUSTERNAME.randomcharacters.cx.internal.cloudapp.net:8080/api/v1/clusters/mycluster/services/HDFS?fields=ServiceInfo/state",
192+
"href" : "http://mycluster.wutj3h4ic1zejluqhxzvckxq0g.cx.internal.cloudapp.net:8080/api/v1/clusters/mycluster/services/HDFS?fields=ServiceInfo/state",
193193
"ServiceInfo" : {
194194
"cluster_name" : "mycluster",
195195
"service_name" : "HDFS",
@@ -198,7 +198,7 @@ The response is similar to the following JSON:
198198
}
199199
```
200200

201-
The URL tells us that the service is currently running on a head node named **hn0-CLUSTERNAME**.
201+
The URL tells us that the service is currently running on a head node named **mycluster.wutj3h4ic1zejluqhxzvckxq0g**.
202202

203203
The state tells us that the service is currently running, or **STARTED**.
204204

articles/hdinsight/hdinsight-operationalize-data-pipeline.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -537,7 +537,11 @@ As with workflows, the configuration of a coordinator is defined in a `job.prope
537537
538538
```
539539
nameNode=wasbs://[CONTAINERNAME]@[ACCOUNTNAME].blob.core.windows.net
540+
<<<<<<< HEAD
541+
jobTracker=[ACTIVE-HEADNODE-NAME].[UNIQUESTRING].dx.internal.cloudapp.net:8050
542+
=======
540543
jobTracker=[ACTIVERESOURCEMANAGER]:8050
544+
>>>>>>> 50a435bd9528fcbaac7bc5ab96745734e63167da
541545
queueName=default
542546
oozie.use.system.libpath=true
543547
appBase=wasbs://[CONTAINERNAME]@[ACCOUNTNAME].blob.core.windows.net/oozie

articles/hdinsight/hdinsight-scaling-best-practices.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -142,10 +142,10 @@ org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create director
142142
```
143143

144144
```
145-
org.apache.http.conn.HttpHostConnectException: Connect to hn0-clustername.servername.internal.cloudapp.net:10001 [hn0-clustername.servername. internal.cloudapp.net/1.1.1.1] failed: Connection refused
145+
org.apache.http.conn.HttpHostConnectException: Connect to active-headnode-name.servername.internal.cloudapp.net:10001 [active-headnode-name.servername. internal.cloudapp.net/1.1.1.1] failed: Connection refused
146146
```
147147

148-
You can review the name node logs from the `/var/log/hadoop/hdfs/` folder, near the time when the cluster was scaled, to see when it entered safe mode. The log files are named `Hadoop-hdfs-namenode-hn0-clustername.*`.
148+
You can review the name node logs from the `/var/log/hadoop/hdfs/` folder, near the time when the cluster was scaled, to see when it entered safe mode. The log files are named `Hadoop-hdfs-namenode-<active-headnode-name>.*`.
149149

150150
The root cause of the previous errors is that Hive depends on temporary files in HDFS while running queries. When HDFS enters safe mode, Hive cannot run queries because it cannot write to HDFS. The temp files in HDFS are located in the local drive mounted to the individual worker node VMs, and replicated amongst other worker nodes at three replicas, minimum.
151151

@@ -189,7 +189,7 @@ If Hive has left behind temporary files, then you can manually clean up those fi
189189
Here is a sample output when files exist:
190190

191191
```output
192-
sshuser@hn0-scalin:~$ hadoop fs -ls -R hdfs://mycluster/tmp/hive/hive
192+
sshuser@scalin:~$ hadoop fs -ls -R hdfs://mycluster/tmp/hive/hive
193193
drwx------ - hive hdfs 0 2017-07-06 13:40 hdfs://mycluster/tmp/hive/hive/4f3f4253-e6d0-42ac-88bc-90f0ea03602c
194194
drwx------ - hive hdfs 0 2017-07-06 13:40 hdfs://mycluster/tmp/hive/hive/4f3f4253-e6d0-42ac-88bc-90f0ea03602c/_tmp_space.db
195195
-rw-r--r-- 3 hive hdfs 27 2017-07-06 13:40 hdfs://mycluster/tmp/hive/hive/4f3f4253-e6d0-42ac-88bc-90f0ea03602c/inuse.info

articles/hdinsight/hdinsight-sync-aad-users-to-cluster.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ The following method uses POST with the Ambari REST API. For more information, s
6060
{
6161
"resources" : [
6262
{
63-
"href" : "http://hn0-hadoop.<YOUR DOMAIN>.com:8080/api/v1/ldap_sync_events/1",
63+
"href" : "http://<ACTIVE-HEADNODE-NAME>.<YOUR DOMAIN>.com:8080/api/v1/ldap_sync_events/1",
6464
"Event" : {
6565
"id" : 1
6666
}
@@ -79,7 +79,7 @@ The following method uses POST with the Ambari REST API. For more information, s
7979
8080
```json
8181
{
82-
"href" : "http://hn0-hadoop.YOURDOMAIN.com:8080/api/v1/ldap_sync_events/1",
82+
"href" : "http://<ACTIVE-HEADNODE-NAME>.YOURDOMAIN.com:8080/api/v1/ldap_sync_events/1",
8383
"Event" : {
8484
"id" : 1,
8585
"specs" : [

0 commit comments

Comments
 (0)