Skip to content

Commit d356447

Browse files
authored
Merge pull request #112606 from TobyTu/zhanguo03
update content according to ci 117042
2 parents 5a5d27a + c071e45 commit d356447

File tree

1 file changed

+12
-10
lines changed

1 file changed

+12
-10
lines changed

articles/data-factory/data-factory-troubleshoot-guide.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -632,15 +632,13 @@ The following table applies to Azure Batch.
632632

633633
- **Cause**: There was an internal error while trying to read the Service Principal or instantiating the MSI authentication.
634634

635-
- **Recommendation**: Please consider providing a Service Principal which has permissions to create an HDInsight cluster in the provided subscription and try again. In case if this is not an acceptable solution, contact ADF support team for further assistance.
635+
- **Recommendation**: Please consider providing a Service Principal which has permissions to create an HDInsight cluster in the provided subscription and try again. Make sure the [Manage Identities are set up correctly](https://docs.microsoft.com/azure/hdinsight/hdinsight-managed-identities). In case if this is not an acceptable solution, contact ADF support team for further assistance.
636636

637637

638638
### Error code: 2300
639639

640640
- **Message**: `Failed to submit the job '%jobId;' to the cluster '%cluster;'. Error: %errorMessage;.`
641641

642-
<br>
643-
644642
- **Cause**: When error message contains a message similar to 'The remote name could not be resolved.', this could mean the provided cluster URI is invalid.
645643

646644

@@ -651,27 +649,32 @@ The following table applies to Azure Batch.
651649

652650
- **Cause**: When error message contains a message similar to 'A task was canceled.', this means that the job submission timed out.
653651

654-
- **Recommendation**: The problem could be either general HDInsight connectivity or network connectivity. First confirm that the HDInsight Ambari UI is available from any browser. Confirm that your credentials are still valid. If you're using self-hosted integrated runtime (IR), make sure to do this from the VM or machine where the self-hosted IR is installed. Then try submitting the job from Data Factory again. If it still fails, contact the Data Factory team for support.
652+
- **Recommendation**: The problem could be either general HDInsight connectivity or network connectivity. First confirm that the HDInsight Ambari UI is available from any browser. Confirm that your credentials are still valid. For more information, read [Ambari Web UI](https://docs.microsoft.com/azure/hdinsight/hdinsight-hadoop-manage-ambari#ambari-web-ui). If you're using self-hosted integrated runtime (IR), make sure to do this from the VM or machine where the self-hosted IR is installed. Then try submitting the job from Data Factory again. If it still fails, contact the Data Factory team for support.
655653

656654
<br>
657655

658656
- **Cause**: When error message contains a message similar to 'User admin is locked out in Ambari' or 'Unauthorized: Ambari user name or password is incorrect', this means the credentials for HDInsight are incorrect or have expired.
659657

660-
- **Recommendation**: Correct the credentials and redeploy the linked service. First make sure the credentials work on HDInsight by opening the cluster URI on any browser and trying to sign in. If the credentials don't work, you can reset them from the Azure portal.
658+
- **Recommendation**: Correct the credentials and redeploy the linked service. First make sure the credentials work on HDInsight by opening the cluster URI on any browser and trying to sign in. If the credentials don't work, you can reset them from the Azure portal. For ESP cluster, you can [reset the password through self service password reset](https://docs.microsoft.com/azure/active-directory/user-help/active-directory-passwords-update-your-own-password).
661659

662660
<br>
663661

664662
- **Cause**: When error message contains a message similar to '502 - Web server received an invalid response while acting as a gateway or proxy server', this error is returned by HDInsight service.
665663

664+
- **Recommendation**: For 502 error, most of the time this is because your Ambari Server process was shut down. You can restart the Ambari Services by rebooting the head node.
666665

667-
- **Recommendation**: Look through Azure HDInsight troubleshooting documentation, for example, https://hdinsight.github.io/ambari/ambari-ui-502-error.html, https://hdinsight.github.io/spark/spark-thriftserver-errors.html, https://docs.microsoft.com/azure/application-gateway/application-gateway-troubleshooting-502.
668-
666+
1. Connect to one of your node on Hdinsight using SSH.
667+
2. Identify your active head node host by running “ping headnodehost”.
668+
3. Connect to your active head node as Ambari Server sits on the active head node using SSH.
669+
4. Reboot the active head node.
670+
671+
For more information: Look through Azure HDInsight troubleshooting documentation, for example: [Ambari UI 502 error](https://hdinsight.github.io/ambari/ambari-ui-502-error.html), [RpcTimeoutException for Apache Spark thrift server](https://docs.microsoft.com/azure/hdinsight/spark/apache-spark-troubleshoot-rpctimeoutexception), [Troubleshooting bad gateway errors in Application Gateway](https://docs.microsoft.com/azure/application-gateway/application-gateway-troubleshooting-502).
669672

670673
<br>
671674

672675
- **Cause**: When error message contains a message similar to 'Unable to service the submit job request as templeton service is busy with too many submit job requests' or 'Queue root.joblauncher already has 500 applications, cannot accept submission of application', this means that too many jobs are being submitted to HDInsight at the same time.
673676

674-
- **Recommendation**: Consider limiting the number of concurrent jobs submitted to HDInsight. Refer to Data Factory activity concurrency if the jobs are being submitted by the same activity. Change the triggers so the concurrent pipeline runs are spread out over time. Refer to HDInsight documentation to adjust templeton.parallellism.job.submit as the error suggests.
677+
- **Recommendation**: Consider limiting the number of concurrent jobs submitted to HDInsight. Refer to Data Factory activity concurrency if the jobs are being submitted by the same activity. Change the triggers so the concurrent pipeline runs are spread out over time. Refer to [HDInsight documentation](https://docs.microsoft.com/azure/hdinsight/hdinsight-hadoop-templeton-webhcat-debug-errors) to adjust `templeton.parallellism.job.submit` as the error suggests.
675678

676679

677680
### Error code: 2301
@@ -731,7 +734,6 @@ The following table applies to Azure Batch.
731734
- **Recommendation**: The error message should help to identify the issue. Please fix the json configuration and try again. Check https://docs.microsoft.com/azure/data-factory/compute-linked-services#azure-hdinsight-on-demand-linked-service for more information.
732735

733736

734-
735737
### Error code: 2310
736738

737739
- **Message**: `Failed to submit Spark job. Error: '%message;'`
@@ -906,7 +908,7 @@ The following table applies to Azure Batch.
906908

907909
- **Cause**: The storage linked service type is not supported by the activity.
908910

909-
- **Recommendation**: Please make sure the selected linked service has one of the supported types for the activity. HDI activities support AzureBlobStorage and AzureBlobFSStorage linked services.
911+
- **Recommendation**: Please make sure the selected linked service has one of the supported types for the activity. HDI activities support AzureBlobStorage and AzureBlobFSStorage linked services. For more information, read [Compare storage options for use with Azure HDInsight clusters](https://docs.microsoft.com/azure/hdinsight/hdinsight-hadoop-compare-storage-options)
910912

911913

912914
### Error code: 2355

0 commit comments

Comments
 (0)