Skip to content

Commit e6abebe

Browse files
committed
update content according to ci 117042
1 parent 5a5d27a commit e6abebe

File tree

1 file changed

+15
-11
lines changed

1 file changed

+15
-11
lines changed

articles/data-factory/data-factory-troubleshoot-guide.md

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -632,15 +632,13 @@ The following table applies to Azure Batch.
632632

633633
- **Cause**: There was an internal error while trying to read the Service Principal or instantiating the MSI authentication.
634634

635-
- **Recommendation**: Please consider providing a Service Principal which has permissions to create an HDInsight cluster in the provided subscription and try again. In case if this is not an acceptable solution, contact ADF support team for further assistance.
635+
- **Recommendation**: Please consider providing a Service Principal which has permissions to create an HDInsight cluster in the provided subscription and try again. Make sure the [Manage Identities are set up correctly](https://docs.microsoft.com/azure/hdinsight/hdinsight-managed-identities). In case if this is not an acceptable solution, contact ADF support team for further assistance.
636636

637637

638638
### Error code: 2300
639639

640640
- **Message**: `Failed to submit the job '%jobId;' to the cluster '%cluster;'. Error: %errorMessage;.`
641641

642-
<br>
643-
644642
- **Cause**: When error message contains a message similar to 'The remote name could not be resolved.', this could mean the provided cluster URI is invalid.
645643

646644

@@ -651,27 +649,34 @@ The following table applies to Azure Batch.
651649

652650
- **Cause**: When error message contains a message similar to 'A task was canceled.', this means that the job submission timed out.
653651

654-
- **Recommendation**: The problem could be either general HDInsight connectivity or network connectivity. First confirm that the HDInsight Ambari UI is available from any browser. Confirm that your credentials are still valid. If you're using self-hosted integrated runtime (IR), make sure to do this from the VM or machine where the self-hosted IR is installed. Then try submitting the job from Data Factory again. If it still fails, contact the Data Factory team for support.
652+
- **Recommendation**: The problem could be either general HDInsight connectivity or network connectivity. First confirm that the HDInsight Ambari UI is available from any browser. Confirm that your credentials are still valid. For more information, read [Ambari Web UI](https://docs.microsoft.com/azure/hdinsight/hdinsight-hadoop-manage-ambari#ambari-web-ui). If you're using self-hosted integrated runtime (IR), make sure to do this from the VM or machine where the self-hosted IR is installed. Then try submitting the job from Data Factory again. If it still fails, contact the Data Factory team for support.
655653

656654
<br>
657655

658656
- **Cause**: When error message contains a message similar to 'User admin is locked out in Ambari' or 'Unauthorized: Ambari user name or password is incorrect', this means the credentials for HDInsight are incorrect or have expired.
659657

660-
- **Recommendation**: Correct the credentials and redeploy the linked service. First make sure the credentials work on HDInsight by opening the cluster URI on any browser and trying to sign in. If the credentials don't work, you can reset them from the Azure portal.
658+
- **Recommendation**: Correct the credentials and redeploy the linked service. First make sure the credentials work on HDInsight by opening the cluster URI on any browser and trying to sign in. If the credentials don't work, you can reset them from the Azure portal. For ESP cluster, you can [reset the password through self service password reset](https://docs.microsoft.com/azure/active-directory/user-help/active-directory-passwords-update-your-own-password).
661659

662660
<br>
663661

664662
- **Cause**: When error message contains a message similar to '502 - Web server received an invalid response while acting as a gateway or proxy server', this error is returned by HDInsight service.
665663

664+
- **Recommendation**: For 502 error, most of the time this is because your Ambari Server process was shut down. You can restart the Ambari Services by rebooting the head node.
666665

667-
- **Recommendation**: Look through Azure HDInsight troubleshooting documentation, for example, https://hdinsight.github.io/ambari/ambari-ui-502-error.html, https://hdinsight.github.io/spark/spark-thriftserver-errors.html, https://docs.microsoft.com/azure/application-gateway/application-gateway-troubleshooting-502.
668-
666+
1. Connect to one of your node on Hdinsight using SSH.
667+
2. Identify your active head node host by running “ping headnodehost”.
668+
3. Connect to your active head node as Ambari Server sits on the active head node using SSH.
669+
4. Reboot the active head node.
669670

670-
<br>
671+
For more information: Look through Azure HDInsight troubleshooting documentation, for example:
672+
673+
- [Ambari UI 502 error](https://hdinsight.github.io/ambari/ambari-ui-502-error.html).
674+
- [Scenario: RpcTimeoutException for Apache Spark thrift server in Azure HDInsight](https://docs.microsoft.com/azure/hdinsight/spark/apache-spark-troubleshoot-rpctimeoutexception).
675+
- [Troubleshooting bad gateway errors in Application Gateway](https://docs.microsoft.com/azure/application-gateway/application-gateway-troubleshooting-502).
671676

672677
- **Cause**: When error message contains a message similar to 'Unable to service the submit job request as templeton service is busy with too many submit job requests' or 'Queue root.joblauncher already has 500 applications, cannot accept submission of application', this means that too many jobs are being submitted to HDInsight at the same time.
673678

674-
- **Recommendation**: Consider limiting the number of concurrent jobs submitted to HDInsight. Refer to Data Factory activity concurrency if the jobs are being submitted by the same activity. Change the triggers so the concurrent pipeline runs are spread out over time. Refer to HDInsight documentation to adjust templeton.parallellism.job.submit as the error suggests.
679+
- **Recommendation**: Consider limiting the number of concurrent jobs submitted to HDInsight. Refer to Data Factory activity concurrency if the jobs are being submitted by the same activity. Change the triggers so the concurrent pipeline runs are spread out over time. Refer to [HDInsight documentation](https://docs.microsoft.com/azure/hdinsight/hdinsight-hadoop-templeton-webhcat-debug-errors) to adjust `templeton.parallellism.job.submit` as the error suggests.
675680

676681

677682
### Error code: 2301
@@ -731,7 +736,6 @@ The following table applies to Azure Batch.
731736
- **Recommendation**: The error message should help to identify the issue. Please fix the json configuration and try again. Check https://docs.microsoft.com/azure/data-factory/compute-linked-services#azure-hdinsight-on-demand-linked-service for more information.
732737

733738

734-
735739
### Error code: 2310
736740

737741
- **Message**: `Failed to submit Spark job. Error: '%message;'`
@@ -906,7 +910,7 @@ The following table applies to Azure Batch.
906910

907911
- **Cause**: The storage linked service type is not supported by the activity.
908912

909-
- **Recommendation**: Please make sure the selected linked service has one of the supported types for the activity. HDI activities support AzureBlobStorage and AzureBlobFSStorage linked services.
913+
- **Recommendation**: Please make sure the selected linked service has one of the supported types for the activity. HDI activities support AzureBlobStorage and AzureBlobFSStorage linked services. For more information, read [Compare storage options for use with Azure HDInsight clusters](https://docs.microsoft.com/azure/hdinsight/hdinsight-hadoop-compare-storage-options)
910914

911915

912916
### Error code: 2355

0 commit comments

Comments
 (0)