Skip to content

Conversation

cnauroth
Copy link
Contributor

@cnauroth cnauroth commented Aug 5, 2025

Description of PR

The NodeManager invokes an external binary (e.g. nvidia-smi) to discover attached GPUs. Right now, there is a hard-coded 10-second timeout on execution of this binary and a hard-coded max error count of 10, beyond which the NodeManager will stop attempting discovery. This change will provide new configuration properties to control both the timeout and the max errors, which is useful in environments where there may be a delay in binding the GPU to the host. Default values for the new configuration properties will be set so as to maintain the existing behavior.

Special thanks to @jjayadeep06 for co-authoring this patch.

How was this patch tested?

New unit test and manual testing.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 24m 11s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 9m 20s Maven dependency ordering for branch
+1 💚 mvninstall 37m 58s trunk passed
+1 💚 compile 6m 12s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 5m 55s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 2m 13s trunk passed
+1 💚 mvnsite 3m 25s trunk passed
+1 💚 javadoc 3m 21s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 3m 17s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 40s trunk passed
+1 💚 shadedclient 42m 46s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for patch
+1 💚 mvninstall 1m 43s the patch passed
+1 💚 compile 5m 20s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 5m 20s the patch passed
+1 💚 compile 5m 1s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 5m 1s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 52s /results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 164 unchanged - 0 fixed = 166 total (was 164)
+1 💚 mvnsite 2m 32s the patch passed
+1 💚 javadoc 2m 36s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 23s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 45s the patch passed
+1 💚 shadedclient 41m 37s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 6s hadoop-yarn-api in the patch passed.
+1 💚 unit 5m 47s hadoop-yarn-common in the patch passed.
+1 💚 unit 27m 14s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
254m 18s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/1/artifact/out/Dockerfile
GITHUB PR #7857
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux b4cf637d6047 5.15.0-144-generic #157-Ubuntu SMP Mon Jun 16 07:33:10 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 8d04c84
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/1/testReport/
Max. process+thread count 536 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Closes apache#7857

Co-authored-by: Jayadeep Jayaraman <[email protected]>
Reviewed-by: Ashutosh Gupta <[email protected]>
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 24s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for branch
-1 ❌ mvninstall 0m 22s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 1m 49s /branch-compile-hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04.txt hadoop-yarn in trunk failed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 56s /branch-compile-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09.txt hadoop-yarn in trunk failed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09.
+1 💚 checkstyle 4m 25s trunk passed
-1 ❌ mvnsite 0m 15s /branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt hadoop-yarn-common in trunk failed.
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 55s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
-1 ❌ spotbugs 0m 11s /branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt hadoop-yarn-common in trunk failed.
+1 💚 shadedclient 23m 59s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 2m 8s Maven dependency ordering for patch
-1 ❌ mvninstall 0m 9s /patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt hadoop-yarn-common in the patch failed.
-1 ❌ compile 0m 25s /patch-compile-hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04.txt hadoop-yarn in the patch failed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 25s /patch-compile-hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04.txt hadoop-yarn in the patch failed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 23s /patch-compile-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09.txt hadoop-yarn in the patch failed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09.
-1 ❌ javac 0m 23s /patch-compile-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09.txt hadoop-yarn in the patch failed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 42s /results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 164 unchanged - 0 fixed = 166 total (was 164)
-1 ❌ mvnsite 0m 11s /patch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt hadoop-yarn-common in the patch failed.
+1 💚 javadoc 1m 9s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 8s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
-1 ❌ spotbugs 0m 9s /patch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt hadoop-yarn-common in the patch failed.
+1 💚 shadedclient 21m 9s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 34s hadoop-yarn-api in the patch passed.
-1 ❌ unit 0m 13s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt hadoop-yarn-common in the patch failed.
-1 ❌ unit 0m 13s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt hadoop-yarn-server-nodemanager in the patch failed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
71m 34s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/2/artifact/out/Dockerfile
GITHUB PR #7857
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 9e3d228d4477 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5db27ec
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/2/testReport/
Max. process+thread count 701 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 10m 10s Maven dependency ordering for branch
+1 💚 mvninstall 20m 30s trunk passed
+1 💚 compile 2m 55s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 2m 40s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 0m 59s trunk passed
+1 💚 mvnsite 1m 39s trunk passed
+1 💚 javadoc 1m 47s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 43s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 26s trunk passed
+1 💚 shadedclient 21m 30s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for patch
+1 💚 mvninstall 1m 6s the patch passed
+1 💚 compile 2m 41s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 2m 41s the patch passed
+1 💚 compile 2m 43s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 2m 43s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 54s /results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 164 unchanged - 0 fixed = 166 total (was 164)
+1 💚 mvnsite 1m 37s the patch passed
+1 💚 javadoc 1m 43s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 41s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 39s the patch passed
+1 💚 shadedclient 21m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 44s hadoop-yarn-api in the patch passed.
+1 💚 unit 4m 27s hadoop-yarn-common in the patch passed.
+1 💚 unit 24m 7s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
137m 40s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/4/artifact/out/Dockerfile
GITHUB PR #7857
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux f966db9c6ea7 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5db27ec
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/4/testReport/
Max. process+thread count 555 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

final String msg = "Failed to execute GPU device detection script";

// The default max errors is 10. Verify that it keeps going for an 11th try.
for (int i = 0; i < 11; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this 11 to 15 & still the test doesn't fail for me, can you check once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is covering the case where you disable the max errors by setting a negative value. To make this clearer, I dialed it up to 20 attempts, and I also added another test that sets the configuration to 11 and confirms it tries exactly 11 times.

yarn.nodemanager.resource-plugins.gpu.discovery-max-errors.
</description>
<name>yarn.nodemanager.resource-plugins.gpu.discovery-timeout</name>
<value>10000ms</value>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for not using 10s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, updated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 9m 7s Maven dependency ordering for branch
+1 💚 mvninstall 37m 37s trunk passed
+1 💚 compile 5m 57s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 5m 7s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 2m 0s trunk passed
+1 💚 mvnsite 2m 51s trunk passed
+1 💚 javadoc 2m 55s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 44s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 29s trunk passed
+1 💚 shadedclient 42m 12s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 59s Maven dependency ordering for patch
+1 💚 mvninstall 1m 44s the patch passed
+1 💚 compile 5m 25s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 5m 25s the patch passed
+1 💚 compile 5m 4s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 5m 4s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 55s /results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 164 unchanged - 0 fixed = 166 total (was 164)
+1 💚 mvnsite 2m 34s the patch passed
+1 💚 javadoc 2m 37s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 28s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 47s the patch passed
+1 💚 shadedclient 42m 26s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 5s hadoop-yarn-api in the patch passed.
+1 💚 unit 5m 48s hadoop-yarn-common in the patch passed.
-1 ❌ unit 27m 14s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
226m 16s
Reason Tests
Failed junit tests hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.TestGpuDiscoverer
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/5/artifact/out/Dockerfile
GITHUB PR #7857
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux b2f5bc390399 5.15.0-144-generic #157-Ubuntu SMP Mon Jun 16 07:33:10 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 7608230
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/5/testReport/
Max. process+thread count 589 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 1s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 12s Maven dependency ordering for branch
+1 💚 mvninstall 37m 58s trunk passed
+1 💚 compile 5m 59s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 5m 6s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 2m 6s trunk passed
+1 💚 mvnsite 2m 49s trunk passed
+1 💚 javadoc 2m 56s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 41s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 29s trunk passed
+1 💚 shadedclient 42m 4s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for patch
+1 💚 mvninstall 1m 43s the patch passed
+1 💚 compile 5m 16s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 5m 16s the patch passed
+1 💚 compile 5m 0s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 5m 0s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 51s the patch passed
+1 💚 mvnsite 2m 28s the patch passed
+1 💚 javadoc 2m 37s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 26s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 45s the patch passed
+1 💚 shadedclient 42m 42s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 6s hadoop-yarn-api in the patch passed.
+1 💚 unit 6m 4s hadoop-yarn-common in the patch passed.
-1 ❌ unit 28m 10s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 57s The patch does not generate ASF License warnings.
229m 57s
Reason Tests
Failed junit tests hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.TestGpuDiscoverer
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/6/artifact/out/Dockerfile
GITHUB PR #7857
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux a3806f78278b 5.15.0-144-generic #157-Ubuntu SMP Mon Jun 16 07:33:10 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 42ccf81
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/6/testReport/
Max. process+thread count 579 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 9m 38s Maven dependency ordering for branch
+1 💚 mvninstall 19m 53s trunk passed
+1 💚 compile 2m 57s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 2m 44s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 1m 0s trunk passed
+1 💚 mvnsite 1m 53s trunk passed
+1 💚 javadoc 1m 53s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 47s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 26s trunk passed
+1 💚 shadedclient 21m 34s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for patch
+1 💚 mvninstall 1m 3s the patch passed
+1 💚 compile 2m 39s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 2m 39s the patch passed
+1 💚 compile 2m 37s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 2m 37s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 54s the patch passed
+1 💚 mvnsite 1m 33s the patch passed
+1 💚 javadoc 1m 32s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 42s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 36s the patch passed
+1 💚 shadedclient 21m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 41s hadoop-yarn-api in the patch passed.
+1 💚 unit 4m 45s hadoop-yarn-common in the patch passed.
-1 ❌ unit 24m 6s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
136m 50s
Reason Tests
Failed junit tests hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.TestGpuDiscoverer
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/7/artifact/out/Dockerfile
GITHUB PR #7857
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 10b8de8c0b98 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5e299fd
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/7/testReport/
Max. process+thread count 546 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 9m 14s Maven dependency ordering for branch
+1 💚 mvninstall 38m 5s trunk passed
+1 💚 compile 5m 59s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 5m 7s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 2m 0s trunk passed
+1 💚 mvnsite 2m 50s trunk passed
+1 💚 javadoc 2m 57s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 40s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 28s trunk passed
+1 💚 shadedclient 42m 19s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for patch
+1 💚 mvninstall 1m 43s the patch passed
+1 💚 compile 5m 18s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 5m 18s the patch passed
+1 💚 compile 5m 3s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 5m 3s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 53s the patch passed
+1 💚 mvnsite 2m 32s the patch passed
+1 💚 javadoc 2m 36s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 24s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 5m 43s the patch passed
+1 💚 shadedclient 41m 36s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 4s hadoop-yarn-api in the patch passed.
+1 💚 unit 5m 47s hadoop-yarn-common in the patch passed.
+1 💚 unit 27m 12s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 56s The patch does not generate ASF License warnings.
225m 6s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/8/artifact/out/Dockerfile
GITHUB PR #7857
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 5f19b2e6d0ce 5.15.0-144-generic #157-Ubuntu SMP Mon Jun 16 07:33:10 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / acf8406
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/8/testReport/
Max. process+thread count 593 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7857/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@cnauroth
Copy link
Contributor Author

cnauroth commented Aug 9, 2025

@ayushtkn , thanks for the review! All comments addressed, and +1 from Yetus. How does this look now?

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cnauroth cnauroth closed this in 0f34922 Aug 11, 2025
cnauroth added a commit that referenced this pull request Aug 11, 2025
Closes #7857

Co-authored-by: Jayadeep Jayaraman <[email protected]>
Reviewed-by: Ashutosh Gupta <[email protected]>
Signed-off-by: Ayush Saxena <[email protected]>
(cherry picked from commit 0f34922)
cnauroth added a commit that referenced this pull request Aug 11, 2025
Closes #7857

Co-authored-by: Jayadeep Jayaraman <[email protected]>
Reviewed-by: Ashutosh Gupta <[email protected]>
Signed-off-by: Ayush Saxena <[email protected]>
(cherry picked from commit 0f34922)
(cherry picked from commit f2b69ba)
@cnauroth
Copy link
Contributor Author

I committed to trunk and merged to branch-3.4 and branch-3.3 with some minor conflicts resolved. @ayushtkn and @hotcodemacha , thank you for the reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants