Skip to content

Commit 1c3324b

Browse files
committed
Revert CUDA upgrade for AWS EFA tests
libfabric hangs and test fail on timeout when CUDA 13 umages are used Signed-off-by: Alexey Rivkin <[email protected]>
1 parent ddfdc9a commit 1c3324b

File tree

3 files changed

+4
-4
lines changed

3 files changed

+4
-4
lines changed

contrib/aws-efa/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ The AWS test script:
8989

9090
## Container Image
9191

92-
The script uses the container image: `nvcr.io/nvidia/cuda-dl-base:25.10-cuda13.0-devel-ubuntu24.04`
92+
The script uses the container image: `nvcr.io/nvidia/cuda-dl-base:25.06-cuda12.9-devel-ubuntu24.04`
9393
You can override this by setting the `CONTAINER_IMAGE` environment variable:
9494

9595
```bash

contrib/aws-efa/aws_job_def.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"imagePullSecrets": [],
1616
"containers": [
1717
{
18-
"image": "nvcr.io/nvidia/cuda-dl-base:25.10-cuda13.0-devel-ubuntu24.04",
18+
"image": "nvcr.io/nvidia/cuda-dl-base:25.06-cuda12.9-devel-ubuntu24.04",
1919
"command": [
2020
"/bin/bash",
2121
"-c",

contrib/aws-efa/aws_test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ usage() {
3030
echo " GITHUB_REPOSITORY - GitHub repository (e.g., \"ai-dynamo/nixl\")"
3131
echo ""
3232
echo "Optional environment variables:"
33-
echo " CONTAINER_IMAGE - Container image to use (default: nvcr.io/nvidia/cuda-dl-base:25.10-cuda13.0-devel-ubuntu24.04)"
33+
echo " CONTAINER_IMAGE - Container image to use (default: nvcr.io/nvidia/cuda-dl-base:25.06-cuda12.9-devel-ubuntu24.04)"
3434
echo " TEST_TIMEOUT - Timeout for test execution in minutes"
3535
exit 1
3636
}
@@ -47,7 +47,7 @@ if [ -z "$GITHUB_REF" ] || [ -z "$GITHUB_SERVER_URL" ] || [ -z "$GITHUB_REPOSITO
4747
fi
4848

4949
test_cmd="$1"
50-
export CONTAINER_IMAGE=${CONTAINER_IMAGE:-"nvcr.io/nvidia/cuda-dl-base:25.10-cuda13.0-devel-ubuntu24.04"}
50+
export CONTAINER_IMAGE=${CONTAINER_IMAGE:-"nvcr.io/nvidia/cuda-dl-base:25.06-cuda12.9-devel-ubuntu24.04"}
5151

5252
# Set Git checkout command based on GITHUB_REF
5353
case "$GITHUB_REF" in

0 commit comments

Comments
 (0)