Skip to content

Commit b3e9d02

Browse files
author
mo
committed
k8s: fallback to TF CPU images for prepare/train; remove GPU requests to ensure scheduling; keep path fixes and blinker guard.
1 parent f23a72e commit b3e9d02

File tree

2 files changed

+2
-4
lines changed

2 files changed

+2
-4
lines changed

k8s-prepare-job.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ spec:
1111
spec:
1212
containers:
1313
- name: cerebros-prepare
14-
image: cerebros-runner:pi
14+
image: tensorflow/tensorflow:2.19.0
1515
imagePullPolicy: IfNotPresent
1616
command: ["/bin/sh"]
1717
args:

k8s-train-job.yaml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ spec:
1111
spec:
1212
containers:
1313
- name: cerebros-train
14-
image: cerebros-runner:pi
14+
image: tensorflow/tensorflow:2.19.0
1515
imagePullPolicy: IfNotPresent
1616
command: ["/bin/sh"]
1717
args:
@@ -44,11 +44,9 @@ spec:
4444
requests:
4545
memory: "6Gi"
4646
cpu: "1500m"
47-
nvidia.com/gpu: 1
4847
limits:
4948
memory: "12Gi"
5049
cpu: "3000m"
51-
nvidia.com/gpu: 1
5250
restartPolicy: Never
5351
volumes:
5452
- name: data-storage

0 commit comments

Comments
 (0)