-
Notifications
You must be signed in to change notification settings - Fork 106
Description
Problem Description
The Service manifest in jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml
has hardcoded app: notebook
labels in both metadata.labels
and spec.selector
, while the kustomization layer adds an additional app: jupyter-rocm-tensorflow-ubi9-python-3-12
label to both objects and their selectors (includeSelectors: true
).
After kustomize render, the Service selector will demand both labels (app: notebook
AND app: jupyter-rocm-tensorflow-ubi9-python-3-12
), yet the pod template in the StatefulSet only receives the generated label (it has no app: notebook
). This results in an empty Endpoints list and 503 errors when trying to access the notebook service.
Expected Behavior
The Service should be able to route traffic to the notebook pods successfully.
Proposed Solution
Update the service's metadata.labels
and spec.selector
to rely on kustomize to inject the canonical app label:
metadata:
name: notebook
# Let kustomize inject the canonical app label;
# keep this block minimal to avoid selector divergence.
labels: {}
spec:
type: ClusterIP
ports:
- port: 8888
protocol: TCP
targetPort: notebook-port
# Rely on kustomize to fill in the selector.
selector: {}
References
- PR: RHOAIENG-27434: Create ROCm Tensorflow Python 3.12 Image #1259
- Comment: RHOAIENG-27434: Create ROCm Tensorflow Python 3.12 Image #1259 (comment)
- Related issues: Service/Pod label mismatch in ROCm PyTorch kustomize manifests causes connectivity issues #1251 (same problem in PyTorch), StatefulSet selector and template labels configuration issue in jupyter/rocm/tensorflow #1264 (StatefulSet issue in same component)
- Reported by: @jiridanek
Metadata
Metadata
Assignees
Labels
Type
Projects
Status