[SPARK-52997] Fixes wrong worker assignment if multiple clusters are deployed to the same namespace #291

schmaxXximilian · 2025-07-29T13:07:45Z

What changes were proposed in this pull request?

Updated the podSelector for master and worker services to include both clusterRole and name labels.

Why are the changes needed?

Using only clusterRole caused service misrouting when multiple Spark clusters were deployed in the same namespace. Adding name ensures correct pod targeting.

Does this PR introduce any user-facing change?

No, this is an internal fix to service selectors.

How was this patch tested?

Tested with multiple clusters in the same namespace. Verified each service only matched its own pods via kubectl describe service.
Also adapted unit tests to reflect new behaviour

Was this patch authored or co-authored using generative AI tooling?

Yes, PR metadata was assisted by AI, but code changes were made manually.

…o wrong worker assignment

schmaxXximilian · 2025-07-30T07:25:21Z

Hey @dongjoon-hyun, would be great if you could have a look at this. Our team is using the operator and we are grateful to the great work you folks have been doing.
Alas, the issue described in the PR is a pretty big problem to it. It would be amazing if we could get it resolved for the next release of the operator.

dongjoon-hyun · 2025-07-30T17:29:05Z

Thank you for making a PR, @schmaxXximilian .

cc @peter-toth

...-submission-worker/src/main/java/org/apache/spark/k8s/operator/SparkClusterResourceSpec.java

dongjoon-hyun · 2025-07-30T17:30:35Z

...-submission-worker/src/main/java/org/apache/spark/k8s/operator/SparkClusterResourceSpec.java

+        .withClusterIP("None")
        .withSelector(
-            Collections.singletonMap(LABEL_SPARK_ROLE_NAME, LABEL_SPARK_ROLE_MASTER_VALUE))
+            Map.of(


May I ask why we need to use Map.of instead of Collections.singletonMap?

Hi @dongjoon-hyun, I'm afraid, I'm not quire sure what you are suggesting. Collections.singletonMap only creates a map with a single element.
I changed my initial commit to below approach. Do you think it would be more suitable?

.addToSelector(Collections.singletonMap(LABEL_SPARK_CLUSTER_NAME, name)) .addToSelector( Collections.singletonMap(LABEL_SPARK_ROLE_NAME, LABEL_SPARK_ROLE_MASTER_VALUE))

dongjoon-hyun

I left a few comments including https://github.com/apache/spark-kubernetes-operator/pull/291/files#r2243410142 .

In general, I understand your requirements although this is not recommended for HPA-enabled Spark Clusters. Let me play with this for a while.

dongjoon-hyun

+1, LGTM. Thank you, @schmaxXximilian . Sorry for being delayed.

Merged to main.

dongjoon-hyun · 2025-09-19T21:56:44Z

I added you to the Apache Spark contributor group (of ASF JIRA) and assigned SPARK-52997 to you.

Welcome to the Apache Spark community.

…deployed to the same namespace ### What changes were proposed in this pull request? Updated the podSelector for master and worker services to include both clusterRole and name labels. ### Why are the changes needed? Using only clusterRole caused service misrouting when multiple Spark clusters were deployed in the same namespace. Adding name ensures correct pod targeting. ### Does this PR introduce any user-facing change? No, this is an internal fix to service selectors. ### How was this patch tested? Tested with multiple clusters in the same namespace. Verified each service only matched its own pods via kubectl describe service. Also adapted unit tests to reflect new behaviour ### Was this patch authored or co-authored using generative AI tooling? Yes, PR metadata was assisted by AI, but code changes were made manually. Closes apache#291 from schmaxXximilian/main. Authored-by: Schmöller Maximilian <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 7915164)

github-actions bot added the SUBMISSION WORKER label Jul 29, 2025

schmaxXximilian changed the title ~~Fixes wrong worker assignment if multiple clusters are deployed to the namespace~~ [SPARK-52997] Fixes wrong worker assignment if multiple clusters are deployed to the same namespace Jul 29, 2025

schmaxXximilian force-pushed the main branch 2 times, most recently from a373739 to 62a3b2d Compare July 29, 2025 14:51

schmaxXximilian marked this pull request as draft July 29, 2025 15:04

schmaxXximilian force-pushed the main branch from 62a3b2d to e3cc007 Compare July 30, 2025 07:05

fixes issue where multiple sparkclusters in one namespace will lead t…

2a77f6c

…o wrong worker assignment

schmaxXximilian force-pushed the main branch from e3cc007 to 2a77f6c Compare July 30, 2025 07:20

schmaxXximilian marked this pull request as ready for review July 30, 2025 07:22

dongjoon-hyun reviewed Jul 30, 2025

View reviewed changes

...-submission-worker/src/main/java/org/apache/spark/k8s/operator/SparkClusterResourceSpec.java Outdated Show resolved Hide resolved

dongjoon-hyun reviewed Jul 30, 2025

View reviewed changes

dongjoon-hyun requested changes Jul 30, 2025

View reviewed changes

schmaxXximilian added 2 commits July 31, 2025 09:24

removes redundant withClusterIP

3d31a54

reverts back to Collections.singletonMap

b0cd8e3

dongjoon-hyun approved these changes Sep 19, 2025

View reviewed changes

dongjoon-hyun closed this in 7915164 Sep 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-52997] Fixes wrong worker assignment if multiple clusters are deployed to the same namespace #291

[SPARK-52997] Fixes wrong worker assignment if multiple clusters are deployed to the same namespace #291

Uh oh!

schmaxXximilian commented Jul 29, 2025 •

edited

Loading

Uh oh!

schmaxXximilian commented Jul 30, 2025

Uh oh!

dongjoon-hyun commented Jul 30, 2025

Uh oh!

Uh oh!

dongjoon-hyun Jul 30, 2025

Uh oh!

schmaxXximilian Jul 31, 2025

Uh oh!

dongjoon-hyun left a comment •

edited

Loading

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-52997] Fixes wrong worker assignment if multiple clusters are deployed to the same namespace #291

[SPARK-52997] Fixes wrong worker assignment if multiple clusters are deployed to the same namespace #291

Uh oh!

Conversation

schmaxXximilian commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

schmaxXximilian commented Jul 30, 2025

Uh oh!

dongjoon-hyun commented Jul 30, 2025

Uh oh!

Uh oh!

dongjoon-hyun Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

schmaxXximilian Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

schmaxXximilian commented Jul 29, 2025 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading