Skip to content

Conversation

@jiangzho
Copy link
Contributor

@jiangzho jiangzho commented Aug 8, 2025

What changes were proposed in this pull request?

This is a bug fix to apply the health probe port from helm chart value override onto operator config

Why are the changes needed?

Without this patch, the port override - if set in value.yaml - is only applied to operator container spec but not applied in config, would cause probe failures thereafter

Does this PR introduce any user-facing change?

No

How was this patch tested?

CIs, helm lint, and local testing

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the BUILD label Aug 8, 2025
@jiangzho
Copy link
Contributor Author

jiangzho commented Aug 8, 2025

cc @peter-toth

@peter-toth
Copy link
Contributor

@jiangzho , can you please resove the conflict?

### What changes were proposed in this pull request?

This PR adds SparkCluster as a supported type in metrics and therefore supports publishing execution metrics for it.

### Why are the changes needed?

Operator by default publish metrics by resource type (SparkApplication) - by adding this support, we support the same set of counter and histogram for received event and timed execution

### Does this PR introduce _any_ user-facing change?

More Metrics becomes available for SparkClusters

### How was this patch tested?

From dev sandbox we can see metrics like

```
metrics_operator.sdk_sparkapplication_added_resource_event_Count{type="counters"} 9010
metrics_operator.sdk_sparkapplication_reconciliation_failed_Count{type="counters"} 9
metrics_operator.sdk_sparkapplication_reconciliation_finished_Count{type="counters"} 182841
metrics_operator.sdk_sparkapplication_reconciliation_retries_Count{type="counters"} 9
metrics_operator.sdk_sparkcluster_added_resource_event_Count{type="counters"} 9009
metrics_operator.sdk_sparkcluster_reconciliation_failed_Count{type="counters"} 0
metrics_operator.sdk_sparkcluster_reconciliation_finished_Count{type="counters"} 182821
metrics_operator.sdk_sparkcluster_reconciliation_retries_Count{type="counters"} 0
```

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#295 from jiangzho/cluster_metrics.

Authored-by: Zhou JIANG <[email protected]>
Signed-off-by: Peter Toth <[email protected]>
@jiangzho jiangzho force-pushed the probe_port_override branch from bddf7fd to c44db7d Compare August 8, 2025 20:47
@peter-toth
Copy link
Contributor

Thanks @jiangzho for the fix!

Merged to main (0.5.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants