Skip to content

Conversation

@jiangzho
Copy link
Contributor

@jiangzho jiangzho commented Aug 7, 2025

What changes were proposed in this pull request?

This PR adds SparkCluster as a supported type in metrics and therefore supports publishing execution metrics for it.

Why are the changes needed?

Operator by default publish metrics by resource type (SparkApplication) - by adding this support, we support the same set of counter and histogram for received event and timed execution

Does this PR introduce any user-facing change?

More Metrics becomes available for SparkClusters

How was this patch tested?

From dev sandbox we can see metrics like

metrics_operator.sdk_sparkapplication_added_resource_event_Count{type="counters"} 9010
metrics_operator.sdk_sparkapplication_reconciliation_failed_Count{type="counters"} 9
metrics_operator.sdk_sparkapplication_reconciliation_finished_Count{type="counters"} 182841
metrics_operator.sdk_sparkapplication_reconciliation_retries_Count{type="counters"} 9
metrics_operator.sdk_sparkcluster_added_resource_event_Count{type="counters"} 9009
metrics_operator.sdk_sparkcluster_reconciliation_failed_Count{type="counters"} 0
metrics_operator.sdk_sparkcluster_reconciliation_finished_Count{type="counters"} 182821
metrics_operator.sdk_sparkcluster_reconciliation_retries_Count{type="counters"} 0

Was this patch authored or co-authored using generative AI tooling?

No

### What changes were proposed in this pull request?

This PR adds SparkCluster as a supported type in metrics and therefore supports publishing execution metrics for it.

### Why are the changes needed?

Operatoer by default publish metrics by resource type (SparkApplication) - by adding this support, we suppor the same set of counter and histogram for received event and timed execution

### Does this PR introduce _any_ user-facing change?

More Metrics becomes available for SparkClusters

### How was this patch tested?

From dev sandbox we can see metrics like

```
metrics_operator.sdk_sparkapplication_added_resource_event_Count{type="counters"} 9010
metrics_operator.sdk_sparkapplication_reconciliation_failed_Count{type="counters"} 9
metrics_operator.sdk_sparkapplication_reconciliation_finished_Count{type="counters"} 182841
metrics_operator.sdk_sparkapplication_reconciliation_retries_Count{type="counters"} 9
metrics_operator.sdk_sparkcluster_added_resource_event_Count{type="counters"} 9009
metrics_operator.sdk_sparkcluster_reconciliation_failed_Count{type="counters"} 0
metrics_operator.sdk_sparkcluster_reconciliation_finished_Count{type="counters"} 182821
metrics_operator.sdk_sparkcluster_reconciliation_retries_Count{type="counters"} 0
```

### Was this patch authored or co-authored using generative AI tooling?

No
@jiangzho
Copy link
Contributor Author

jiangzho commented Aug 7, 2025

cc @peter-toth for review. Thanks a lot!

Copy link
Contributor

@peter-toth peter-toth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the improvement!

@peter-toth peter-toth closed this in cdd2790 Aug 8, 2025
@peter-toth
Copy link
Contributor

Thanks for the PR @jiangzho!

Merged to main (0.5.0).

jiangzho added a commit to jiangzho/spark-kubernetes-operator that referenced this pull request Aug 8, 2025
### What changes were proposed in this pull request?

This PR adds SparkCluster as a supported type in metrics and therefore supports publishing execution metrics for it.

### Why are the changes needed?

Operator by default publish metrics by resource type (SparkApplication) - by adding this support, we support the same set of counter and histogram for received event and timed execution

### Does this PR introduce _any_ user-facing change?

More Metrics becomes available for SparkClusters

### How was this patch tested?

From dev sandbox we can see metrics like

```
metrics_operator.sdk_sparkapplication_added_resource_event_Count{type="counters"} 9010
metrics_operator.sdk_sparkapplication_reconciliation_failed_Count{type="counters"} 9
metrics_operator.sdk_sparkapplication_reconciliation_finished_Count{type="counters"} 182841
metrics_operator.sdk_sparkapplication_reconciliation_retries_Count{type="counters"} 9
metrics_operator.sdk_sparkcluster_added_resource_event_Count{type="counters"} 9009
metrics_operator.sdk_sparkcluster_reconciliation_failed_Count{type="counters"} 0
metrics_operator.sdk_sparkcluster_reconciliation_finished_Count{type="counters"} 182821
metrics_operator.sdk_sparkcluster_reconciliation_retries_Count{type="counters"} 0
```

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#295 from jiangzho/cluster_metrics.

Authored-by: Zhou JIANG <[email protected]>
Signed-off-by: Peter Toth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants