Skip to content

Commit 486f4ab

Browse files
authored
fix: re-enable the jmx exporter (#585)
* fix: re-enable the jmx exporter for history * update docs and changelog
1 parent e8a4654 commit 486f4ab

File tree

6 files changed

+25
-16
lines changed

6 files changed

+25
-16
lines changed

CHANGELOG.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ All notable changes to this project will be documented in this file.
2323
by `FILE_LOG_DIRECTORY` (or via `--file-log-directory <DIRECTORY>`).
2424
- Replace stackable-operator `print_startup_string` with `tracing::info!` with fields.
2525
- BREAKING: Inject the vector aggregator address into the vector config using the env var `VECTOR_AGGREGATOR_ADDRESS` instead
26-
of having the operator write it to the vector config ([#551]).
26+
of having the operator write it to the vector config ([#551]).
2727
- Document that Spark Connect doesn't integrate with the history server ([#559])
2828
- test: Bump to Vector `0.46.1` ([#565]).
2929
- Use versioned common structs ([#572]).
@@ -33,7 +33,7 @@ All notable changes to this project will be documented in this file.
3333
- The `runAsUser` and `runAsGroup` fields will not be set anymore by the operator
3434
- The defaults from the docker images itself will now apply, which will be different from 1000/0 going forward
3535
- This is marked as breaking because tools and policies might exist, which require these fields to be set
36-
- BREAKING: the JMX exporter has been an replaced with the built-in Prometheus servlet. The history server pods do not expose metrics anymore ([#584])
36+
- Enable the built-in Prometheus servlet. The jmx exporter was removed in ([#584]) but added back in ([#585]).
3737

3838
### Fixed
3939

@@ -61,6 +61,7 @@ All notable changes to this project will be documented in this file.
6161
[#580]: https://github.com/stackabletech/spark-k8s-operator/pull/580
6262
[#575]: https://github.com/stackabletech/spark-k8s-operator/pull/575
6363
[#584]: https://github.com/stackabletech/spark-k8s-operator/pull/584
64+
[#585]: https://github.com/stackabletech/spark-k8s-operator/pull/585
6465

6566
## [25.3.0] - 2025-03-21
6667

@@ -111,7 +112,7 @@ All notable changes to this project will be documented in this file.
111112
- BREAKING: The fields `connection` and `host` on `S3Connection` as well as `bucketName` on `S3Bucket`are now mandatory ([#472]).
112113
- Fix `envOverrides` for SparkApplication and SparkHistoryServer ([#451]).
113114
- Ensure SparkApplications can only create a single submit Job. Fix for #457 ([#460]).
114-
- Invalid `SparkApplication`/`SparkHistoryServer` objects don't cause the operator to stop functioning (#[482]).
115+
- Invalid `SparkApplication`/`SparkHistoryServer` objects don't cause the operator to stop functioning (#[482]).
115116

116117
### Removed
117118

@@ -186,7 +187,7 @@ All notable changes to this project will be documented in this file.
186187
- Support PodDisruptionBudgets for HistoryServer ([#288]).
187188
- Support for versions 3.4.1, 3.5.0 ([#291]).
188189
- History server now exports metrics via jmx exporter (port 18081) ([#291]).
189-
- Document graceful shutdown ([#306]).
190+
- Document graceful shutdown ([#306]).
190191

191192
### Changed
192193

docs/modules/spark-k8s/pages/usage-guide/history-server.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,6 @@ image::history-server-ui.png[History Server Console]
161161

162162
[NOTE]
163163
====
164-
Up to version 25.3 of the Stackable Data Platform, the history server used the JMX exporter to expose metrics on a separate port.
165-
Starting with version 25.7 the JMX exporter has been removed and the history server doesn't expose metrics as of Spark version 3.5.6.
164+
Starting with version 25.7, the built-in Prometheus servlet is enabled in addition to the existing JMX exporter.
165+
The JMX exporter is still available but it is deprecated and will be removed in a future release.
166166
====

docs/modules/spark-k8s/pages/usage-guide/operations/applications.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@ To resubmit an application, a new SparkApplication resource must be created.
1313

1414
[NOTE]
1515
====
16-
Up to version 25.3 of the Stackable Data Platform, Spark applications used the JMX exporter to expose metrics on a separate port.
17-
Starting with version 25.7, the built-in Prometheus servlet is used instead.
16+
Starting with version 25.7, the built-in Prometheus servlet is enabled.
17+
The JMX exporter is available but not used for applications. It has never been used automatically for applications and now it is deprecated.
1818
====
1919

2020
Application driver pods expose Prometheus metrics at the following endpoints:

rust/operator-binary/src/crd/constants.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,7 @@ pub const SPARK_DEFAULTS_FILE_NAME: &str = "spark-defaults.conf";
8686
pub const SPARK_ENV_SH_FILE_NAME: &str = "spark-env.sh";
8787

8888
pub const SPARK_CLUSTER_ROLE: &str = "spark-k8s-clusterrole";
89+
pub const METRICS_PORT: u16 = 18081;
8990
pub const HISTORY_UI_PORT: u16 = 18080;
9091

9192
pub const LISTENER_VOLUME_NAME: &str = "listener";

rust/operator-binary/src/history/config/jvm.rs

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,9 @@ use stackable_operator::role_utils::{
55

66
use crate::crd::{
77
constants::{
8-
JVM_SECURITY_PROPERTIES_FILE, LOG4J2_CONFIG_FILE, STACKABLE_TLS_STORE_PASSWORD,
9-
STACKABLE_TRUST_STORE, VOLUME_MOUNT_PATH_CONFIG, VOLUME_MOUNT_PATH_LOG_CONFIG,
8+
JVM_SECURITY_PROPERTIES_FILE, LOG4J2_CONFIG_FILE, METRICS_PORT,
9+
STACKABLE_TLS_STORE_PASSWORD, STACKABLE_TRUST_STORE, VOLUME_MOUNT_PATH_CONFIG,
10+
VOLUME_MOUNT_PATH_LOG_CONFIG,
1011
},
1112
history::HistoryConfigFragment,
1213
logdir::ResolvedLogDir,
@@ -32,6 +33,9 @@ pub fn construct_history_jvm_args(
3233
format!(
3334
"-Djava.security.properties={VOLUME_MOUNT_PATH_CONFIG}/{JVM_SECURITY_PROPERTIES_FILE}"
3435
),
36+
format!(
37+
"-javaagent:/stackable/jmx/jmx_prometheus_javaagent.jar={METRICS_PORT}:/stackable/jmx/config.yaml"
38+
),
3539
];
3640

3741
if logdir.tls_enabled() {
@@ -82,7 +86,8 @@ mod tests {
8286
assert_eq!(
8387
jvm_config,
8488
"-Dlog4j.configurationFile=/stackable/log_config/log4j2.properties \
85-
-Djava.security.properties=/stackable/spark/conf/security.properties"
89+
-Djava.security.properties=/stackable/spark/conf/security.properties \
90+
-javaagent:/stackable/jmx/jmx_prometheus_javaagent.jar=18081:/stackable/jmx/config.yaml"
8691
);
8792
}
8893

@@ -125,6 +130,7 @@ mod tests {
125130
jvm_config,
126131
"-Dlog4j.configurationFile=/stackable/log_config/log4j2.properties \
127132
-Djava.security.properties=/stackable/spark/conf/security.properties \
133+
-javaagent:/stackable/jmx/jmx_prometheus_javaagent.jar=18081:/stackable/jmx/config.yaml \
128134
-Dhttps.proxyHost=proxy.my.corp \
129135
-Djava.net.preferIPv4Stack=true \
130136
-Dhttps.proxyPort=1234"

rust/operator-binary/src/history/history_controller.rs

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -56,11 +56,11 @@ use crate::{
5656
constants::{
5757
ACCESS_KEY_ID, HISTORY_APP_NAME, HISTORY_CONTROLLER_NAME, HISTORY_ROLE_NAME,
5858
HISTORY_UI_PORT, JVM_SECURITY_PROPERTIES_FILE, LISTENER_VOLUME_DIR,
59-
LISTENER_VOLUME_NAME, MAX_SPARK_LOG_FILES_SIZE, OPERATOR_NAME, SECRET_ACCESS_KEY,
60-
SPARK_DEFAULTS_FILE_NAME, SPARK_ENV_SH_FILE_NAME, SPARK_IMAGE_BASE_NAME,
61-
STACKABLE_TRUST_STORE, VOLUME_MOUNT_NAME_CONFIG, VOLUME_MOUNT_NAME_LOG,
62-
VOLUME_MOUNT_NAME_LOG_CONFIG, VOLUME_MOUNT_PATH_CONFIG, VOLUME_MOUNT_PATH_LOG,
63-
VOLUME_MOUNT_PATH_LOG_CONFIG,
59+
LISTENER_VOLUME_NAME, MAX_SPARK_LOG_FILES_SIZE, METRICS_PORT, OPERATOR_NAME,
60+
SECRET_ACCESS_KEY, SPARK_DEFAULTS_FILE_NAME, SPARK_ENV_SH_FILE_NAME,
61+
SPARK_IMAGE_BASE_NAME, STACKABLE_TRUST_STORE, VOLUME_MOUNT_NAME_CONFIG,
62+
VOLUME_MOUNT_NAME_LOG, VOLUME_MOUNT_NAME_LOG_CONFIG, VOLUME_MOUNT_PATH_CONFIG,
63+
VOLUME_MOUNT_PATH_LOG, VOLUME_MOUNT_PATH_LOG_CONFIG,
6464
},
6565
history::{self, HistoryConfig, SparkHistoryServerContainer, v1alpha1},
6666
listener_ext,
@@ -574,6 +574,7 @@ fn build_stateful_set(
574574
])
575575
.args(command_args(log_dir))
576576
.add_container_port("http", HISTORY_UI_PORT.into())
577+
.add_container_port("metrics", METRICS_PORT.into())
577578
.add_env_vars(merged_env)
578579
.add_volume_mounts(log_dir.volume_mounts())
579580
.context(AddVolumeMountSnafu)?

0 commit comments

Comments
 (0)