-
-
Notifications
You must be signed in to change notification settings - Fork 3
Closed
Copy link
Description
Spark application logs should be storable in Azure Data Lake Storage Gen2 (ADLS).
The hadoop-azure module must be added to the Spark image and the spec.logFileDirectory structure could be extended with adls. Currently only s3 is supported. Alternatively, a custom log directory should be specifiable.
The SAS token should be read from a Secret.
- feat: Make spark-env.sh configurable #473
- stackabletech/decisions#32
- feat: custom log directory #479 (including an integration test for a custom log directory in HDFS)
- Test manually a custom log directory in ADLS with
spark-k8s-operator:0.0.0-pr479
SparkHistoryServer: https://github.com/stackabletech/sbernauer-customers/blob/e2e6db97645ac2d0fd54130c81e45a8d3e7107e7/airplus/poc-cluster/specs/spark-history.yaml
SparkApplication: https://github.com/stackabletech/sbernauer-customers/blob/e2e6db97645ac2d0fd54130c81e45a8d3e7107e7/airplus/poc-cluster/test/spark.yaml
Documentation
https://docs.stackable.tech/home/nightly/spark-k8s/usage-guide/history-server#_custom_log_directory
Release Notes
New / extended platform features
Other product features
- Apache Spark: A custom log directory can be specified for the event logs, allowing to choose another location than an S3 bucket, e.g. HDFS or ABFS.
Metadata
Metadata
Assignees
Type
Projects
Status
Done
Status
Done