Skip to content

feat: Add Tubi production configuration for Spark Kubernetes Operator#1

Open
Sunninsky wants to merge 3 commits intomainfrom
tubi/production-config
Open

feat: Add Tubi production configuration for Spark Kubernetes Operator#1
Sunninsky wants to merge 3 commits intomainfrom
tubi/production-config

Conversation

@Sunninsky
Copy link
Copy Markdown
Collaborator

Summary

This PR adds Tubi-specific configuration and manifests for running the Apache Spark Kubernetes Operator in production.

Changes

Configuration Files

  • tubi-production-values.yaml - Helm chart values for production deployment
  • upgrade-to-0.6.0.sh - Script for upgrading operator versions safely

Manifests

  • manifests/spark-connect-server.yaml - Spark Connect Server with production configuration
  • manifests/spark-history-server.yaml - Spark History Server with S3 log storage
  • manifests/labeled-presentation-job.yaml - Example Scala batch job

PySpark Test Manifests

  • manifests/pyspark-test.yaml - Basic PySpark validation
  • manifests/pyspark-tensorflow-test.yaml - TensorFlow + wheel installation via init container
  • manifests/pyspark-s3-test.yaml - S3 pyFiles test
  • manifests/pyspark-wheel-test.yaml - Wheel distribution test

Documentation

  • TUBI_README.md - Usage documentation for Tubi engineers

Tested On

  • Cluster: scalamigo-v2-production
  • Operator version: 0.6.0
  • Spark: 3.5.3 (Scala 2.12)
  • Image: public.ecr.aws/dataminded/spark-k8s-glue:v3.5.3-hadoop-3.3.6-v2

Notes

  • Unity Catalog tokens are commented out - should be provided via K8s Secrets
  • IAM roles use kiam annotations for S3 access

dongjoon-hyun and others added 3 commits November 3, 2025 17:33
### What changes were proposed in this pull request?

This PR aims to update `GitHub Action` YAML file in `branch-0.6`.

### Why are the changes needed?

To enable GitHub Action CI.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

CI is trigged on this PR and passes.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#415 from dongjoon-hyun/SPARK-54167.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
- Add tubi-production-values.yaml for Helm chart configuration
- Add manifests for Spark Connect Server, History Server, and batch jobs
- Add PySpark test manifests for TensorFlow/wheel installation validation
- Add upgrade script for operator version upgrades
- Add TUBI_README.md with usage documentation

Tested on scalamigo-v2-production cluster with:
- Operator version 0.6.0
- Spark 3.5.3 with Scala 2.12
- S3 access via kiam/IRSA
- TensorFlow and custom wheel installation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants