To setup the spark-cluster chart locally you need:
-
Instal Minikube or Kubernetes on Docker Desktop for your OS
- Supported Kubernetes versions 1.11.0 - 1.18.0.
minikube start --kubernetes-version=1.18.0 --cpus=12 --memory=14g- Install Kubectl
- Install Helm and initialize it (for Helm 2.x)
export TILLER_NAMESPACE=kube-system
kubectl create -n kube-system -f scripts/cluster-admin.yaml
kubectl create serviceaccount tiller --namespace kube-system
kubectl create clusterrolebinding tiller-cluster-role --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
helm init --upgrade --service-account tiller- Add and sync Helm repository
jahstreet
helm repo add jetstack https://charts.jetstack.io
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart
helm repo add loki https://grafana.github.io/loki/charts
helm repo add jahstreet https://jahstreet.github.io/helm-charts
helm repo update-
Run in a separate terminal
minikube tunnel -
Install cluster-base chart
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.crds.yaml
helm upgrade --install cluster-base jahstreet/cluster-base --namespace kube-system- Check the Nginx Ingress Controller load balancer external IP
kubectl get service cluster-base-ingress-nginx-controller --namespace kube-system- Add entry to hosts file
<load-balancer-external-IP> my-cluster.example.com
- Install spark-cluster chart (NOTE: use release name
spark-cluster)
helm upgrade --install spark-cluster --namespace spark-cluster jahstreet/spark-cluster \
--timeout 600 \
-f charts/spark-cluster/examples/custom-values-local.yaml- Installation may take some time, wait until the
PodsareRunning
kubectl get pods --watch --namespace spark-cluster-
Go to
https://my-cluster.example.com/jupyterhubin your browser -
Enter login
adminand passwordadmin -
SpawnJupyter profile and you'll be redirected to your personalJupyter Notebookonce it's Up and Running -
You can find Livy UI with the clickable links to the Spark UI, logs and debug info for the
RunningJupyter sessions athttps://my-cluster.example.com/livy -
Try out notebooks in
examples/folder -
Install spark-monitoring chart
helm upgrade --install spark-monitoring --namespace monitoring jahstreet/spark-monitoring \
--timeout 600 \
-f charts/spark-monitoring/examples/custom-values-example.yamlNote: at present the
spark-monitoringchart requires to be installed with the release namespark-monitoringto themonitoringnamespace in order to makePrometheus Pushgatewayservice monitor work properly. Please refercharts/spark-monitoring/values.yamlsectionpushgatewayto change that.
- Installation may take some time, wait until the
PodsareRunning
kubectl get pods --watch --namespace monitoring- Go to
https://my-cluster.example.com/grafanain your browser - Login to Grafana with user
adminand passwordadmin - Go to
Explorepage via corresponding tab on the left panel, select datasourceLokiand choose the Kubernetes labels to get Pod logs for
- Also you can find already pre-installed Grafana dasboards:
Spark MetricsandCluster State Board
