Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions docker/playground/.env
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@
AWS_JAVA_SDK_VERSION=1.12.367
HADOOP_VERSION=3.3.6
HIVE_VERSION=2.3.9
ICEBERG_VERSION=1.6.1
KYUUBI_VERSION=1.9.0
ICEBERG_VERSION=1.10.1
KYUUBI_VERSION=1.11.0
KYUUBI_HADOOP_VERSION=3.3.6
POSTGRES_VERSION=12
POSTGRES_JDBC_VERSION=42.3.4
SCALA_BINARY_VERSION=2.12
SPARK_VERSION=3.4.4
SPARK_BINARY_VERSION=3.4
SPARK_VERSION=3.5.7
SPARK_BINARY_VERSION=3.5
SPARK_HADOOP_VERSION=3.3.4
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark 4 uses Hadoop client 3.4, which switches to AWS SDK 2.x, requires more work, so let's keep using Spark 3.5 for now. this also matches the current state of Kyuubi project - default Spakr version is 3.5

ZOOKEEPER_VERSION=3.6.3
PROMETHEUS_VERSION=2.53.3
PROMETHEUS_VERSION=2.53.5
GRAFANA_VERSION=11.4.0
21 changes: 16 additions & 5 deletions docker/playground/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,24 +10,35 @@ Playground

### Play

1. Connect using `beeline`
1. Connect using `kyuubi-beeline`

`docker exec -it kyuubi /opt/kyuubi/bin/beeline -u 'jdbc:hive2://0.0.0.0:10009/tpcds/tiny'`;
```
docker exec -it kyuubi /opt/kyuubi/bin/kyuubi-beeline -u 'jdbc:kyuubi://0.0.0.0:10009/tpcds/tiny'
```

2. Connect using DBeaver

Add a Kyuubi datasource with

- connection url `jdbc:hive2://0.0.0.0:10009/tpcds/tiny`
- connection url `jdbc:kyuubi://0.0.0.0:10009/tpcds/tiny`
- username: `anonymous`
- password: `<empty>`

3. Use built-in dataset

Kyuubi supply some built-in dataset, after Kyuubi started, you can run the following command to load the different datasets:

- For loading TPC-DS tiny dataset to `spark_catalog.tpcds_tiny`, run `docker exec -it kyuubi /opt/kyuubi/bin/beeline -u 'jdbc:hive2://0.0.0.0:10009/' -f /opt/load_data/load-dataset-tpcds-tiny.sql`
- For loading TPC-H tiny dataset to `spark_catalog.tpch_tiny`, run `docker exec -it kyuubi /opt/kyuubi/bin/beeline -u 'jdbc:hive2://0.0.0.0:10009/' -f /opt/load_data/load-dataset-tpch-tiny.sql`
- For loading TPC-DS tiny dataset to `spark_catalog.tpcds_tiny`, run

```
docker exec -it kyuubi /opt/kyuubi/bin/kyuubi-beeline -u 'jdbc:kyuubi://0.0.0.0:10009/' -f /opt/load_data/load-dataset-tpcds-tiny.sql
```

- For loading TPC-H tiny dataset to `spark_catalog.tpch_tiny`, run

```
docker exec -it kyuubi /opt/kyuubi/bin/kyuubi-beeline -u 'jdbc:kyuubi://0.0.0.0:10009/' -f /opt/load_data/load-dataset-tpch-tiny.sql
```

### Access Service

Expand Down
3 changes: 1 addition & 2 deletions docker/playground/build-image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

set -e

APACHE_MIRROR=${APACHE_MIRROR:-https://dlcdn.apache.org}
APACHE_MIRROR=${APACHE_MIRROR:-https://archive.apache.org/dist}
MAVEN_MIRROR=${MAVEN_MIRROR:-https://maven-central-asia.storage-download.googleapis.com/maven2}
BUILD_CMD="docker build"

Expand All @@ -44,7 +44,6 @@ ${BUILD_CMD} \
--build-arg APACHE_MIRROR=${APACHE_MIRROR} \
--build-arg MAVEN_MIRROR=${MAVEN_MIRROR} \
--build-arg KYUUBI_VERSION=${KYUUBI_VERSION} \
--build-arg AWS_JAVA_SDK_VERSION=${AWS_JAVA_SDK_VERSION} \
--build-arg HADOOP_VERSION=${HADOOP_VERSION} \
--file "${SELF_DIR}/image/kyuubi-playground-hadoop.Dockerfile" \
--tag nekyuubi/kyuubi-playground-hadoop:${KYUUBI_VERSION} \
Expand Down
6 changes: 6 additions & 0 deletions docker/playground/compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ services:
- 9083
volumes:
- ./conf/core-site.xml:/etc/hadoop/conf/core-site.xml
- ./conf/hadoop-env.sh:/etc/hadoop/conf/hadoop-env.sh
- ./conf/hive-env.sh:/etc/hive/conf/hive-env.sh
- ./conf/hive-site.xml:/etc/hive/conf/hive-site.xml
depends_on:
- postgres
Expand All @@ -89,9 +91,13 @@ services:
- 10099:10099
volumes:
- ./conf/core-site.xml:/etc/hadoop/conf/core-site.xml
- ./conf/hadoop-env.sh:/etc/hadoop/conf/hadoop-env.sh
- ./conf/hive-env.sh:/etc/hive/conf/hive-env.sh
- ./conf/hive-site.xml:/etc/hive/conf/hive-site.xml
- ./conf/spark-defaults.conf:/etc/spark/conf/spark-defaults.conf
- ./conf/spark-env.sh:/etc/spark/conf/spark-env.sh
- ./conf/kyuubi-defaults.conf:/etc/kyuubi/conf/kyuubi-defaults.conf
- ./conf/kyuubi-env.sh:/etc/kyuubi/conf/kyuubi-env.sh
- ./conf/kyuubi-log4j2.xml:/etc/kyuubi/conf/log4j2.xml
- ./script/load-dataset-tpcds-tiny.sql:/opt/load_data/load-dataset-tpcds-tiny.sql
- ./script/load-dataset-tpch-tiny.sql:/opt/load_data/load-dataset-tpch-tiny.sql
Expand Down
18 changes: 18 additions & 0 deletions docker/playground/conf/hadoop-env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

export JAVA_HOME=$(update-java-alternatives --list | grep java-1.8.0-openjdk | awk '{print $NF}')
18 changes: 18 additions & 0 deletions docker/playground/conf/hive-env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

export JAVA_HOME=$(update-java-alternatives --list | grep java-1.8.0-openjdk | awk '{print $NF}')
18 changes: 18 additions & 0 deletions docker/playground/conf/kyuubi-env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

export JAVA_HOME=$(update-java-alternatives --list | grep java-1.17.0-openjdk | awk '{print $NF}')
18 changes: 18 additions & 0 deletions docker/playground/conf/spark-env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

export JAVA_HOME=$(update-java-alternatives --list | grep java-1.17.0-openjdk | awk '{print $NF}')
16 changes: 14 additions & 2 deletions docker/playground/image/kyuubi-playground-base.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,25 @@
# See the License for the specific language governing permissions and
# limitations under the License.

FROM eclipse-temurin:8-focal
FROM ubuntu:focal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ubuntu 20.04 (Focal Fossa) is already end of standard support.

https://ubuntu.com/blog/ubuntu-20-04-lts-end-of-life-standard-support-is-coming-to-an-end-heres-how-to-prepare

Let's upgrade the OS version in this PR or a separate PR.

Copy link
Member Author

@pan3793 pan3793 Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should move forward.

one additional consideration, we'd better align it with hadoop dev container, otherwise there might be some issues when using hadoop native libs, especially when users play with security configs. e.g., ubuntu focal is the latest version that provides openssl 1.x, the hadoop native libs shipped by official release compile against ubuntu focal with openssl 1.x, a runtime linkage issue will be thrown if we try to enable kerberos on ubuntu jammy or noble.

but I think it has no issues for SIMPLE mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, another issue related to Hadoop and Ubuntu, the APT repo's jsvc is too old to support modern JDK, as Hadoop trunk is moving to JDK 17+, this could be another noisy for users to run kerberized Hadoop with JDK 17+ on Ubuntu, maybe we should contact Debian or Ubuntu Java team to upgrade it ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APT repo's jsvc is too old

Found very old issue https://bugs.launchpad.net/ubuntu/+source/commons-daemon/+bug/1788154

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: Filed https://issues.apache.org/jira/browse/HADOOP-19774 to use Ubuntu 24.04 in Hadoop


ENV DEBIAN_FRONTEND=noninteractive
ENV DEBCONF_TERSE=true

RUN set -x && \
echo 'APT::Install-Recommends "0";' > /etc/apt/apt.conf.d/10disableextras && \
echo 'APT::Install-Suggests "0";' >> /etc/apt/apt.conf.d/10disableextras && \
ln -snf /usr/bin/bash /usr/bin/sh && \
apt-get update -q && \
apt-get install -yq retry busybox && \
apt-get install -yq \
retry \
busybox \
ca-certificates-java \
openjdk-8-jdk-headless \
openjdk-17-jdk-headless && \
rm -rf /var/lib/apt/lists/* && \
update-ca-certificates -f && \
update-java-alternatives --set $(update-java-alternatives --list | grep java-1.8.0-openjdk | awk '{print $NF}') || \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use || to ignore the error code returned by update-java-alternatives command, as JDK 8 lacks some commands provided by mordern JDKs

mkdir /opt/busybox && \
busybox --install /opt/busybox

Expand Down
7 changes: 2 additions & 5 deletions docker/playground/image/kyuubi-playground-hadoop.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ ARG KYUUBI_VERSION

FROM nekyuubi/kyuubi-playground-base:${KYUUBI_VERSION}

ARG AWS_JAVA_SDK_VERSION
ARG HADOOP_VERSION

ARG APACHE_MIRROR
Expand All @@ -29,9 +28,7 @@ RUN set -x && \
tar -xzf ${HADOOP_TAR_NAME}.tar.gz -C /opt && \
ln -s /opt/hadoop-${HADOOP_VERSION} ${HADOOP_HOME} && \
rm ${HADOOP_TAR_NAME}.tar.gz && \
HADOOP_CLOUD_STORAGE_JAR_NAME=hadoop-cloud-storage && \
wget -q ${MAVEN_MIRROR}/org/apache/hadoop/${HADOOP_CLOUD_STORAGE_JAR_NAME}/${HADOOP_VERSION}/${HADOOP_CLOUD_STORAGE_JAR_NAME}-${HADOOP_VERSION}.jar -P ${HADOOP_HOME}/share/hadoop/hdfs/lib && \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadoop-cloud-storage is a package for assembling, has no classes

HADOOP_AWS_JAR_NAME=hadoop-aws && \
wget -q ${MAVEN_MIRROR}/org/apache/hadoop/${HADOOP_AWS_JAR_NAME}/${HADOOP_VERSION}/${HADOOP_AWS_JAR_NAME}-${HADOOP_VERSION}.jar -P ${HADOOP_HOME}/share/hadoop/hdfs/lib && \
ln -s ${HADOOP_HOME}/share/hadoop/tools/lib/${HADOOP_AWS_JAR_NAME}-${HADOOP_VERSION}.jar ${HADOOP_HOME}/share/hadoop/hdfs/lib/ && \
AWS_JAVA_SDK_BUNDLE_JAR_NAME=aws-java-sdk-bundle && \
wget -q ${MAVEN_MIRROR}/com/amazonaws/${AWS_JAVA_SDK_BUNDLE_JAR_NAME}/${AWS_JAVA_SDK_VERSION}/${AWS_JAVA_SDK_BUNDLE_JAR_NAME}-${AWS_JAVA_SDK_VERSION}.jar -P ${HADOOP_HOME}/share/hadoop/hdfs/lib
ln -s $(find ${HADOOP_HOME}/share/hadoop/tools/lib/ -name "${AWS_JAVA_SDK_BUNDLE_JAR_NAME}-*.jar") ${HADOOP_HOME}/share/hadoop/hdfs/lib/ \
2 changes: 0 additions & 2 deletions docker/playground/image/kyuubi-playground-kyuubi.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,6 @@ RUN set -x && \
tar -xzf apache-kyuubi-${KYUUBI_VERSION}-bin.tgz -C /opt && \
ln -s /opt/apache-kyuubi-${KYUUBI_VERSION}-bin ${KYUUBI_HOME} && \
rm apache-kyuubi-${KYUUBI_VERSION}-bin.tgz && \
HADOOP_CLOUD_STORAGE_JAR_NAME=hadoop-cloud-storage && \
wget -q ${MAVEN_MIRROR}/org/apache/hadoop/${HADOOP_CLOUD_STORAGE_JAR_NAME}/${KYUUBI_HADOOP_VERSION}/${HADOOP_CLOUD_STORAGE_JAR_NAME}-${KYUUBI_HADOOP_VERSION}.jar -P ${KYUUBI_HOME}/jars && \
HADOOP_AWS_JAR_NAME=hadoop-aws && \
wget -q ${MAVEN_MIRROR}/org/apache/hadoop/${HADOOP_AWS_JAR_NAME}/${KYUUBI_HADOOP_VERSION}/${HADOOP_AWS_JAR_NAME}-${KYUUBI_HADOOP_VERSION}.jar -P ${KYUUBI_HOME}/jars && \
AWS_JAVA_SDK_BUNDLE_JAR_NAME=aws-java-sdk-bundle && \
Expand Down
2 changes: 0 additions & 2 deletions docker/playground/image/kyuubi-playground-spark.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,6 @@ RUN set -x && \
wget -q ${MAVEN_MIRROR}/org/apache/iceberg/${ICEBERG_SPARK_JAR_NAME}/${ICEBERG_VERSION}/${ICEBERG_SPARK_JAR_NAME}-${ICEBERG_VERSION}.jar -P ${SPARK_HOME}/jars && \
SPARK_HADOOP_CLOUD_JAR_NAME=spark-hadoop-cloud_${SCALA_BINARY_VERSION} && \
wget -q ${MAVEN_MIRROR}/org/apache/spark/${SPARK_HADOOP_CLOUD_JAR_NAME}/${SPARK_VERSION}/${SPARK_HADOOP_CLOUD_JAR_NAME}-${SPARK_VERSION}.jar -P ${SPARK_HOME}/jars && \
HADOOP_CLOUD_STORAGE_JAR_NAME=hadoop-cloud-storage && \
wget -q ${MAVEN_MIRROR}/org/apache/hadoop/${HADOOP_CLOUD_STORAGE_JAR_NAME}/${SPARK_HADOOP_VERSION}/${HADOOP_CLOUD_STORAGE_JAR_NAME}-${SPARK_HADOOP_VERSION}.jar -P ${SPARK_HOME}/jars && \
HADOOP_AWS_JAR_NAME=hadoop-aws && \
wget -q ${MAVEN_MIRROR}/org/apache/hadoop/${HADOOP_AWS_JAR_NAME}/${SPARK_HADOOP_VERSION}/${HADOOP_AWS_JAR_NAME}-${SPARK_HADOOP_VERSION}.jar -P ${SPARK_HOME}/jars && \
AWS_JAVA_SDK_BUNDLE_JAR_NAME=aws-java-sdk-bundle && \
Expand Down
Loading