Skip to content

Conversation

@dervoeti
Copy link
Member

@dervoeti dervoeti commented Jun 13, 2025

Description

Implemented solution for https://github.com/stackabletech/decisions/issues/44
Fixes #1068

This PR enables custom versions (with a suffix like -stackable0.0.0-dev) for all components we apply custom patches to. It also makes the following products use our patched Hadoop libraries:

  • HBase
  • Phoenix
  • hbase-operator-tools
  • Hive
  • Druid
  • Spark

And these components use our patched HBase libraries now:

  • Phoenix
  • hbase-operator-tools
  • Spark

In the SBOMs, the custom version is replaced by the original one (using sed), so vulnerabilities filed directly against e.g. Hadoop 3.3.6 are still found when scanning the SBOMs (which we do in our vulnerability management pipeline). Otherwise vulnerabilities against Hadoop 3.3.6 might be missed, because the version would be something like 3.3.6-stackable25.7.0 and vulnerability databases don't contain entries for that particular version.The patched libraries are used by overriding for instance the Hadoop version when running Maven (like -Dhadoop.compile.version=3.3.6-stackable0.0.0-dev ). Since they are not found in our Nexus Maven repo, we copy them to the local Maven repo from the Hadoop builder:

COPY --from=hadoop-builder --chown=${STACKABLE_USER_UID}:0 /stackable/patched-libs /stackable/patched-libs
cp -r /stackable/patched-libs/maven/* /stackable/.m2/repository

We can't COPY directly into /stackable/.m2/repository since that directory is usually cached and the copied libraries would be overridden by the cache.

Further changes:

  • The patch that fixes CVE-2023-34455 in Druid was removed, since Druid now uses the patched Hadoop version which does not contain the vulnerability
  • HADOOP-18837 and HADOOP-18496 where backported to Hadoop 3.3.6 to check if the vulnerabilities fixed by these patches are also fixed in products that use our patched Hadoop version
  • Building HBase now happens in a separate Dockerfile (hbase/hbase/Dockerfile) to differentiate the final HBase image (hbase/Dockerfile) from HBase the application. This was needed, because for example Phoenix depends on our patched HBase version now, but the HBase image depends on Phoenix, which would be a circular dependency if the HBase container image and application were actually the same image. Now all components of the HBase image (except hadoop-s3-builder ) are built in separate Dockerfiles instead of inline.
  • The same thing for Trino: trino/trino/Dockerfile is the application now, since storage-connector uses our patched Trino version now, while the Trino image includes the storage-connector
  • hbase-operator-tools and Phoenix now have the HBase version they were built with as suffix in their Docker target names (but not in their versions). The reason is, that while the Phoenix version itself is the same (5.2.1-stackable25.7.0), in our case Phoenix can be built against HBase 2.6.1 or HBase 2.6.2, which also both introduce a different version of Hadoop. We don't want e.g. the HBase 2.6.2 image include a static Phoenix 5.2.1 version that was built with HBase 2.6.1 and Hadoop 3.3.6. So when we include Phoenix in the HBase image, we need to specify which variant of 5.2.1 we want to include. Same for hbase-operator-tools.

I successfully built these products and tested them with the smoke tests:

  • Kafka 3.9.0
  • Nifi 1.28.1
  • Nifi 2.4.0
  • Hive 3.1.3
  • Hive 4.0.1
  • OPA 1.4.2
  • Zookeeper 3.9.3
  • Druid 30.0.1
  • HBase 2.6.1
  • HBase 2.6.2
  • Hadoop 3.3.6
  • Hadoop 3.4.1
  • Spark-k8s 3.5.5
  • Trino 470
  • Trino 476

I also built omid 1.1.3 successfully, we don't have a smoke test for it as far as I know.

Definition of Done Checklist

Note

Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant.

Please make sure all these things are done and tick the boxes

  • Changes are OpenShift compatible
  • All added packages (via microdnf or otherwise) have a comment on why they are added
  • Things not downloaded from Red Hat repositories should be mirrored in the Stackable repository and downloaded from there
  • All packages should have (if available) signatures/hashes verified
  • Add an entry to the CHANGELOG.md file
  • Integration tests ran successfully
TIP: Running integration tests with a new product image

The image can be built and uploaded to the kind cluster with the following commands:

bake --product <product> --image-version <stackable-image-version>
kind load docker-image <image-tagged-with-the-major-version> --name=<name-of-your-test-cluster>

See the output of bake to retrieve the image tag for <image-tagged-with-the-major-version>.

dervoeti and others added 20 commits May 23, 2025 11:10
| datasource | package           | from   | to     |
| ---------- | ----------------- | ------ | ------ |
| docker     | docker/dockerfile | 1.10.0 | 1.15.1 |
* remove 3.7.1 and 3.8.0

* add 4.0.0

* update changelog

* bump to java 23 for kafka 4.0.0

* fix kcat image name
* chore: stop building kcat image

* fix: adjust kafka / kafka-testing-tools watched paths
@dervoeti dervoeti changed the title Feat/custom product versions feat: custom product versions Jun 13, 2025
@dervoeti dervoeti moved this to Development: Waiting for Review in Stackable Engineering Jun 13, 2025
@dervoeti dervoeti self-assigned this Jun 13, 2025
@dervoeti dervoeti requested review from a team and removed request for a team June 13, 2025 10:23
@dervoeti
Copy link
Member Author

This PR is too large. I will try to split it up.

@dervoeti dervoeti closed this Jun 13, 2025
@lfrancke lfrancke deleted the feat/custom-product-versions branch July 4, 2025 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants