Skip to content

Conversation

@VardhanThigle
Copy link
Contributor

@VardhanThigle VardhanThigle commented Dec 30, 2025

Swap Go Spanner Cassandra Proxy with Java Client.

Fixes b/471713394

Goal

Replace the go-spanner-cassandra proxy with the java-spanner-cassandra proxy in the sources/cassandra Docker image, while maintaining the existing dual-write proxy functionality (zdm-proxy). The Java Client being used will be maintained to accommodate newer options and features.

Current Components of the Docker Image

  • Dockerfile:
    • Base: golang:1.24-bullseye
    • Builds zdm-proxy (Go)
    • Builds go-spanner-cassandra (Go)
    • Final Image: alpine:3.22 with binaries copied over.
  • Entrypoint:
    • Runs cassandra-spanner-proxy with CLI flags derived from environment variables.
    • Runs zdm-proxy with a config file.
    • Monitors PIDs.

Proposed Changes

1. Dockerfile Restructuring (Multi-Stage Build)

We need a multi-stage build to handle both Go (for zdm-proxy) and Java (for java-spanner-cassandra) compilation.

Stages:

  1. Go Builder:
    • Image: golang:1.24-bullseye (Keep existing).
    • Action: Build zdm-proxy as before.
  2. Java Builder:
    • Base Image: maven:3.9-eclipse-temurin-17.
    • Reasoning: The java-spanner-cassandra client requires Java 8 or later. We will use Java 17 LTS as it is a modern, widely supported standard that offers better performance and container support than Java 8.
    • Action: Clone java-spanner-cassandra and run mvn clean install to produce the shaded process/launcher jar.
  3. Final Image:
    • Image: alpine:3.22 (Keep existing).
    • Action:
      • Install openjdk17-jre (or headless version) + bash + gettext (for envsubst).
      • Copy zdm-proxy binary from Go Builder.
      • Copy spanner-cassandra-launcher.jar from Java Builder.
      • Copy entrypoint.sh.
      • Copy new spanner.yaml (template).

2. Configuration Strategy

We will map the existing environment variables to the new Java Proxy's configuration format using a template and envsubst. **This ensures that users do not need to change how they deploy the container.

Current Configuration Flow (Old)

  1. User Input: User passes environment variables to docker run: SPANNER_PROJECT, SPANNER_INSTANCE, SPANNER_DATABASE, GRPC_CHANNELS (optional).
  2. Entrypoint Execution: entrypoint.sh reads these variables directly.
  3. Process Start: entrypoint.sh constructs a command line string:
    /cassandra-spanner-proxy --db="projects/$SPANNER_PROJECT/instances/$SPANNER_INSTANCE/databases/$SPANNER_DATABASE" ...

Proposed Configuration Flow (New)

  1. User Input: User passes identical environment variables as before (SPANNER_PROJECT, etc.). No user-side change required.
  2. Template Creation: We bake a spanner-cassandra-config.yaml template into the image.
  3. Entrypoint Execution:
    • entrypoint.sh uses envsubst to read the template and replace placeholders (e.g., ${SPANNER_PROJECT}) with actual values from the current environment.
    • References:
      • The template will use standard env placeholders.
      • envsubst allows dynamic generation of the complex YAML structure required by the Java proxy, which is more robust than passing many -D system properties for nested configs.
  4. Process Start:
    • entrypoint.sh starts the Java process pointing to the generated config file.
    • Command: java -DconfigFilePath=/app/generated-config.yaml -jar ...

Configuration Flexibility (FAQ)

Q: Can a user have a config file in the code repo and only override a few parameters via env variables?
A: Yes. The config file in the repo (sources/cassandra/spanner-cassandra-config.yaml) serves as the template.

  • Static Values: You can hardcode values in this file (e.g., max_prepared_statements: 100) if they should always be the same.
  • Overridable Values: For values you want to control via environment variables (like Project ID), you must use placeholders (e.g., ${SPANNER_PROJECT}) in the file.
  • Runtime Behavior: When the container starts, envsubst will replace the placeholders with the actual environment variable values. Hardcoded values remain unchanged.
  • Full Override: If a user wants to supply a completely different config at runtime, they can mount a file to /app/spanner-cassandra-config.yaml, and the entrypoint will still process it with envsubst (allowing them to use their own placeholders if desired).

3. Entrypoint Script (entrypoint.sh)

Update entrypoint.sh to:

  1. Generate generated-config.yaml from spanner-cassandra-config.yaml using envsubst.
  2. Launch the Java proxy background process.
  3. Launch zdm-proxy (unchanged).
  4. Monitor processes (update PID tracking for Java process).

Detailed Steps

  1. Create sources/cassandra/spanner-cassandra-config.yaml: Define the default template.
  2. Update sources/cassandra/Dockerfile:
    • Add maven:3.9-eclipse-temurin-17 build stage.
    • Install openjdk17-jre and gettext in the final stage.
    • Copy JARs and template.
  3. Update sources/cassandra/entrypoint.sh:
    • Add envsubst step.
    • Update process start command.

Questions/Notes

  • Java Version: Proceeding with 17 (Client supports 8+).
  • Config: Proceeding with Template + Envsubst to maintain backward compatibility for user inputs while enabling static config in repo.

Tests

Basic

  1. Builds
  2. Run built ZDM image
  3. Insert 15 rows, modify 5 rows, delete 5 rows. Compare output with Spanner.
    Test passes. As of now the test is manual as Spanner Emulator does not support Cassandra Adapter.

IOPS test with cassandra-stress

No regression in throughput with respect to main.
All latencies below are in milli seconds.

Version of ZDM Image Number of Threads Number of GRPC Channels IOPS Latency Mean Latency median Latency 95p Latency 99p Latency 99.9p Latency Max
At PR 250 4 1,789 op/s 137.5 134.5 157.3 198.8 364.6 534.5
At PR 250 500 1,790 op/s 139.3 138.7 151.8 159.6 248 351.5
At Main 250 500 1,744 op/s 142.4 141.7 156.2 164.6 178.4 345.8
  • Tests pass
  • Appropriate changes to README are included in PR

@VardhanThigle VardhanThigle requested a review from a team as a code owner December 30, 2025 08:48
@VardhanThigle VardhanThigle requested review from aasthabharill and bharadwaj-aditya and removed request for a team December 30, 2025 08:48
@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 47.22%. Comparing base (063866d) to head (b9a4e46).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1256      +/-   ##
==========================================
- Coverage   47.22%   47.22%   -0.01%     
==========================================
  Files         231      231              
  Lines       26698    26698              
  Branches      581      581              
==========================================
- Hits        12609    12607       -2     
- Misses      13368    13370       +2     
  Partials      721      721              
Components Coverage Δ
backend-apis 42.11% <ø> (ø)
backend-library 51.58% <ø> (ø)
cli 24.44% <ø> (ø)
frontend 38.77% <ø> (-0.05%) ⬇️
see 1 file with indirect coverage changes
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@rohitwali rohitwali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes. mostly LGTM, couple of clarifications

Copy link
Contributor

@bharadwaj-aditya bharadwaj-aditya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of parameters to check. Looks fine overall.

@VardhanThigle VardhanThigle force-pushed the dualwrites branch 4 times, most recently from 69943e3 to 167755f Compare December 31, 2025 07:05
@VardhanThigle VardhanThigle merged commit d00573f into GoogleCloudPlatform:master Jan 16, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants