Skip to content

Conversation

@gabotechs
Copy link
Collaborator

@gabotechs gabotechs commented Nov 25, 2025

Adds a Trino cluster to the CDK deployment with all TPCH setup so that we can benchmark this project against Trino.

For that, several things are done:

  • Add a Trino deployment to the CDK code
  • Refactor the benchmarking script for better reusability
  • Add a new trino-bench.ts script for benchmarking Trino

Results of benchmarking TPCH with a 10 scale factor against Trino

prev=Trino
new=DataFusion Distributed

==== Comparison with previous run ====
      q1: prev=2009 ms, new=4702 ms, 2.34x slower ❌
      q2: prev=3326 ms, new=1373 ms, 2.42x faster ✅
      q3: prev=3529 ms, new=3789 ms, 1.07x slower ✖
      q4: prev=2411 ms, new=2630 ms, 1.09x slower ✖
      q5: prev=6195 ms, new=4413 ms, 1.40x faster ✅
      q6: prev=1447 ms, new=2353 ms, 1.63x slower ❌
      q7: prev=7985 ms, new=5810 ms, 1.37x faster ✅
      q8: prev=8513 ms, new=6657 ms, 1.28x faster ✅
      q9: prev=9854 ms, new=8303 ms, 1.19x faster ✔
     q10: prev=3663 ms, new=4078 ms, 1.11x slower ✖
     q11: prev=2390 ms, new=1425 ms, 1.68x faster ✅
     q12: prev=2389 ms, new=2492 ms, 1.04x slower ✖
     q13: prev=3961 ms, new=3124 ms, 1.27x faster ✅
     q14: prev=1826 ms, new=2425 ms, 1.33x slower ❌
     q15: prev=4035 ms, new=3445 ms, 1.17x faster ✔
     q16: prev=1850 ms, new=1218 ms, 1.52x faster ✅
     q17: prev=6522 ms, new=6722 ms, 1.03x slower ✖
     q18: prev=10231 ms, new=6049 ms, 1.69x faster ✅
     q19: prev=1922 ms, new=3030 ms, 1.58x slower ❌
     q20: prev=4115 ms, new=2967 ms, 1.39x faster ✅
     q21: prev=15599 ms, new=7960 ms, 1.96x faster ✅
     q22: prev=1600 ms, new=1039 ms, 1.54x faster ✅

@gabotechs gabotechs force-pushed the gabrielmusat/add-trino-to-cdk branch from 7376956 to 1744711 Compare November 25, 2025 15:00
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the @ in the file name expected?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's some convention that I used to use in large TypeScript projects that I found very helpful. All the "common" code scoped to a folder is prefixed with @ so that:

  • IDEs place it at the top of the folder, separating it from any other normal code
  • The developer eye rapidly indexes file as "something that contains utils used across two or more files in the same folder"

# Conflicts:
#	benchmarks/cdk/bin/datafusion-bench.ts
#	benchmarks/cdk/lib/cdk-stack.ts
@gabotechs gabotechs merged commit 3262025 into main Dec 8, 2025
4 checks passed
@gabotechs gabotechs deleted the gabrielmusat/add-trino-to-cdk branch December 8, 2025 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants