Skip to content

Commit 2a15d10

Browse files
Merge branch 'ClickHouse:main' into main
2 parents 7f0a902 + 5ee0a45 commit 2a15d10

File tree

622 files changed

+21130
-8908
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

622 files changed

+21130
-8908
lines changed

.github/workflows/generate-results.yml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
name: "Generate index.html"
1+
name: "Build the website"
22
on:
3+
workflow_dispatch: # This allows manual trigger from the UI
34
push:
4-
branches:
5+
branches:
56
- main
67

78
permissions:
@@ -10,8 +11,8 @@ permissions:
1011
jobs:
1112
build:
1213
runs-on: ubuntu-latest
13-
env:
14-
CI_COMMIT_MESSAGE: "[bot] update index.html"
14+
env:
15+
CI_COMMIT_MESSAGE: "[bot] Build the website"
1516
CI_COMMIT_AUTHOR: github
1617
steps:
1718
- uses: actions/checkout@v3

README.md

Lines changed: 46 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -54,12 +54,12 @@ TLDR: *All Benchmarks Are ~~Bastards~~ Liars*.
5454

5555
To introduce a new system, simply copy-paste one of the directories and edit the files accordingly:
5656

57-
- `benchmark.sh`: this is the main script to run the benchmark on a fresh VM; Ubuntu 22.04 or newer should be used by default, or any other system if specified in the comments. The script may not necessarily run in a fully automated manner - it is recommended always to copy-paste the commands one by one and observe the results. For managed databases, if the setup requires clicking in the UI, write a `README.md` instead.
57+
- `benchmark.sh`: this is the main script to run the benchmark on a fresh VM; Ubuntu 24.04 or newer should be used by default. For databases that could be installed locally, the script should be able to run in a fully automated manner, so it can be used as a cloud-init script. It should output the results in the following format: - one or more lines `Load time: 1234` with the time in seconds; - a line `Data size: 1234567890` with the data size in bytes; the data size should include indexes and transaction logs if applicable; - 43 consecutive lines in the form of `[1.234, 5.678, 9.012],` for the runtimes of every query; - the output may include other lines with the logs, that are not used for the report. For managed databases, if the setup requires clicking in the UI, write a `README.md` instead.
5858
- `README.md`: contains comments and observations if needed. For managed databases, it can describe the setup procedure to be used instead of a shell script.
5959
- `create.sql`: a CREATE TABLE statement. If it's a NoSQL system, another file like `wtf.json` can be presented.
6060
- `queries.sql`: contains 43 queries to run;
6161
- `run.sh`: a loop for running the queries; every query is run three times; if it's a database with local on-disk storage, the first query should be run after dropping the page cache;
62-
- `results`: put the .json files with the results for every hardware configuration there.
62+
- `results`: put the .json files with the results for every hardware configuration there. Please double-check that each file is valid JSON (e.g., no comma errors).
6363

6464
To introduce a new result for an existing system on different hardware configurations, add a new file to `results`.
6565

@@ -144,7 +144,7 @@ We allow but do not recommend creating scoreboards from this benchmark or saying
144144

145145
There is a web page to navigate across benchmark results and present a summary report. It allows filtering out some systems, setups, or queries. For example, if you found some subset of the 43 queries are irrelevant, you can simply exclude them from the calculation and share the report without these queries.
146146

147-
You can select the summary metric from one of the following: "Cold Run", "Hot Run", "Load Time", and "Data Size". If you select the "Load Time" or "Data Size", the entries will be simply ordered from best to worst, and additionally, the ratio to the best non-zero result will be shown (the number of times one system is worse than the best system in this metric). Load time can be zero for stateless query engines like `clickhouse-local` or `Amazon Athena`.
147+
You can select the summary metric from one of the following: "Cold Run", "Hot Run", "Load Time", "Data Size", and "Combined". If you select the "Load Time" or "Data Size", the entries will be simply ordered from best to worst, and additionally, the ratio to the best non-zero result will be shown (the number of times one system is worse than the best system in this metric). Load time can be zero for stateless query engines like `clickhouse-local` or `Amazon Athena`.
148148

149149
If you select "Cold Run" or "Hot Run", the aggregation across the queries is performed in the following way:
150150

@@ -170,6 +170,7 @@ For example, one system crashed while trying to run a query which can highlight
170170

171171
Why geometric mean? The ratios can only be naturally averaged in this way. Imagine there are two queries and two systems. The first system ran the first query in 1s and the second query in 20s. The second system ran the first query in 2s and the second query in 10s. So, the first system is two times faster on the first query and two times slower on the second query and vice-versa. The final score should be identical for these systems.
172172

173+
The "Combined" metric summarizes all the results as a weighted geometric mean with the following weights: load time: 10%, data size: 10%, cold runtime: 20%, hot runtime: 60%.
173174

174175
## History and Motivation
175176

@@ -201,109 +202,60 @@ We also introduced the [Hardware Benchmark](https://benchmark.clickhouse.com/har
201202

202203
## Systems Included
203204

204-
- [x] ClickHouse
205-
- [x] ClickHouse on local Parquet files
206-
- [x] ClickHouse operating like "Athena" on remote Parquet files
207-
- [x] ClickHouse on a VFS over HTTPs on CDN
208-
- [x] MySQL InnoDB
209-
- [x] MySQL MyISAM
210-
- [x] MariaDB
211-
- [x] MariaDB ColumnStore
212-
- [x] MemSQL/SingleStore
213-
- [x] PostgreSQL
214-
- [x] Greenplum
215-
- [x] TimescaleDB
216-
- [x] Citus
217-
- [x] Vertica (without publishing)
218-
- [x] QuestDB
219-
- [x] chdb
220-
- [x] DuckDB
221-
- [x] DuckDB over local Parquet files
205+
ClickBench provides [publicly available benchmark results for over 60 database management systems](https://benchmark.clickhouse.com/).
206+
207+
By default, all tests are run on c6a.4xlarge VM in AWS with 500 GB gp2.
208+
209+
In addition, there are also systems where the code to run the benchmark is provided, but the results cannot be published.
210+
Currently, this includes
211+
212+
- Vertica
213+
214+
Please help us add more systems and run the benchmarks on more types of VMs:
215+
216+
- [ ] Actian Vector
217+
- [ ] Apache Ignite
218+
- [ ] Apache Kudu
219+
- [ ] Apache Kylin
220+
- [ ] Azure Synapse
221+
- [ ] Boilingdata
222+
- [ ] CockroachDB Serverless
223+
- [ ] Databricks
224+
- [ ] DolphinDB
225+
- [ ] Dremio (without publishing)
222226
- [ ] DuckDB operating like "Athena" on remote Parquet files
223-
- [x] MonetDB
224-
- [x] mapD/Omnisci/HeavyAI
225-
- [x] Databend
226-
- [x] DataFusion
227-
- [x] ByteHouse
228-
- [x] Doris/PALO
229-
- [x] SelectDB
230-
- [x] Druid
231-
- [x] Pinot
232-
- [x] CrateDB
233-
- [x] Spark SQL
234-
- [x] Starrocks
235-
- [ ] ShitholeDB
227+
- [ ] EventQL
228+
- [ ] Exasol
236229
- [ ] Hive
237-
- [x] Hydra
230+
- [ ] Hydrolix
238231
- [ ] Impala
239-
- [x] Hyper
240-
- [x] Umbra
241-
- [x] SQLite
242-
- [x] Redshift
243-
- [x] Redshift Serverless
244-
- [ ] Redshift Spectrum
245-
- [ ] Presto
246-
- [ ] Trino
247-
- [x] Amazon Athena
248-
- [x] Bigquery (without publishing)
249-
- [x] Snowflake
250-
- [ ] Rockset
251-
- [x] CockroachDB
252-
- [ ] CockroachDB Serverless
253-
- [ ] Databricks
254-
- [ ] Planetscale (without publishing)
255-
- [ ] TiDB (TiFlash)
256-
- [x] Amazon RDS Aurora for MySQL
257-
- [x] Amazon RDS Aurora for Postgres
258232
- [ ] InfluxDB
259-
- [ ] TDEngine
260-
- [x] MongoDB
261-
- [ ] Cassandra
262-
- [ ] ScyllaDB
263-
- [x] Elasticsearch
264-
- [ ] Apache Ignite
265-
- [x] Motherduck
266-
- [x] Infobright
267-
- [ ] Actian Vector
233+
- [ ] LocustDB
268234
- [ ] Manticore Search
269-
- [x] Vertica (without publishing)
270-
- [ ] Azure Synapse
271-
- [ ] Starburst Galaxy
272235
- [ ] MS SQL Server with Column Store Index (without publishing)
273-
- [ ] Dremio (without publishing)
274-
- [ ] Exasol
275-
- [ ] LocustDB
276-
- [ ] EventQL
277-
- [x] Apache Drill
278-
- [ ] Apache Kudu
279-
- [ ] Apache Kylin
280-
- [x] S3 select command in AWS
281-
- [x] Kinetica
282-
- [ ] YDB
283236
- [ ] OceanBase
284-
- [ ] Boilingdata
285-
- [x] Byteconity
286-
- [ ] DolphinDB
287-
- [x] Oxla
237+
- [ ] Planetscale (without publishing)
238+
- [ ] Presto
288239
- [ ] Quickwit
289-
- [x] AlloyDB
290-
- [x] ParadeDB
291-
- [x] GlareDB
240+
- [ ] Redshift Spectrum
241+
- [ ] Rockset
292242
- [ ] Seafowl
243+
- [ ] ShitholeDB
293244
- [ ] Sneller
294-
- [x] Tablespace
295-
- [x] Tembo
296-
- [x] Cloudberry
297-
- [x] Daft
298-
- [x] Pandas
299-
- [x] Polars
300-
- [x] OctoSQL
301-
- [x] VictoriaLogs
302-
- [x] Hologres
245+
- [ ] Starburst Galaxy
246+
- [ ] Trino
247+
- [ ] TDEngine
303248

304-
By default, all tests are run on c6a.4xlarge VM in AWS with 500 GB gp2.
249+
The list above _may_ include systems that cannot run ClickBench for various limitations.
250+
Systems that have been identified to have known limitations or issues and could not be benchmarked are:
305251

306-
Please help us add more systems and run the benchmarks on more types of VMs.
252+
- Cassandra (see [discussion](https://github.com/ClickHouse/ClickBench/issues/384))
253+
- csvq (see [README](https://github.com/ClickHouse/ClickBench/tree/main/csvq))
254+
- dsq (see [README](https://github.com/ClickHouse/ClickBench/tree/main/dsq))
255+
- Hydrolix (see [README](https://github.com/ClickHouse/ClickBench/tree/main/hydrolix))
256+
- LoctusDB (see [README](https://github.com/ClickHouse/ClickBench/tree/main/locustdb))
257+
- ScyllaDB (see [discussion](https://github.com/ClickHouse/ClickBench/issues/384))
258+
- S3 select command in AWS (see [README](https://github.com/ClickHouse/ClickBench/tree/main/s3select))
307259

308260
## Similar Projects
309261

alloydb/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Note: As of current date, AlloyDB can only be accessed by setting up Alloy Auth
1010
2. Setup a EC2 instance with 30gb disk
1111
a. SSH in and download Alloy Auth Proxy https://cloud.google.com/alloydb/docs/auth-proxy/overview
1212
```bash
13-
wget https://storage.googleapis.com/alloydb-auth-proxy/v1.5.0/alloydb-auth-proxy.linux.amd64 -O alloydb-auth-proxy
13+
wget --continue --progress=dot:giga https://storage.googleapis.com/alloydb-auth-proxy/v1.5.0/alloydb-auth-proxy.linux.amd64 -O alloydb-auth-proxy
1414

1515
chmod +x alloydb-auth-proxy
1616
```
@@ -26,7 +26,7 @@ Note: As of current date, AlloyDB can only be accessed by setting up Alloy Auth
2626
2727
4. Download public dataset and required scripts
2828
```bash
29-
wget --continue 'https://datasets.clickhouse.com/hits_compatible/hits.tsv.gz'
29+
wget --continue --progress=dot:giga 'https://datasets.clickhouse.com/hits_compatible/hits.tsv.gz'
3030
```
3131
Load scripts in this repo
3232

alloydb/benchmark.sh

100644100755
File mode changed.

alloydb/results/gcp.128GB_tuned.json

Lines changed: 46 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -4,54 +4,57 @@
44
"machine": "16 vCPU 128GB",
55
"cluster_size": "serverless",
66
"comment": "",
7+
"proprietary": "yes",
8+
"tuned": "no",
79

810
"tags": ["C", "column-oriented", "PostgreSQL compatible", "managed", "gcp"],
911

12+
"load_time": 0,
1013
"data_size": 9941379875,
1114

1215
"result": [
13-
[3.58975, 3.57992, 3.54662],
14-
[0.08625, 0.0838, 0.08574],
15-
[3.10807, 2.97731, 2.9985],
16-
[2.44647, 2.36909, 2.39708],
17-
[41.17192, 39.8976, 41.06776],
18-
[133.67873, 131.0997, 128.0839],
19-
[4.12758, 4.0239, 4.08013],
20-
[0.09695, 0.09319, 0.09563],
21-
[36.86525, 35.32732, 35.25588],
22-
[48.45764, 47.93144, 46.52182],
23-
[2.97389, 2.89167, 2.95393],
24-
[2.72543, 2.63553, 2.69429],
25-
[23.60888, 23.60363, 23.46187],
26-
[33.28765, 32.03713, 31.81454],
27-
[12.95648, 12.47305, 12.33908],
28-
[21.37397, 21.35981, 21.19967],
29-
[46.84838, 46.70314, 46.10616],
30-
[0.06719, 0.06616, 0.06635],
31-
[49.55343, 49.13112, 48.82092],
32-
[0.06027, 0.05733, 0.05961],
33-
[7.33196, 7.22324, 7.31305],
34-
[7.30749, 7.03247, 7.22055],
35-
[2.55208, 2.53827, 2.4771],
36-
[1.20815, 1.16618, 1.19975],
37-
[0.76076, 0.73289, 0.75499],
38-
[6.91366, 6.74895, 6.64693],
39-
[1.02818, 1.0217, 0.98556],
40-
[6.67123, 6.46139, 6.44516],
41-
[73.94566, 71.68655, 73.74145],
42-
[0.00645, 0.00614, 0.00639],
43-
[11.49935, 11.35433, 10.95156],
44-
[23.93414, 23.63846, 22.78162],
45-
[204.59582, 195.91745, 203.92405],
46-
[198.93847, 190.58213, 191.488],
47-
[197.07735, 193.70621, 190.22602],
48-
[21.8236, 21.72214, 21.08265],
49-
[0.5763, 0.55371, 0.57517],
50-
[0.15114, 0.14738, 0.14813],
51-
[0.10535, 0.1045, 0.10124],
52-
[61.08416, 59.11649, 60.05224],
53-
[0.15439, 0.1529, 0.15313],
54-
[0.10687, 0.10602, 0.10481],
55-
[0.06382, 0.06367, 0.06166]
16+
[3.589, 3.579, 3.546],
17+
[0.086, 0.083, 0.085],
18+
[3.108, 2.977, 2.998],
19+
[2.446, 2.369, 2.397],
20+
[41.171, 39.897, 41.067],
21+
[133.678, 131.099, 128.083],
22+
[4.127, 4.023, 4.080],
23+
[0.096, 0.093, 0.095],
24+
[36.865, 35.327, 35.255],
25+
[48.457, 47.931, 46.521],
26+
[2.973, 2.891, 2.953],
27+
[2.725, 2.635, 2.694],
28+
[23.608, 23.603, 23.461],
29+
[33.287, 32.037, 31.814],
30+
[12.956, 12.473, 12.339],
31+
[21.373, 21.359, 21.199],
32+
[46.848, 46.703, 46.106],
33+
[0.067, 0.066, 0.066],
34+
[49.553, 49.131, 48.820],
35+
[0.060, 0.057, 0.059],
36+
[7.331, 7.223, 7.313],
37+
[7.307, 7.032, 7.220],
38+
[2.552, 2.538, 2.477],
39+
[1.208, 1.166, 1.199],
40+
[0.760, 0.732, 0.754],
41+
[6.913, 6.748, 6.646],
42+
[1.028, 1.021, 0.985],
43+
[6.671, 6.461, 6.445],
44+
[73.945, 71.686, 73.741],
45+
[0.006, 0.006, 0.006],
46+
[11.499, 11.354, 10.951],
47+
[23.934, 23.638, 22.781],
48+
[204.595, 195.917, 203.924],
49+
[198.938, 190.582, 191.488],
50+
[197.077, 193.706, 190.226],
51+
[21.823, 21.722, 21.082],
52+
[0.576, 0.553, 0.575],
53+
[0.151, 0.147, 0.148],
54+
[0.105, 0.104, 0.101],
55+
[61.084, 59.116, 60.052],
56+
[0.154, 0.152, 0.153],
57+
[0.106, 0.106, 0.104],
58+
[0.063, 0.063, 0.061]
5659
]
5760
}

0 commit comments

Comments
 (0)