Skip to content

Conversation

@cometkim
Copy link

@cometkim cometkim commented Nov 13, 2025

Notes:

  • Quickwit hasn't released a new version long time, and many people actually use a nightly build. I used a prebuilt binary here to avoid running Docker, but we can request a new binary release from the Quickwit team before merging this PR.

  • Quickwit does not support Q5. Testing this would require additional features, such as ElasticSearch's bucket_script.

  • The result for Q2 appears to be inconsistent with other engines. It's unclear whether this is a bug, a precision loss, or data corruption.

  • Quickwit's terms aggregation does not support unlimited buckets. There is no explicit "return all" option in aggregations, and even if I specify an arbitrarily large number, the maximum number is limited by the searcher's aggregation_bucket_limit settings.

  • I haven't tuned the settings for the instance size.

  • This may differ significantly from actual production results. It's more like benchmarking Tantivy. Since Quickwit is typically configured with S3 and a Postgres metastore, I suspect there will be additional overhead by other components and networking.

  • There were several errors while loading the 1000m data, but they weren't logged, so I don't know the exact cause. I need to run this at least once more to check the data quality.

@CLAassistant
Copy link

CLAassistant commented Nov 13, 2025

CLA assistant check
All committers have signed the CLA.

@@ -0,0 +1,18 @@
#!/bin/bash

# The latest official release of Quickwit is too old, many unsupported tantivy quries.
Copy link
Member

@rschu1ze rschu1ze Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what stops us from using the latest and greatest Docker builds?

Quickwit hasn't released a new version long time, and many people actually use a nightly build. I used a prebuilt binary here to avoid running Docker, but we can request a new binary release from the Quickwit team before merging this PR.

EDIT: Using Docker is fine, see the starrocks and singlestore submissions in this repository.

@rschu1ze
Copy link
Member

Quickwit does not support Q5. Testing this would require additional features, such as ElasticSearch's bucket_script.

It is fine to add [null, null, null] as measurements for Q5.

@rschu1ze
Copy link
Member

The result for Q2 appears to be inconsistent with other engines. It's unclear whether this is a bug, a precision loss, or data corruption.

Some more debugging would be nice but we can again mark Q2 again as [null, null, null].

@rschu1ze
Copy link
Member

Quickwit's terms aggregation does not support unlimited buckets. There is no explicit "return all" option in aggregations, and even if I specify an arbitrarily large number, the maximum number is limited by the searcher's aggregation_bucket_limit settings.

I don't really understand what that means. Is performance slower than it could be?

@rschu1ze
Copy link
Member

I haven't tuned the settings for the instance size.

That's good. As per the benchmark rules, as little as possible tuning should be applied (i.e. databases should run with their default settings).

@rschu1ze
Copy link
Member

@cometkim I'm interested in merging this - thanks for the PR. Seems more work is needed, please ping me when this is ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants