Skip to content

[Feature request] Allow for Query Hints / Settings to be accessible to query builder for a given driverΒ #8838

@Korg95

Description

@Korg95

Is your feature request related to a problem? Please describe.

In an example such as ClickHouse, there are many settings that can be applied at the server, session, or query level to optimize performance. However, the best setting for one query may not be ideal for others. Relying on server-level settings for query performance often leads to suboptimal results for certain queries, especially those with varying parameters. Many queries can benefit from runtime hints/settings to improve performance based on their specific needs (Telemetry might be another case).

Currently, Cube.js forces us to use SQL instead of table names in cube definitions as a workaround to apply settings. This causes subqueries to be formed where the setting is only applied to the subquery, rather than the entire query. As a result, the intended settings are not fully applied, leading to performance inefficiencies, particularly for real-time applications. There are also potentially databases where applying settings at the subquery level is not possible, making this workaround even more problematic.

Describe the solution you'd like

We would like Cube.js to support the ability to pass query hints or settings to the query builder, specific to a given driver. Ideally, these hints could be calculated dynamically during query_rewrite and then passed down to the query generation.

Additionally, allowing API callers to pass parameters that would control query hints would provide better control over query execution, especially when there is no direct customer exposure. For example setting of parameters may be calculated via a call to a library in CSharp which is our main language for example. Forcing individuals to use query_rewrite to do it all might be a showstopper in this case.

An alternative on this latter part to avoid this low level hinting would be to use schema to help schematize the options available for a given datastore to pass to the query generation layer. So when you query a cube linked to a datastore, a range of options would be available.

Describe alternatives you've considered

Using raw SQL instead of table names in cube definitions to allow for injection of settings using filter params. This workaround forces the query to become a subquery, limiting the scope of the applied settings to the subquery rather than the entire query. This not only leads to inefficiencies but may also not be possible in languages where settings cannot be applied to subqueries.
Manually setting certain parameters at the server or session level, though this does not offer the flexibility needed for queries with varied workloads and priorities.

Another workaround is to ask cube for the raw SQL and then alter it as required and execute the query yourself. This is useful ofcourse for edge case scenarios where you might do things like alter the query and export the result to s3 but it totally defeats the purpose of cube for regular query scenarios.

Additional context
Some specific use cases where query hints/settings would be beneficial:

  • Distributed queries in ClickHouse Cloud require settings to use more than one replica.
  • Longer lookback queries could benefit from allocating more threads, ensuring better latency for less frequent, long-running queries.
  • Overriding the join algorithm for specific queries to optimize execution plans.
  • Setting a query ID for end-to-end query tracing and monitoring.

Providing this flexibility would improve query execution performance, especially for real-time applications.

If there is existing functionality to do this, happy to move this to a question so that others can find the info there.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    driver:clickhouseIssues related to the ClickHouse driver

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions