Skip to content

Rollup Join doesn't support filtering & querying a subset of dimensionsΒ #10115

@niklasreher

Description

@niklasreher

Problem

Hi! πŸ™‹β€β™‚οΈ
I'm trying to join data from different data sources (BigQuery and Clickhouse) via rollup join. So far, everything works as expected using the following (simplified) code:

cubes:
  - name: brands_from_products
    data_source: default
    sql: SELECT DISTINCT brand_id, organization_id FROM products

    dimensions:
        - name: brand_id
          sql: brand_id
          type: string
          primary_key: true
          public: true

        - name: brand_name
          sql: "dictGet('brands_dictionary', 'name', cityHash64(brand_id))"
          type: string

    pre_aggregations:
        - name: brands_from_products_rollup
          type: rollup
          external: true
          dimensions:
            - brand_id             
            - brand_name
          indexes:
            - name: brand_id_index
              columns:
                - brand_id
          refresh_key:
            every: 1 day

  - name: bq_orders
    data_source: bigquery
    sql_table: (SELECT * FROM fc-datawarehouse.dbt_mart_fc_platform_hh.dim_orders)

    dimensions:
      - name: brand_id
        sql: brand_id
        type: string

    joins:
      - name: brands_from_products
        sql: "{bq_orders.brand_id} = {brands_from_products.brand_id}"
        relationship: many_to_one

    pre_aggregations:
      - name: orders_rollup
        type: rollup
        external: true
        dimensions:
          - brand_id
        indexes:
          - name: brand_id_index
            columns:
              - brand_id
        refresh_key:
          every: 1 day

      - name: combined_rollup
        type: rollup_join
        external: true
        dimensions:
          - bq_orders.brand_id
          - brands_from_products.brand_id
          - brands_from_products.brand_name
        rollups:
          - bq_orders.orders_rollup
          - brands_from_products.brands_from_products_rollup

However, whenever I query only a subset of the dimensions from the rollup join, e.g.

SELECT
  `brands_from_products__brand_id` `brands_from_products__brand_id`,
  `bq_orders__brand_id` `bq_orders__brand_id`
FROM
  metrics_service_rollups.bq_orders_orders_rollup AS `bq_orders__orders_rollup`
  LEFT JOIN metrics_service_rollups.brands_from_products_brands_from_products_rollup AS `brands_from_products__brands_from_products_rollup` ON `bq_orders__brand_id` = `brands_from_products__brand_id`
GROUP BY
  1,
  2

... Cube gives me the following error message:

To join across data sources use rollupJoin with Cube Store. If rollupJoin is defined, this error indicates it doesn't match the query. Please use Rollup Designer to verify it's definition. Found data sources: default, bigquery

The same happens when I try to add a filter to the query.

Is there anything I'm missing that would allow me to create a rollup join with many dimensions and measures in it and query only a subset of it?

Many thanks in advance! πŸ™

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionThe issue is a question. Please use Stack Overflow for questions.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions