Skip to content

Fact to Fact JOINS where Querying more than 1 Fact tableΒ #9209

@ElisabethPA

Description

@ElisabethPA

TWO Fact tables are joined in the same query indirectly, while there is no Fact-to-Fact JOIN defined in example .js file .

We are migrating from Microsoft multidimensional OLAP to CubeDev. Our backend data is in star schema format. We have many facts and dimensions and we are not able to merge fact tables together. They have different granularities. In OLAP cubes we are able to query two measures from different facts in the same MDX query without any problem, while in CubeDev we have a major stopper.

The structure of SQL query generated by CubeDev is slowing down the performance, because we have 1 Fact table join with common dimension + 2nd Fact table joined with common dimension. Both of these queries are combined using JOIN as well. In practice this is generating a cross join between tables, that potentially for us have millions of rows.

Is there any work-around to avoid having LEFT join used in a simple queries where more than one FACT table is involved?
Here is our example:

Tables used:

  1. FactSales
  2. DimProduct
  3. FactTargets

Query we tried from our side:

{
  "limit": 5000,
  "measures": [
    "FactSales.totalSales",
    "FactTargets.totalTargets"
  ],
  "dimensions": [
    "DimProduct.productName"
  ]
}

SQL Query that is generated involves LEFT JOIN on Fact tables, which slows down the performance when tables will have more than millions rows.

The problem for us is the inner query, which shows that two fact tables in example above are cross-joined. Is there any way to avoid it?

Example of .js file here:

cube(`FactSales`, {
  sql: `
    SELECT * FROM (VALUES
      (1, 1001, 500, '2024-02-01'),
      (2, 1002, 700, '2024-02-02'),
      (3, 1001, 800, '2024-02-03'),
      (4, 1003, 600, '2024-02-04')
    ) AS t(sale_id, product_id, sales_value, sale_date)
  `,

  joins: {
    DimProduct: {
      sql: `${CUBE}.product_id = ${DimProduct}.product_id`,
      relationship: `belongsTo`
    }
  },

  measures: {
    totalSales: {
      type: `sum`,
      sql: `sales_value`
    }
  },

  dimensions: {
    saleId: {
      sql: `sale_id`,
      type: `number`,
      primaryKey: true
    },
    productId: {
      sql: `product_id`,
      type: `string`
    },
    saleDate: {
      sql: `sale_date`,
      type: `time`
    }
  }
});

cube(`FactTargets`, {
  sql: `
    SELECT * FROM (VALUES
      (1, 1001, 1500, '2024-02'),
      (2, 1002, 1200, '2024-02'),
      (3, 1003, 1000, '2024-02')
    ) AS t(target_id, product_id, target_value, target_month)
  `,

  joins: {
    DimProduct: {
      sql: `${CUBE}.product_id = ${DimProduct}.product_id`,
      relationship: `belongsTo`
    }
  },

  measures: {
    totalTargets: {
      type: `sum`,
      sql: `target_value`
    }
  },

  dimensions: {
    targetId: {
      sql: `target_id`,
      type: `number`,
      primaryKey: true
    },
    productId: {
      sql: `product_id`,
      type: `string`
    },
    targetMonth: {
      sql: `target_month`,
      type: `string`
    }
  }
});

cube(`DimProduct`, {
  sql: `
    SELECT * FROM (VALUES
      (1001, 'Laptop', 'Electronics'),
      (1002, 'Phone', 'Electronics'),
      (1003, 'Desk', 'Furniture')
    ) AS t(product_id, product_name, category)
  `,

  joins: {
    FactSales: {
      sql: `${CUBE}.product_id = ${FactSales}.product_id`,
      relationship: `hasMany`
    }
  },

  dimensions: {
    productId: {
      sql: `product_id`,
      type: `string`,
      primaryKey: true
    },
    productName: {
      sql: `product_name`,
      type: `string`
    },
    category: {
      sql: `category`,
      type: `string`
    }
  }
});

Metadata

Metadata

Assignees

Labels

questionThe issue is a question. Please use Stack Overflow for questions.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions