Skip to content

Cube preAggregation fail to incrementally capture the up-to-date time series dataΒ #9293

@kevinleeTCA

Description

@kevinleeTCA

Problem

We have an incremental preAggregation rollup, updated on daily basis, and partition by month

  preAggregations: {
    rollup: {
      measures: [CUBE.incomeAmount, CUBE.count, CUBE.managementFeeIncomeAmount, CUBE.minDate],
      dimensions: [CUBE.transactionTaxCategoryId, CUBE.generalLedgerManagement, CUBE.accountOwner, CUBE.accountBookType],
      timeDimension: CUBE.createdAt,
      granularity: `day`,
      indexes: {
        idx: {
          columns: [ CUBE.accountOwner, CUBE.generalLedgerManagement],
        },
      },
      refresh_key: {
        every: `30 6 * * *`,
        timezone: `Australia/Sydney`,
        incremental: true,
        update_window: `120 days`,
      },
      partition_granularity: `month`,
      build_range_start: {
        sql: `SELECT '2020-08-11'::timestamp AT TIME ZONE 'utc'`,
      },
      build_range_end: {
        sql: `SELECT NOW()`,
      },
    },
  },

We also have the 5 timezones that we would like to support for our customer' query:

  scheduledRefreshTimeZones: [
    'Australia/Adelaide',
    'Australia/Brisbane',
    'Australia/Darwin',
    'Australia/Perth',
    'Australia/Sydney'
  ],

our configuration:

Cube server type: self-host via docker
Cube version: 1.1.18
Cubestore version: 1.1.18

We found a missing record in the pre-aggregation rollups, that record created at 2025-03-02 10pm (Australia/Sydney). Daily rollup runs at 6.30am (Australia/Sydney).

Details

  1. looking at all our rollups for March, we have 5 of them corresponding to the 5 timezones, but hard to figure it out which one represents which timezone, any clue ?
    Image
  2. Among these 5, only 1 of them , i.e., Australia/Perth partition has the missing management fee transaction data ($35). but it also means it has 1 missing transaction ($18) that happens at 11pm, 28th, Feb (Australia/Perth)

Image

  1. All other 4 partitions have no such transaction, but does have the $18 that happens 2025-02-28T15:01:29.343Z (UTC), as it is legitimately 1st of March in their timezone.
  2. This is aligned with the query we emitted via API.

Image

  1. Looking at the actual time of the $35, it is 2025-03-02T11:07:17.596Z (UTC) which is about 10pm at 02/03 in Australia/Sydney timezone, so it should be captured in other time partitions, as we did the daily aggregation 03/03 at 6.30am Australia/Sydney timezone, you can see the build_range_end are all about 7am 03/03, so it should cover the missing transactions.

Related Cube.js schema

...
  measures: {
    incomeAmount: {
      sql: `amount`,
      type: `sum`,
    },
    feeIncomeAmount: {
      type: `sum`,
      title: `Income excluding any Rent`,
      sql: `amount`,
      filters: [
        {
          sql: `${CUBE.transactionTaxCategoryId} != 'RENT'`,
        },
      ],
    },
    count: {
      type: `count`,
    },
    minDate: {
      sql: `created_at::TIMESTAMP`,
      type: `min`,
    },
    managementFeeIncomeAmount: {
      type: `sum`,
      title: `Total Management Fee Income`,
      sql: `amount`,
      filters: [
        {
          sql: `${CUBE.transactionTaxCategoryId} = 'MANAGEMENT_FEES'`,
        },
      ],
      description: "Management fee income also known as commission fee or commission rate income, " +
          "it is a service fee that landlord/owners/investor charged on top of the rent collected from tenants"
    },
  },
  dimensions: {
    eventId: {
      sql: `event_id`,
      type: `string`,
      primaryKey: true,
    },
    accountId: {
      sql: `account_id`,
      type: `string`,
      primaryKey: true,
      shown: true,
    },
    accountBookType: {
      sql: `book`,
      type: `string`,
    },
    createdAt: {
      sql: `created_at::TIMESTAMP`,
      type: `time`,
    },
    accountOwner: {
      sql: `account_owner_reference`,
      type: `string`,
    },
    generalLedgerManagement: {
      sql: `relating_to_management`,
      type: `string`,
    },
    transactionTaxCategoryId: {
      sql: `tax_category`,
      type: `string`,
    },
  },

Related Cube.js query

query glIncomeByTaxCategoryReport($organisationId: String!, $teamIds: [String!], $groupIds: [String!], $propertyTypes: [String!], $dateRange: [String!], $orderBy: RootOrderByInput!, $limit: Int, $timezone: String) {
  cube(
    timezone: $timezone
    where: {pM_Teams: {organisationId: {equals: $organisationId}, teamId: {in: $teamIds}, propertyType: {in: $propertyTypes}}, generalLedger_Income: {createdAt: {inDateRange: $dateRange}, accountBookType: {equals: "CASH"}}, groupsByMa: {groupIds: {contains: $groupIds}}}
    orderBy: $orderBy
    limit: $limit
  ) {
    generalLedger_Income {
      incomeAmount
      managementFeeIncomeAmount
			generalLedgerManagement
      count
      __typename
    }
    pM_FeeTaxCategories {
      feeTaxCategoryName
      __typename
    }
		pM_Teams{
			teamName
		}
    __typename
  }
}

params

{
	"organisationId": "4df60ddb-d2ba-4b32-96b0-616c58ecafb8",
	"dateRange": [
		"2025-03-01",
		"2025-03-28"
	],
	"orderBy": {
		"generalLedger_Income": {
			"createdAt": "desc"
		}
	},
	"limit": 3000,
	"timezone": "Australia/Perth"
}

Metadata

Metadata

Labels

questionThe issue is a question. Please use Stack Overflow for questions.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions