Skip to content

Conversation

@karlomedallo
Copy link
Contributor

Fixes #9281 by implementing explicit CAST for numeric filters in the DuckDB driver. This ensures correct numeric comparisons and prevents mismatches caused by string-to-number conversions.

Added test cases to validate casting behavior for different filter scenarios.

Check List

  • Tests have been run in packages where changes made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

Issue Reference this PR resolves

#9281

Description of Changes Made (if issue reference is not provided)

Fixes [cube-js#9281](cube-js#9281) by implementing explicit CAST for numeric filters in the DuckDB driver. This ensures correct numeric comparisons and prevents mismatches caused by string-to-number conversions.

Added test cases to validate casting behavior for different filter scenarios.
@karlomedallo karlomedallo requested review from a team as code owners March 11, 2025 15:15
@github-actions github-actions bot added the pr:community Contribution from Cube.js community members. label Mar 11, 2025
Copy link
Member

@KSDaemon KSDaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karlomedallo Thank you for the fix and even tests! However, I think this needs a bit of edits.


class DuckDBFilter extends BaseFilter {
public castParameter() {
if (this.measure || this.definition().type === 'number') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is not exactly right. Measures might be of different types and only numbers-alike (e.g. count, sum) should be cast.

Copy link
Contributor Author

@karlomedallo karlomedallo Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review. I actually found another bug when I created the test for other measure types. The bug is about time measure filters. I'm not sure if you want to track this as another issue but for the meantime, I just added it here. Just let me know if you want me to put this in another PR.

give this cube:

cube(`Orders`, {
  sql: `
  select id, amount, status, created_at, is_paid from (
    select 1 as id, 100 as amount, 'new' status, TIMESTAMPTZ '2020-01-01 00:00:00' created_at, TRUE as is_paid
    UNION ALL
    select 2 as id, 200 as amount, 'new' status, TIMESTAMPTZ '2020-01-02 00:00:00' created_at, TRUE as is_paid
    UNION ALL
    select 3 as id, 300 as amount, 'processed' status, TIMESTAMPTZ '2020-01-03 00:00:00' created_at, TRUE as is_paid
    UNION ALL
    select 4 as id, 500 as amount, 'processed' status, TIMESTAMPTZ '2020-01-04 00:00:00' created_at, FALSE as is_paid
    UNION ALL
    select 5 as id, 600 as amount, 'shipped' status, TIMESTAMPTZ '2020-01-05 00:00:00' created_at, FALSE as is_paid
  )
  `,
  measures: {
    count: {
      type: `count`,
    },
    totalAmount: {
      sql: `amount`,
      type: `sum`,
    },
    avgAmount: {
      sql: `amount`,
      type: `avg`,
    },
    statusList: {
      sql: `status`,
      type: `string`,
    },
    lastStatus: {
      sql: `MAX(status)`,
      type: `string`,
    },
    hasUnpaidOrders: {
      sql: `BOOL_OR(NOT is_paid)`,
      type: `boolean`,
    },
    maxCreatedAt: {
      sql: `MAX(created_at)`,
      type: `time`,
    },
    toRemove: {
      type: `count`,
    },
  },
  dimensions: {
    status: {
      sql: `status`,
      type: `string`,
    },
    amount: {
      sql: `amount`,
      type: `number`,
    },
    isPaid: {
      sql: `is_paid`,
      type: `boolean`,
    },
    createdAt: {
      sql: `created_at`,
      type: `time`,
    },
  },
});

these tests will fail:

test('time measure filter - afterDate operator', async () => {
    // This test verifies that filters on time measures don't use CAST
    const response = await client.load({
      measures: [
        'Orders.count'
      ],
      filters: [
        {
          member: 'Orders.maxCreatedAt',
          operator: 'afterDate',
          values: ['2020-01-03']
        }
      ]
    });
    
    // Orders after 2020-01-03
    expect(response.rawData()[0]['Orders.count']).toBe('5');
  });
  
  test('time measure filter - beforeDate operator', async () => {
    // This test verifies that beforeDate operator works correctly with time measures
    const response = await client.load({
      measures: [
        'Orders.count'
      ],
      filters: [
        {
          member: 'Orders.maxCreatedAt',
          operator: 'beforeDate',
          values: ['2020-01-06']
        }
      ]
    });
    // All orders should be counted since max date (2020-01-05) is before 2020-01-06
    expect(response.rawData()[0]['Orders.count']).toBe('5');
  });

  test('time measure filter - beforeOrOnDate operator', async () => {
    // This test verifies that beforeOrOnDate operator works correctly with time measures
    const response = await client.load({
      measures: [
        'Orders.count'
      ],
      filters: [
        {
          member: 'Orders.maxCreatedAt',
          operator: 'beforeOrOnDate',
          values: ['2020-01-05']
        }
      ]
    });
    // All orders should be counted since max date (2020-01-05) is equal to 2020-01-05
    expect(response.rawData()[0]['Orders.count']).toBe('5');
  });

  test('time measure filter - afterOrOnDate operator', async () => {
    // This test verifies that afterOrOnDate operator works correctly with time measures
    const response = await client.load({
      measures: [
        'Orders.count'
      ],
      filters: [
        {
          member: 'Orders.maxCreatedAt',
          operator: 'afterOrOnDate',
          values: ['2020-01-05']
        }
      ]
    });
    // All orders should be counted since max date (2020-01-05) is equal to 2020-01-05
    expect(response.rawData()[0]['Orders.count']).toBe('5');
  });

  test('time measure filter - inDateRange operator', async () => {
    // This test verifies that inDateRange operator works correctly with time measures
    const response = await client.load({
      measures: [
        'Orders.count'
      ],
      filters: [
        {
          member: 'Orders.maxCreatedAt',
          operator: 'inDateRange',
          values: ['2020-01-04', '2020-01-06']
        }
      ]
    });
    // All orders should be counted since max date (2020-01-05) is within the range
    expect(response.rawData()[0]['Orders.count']).toBe('5');
  });

  test('time measure filter - notInDateRange operator', async () => {
    // This test verifies that notInDateRange operator works correctly with time measures
    const response = await client.load({
      measures: [
        'Orders.count'
      ],
      filters: [
        {
          member: 'Orders.maxCreatedAt',
          operator: 'notInDateRange',
          values: ['2020-01-06', '2020-01-07']
        }
      ]
    });
    // All orders should be counted since max date (2020-01-05) is outside the range
    expect(response.rawData()[0]['Orders.count']).toBe('5');
  });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated my PR now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @karlomedallo Great that you found other bug and fixed it!
No need for a separate PR. I'll review new changes soon.

@KSDaemon KSDaemon self-assigned this Mar 13, 2025
@karlomedallo
Copy link
Contributor Author

I reviewed the logs for the recent CI run and noticed the following error:

2025-03-14T22:59:21.2032573Z     FetchError: request to https://github.com/cube-js/cube/releases/download/v1.2.22/cubestored-x86_64-unknown-linux-gnu.tar.gz failed, reason: socket hang up

It appears to be a network issue with fetching the cubestored tarball. I checked the URL and it works now, suggesting it might have been a temporary glitch. Perhaps rerunning the actions could resolve the issue.

Copy link
Member

@KSDaemon KSDaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 Thank you again for the fix and wide range covering tests!

@KSDaemon KSDaemon merged commit 969508d into cube-js:master Mar 19, 2025
26 checks passed
marianore-muttdata pushed a commit to MuttData/cube that referenced this pull request Jun 17, 2025
…s#9328)

* feat(duckdb-driver): Fix numeric filter comparisons in DuckDB

Fixes [cube-js#9281](cube-js#9281) by implementing explicit CAST for numeric filters in the DuckDB driver. This ensures correct numeric comparisons and prevents mismatches caused by string-to-number conversions.

Added test cases to validate casting behavior for different filter scenarios.

* feat(duckdb-driver): Fix numeric and time measure filter comparisons

* ci: trigger checks for PR cube-js#9281

---------

Co-authored-by: Konstantin Burkalev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

driver:duckdb pr:community Contribution from Cube.js community members.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Numeric filter values passed as strings in DuckDB dialect

3 participants