-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat(duckdb-driver): Fix numeric filter comparisons in DuckDB #9328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes [cube-js#9281](cube-js#9281) by implementing explicit CAST for numeric filters in the DuckDB driver. This ensures correct numeric comparisons and prevents mismatches caused by string-to-number conversions. Added test cases to validate casting behavior for different filter scenarios.
KSDaemon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karlomedallo Thank you for the fix and even tests! However, I think this needs a bit of edits.
|
|
||
| class DuckDBFilter extends BaseFilter { | ||
| public castParameter() { | ||
| if (this.measure || this.definition().type === 'number') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is not exactly right. Measures might be of different types and only numbers-alike (e.g. count, sum) should be cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the review. I actually found another bug when I created the test for other measure types. The bug is about time measure filters. I'm not sure if you want to track this as another issue but for the meantime, I just added it here. Just let me know if you want me to put this in another PR.
give this cube:
cube(`Orders`, {
sql: `
select id, amount, status, created_at, is_paid from (
select 1 as id, 100 as amount, 'new' status, TIMESTAMPTZ '2020-01-01 00:00:00' created_at, TRUE as is_paid
UNION ALL
select 2 as id, 200 as amount, 'new' status, TIMESTAMPTZ '2020-01-02 00:00:00' created_at, TRUE as is_paid
UNION ALL
select 3 as id, 300 as amount, 'processed' status, TIMESTAMPTZ '2020-01-03 00:00:00' created_at, TRUE as is_paid
UNION ALL
select 4 as id, 500 as amount, 'processed' status, TIMESTAMPTZ '2020-01-04 00:00:00' created_at, FALSE as is_paid
UNION ALL
select 5 as id, 600 as amount, 'shipped' status, TIMESTAMPTZ '2020-01-05 00:00:00' created_at, FALSE as is_paid
)
`,
measures: {
count: {
type: `count`,
},
totalAmount: {
sql: `amount`,
type: `sum`,
},
avgAmount: {
sql: `amount`,
type: `avg`,
},
statusList: {
sql: `status`,
type: `string`,
},
lastStatus: {
sql: `MAX(status)`,
type: `string`,
},
hasUnpaidOrders: {
sql: `BOOL_OR(NOT is_paid)`,
type: `boolean`,
},
maxCreatedAt: {
sql: `MAX(created_at)`,
type: `time`,
},
toRemove: {
type: `count`,
},
},
dimensions: {
status: {
sql: `status`,
type: `string`,
},
amount: {
sql: `amount`,
type: `number`,
},
isPaid: {
sql: `is_paid`,
type: `boolean`,
},
createdAt: {
sql: `created_at`,
type: `time`,
},
},
});these tests will fail:
test('time measure filter - afterDate operator', async () => {
// This test verifies that filters on time measures don't use CAST
const response = await client.load({
measures: [
'Orders.count'
],
filters: [
{
member: 'Orders.maxCreatedAt',
operator: 'afterDate',
values: ['2020-01-03']
}
]
});
// Orders after 2020-01-03
expect(response.rawData()[0]['Orders.count']).toBe('5');
});
test('time measure filter - beforeDate operator', async () => {
// This test verifies that beforeDate operator works correctly with time measures
const response = await client.load({
measures: [
'Orders.count'
],
filters: [
{
member: 'Orders.maxCreatedAt',
operator: 'beforeDate',
values: ['2020-01-06']
}
]
});
// All orders should be counted since max date (2020-01-05) is before 2020-01-06
expect(response.rawData()[0]['Orders.count']).toBe('5');
});
test('time measure filter - beforeOrOnDate operator', async () => {
// This test verifies that beforeOrOnDate operator works correctly with time measures
const response = await client.load({
measures: [
'Orders.count'
],
filters: [
{
member: 'Orders.maxCreatedAt',
operator: 'beforeOrOnDate',
values: ['2020-01-05']
}
]
});
// All orders should be counted since max date (2020-01-05) is equal to 2020-01-05
expect(response.rawData()[0]['Orders.count']).toBe('5');
});
test('time measure filter - afterOrOnDate operator', async () => {
// This test verifies that afterOrOnDate operator works correctly with time measures
const response = await client.load({
measures: [
'Orders.count'
],
filters: [
{
member: 'Orders.maxCreatedAt',
operator: 'afterOrOnDate',
values: ['2020-01-05']
}
]
});
// All orders should be counted since max date (2020-01-05) is equal to 2020-01-05
expect(response.rawData()[0]['Orders.count']).toBe('5');
});
test('time measure filter - inDateRange operator', async () => {
// This test verifies that inDateRange operator works correctly with time measures
const response = await client.load({
measures: [
'Orders.count'
],
filters: [
{
member: 'Orders.maxCreatedAt',
operator: 'inDateRange',
values: ['2020-01-04', '2020-01-06']
}
]
});
// All orders should be counted since max date (2020-01-05) is within the range
expect(response.rawData()[0]['Orders.count']).toBe('5');
});
test('time measure filter - notInDateRange operator', async () => {
// This test verifies that notInDateRange operator works correctly with time measures
const response = await client.load({
measures: [
'Orders.count'
],
filters: [
{
member: 'Orders.maxCreatedAt',
operator: 'notInDateRange',
values: ['2020-01-06', '2020-01-07']
}
]
});
// All orders should be counted since max date (2020-01-05) is outside the range
expect(response.rawData()[0]['Orders.count']).toBe('5');
});There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated my PR now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @karlomedallo Great that you found other bug and fixed it!
No need for a separate PR. I'll review new changes soon.
|
I reviewed the logs for the recent CI run and noticed the following error: It appears to be a network issue with fetching the cubestored tarball. I checked the URL and it works now, suggesting it might have been a temporary glitch. Perhaps rerunning the actions could resolve the issue. |
KSDaemon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 Thank you again for the fix and wide range covering tests!
…s#9328) * feat(duckdb-driver): Fix numeric filter comparisons in DuckDB Fixes [cube-js#9281](cube-js#9281) by implementing explicit CAST for numeric filters in the DuckDB driver. This ensures correct numeric comparisons and prevents mismatches caused by string-to-number conversions. Added test cases to validate casting behavior for different filter scenarios. * feat(duckdb-driver): Fix numeric and time measure filter comparisons * ci: trigger checks for PR cube-js#9281 --------- Co-authored-by: Konstantin Burkalev <[email protected]>
Fixes #9281 by implementing explicit CAST for numeric filters in the DuckDB driver. This ensures correct numeric comparisons and prevents mismatches caused by string-to-number conversions.
Added test cases to validate casting behavior for different filter scenarios.
Check List
Issue Reference this PR resolves
#9281
Description of Changes Made (if issue reference is not provided)