-
Notifications
You must be signed in to change notification settings - Fork 14
Adds test ensuring dictionary corruption does not occur anymore #208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds test ensuring dictionary corruption does not occur anymore #208
Conversation
| /// Test that multiple first_value() aggregations work correctly in distributed queries. | ||
| // TODO: Once https://github.com/apache/datafusion/pull/18303 is merged, this test will lose | ||
| // meaning, since the PR above will mask the underlying problem. Different queries or | ||
| // a new approach must be used in this case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just check for duplicate column names directly then? We could make a MemoryExec or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look into that as a follow-up? I want to get the remaining issues with distributed activation first, this can be a side-quest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure yes we'll see what the maintainers here think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. It's still a good test. If you could file and issue and put the number in the todo, that would help us track it :)
jayshrivastava
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for this
|
|
||
| // Print them out, the error message from `assert_eq` is otherwise hard to read. | ||
| println!("{}", expected_result); | ||
| println!("{}", actual_result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It would be preferable to only do this on failure. I think it's fine if we do
if actual_result.to_string() != expected_result {
println!("{}", expected_result);
println!("{}", actual_result);
panic!(...)
}
| /// Test that multiple first_value() aggregations work correctly in distributed queries. | ||
| // TODO: Once https://github.com/apache/datafusion/pull/18303 is merged, this test will lose | ||
| // meaning, since the PR above will mask the underlying problem. Different queries or | ||
| // a new approach must be used in this case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. It's still a good test. If you could file and issue and put the number in the todo, that would help us track it :)
gabotechs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 makes sense. Thanks!
This test ensures #205 has the desired effect on distributed execution. I also snuck in a few docs and a single typo fix to make up for the ones I am adding 😬 .
It has a bit of half-life, as the fixes from apache/datafusion#18303 will mask the issue again, but for now, it is better than nothing.