-
Notifications
You must be signed in to change notification settings - Fork 666
Add support for TABLESAMPLE #1580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
789e1db to
84b58d7
Compare
src/ast/query.rs
Outdated
| sample: Option<Box<TableSample>>, | ||
| /// Position of the table sample modifier in the table factor. Default is after the table alias | ||
| /// e.g. `SELECT * FROM tbl t TABLESAMPLE (10 ROWS)`. See `Dialect::supports_table_sample_before_alias`. | ||
| sample_before_alias: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use an enum here? e.g
enum TableSampleKind {
BeforeTableAlias(Box<TableSample>)
// ...
}
sample: Option<TableSampleKind>thinking that avoids the surplus flag on the table factor and would lend itself to be extensible if required later on
src/parser/mod.rs
Outdated
| self.expect_token(&Token::RParen)?; | ||
| TableSample::Bucket(TableSampleBucket { bucket, total, on }) | ||
| } else { | ||
| let value = match self.try_parse(|p| p.parse_number_value()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the try_parse rather be maybe_parse (we seem to be ignoring the returned error otherwise)?
src/parser/mod.rs
Outdated
| let value = match self.maybe_parse(|p| p.parse_number_value()) { | ||
| Ok(Some(num)) => num, | ||
| _ => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let value = match self.maybe_parse(|p| p.parse_number_value()) { | |
| Ok(Some(num)) => num, | |
| _ => { | |
| let value = match self.maybe_parse(|p| p.parse_number_value())? { | |
| Some(num) => num, | |
| None => { |
I think this is usually the recursion limit or similar fatal error we can propagate
src/parser/mod.rs
Outdated
| if self.peek_token().token == Token::RParen | ||
| && !self.dialect.supports_implicit_table_sample_method() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this could be simplified as if self.dialect.supports_implicit_table_sample_method() && self.consume_token(Token::RParen) it would also let us skip the expect_token that follows as well?
src/parser/mod.rs
Outdated
| repeatable: seed, | ||
| }) | ||
| // Try to parse without an explicit table sample method keyword | ||
| } else if self.peek_token().token == Token::LParen { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } else if self.peek_token().token == Token::LParen { | |
| } else if self.consume_token(Token::LParen) { |
tests/sqlparser_snowflake.rs
Outdated
| "SELECT * FROM testtable SAMPLE (10)", | ||
| "SELECT * FROM testtable TABLESAMPLE BERNOULLI (10)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the scenarios that currently rely on one_statement_parse_to, could we represent them faithfully when displaying? e.g this and the ROW vs BERNOULLI variants etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand the question, but if you mean whether the variants I added as interchangeable are really interchangeable in the dialect, as far as I understand yes. If not, please elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah so I was rather thinking that ideally we preserve the syntax roundtrip, even though they'r interchangeable in some dialects
snowflake_and_generic().verified_stmt("SELECT * FROM testtable TABLESAMPLE ROW (20.3)");
snowflake_and_generic().verified_stmt("SELECT * FROM testtable SAMPLE BLOCK (3) SEED (82)");|
@iffyio please see latest commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yoavcloud! Left a comment to distinguish sample vs tablesample. Otherwise this looks good to me!
tests/sqlparser_snowflake.rs
Outdated
| snowflake_and_generic().one_statement_parses_to( | ||
| "SELECT * FROM testtable SAMPLE (10)", | ||
| "SELECT * FROM testtable TABLESAMPLE (10)", | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh could we do the same here as well? clickhouse for example doesn't have has SAMPLE but not TABLESAMPLE so that the roundtrip stays correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added support for clickhouse as well, with a different approach to how to model the it in the AST
a0fdfbf to
ef04565
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @yoavcloud
cc @alamb
|
Thanks @yoavcloud and @iffyio |
This PR adds support for the
TABLESAMPLEoption in the following dialects:Collateral work includes expanding the use of a constructor function to create
Tablestructs in unit tests to avoid modifying many files when adding default options to theTablestruct.