Allow setting the recursion limit for sql parsing#14756
Conversation
|
I see that a recursion limit of 50 is introduced for SQL parsing. |
|
@jatin510 This number is actually the default in So it will essentially be the exact same behaviour, but configurable. |
fd42361 to
b29e5e2
Compare
|
I'll note that we encountered this when parsing a dynamically generated expression containing a lot of |
alamb
left a comment
There was a problem hiding this comment.
Thanks @cetra3 -- this change makes sense to me. Thank you 🙏
To merge this PR I think we should:
- try and improve the API (I left comments about this)
- Add a test (so that if we break this in the future accidentally we'll know)
A good test would be a sqllogictest that runs a query that passes and then cranks the limit down to something low and then runs the same query again and show it errors
I'll note that we encountered this when parsing a dynamically generated expression containing a lot of ANDs and ORs. I believe these sorts of expressions are parsed recursively, so it's pretty easy to hit 50 levels of recursion if each AND is a level.
@adriangb I think you are saying that this is a good change, right?
|
|
||
| let recursion_limit = self.config.options().sql_parser.recursion_limit; | ||
|
|
||
| let mut statements = DFParser::parse_sql_with_dialect_limit( |
There was a problem hiding this comment.
The API for DFParser is already somewhat tough.
Rather than adding a new method here, could you make this a builder style instead, so ths would look something like this?
| let mut statements = DFParser::parse_sql_with_dialect_limit( | |
| let mut statements = DFParser::new_with_dialect(sql, dialect.with_ref()) | |
| .with_recursion_limit(recursion_limit) | |
| .parse_statements() |
There was a problem hiding this comment.
I've adjusted the PR to include a new DFParserBuilder that accomplishes this, but I've tried to keep the original methods as close as possible for backwards compat
datafusion/sql/src/parser.rs
Outdated
| } | ||
|
|
||
| /// Same as `sqlparser` | ||
| const DEFAULT_RECURSION_LIMIT: usize = 50; |
There was a problem hiding this comment.
I verified this is the same default:
https://github.com/apache/datafusion-sqlparser-rs/blob/648efd7057d63c65b53eddc3d05cc89d5697d85c/src/parser/mod.rs#L187
Yes I was just explaining why this is necessary - it's not immediately obvious how parsing SQL can be so recursive. |
9ffaea5 to
15dbe2e
Compare
15dbe2e to
b36073d
Compare
|
@alamb I've added a test and a builder struct, let me know if you want further changes |
alamb
left a comment
There was a problem hiding this comment.
Thank you @cetra3
In order to try and minimize downstream API breakages following https://datafusion.apache.org/library-user-guide/api-health.html I think we should
- put back
DFParser::new()andDFParser::new_with_dialectmarked at deprecated - Adding some doc strings and examples to DFParserBuilder
Here is a proposed PR with those changes to your fork:
Add some examples, restore old APIs and deprecate
|
Thanks again @cetra3 |
Which issue does this PR close?
No issue, just running into this in production.
Rationale for this change
At the moment there isn't a clean way to set the recursion limit on the sql parser.
What changes are included in this PR?
This PR allows a config option,
datafusion.sql_parser.recursion_limitwhich allows overriding the recursion limit.Are these changes tested?
No, it's just a config option.
Are there any user-facing changes?
There will be some changes to public facing API. I'm also not sure if this is the best approach here, as
DFParsermight need a deeper refactor to keep things clean.