- 
                Notifications
    You must be signed in to change notification settings 
- Fork 661
Enhance the nested type access for Generic and DuckDB dialect #1541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Closed
      
      
    
  
     Closed
                    Changes from all commits
      Commits
    
    
            Show all changes
          
          
            5 commits
          
        
        Select commit
          Hold shift + click to select a range
      
      221b9dc
              
                extract `support_period_map_access_key` config
              
              
                goldmedal 2778211
              
                handle the chain of the subscript and map accesses for generic and du…
              
              
                goldmedal cd7b567
              
                Merge branch 'main' into feature/1533-dereference-expr-2
              
              
                goldmedal a4a5448
              
                fix the doc test
              
              
                goldmedal dc5e540
              
                fix doc
              
              
                goldmedal File filter
Filter by extension
Conversations
          Failed to load comments.   
        
        
          
      Loading
        
  Jump to
        
          Jump to file
        
      
      
          Failed to load files.   
        
        
          
      Loading
        
  Diff view
Diff view
There are no files selected for viewing
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
      
      Oops, something went wrong.
        
    
  
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah to your comment I'm thinking it makes sense to already in this PR merge the subscript behavior/representation into mapaccess? thinking that looks like it'll resolve both issues and adding a new dialect flag and extending the two codepaths compounds the issue it seems.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @iffyio. Indeed, if we merge them in this PR, we can fix many things. It could be a big refactor 🤔
I have two candidate proposals for it:
Merge
SubscriptintoMapAcessand renameMapAccessExpr::Subscriptand add a newMapAccessSyntax::Slicefor[1:5]SQLMapAcesstoElementAccessfor the elements access ofMapandArray.Remove
MapAccessand integrate withCompositeAccessCompositeAccessis a syntax structure forexpr1.expr2. I think we can use it to represent the period map access. We can useCompositeAccessandSubscriptto present the access chain likeexpr1.expr2[1].expr3Then, we don't need
MapAccessfor the chain.What do you think? Which one do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first option to merge subscript into mapaccess sounds reasonable! I'm thinking we could skip the rename at least to start with to keep the breakage minimal and I'm imagining it shouldn't be as large of a change in that case, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removing
Subscriptcauses more breakage than removingMapAccess. 🤔Do you know how many downstream projects use
MapAccess? I found thatDataFusionhasn’t implemented it, butSubscriptis used to handle array syntax. I'm not sure which one is better.I drafted a version for option 2: #1551.
It still has some issues with
Expr::Methodparsing, but I think it preservesSubscriptand avoids a significant breaking change for downstream projects.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, #1551 can also support the syntax like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep the nested representation for the following reasons:
a[1]) and maps (a['field']).SubscriptandCompositeAccess(and possiblyMethod, if needed 🤔), we can cover the entire syntax of access chains without requiring users to introduce additionalExpr. This makes the SQL syntax more stable. Some examples include:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was primarily thinking it would be easier and more efficient to traverse the ast if the above variants are expressed as a chain without nesting. Currently, both the parser and downstream crates tend to struggle with the recursive nature when there's a lot of nesting going on in large/complex sql.
Offhand, not sure I have a full picture of either approach though, its not super clear to me what the disadvantage would be with linear, or if there are advantages to using a nested representation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just my assumption. If I'm wrong, please correct me.
I mean if we accept the nested presentation, we don't need to change
Subscriptand won't make breaking changes for the array or map syntax. I'm not sure about other downstream projects. At least, DataFusion won't be broken. If it would cause big breaking change, I think reconsidering about it 🤔When working on this issue, I found we implemented many different
Exprfor similar purposes (access chain). For exampleMapAccessfora[1][2][3]ora[1].b.c[3]Subscriptfora[1]ora[1][2], ...JsonAccessfora.b[0].c, ...CompoundIdentifierfora.b.cCompositeAccessfor( .. ).aMethodforfunc1().func2().func3()They are the same thing, with different combinations and orderings, and maybe for various dialects. I think it also means we have various code paths for the same purposes.
I hope to use some basic components to present all of them.
I think it's possible to use a linear representation to do it but it could make a huge breaking change.
It makes sense to me. Indeed, the complex nested representation is a potential issue for performance or usage 🤔
I tried to draft a new linear representation like:
I think it can cover many cases of the access chain. I'm not sure about naming but I don't prefer to keep using
Expr::SubScriptorExpr::MapAccesbecause it has turned to a different meaning. I prefer to remove both of them.What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice yeah I think the
CompoundExprexample to represent the different variants would make a lot of sense!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll follow this design in #1551. Let's close this PR 👍