- 
                Notifications
    You must be signed in to change notification settings 
- Fork 664
Add support for PostgreSQL/Redshift geometric operators #1723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
        
          
                src/ast/operator.rs
              
                Outdated
          
        
      | /// `+` Addition, also Translation (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' + point '(2.0,0)'` | ||
| Plus, | ||
| /// Minus, e.g. `a - b` | ||
| /// `-` Subtraction, Translation (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' - point '(2.0,0)'` | ||
| Minus, | ||
| /// Multiply, e.g. `a * b` | ||
| /// `*` Multiplication, also Scaling (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' * point '(2.0,0)'` | ||
| Multiply, | ||
| /// Divide, e.g. `a / b` | ||
| /// `/` Division, also Scaling (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(2,2))' / point '(2.0,0)'` | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we skip this diff? I'm thinking since these operators are fundamental and apply to various scenarios, no need to mention specific dialect semantics given neither the parser nor AST handles them specially as a result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/ast/operator.rs
              
                Outdated
          
        
      | /// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator), e.g. `# path '((1,0),(0,1),(-1,0))'` | ||
| NumOfPoints, | ||
| /// `@-@` Length or circumference (PostgreSQL/Redshift geometric operator), e.g. `@-@ path '((0,0),(1,0))'` | ||
| LenOfCircumference, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the doc says length or circumference, is the Of a typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a typo, fixed
        
          
                src/ast/operator.rs
              
                Outdated
          
        
      | PGAbs, | ||
| /// Unary logical not operator: e.g. `! false` (Hive-specific) | ||
| BangNot, | ||
| /// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator), e.g. `# path '((1,0),(0,1),(-1,0))'` | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include a link to the docs where the operators are described? that would be helpful for folks that come across it and need to find more context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/ast/operator.rs
              
                Outdated
          
        
      | // `#` PostgreSQL Bitwise XOR, also Intersection (PostgreSQL/Redshift geometric operator), e.g. `box '((1,-1),(-1,1))' # box '((1,1),(-2,-2))'` | ||
| PGBitwiseXor, | ||
| /// Bitwise shift left, e.g. `a << b` (PostgreSQL-specific) | ||
| /// `<<` PostgreSQL Bitwise Shift Left, also Left of? (PostgreSQL/Redshift geometric operator), e.g. circle '((0,0),1)' << circle '((5,0),1)' | ||
| PGBitwiseShiftLeft, | ||
| /// Bitwise shift right, e.g. `a >> b` (PostgreSQL-specific) | ||
| /// `>>` PostgreSQL Bitwise Shift Right, Right of? (PostgreSQL/Redshift geometric operator), e.g. circle '((5,0),1)' >> circle '((0,0),1)' | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually same with these, I think just mentioning the operator suffices, without going into semantics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/ast/operator.rs
              
                Outdated
          
        
      | /// <https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#overlaps-predicate> | ||
| Overlaps, | ||
| /// `##` Point of closest proximity (PostgreSQL/Redshift geometric operator), e.g. `point '(0,0)' ## lseg '((2,0),(0,2))'` | ||
| PointOfClosestProximity, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm actually thinking about it, I feel like the naming of the operators might become problematic, e.g. if another dialect supports ## operator then calling it PointOfClosestProximity would be confusing vs if the operator was called DoubleHash dialect agnostic. Would it make sense to name the operators in such a manner? i.e. by their characters or if theres a well known description that isnt describing the pg behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/parser/mod.rs
              
                Outdated
          
        
      | if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) | ||
| && GEOMETRIC_TYPES.contains(&w.keyword) | ||
| { | ||
| self.parse_geometric_type(w.keyword) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be its own branch within parse_expr_prefix_by_reserved_word instead?
It looks like we should be able to do something like
kw if self.dialect.supports_pg_geometric_types() && PG_GEOMETRIC_TYPES.contains(kw) => {
    self.parse_geometric_expr(kw)
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/parser/mod.rs
              
                Outdated
          
        
      | let token_with_span = self.next_token(); // Get the full TokenWithSpan | ||
| let value = match token_with_span.token { | ||
| // Extract the Token field | ||
| Token::SingleQuotedString(s) => Value::SingleQuotedString(s), | ||
| _ => return self.expected("SingleQuotedString", token_with_span), // Pass full TokenWithSpan | ||
| }; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let token_with_span = self.next_token(); // Get the full TokenWithSpan | |
| let value = match token_with_span.token { | |
| // Extract the Token field | |
| Token::SingleQuotedString(s) => Value::SingleQuotedString(s), | |
| _ => return self.expected("SingleQuotedString", token_with_span), // Pass full TokenWithSpan | |
| }; | |
| let token_with_span = self.next_token(); | |
| let value = match token_with_span.token { | |
| Token::SingleQuotedString(s) => Value::SingleQuotedString(s), | |
| _ => return self.expected("SingleQuotedString", token_with_span), | |
| }; | 
thinking we can drop the comments in this case since the code is as descriptive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/parser/mod.rs
              
                Outdated
          
        
      | | tok @ Token::AtAt | ||
| | tok @ Token::QuestionMarkDash | ||
| | tok @ Token::QuestionPipe | ||
| if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) => | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace these with a dialect method? the dialect_of has been deprecated, we can introduce a method like supports_pg_geometric_types() or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/test_utils.rs
              
                Outdated
          
        
      | pub fn only_psql_redshift() -> TestedDialects { | ||
| TestedDialects { | ||
| dialects: vec![ | ||
| Box::new(PostgreSqlDialect {}), | ||
| Box::new(RedshiftSqlDialect {}), | ||
| ], | ||
| options: None, | ||
| recursion_limit: None, | ||
| } | ||
| } | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by switching to the dialect method, it would be possible to reuse all_dialects_where in place of this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the code to use all_dialects_where as to your suggestion
        
          
                tests/sqlparser_postgres.rs
              
                Outdated
          
        
      | //"&<", // (is strictly to the left of) // Same as OverlapsToLeft, causes duplicate failure | ||
| //"&>", // (is strictly to the right of) // Same as OverlapsToRight, causes duplicate failure | ||
| "|=|", // distance between A and B trajectories at their closest point of approach | ||
| "<<#>>", // n-D distance between A and B bounding boxes | ||
| "|>>", // A's bounding box is strictly above B's. | ||
| "~=", // bounding box is the same | ||
| //"|>>", // A's bounding box is strictly above B's. // Same as IsStrictlyAbove, causes duplicate failure | ||
| //"~=", // bounding box is the same // Same as SameAs, causes duplicate failure | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I followed why these are commented out, what do we mean by causes duplicate failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented-out operators are already defined under different names in the code,
so they can no longer be considered custom operators.
If we include these again under their original symbols, it causes a duplicate match failure.
so I have now completely removed them.
b297ccb    to
    01f6646      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @benrsatori! Left some comments, the changes look good to me overall
| fn supports_array_typedef_size(&self) -> bool { | ||
| false | ||
| } | ||
| /// Returns true if the dialect supports geometric types. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a link to the postgres docs here (and maybe an example sql)? it would help folks with context on what we mean by geometric types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/parser/mod.rs
              
                Outdated
          
        
      | Token::QuestionAnd => Some(BinaryOperator::QuestionAnd), | ||
| Token::QuestionPipe => Some(BinaryOperator::QuestionPipe), | ||
| Token::CustomBinaryOperator(s) => Some(BinaryOperator::Custom(s.clone())), | ||
| Token::DoubleSharp if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for these clauses as well could we use the introduced self.dialect.supports_geometric_types()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/tokenizer.rs
              
                Outdated
          
        
      | _ => self.start_binop(chars, "||", Token::StringConcat), | ||
| } | ||
| } | ||
| Some('&') if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly for these clauses it would be nice to use the self.dialect.supports_geometric_types() method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| Parenthesis(Box<DataType>), | ||
| } | ||
|  | ||
| #[derive(Debug, Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a short description to this struct with a link to the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                src/ast/operator.rs
              
                Outdated
          
        
      | BangNot, | ||
| /// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator) | ||
| /// see <https://www.postgresql.org/docs/9.5/functions-geometry.html> | ||
| NumOfPoints, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do similar to the binary operator where we name them according to their symbols to be dialect agnostic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| /// | ||
| /// [bigquery]: https://cloud.google.com/bigquery/docs/user-defined-functions#templated-sql-udf-parameters | ||
| AnyType, | ||
| /// geometric type | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include a link to the postgres docs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
9c62ffc    to
    fdc90cf      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @benrsatori!
cc @alamb
| @benrsatori could you take a look at the CI failures? | 
| Hi @iffyio | 
| 🚀 | 
Add support for PostgreSQL and Redshift geometric operators.