-
Notifications
You must be signed in to change notification settings - Fork 131
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This is a follow on to #1273
There are a number of places where it would be convenient to pass SQL strings as expressions. For example, it would be nice to do
df.select(
"a",
"a - b",
col("c"),
)
This should intuitively know that we are getting column a
, followed by col("a") - col("b")
followed by column 'c'.
Describe the solution you'd like
Using the sql parsing on the DataFrame make the following functions handle SQL strings. We must be very careful that we do not break things like cases where users have a column name that is not SQL parseable.
Describe alternatives you've considered
Status quo
Additional context
DataFrame functions to update:
- select
- remove
select_exprs
- with_column
- with_columns
- aggregate
- repartition_by_hash
We do not want to apply this treatment to joins because there is no easy way to know which DataFrame to perform the SQL parsing against.