-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
Hi,
I recently encountered an issue while using dplyr::semi_join with Clickhouse. The default code generated by dplyr produces a subquery with dependencies, and this isn't supported in Clickhouse (or am I wrong?). However, I noticed that Clickhouse does support LEFT SEMI JOIN. Consequently, I've wrote the following function to address this:
#' @export
#' @importFrom dbplyr sql_query_semi_join
sql_query_semi_join.ClickhouseConnection <- function(con, x, y, anti, by, where, vars, ..., lvl = 0) {
x <- dbplyr:::dbplyr_sql_subquery(con, x, name = by$x_as, lvl = lvl)
y <- dbplyr:::dbplyr_sql_subquery(con, y, name = by$y_as, lvl = lvl)
on <- dbplyr:::sql_join_tbls(con, by)
JOIN <- ifelse(anti, dplyr::sql("ANTI LEFT JOIN"), dplyr::sql("SEMI LEFT JOIN"))
# Wrap with SELECT since callers assume a valid query is returned
clauses <- list(
dbplyr:::sql_clause_select(con, vars),
dbplyr:::sql_clause_from(x),
dbplyr:::sql_clause(JOIN, y),
dbplyr:::sql_clause("ON", on, sep = " AND", parens = TRUE, lvl = 1)
)
dbplyr:::sql_format_clauses(clauses, lvl, con)
}
Nonetheless, I'm aware that my function uses some internal dbplyr functions, and I'm uncertain about the permissibility of this approach. Could someone provide some directions on how to refine this function for a potential PR?
Thank you in advance.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels