-
Notifications
You must be signed in to change notification settings - Fork 3
Closed
Description
I think I found another edge case. Below are two contrived examples for cases where I first want to extract some data/variable from dt1 and merge it into dt2. Then I want to do something with that variable in dt2 (e.g., define another variable) and merge the result back to the original main table dt1.
using DataFrames
using TidierData
import TidierDB as DB
df1 = DataFrame(id = [string('A' + i ÷ 26, 'A' + i % 26) for i in 0:9],
value0 = [i % 2 == 0 ? "aa" : "bb" for i in 1:10],
value1 = repeat(1:5, 2))
df2 = DataFrame(id = df1.id, value2 = rand(10))
db = DB.connect(DB.duckdb())
DB.copy_to(db, df1, "t1");
DB.copy_to(db, df2, "t2");
dt1 = @chain DB.dt(db, "t1") begin
DB.@group_by(value0)
DB.@summarize(id = first(id), value1 = sum(value1))
end
dt2 = @chain DB.dt(db, "t2") begin
DB.@inner_join(DB.t(dt1), id == id)
DB.@distinct()
end
df_merged = @chain DB.dt(db, "t1") begin
DB.@select(id, value0)
DB.@inner_join(dt2, id == id)
@aside DB.@show_query _
DB.@collect()
end
WITH j1j1cte_1 AS (
SELECT value0, first(id) AS id, SUM(value1) AS value1
FROM t1
GROUP BY value0),
j1cte_1 AS (
SELECT DISTINCT value2, value0, value1
FROM t2
INNER JOIN j1cte_1 ON t2.id = j1cte_1.id),
j1cte_2 AS (
SELECT *
FROM j1cte_1),
cte_1 AS (
SELECT COALESCE(t1.id, j1cte_2.id) AS id, t1.value0, j1cte_2.value2, j1cte_2.value0, j1cte_2.value1
FROM t1
INNER JOIN j1cte_2 ON t1.id = j1cte_2.id)
SELECT *
FROM cte_1
ERROR: Binder Error: There is a WITH item named "j1cte_1", but it cannot be referenced from this part of the query.
Stacktrace:
....df1 = DataFrame(id = [string('A' + i ÷ 26, 'A' + i % 26) for i in 0:9],
value0 = [i % 2 == 0 ? "aa" : "bb" for i in 1:10],
value1 = repeat(1:5, 2))
df2 = DataFrame(id = df1.id, value2 = rand(10))
db = DB.connect(DB.duckdb())
DB.copy_to(db, df1, "t1");
DB.copy_to(db, df2, "t2");
dt1 = @chain DB.dt(db, "t1") begin
DB.@distinct()
end
dt2 = @chain DB.dt(db, "t2") begin
DB.@inner_join(DB.t(dt1), id)
end
df_merged = @chain DB.dt(db, "t1") begin
DB.@inner_join(DB.t(dt2), id == id)
@aside DB.@show_query _
DB.@collect()
end
WITH j1j1cte_1 AS (
SELECT DISTINCT id, value0, value1
FROM t1),
j1j1cte_2 AS (
SELECT *
FROM j1cte_1),
j1cte_1 AS (
SELECT COALESCE(t2.id, j1cte_2.id) AS id, t2.value2, j1cte_2.value0, j1cte_2.value1
FROM j1cte_0)
SELECT COALESCE(t1.id, j1cte_1.id) AS id, t1.value0, t1.value1, j1cte_1.value2, j1cte_1.value0, j1cte_1.value1
FROM t1
INNER JOIN j1cte_1 ON t1.id = j1cte_1.id
ERROR: Catalog Error: Table with name j1cte_0 does not exist!
Did you mean "sqlite_master"?
LINE 1: ....id) AS id, t2.value2, j1cte_2.value0, j1cte_2.value1 FROM j1cte_0) SELECT COALESCE(t1.id, j1cte_1.id) AS id, t1.value0...
^
Stacktrace:
....Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels