Skip to content

Edge case when working with dates #126

@rgreminger

Description

@rgreminger

Hi, it's been a while, but I think I may have found another edge case working with dates.

Maybe I'm missing something, but I would expect the following to work:

using DataFrames
using Dates
using TidierData
import TidierDB as DB

df1 = DataFrame(id = [string('A' + i ÷ 26, 'A' + i % 26) for i in 0:9], 
                          groups = [i % 2 == 0 ? "aa" : "bb" for i in 1:10], 
                          date = [DateTime(2023, 1, i) for i in 1:10],
                          value = repeat(1:5, 2), 
                          percent = 0.1:0.1:1.0);

db = DB.connect(DB.duckdb())

DB.copy_to(db, df1, "t1");

df = @chain DB.db_table(db, "t1") begin
    
    DB.@group_by(groups)
    DB.@summarize(
        day_diff= day(maximum(date) - minimum(date)),
    )

    DB.@filter(day_diff < 8 )
    @aside DB.@show_query _
    DB.@collect()
end

WITH cte_1 AS (
SELECT groups,  EXTRACT(DAY
        FROM MAX(date) - MIN(date))   AS day_diff
        GROUP BY groups)  
SELECT *
        FROM cte_1 
        WHERE day_diff < 8

ERROR: DuckDB.QueryException("Binder Error: Referenced column \"groups\" not found in FROM clause!\n\nLINE 1: ...(DAY FROM MAX(date) - MIN(date))   AS day_diff GROUP BY groups)  SELECT * FROM cte_1  WHERE day_diff < 8\n                                                                   ^")

The query without the @filter works though:

df = @chain DB.db_table(db, "t1") begin
    
    DB.@group_by(groups)
    DB.@summarize(
        day_diff= day(maximum(date) - minimum(date)),
    )

    # DB.@filter(day_diff < 8 )
    @aside DB.@show_query _
    DB.@collect()
end

SELECT groups,  EXTRACT(DAY
        FROM MAX(date) - MIN(date))   AS day_diff
        FROM t1
        GROUP BY groups

For now, I'm splitting it up like this, which works:

df = @chain DB.db_table(db, "t1") begin
    
    DB.@group_by(groups)
    DB.@summarize(
        day_diff= maximum(date) - minimum(date),
    )

    DB.@mutate(day_diff = day(day_diff))

    DB.@filter(day_diff < 8 ) 

    @aside DB.@show_query _
    DB.@collect()
end

WITH cte_1 AS (
SELECT groups, MAX(date) - MIN(date) AS day_diff
        FROM t1
        GROUP BY groups),
cte_2 AS (
SELECT  groups,  EXTRACT(DAY
        FROM day_diff)   AS day_diff
        FROM cte_1 ),
cte_3 AS (
SELECT *
        FROM cte_2
        WHERE day_diff < 8)  
SELECT *
        FROM cte_3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions