Skip to content

Join with NA can alter the type of the column #4367

@tdeenes

Description

@tdeenes

Consider the following scenario:

  • create a data.table, having an integer and a logical column with NAs ('x' and 'y', respectively)
  • case 1: join on list(x = NA) (note: originally x is integer, but i.x is now logical)
  • case 2: join on list(y = NA_integer_) (note: originally y is logical, but i.y is now integer)
d <- data.table(x = c(NA_integer_, 2L), y = c(TRUE, NA))
str(x)
# Classes ‘data.table’ and 'data.frame':	2 obs. of  2 variables:
# $ x: int  NA 2
# $ y: logi  TRUE NA

dx <- d[list(x = NA), on = "x"]
dy <- d[list(y = NA_integer_), on = "y"]

One expects that the type of the columns in 'dx' and 'dy' will be the same the respective columns in the original data.table. Indeed, this is what we get until data.table_1.12.2:

typeof(dx$x)
# [1] "integer"
typeof(dy$y)
# [1] "logical"

However, from 1.12.4 on (tested also with the most recent devel version), this is what we get:

typeof(dx$x)
# [1] "logical"
typeof(dy$y)
# [1] "integer"

Metadata

Metadata

Assignees

No one assigned

    Labels

    HighbugjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsregression

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions