-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
HighjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsperformanceregression
Milestone
Description
HI, I was testing
data.table_1.10.4-3 + R version 3.4.0 (2017-04-21)
vs
data.table_1.12.2 + R version 3.6.1 (2019-07-05)
and have noticed that join operation almost 2 times slower in new version data.table (R?)
I think mostly depends on version of data.table
rm(list=ls())
library(data.table)
library(tictoc)
aa = data.table(a = seq(1,100), b = rep(0, 100))
bb = data.table(a = seq(1,100), b = rep(1, 100))
#aa[, a1 := as.character(a)]
#bb[, a1 := as.character(a)]
setindex(aa, a)
setindex(bb, a)
tic()
for(i in c(1:1000)) {
# aa[bb, b := i.b, on=.(a, a1)] # test1
aa[bb, b := i.b, on=.(a)] # test2
}
toc()
# test1
# 3.6.1: 5.5 sec with index
# 3.6.1: 6.87 sec without index
# 3.4.0: 3.02 sec (no index)
# test2
# 3.6.1: 3.82 sec with index
# 3.6.1: 4.44 sec without index
# 3.4.0: 2.48 sec (no index)Metadata
Metadata
Assignees
Labels
HighjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsperformanceregression