Skip to content

revdep bbotk fails to create a column by group after unserializing a data.table #7498

@aitap

Description

@aitap

From their tests:

library(bbotk)
library(data.table)
  search_space = domain = ps(
    x1 = p_dbl(-5, 10),
    x2 = p_dbl(0, 15)
  )

  fun = function(xdt, instances) {
    data.table(y = branin(xdt[["x1"]], xdt[["x2"]], noise = as.numeric(instances)))
  }

  objective = ObjectiveRFunDt$new(fun = fun, domain = domain)

  instance = OptimInstanceBatchSingleCrit$new(
    objective = objective,
    search_space = search_space,
    terminator = trm("evals", n_evals = 96))


  optimizer = opt("irace", instances = rnorm(10, mean = 0, sd = 0.1))

optimizer$optimize(instance)
Error in `[.data.table`(log, , `:=`("step", rleid("instance")), by = "iteration") :
  Internal error in dogroups: Trying to add new column by reference but tl is full; setalloccol should have run first at R level before getting to this point. (omitted)
Enter a frame number, or 0 to exit 

 1: optimizer$optimize(instance)
 2: .__OptimizerBatch__optimize(self = self, private = private, super = super, inst = inst)
 3: optimize_batch_default(inst, self)
 4: tryCatch({
    get_private(optimizer)$.optimize(instance)
}, terminated_error = function(cond) {
})
 5: tryCatchList(expr, classes, parentenv, handlers)
 6: tryCatchOne(expr, names, parentenv, handlers[[1]])
 7: doTryCatch(return(expr), name, parentenv, handler)
 8: get_private(optimizer)$.optimize(instance)
 9: .__OptimizerBatchIrace__.optimize(self = self, private = private, super = super, inst = inst)
10: log[, `:=`("step", rleid("instance")), by = "iteration"]
11: `[.data.table`(log, , `:=`("step", rleid("instance")), by = "iteration")

Browse[1]> str(log)
Classes ‘data.table’ and 'data.frame':  96 obs. of  5 variables:
 $ iteration    : int  1 1 1 1 1 1 1 1 1 1 ...
 $ instance     : int  1 1 1 1 1 2 2 2 2 2 ...
 $ configuration: int  1 2 3 4 5 1 2 3 4 5 ...
 $ cost         : num  3.52 71.51 61.13 94.09 22.35 ...
 $ time         : num  NA NA NA NA NA NA NA NA NA NA ...
 - attr(*, ".internal.selfref")=<pointer: (nil)> 
Browse[1]> .Internal(inspect(log, 0))
@55995985eba8 19 VECSXP g0c4 [OBJ,REF(4),gp=0x20,ATT] (len=5, tl=0, gr)
ATTRIB:
  @55994fef2410 02 LISTSXP g0c0 [REF(1)] 
Browse[1]> .Internal(inspect(attr(log, '.internal.selfref'), 1))
@55994fef2528 22 EXTPTRSXP g0c0 [REF(65535)] <(nil)>
PROTECTED:
  @55994fef2560 22 EXTPTRSXP g0c0 [REF(2)] <(nil)>
TAG:
  @55995985f2a8 16 STRSXP g0c4 [REF(1),gp=0x20] (len=5, tl=0, gr)

The code is trying to add a new column by group to a freshly unserialized data.table. This used to work (replacing the table with an over-allocated shallow copy in the caller's environment) but now doesn't (with a different error message):

as.data.table(mtcars) |> serialize(NULL) |> unserialize() -> x
x[, foo := mean(mpg), by = "cyl"]
# Error in `[.data.table`(x, , `:=`(foo, mean(mpg)), by = "cyl") : 
#   This data.table has either been loaded from disk (e.g. using readRDS()/load()) or constructed manually (e.g. using structure()). Please run setDT() or setalloccol() on it first (to pre-allocate space for new columns) before assigning by reference to it.

This is similar to #7488, but in a different code path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    revdepReverse dependencies

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions