You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DT[,foo:=bar]: earlier check for selfrefok (#7502)
* DT[,foo:=bar]: earlier check for selfrefok
The check needs to be there, not below, to detect non-selfrefok tables
in by-group operations.
Use static analysis to detect common forms of column deletion: foo :=
NULL and .(bar, baz) := .(NULL, NULL). Static analysis is doomed to miss
things like frob := if (runif(1) < .5) 42 else NULL, but hopefully it
covers the needs of our reverse dependencies.
* add second setalloccol check
---------
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
# don't pass verbose to selfrefok here -- only activated when
1223
-
# ok=-1 which will trigger setalloccol with verbose after
1224
-
# the jval = eval(jsub, ...)
1223
+
# ok=-1 which will trigger setalloccol with verbose in the next branch
1224
+
#if a change in the number of columns is suspected
1225
1225
if (ok==0L) # ok==0 so no warning when loaded from disk (-1) [-1 considered TRUE by R]
1226
1226
if (is.data.table(x)) warningf("A shallow copy of this data.table was taken so that := can add or remove %d columns by reference. At an earlier point, this data.table was copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. It's also not unusual for data.table-agnostic packages to produce tables affected by this issue. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.", length(newnames))
1227
1227
}
1228
+
# ok <- selfrefok above called without verbose -- only activated when
1229
+
# ok=-1 which will trigger setalloccol with verbose in the next
1230
+
# branch, which again calls _selfrefok and returns the message then
1231
+
# !is.data.table for DF |> DT(,:=) tests 2212.16-19 (#5113) where a shallow copy is routine for data.frame
(truelength(x) < ncol(x)+length(newnames)) # not enough space for new columns
1239
+
)
1240
+
) {
1241
+
DT=x# in case getOption contains "ncol(DT)" as it used to. TODO: warn and then remove
1242
+
n= length(newnames) + eval(getOption("datatable.alloccol")) # TODO: warn about expressions and then drop the eval()
1243
+
# i.e. reallocate at the size as if the new columns were added followed by setalloccol().
1244
+
name= substitute(x)
1245
+
if (is.name(name) &&ok&&verbose) { # && NAMED(x)>0 (TO DO) # ok here includes -1 (loaded from disk)
1246
+
catf("Growing vector of column pointers from truelength %d to %d. A shallow copy has been taken, see ?setalloccol. Only a potential issue if two variables point to the same data (we can't yet detect that well) and if not you can safely ignore this. To avoid this message you could setalloccol() first, deep copy first using copy(), wrap with suppressWarnings() or increase the 'datatable.alloccol' option.\n", truelength(x), n)
1247
+
# #1729 -- copying to the wrong environment here can cause some confusion
1248
+
if (ok==-1L) catf("Note that the shallow copy will assign to the environment from which := was called. That means for example that if := was called within a function, the original table may be unaffected.\n")
1249
+
1250
+
# Verbosity should not issue warnings, so cat rather than warning.
1251
+
# TO DO: Add option 'datatable.pedantic' to turn on warnings like this.
1252
+
1253
+
# TO DO ... comments moved up from C ...
1254
+
# Note that the NAMED(dt)>1 doesn't work because .Call
1255
+
# always sets to 2 (see R-ints), it seems. Work around
1256
+
# may be possible but not yet working. When the NAMED test works, we can drop allocwarn argument too
1257
+
# because that's just passed in as FALSE from [<- where we know `*tmp*` isn't really NAMED=2.
1258
+
# Note also that this growing will happen for missing columns assigned NULL, too. But so rare, we
1259
+
# don't mind.
1260
+
}
1261
+
setalloccol(x, n, verbose=verbose) # always assigns to calling scope; i.e. this scope
if (is.name(name) &&ok&&verbose) { # && NAMED(x)>0 (TO DO) # ok here includes -1 (loaded from disk)
1421
-
catf("Growing vector of column pointers from truelength %d to %d. A shallow copy has been taken, see ?setalloccol. Only a potential issue if two variables point to the same data (we can't yet detect that well) and if not you can safely ignore this. To avoid this message you could setalloccol() first, deep copy first using copy(), wrap with suppressWarnings() or increase the 'datatable.alloccol' option.\n", truelength(x), n)
1422
-
# #1729 -- copying to the wrong environment here can cause some confusion
1423
-
if (ok==-1L) catf("Note that the shallow copy will assign to the environment from which := was called. That means for example that if := was called within a function, the original table may be unaffected.\n")
1424
-
1425
-
# Verbosity should not issue warnings, so cat rather than warning.
1426
-
# TO DO: Add option 'datatable.pedantic' to turn on warnings like this.
1427
-
1428
-
# TO DO ... comments moved up from C ...
1429
-
# Note that the NAMED(dt)>1 doesn't work because .Call
1430
-
# always sets to 2 (see R-ints), it seems. Work around
1431
-
# may be possible but not yet working. When the NAMED test works, we can drop allocwarn argument too
1432
-
# because that's just passed in as FALSE from [<- where we know `*tmp*` isn't really NAMED=2.
1433
-
# Note also that this growing will happen for missing columns assigned NULL, too. But so rare, we
1434
-
# don't mind.
1435
-
}
1436
-
setalloccol(x, n, verbose=verbose) # always assigns to calling scope; i.e. this scope
0 commit comments