You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Detect the correspondence between column names and indices after evaluating `j`, not before, to account for possible modifications to the table.
Co-authored-by: aitap <[email protected]>
Co-authored-by: Marco Colombo <[email protected]>
Copy file name to clipboardExpand all lines: NEWS.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -550,6 +550,8 @@ rowwiseDT(
550
550
551
551
22.`fread()`couldfailtoreadMacCSV files (with`\r`lineendings) ifthefilecontainedany`\n`character, suchasafinal`\r\n`.Thiswasfixedbydetectingthepredominantlineendinginasampleofthefile, [#4186](https://github.com/Rdatatable/data.table/issues/4186). Thanks to @MPagel for the report and @ben-schwen for the fix.
552
552
553
+
23.Byreference assignments (':=') withfunctionsthatmodifiedthedata.tablebyreference e.g. (`foo=function(DT){modify(DT);return(1L)}`, `DT[,a:=foo(DT)]`) returnedamalformeddata.tableduetothemodificationofthetargetednamedcolumn index ("a") duringthejexpressionevaluation [#6768](https://github.com/Rdatatable/data.table/issues/6768). Thanks @AntonNM for the report and fix.
# Adding new column(s). TO DO: move after the first eval in case the jsub has an error.
1216
+
# Adding new column(s). Allocation for columns and recalculation of target cols moved after the jval = eval(jsub)
1217
+
# in case of error or by-reference modifications to the DT
1218
1218
newnames=setdiff(lhs, names_x)
1219
1219
m[is.na(m)] = ncol(x)+seq_along(newnames)
1220
1220
cols= as.integer(m)
1221
1221
# don't pass verbose to selfrefok here -- only activated when
1222
-
# ok=-1 which will trigger setalloccol with verbose in the next
1223
-
#branch, which again calls _selfrefok and returns the message then
1222
+
# ok=-1 which will trigger setalloccol with verbose after
1223
+
#the jval = eval(jsub, ...)
1224
1224
if ((ok<-selfrefok(x, verbose=FALSE))==0L) # ok==0 so no warning when loaded from disk (-1) [-1 considered TRUE by R]
1225
1225
if (is.data.table(x)) warningf("A shallow copy of this data.table was taken so that := can add or remove %d columns by reference. At an earlier point, this data.table was copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. It's also not unusual for data.table-agnostic packages to produce tables affected by this issue. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.", length(newnames))
1226
-
# !is.data.table for DF |> DT(,:=) tests 2212.16-19 (#5113) where a shallow copy is routine for data.frame
1227
-
if ((ok<1L) || (truelength(x) < ncol(x)+length(newnames))) {
1228
-
DT=x# in case getOption contains "ncol(DT)" as it used to. TODO: warn and then remove
1229
-
n= length(newnames) + eval(getOption("datatable.alloccol")) # TODO: warn about expressions and then drop the eval()
1230
-
# i.e. reallocate at the size as if the new columns were added followed by setalloccol().
1231
-
name= substitute(x)
1232
-
if (is.name(name) &&ok&&verbose) { # && NAMED(x)>0 (TO DO) # ok here includes -1 (loaded from disk)
1233
-
catf("Growing vector of column pointers from truelength %d to %d. A shallow copy has been taken, see ?setalloccol. Only a potential issue if two variables point to the same data (we can't yet detect that well) and if not you can safely ignore this. To avoid this message you could setalloccol() first, deep copy first using copy(), wrap with suppressWarnings() or increase the 'datatable.alloccol' option.\n", truelength(x), n)
1234
-
# #1729 -- copying to the wrong environment here can cause some confusion
1235
-
if (ok==-1L) catf("Note that the shallow copy will assign to the environment from which := was called. That means for example that if := was called within a function, the original table may be unaffected.\n")
1236
-
1237
-
# Verbosity should not issue warnings, so cat rather than warning.
1238
-
# TO DO: Add option 'datatable.pedantic' to turn on warnings like this.
1239
-
1240
-
# TO DO ... comments moved up from C ...
1241
-
# Note that the NAMED(dt)>1 doesn't work because .Call
1242
-
# always sets to 2 (see R-ints), it seems. Work around
1243
-
# may be possible but not yet working. When the NAMED test works, we can drop allocwarn argument too
1244
-
# because that's just passed in as FALSE from [<- where we know `*tmp*` isn't really NAMED=2.
1245
-
# Note also that this growing will happen for missing columns assigned NULL, too. But so rare, we
1246
-
# don't mind.
1247
-
}
1248
-
setalloccol(x, n, verbose=verbose) # always assigns to calling scope; i.e. this scope
# Re-matches characters names in the lhs after jval to account for jsub's that modify the columns of the data.table (#6768)
1386
+
# Replaces numerical lhs with respective names_x
1387
+
if(is.character(lhs)){
1388
+
m= chmatch(lhs, names_x)
1389
+
if(!anyNA(m)) {
1390
+
# updates by reference to existing columns
1391
+
cols= as.integer(m)
1392
+
newnames=NULL
1393
+
} else {
1394
+
# Adding new column(s).
1395
+
newnames= setdiff(lhs, names_x)
1396
+
m[is.na(m)] = ncol(x) + seq_along(newnames)
1397
+
cols= as.integer(m)
1398
+
# ok <- selfrefok above called without verbose -- only activated when
1399
+
# ok=-1 which will trigger setalloccol with verbose in the next
1400
+
# branch, which again calls _selfrefok and returns the message then
1401
+
# !is.data.table for DF |> DT(,:=) tests 2212.16-19 (#5113) where a shallow copy is routine for data.frame
1402
+
if ((ok<1L) || (truelength(x) < ncol(x)+length(newnames))) {
1403
+
DT=x# in case getOption contains "ncol(DT)" as it used to. TODO: warn and then remove
1404
+
n= length(newnames) + eval(getOption("datatable.alloccol")) # TODO: warn about expressions and then drop the eval()
1405
+
# i.e. reallocate at the size as if the new columns were added followed by setalloccol().
1406
+
name= substitute(x)
1407
+
if (is.name(name) &&ok&&verbose) { # && NAMED(x)>0 (TO DO) # ok here includes -1 (loaded from disk)
1408
+
catf("Growing vector of column pointers from truelength %d to %d. A shallow copy has been taken, see ?setalloccol. Only a potential issue if two variables point to the same data (we can't yet detect that well) and if not you can safely ignore this. To avoid this message you could setalloccol() first, deep copy first using copy(), wrap with suppressWarnings() or increase the 'datatable.alloccol' option.\n", truelength(x), n)
1409
+
# #1729 -- copying to the wrong environment here can cause some confusion
1410
+
if (ok==-1L) catf("Note that the shallow copy will assign to the environment from which := was called. That means for example that if := was called within a function, the original table may be unaffected.\n")
1411
+
1412
+
# Verbosity should not issue warnings, so cat rather than warning.
1413
+
# TO DO: Add option 'datatable.pedantic' to turn on warnings like this.
1414
+
1415
+
# TO DO ... comments moved up from C ...
1416
+
# Note that the NAMED(dt)>1 doesn't work because .Call
1417
+
# always sets to 2 (see R-ints), it seems. Work around
1418
+
# may be possible but not yet working. When the NAMED test works, we can drop allocwarn argument too
1419
+
# because that's just passed in as FALSE from [<- where we know `*tmp*` isn't really NAMED=2.
1420
+
# Note also that this growing will happen for missing columns assigned NULL, too. But so rare, we
1421
+
# don't mind.
1422
+
}
1423
+
setalloccol(x, n, verbose=verbose) # always assigns to calling scope; i.e. this scope
0 commit comments