Skip to content

Commit e7b7e47

Browse files
authored
Subset Copy Documentation (#6182)
* quick doc fix * wording * wording * review suggestions
1 parent ff900d1 commit e7b7e47

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

man/assign.Rd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ Unlike \code{<-} for \code{data.frame}, the (potentially large) LHS is not coerc
8585
Since \code{[.data.table} incurs overhead to check the existence and type of arguments (for example), \code{set()} provides direct (but less flexible) assignment by reference with low overhead, appropriate for use inside a \code{for} loop. See examples. \code{:=} is more powerful and flexible than \code{set()} because \code{:=} is intended to be combined with \code{i} and \code{by} in single queries on large datasets.
8686
}
8787
\note{
88-
\code{DT[a > 4, b := c]} is different from \code{DT[a > 4][, b := c]}. The first expression updates (or adds) column \code{b} with the value \code{c} on those rows where \code{a > 4} evaluates to \code{TRUE}. \code{X} is updated \emph{by reference}, therefore no assignment needed.
88+
\code{DT[a > 4, b := c]} is different from \code{DT[a > 4][, b := c]}. The first expression updates (or adds) column \code{b} with the value \code{c} on those rows where \code{a > 4} evaluates to \code{TRUE}. \code{X} is updated \emph{by reference}, therefore no assignment needed. Note that this does not apply when `i` is missing, i.e. \code{DT[]}.
8989
9090
The second expression on the other hand updates a \emph{new} \code{data.table} that's returned by the subset operation. Since the subsetted data.table is ephemeral (it is not assigned to a symbol), the result would be lost; unless the result is assigned, for example, as follows: \code{ans <- DT[a > 4][, b := c]}.
9191
}

vignettes/datatable-reference-semantics.Rmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,8 @@ A *shallow* copy is just a copy of the vector of column pointers (corresponding
7878

7979
A *deep* copy on the other hand copies the entire data to another location in memory.
8080

81+
When subsetting a *data.table* using `i` (e.g., `DT[1:10]`), a *deep* copy is made. However, when `i` is not provided or equals `TRUE`, a *shallow* copy is made.
82+
8183
#
8284
With *data.table's* `:=` operator, absolutely no copies are made in *both* (1) and (2), irrespective of R version you are using. This is because `:=` operator updates *data.table* columns *in-place* (by reference).
8385

0 commit comments

Comments
 (0)