You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vignettes/datatable-joins.Rmd
+24-86Lines changed: 24 additions & 86 deletions
Original file line number
Diff line number
Diff line change
@@ -700,109 +700,47 @@ Products[!"popcorn",
700
700
701
701
### 7.2. Updating by reference
702
702
703
-
The`:=`operator in `data.table` is used for updating or adding columns by reference. This means it modifies the original `data.table` without creating a copy, which is very memory-efficient, especially for large datasets. When used inside a `data.table`, `:=` allows you to **add new columns** or **modify existing ones** as part of your query.
703
+
Use`:=`to modify columns **by reference** (no copy) during joins. General syntax: `x[i, on=, (cols) := val]`.
704
704
705
-
Let's update our `Products` table with the latest price from `ProductPriceHistory`:
705
+
- Simple One-to-One Update
706
+
Update `Products` with prices from `ProductPriceHistory`:
706
707
707
-
```{r Simple_One_to_One_Update}
708
-
Products[ProductPriceHistory, on = .(id = product_id), price := i.price]
708
+
```{r}
709
+
Products[ProductPriceHistory,
710
+
on = .(id = product_id),
711
+
price := i.price]
709
712
```
710
-
- The `price` column in `Products` is updated using the `price` column from `ProductPriceHistory`.
711
-
- The `on = .(id = product_id)` ensures that updates happen based on matching IDs.
712
-
- This method modifies `Products` in place, avoiding unnecessary copies.
713
-
714
-
Grouped Updates with `.EACHI`
715
-
716
-
If we need to get the latest price and date (instead of all matches), we can use grouped updates efficiently:
713
+
- i.price refers to price from i (ProductPriceHistory).
- A simple join `(on)` updates rows based on matching IDs without considering grouping or ordering.
742
-
- Grouped updates allow operations like selecting the "latest" record within each group using `.EACHI`.
743
-
744
-
**Right Join**
745
-
To update the right table by reference without copying (similar to SQL right join workflows), use `.SD` and `.SDcols`. This approach avoids modifying the left table directly while dynamically selecting columns.
730
+
- Efficient Right Join Update
731
+
Add product details to ProductPriceHistory without copying:
746
732
747
733
```{r}
748
-
# Get all columns from Products except the ID column
749
-
product_cols <- setdiff(names(Products), "id")
750
-
751
-
# Update ProductPriceHistory with product details from Products
0 commit comments