You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Most of the time after a join is complete we need to make some additional transformations. To make so we have the following alternatives:
207
+
Most of the time after joining we need to make some additional transformations. To do so we have the following alternatives:
208
208
209
-
- Chaining a new instruction by adding a pair of brakes`[]`.
209
+
- Chaining a new instruction by adding a pair of brackets`[]`.
210
210
- Passing a list with the columns that we want to keep or create to the `j` argument.
211
211
212
212
Our recommendation is to use the second alternative if possible, as it is **faster** and uses **less memory** than the first one.
213
213
214
-
215
214
##### Managing shared column Names with the j argument
216
215
217
216
The `j` argument has great alternatives to manage joins with tables **sharing the same names for several columns**. By default all columns are taking their source from the the `x` table, but we can also use the `x.` prefix to make clear the source and use the prefix `i.` to use any column form the table declared in the `i` argument of the `x` table.
218
217
219
-
Going back to the little supermarket, after updating the `ProductReceived` table with the `Products` table, it seems convenient apply the following changes:
218
+
Going back to the little supermarket, after updating the `ProductReceived` table with the `Products` table, suppose we want to apply the following changes:
220
219
221
-
-Changing the columns names from `id` to `product_id` and from `i.id` to `received_id`.
222
-
-Adding the `total_value`.
220
+
-Change the columns names from `id` to `product_id` and from `i.id` to `received_id`.
221
+
-Add the `total_value`.
223
222
224
223
```{r}
225
224
Products[
@@ -238,9 +237,9 @@ Products[
238
237
239
238
##### Summarizing with `on` in `data.table`
240
239
241
-
We can also use this alternative to return aggregated results based columns present in the `x` table.
240
+
We can also use this alternative to return aggregated results based on the columns present in the `x` table.
242
241
243
-
For example, we might interested in how much money we expend buying products each date regardless the products.
242
+
For example, we might be interested in how much money we spend buying each product across days.
244
243
245
244
```{r}
246
245
dt1 = ProductReceived[
@@ -250,7 +249,7 @@ dt1 = ProductReceived[
250
249
j = .(total_value_received = sum(price * count))
251
250
]
252
251
253
-
252
+
# alternative using multiple [] queries
254
253
dt2 = ProductReceived[
255
254
Products,
256
255
on = c("product_id" = "id"),
@@ -263,7 +262,7 @@ identical(dt1, dt2)
263
262
264
263
#### 3.1.4. Joining based on several columns
265
264
266
-
So far we have just joined `data.table` base on 1 column, but it's important to know that the package can join tables matching several columns.
265
+
So far we have just joined `data.table`s based on 1 column, but it's important to know that the package can join tables matching several columns.
267
266
268
267
To illustrate this, let's assume that we want to add the `tax_prop` from `NewTax` to **update** the `Products` table.
269
268
@@ -275,7 +274,7 @@ NewTax[Products, on = c("unit", "type")]
275
274
276
275
Use this method if you need to combine columns from 2 tables based on one or more references but ***keeping only rows matched in both tables***.
277
276
278
-
To perform this operation we just need to add `nomatch = NULL`or `nomatch = 0`to any of the prior join operations to return the same results.
277
+
To perform this operation we just need to add `nomatch = NULL` to any of the prior join operations to return the same results.
279
278
280
279
```{r}
281
280
# First Table
@@ -296,7 +295,7 @@ Despite both tables having the same information, there are some relevant differe
296
295
- The `id` column in the first table has the same information as the `product_id` in the second table.
297
296
- The `i.id` column in the first table has the same information as the `id` in the second table.
298
297
299
-
### 3.3. Not join
298
+
### 3.3. Anti-join
300
299
301
300
This method **keeps only the rows that don't match with any row of a second table**.
0 commit comments