Skip to content
Merged
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
43b8bcb
doc updated reagrding non equi join
venom1204 Feb 10, 2025
87dcae9
corrected lintr
venom1204 Feb 10, 2025
29e768a
correction
venom1204 Feb 11, 2025
b29de8e
bacticks corrected
venom1204 Feb 14, 2025
dd3b556
correction
venom1204 Feb 14, 2025
4e07ad9
corrected
venom1204 Feb 14, 2025
8fad112
Merge branch 'master' into issue6626
venom1204 Feb 18, 2025
d68b802
Update vignettes/datatable-joins.Rmd
venom1204 Feb 18, 2025
a28a1c8
Merge branch 'master' into issue6626
venom1204 Feb 18, 2025
1245e5e
Merge branch 'master' into issue6626
venom1204 Feb 19, 2025
b556d13
done
venom1204 Feb 19, 2025
8e3ffbe
added explicitly distinguishing
venom1204 Feb 19, 2025
06286b9
Merge branch 'master' into issue6626
venom1204 Feb 20, 2025
6d026bd
removed the temporary part
venom1204 Feb 20, 2025
ed53aca
Update vignettes/datatable-joins.Rmd
venom1204 Feb 25, 2025
a98f799
Update vignettes/datatable-joins.Rmd
venom1204 Feb 25, 2025
da0931d
Update vignettes/datatable-joins.Rmd
venom1204 Feb 25, 2025
008cadf
Update vignettes/datatable-joins.Rmd
venom1204 Feb 25, 2025
66da346
Merge branch 'master' into issue6626
venom1204 Feb 25, 2025
a6b99b9
updated section
venom1204 Feb 25, 2025
f44211e
more minor style changes
MichaelChirico Feb 25, 2025
fe575ae
revise wording, use consistent capitalizatoin
MichaelChirico Feb 25, 2025
f7ad5e0
revise exposition
MichaelChirico Feb 25, 2025
dcf4178
rm redundant word
MichaelChirico Feb 25, 2025
6357b81
Update vignettes/datatable-joins.Rmd
venom1204 Feb 25, 2025
41f8c94
Update vignettes/datatable-joins.Rmd
venom1204 Feb 25, 2025
ff7505f
corrected
venom1204 Feb 25, 2025
c0f86bf
Suggest an explicit workaround
MichaelChirico Feb 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions vignettes/datatable-joins.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -569,6 +569,42 @@ ProductReceivedProd2[ProductSalesProd2,
nomatch = NULL]
```

### 4.1 Output column names in non-equi joins

When performing non-equi joins (`<`, `>`, `<=`, `>=`), column names are assigned as follows:

- The left operand (`x` column) determines the column name in the result.
- The right operand (`i` column) contributes values but does not retain its original name.
- By default, `data.table` does not retain the `i` column used in the join condition unless explicitly requested.

In non-equi joins, the left side of the operator (e.g., `x_int` in `x_int >= i_int`) must be a column from `x`, while the right side (e.g., `i_int`) must be a column from `i`.

Non-equi joins do not currently support arbitrary expressions (but see [#1639](https://github.com/Rdatatable/data.table/issues/1639)). For example, `on = .(x_int >= i_int)` is valid, but `on = .(x_int >= i_int + 1L)` is not. To perform such a non-equi join, first add the expression as a new column, e.g. `i[, i_int_plus_one := i_int + 1L]`, then do `.on(x_int >= i_int_plus_one)`.

```{r non_equi_join_example}
x <- data.table(x_int = 2:4, lower = letters[1:3])
i <- data.table(i_int = c(2L, 4L, 5L), UPPER = LETTERS[1:3])
x[i, on = .(x_int >= i_int)]
```

Key Takeaways:
- The name of the output column (`x_int`) comes from `x`, but the values come from `i_int` in `i`.
- The last row contains `NA` because no rows in `x` match the last row in `i` (`UPPER == "C"`).
- Multiple rows in `x` are returned to match the first row in `i` with `UPPER == "A"`.

If you want to keep the `i_int` column from `i`, you need to explicitly select it in the result:

```{r retain_i_column}
x[i, on = .(x_int >= i_int), .(i_int = i.i_int, x_int = x.x_int, lower, UPPER)]
```

Using prefixes (`x.` and `i.`) is not strictly necessary in this case since the names are unambiguous, but using them ensures the output clearly distinguishes `i_int` (from `i`) and `x_int` (from `x`).

If you want to exclude unmatched rows (an _inner join_), use `nomatch = NULL`:

```{r retain_i_column_inner_join}
x[i, on = .(x_int >= i_int), .(i_int = i.i_int, x_int = x.x_int, lower, UPPER), nomatch = NULL]
```

## 5. Rolling join

Expand Down
Loading