Skip to content

Commit 29e768a

Browse files
committed
correction
1 parent 87dcae9 commit 29e768a

File tree

1 file changed

+5
-19
lines changed

1 file changed

+5
-19
lines changed

vignettes/datatable-joins.Rmd

Lines changed: 5 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -576,24 +576,18 @@ When performing non-equi joins (<, >, <=, >=), it's important to understand how
576576
- The left operand (`x` column) determines the column name in the result.
577577
- The right operand (`i` column) contributes values but does not retain its original name.
578578
- By default, `data.table` does not retain the `i` column used in the join condition unless explicitly requested.
579-
`In non-equi joins, the resulting column inherits the name from the left (x) table but contains values from the right (i) table.`
580579

581-
This can cause confusion when `x` and `i` have different column names.
580+
In non-equi joins, the left side of the operator (e.g., `A` in `A >= B`) must be a column from `x`,
581+
and the right side (e.g., `B`) must be a column from `i`. Non-equi join does not support arbitrary expressions. For example, `on = .(x_col >= i_col)` is valid, but `on = .(x_col >= i_col + 1)` is not.
582582

583-
**Important**: Non-equi join conditions must use column names from `x` and `i`, *not arbitrary expressions*.
584-
For example, `on = .(x_col >= i_col)` is valid, but `on = .(x_col >= i_col + 1)` is not.
585-
586-
In non-equi joins, the left side of the operator (e.g., `A` in `A >= B`) *must be a column from `x`*,
587-
and the right side (e.g., `B`) *must be a column from `i`*.
588-
589-
To use expressions, create temporary columns first (see example below).
583+
Arbitrary comparisons can be accomplished by create temporary columns first. For example:
590584

591585
```{r}
592586
x <- data.table(A = 1:5, value_x = letters[1:5])
593587
i <- data.table(B = c(2, 4, 5), value_i = LETTERS[1:3])
594588
x[i, on = .(A >= B)]
595589
```
596-
In data.table, when using a non-equi join condition (>=, <, etc.), the column from x is retained in the result, while the column from i is not retained unless explicitly selected.
590+
```In data.table, when using a non-equi join condition (>=, <, etc.), the column from x is retained in the result, while the column from i is not retained unless explicitly selected.```
597591

598592
Expected Output
599593
A value_x value_i
@@ -604,15 +598,7 @@ Expected Output
604598

605599
If multiple rows in x satisfy the join condition with a single row in i, those rows will be duplicated in the result.
606600

607-
Notice that A appears in the result, but B from i is missing.
608-
**Note for SQL Users**: Unlike SQL, `data.table` non-equi joins:
609-
- Do not retain the `i` column used in the join condition unless explicitly selected.
610-
- Use the `x` column name in the result (e.g., `A` instead of `B` in `A >= B`).
611-
612-
This is because B was only used for filtering and is not retained unless explicitly selected.
613-
However, columns from i that are not used in the join condition (e.g., value_i) are automatically included in the output by default, since data.table keeps all non-matching columns from i.
614-
615-
`If you want to keep the B column from i, you need to explicitly select it in the result:`
601+
If you want to keep the B column from i, you need to explicitly select it in the result:
616602

617603
```{r}
618604
x[i, on = .(A >= B), .(B, A, value_x, value_i)]

0 commit comments

Comments
 (0)