Skip to content

Commit 1999bbd

Browse files
nitish jhanitish jha
authored andcommitted
updating vignette
1 parent ff900d1 commit 1999bbd

File tree

1 file changed

+44
-0
lines changed

1 file changed

+44
-0
lines changed

vignettes/datatable-intro.Rmd

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,50 @@ The function `length()` requires an input argument. We just need to compute the
250250

251251
This type of operation occurs quite frequently, especially while grouping (as we will see in the next section), to the point where `data.table` provides a *special symbol* `.N` for it.
252252

253+
### g) Handle non-existing elements in `i`
254+
255+
#### -- What happens when querying for non-existing elements?
256+
257+
When querying a `data.table` for elements that do not exist, the behavior differs based on the method used.
258+
259+
```r
260+
dt <- data.table(x = letters[1:3], y = LETTERS[1:3])
261+
setkeyv(dt, "x")
262+
```
263+
264+
* **Key-based subsetting: `dt["d"]`**
265+
266+
This performs a right join on the key column `x`, resulting in a row with `d` and `NA` for columns not found.
267+
268+
```r
269+
dt["d"]
270+
# Returns:
271+
# x y
272+
# 1: d <NA>
273+
```
274+
275+
* **Logical subsetting: `dt[x == "d"]`**
276+
277+
This performs a plain subset operation that does not find any matching rows and thus returns an empty data.table.
278+
279+
```r
280+
dt[x == "d"]
281+
# Returns:
282+
# Empty data.table (0 rows and 2 cols): x,y
283+
```
284+
285+
* **Exact match using `nomatch=NULL`**
286+
287+
For exact matches without `NA` for non-existing elements, use `nomatch=NULL`:
288+
289+
```r
290+
dt["d", nomatch=NULL]
291+
# Returns:
292+
# Empty data.table (0 rows and 2 cols): x,y
293+
```
294+
295+
Understanding these behaviors can help prevent confusion when dealing with non-existing elements in your data.
296+
253297
#### Special symbol `.N`: {#special-N}
254298

255299
`.N` is a special built-in variable that holds the number of observations _in the current group_. It is particularly useful when combined with `by` as we'll see in the next section. In the absence of group by operations, it simply returns the number of rows in the subset.

0 commit comments

Comments
 (0)