Skip to content

Commit 77f26c4

Browse files
committed
added changes suggested by ben-schwen
1 parent fbb7c86 commit 77f26c4

File tree

1 file changed

+10
-4
lines changed

1 file changed

+10
-4
lines changed

vignettes/datatable-fread-and-fwrite.Rmd

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,16 +49,19 @@ library(data.table)
4949
fread("grep -v HEADER example_data.txt")
5050
```
5151

52-
The `-v` option makes `grep` return all lines except those containing the string 'HEADER'. Given the number of high quality engineers that have looked at the command tool grep over the years, it is most likely that it is as fast as you can get, as well as being correct, convenient, well documented online, easy to learn and search for solutions for specific tasks. If you need to perform more complex string filtering (e.g., matching strings at the beginning or end of lines), the `grep` syntax is very powerful. Learning its syntax is a transferable skill for other languages and environments.
53-
52+
The `-v` option makes `grep` return all lines except those containing the string 'HEADER'.
53+
54+
> "Given the number of high quality engineers that have looked at the command tool grep over the years, it is most likely that it is as fast as you can get, as well as being correct, convenient, well documented online, easy to learn and search for solutions for specific tasks. If you need to perform more complex string filtering (e.g., matching strings at the beginning or end of lines), the grep syntax is very powerful. Learning its syntax is a transferable skill for other languages and environments."
55+
>
56+
> — Matt Dowle
57+
5458
Look at this [example](https://stackoverflow.com/questions/36256706/fread-together-with-grepl/36270543#36270543) for more detail.
5559

5660
On Windows we recommend [Cygwin](https://www.cygwin.com/) (run one .exe to install) which includes the command line tools such as grep. In March 2016, Microsoft [announced](https://www.hanselman.com/blog/developers-can-run-bash-shell-and-usermode-ubuntu-linux-binaries-on-windows-10) they will include these tools in Windows 10 natively. On Linux and macOS, these tools have always been included in the operating system. You can find many examples and tutorials about command line tools online. We recommend [Data Science at the Command Line](https://www.oreilly.com/library/view/data-science-at/9781491947845/).
5761

5862
#### 1.1.1 Reading directly from a text string
5963

60-
`fread()` can read data directly from a character string in R using the text argument. This is particularly handy for creating reproducible examples, testing code snippets, or working with data generated programmatically within your R session. Each line in the string should be separated by a newline character `(
61-
)`.
64+
`fread()` can read data directly from a character string in R using the text argument. This is particularly handy for creating reproducible examples, testing code snippets, or working with data generated programmatically within your R session. Each line in the string should be separated by a newline character `\n`.
6265

6366
```{r}
6467
my_data_string = "colA,colB,colC\n1,apple,TRUE\n2,banana,FALSE\n3,orange,TRUE"
@@ -83,6 +86,9 @@ In many cases, `fread()` can automatically detect and decompress files with comm
8386
- `.gz` (gzip): Supported and works out of the box.
8487
- `.xz` (xz): Supported and works out of the box.
8588
- `.zip` (ZIP archives, single file): Supported—`fread()` will read the first file in the archive if only one file is present.
89+
- `.tar` (tar archives, single file): Supported—`fread()` will read the first file in the archive if only one file is present.
90+
91+
> Note: If there are multiple files in the archive, `fread()` will fail with an error.
8692
8793
### 1.2 Automatic separator and skip detection
8894

0 commit comments

Comments
 (0)