Skip to content

Commit e1475cc

Browse files
authored
datatable-fread-and-fwrite.Rmd #7253 EN
reporting corrections of FR review
1 parent 7e37f3e commit e1475cc

File tree

1 file changed

+10
-6
lines changed

1 file changed

+10
-6
lines changed

vignettes/datatable-fread-and-fwrite.Rmd

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,8 @@ print(dt_from_text)
7171

7272
#### 1.1.2 Reading from URLs
7373

74-
`fread()` can read data directly from web URLs by passing the URL as a character string to its `file` argument. This allows you to download and read data from the internet in one step.
74+
`fread()` can read data directly from web URLs by passing the URL as a character string to its `file` argument.
75+
This allows you to download and read data from the internet in one step.
7576

7677
```{r}
7778
# dt = fread("https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv")
@@ -86,7 +87,7 @@ In many cases, `fread()` can automatically detect and decompress files with comm
8687
- `.gz` / `.bz2` (gzip / bzip2): Supported and works out of the box.
8788
- `.zip` / `.tar` (ZIP / tar archives, single file): Supported—`fread()` will read the first file in the archive if only one file is present.
8889

89-
> Note: If there are multiple files in the archive, `fread()` will fail with an error.
90+
**Note**: If there are multiple files in the archive, `fread()` will fail with an error.
9091

9192
### 1.2 Automatic separator and skip detection
9293

@@ -112,7 +113,7 @@ By default (`skip="auto"`), `fread` will automatically skip blank lines and comm
112113

113114
### 1.3 High-Quality Automatic Column Type Detection
114115

115-
Many real-world datasets contain columns that are initially blank, zero-filled, or appear numeric but later contain characters. To handle such inconsistencies, `fread()` in `data.table` employs a robust column type detection strategy.
116+
Many real-world datasets contain columns that are initially blank, zero-filled, or appear numeric but later contain characters. To handle such inconsistencies, `fread()` employs a robust column type detection strategy.
116117

117118
Since v1.10.5, `fread()` samples rows by reading blocks of contiguous rows from multiple equally spaced points across the file, including the start, middle, and end. The total number of rows sampled is chosen dynamically based on the file size and structure, and is typically around 10,000, but can be smaller or slightly larger. This wide sampling helps detect type changes that occur later in the data (e.g., `001` to `0A0` or blanks becoming populated).
118119

@@ -142,7 +143,9 @@ All detection logic and any rereads are detailed when `verbose=TRUE` is enabled.
142143

143144
### 1.4 Early Error Detection at End-of-File
144145

145-
Because the large sample explicitly includes the very end of the file, critical issues—such as an inconsistent number of columns, a malformed footer, or an opening quote without a matching closing quote—can be detected and reported almost instantly. This early error detection avoids the unnecessary overhead of processing the entire file or allocating excessive memory, only to encounter a failure at the final step. It ensures faster feedback and more efficient resource usage, especially when working with large datasets.
146+
Because the large sample explicitly includes the very end of the file, critical issues—such as an inconsistent number of columns, a malformed footer, or an opening quote without a matching closing quote—can be detected and reported almost instantly.
147+
This early error detection avoids the unnecessary overhead of processing the entire file or allocating excessive memory, only to encounter a failure at the final step.
148+
It ensures faster feedback and more efficient resource usage, especially when working with large datasets.
146149

147150
### 1.5 `integer64` Support
148151

@@ -195,7 +198,7 @@ Use `skip="string"` in `fread` to search for a line containing a substring (typi
195198

196199
Supported Scenarios:
197200
- Unescaped quotes inside quoted fields
198-
e.g., `"This "quote" is invalid, but fread works anyway"` — supported as long as column count remains consistent.
201+
e.g., `"This "quote" is invalid, but fread works anyway"` — supported as long as column count remains consistent :
199202

200203
```{r}
201204
data.table::fread(text='x,y\n"This "quote" is invalid, but fread works anyway",1')
@@ -218,7 +221,8 @@ From v1.10.6, `fread` resolves ambiguities more reliably across the entire file
218221

219222
## 2. fwrite()
220223

221-
`fwrite()` is the fast file writer companion to `fread()`. It’s designed for speed, sensible defaults, and ease of use, mirroring many of the conveniences found in `fread`.
224+
`fwrite()` is the fast file writer companion to `fread()`.
225+
It’s designed for speed, sensible defaults, and ease of use, mirroring many of the conveniences found in `fread`.
222226

223227
### 2.1 Intelligent and Minimalist Quoting (quote="auto")
224228

0 commit comments

Comments
 (0)