Skip to content
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
1fb7294
added logic to handle select argumnt using .shallow()
Mukulyadav2004 Aug 4, 2025
ee3e85f
add conditional passing
Mukulyadav2004 Aug 4, 2025
01d8bbe
assign x when selct is used
Mukulyadav2004 Aug 4, 2025
9837fd4
Merge branch 'master' into issue#4177
Mukulyadav2004 Aug 4, 2025
97cb32b
added coverage tests
Mukulyadav2004 Aug 4, 2025
4f9928a
typo
Mukulyadav2004 Aug 4, 2025
0eb227f
merge master
Mukulyadav2004 Aug 4, 2025
8424176
right number
Mukulyadav2004 Aug 4, 2025
d0956f3
manual entry + news
Mukulyadav2004 Aug 5, 2025
bb8320d
moved select logic to address matrix inputs also
Mukulyadav2004 Aug 5, 2025
8e63991
for matrix handling
Mukulyadav2004 Aug 5, 2025
d237f57
add tests for other classes too
Mukulyadav2004 Aug 5, 2025
3a718c0
remove trail whit space
Mukulyadav2004 Aug 5, 2025
23214d1
is.null
Mukulyadav2004 Aug 5, 2025
2744a89
add tests
Mukulyadav2004 Aug 5, 2025
66d3744
trailing space
Mukulyadav2004 Aug 5, 2025
7f7a061
Merge branch 'issue#4177' of https://github.com/Rdatatable/data.table…
Mukulyadav2004 Aug 8, 2025
1887699
merge master branch
Mukulyadav2004 Aug 8, 2025
d5c933b
added atime performance test
Mukulyadav2004 Aug 8, 2025
b0e7f82
add condition to when to use select
Mukulyadav2004 Aug 8, 2025
8fe7002
added select parameter to docs also
Mukulyadav2004 Aug 14, 2025
2a0549f
Merge branch 'master' into issue#4177
tdhock Aug 15, 2025
ec25903
setup and branching only in expr and added cases
Mukulyadav2004 Aug 15, 2025
e54bd71
document select arg
Aug 15, 2025
6047fa3
Merge branch 'issue#4177' of https://github.com/rdatatable/data.table…
Aug 15, 2025
c2c5d89
bypass name res for num select, avoid no-op shallow, and cols in pla…
Mukulyadav2004 Aug 15, 2025
926bb0e
better writing style
Mukulyadav2004 Aug 15, 2025
cbffb5b
seperate both cases
Mukulyadav2004 Aug 15, 2025
ea91282
use only is.numeric(x)
Mukulyadav2004 Aug 15, 2025
28705de
better manual docs
Mukulyadav2004 Aug 15, 2025
94a9b59
closed parenthisis
Mukulyadav2004 Aug 20, 2025
377620e
improved test stmt
Mukulyadav2004 Aug 20, 2025
d834811
imprvd tst stmt
Mukulyadav2004 Aug 20, 2025
08f6fee
use select parm instead of x in is.numeric
Mukulyadav2004 Aug 22, 2025
b072be5
fwrite call to use data.table subsetting
Mukulyadav2004 Aug 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@

15. New function `isoyear()` has been implemented as a complement to `isoweek()`, returning the ISO 8601 year corresponding to a given date, [#7154](https://github.com/Rdatatable/data.table/issues/7154). Thanks to @ben-schwen and @MichaelChirico for the suggestion and @venom1204 for the implementation.

16. `fwrite()` gains `select` argument to write only specified columns, avoiding temporary object creation for memory efficiency, [#4177](https://github.com/Rdatatable/data.table/issues/4177). For `data.table` objects, this uses `.shallow()` to create shallow copies without data duplication. Thanks to @artidataio for feature request, @ColeMiller1 for suggesting implementation and @Mukulyadav2004 for the implementation.

### BUG FIXES

1. `fread()` no longer warns on certain systems on R 4.5.0+ where the file owner can't be resolved, [#6918](https://github.com/Rdatatable/data.table/issues/6918). Thanks @ProfFancyPants for the report and PR.
Expand Down
14 changes: 13 additions & 1 deletion R/fwrite.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@
yaml = FALSE,
bom = FALSE,
verbose=getOption("datatable.verbose", FALSE),
encoding = "") {
encoding = "",
select = NULL) {
na = as.character(na[1L]) # fix for #1725
if (length(encoding) != 1L || !encoding %chin% c("", "UTF-8", "native")) {
stopf("Argument 'encoding' must be '', 'UTF-8' or 'native'.")
Expand All @@ -26,6 +27,7 @@
buffMB = as.integer(buffMB)
nThread = as.integer(nThread)
compressLevel = as.integer(compressLevel)

Check warning on line 30 in R/fwrite.R

View workflow job for this annotation

GitHub Actions / lint-r

file=R/fwrite.R,line=30,col=1,[trailing_whitespace_linter] Remove trailing whitespace.
# write.csv default is 'double' so fwrite follows suit. write.table's default is 'escape'
# validate arguments
if (is.matrix(x)) { # coerce to data.table if input object is matrix
Expand All @@ -38,6 +40,16 @@
x = as.data.table(x)
}
}
# Handle select argument using .shallow()
if (!null(select)) {
cols = colnamesInt(x, select)

Check warning on line 45 in R/fwrite.R

View workflow job for this annotation

GitHub Actions / lint-r

file=R/fwrite.R,line=45,col=34,[trailing_whitespace_linter] Remove trailing whitespace.
if (is.data.table(x)) {
x = .shallow(x, cols)
} else {
x = x[select]
}
}

Check warning on line 52 in R/fwrite.R

View workflow job for this annotation

GitHub Actions / lint-r

file=R/fwrite.R,line=52,col=1,[trailing_whitespace_linter] Remove trailing whitespace.
stopifnot(
is.list(x),
identical(quote,"auto") || isTRUEorFALSE(quote),
Expand Down
7 changes: 7 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -21602,3 +21602,10 @@ if (getRversion() >= "4.0.0") { # rely on stopifnot(named = ...) for correct mes
test(2337.3, is.null(fwrite(data.table(c(0.1, 0.2)), dec=",", sep="\t")))
test(2337.4, is.null(fwrite(data.table(a=numeric(), b=numeric()), dec=",", sep=",")))
test(2337.5, is.null(fwrite(data.table(a=numeric()), dec=",", sep=",")))

# test for select parameter #4177
DT = data.table(a=1:2, b=3:4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add tests for other classes e.g.

df = as.data.frame(DT)
l = as.list(DT)
m = as.matrix(DT)

f = tempfile()
fwrite(DT, f, select = "a")
test(2338.1, names(fread(f)), "a")
unlink(f)
1 change: 1 addition & 0 deletions man/fwrite.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ fwrite(x, file = "", append = FALSE, quote = "auto",
\item{bom}{If \code{TRUE} a BOM (Byte Order Mark) sequence (EF BB BF) is added at the beginning of the file; format 'UTF-8 with BOM'.}
\item{verbose}{Be chatty and report timings?}
\item{encoding}{ The encoding of the strings written to the CSV file. Default is \code{""}, which means writing raw bytes without considering the encoding. Other possible options are \code{"UTF-8"} and \code{"native"}. }
\item{select}{Vector of column names or column numbers specifying which columns to include. When \code{NULL} (default), all columns are selected. This avoids creating temporary subsets for memory efficiency.}
}
\details{
\code{fwrite} began as a community contribution with \href{https://github.com/Rdatatable/data.table/pull/1613}{pull request #1613} by Otto Seiskari. This gave Matt Dowle the impetus to specialize the numeric formatting and to parallelize: \url{https://h2o.ai/blog/2016/fast-csv-writing-for-r/}. Final items were tracked in \href{https://github.com/Rdatatable/data.table/issues/1664}{issue #1664} such as automatic quoting, \code{bit64::integer64} support, decimal/scientific formatting exactly matching \code{write.csv} between 2.225074e-308 and 1.797693e+308 to 15 significant figures, \code{row.names}, dates (between 0000-03-01 and 9999-12-31), times and \code{sep2} for \code{list} columns where each cell can itself be a vector.
Expand Down
Loading