Skip to content

Commit 7b75771

Browse files
committed
fwrite: pre-encode strings and factor levels
Previously, fwrite() deferred encoding of the strings to fwriteR.c:getString,getCategString called from OpenMP threads. Calling translateChar[UTF8] to encode a string results in memory allocation unless it is already in the desired encoding, which is unsafe to perform on a non-main thread. Fixes: #6883
1 parent 2cb0316 commit 7b75771

File tree

2 files changed

+13
-0
lines changed

2 files changed

+13
-0
lines changed

R/fwrite.R

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,15 @@ fwrite = function(x, file="", append=FALSE, quote="auto",
111111
}
112112
# nocov end
113113
file = enc2native(file) # CfwriteR cannot handle UTF-8 if that is not the native encoding, see #3078.
114+
# pre-encode any strings or factor levels to avoid translateChar trying to allocate from OpenMP threads
115+
if (encoding %chin% c("UTF-8", "native")) {
116+
enc = switch(encoding, "UTF-8" = enc2utf8, "native" = enc2native)
117+
x = lapply(x, function(x) {
118+
if (is.character(x)) x = enc(x)
119+
if (is.factor(x)) levels(x) = enc(levels(x))
120+
x
121+
})
122+
}
114123
.Call(CfwriteR, x, file, sep, sep2, eol, na, dec, quote, qmethod=="escape", append,
115124
row.names, col.names, logical01, scipen, dateTimeAs, buffMB, nThread,
116125
showProgress, is_gzip, compressLevel, bom, yaml, verbose, encoding)

inst/tests/tests.Rraw

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21113,3 +21113,7 @@ test(2309.09, as.data.table(df, keep.rownames=TRUE), data.table(rn = c("a","b"),
2111321113
as.data.frame.no.reset = function(x) x
2111421114
DF = structure(list(a = 1:2), class = c("data.frame", "no.reset"), row.names = c(NA, -2L))
2111521115
test(2310.01, as.data.table(DF), data.table(a=1:2))
21116+
21117+
# avoid translateChar*() in OpenMP threads, #6883
21118+
DF = list(rep(iconv(strrep("\uf8", 100), to = "latin1"), 100000))
21119+
test(2311, fwrite(DF, nullfile(), encoding = "UTF-8"), NULL)

0 commit comments

Comments
 (0)