Skip to content

Commit 45bf1d4

Browse files
Merge branch 'master' into fix_fwrite_length
2 parents 4e91a21 + d6a9fe7 commit 45bf1d4

File tree

20 files changed

+116
-9
lines changed

20 files changed

+116
-9
lines changed

.ci/atime/tests.R

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# A list of performance tests.
22
#
3+
# See documentation in https://github.com/Rdatatable/data.table/wiki/Performance-testing for best practices.
4+
#
35
# Each entry in this list corresponds to a performance test and contains a sublist with three mandatory arguments:
46
# - N: A numeric sequence of data sizes to vary.
57
# - setup: An expression evaluated for every data size before measuring time/memory.

CODEOWNERS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,3 +60,6 @@
6060
# performance testing
6161
/.ci/atime/tests.R @tdhock @Anirban166
6262
/.github/workflows/performance-tests.yaml @Anirban166
63+
64+
# docs
65+
/man/openmp-utils.Rd @Anirban166

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616

1717
1. Using `print.data.table()` with character truncation using `datatable.prettyprint.char` no longer errors with `NA` entries, [#6441](https://github.com/Rdatatable/data.table/issues/6441). Thanks to @r2evans for the bug report, and @joshhwuu for the fix.
1818

19+
2. `fwrite()` respects `dec=','` for timestamp columns (`POSIXct` or `nanotime`) with sub-second accuracy, [#6446](https://github.com/Rdatatable/data.table/issues/6446). Thanks @kav2k for pointing out the inconsistency and @MichaelChirico for the PR.
20+
1921
## NOTES
2022

2123
1. Tests run again when some Suggests packages are missing, [#6411](https://github.com/Rdatatable/data.table/issues/6411). Thanks @aadler for the note and @MichaelChirico for the fix.

inst/tests/other.Rraw

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -761,3 +761,8 @@ if (loaded[["dplyr"]]) {
761761
DT = data.table(a = 1, b = 2, c = '1,2,3,4', d = 4)
762762
test(30, DT[, c := strsplit(c, ',', fixed = TRUE) %>% lapply(as.integer) %>% as.list]$c, list(1:4)) # nolint: pipe_call_linter. Mimicking MRE as filed.
763763
}
764+
765+
if (loaded[["nanotime"]]) {
766+
# respect dec=',' for nanotime, related to #6446, corresponding to tests 2281.*
767+
test(31, fwrite(data.table(as.nanotime(.POSIXct(0))), dec=',', sep=';'), output="1970-01-01T00:00:00,000000000Z")
768+
}

inst/tests/tests.Rraw

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19064,3 +19064,7 @@ test(2280.1, internal_error("broken"), error="Internal error.*broken")
1906419064
test(2280.2, internal_error("broken %d%s", 1, "2"), error="Internal error.*broken 12")
1906519065
foo = function(...) internal_error("broken")
1906619066
test(2280.3, foo(), error="Internal error in foo: broken")
19067+
19068+
# fwrite respects dec=',' for sub-second timestamps, #6446
19069+
test(2281.1, fwrite(data.table(a=.POSIXct(0.001)), dec=',', sep=';'), output="1970-01-01T00:00:00,001Z")
19070+
test(2281.2, fwrite(data.table(a=.POSIXct(0.0001)), dec=',', sep=';'), output="1970-01-01T00:00:00,000100Z")

man/openmp-utils.Rd

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -41,17 +41,18 @@
4141
\item\file{cj.c} - \code{\link{CJ}()}
4242
\item\file{coalesce.c} - \code{\link{fcoalesce}()}
4343
\item\file{fifelse.c} - \code{\link{fifelse}()}
44-
\item\file{fread.c} - \code{\link{fread}()}
44+
\item\file{fread.c}, \file{freadR.c} - \code{\link{fread}(). Parallelized across row-based chunks of the file.}
4545
\item\file{forder.c}, \file{fsort.c}, and \file{reorder.c} - \code{\link{forder}()} and related
4646
\item\file{froll.c}, \file{frolladaptive.c}, and \file{frollR.c} - \code{\link{froll}()} and family
47-
\item\file{fwrite.c} - \code{\link{fwrite}()}
48-
\item\file{gsumm.c} - GForce in various places, see \link{GForce}
47+
\item\file{fwrite.c} - \code{\link{fwrite}(). Parallelized across rows.}
48+
\item\file{gsumm.c} - GForce in various places, see \link{GForce}. Parallelized across groups.
4949
\item\file{nafill.c} - \code{\link{nafill}()}
5050
\item\file{subset.c} - Used in \code{\link[=data.table]{[.data.table}} subsetting
5151
\item\file{types.c} - Internal testing usage
5252
}
53+
54+
We endeavor to keep this list up to date, but note that the canonical reference here is the source code itself.
5355
}
5456
\examples{
5557
getDTthreads(verbose=TRUE)
5658
}
57-
\keyword{ data }

src/between.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
#include "data.table.h"
22

3+
/*
4+
OpenMP is used here to parallelize:
5+
- The loops that check if each element of the vector provided is between
6+
the specified lower and upper bounds, for INTSXP and REALSXP types
7+
- The checking and handling of undefined values (such as NaNs)
8+
*/
39
SEXP between(SEXP x, SEXP lower, SEXP upper, SEXP incbounds, SEXP NAboundsArg, SEXP checkArg) {
410
int nprotect = 0;
511
R_len_t nx = length(x), nl = length(lower), nu = length(upper);

src/cj.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
#include "data.table.h"
22

3+
/*
4+
OpenMP is used here to parallelize:
5+
- The element assignment in vectors
6+
- The memory copying operations (blockwise replication of data using memcpy)
7+
- The creation of all combinations of the input vectors over the cross-product space
8+
*/
39
SEXP cj(SEXP base_list) {
410
int ncol = LENGTH(base_list);
511
SEXP out = PROTECT(allocVector(VECSXP, ncol));

src/coalesce.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
#include "data.table.h"
22

3+
/*
4+
OpenMP is used here to parallelize:
5+
- The operation that iterates over the rows to coalesce the data
6+
- The replacement of NAs with non-NA values from subsequent vectors
7+
- The conditional checks within parallelized loops
8+
*/
39
SEXP coalesce(SEXP x, SEXP inplaceArg) {
410
if (TYPEOF(x)!=VECSXP) internal_error(__func__, "input is list(...) at R level"); // # nocov
511
if (!IS_TRUE_OR_FALSE(inplaceArg)) internal_error(__func__, "argument 'inplaceArg' must be TRUE or FALSE"); // # nocov

src/fifelse.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
#include "data.table.h"
22

3+
/*
4+
OpenMP is being used here to parallelize loops that perform conditional
5+
checks along with assignment operations over the elements of the
6+
supplied logical vector based on the condition (test) and values
7+
provided for the remaining arguments (yes, no, and na).
8+
*/
39
SEXP fifelseR(SEXP l, SEXP a, SEXP b, SEXP na) {
410
if (!isLogical(l)) {
511
error(_("Argument 'test' must be logical."));

0 commit comments

Comments
 (0)