Skip to content

Commit 9ec6175

Browse files
committed
Merge branch 'frollapply2025' into froll-n0
2 parents 385f294 + ea19cf5 commit 9ec6175

File tree

2 files changed

+4
-6
lines changed

2 files changed

+4
-6
lines changed

NEWS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,7 @@
190190
#119: -0.28964772 0.6116575
191191
#120: -0.40598313 0.6112854
192192
```
193-
- uses multiple CPU threads; evaluation of UDF is inherently slow so this can be a great help.
193+
- uses multiple CPU threads (on a decent OS); evaluation of UDF is inherently slow so this can be a great help.
194194
```r
195195
x = rnorm(1e5)
196196
n = 500

man/frollapply.Rd

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -234,7 +234,7 @@ system.time(for (i in 1:1e4) x[["v1"]])
234234
\item No repeated allocation of a rolling window subset.\cr
235235
Object (type of \code{X} and size of \code{N}) is allocated once (for each CPU thread), and then for each iteration this object is being re-used by copying expected subset of data into it. This means we still have to subset data on each iteration, but we only copy data into pre-allocated window object, instead of allocating in each iteration. Allocation is carrying much bigger overhead than copy. The faster the \code{FUN} evaluates the more relative speedup we are getting, because allocation of a subset does not depend on how fast or slow \code{FUN} evaluates. See \emph{caveats} section for possible edge cases caused by this optimization.
236236
\item Parallel evaluation of \code{FUN} calls.\cr
237-
Until now (October 2022) all the multithreaded code in data.table was using \emph{OpenMP}. It can be used only in C language and it has very low overhead. Unfortunately it could not be applied in \code{frollapply} because to evaluate UDF from C code one has to call R's C api that is not thread safe (can be run only from single threaded C code). Therefore \code{frollapply} uses \code{\link[parallel]{parallel-package}} to provide parallelism on R language level. It uses \emph{fork} parallelism, which has low overhead as well, unless results of computation are big in size. \emph{Fork} is not available on Windows OS. See \emph{caveats} section for limitations caused by using this optimization.
237+
Until now (September 2025) all the multithreaded code in data.table was using \emph{OpenMP}. It can be used only in C language and it has very low overhead. Unfortunately it could not be applied in \code{frollapply} because to evaluate UDF from C code one has to call R's C api that is not thread safe (can be run only from single threaded C code). Therefore \code{frollapply} uses \code{\link[parallel]{parallel-package}} to provide parallelism on R language level. It uses \emph{fork} parallelism, which has low overhead as well (unless results of computation are big in size which is not an issue for rolling statistics). \emph{Fork} is not available on Windows OS. See \emph{caveats} section for limitations caused by using this optimization.
238238
}
239239
}
240240
\examples{
@@ -257,10 +257,8 @@ flow = function(x) {
257257
v2 = x[[2L]]
258258
(v1[2L] - v1[1L] * (1+v2[2L])) / v1[1L]
259259
}
260-
x[,
261-
"flow" := frollapply(.(Sepal.Length, Sepal.Width), 2L, flow, by.column=FALSE),
262-
by = Species
263-
][]
260+
x[, "flow" := frollapply(.(Sepal.Length, Sepal.Width), 2L, flow, by.column=FALSE),
261+
by = Species][]
264262
265263
## rolling regression: by.column=FALSE
266264
f = function(x) coef(lm(v2 ~ v1, data=x))

0 commit comments

Comments
 (0)