Skip to content

Commit ff8a23c

Browse files
jangoreckimcol
andauthored
Apply wording suggestions from @mcol code review
Co-authored-by: Marco Colombo <[email protected]>
1 parent a3c6c10 commit ff8a23c

File tree

2 files changed

+5
-5
lines changed

2 files changed

+5
-5
lines changed

NEWS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@
246246
#9: 2025-09-22 9 8 9.0
247247
```
248248

249-
19. New rolling functions: `frollmin`, `frollprod` and `frollmedian`, have been implemented, towards [#2778](https://github.com/Rdatatable/data.table/issues/2778). Thanks to @jangorecki for implementation. Implementation of rolling median is based on a novel algorithm "sort-median" described by [@suomela](https://github.com/suomela) in 2014 in his paper [Median Filtering is Equivalent to Sorting](https://arxiv.org/abs/1406.1717). "sort-median" scales very well, not only for size of input vector but also for size of rolling window.
249+
19. New rolling functions: `frollmin`, `frollprod` and `frollmedian`, have been implemented, towards [#2778](https://github.com/Rdatatable/data.table/issues/2778). Thanks to @jangorecki for implementation. Implementation of rolling median is based on a novel algorithm "sort-median" described by [@suomela](https://github.com/suomela) in his 2014 paper [Median Filtering is Equivalent to Sorting](https://arxiv.org/abs/1406.1717). "sort-median" scales very well, not only for size of input vector but also for size of rolling window.
250250
```r
251251
rollmedian = function(x, n) {
252252
ans = rep(NA_real_, nx<-length(x))

man/froll.Rd

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -86,21 +86,21 @@
8686
\item \code{has.nf=FALSE} uses faster implementation that does not support non-finite values. Then depending on the rolling function it will either:
8787
\itemize{
8888
\item (\emph{mean, sum, prod}) detect non-finite, re-run non-finite aware.
89-
\item (\emph{max, min, median}) does not detect non-finites and may silently give incorrect answer.
89+
\item (\emph{max, min, median}) does not detect non-finites and may silently produce an incorrect answer.
9090
}
9191
In general \code{has.nf=FALSE && any(!is.finite(x))} should be considered as undefined behavior. Therefore \code{has.nf=FALSE} should be used with care.
9292
}
9393
}
9494
\section{Implementation}{
95-
Each rolling function has 4 different implementations. First factor that decides which implementation is being used is \code{adaptive} argument, see setion below for details. Then for each of those two algorithms (adaptive \code{TRUE} or \code{FALSE}) there are usually two implementations depending on the \code{algo} argument.
95+
Each rolling function has 4 different implementations. First factor that decides which implementation is used is the \code{adaptive} argument (either \code{TRUE} or \code{FALSE}), see section below for details. Then for each of those two algorithms there are usually two implementations depending on the \code{algo} argument.
9696
\itemize{
9797
\item \code{algo="fast"} uses \emph{"online"}, single pass, algorithm.
9898
\itemize{
99-
\item \emph{max} and \emph{min} rolling function will not do only a single pass but, on average, \code{length(x)/n} nested loops will be computed. The bigger the window the bigger advantage over algo \emph{exact} which computes \code{length(x)} nested loops. Note that \emph{exact} uses multiple CPUs so for a small window size and many CPUs it is possible it will be actually faster than \emph{fast} but in those cases elapsed timings will likely be far below a single second.
99+
\item \emph{max} and \emph{min} rolling function will not do only a single pass but, on average, they will compute \code{length(x)/n} nested loops. The larger the window, the greater the advantage over the \emph{exact} algorithm, which computes \code{length(x)} nested loops. Note that \emph{exact} uses multiple CPUs so for a small window sizes and many CPUs it may actually be faster than \emph{fast}. However, in such cases the elapsed timings will likely be far below a single second.
100100
\item \emph{median} will use a novel algorithm described by \emph{Jukka Suomela} in his paper \emph{Median Filtering is Equivalent to Sorting (2014)}. See references section for the link. Implementation here is extended to support arbitrary length of input and an even window size. Despite extensive validation of results this function should be considered experimental. When missing values are detected it will fall back to slower \code{algo="exact"} implementation.
101101
\item Not all functions have \emph{fast} implementation available. As of now adaptive \emph{max}, adaptive \emph{min} and adaptive \emph{median} do not have \emph{fast} implementation, therefore it will automatically fall back to \emph{exact} implementation. \code{datatable.verbose} option can be used to check that.
102102
}
103-
\item \code{algo="exact"} will make rolling functions to use a more computationally-intensive algorithm. For each observation from input vector it will compute a function on a rolling window from scratch (complexity \eqn{O(n^2)}).
103+
\item \code{algo="exact"} will make the rolling functions use a more computationally-intensive algorithm. For each observation in the input vector it will compute a function on a rolling window from scratch (complexity \eqn{O(n^2)}).
104104
\itemize{
105105
\item Depeneding on the function, this algorithm may suffers less from floating point rounding error (the same consideration applies to base \code{\link[base]{mean}}).
106106
\item In case of \emph{mean} (and possibly other functions in future), it will additionally make extra pass to perform floating point error correction. Error corrections might not be truly exact on some platforms (like Windows) when using multiple threads.

0 commit comments

Comments
 (0)