|
45 | 45 | } |
46 | 46 | \arguments{ |
47 | 47 | \item{x}{ Integer, numeric or logical vector, coerced to numeric, on which sliding window calculates an aggregate function. It supports vectorized input, then it needs to be a \code{data.table}, \code{data.frame} or a \code{list}, in which case a rolling function is applied to each column/vector. } |
48 | | - \item{n}{ Integer, non-negative, rolling window size. This is the \emph{total} number of included values in aggregate function. In case of an adaptive rolling function window size has to be provided as a vector for each indivdual value of \code{x}. It supports vectorized input, then it needs to be a vector, or in case of an adaptive rolling a \code{list} of vectors. } |
49 | | - \item{fill}{ Numeric; value to pad by. Defaults to \code{NA}. } |
50 | | - \item{algo}{ Character, default \code{"fast"}. When set to \code{"exact"}, a slower (but more accurate) algorithm is used. It suffers less from floating point rounding errors by performing an extra pass, and carefully handles all non-finite values. It will use multiple cores where available. See Details for more information. } |
| 48 | + \item{n}{ Integer, non-negative, non-NA, rolling window size. This is the \emph{total} number of included values in aggregate function. In case of an adaptive rolling function, the window size has to be provided as a vector for each individual value of \code{x}. It supports vectorized input, then it needs to be a vector, or in case of an adaptive rolling a \code{list} of vectors. } |
| 49 | + \item{fill}{ Numeric; value to pad by for an incomplete window iteration. Defaults to \code{NA}. When partial=TRUE this argument is ignored. } |
| 50 | + \item{algo}{ Character, default \code{"fast"}. When set to \code{"exact"}, a slower (in some cases more accurate) algorithm is used. It will use multiple cores where available. See Details for more information. } |
51 | 51 | \item{align}{ Character, specifying the "alignment" of the rolling window, defaulting to \code{"right"}. \code{"right"} covers preceding rows (the window \emph{ends} on the current value); \code{"left"} covers following rows (the window \emph{starts} on the current value); \code{"center"} is halfway in between (the window is \emph{centered} on the current value, biased towards \code{"left"} when \code{n} is even). } |
52 | | - \item{na.rm}{ Logical, default \code{FALSE}. Should missing values be removed when calculating window? } |
| 52 | + \item{na.rm}{ Logical, default \code{FALSE}. Should missing values be removed when calculating aggregate function on a window? } |
53 | 53 | \item{has.nf}{ Logical. If it is known whether \code{x} contains non-finite values (\code{NA}, \code{NaN}, \code{Inf}, \code{-Inf}), then setting this to \code{TRUE} or \code{FALSE} may speed up computation. Defaults to \code{NA}. See \emph{has.nf argument} section below for details. } |
54 | 54 | \item{adaptive}{ Logical, default \code{FALSE}. Should the rolling function be calculated adaptively? See \emph{Adaptive rolling functions} section below for details. } |
55 | | - \item{partial}{ Logical, default \code{FALSE}. Should the rolling window size(s) provided in \code{n} be computed also for leading incomplete running window. See \emph{\code{partial} argument} section below for details. } |
| 55 | + \item{partial}{ Logical, default \code{FALSE}. Should the rolling window size(s) provided in \code{n} be computed also for leading incomplete running window? See \emph{\code{partial} argument} section below for details. } |
56 | 56 | \item{give.names}{ Logical, default \code{FALSE}. When \code{TRUE}, names are automatically generated corresponding to names of \code{x} and names of \code{n}. If answer is an atomic vector, then the argument is ignored, see examples. } |
57 | 57 | \item{hasNA}{ Logical. Deprecated, use \code{has.nf} argument instead. } |
58 | 58 | } |
59 | 59 | \details{ |
60 | | - \code{froll*} functions accept vector, list, \code{data.frame} or \code{data.table}. Functions operate on a single vector; when passing a non-atomic input, then function is applied column-by-column, not to the complete set of columns at once. |
| 60 | + \code{froll*} functions accept vector, list, \code{data.frame} or \code{data.table}. Functions operate on a single vector; when passing a non-atomic input, then the function is applied column-by-column, not to the complete set of columns at once. |
61 | 61 |
|
62 | 62 | Argument \code{n} allows multiple values to apply rolling function on multiple window sizes. If \code{adaptive=TRUE}, then \code{n} can be a list to specify multiple window sizes for adaptive rolling computation. See \emph{Adaptive rolling functions} section below for details. |
63 | 63 |
|
64 | | - When multiple columns and/or multiple window widths are provided, then computations run in parallel. The exception is for \code{algo="exact"}, which runs in parallel even for single column and single window width. By default, data.table uses only half of available CPUs, see \code{\link{setDTthreads}} for details on how to tune CPU usage. |
| 64 | + When multiple columns or multiple window widths are provided, then they are run in parallel. The exception is for \code{algo="exact"} or \code{adaptive=TRUE}, which runs in parallel even for single column and single window width. By default, data.table uses only half of available CPUs, see \code{\link{setDTthreads}} for details on how to tune CPU usage. |
65 | 65 |
|
66 | | - Adaptive rolling functions are a special case where each |
67 | | - observation has its own corresponding rolling window width. Due to the logic |
68 | | - of adaptive rolling functions, the following restrictions apply: |
69 | | - \itemize{ |
70 | | - \item \code{align} only \code{"right"}. |
71 | | - \item if list of vectors is passed to \code{x}, then all |
72 | | - vectors within it must have equal length. |
73 | | - } |
74 | | - |
75 | | - When multiple columns or multiple windows width are provided, then they |
76 | | - are run in parallel. The exception is for \code{algo="exact"}, which runs in |
77 | | - parallel already. |
78 | | - |
79 | | - Setting \code{options(datatable.verbose=TRUE)} will display various |
80 | | - information about how rolling function processed. It will not print |
81 | | - information in real-time but only at the end of the processing. |
| 66 | + Setting \code{options(datatable.verbose=TRUE)} will display various information about how rolling function processed. It will not print information in real-time but only at the end of the processing. |
82 | 67 | } |
83 | 68 | \value{ |
84 | | - For a non \emph{vectorized} input (\code{x} is not a list, and \code{n} specify single rolling window) a \code{vector} is returned, for convenience. Thus, rolling functions can be used conveniently within \code{data.table} syntax. For a \emph{vectorized} input a list is returned. |
| 69 | + For a non \emph{vectorized} input (\code{x} is not a list, and \code{n} specifies a single rolling window) a \code{vector} is returned, for convenience. Thus, rolling functions can be used conveniently within \code{data.table} syntax. For a \emph{vectorized} input a list is returned. |
85 | 70 | } |
86 | 71 | \note{ |
87 | | - Be aware that rolling functions operate on the physical order of input. If the intent is to roll values in a vector by a logical window, for example an hour, or a day, then one has to ensure that there are no gaps in the input, or use adaptive rolling function to handle gaps, for which we provide helper function \code{\link{frolladapt}} to generate adaptive window size. |
| 72 | + Be aware that rolling functions operate on the physical order of input. If the intent is to roll values in a vector by a logical window, for example an hour, or a day, then one has to ensure that there are no gaps in the input, or use an adaptive rolling function to handle gaps, for which we provide helper function \code{\link{frolladapt}} to generate adaptive window size. |
88 | 73 | } |
89 | 74 | \section{\code{has.nf} argument}{ |
90 | 75 | \code{has.nf} can be used to speed up processing in cases when it is known if \code{x} contains (or not) non-finite values (\code{NA}, \code{NaN}, \code{Inf}, \code{-Inf}). |
91 | 76 | \itemize{ |
92 | | - \item Default \code{has.nf=NA} uses faster implementation that does not support non-finite values, but when non-finite values are detected it will re-run non-finite supported implementation. |
| 77 | + \item Default \code{has.nf=NA} uses faster implementation that does not support non-finite values, but when non-finite values are detected it will re-run non-finite aware implementation. |
93 | 78 | \item \code{has.nf=TRUE} uses non-finite aware implementation straightaway. |
94 | 79 | \item \code{has.nf=FALSE} uses faster implementation that does not support non-finite values. Then depending on the rolling function it will either: |
95 | 80 | \itemize{ |
96 | 81 | \item (\emph{mean, sum, prod, var, sd}) detect non-finite, re-run non-finite aware. |
97 | 82 | \item (\emph{max, min, median}) does not detect non-finites and may silently produce an incorrect answer. |
98 | 83 | } |
99 | | - In general \code{has.nf=FALSE && any(!is.finite(x))} should be considered as undefined behavior. Therefore \code{has.nf=FALSE} should be used with care. |
100 | 84 | } |
| 85 | + In general \code{has.nf=FALSE && any(!is.finite(x))} should be considered undefined behavior. Therefore \code{has.nf=FALSE} should be used with care. |
101 | 86 | } |
102 | 87 | \section{Implementation}{ |
103 | 88 | Most of the rolling functions have 4 different implementations. First factor that decides which implementation is used is the \code{adaptive} argument (either \code{TRUE} or \code{FALSE}), see section below for details. Then for each of those two algorithms there are usually two implementations depending on the \code{algo} argument. |
|
111 | 96 | } |
112 | 97 | \item \code{algo="exact"} will make the rolling functions use a more computationally-intensive algorithm. For each observation in the input vector it will compute a function on a rolling window from scratch (complexity \eqn{O(n^2)}). |
113 | 98 | \itemize{ |
114 | | - \item Depeneding on the function, this algorithm may suffers less from floating point rounding error (the same consideration applies to base \code{\link[base]{mean}}). |
| 99 | + \item Depending on the function, this algorithm may suffer less from floating point rounding error (the same consideration applies to base \code{\link[base]{mean}}). |
115 | 100 | \item In case of \emph{mean}, it will additionally make an extra pass to perform floating point error correction. Error corrections might not be truly exact on some platforms (like Windows) when using multiple threads. |
116 | 101 | } |
117 | 102 | } |
118 | 103 | } |
119 | 104 | \section{Adaptive rolling functions}{ |
120 | | - Adaptive rolling functions are a special case where each observation has its own corresponding rolling window width. Therefore, values passed to \code{n} argument must be series corresponding to observations in \code{x}. If multiple windows are meant to be computed, then a list of integer vectors is expected; each list element must be an integer vector of window size corresponding to observations in \code{x}; see Examples. Due to the logic or implementation of adaptive rolling functions, the following restrictions apply |
| 105 | + Adaptive rolling functions are a special case where each observation has its own corresponding rolling window width. Therefore, values passed to \code{n} argument must be series corresponding to observations in \code{x}. If multiple windows are meant to be computed, then a list of integer vectors is expected; each list element must be an integer vector of window size corresponding to observations in \code{x}; see Examples. Due to the logic or implementation of adaptive rolling functions, the following restrictions apply: |
121 | 106 | \itemize{ |
122 | 107 | \item \code{align} does not support \code{"center"}. |
123 | | - \item if list of vectors is passed to \code{x}, then all vectors within it must have equal length due to the fact that length of adaptive window widths must match the length of vectors in \code{x}. |
| 108 | + \item if a list of vectors is passed to \code{x}, then all vectors within it must have equal length due to the fact that length of adaptive window widths must match the length of vectors in \code{x}. |
124 | 109 | } |
125 | 110 | } |
126 | 111 | \section{\code{partial} argument}{ |
|
131 | 116 | \section{\code{zoo} package users notice}{ |
132 | 117 | Users coming from most popular package for rolling functions \code{zoo} might expect following differences in \code{data.table} implementation |
133 | 118 | \itemize{ |
134 | | - \item rolling function will always return result of the same length |
135 | | - as input. |
| 119 | + \item rolling function will always return result of the same length as input. |
136 | 120 | \item \code{fill} defaults to \code{NA}. |
137 | | - \item \code{fill} accepts only constant values. It does not support |
138 | | - for \emph{na.locf} or other functions. |
| 121 | + \item \code{fill} accepts only constant values. It does not support for \emph{na.locf} or other functions. |
139 | 122 | \item \code{align} defaults to \code{"right"}. |
140 | | - \item \code{na.rm} is respected, and other functions are not needed |
141 | | - when input contains \code{NA}. |
| 123 | + \item \code{na.rm} is respected, and other functions are not needed when input contains \code{NA}. |
142 | 124 | \item integers and logical are always coerced to numeric. |
143 | | - \item when \code{adaptive=FALSE} (default), then \code{n} must be a |
144 | | - numeric vector. List is not accepted. |
145 | | - \item when \code{adaptive=TRUE}, then \code{n} must be vector of |
146 | | - length equal to \code{nrow(x)}, or list of such vectors. |
| 125 | + \item when \code{adaptive=FALSE} (default), then \code{n} must be a numeric vector. List is not accepted. |
| 126 | + \item when \code{adaptive=TRUE}, then \code{n} must be vector of length equal to \code{nrow(x)}, or list of such vectors. |
147 | 127 | } |
148 | 128 | } |
149 | 129 | \examples{ |
@@ -190,7 +170,7 @@ frollsum(list(x=1:5, y=5:1), c(tiny=2, big=4), give.names=TRUE) |
190 | 170 | frollmax(c(1,2,NA,4,5), 2) |
191 | 171 | frollmax(c(1,2,NA,4,5), 2, has.nf=FALSE) |
192 | 172 |
|
193 | | -# use verobse=TRUE for extra insight |
| 173 | +# use verbose=TRUE for extra insight |
194 | 174 | .op = options(datatable.verbose = TRUE) |
195 | 175 | frollsd(c(1:5,NA,7:8), 4) |
196 | 176 | options(.op) |
|
0 commit comments