Skip to content

Commit 7a6920e

Browse files
committed
Updated the fread part and rm whitespaces added from newlines
1 parent 2005487 commit 7a6920e

File tree

1 file changed

+33
-33
lines changed

1 file changed

+33
-33
lines changed

man/openmp-utils.Rd

Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -38,60 +38,60 @@
3838
3939
\itemize{
4040
\item\file{between.c} - \code{\link{between}()}
41-
41+
4242
OpenMP is used here to parallelize:
4343
\itemize{
4444
\item The loops that check if each element of the vector provided is between the specified \code{lower} and \code{upper} bounds, for integer (\code{INTSXP}) and real (\code{REALSXP}) types
4545
\item The checking and handling of undefined values (such as NaNs)
4646
}
47-
47+
4848
Since this function is used to find rows where a column's value falls within a specific range, it benefits more from parallelization when the input data consists of a large number of rows.
49-
49+
5050
\item\file{cj.c} - \code{\link{CJ}()}
51-
51+
5252
OpenMP is used here to parallelize:
5353

5454
\itemize{
5555
\item The element assignment in vectors
5656
\item The memory copying operations (blockwise replication of data using \code{memcpy})
5757
\item The creation of all combinations of the input vectors over the cross-product space
5858
}
59-
59+
6060
Given that the number of combinations increases exponentially as more columns are added, better speedup can be expected when dealing with a large number of columns.
61-
61+
6262
\item\file{coalesce.c} - \code{\link{fcoalesce}()}
63-
63+
6464
OpenMP is used here to parallelize:
6565
\itemize{
6666
\item The operation that iterates over the rows to coalesce the data (which can be of type integer, real, or complex)
6767
\item The replacement of NAs with non-NA values from subsequent vectors
6868
\item The conditional checks within parallelized loops
6969
}
70-
70+
7171
Significant speedup can be expected for more number of columns here, given that this function operates efficiently across multiple columns to find non-NA values.
72-
72+
7373
\item\file{fifelse.c} - \code{\link{fifelse}()}
74-
74+
7575
For logical, integer, and real types, OpenMP is being used here to parallelize loops that perform conditional checks along with assignment operations over the elements of the supplied logical vector based on the condition (\code{test}) and values provided for the remaining arguments (\code{yes}, \code{no}, and \code{na}).
76-
76+
7777
Better speedup can be expected for more number of columns here as well, given that this function operates column-wise with independent vector operations.
78-
78+
7979
\item\file{fread.c} - \code{\link{fread}()}
80-
80+
8181
OpenMP is used here to:
8282

8383
\itemize{
84+
\item Parallelize the reading of data in chunks
8485
\item Avoid race conditions or concurrent writes to the output \code{data.table} by having atomic operations on the string data
8586
\item Manage synchronized updates to the progress bar and serialize the output to the console
8687
}
87-
There are no explicit pragmas for parallelizing loops, and instead the use of OpenMP here is mainly in controlling access to shared resources (with the use of critical sections, for instance) in a multi-threaded environment.
88-
88+
8989
This function is highly optimized in reading and processing data with both large numbers of rows and columns, but the efficiency is more pronounced across rows.
90-
90+
9191
\item\file{forder.c}, \file{fsort.c}, and \file{reorder.c} - \code{\link{forder}()} and related
92-
92+
9393
OpenMP is used here to parallelize multiple operations that come together to sort a \code{data.table} using the Radix algorithm. These include:
94-
94+
9595
\itemize{
9696
\item The counting of unique values and recursively sorting subsets of data across different threads (specific to \file{forder.c})
9797
\item The process of finding the range and distribution of data for efficient grouping and sorting (applies to both \file{forder.c} and \file{fsort.c})
@@ -100,42 +100,42 @@
100100
}
101101

102102
Better speedups can be expected when the input data contains a large number of rows as the sorting complexity increases with more rows.
103-
103+
104104
\item\file{froll.c}, \file{frolladaptive.c}, and \file{frollR.c} - \code{\link{froll}()} and family
105-
105+
106106
OpenMP is used here to parallelize the loops that compute the rolling means (\code{frollmean}) and sums (\code{frollsum}) over a sliding window for each position in the input vector.
107-
107+
108108
These functions benefit more in terms of speedup when the data has a large number of columns, primarily due to the efficient memory access patterns (cache-friendly) used when processing the data for each column sequentially in memory to compute the rolling statistic.
109109

110110
\item\file{fwrite.c} - \code{\link{fwrite}()}
111111

112112
OpenMP is used here primarily to parallelize the process of writing rows to the output file, but error handling and compression (if enabled) are also managed within the parallel region. Special attention is paid to thread safety and synchronization, especially in the ordered sections where output to the file and handling of errors is serialized to maintain the correct sequence of rows.
113113

114114
Similar to \code{\link{fread}()}, this function is highly efficient in parallely processing data with large numbers of both rows and columns, but it has more notable speedups with an increased number of rows.
115-
115+
116116
\item\file{gsumm.c} - GForce in various places, see \link{GForce}
117-
117+
118118
Functions with GForce optimization are internally parallelized to speed up grouped summaries over a large \code{data.table}. OpenMP is used here to parallelize operations involved in calculating group-wise statistics like sum, mean, and median (implying faster computation of \code{sd}, \code{var}, and \code{prod} as well).
119-
119+
120120
These optimized grouping operations benefit more in terms of speedup if the input data contains a large number of rows (since they are often used to aggregate data across groups).
121-
121+
122122
\item\file{nafill.c} - \code{\link{nafill}()}
123-
123+
124124
OpenMP is being used here to parallelize the loop that fills missing values over columns of the input data. This includes handling different data types (double, integer, and integer64) and applying the designated filling method (constant, last observation carried forward, or next observation carried backward) to each column in parallel.
125-
125+
126126
Given its optimization for column-wise operations, better speedups can be expected when the input data consists of a large number of columns.
127-
127+
128128
\item\file{subset.c} - Used in \code{\link[=data.table]{[.data.table}} subsetting
129-
129+
130130
OpenMP is used here to parallelize the loops that perform the subsetting of vectors, with conditional checks and filtering of data.
131-
131+
132132
Since subset operations tend to be usually row-dependent, better speedups can be expected when dealing with a large number of rows. However, it also depends on whether the computations are focused on rows or columns (as dictated by the subsetting criteria).
133-
133+
134134
\item\file{types.c} - Internal testing usage
135-
135+
136136
This caters to internal tests (not impacting any user-facing operations or functions), and OpenMP is being used here to test a message printing function inside a nested loop which has been collapsed into a single loop of the combined iteration space using \code{collapse(2)}, along with specification of dynamic scheduling for distributing the iterations in a way that can balance the workload among the threads.
137137
}
138-
138+
139139
In general, or as applicable to all the aforementioned use cases, better speedup can be expected when dealing with large datasets.
140140

141141
Having such data when using \code{\link{fread}()} or \code{\link{fwrite}()} (ones with significant speedups for larger file sizes) also means that while one part of the data is being read from or written to disk (I/O operations), another part can be simultaneously processed using multiple cores (parallel computations). This overlap reduces the total time taken for the read or write operation (as the system can perform computations during otherwise idle I/O time).

0 commit comments

Comments
 (0)