You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# TODO(michaelchirico): Enforce these and re-activate them one-by-one.
83
-
implicit_integer_linter=Inf,
84
-
infix_spaces_linter=Inf,
85
-
undesirable_function_linter=Inf
86
-
)),
87
-
exclusion_for_dir("vignettes", list(
88
-
quotes_linter=Inf,
89
-
sample_int_linter=Inf
90
-
# strings_as_factors_linter = Inf
91
-
# system_time_linter = Inf
92
-
)),
93
-
exclusion_for_dir("inst/tests", list(
94
-
library_call_linter=Inf,
95
-
numeric_leading_zero_linter=Inf,
96
-
undesirable_operator_linter=Inf, # For ':::', possibly we could be more careful to only exclude ':::'.
97
-
# TODO(michaelchirico): Enforce these and re-activate them one-by-one.
98
-
comparison_negation_linter=Inf,
99
-
condition_call_linter=Inf,
100
-
duplicate_argument_linter=Inf,
101
-
equals_na_linter=Inf,
102
-
missing_argument_linter=Inf,
103
-
paste_linter=Inf,
104
-
rep_len_linter=Inf,
105
-
sample_int_linter=Inf,
106
-
seq_linter=Inf,
107
-
unnecessary_lambda_linter=Inf
108
-
))
81
+
exclusions=list(
82
+
`../tests`=list(
83
+
quotes_linter=Inf,
84
+
# TODO(michaelchirico): Enforce these and re-activate them one-by-one.
85
+
implicit_integer_linter=Inf,
86
+
infix_spaces_linter=Inf,
87
+
undesirable_function_linter=Inf
88
+
),
89
+
`../vignettes*`=list(
90
+
# assignment_linter = Inf,
91
+
implicit_integer_linter=Inf,
92
+
quotes_linter=Inf,
93
+
sample_int_linter=Inf
94
+
# strings_as_factors_linter = Inf
95
+
# system_time_linter = Inf
96
+
),
97
+
`../inst/tests`=list(
98
+
library_call_linter=Inf,
99
+
numeric_leading_zero_linter=Inf,
100
+
undesirable_operator_linter=Inf, # For ':::', possibly we could be more careful to only exclude ':::'.
101
+
# TODO(michaelchirico): Enforce these and re-activate them one-by-one.
102
+
comparison_negation_linter=Inf,
103
+
condition_call_linter=Inf,
104
+
duplicate_argument_linter=Inf,
105
+
equals_na_linter=Inf,
106
+
missing_argument_linter=Inf,
107
+
paste_linter=Inf,
108
+
rep_len_linter=Inf,
109
+
sample_int_linter=Inf,
110
+
seq_linter=Inf,
111
+
unnecessary_lambda_linter=Inf
112
+
),
113
+
`../inst/tests/froll.Rraw`=list(
114
+
dt_test_literal_linter=Inf# TODO(michaelchirico): Fix these once #5898, #5692, #5682, #5576, #5575, #5441 are merged.
109
115
)
110
-
}),
111
-
list(`../inst/tests/froll.Rraw`=list(dt_test_literal_linter=Inf)) # TODO(michaelchirico): Fix these once #5898, #5692, #5682, #5576, #5575, #5441 are merged.
Copy file name to clipboardExpand all lines: .ci/README.md
+44-3Lines changed: 44 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# data.table continuous integration and deployment
2
2
3
-
On each Pull Request opened in GitHub we run GitHub Actions test jobs to provide prompt feedback about the status of PR. Our main CI pipeline runs on GitLab CI nightly. GitLab repository automatically mirrors our GitHub repository and runs pipeline on `master` branch every night. It tests more environments and different configurations. It publish variety of artifacts.
3
+
On each Pull Request opened in GitHub we run GitHub Actions test jobs to provide prompt feedback about the status of PR. Our more thorough main CI pipeline runs nightly on GitLab CI. In addition to branches pushed directly, the GitLab repository automatically mirrors our GitHub repository and runs pipeline on the `master` branch every night. It tests more environments and different configurations. It publishes a variety of artifacts such as our [homepage](https://rdatatable.gitlab.io/data.table/) and [CRAN-like website for dev version](https://rdatatable.gitlab.io/data.table/web/packages/data.table/index.html), including windows binaries for the dev version.
4
4
5
5
## Environments
6
6
@@ -12,11 +12,16 @@ Test jobs:
12
12
-`test-lin-rel-cran` - `--as-cran` on Linux, strict test for final status of `R CMD check`.
13
13
-`test-lin-dev-gcc-strict-cran` - `--as-cran` on Linux, `r-devel` built with `-enable-strict-barrier --disable-long-double`, test for compilation warnings, test for new NOTEs/WARNINGs from `R CMD check`.
14
14
-`test-lin-dev-clang-cran` - same as `gcc-strict` job but R built with `clang` and no `--enable-strict-barrier --disable-long-double` flags.
15
-
-`test-lin-310-cran` - R 3.1.0 on Linux, stated R dependency version.
15
+
-`test-lin-ancient-cran` - Stated R dependency version (currently 3.4.0) on Linux.
16
+
-`test-lin-dev-clang-san` - `r-devel` on Linux built with `clang -fsanitize=address,undefined` (including LeakSanitizer), test for sanitizer output in tests and examples.
17
+
-`test-lin-dev-gcc-san` - `r-devel` on Linux built with `gcc -fsanitize=address,undefined` (including LeakSanitizer), test for sanitizer output in tests and examples.
16
18
-`test-win-rel` - `r-release` on Windows.
17
19
-`test-win-dev` - `r-devel` on Windows.
18
20
-`test-win-old` - `r-oldrel` on Windows.
19
-
-`test-mac-rel` - macOS build not yet available, see [#3326](https://github.com/Rdatatable/data.table/issues/3326) for status
21
+
-`test-mac-rel` - `r-release` on macOS.
22
+
-`test-mac-old` - `r-oldrel` on macOS.
23
+
24
+
The CI steps for the tests are [required](https://github.com/Rdatatable/data.table/blob/55eb0f160b169398d51f138131c14a66c86e5dc9/.ci/publish.R#L162-L168) to be named according to the pattern `test-(lin|win|mac)-<R version>[-<suffix>]*`, where `<R version>` is `rel`, `dev`, `old`, `ancient`, or three digits comprising an R version (e.g. `362` corresponding to R-3.6.2).
20
25
21
26
Tests jobs are allowed to fail, summary and logs of test jobs are later published at _CRAN-like checks_ page, see artifacts below.
22
27
@@ -44,3 +49,39 @@ Base R implemented helper script, [originally proposed to base R](https://svn.r-
44
49
### [`publish.R`](./publish.R)
45
50
46
51
Base R implemented helper script to orchestrate generation of most artifacts and to arrange them nicely. It is being used only in [_integration_ stage in GitLab CI pipeline](./../.gitlab-ci.yml).
52
+
53
+
### [`lint.R`](./lint.R)
54
+
55
+
Base R runner for the manual (non-`lintr`) lint checks to be run from GitHub Actions during the code quality check. The command line arguments are as follows:
56
+
1. Path to the directory containing files defining the linters. A linter is a function that accepts one argument (typically the path to the file) and signals an error if it fails the lint check.
57
+
2. Path to the directory containing files to check.
58
+
3. A regular expression matching the files to check.
59
+
60
+
One of the files in the linter directory may define the `.preprocess` function, which must accept one file path and return a value that other linter functions will understand. The function may also return `NULL` to indicate that the file must be skipped.
61
+
62
+
Example command lines:
63
+
64
+
```sh
65
+
Rscript .ci/lint.R .ci/linters/c src '[.][ch]$'
66
+
Rscript .ci/lint.R .ci/linters/po po '[.]po$'
67
+
Rscript .ci/lint.R .ci/linters/md .'[.]R?md$'
68
+
```
69
+
70
+
## GitLab Open Source Program
71
+
72
+
We are currently part of the [GitLab for Open Source Program](https://about.gitlab.com/solutions/open-source/). This gives us 50,000 compute minutes per month for our GitLab CI. Our license needs to be renewed yearly (around July) and is currently managed by @ben-schwen.
73
+
74
+
## Updating CI pipeline
75
+
76
+
Basic CI checks are also run on every push to the GitLab repository. This can **and should** be used for PRs changing the CI pipeline before merging them to master.
77
+
78
+
```shell
79
+
# fetch changes from remote (GitHub) and push them to GitLab
Copy file name to clipboardExpand all lines: .ci/atime/tests.R
+72-3Lines changed: 72 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,5 @@
1
+
pval.thresh<-0.001# to reduce false positives.
2
+
1
3
# Test case adapted from https://github.com/Rdatatable/data.table/issues/6105#issue-2268691745 which is where the issue was reported.
2
4
# https://github.com/Rdatatable/data.table/pull/6107 fixed performance across 3 ways to specify a column as Date, and we test each individually.
3
5
extra.args.6107<- c(
@@ -13,12 +15,38 @@ for (extra.arg in extra.args.6107){
13
15
tmp_csv= tempfile()
14
16
fwrite(DT, tmp_csv)
15
17
},
18
+
FasterIO="60a01fa65191c44d7997de1843e9a1dfe5be9f72", # First commit of the PR (https://github.com/Rdatatable/data.table/pull/6925/commits) that reduced time usage
16
19
Slow="e9087ce9860bac77c51467b19e92cf4b72ca78c7", # Parent of the merge commit (https://github.com/Rdatatable/data.table/commit/a77e8c22e44e904835d7b34b047df2eff069d1f2) of the PR (https://github.com/Rdatatable/data.table/pull/6107) that fixes the issue
17
20
Fast="a77e8c22e44e904835d7b34b047df2eff069d1f2") # Merge commit of the PR (https://github.com/Rdatatable/data.table/pull/6107) that fixes the issue
"forderv(retGrp=%s) improved in #4386", retGrp_chr
28
+
)]] <-list(
29
+
setup= quote({
30
+
dt<- data.table(group= rep(1:2, l=N))
31
+
}),
32
+
expr= substitute({
33
+
old.opt<- options(datatable.forder.auto.index=TRUE) # required for test, un-documented, comments in forder.c say it is for debugging only.
34
+
data.table:::forderv(dt, "group", retGrp=RETGRP)
35
+
options(old.opt) # so the option does not affect other tests.
36
+
}, list(RETGRP=eval(str2lang(retGrp_chr)))),
37
+
## From ?bench::mark, "Each expression will always run at least twice,
38
+
## once to measure the memory allocation and store results
39
+
## and one or more times to measure timing."
40
+
## So for atime(times=10) that means 11 times total.
41
+
## First time for memory allocation measurement,
42
+
## (also sets the index of dt in this example),
43
+
## then 10 more times for time measurement.
44
+
## Timings should be constant if the cached index is used (Fast),
45
+
## and (log-)linear if the index is re-computed (Slow).
46
+
Slow="b1b1832b0d2d4032b46477d9fe6efb15006664f4", # Parent of the first commit (https://github.com/Rdatatable/data.table/commit/b0efcf59442a7d086c6df17fa6a45c81b082322e) in the PR (https://github.com/Rdatatable/data.table/pull/4386/commits) where the performance was improved.
47
+
Fast="ffe431fbc1fe2d52ed9499f78e7e16eae4d71a93"# Last commit of the PR (https://github.com/Rdatatable/data.table/pull/4386/commits) where the performance was improved.
48
+
)
49
+
22
50
# A list of performance tests.
23
51
#
24
52
# See documentation in https://github.com/Rdatatable/data.table/wiki/Performance-testing for best practices.
# Test case adapted from https://github.com/Rdatatable/data.table/pull/7022#discussion_r2107900643
136
+
"fread disk overhead improved in #6925"=atime::atime_test(
137
+
N=2^seq(0, 20), # smaller N because we are doing multiple fread calls.
138
+
setup= {
139
+
fwrite(iris[1], iris.csv<- tempfile())
140
+
},
141
+
expr= replicate(N, data.table::fread(iris.csv)),
142
+
Fast="60a01fa65191c44d7997de1843e9a1dfe5be9f72", # First commit of the PR (https://github.com/Rdatatable/data.table/pull/6925/commits) that reduced time usage
143
+
Slow="e25ea80b793165094cea87d946d2bab5628f70a6"# Parent of the first commit (https://github.com/Rdatatable/data.table/commit/60a01fa65191c44d7997de1843e9a1dfe5be9f72)
144
+
),
145
+
101
146
# Performance regression discussed in https://github.com/Rdatatable/data.table/issues/4311
102
147
# Test case adapted from https://github.com/Rdatatable/data.table/pull/4440#issuecomment-632842980 which is the fix PR.
103
148
"shallow regression fixed in #4440"=atime::atime_test(
Slow="fd24a3105953f7785ea7414678ed8e04524e6955", # Parent of the merge commit (https://github.com/Rdatatable/data.table/commit/ed72e398df76a0fcfd134a4ad92356690e4210ea) of the PR (https://github.com/Rdatatable/data.table/pull/5054) that fixes the issue
209
-
Fast="ed72e398df76a0fcfd134a4ad92356690e4210ea"), # Merge commit of the PR (https://github.com/Rdatatable/data.table/pull/5054) that fixes the issue
255
+
Fast="ed72e398df76a0fcfd134a4ad92356690e4210ea"), # Merge commit of the PR (https://github.com/Rdatatable/data.table/pull/5054) that fixes the issue # Test case created directly using the atime code below (not adapted from any other benchmark), based on the issue/fix PR https://github.com/Rdatatable/data.table/pull/5054#issue-930603663 "melt should be more efficient when there are missing input columns."
256
+
257
+
# Test case created from @tdhock's comment https://github.com/Rdatatable/data.table/pull/6393#issuecomment-2327396833, in turn adapted from @philippechataignon's comment https://github.com/Rdatatable/data.table/pull/6393#issuecomment-2326714012
Before="f339aa64c426a9cd7cf2fcb13d91fc4ed353cd31", # Parent of the first commit https://github.com/Rdatatable/data.table/commit/fcc10d73a20837d0f1ad3278ee9168473afa5ff1 in the PR https://github.com/Rdatatable/data.table/pull/6393/commits with major change to fwrite with gzip.
268
+
PR="3630413ae493a5a61b06c50e80d166924d2ef89a"), # Close-to-last merge commit in the PR.
269
+
270
+
# Test case created directly using the atime code below (not adapted from any other benchmark), based on the PR, Removes unnecessary data.table call from as.data.table.array https://github.com/Rdatatable/data.table/pull/7010
271
+
"as.data.table.array improved in #7010"=atime::atime_test(
Slow="73d79edf8ff8c55163e90631072192301056e336", # Parent of the first commit in the PR (https://github.com/Rdatatable/data.table/commit/8397dc3c993b61a07a81c786ca68c22bc589befc)
278
+
Fast="8397dc3c993b61a07a81c786ca68c22bc589befc"), # Commit in the PR (https://github.com/Rdatatable/data.table/pull/7019/commits) that removes inefficiency
0 commit comments