-
Notifications
You must be signed in to change notification settings - Fork 339
Rework run_cpp_tests() to avoid testthat internals
#2315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ) | ||
| }, | ||
| error = function(e) { | ||
| catch_error <- TRUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R/test-compiled-code.R
Outdated
|
|
||
| context_start(context_name) | ||
|
|
||
| tests <- xml2::xml_find_all(context, "./Section") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the xml2 stuff in this file remains untouched. It's just the setup and reporting of pass/fail that has changed.
R/test-compiled-code.R
Outdated
| for (test in tests) { | ||
| test_description <- xml2::xml_attr(test, "name") | ||
|
|
||
| test_that(test_description, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"For each test in a context, set up an R level test_that() block"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should consider if you should use test_code() here instead, since it's a little lower level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that at first, see 99dde78
The problem was that we need to record the test_description somewhere for each test, and test_code() expects that you have already set that up somehow.
I had come up with some solution of a new with_description() style helper that calls test_code() inside it, but that's literally exactly what test_that() does with extra steps, so I simplified it back to test_that()
| # There is no `fail()` equivalent for an error. | ||
| # We could use `stop()`, but we want to pass through a `srcref`. | ||
| expectation( | ||
| type = "error", | ||
| message = exception_text, | ||
| srcref = exception_srcref | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to better ideas here. I'm also not entirely sure how to trigger this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a useful claude experiment to me because even though
the$test_expectations <- the$test_expectations + 1Lwas very much the wrong solution for this, that very quickly helped me isolate what the actual problem was (that would have taken me an hour by myself, or more, probably)
And I also used Claude for the first draft of the pass() / fail() solution, which was pretty helpful even though it also wasn't great yet.
But overall it gave me momentum on a bug that I definitely would have just given up on otherwise, because testthat is just too unfamiliar to me.
I feel like there's this unexplored or unnamed usage of Claude of "just find the bug and explain it to me, I'll be the one to work on the solution". Which is still quite useful in saving a bunch of upfront time.
Here's my whole Claude conversation
Details
╭─── Claude Code v2.1.14 ──────────────────────────────────────────────────────╮
│ │ Tips for getting started │
│ Welcome back! │ Run /init to create a CLAUDE.md file wit… │
│ │ ───────────────────────────────────────── │
│ ▐▛███▜▌ │ Recent activity │
│ ▝▜█████▛▘ │ No recent activity │
│ ▘▘ ▝▝ │ │
│ │ │
│ Opus 4.5 · API Usage Billing │ │
│ ~/files/r/packages/cpp11 │ │
╰──────────────────────────────────────────────────────────────────────────────╯
Welcome to Opus 4.5
❯ There seems to be a bug between cpp11 and testthat where a single test is
always reported as skipped.
If you run the cpp11test tests as outlined in CLAUDE.md, you should see a
single test that is skipped.
I believe this is a bug in testthat, possibly in
`testthat::run_cpp_tests`.
I have checked out a branch of testthat in
`/Users/davis/files/r/packages/testthat` for you to modify. You may work
in that directory, but not in any other one.
See if you can figure out what is wrong in testthat that causes a single
test to always be skipped. You can check your changes in testthat by
running the following:
```r
Rscript -e "
devtools::install('/Users/davis/files/r/packages/testthat')
devtools::clean_dll('./cpp11test')
devtools::test('./cpp11test')
"
```
⏺ I'll investigate this issue. Let me start by running the cpp11test tests to see the skipped test, and then explore the testthat code.
⏺ Bash(Rscript -e " timeout: 5m 0s
devtools::clean_dll('./cpp11test')…)
⎿ ℹ Testing cpp11test
SKIP: --------------------------------------------
Reason: empty test
… +2 lines (ctrl+o to expand)
⏺ I can see there's 1 test skipped with reason "empty test". Let me look at the testthat code, specifically run_cpp_tests.
⏺ Search(pattern: "run_cpp_tests", path: "~/files/r/packages/testthat")
⎿ Found 6 files (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Read 440 lines
⏺ Let me search for "empty test" in testthat to find where this skip message comes from.
⏺ Search(pattern: "empty test", path: "~/files/r/packages/testthat",
output_mode: "content")
⎿ Found 33 lines (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/skip.R)
⎿ Read 330 lines
⏺ Let me look at what context_start does and how tests are reported.
⏺ Search(pattern: "context_start", path: "~/files/r/packages/testthat")
⎿ Found 8 files (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/context.R)
⎿ Read 56 lines
⏺ Let me search for where skip_empty is called.
⏺ Search(pattern: "skip_empty", path: "~/files/r/packages/testthat",
output_mode: "content")
⎿ Found 5 lines (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/test-that.R)
⎿ Read 260 lines
⏺ Let me look at the test-cpp.R template file to understand how run_cpp_tests is called.
⏺ Read(~/files/r/packages/testthat/inst/resources/test-cpp.R)
⎿ Read 2 lines
⏺ Read(cpp11test/tests/testthat/test-cpp.R)
⎿ Read 2 lines
⏺ Let me look at how test files are run in testthat.
⏺ Search(pattern: "def test_file", path: "~/files/r/packages/testthat")
⎿ Found 0 files
⏺ Search(pattern: "test_file.*<-.*function", path:
"~/files/r/packages/testthat", glob: "*.R")
⎿ Found 2 files (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/test-files.R)
⎿ Read 476 lines
⏺ Let me look at the source_file function.
⏺ Search(pattern: "^source_file.*<-.*function", path:
"~/files/r/packages/testthat/R")
⎿ Found 1 file (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/source.R)
⎿ Read 165 lines
⏺ I found the issue. When source_file runs with wrap = TRUE (the default), it wraps the entire file content in test_code(). At the end of test_code(), it checks if any expectations were recorded by comparing the$test_expectations before and after. If no new expectations were recorded, it calls skip_empty().
The problem is that run_cpp_tests reports tests directly to the reporter via get_reporter()$add_result() without incrementing the$test_expectations. So even though 962 C++ tests passed, testthat thinks the file had no expectations and reports a skip.
Let me verify this theory by checking how the$test_expectations is accessed.
⏺ Search(pattern: "the\$test_expectations", path:
"~/files/r/packages/testthat/R", output_mode: "content")
⎿ Found 8 lines (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/testthat-package.R)
⎿ Read 41 lines
⏺ Now I understand the issue completely. Let me add the fix to run_cpp_tests. When it processes test results, it needs to increment the$test_expectations so that the wrapping test_code() knows that tests were actually run.
⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Read 120 lines
⏺ Now I'll add the fix. The issue is that run_cpp_tests reports test results directly to the reporter without incrementing the$test_expectations. I need to increment this counter for each result added.
⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Added 1 line
96 get_reporter()$start_test(context = context_name, test
= test_name)
97
98 for (i in seq_len(successes)) {
99 + the$test_expectations <- the$test_expectations + 1L
100 exp <- new_expectation("success", "")
101 exp$test <- test_name
102 get_reporter()$add_result(
⏺ Now let me add the same fix for failures and exceptions.
⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Read 65 lines
⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Added 2 lines
130 c(line, line, 1, 1)
131 )
132
133 + the$test_expectations <- the$test_expectations + 1L
134 exp <- new_expectation("failure", org_text, srcref =
failure_srcref)
135 exp$test <- test_name
136
...
142
143 exceptions <- xml2::xml_find_all(test, "./Exception")
144 for (exception in exceptions) {
145 + the$test_expectations <- the$test_expectations + 1L
146 exception_text <- xml2::xml_text(exception, trim =
TRUE)
147 filename <- xml2::xml_attr(exception, "filename")
148 line <- xml2::xml_attr(exception, "line")
⏺ Let me also check the error handling case at the beginning of run_cpp_tests.
⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Read 25 lines
⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Added 1 line
62 catch_error <- TRUE
63 reporter <- get_reporter()
64
65 + the$test_expectations <- the$test_expectations + 1L
66 context_start("Catch")
67 reporter$start_test(context = "Catch", test = "Catch")
68 reporter$add_result(
⏺ Now let me test the fix by reinstalling testthat and running the cpp11test tests.
⏺ Bash(Rscript -e " timeout: 5m
devtools::install('/Users/davis/files/r/packages/testthat 0s
')…)
⎿ rlang (1df507ece... -> 0fd422994...) [GitHub]
pkgload (6cc3650a0... -> [{"sha":"...) [GitHub]
Downloading GitHub repo r-lib/rlang@HEAD
… +166 lines (ctrl+o to expand)
⏺ The fix works. The output now shows:
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 962 ]
The skip count went from 1 to 0.
Root cause: When a test file is sourced, source_file() wraps the entire file content in test_code() (with wrap = TRUE). At the end of test_code(), it checks if any expectations were recorded by comparing the$test_expectations before and after execution. If no new expectations were recorded, it calls skip_empty().
The problem was that run_cpp_tests() reports test results directly to the reporter via get_reporter()$add_result() without incrementing the$test_expectations. So even though 962 C++ tests passed, testthat thought the file had no expectations and reported an "empty test" skip.
Fix: I added the$test_expectations <- the$test_expectations + 1L before each add_result() call in run_cpp_tests():
- Line 65: In the error handler when catching fails
- Line 99: For each success
- Line 133: For each failure
- Line 146: For each exception
✻ Baked for 3m 33s
❯ /export
⎿ Conversation exported to:
2026-01-21-there-seems-to-be-a-bug-between-cpp11-and-testthat.txt
❯ /cost
⎿ Total cost: $1.08
Total duration (API): 2m 50s
Total duration (wall): 18h 26m 50s
Total code changes: 4 lines added, 0 lines removed
Usage by model:
claude-haiku: 4.8k input, 153 output, 0 cache read, 0 cache write
($0.0056)
claude-opus-4-5: 210 input, 7.5k output, 1.0m cache read, 60.2k cache
write ($1.08)
❯ That seems like the right idea, but it seems like the real issue is that
`run_cpp_tests()` is very low level and is not emitting `pass()` or
`fail()` after processing each test, which means that the
`handle_expectation` handler in `test_code()` is not being emitted, and
that is what normally bumps `the$test_expectations`. Is there a way to
transition `run_cpp_tests()` to use `pass()` and `fail()` and friends
instead?
❯ That seems like the right idea, but it seems like the real issue is that
`run_cpp_tests()` is very low level
and is not emitting `pass()` or `fail()` after processing each test,
which means that the `handle_expectation`
handler in `test_code()` is not being emitted, and that is what
normally bumps `the$test_expectations`. Is
there a way to transition `run_cpp_tests()` to use `pass()` and
`fail()` and friends instead?
⏺ You're right, that's a cleaner approach. Let me look at how pass() and fail() work.
⏺ Search(pattern: "^(pass|fail)\s*<-\s*function", path:
"~/files/r/packages/testthat/R")
⎿ Found 1 file (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/expect-that.R)
⎿ Read 92 lines
⏺ Search(pattern: "^expectation\s*<-\s*function", path:
"~/files/r/packages/testthat/R")
⎿ Found 1 file (ctrl+o to expand)
⏺ Read(~/files/r/packages/testthat/R/expectation.R)
⎿ Read 252 lines
⏺ You're right. Let me trace through how succeed() and fail() work to see if we can use them properly.
The issue is that fail() calls exp_signal() which calls stop(exp) for failures - this would interrupt processing after the first failure. For C++ tests, we want to report all failures.
However, we could:
1. Use succeed() for successes (which signals without throwing)
2. For failures/errors, signal the expectation as a condition without throwing
Let me create a cleaner implementation:
⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Read 444 lines
⏺ Now let me rewrite run_cpp_tests to use proper expectation signaling. The key insight is:
1. succeed() signals a success condition - this works
2. For failures, we can't use fail() (it throws and stops processing). Instead, we signal the failure as a condition without throwing
3. We need to set up the$description properly so test_description() returns the right test name
⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
⎿ Added 78 lines, removed 76 lines
59 )
60 },
61 error = function(e) {
62 - catch_error <- TRUE
63 - reporter <- get_reporter()
62 + catch_error <<- TRUE
63
64 - the$test_expectations <- the$test_expectations + 1L
64 context_start("Catch")
65 - reporter$start_test(context = "Catch", test = "Catch")
66 - reporter$add_result(
67 - context = "Catch",
68 - test = "Catch",
69 - result = new_expectation("failure", e$message)
70 - )
71 - reporter$end_test(context = "Catch", test = "Catch")
65 + process_cpp_test_section_error("Catch", e$message)
66 }
67 )
68
...
89
90 tests <- xml2::xml_find_all(context, "./Section")
91 for (test in tests) {
92 - test_name <- xml2::xml_attr(test, "name")
92 + process_cpp_test_section(test, context_name)
93 + }
94 + }
95 +}
96
97 - result <- xml2::xml_find_first(test,
-"./OverallResults")
98 - successes <- as.integer(xml2::xml_attr(result,
-"successes"))
97 +# Helper to process a single C++ test section
98 +# Using a separate function ensures proper scoping for
+local_description_set()
99 +process_cpp_test_section <- function(test, context_name) {
100 + test_name <- xml2::xml_attr(test, "name")
101
102 - get_reporter()$start_test(context = context_name, test
-= test_name)
102 + # Set up the description so test_description() returns the
+correct test name
103 + # when expectations are signaled and handled
104
105 - for (i in seq_len(successes)) {
106 - the$test_expectations <- the$test_expectations + 1L
107 - exp <- new_expectation("success", "")
108 - exp$test <- test_name
109 - get_reporter()$add_result(
110 - context = context_name,
111 - test = test_name,
112 - result = exp
113 - )
114 - }
105 + local_description_set(test_name)
106
107 - failures <- xml2::xml_find_all(test, "./Expression")
108 - for (failure in failures) {
109 - org <- xml2::xml_find_first(failure, "Original")
110 - org_text <- xml2::xml_text(org, trim = TRUE)
107 + result <- xml2::xml_find_first(test, "./OverallResults")
108 + successes <- as.integer(xml2::xml_attr(result,
+"successes"))
109
110 - filename <- xml2::xml_attr(failure, "filename")
111 - type <- xml2::xml_attr(failure, "type")
110 + get_reporter()$start_test(context = context_name, test =
+test_name)
111
112 - type_msg <- switch(
113 - type,
114 - "CATCH_CHECK_FALSE" = "isn't false.",
115 - "CATCH_CHECK_THROWS" = "did not throw an
-exception.",
116 - "CATCH_CHECK_THROWS_AS" = "threw an exception with
-unexpected type.",
117 - "isn't true."
118 - )
112 + for (i in seq_len(successes)) {
113 + succeed("")
114 + }
115
116 - org_text <- paste(org_text, type_msg)
116 + failures <- xml2::xml_find_all(test, "./Expression")
117 + for (failure in failures) {
118 + org <- xml2::xml_find_first(failure, "Original")
119 + org_text <- xml2::xml_text(org, trim = TRUE)
120
121 - line <- xml2::xml_attr(failure, "line")
122 - failure_srcref <- srcref(
123 - srcfile(file.path("src", filename)),
124 - c(line, line, 1, 1)
125 - )
121 + filename <- xml2::xml_attr(failure, "filename")
122 + type <- xml2::xml_attr(failure, "type")
123
124 - the$test_expectations <- the$test_expectations + 1L
125 - exp <- new_expectation("failure", org_text, srcref =
-failure_srcref)
126 - exp$test <- test_name
124 + type_msg <- switch(
125 + type,
126 + "CATCH_CHECK_FALSE" = "isn't false.",
127 + "CATCH_CHECK_THROWS" = "did not throw an exception.",
128 + "CATCH_CHECK_THROWS_AS" = "threw an exception with
+unexpected type.",
129 + "isn't true."
130 + )
131
132 - get_reporter()$add_result(
133 - context = context_name,
134 - test = test_name,
135 - result = exp
136 - )
137 - }
132 + org_text <- paste(org_text, type_msg)
133
134 - exceptions <- xml2::xml_find_all(test, "./Exception")
135 - for (exception in exceptions) {
136 - the$test_expectations <- the$test_expectations + 1L
137 - exception_text <- xml2::xml_text(exception, trim =
-TRUE)
138 - filename <- xml2::xml_attr(exception, "filename")
139 - line <- xml2::xml_attr(exception, "line")
134 + line <- xml2::xml_attr(failure, "line")
135 + failure_srcref <- srcref(
136 + srcfile(file.path("src", filename)),
137 + c(line, line, 1, 1)
138 + )
139
140 - exception_srcref <- srcref(
141 - srcfile(file.path("src", filename)),
142 - c(line, line, 1, 1)
143 - )
140 + # Signal the failure as a condition (not an error) so we
+can report
141 + # multiple failures without stopping. The expectation
+handler will
142 + # catch this and properly register it.
143 + exp <- new_expectation("failure", org_text, srcref =
+failure_srcref)
144 + signal_cpp_expectation(exp)
145 + }
146
147 - exp <- new_expectation(
148 - "error",
149 - exception_text,
150 - srcref = exception_srcref
151 - )
152 - exp$test <- test_name
147 + exceptions <- xml2::xml_find_all(test, "./Exception")
148 + for (exception in exceptions) {
149 + exception_text <- xml2::xml_text(exception, trim = TRUE)
150 + filename <- xml2::xml_attr(exception, "filename")
151 + line <- xml2::xml_attr(exception, "line")
152
153 - get_reporter()$add_result(
154 - context = context_name,
155 - test = test_name,
156 - result = exp
157 - )
158 - }
153 + exception_srcref <- srcref(
154 + srcfile(file.path("src", filename)),
155 + c(line, line, 1, 1)
156 + )
157
158 - get_reporter()$end_test(context = context_name, test =
-test_name)
159 - }
158 + exp <- new_expectation("error", exception_text, srcref =
+exception_srcref)
159 + signal_cpp_expectation(exp)
160 }
161 +
162 + get_reporter()$end_test(context = context_name, test =
+test_name)
163 }
164
165 +# Helper to process an error that occurred during C++ test
+execution
166 +process_cpp_test_section_error <- function(test_name,
+message) {
167 + local_description_set(test_name)
168 +
169 + get_reporter()$start_test(context = "Catch", test =
+test_name)
170 +
171 + exp <- new_expectation("failure", message)
172 + signal_cpp_expectation(exp)
173 +
174 + get_reporter()$end_test(context = "Catch", test =
+test_name)
175 +}
176 +
177 +# Signal an expectation condition without throwing for
+failures/errors
178 +# This allows C++ tests to report multiple failures without
+stopping
179 +signal_cpp_expectation <- function(exp) {
180 + withRestarts(
181 + signalCondition(exp),
182 + muffle_expectation = function(e) NULL
183 + )
184 +}
185 +
186 #' Use Catch for C++ unit testing
187 #'
188 #' Add the necessary infrastructure to enable C++ unit
testing
⏺ Now let me test this cleaner implementation.
⏺ Bash(Rscript -e " timeout: 5m
devtools::install('/Users/davis/files/r/packages/testthat 0s
')…)
⎿ pkgload (6d29dc983... -> [{"sha":"...) [GitHub]
Downloading GitHub repo r-lib/pkgload@
… +105 lines (ctrl+o to expand)
⏺ The cleaner implementation works. Here's a summary of the changes:
Key changes:
1. succeed("") for successes - this signals an expectation condition that gets caught by handle_expectation in test_code(), which properly increments the$test_expectations and registers the result
2. signal_cpp_expectation() helper - for failures and errors, we can't use fail() because it throws (which would stop processing after the first failure). Instead, this helper signals the expectation as a condition using signalCondition() rather than stop(). This allows multiple failures to be reported.
3. local_description_set(test_name) in process_cpp_test_section() - ensures test_description() returns the correct test name when expectations are handled by register_expectation()
4. Factored out helper functions - process_cpp_test_section() and process_cpp_test_section_error() provide proper scoping for local_description_set() and make the code cleaner
The flow now is:
1. succeed("") / signal_cpp_expectation() signal condition
2. handle_expectation in test_code() catches it
3. Handler increments the$test_expectations and calls register_expectation()
4. register_expectation() uses test_description() (which reads the$description we set) and adds to reporter
✻ Crunched for 4m 14s
My two prompts were
❯ There seems to be a bug between cpp11 and testthat where a single test is
always reported as skipped.
If you run the cpp11test tests as outlined in CLAUDE.md, you should see a
single test that is skipped.
I believe this is a bug in testthat, possibly in
`testthat::run_cpp_tests`.
I have checked out a branch of testthat in
`/Users/davis/files/r/packages/testthat` for you to modify. You may work
in that directory, but not in any other one.
See if you can figure out what is wrong in testthat that causes a single
test to always be skipped. You can check your changes in testthat by
running the following:
```r
Rscript -e "
devtools::install('/Users/davis/files/r/packages/testthat')
devtools::clean_dll('./cpp11test')
devtools::test('./cpp11test')
"
```
❯ That seems like the right idea, but it seems like the real issue is that
`run_cpp_tests()` is very low level
and is not emitting `pass()` or `fail()` after processing each test,
which means that the `handle_expectation`
handler in `test_code()` is not being emitted, and that is what
normally bumps `the$test_expectations`. Is
there a way to transition `run_cpp_tests()` to use `pass()` and
`fail()` and friends instead?
…escription()` correct so we don't report "code run outside of test_that"
Since we are now inside a `test_that()`, which has its own local `test_code()`, where each of those has its own `tryCatch()`
9a5d51f to
18d248c
Compare
R/test-compiled-code.R
Outdated
| for (test in tests) { | ||
| test_description <- xml2::xml_attr(test, "name") | ||
|
|
||
| test_that(test_description, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should consider if you should use test_code() here instead, since it's a little lower level.
R/test-compiled-code.R
Outdated
| test_description <- xml2::xml_attr(test, "name") | ||
|
|
||
| test_that(test_description, { | ||
| result <- xml2::xml_find_first(test, "./OverallResults") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably out of scope for this PR, but it would probably be better to do:
result <- xml2::xml_find_first(test, "./OverallResults|./Expression|./Exception")in order preserve the original ordering of successes, failures, and errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've determined that I don't think that will actually help that much
OverallResults reports something like
<OverallResults successes="5" failures="1" expectedFailures="0">
from which we extract 5 to call pass() 5 times
but we don't know where the successes are in relation to the 1 failure
a single test_that block on the cpp11 side might look like
test_that("data_frame works") {
auto getExportedValue = cpp11::package("base")["getExportedValue"];
auto mtcars = getExportedValue("datasets", "mtcars");
cpp11::data_frame mtcars_df(mtcars);
expect_true(mtcars_df.nrow() == 31);
expect_true(mtcars_df.ncol() == 11);
cpp11::strings names(mtcars_df.names());
expect_true(names[0] == "mpg");
expect_true(names[7] == "vs");
auto iris = getExportedValue("datasets", "iris");
cpp11::data_frame iris_df(iris);
expect_true(iris_df.nrow() == 150);
expect_true(iris_df.ncol() == 5);
}
but Catch just tells us this
{xml_node}
<Section name="data_frame works" filename="test-data_frame.cpp" line="9">
[1] <Expression success="false" type="CATCH_CHECK" filename="test-data_frame.cpp" line="14">\n <Original>\n mtcars_df.nrow() == 31\n </Original>\n <Expanded>\n 32 == 31\n </Ex ...
[2] <OverallResults successes="5" failures="1" expectedFailures="0"/>
i.e. we see 5 successes, 1 failure, and we can get info about that failure from the Expression, but it doesn't list the successes interweaved with that failure, so we can't call pass() and fail() in the right order.
I'm also not sure it matters much. Whether we call pass() and fail() in the right order doesn't seem that important, right? Because pass() never emits anything to the screen.
The one case that might be kinda important is emitting fail() vs expectation("error") in the right order, but that feels rare enough that I'm ok with the current setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense.
| for (test in context$tests) { | ||
| test_that(test$name, { | ||
| for (i in seq_len(test$n_successes)) { | ||
| pass() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now makes it very clear why order doesn't matter.
run_cpp_tests() to avoid testthat internals!run_cpp_tests() to avoid testthat internals


Currently there is an issue where if you have a test file that only calls
testthat::run_cpp_tests()and has no other tests in it, then you'll get "skipped 1 empty test" every timeYou can force this to appear in testthat itself by commenting out all the other tests in the test file that calls
run_cpp_tests()for testthatThe main issue is that
run_cpp_tests()is super low level, and is trying to fully handle the recording of each test result to the reporter. For example, here is how we reported successes:testthat/R/test-compiled-code.R
Lines 96 to 106 in 34d9ab9
I was expecting to see something like
pass()in here, and theget_reporter()$start_testcall is also weird to me. It's very low level.It doesn't actually signal any
expectations()conditions, so we never hit anyhandle_*()paths like this onetestthat/R/test-that.R
Lines 139 to 143 in 34d9ab9
Note this line
This never gets run with
run_cpp_tests(), so this eventually ends up callingskip_empty()heretestthat/R/test-that.R
Lines 194 to 200 in 34d9ab9
And that's why we are seeing a single skipped test. To testthat it doesn't look like we ran any!
I tried diagnosing this with Claude, and it got me this far ^, but its solution was "hey you should just add
the$test_expectations <- the$test_expectations + 1Lall over the place inrun_cpp_tests()"I was not satisfied with that solution, as it would be nice if only
test_code()was in charge of touching that global.Instead, I went through a few rounds of refactoring (starting with Claude, but I eventually took over) to recognize that we can avoid pretty much all testthat internals in
run_cpp_tests(), instead we now:context_start()(this is a testthat internal function, but is needed here)test_that()macro in that context, set up an R leveltest_that()blockdescriptioncomes over from the C++test_that()macroexpris a series ofpass(),fail()orexpectation("error")calls, depending on the test results for that testThis avoids pretty much all of the testthat internals of having to mess with a
reporterdirectly! We just go throughtest_code()directly, so it gets to handlethe$test_expectationsand hit all thehandle_*()functions.If you just read
run_cpp_tests()from top to bottom now, I think you'll probably just nod along like "yea that makes sense", compared to the internal manipulation that was occurring before.