Rework `run_cpp_tests()` to avoid testthat internals #2315

DavisVaughan · 2026-01-21T17:25:34Z

Currently there is an issue where if you have a test file that only calls testthat::run_cpp_tests() and has no other tests in it, then you'll get "skipped 1 empty test" every time

You can force this to appear in testthat itself by commenting out all the other tests in the test file that calls run_cpp_tests() for testthat

The main issue is that run_cpp_tests() is super low level, and is trying to fully handle the recording of each test result to the reporter. For example, here is how we reported successes:

testthat/R/test-compiled-code.R

Lines 96 to 106 in 34d9ab9

    
           get_reporter()$start_test(context = context_name, test = test_name) 
        
           for (i in seq_len(successes)) { 
        
             exp <- new_expectation("success", "") 
        
             exp$test <- test_name 
        
             get_reporter()$add_result( 
        
               context = context_name, 
        
               test = test_name, 
        
               result = exp 
        
             ) 
        
           }

I was expecting to see something like pass() in here, and the get_reporter()$start_test call is also weird to me. It's very low level.

It doesn't actually signal any expectations() conditions, so we never hit any handle_*() paths like this one

testthat/R/test-that.R

Lines 139 to 143 in 34d9ab9

    
           handle_expectation <- function(e) { 
        
             the$test_expectations <- the$test_expectations + 1L 
        
             register_expectation(e, 11) 
        
             invokeRestart("muffle_expectation") 
        
           }

Note this line

the$test_expectations <- the$test_expectations + 1L

This never gets run with run_cpp_tests(), so this eventually ends up calling skip_empty() here

testthat/R/test-that.R

Lines 194 to 200 in 34d9ab9

    
             new_expectations <- the$test_expectations > starting_expectations 
        
             if (snapshot_skipped) { 
        
               skip("On CRAN") 
        
             } else if (!new_expectations && skip_on_empty) { 
        
               skip_empty() 
        
             } 
        
           },

And that's why we are seeing a single skipped test. To testthat it doesn't look like we ran any!

I tried diagnosing this with Claude, and it got me this far ^, but its solution was "hey you should just add the$test_expectations <- the$test_expectations + 1L all over the place in run_cpp_tests()"

I was not satisfied with that solution, as it would be nice if only test_code() was in charge of touching that global.

Instead, I went through a few rounds of refactoring (starting with Claude, but I eventually took over) to recognize that we can avoid pretty much all testthat internals in run_cpp_tests(), instead we now:

Set up a per C++ file context with context_start() (this is a testthat internal function, but is needed here)
For each individual test_that() macro in that context, set up an R level test_that() block
- The description comes over from the C++ test_that() macro
- The expr is a series of pass(), fail() or expectation("error") calls, depending on the test results for that test

This avoids pretty much all of the testthat internals of having to mess with a reporter directly! We just go through test_code() directly, so it gets to handle the$test_expectations and hit all the handle_*() functions.

If you just read run_cpp_tests() from top to bottom now, I think you'll probably just nod along like "yea that makes sense", compared to the internal manipulation that was occurring before.

DavisVaughan · 2026-01-21T17:26:59Z

R/test-compiled-code.R

      )
    },
    error = function(e) {
-      catch_error <- TRUE


See the <- here? That was meant to be <<-. We were meant to early exit when we detect this "something is horribly wrong" case.

Before:

After:

DavisVaughan · 2026-01-21T17:28:05Z

R/test-compiled-code.R

-
    context_start(context_name)

    tests <- xml2::xml_find_all(context, "./Section")


All of the xml2 stuff in this file remains untouched. It's just the setup and reporting of pass/fail that has changed.

DavisVaughan · 2026-01-21T17:28:49Z

R/test-compiled-code.R

+    for (test in tests) {
+      test_description <- xml2::xml_attr(test, "name")
+
+      test_that(test_description, {


"For each test in a context, set up an R level test_that() block"

You should consider if you should use test_code() here instead, since it's a little lower level.

I tried that at first, see 99dde78

The problem was that we need to record the test_description somewhere for each test, and test_code() expects that you have already set that up somehow.

I had come up with some solution of a new with_description() style helper that calls test_code() inside it, but that's literally exactly what test_that() does with extra steps, so I simplified it back to test_that()

DavisVaughan · 2026-01-21T17:29:18Z

R/test-compiled-code.R

+          # There is no `fail()` equivalent for an error.
+          # We could use `stop()`, but we want to pass through a `srcref`.
+          expectation(
+            type = "error",
+            message = exception_text,
+            srcref = exception_srcref
+          )


Open to better ideas here. I'm also not entirely sure how to trigger this case.

DavisVaughan · 2026-01-21T17:35:29Z

R/test-compiled-code.R

This was a useful claude experiment to me because even though

the$test_expectations <- the$test_expectations + 1L

was very much the wrong solution for this, that very quickly helped me isolate what the actual problem was (that would have taken me an hour by myself, or more, probably)

And I also used Claude for the first draft of the pass() / fail() solution, which was pretty helpful even though it also wasn't great yet.

But overall it gave me momentum on a bug that I definitely would have just given up on otherwise, because testthat is just too unfamiliar to me.

I feel like there's this unexplored or unnamed usage of Claude of "just find the bug and explain it to me, I'll be the one to work on the solution". Which is still quite useful in saving a bunch of upfront time.

Here's my whole Claude conversation

Details

╭─── Claude Code v2.1.14 ──────────────────────────────────────────────────────╮ │ │ Tips for getting started │ │ Welcome back! │ Run /init to create a CLAUDE.md file wit… │ │ │ ───────────────────────────────────────── │ │ ▐▛███▜▌ │ Recent activity │ │ ▝▜█████▛▘ │ No recent activity │ │ ▘▘ ▝▝ │ │ │ │ │ │ Opus 4.5 · API Usage Billing │ │ │ ~/files/r/packages/cpp11 │ │ ╰──────────────────────────────────────────────────────────────────────────────╯ Welcome to Opus 4.5 ❯ There seems to be a bug between cpp11 and testthat where a single test is always reported as skipped. If you run the cpp11test tests as outlined in CLAUDE.md, you should see a single test that is skipped. I believe this is a bug in testthat, possibly in `testthat::run_cpp_tests`. I have checked out a branch of testthat in `/Users/davis/files/r/packages/testthat` for you to modify. You may work in that directory, but not in any other one. See if you can figure out what is wrong in testthat that causes a single test to always be skipped. You can check your changes in testthat by running the following: ```r Rscript -e " devtools::install('/Users/davis/files/r/packages/testthat') devtools::clean_dll('./cpp11test') devtools::test('./cpp11test') " ``` ⏺ I'll investigate this issue. Let me start by running the cpp11test tests to see the skipped test, and then explore the testthat code. ⏺ Bash(Rscript -e " timeout: 5m 0s devtools::clean_dll('./cpp11test')…) ⎿ ℹ Testing cpp11test SKIP: -------------------------------------------- Reason: empty test … +2 lines (ctrl+o to expand) ⏺ I can see there's 1 test skipped with reason "empty test". Let me look at the testthat code, specifically run_cpp_tests. ⏺ Search(pattern: "run_cpp_tests", path: "~/files/r/packages/testthat") ⎿ Found 6 files (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Read 440 lines ⏺ Let me search for "empty test" in testthat to find where this skip message comes from. ⏺ Search(pattern: "empty test", path: "~/files/r/packages/testthat", output_mode: "content") ⎿ Found 33 lines (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/skip.R) ⎿ Read 330 lines ⏺ Let me look at what context_start does and how tests are reported. ⏺ Search(pattern: "context_start", path: "~/files/r/packages/testthat") ⎿ Found 8 files (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/context.R) ⎿ Read 56 lines ⏺ Let me search for where skip_empty is called. ⏺ Search(pattern: "skip_empty", path: "~/files/r/packages/testthat", output_mode: "content") ⎿ Found 5 lines (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/test-that.R) ⎿ Read 260 lines ⏺ Let me look at the test-cpp.R template file to understand how run_cpp_tests is called. ⏺ Read(~/files/r/packages/testthat/inst/resources/test-cpp.R) ⎿ Read 2 lines ⏺ Read(cpp11test/tests/testthat/test-cpp.R) ⎿ Read 2 lines ⏺ Let me look at how test files are run in testthat. ⏺ Search(pattern: "def test_file", path: "~/files/r/packages/testthat") ⎿ Found 0 files ⏺ Search(pattern: "test_file.*<-.*function", path: "~/files/r/packages/testthat", glob: "*.R") ⎿ Found 2 files (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/test-files.R) ⎿ Read 476 lines ⏺ Let me look at the source_file function. ⏺ Search(pattern: "^source_file.*<-.*function", path: "~/files/r/packages/testthat/R") ⎿ Found 1 file (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/source.R) ⎿ Read 165 lines ⏺ I found the issue. When source_file runs with wrap = TRUE (the default), it wraps the entire file content in test_code(). At the end of test_code(), it checks if any expectations were recorded by comparing the$test_expectations before and after. If no new expectations were recorded, it calls skip_empty(). The problem is that run_cpp_tests reports tests directly to the reporter via get_reporter()$add_result() without incrementing the$test_expectations. So even though 962 C++ tests passed, testthat thinks the file had no expectations and reports a skip. Let me verify this theory by checking how the$test_expectations is accessed. ⏺ Search(pattern: "the\$test_expectations", path: "~/files/r/packages/testthat/R", output_mode: "content") ⎿ Found 8 lines (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/testthat-package.R) ⎿ Read 41 lines ⏺ Now I understand the issue completely. Let me add the fix to run_cpp_tests. When it processes test results, it needs to increment the$test_expectations so that the wrapping test_code() knows that tests were actually run. ⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Read 120 lines ⏺ Now I'll add the fix. The issue is that run_cpp_tests reports test results directly to the reporter without incrementing the$test_expectations. I need to increment this counter for each result added. ⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Added 1 line 96 get_reporter()$start_test(context = context_name, test = test_name) 97 98 for (i in seq_len(successes)) { 99 + the$test_expectations <- the$test_expectations + 1L 100 exp <- new_expectation("success", "") 101 exp$test <- test_name 102 get_reporter()$add_result( ⏺ Now let me add the same fix for failures and exceptions. ⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Read 65 lines ⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Added 2 lines 130 c(line, line, 1, 1) 131 ) 132 133 + the$test_expectations <- the$test_expectations + 1L 134 exp <- new_expectation("failure", org_text, srcref = failure_srcref) 135 exp$test <- test_name 136 ... 142 143 exceptions <- xml2::xml_find_all(test, "./Exception") 144 for (exception in exceptions) { 145 + the$test_expectations <- the$test_expectations + 1L 146 exception_text <- xml2::xml_text(exception, trim = TRUE) 147 filename <- xml2::xml_attr(exception, "filename") 148 line <- xml2::xml_attr(exception, "line") ⏺ Let me also check the error handling case at the beginning of run_cpp_tests. ⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Read 25 lines ⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Added 1 line 62 catch_error <- TRUE 63 reporter <- get_reporter() 64 65 + the$test_expectations <- the$test_expectations + 1L 66 context_start("Catch") 67 reporter$start_test(context = "Catch", test = "Catch") 68 reporter$add_result( ⏺ Now let me test the fix by reinstalling testthat and running the cpp11test tests. ⏺ Bash(Rscript -e " timeout: 5m devtools::install('/Users/davis/files/r/packages/testthat 0s ')…) ⎿ rlang (1df507ece... -> 0fd422994...) [GitHub] pkgload (6cc3650a0... -> [{"sha":"...) [GitHub] Downloading GitHub repo r-lib/rlang@HEAD … +166 lines (ctrl+o to expand) ⏺ The fix works. The output now shows: [ FAIL 0 | WARN 0 | SKIP 0 | PASS 962 ] The skip count went from 1 to 0. Root cause: When a test file is sourced, source_file() wraps the entire file content in test_code() (with wrap = TRUE). At the end of test_code(), it checks if any expectations were recorded by comparing the$test_expectations before and after execution. If no new expectations were recorded, it calls skip_empty(). The problem was that run_cpp_tests() reports test results directly to the reporter via get_reporter()$add_result() without incrementing the$test_expectations. So even though 962 C++ tests passed, testthat thought the file had no expectations and reported an "empty test" skip. Fix: I added the$test_expectations <- the$test_expectations + 1L before each add_result() call in run_cpp_tests(): - Line 65: In the error handler when catching fails - Line 99: For each success - Line 133: For each failure - Line 146: For each exception ✻ Baked for 3m 33s ❯ /export ⎿ Conversation exported to: 2026-01-21-there-seems-to-be-a-bug-between-cpp11-and-testthat.txt ❯ /cost ⎿ Total cost: $1.08 Total duration (API): 2m 50s Total duration (wall): 18h 26m 50s Total code changes: 4 lines added, 0 lines removed Usage by model: claude-haiku: 4.8k input, 153 output, 0 cache read, 0 cache write ($0.0056) claude-opus-4-5: 210 input, 7.5k output, 1.0m cache read, 60.2k cache write ($1.08) ❯ That seems like the right idea, but it seems like the real issue is that `run_cpp_tests()` is very low level and is not emitting `pass()` or `fail()` after processing each test, which means that the `handle_expectation` handler in `test_code()` is not being emitted, and that is what normally bumps `the$test_expectations`. Is there a way to transition `run_cpp_tests()` to use `pass()` and `fail()` and friends instead? ❯ That seems like the right idea, but it seems like the real issue is that `run_cpp_tests()` is very low level and is not emitting `pass()` or `fail()` after processing each test, which means that the `handle_expectation` handler in `test_code()` is not being emitted, and that is what normally bumps `the$test_expectations`. Is there a way to transition `run_cpp_tests()` to use `pass()` and `fail()` and friends instead? ⏺ You're right, that's a cleaner approach. Let me look at how pass() and fail() work. ⏺ Search(pattern: "^(pass|fail)\s*<-\s*function", path: "~/files/r/packages/testthat/R") ⎿ Found 1 file (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/expect-that.R) ⎿ Read 92 lines ⏺ Search(pattern: "^expectation\s*<-\s*function", path: "~/files/r/packages/testthat/R") ⎿ Found 1 file (ctrl+o to expand) ⏺ Read(~/files/r/packages/testthat/R/expectation.R) ⎿ Read 252 lines ⏺ You're right. Let me trace through how succeed() and fail() work to see if we can use them properly. The issue is that fail() calls exp_signal() which calls stop(exp) for failures - this would interrupt processing after the first failure. For C++ tests, we want to report all failures. However, we could: 1. Use succeed() for successes (which signals without throwing) 2. For failures/errors, signal the expectation as a condition without throwing Let me create a cleaner implementation: ⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Read 444 lines ⏺ Now let me rewrite run_cpp_tests to use proper expectation signaling. The key insight is: 1. succeed() signals a success condition - this works 2. For failures, we can't use fail() (it throws and stops processing). Instead, we signal the failure as a condition without throwing 3. We need to set up the$description properly so test_description() returns the right test name ⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R) ⎿ Added 78 lines, removed 76 lines 59 ) 60 }, 61 error = function(e) { 62 - catch_error <- TRUE 63 - reporter <- get_reporter() 62 + catch_error <<- TRUE 63 64 - the$test_expectations <- the$test_expectations + 1L 64 context_start("Catch") 65 - reporter$start_test(context = "Catch", test = "Catch") 66 - reporter$add_result( 67 - context = "Catch", 68 - test = "Catch", 69 - result = new_expectation("failure", e$message) 70 - ) 71 - reporter$end_test(context = "Catch", test = "Catch") 65 + process_cpp_test_section_error("Catch", e$message) 66 } 67 ) 68 ... 89 90 tests <- xml2::xml_find_all(context, "./Section") 91 for (test in tests) { 92 - test_name <- xml2::xml_attr(test, "name") 92 + process_cpp_test_section(test, context_name) 93 + } 94 + } 95 +} 96 97 - result <- xml2::xml_find_first(test, -"./OverallResults") 98 - successes <- as.integer(xml2::xml_attr(result, -"successes")) 97 +# Helper to process a single C++ test section 98 +# Using a separate function ensures proper scoping for +local_description_set() 99 +process_cpp_test_section <- function(test, context_name) { 100 + test_name <- xml2::xml_attr(test, "name") 101 102 - get_reporter()$start_test(context = context_name, test -= test_name) 102 + # Set up the description so test_description() returns the +correct test name 103 + # when expectations are signaled and handled 104 105 - for (i in seq_len(successes)) { 106 - the$test_expectations <- the$test_expectations + 1L 107 - exp <- new_expectation("success", "") 108 - exp$test <- test_name 109 - get_reporter()$add_result( 110 - context = context_name, 111 - test = test_name, 112 - result = exp 113 - ) 114 - } 105 + local_description_set(test_name) 106 107 - failures <- xml2::xml_find_all(test, "./Expression") 108 - for (failure in failures) { 109 - org <- xml2::xml_find_first(failure, "Original") 110 - org_text <- xml2::xml_text(org, trim = TRUE) 107 + result <- xml2::xml_find_first(test, "./OverallResults") 108 + successes <- as.integer(xml2::xml_attr(result, +"successes")) 109 110 - filename <- xml2::xml_attr(failure, "filename") 111 - type <- xml2::xml_attr(failure, "type") 110 + get_reporter()$start_test(context = context_name, test = +test_name) 111 112 - type_msg <- switch( 113 - type, 114 - "CATCH_CHECK_FALSE" = "isn't false.", 115 - "CATCH_CHECK_THROWS" = "did not throw an -exception.", 116 - "CATCH_CHECK_THROWS_AS" = "threw an exception with -unexpected type.", 117 - "isn't true." 118 - ) 112 + for (i in seq_len(successes)) { 113 + succeed("") 114 + } 115 116 - org_text <- paste(org_text, type_msg) 116 + failures <- xml2::xml_find_all(test, "./Expression") 117 + for (failure in failures) { 118 + org <- xml2::xml_find_first(failure, "Original") 119 + org_text <- xml2::xml_text(org, trim = TRUE) 120 121 - line <- xml2::xml_attr(failure, "line") 122 - failure_srcref <- srcref( 123 - srcfile(file.path("src", filename)), 124 - c(line, line, 1, 1) 125 - ) 121 + filename <- xml2::xml_attr(failure, "filename") 122 + type <- xml2::xml_attr(failure, "type") 123 124 - the$test_expectations <- the$test_expectations + 1L 125 - exp <- new_expectation("failure", org_text, srcref = -failure_srcref) 126 - exp$test <- test_name 124 + type_msg <- switch( 125 + type, 126 + "CATCH_CHECK_FALSE" = "isn't false.", 127 + "CATCH_CHECK_THROWS" = "did not throw an exception.", 128 + "CATCH_CHECK_THROWS_AS" = "threw an exception with +unexpected type.", 129 + "isn't true." 130 + ) 131 132 - get_reporter()$add_result( 133 - context = context_name, 134 - test = test_name, 135 - result = exp 136 - ) 137 - } 132 + org_text <- paste(org_text, type_msg) 133 134 - exceptions <- xml2::xml_find_all(test, "./Exception") 135 - for (exception in exceptions) { 136 - the$test_expectations <- the$test_expectations + 1L 137 - exception_text <- xml2::xml_text(exception, trim = -TRUE) 138 - filename <- xml2::xml_attr(exception, "filename") 139 - line <- xml2::xml_attr(exception, "line") 134 + line <- xml2::xml_attr(failure, "line") 135 + failure_srcref <- srcref( 136 + srcfile(file.path("src", filename)), 137 + c(line, line, 1, 1) 138 + ) 139 140 - exception_srcref <- srcref( 141 - srcfile(file.path("src", filename)), 142 - c(line, line, 1, 1) 143 - ) 140 + # Signal the failure as a condition (not an error) so we +can report 141 + # multiple failures without stopping. The expectation +handler will 142 + # catch this and properly register it. 143 + exp <- new_expectation("failure", org_text, srcref = +failure_srcref) 144 + signal_cpp_expectation(exp) 145 + } 146 147 - exp <- new_expectation( 148 - "error", 149 - exception_text, 150 - srcref = exception_srcref 151 - ) 152 - exp$test <- test_name 147 + exceptions <- xml2::xml_find_all(test, "./Exception") 148 + for (exception in exceptions) { 149 + exception_text <- xml2::xml_text(exception, trim = TRUE) 150 + filename <- xml2::xml_attr(exception, "filename") 151 + line <- xml2::xml_attr(exception, "line") 152 153 - get_reporter()$add_result( 154 - context = context_name, 155 - test = test_name, 156 - result = exp 157 - ) 158 - } 153 + exception_srcref <- srcref( 154 + srcfile(file.path("src", filename)), 155 + c(line, line, 1, 1) 156 + ) 157 158 - get_reporter()$end_test(context = context_name, test = -test_name) 159 - } 158 + exp <- new_expectation("error", exception_text, srcref = +exception_srcref) 159 + signal_cpp_expectation(exp) 160 } 161 + 162 + get_reporter()$end_test(context = context_name, test = +test_name) 163 } 164 165 +# Helper to process an error that occurred during C++ test +execution 166 +process_cpp_test_section_error <- function(test_name, +message) { 167 + local_description_set(test_name) 168 + 169 + get_reporter()$start_test(context = "Catch", test = +test_name) 170 + 171 + exp <- new_expectation("failure", message) 172 + signal_cpp_expectation(exp) 173 + 174 + get_reporter()$end_test(context = "Catch", test = +test_name) 175 +} 176 + 177 +# Signal an expectation condition without throwing for +failures/errors 178 +# This allows C++ tests to report multiple failures without +stopping 179 +signal_cpp_expectation <- function(exp) { 180 + withRestarts( 181 + signalCondition(exp), 182 + muffle_expectation = function(e) NULL 183 + ) 184 +} 185 + 186 #' Use Catch for C++ unit testing 187 #' 188 #' Add the necessary infrastructure to enable C++ unit testing ⏺ Now let me test this cleaner implementation. ⏺ Bash(Rscript -e " timeout: 5m devtools::install('/Users/davis/files/r/packages/testthat 0s ')…) ⎿ pkgload (6d29dc983... -> [{"sha":"...) [GitHub] Downloading GitHub repo r-lib/pkgload@ … +105 lines (ctrl+o to expand) ⏺ The cleaner implementation works. Here's a summary of the changes: Key changes: 1. succeed("") for successes - this signals an expectation condition that gets caught by handle_expectation in test_code(), which properly increments the$test_expectations and registers the result 2. signal_cpp_expectation() helper - for failures and errors, we can't use fail() because it throws (which would stop processing after the first failure). Instead, this helper signals the expectation as a condition using signalCondition() rather than stop(). This allows multiple failures to be reported. 3. local_description_set(test_name) in process_cpp_test_section() - ensures test_description() returns the correct test name when expectations are handled by register_expectation() 4. Factored out helper functions - process_cpp_test_section() and process_cpp_test_section_error() provide proper scoping for local_description_set() and make the code cleaner The flow now is: 1. succeed("") / signal_cpp_expectation() signal condition 2. handle_expectation in test_code() catches it 3. Handler increments the$test_expectations and calls register_expectation() 4. register_expectation() uses test_description() (which reads the$description we set) and adds to reporter ✻ Crunched for 4m 14s

My two prompts were

❯ There seems to be a bug between cpp11 and testthat where a single test is always reported as skipped. If you run the cpp11test tests as outlined in CLAUDE.md, you should see a single test that is skipped. I believe this is a bug in testthat, possibly in `testthat::run_cpp_tests`. I have checked out a branch of testthat in `/Users/davis/files/r/packages/testthat` for you to modify. You may work in that directory, but not in any other one. See if you can figure out what is wrong in testthat that causes a single test to always be skipped. You can check your changes in testthat by running the following: ```r Rscript -e " devtools::install('/Users/davis/files/r/packages/testthat') devtools::clean_dll('./cpp11test') devtools::test('./cpp11test') " ``` ❯ That seems like the right idea, but it seems like the real issue is that `run_cpp_tests()` is very low level and is not emitting `pass()` or `fail()` after processing each test, which means that the `handle_expectation` handler in `test_code()` is not being emitted, and that is what normally bumps `the$test_expectations`. Is there a way to transition `run_cpp_tests()` to use `pass()` and `fail()` and friends instead?

…escription()` correct so we don't report "code run outside of test_that"

…test_that()`!

Since we are now inside a `test_that()`, which has its own local `test_code()`, where each of those has its own `tryCatch()`

hadley · 2026-01-21T18:26:05Z

R/test-compiled-code.R

+    for (test in tests) {
+      test_description <- xml2::xml_attr(test, "name")
+
+      test_that(test_description, {


You should consider if you should use test_code() here instead, since it's a little lower level.

hadley · 2026-01-21T18:27:19Z

R/test-compiled-code.R

+      test_description <- xml2::xml_attr(test, "name")
+
+      test_that(test_description, {
+        result <- xml2::xml_find_first(test, "./OverallResults")


Probably out of scope for this PR, but it would probably be better to do:

result <- xml2::xml_find_first(test, "./OverallResults|./Expression|./Exception")

in order preserve the original ordering of successes, failures, and errors.

I've determined that I don't think that will actually help that much

OverallResults reports something like

<OverallResults successes="5" failures="1" expectedFailures="0">

from which we extract 5 to call pass() 5 times

but we don't know where the successes are in relation to the 1 failure

a single test_that block on the cpp11 side might look like

test_that("data_frame works") { auto getExportedValue = cpp11::package("base")["getExportedValue"]; auto mtcars = getExportedValue("datasets", "mtcars"); cpp11::data_frame mtcars_df(mtcars); expect_true(mtcars_df.nrow() == 31); expect_true(mtcars_df.ncol() == 11); cpp11::strings names(mtcars_df.names()); expect_true(names[0] == "mpg"); expect_true(names[7] == "vs"); auto iris = getExportedValue("datasets", "iris"); cpp11::data_frame iris_df(iris); expect_true(iris_df.nrow() == 150); expect_true(iris_df.ncol() == 5); }

but Catch just tells us this

{xml_node} <Section name="data_frame works" filename="test-data_frame.cpp" line="9"> [1] <Expression success="false" type="CATCH_CHECK" filename="test-data_frame.cpp" line="14">\n <Original>\n mtcars_df.nrow() == 31\n </Original>\n <Expanded>\n 32 == 31\n </Ex ... [2] <OverallResults successes="5" failures="1" expectedFailures="0"/>

i.e. we see 5 successes, 1 failure, and we can get info about that failure from the Expression, but it doesn't list the successes interweaved with that failure, so we can't call pass() and fail() in the right order.

I'm also not sure it matters much. Whether we call pass() and fail() in the right order doesn't seem that important, right? Because pass() never emits anything to the screen.

The one case that might be kinda important is emitting fail() vs expectation("error") in the right order, but that feels rare enough that I'm ok with the current setup.

Ok, makes sense.

DavisVaughan · 2026-01-21T20:24:56Z

@hadley b8c1942 refactors this into a series of self contained parsers so that we can handle all the xml2 bits up front with parse_catch_contexts() and get one big R object back to process. It feels easier to follow now.

hadley · 2026-01-21T21:37:32Z

R/test-compiled-code.R

+    for (test in context$tests) {
+      test_that(test$name, {
+        for (i in seq_len(test$n_successes)) {
+          pass()


This now makes it very clear why order doesn't matter.

DavisVaughan commented Jan 21, 2026

View reviewed changes

DavisVaughan added 9 commits January 21, 2026 12:47

Bump the$test_expectations

f34111f

Move towards less manual bookkeeping

6d42518

Fix with_description_push() thinko

546436e

Use test_code() to avoid more internals! This also gets the `test_d…

aac4ad6

…escription()` correct so we don't report "code run outside of test_that"

Recognize that with_description_push() + test_code() is exactly `…

99dde78

…test_that()`!

Recognize that we can now use fail()

a9e9319

Since we are now inside a `test_that()`, which has its own local `test_code()`, where each of those has its own `tryCatch()`

You really do need <<- here!

09482ab

Use conditionMessage() just in case

2c12010

Justify usage of expectation()

18d248c

DavisVaughan force-pushed the fix/testthat-cpp-skip branch from 9a5d51f to 18d248c Compare January 21, 2026 17:47

NEWS bullet

3b17e83

hadley reviewed Jan 21, 2026

View reviewed changes

Refactor into a series of parsers

b8c1942

Remove unused variable

a9d4463

DavisVaughan requested a review from hadley January 21, 2026 21:26

hadley reviewed Jan 21, 2026

View reviewed changes

DavisVaughan changed the title ~~Rework run_cpp_tests() to avoid testthat internals!~~ Rework run_cpp_tests() to avoid testthat internals Jan 22, 2026

DavisVaughan merged commit dd772a0 into r-lib:main Jan 22, 2026
17 checks passed

DavisVaughan deleted the fix/testthat-cpp-skip branch January 22, 2026 13:47

	get_reporter()$start_test(context = context_name, test = test_name)

	for (i in seq_len(successes)) {
	exp <- new_expectation("success", "")
	exp$test <- test_name
	get_reporter()$add_result(
	context = context_name,
	test = test_name,
	result = exp
	)
	}

	handle_expectation <- function(e) {
	the$test_expectations <- the$test_expectations + 1L
	register_expectation(e, 11)
	invokeRestart("muffle_expectation")
	}

	new_expectations <- the$test_expectations > starting_expectations
	if (snapshot_skipped) {
	skip("On CRAN")
	} else if (!new_expectations && skip_on_empty) {
	skip_empty()
	}
	},


		context_start(context_name)

		tests <- xml2::xml_find_all(context, "./Section")

Rework run_cpp_tests() to avoid testthat internals #2315

Rework run_cpp_tests() to avoid testthat internals #2315

Conversation

DavisVaughan commented Jan 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DavisVaughan Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DavisVaughan commented Jan 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Rework `run_cpp_tests()` to avoid testthat internals #2315

Rework `run_cpp_tests()` to avoid testthat internals #2315

DavisVaughan Jan 21, 2026 •

edited

Loading