Skip to content

Conversation

@DavisVaughan
Copy link
Member

Currently there is an issue where if you have a test file that only calls testthat::run_cpp_tests() and has no other tests in it, then you'll get "skipped 1 empty test" every time

You can force this to appear in testthat itself by commenting out all the other tests in the test file that calls run_cpp_tests() for testthat

Screenshot 2026-01-21 at 12 08 29 PM

The main issue is that run_cpp_tests() is super low level, and is trying to fully handle the recording of each test result to the reporter. For example, here is how we reported successes:

get_reporter()$start_test(context = context_name, test = test_name)
for (i in seq_len(successes)) {
exp <- new_expectation("success", "")
exp$test <- test_name
get_reporter()$add_result(
context = context_name,
test = test_name,
result = exp
)
}

I was expecting to see something like pass() in here, and the get_reporter()$start_test call is also weird to me. It's very low level.

It doesn't actually signal any expectations() conditions, so we never hit any handle_*() paths like this one

testthat/R/test-that.R

Lines 139 to 143 in 34d9ab9

handle_expectation <- function(e) {
the$test_expectations <- the$test_expectations + 1L
register_expectation(e, 11)
invokeRestart("muffle_expectation")
}

Note this line

the$test_expectations <- the$test_expectations + 1L

This never gets run with run_cpp_tests(), so this eventually ends up calling skip_empty() here

testthat/R/test-that.R

Lines 194 to 200 in 34d9ab9

new_expectations <- the$test_expectations > starting_expectations
if (snapshot_skipped) {
skip("On CRAN")
} else if (!new_expectations && skip_on_empty) {
skip_empty()
}
},

And that's why we are seeing a single skipped test. To testthat it doesn't look like we ran any!


I tried diagnosing this with Claude, and it got me this far ^, but its solution was "hey you should just add the$test_expectations <- the$test_expectations + 1L all over the place in run_cpp_tests()"

I was not satisfied with that solution, as it would be nice if only test_code() was in charge of touching that global.

Instead, I went through a few rounds of refactoring (starting with Claude, but I eventually took over) to recognize that we can avoid pretty much all testthat internals in run_cpp_tests(), instead we now:

  • Set up a per C++ file context with context_start() (this is a testthat internal function, but is needed here)
  • For each individual test_that() macro in that context, set up an R level test_that() block
    • The description comes over from the C++ test_that() macro
    • The expr is a series of pass(), fail() or expectation("error") calls, depending on the test results for that test

This avoids pretty much all of the testthat internals of having to mess with a reporter directly! We just go through test_code() directly, so it gets to handle the$test_expectations and hit all the handle_*() functions.

If you just read run_cpp_tests() from top to bottom now, I think you'll probably just nod along like "yea that makes sense", compared to the internal manipulation that was occurring before.

)
},
error = function(e) {
catch_error <- TRUE
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the <- here? That was meant to be <<-. We were meant to early exit when we detect this "something is horribly wrong" case.

Before:

Image

After:

Image


context_start(context_name)

tests <- xml2::xml_find_all(context, "./Section")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the xml2 stuff in this file remains untouched. It's just the setup and reporting of pass/fail that has changed.

Comment on lines 85 to 88
for (test in tests) {
test_description <- xml2::xml_attr(test, "name")

test_that(test_description, {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For each test in a context, set up an R level test_that() block"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should consider if you should use test_code() here instead, since it's a little lower level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that at first, see 99dde78

The problem was that we need to record the test_description somewhere for each test, and test_code() expects that you have already set that up somehow.

I had come up with some solution of a new with_description() style helper that calls test_code() inside it, but that's literally exactly what test_that() does with extra steps, so I simplified it back to test_that()

Comment on lines 133 to 139
# There is no `fail()` equivalent for an error.
# We could use `stop()`, but we want to pass through a `srcref`.
expectation(
type = "error",
message = exception_text,
srcref = exception_srcref
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to better ideas here. I'm also not entirely sure how to trigger this case.

Copy link
Member Author

@DavisVaughan DavisVaughan Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a useful claude experiment to me because even though

the$test_expectations <- the$test_expectations + 1L

was very much the wrong solution for this, that very quickly helped me isolate what the actual problem was (that would have taken me an hour by myself, or more, probably)


And I also used Claude for the first draft of the pass() / fail() solution, which was pretty helpful even though it also wasn't great yet.

But overall it gave me momentum on a bug that I definitely would have just given up on otherwise, because testthat is just too unfamiliar to me.

I feel like there's this unexplored or unnamed usage of Claude of "just find the bug and explain it to me, I'll be the one to work on the solution". Which is still quite useful in saving a bunch of upfront time.

Here's my whole Claude conversation

Details
╭─── Claude Code v2.1.14 ──────────────────────────────────────────────────────╮
│                                  │ Tips for getting started                  │
│           Welcome back!          │ Run /init to create a CLAUDE.md file wit… │
│                                  │ ───────────────────────────────────────── │
│              ▐▛███▜▌             │ Recent activity                           │
│             ▝▜█████▛▘            │ No recent activity                        │
│               ▘▘ ▝▝              │                                           │
│                                  │                                           │
│   Opus 4.5 · API Usage Billing   │                                           │
│     ~/files/r/packages/cpp11     │                                           │
╰──────────────────────────────────────────────────────────────────────────────╯

  Welcome to Opus 4.5

❯ There seems to be a bug between cpp11 and testthat where a single test is 
  always reported as skipped.                                               
                                                                            
  If you run the cpp11test tests as outlined in CLAUDE.md, you should see a 
  single test that is skipped.                                              
                                                                            
  I believe this is a bug in testthat, possibly in                          
  `testthat::run_cpp_tests`.                                                
                                                                            
  I have checked out a branch of testthat in                                
  `/Users/davis/files/r/packages/testthat` for you to modify. You may work  
  in that directory, but not in any other one.                              
                                                                            
  See if you can figure out what is wrong in testthat that causes a single  
  test to always be skipped. You can check your changes in testthat by      
  running the following:                                                    
                                                                            
  ```r                                                                      
  Rscript -e "                                                              
  devtools::install('/Users/davis/files/r/packages/testthat')               
  devtools::clean_dll('./cpp11test')                                        
  devtools::test('./cpp11test')                                             
  "                                                                         
  ```                                                                       

⏺ I'll investigate this issue. Let me start by running the cpp11test tests to see the skipped test, and then explore the testthat code.

⏺ Bash(Rscript -e "                          timeout: 5m 0s
        devtools::clean_dll('./cpp11test')…)
  ⎿  ℹ Testing cpp11test                                
     SKIP:  --------------------------------------------
     Reason: empty test
     … +2 lines (ctrl+o to expand)

⏺ I can see there's 1 test skipped with reason "empty test". Let me look at the testthat code, specifically run_cpp_tests.

⏺ Search(pattern: "run_cpp_tests", path: "~/files/r/packages/testthat")
  ⎿  Found 6 files (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Read 440 lines

⏺ Let me search for "empty test" in testthat to find where this skip message comes from.

⏺ Search(pattern: "empty test", path: "~/files/r/packages/testthat",
        output_mode: "content")
  ⎿  Found 33 lines (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/skip.R)
  ⎿  Read 330 lines

⏺ Let me look at what context_start does and how tests are reported.

⏺ Search(pattern: "context_start", path: "~/files/r/packages/testthat")
  ⎿  Found 8 files (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/context.R)
  ⎿  Read 56 lines

⏺ Let me search for where skip_empty is called.

⏺ Search(pattern: "skip_empty", path: "~/files/r/packages/testthat",
        output_mode: "content")
  ⎿  Found 5 lines (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/test-that.R)
  ⎿  Read 260 lines

⏺ Let me look at the test-cpp.R template file to understand how run_cpp_tests is called.

⏺ Read(~/files/r/packages/testthat/inst/resources/test-cpp.R)
  ⎿  Read 2 lines

⏺ Read(cpp11test/tests/testthat/test-cpp.R)
  ⎿  Read 2 lines

⏺ Let me look at how test files are run in testthat.

⏺ Search(pattern: "def test_file", path: "~/files/r/packages/testthat")
  ⎿  Found 0 files 

⏺ Search(pattern: "test_file.*<-.*function", path:
        "~/files/r/packages/testthat", glob: "*.R")
  ⎿  Found 2 files (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/test-files.R)
  ⎿  Read 476 lines

⏺ Let me look at the source_file function.

⏺ Search(pattern: "^source_file.*<-.*function", path:
        "~/files/r/packages/testthat/R")
  ⎿  Found 1 file (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/source.R)
  ⎿  Read 165 lines

⏺ I found the issue. When source_file runs with wrap = TRUE (the default), it wraps the entire file content in test_code(). At the end of test_code(), it checks if any expectations were recorded by comparing the$test_expectations before and after. If no new expectations were recorded, it calls skip_empty().

  The problem is that run_cpp_tests reports tests directly to the reporter via get_reporter()$add_result() without incrementing the$test_expectations. So even though 962 C++ tests passed, testthat thinks the file had no expectations and reports a skip.

  Let me verify this theory by checking how the$test_expectations is accessed.

⏺ Search(pattern: "the\$test_expectations", path:
        "~/files/r/packages/testthat/R", output_mode: "content")
  ⎿  Found 8 lines (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/testthat-package.R)
  ⎿  Read 41 lines

⏺ Now I understand the issue completely. Let me add the fix to run_cpp_tests. When it processes test results, it needs to increment the$test_expectations so that the wrapping test_code() knows that tests were actually run.

⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Read 120 lines

⏺ Now I'll add the fix. The issue is that run_cpp_tests reports test results directly to the reporter without incrementing the$test_expectations. I need to increment this counter for each result added.

⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Added 1 line                                                        
       96        get_reporter()$start_test(context = context_name, test  
           = test_name)                                                  
       97                                                                
       98        for (i in seq_len(successes)) {                         
       99 +        the$test_expectations <- the$test_expectations + 1L   
      100          exp <- new_expectation("success", "")                 
      101          exp$test <- test_name                                 
      102          get_reporter()$add_result(                            

⏺ Now let me add the same fix for failures and exceptions.

⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Read 65 lines

⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Added 2 lines                                                       
      130            c(line, line, 1, 1)                                 
      131          )                                                     
      132                                                                
      133 +        the$test_expectations <- the$test_expectations + 1L   
      134          exp <- new_expectation("failure", org_text, srcref =  
           failure_srcref)                                               
      135          exp$test <- test_name                                 
      136                                                                
     ...
      142                                                                
      143        exceptions <- xml2::xml_find_all(test, "./Exception")   
      144        for (exception in exceptions) {                         
      145 +        the$test_expectations <- the$test_expectations + 1L   
      146          exception_text <- xml2::xml_text(exception, trim =    
           TRUE)                                                         
      147          filename <- xml2::xml_attr(exception, "filename")     
      148          line <- xml2::xml_attr(exception, "line")             

⏺ Let me also check the error handling case at the beginning of run_cpp_tests.

⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Read 25 lines

⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Added 1 line                                                        
      62        catch_error <- TRUE                                      
      63        reporter <- get_reporter()                               
      64                                                                 
      65 +      the$test_expectations <- the$test_expectations + 1L      
      66        context_start("Catch")                                   
      67        reporter$start_test(context = "Catch", test = "Catch")   
      68        reporter$add_result(                                     

⏺ Now let me test the fix by reinstalling testthat and running the cpp11test tests.

⏺ Bash(Rscript -e "                                                timeout: 5m 
        devtools::install('/Users/davis/files/r/packages/testthat  0s
      ')…)
  ⎿  rlang   (1df507ece... -> 0fd422994...) [GitHub]
     pkgload (6cc3650a0... -> [{"sha":"...) [GitHub]
     Downloading GitHub repo r-lib/rlang@HEAD
     … +166 lines (ctrl+o to expand)

⏺ The fix works. The output now shows:

  [ FAIL 0 | WARN 0 | SKIP 0 | PASS 962 ]

  The skip count went from 1 to 0.

  Root cause: When a test file is sourced, source_file() wraps the entire file content in test_code() (with wrap = TRUE). At the end of test_code(), it checks if any expectations were recorded by comparing the$test_expectations before and after execution. If no new expectations were recorded, it calls skip_empty().

  The problem was that run_cpp_tests() reports test results directly to the reporter via get_reporter()$add_result() without incrementing the$test_expectations. So even though 962 C++ tests passed, testthat thought the file had no expectations and reported an "empty test" skip.

  Fix: I added the$test_expectations <- the$test_expectations + 1L before each add_result() call in run_cpp_tests():
  - Line 65: In the error handler when catching fails
  - Line 99: For each success
  - Line 133: For each failure
  - Line 146: For each exception

✻ Baked for 3m 33s

❯ /export 
  ⎿  Conversation exported to:
     2026-01-21-there-seems-to-be-a-bug-between-cpp11-and-testthat.txt

❯ /cost 
  ⎿  Total cost:            $1.08
     Total duration (API):  2m 50s
     Total duration (wall): 18h 26m 50s
     Total code changes:    4 lines added, 0 lines removed
     Usage by model:
             claude-haiku:  4.8k input, 153 output, 0 cache read, 0 cache write 
     ($0.0056)
          claude-opus-4-5:  210 input, 7.5k output, 1.0m cache read, 60.2k cache 
     write ($1.08)


❯ That seems like the right idea, but it seems like the real issue is that  
  `run_cpp_tests()` is very low level and is not emitting `pass()` or       
  `fail()` after processing each test, which means that the                 
  `handle_expectation` handler in `test_code()` is not being emitted, and   
  that is what normally bumps `the$test_expectations`. Is there a way to    
  transition `run_cpp_tests()` to use `pass()` and `fail()` and friends     
  instead?                                                                  

❯ That seems like the right idea, but it seems like the real issue is that  
  `run_cpp_tests()` is very low level                                       
  and is not emitting `pass()` or `fail()` after processing each test,      
  which means that the `handle_expectation`                                 
  handler in `test_code()` is not being emitted, and that is what           
  normally bumps `the$test_expectations`. Is                                
  there a way to transition `run_cpp_tests()` to use `pass()` and           
  `fail()` and friends instead?                                             

⏺ You're right, that's a cleaner approach. Let me look at how pass() and fail() work.

⏺ Search(pattern: "^(pass|fail)\s*<-\s*function", path:
        "~/files/r/packages/testthat/R")
  ⎿  Found 1 file (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/expect-that.R)
  ⎿  Read 92 lines

⏺ Search(pattern: "^expectation\s*<-\s*function", path:
        "~/files/r/packages/testthat/R")
  ⎿  Found 1 file (ctrl+o to expand)

⏺ Read(~/files/r/packages/testthat/R/expectation.R)
  ⎿  Read 252 lines

⏺ You're right. Let me trace through how succeed() and fail() work to see if we can use them properly.

  The issue is that fail() calls exp_signal() which calls stop(exp) for failures - this would interrupt processing after the first failure. For C++ tests, we want to report all failures.

  However, we could:
  1. Use succeed() for successes (which signals without throwing)
  2. For failures/errors, signal the expectation as a condition without throwing

  Let me create a cleaner implementation:

⏺ Read(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Read 444 lines

⏺ Now let me rewrite run_cpp_tests to use proper expectation signaling. The key insight is:

  1. succeed() signals a success condition - this works
  2. For failures, we can't use fail() (it throws and stops processing). Instead, we signal the failure as a condition without throwing
  3. We need to set up the$description properly so test_description() returns the right test name

⏺ Update(~/files/r/packages/testthat/R/test-compiled-code.R)
  ⎿  Added 78 lines, removed 76 lines                                    
      59        )                                                        
      60      },                                                         
      61      error = function(e) {                                      
      62 -      catch_error <- TRUE                                      
      63 -      reporter <- get_reporter()                               
      62 +      catch_error <<- TRUE                                     
      63                                                                 
      64 -      the$test_expectations <- the$test_expectations + 1L      
      64        context_start("Catch")                                   
      65 -      reporter$start_test(context = "Catch", test = "Catch")   
      66 -      reporter$add_result(                                     
      67 -        context = "Catch",                                     
      68 -        test = "Catch",                                        
      69 -        result = new_expectation("failure", e$message)         
      70 -      )                                                        
      71 -      reporter$end_test(context = "Catch", test = "Catch")     
      65 +      process_cpp_test_section_error("Catch", e$message)       
      66      }                                                          
      67    )                                                            
      68                                                                 
     ...
       89                                                                
       90      tests <- xml2::xml_find_all(context, "./Section")         
       91      for (test in tests) {                                     
       92 -      test_name <- xml2::xml_attr(test, "name")               
       92 +      process_cpp_test_section(test, context_name)            
       93 +    }                                                         
       94 +  }                                                           
       95 +}                                                             
       96                                                                
       97 -      result <- xml2::xml_find_first(test,                    
          -"./OverallResults")                                           
       98 -      successes <- as.integer(xml2::xml_attr(result,          
          -"successes"))                                                 
       97 +# Helper to process a single C++ test section                 
       98 +# Using a separate function ensures proper scoping for        
          +local_description_set()                                       
       99 +process_cpp_test_section <- function(test, context_name) {    
      100 +  test_name <- xml2::xml_attr(test, "name")                   
      101                                                                
      102 -      get_reporter()$start_test(context = context_name, test  
          -= test_name)                                                  
      102 +  # Set up the description so test_description() returns the  
          +correct test name                                             
      103 +  # when expectations are signaled and handled                
      104                                                                
      105 -      for (i in seq_len(successes)) {                         
      106 -        the$test_expectations <- the$test_expectations + 1L   
      107 -        exp <- new_expectation("success", "")                 
      108 -        exp$test <- test_name                                 
      109 -        get_reporter()$add_result(                            
      110 -          context = context_name,                             
      111 -          test = test_name,                                   
      112 -          result = exp                                        
      113 -        )                                                     
      114 -      }                                                       
      105 +  local_description_set(test_name)                            
      106                                                                
      107 -      failures <- xml2::xml_find_all(test, "./Expression")    
      108 -      for (failure in failures) {                             
      109 -        org <- xml2::xml_find_first(failure, "Original")      
      110 -        org_text <- xml2::xml_text(org, trim = TRUE)          
      107 +  result <- xml2::xml_find_first(test, "./OverallResults")    
      108 +  successes <- as.integer(xml2::xml_attr(result,              
          +"successes"))                                                 
      109                                                                
      110 -        filename <- xml2::xml_attr(failure, "filename")       
      111 -        type <- xml2::xml_attr(failure, "type")               
      110 +  get_reporter()$start_test(context = context_name, test =    
          +test_name)                                                    
      111                                                                
      112 -        type_msg <- switch(                                   
      113 -          type,                                               
      114 -          "CATCH_CHECK_FALSE" = "isn't false.",               
      115 -          "CATCH_CHECK_THROWS" = "did not throw an            
          -exception.",                                                  
      116 -          "CATCH_CHECK_THROWS_AS" = "threw an exception with  
          -unexpected type.",                                            
      117 -          "isn't true."                                       
      118 -        )                                                     
      112 +  for (i in seq_len(successes)) {                             
      113 +    succeed("")                                               
      114 +  }                                                           
      115                                                                
      116 -        org_text <- paste(org_text, type_msg)                 
      116 +  failures <- xml2::xml_find_all(test, "./Expression")        
      117 +  for (failure in failures) {                                 
      118 +    org <- xml2::xml_find_first(failure, "Original")          
      119 +    org_text <- xml2::xml_text(org, trim = TRUE)              
      120                                                                
      121 -        line <- xml2::xml_attr(failure, "line")               
      122 -        failure_srcref <- srcref(                             
      123 -          srcfile(file.path("src", filename)),                
      124 -          c(line, line, 1, 1)                                 
      125 -        )                                                     
      121 +    filename <- xml2::xml_attr(failure, "filename")           
      122 +    type <- xml2::xml_attr(failure, "type")                   
      123                                                                
      124 -        the$test_expectations <- the$test_expectations + 1L   
      125 -        exp <- new_expectation("failure", org_text, srcref =  
          -failure_srcref)                                               
      126 -        exp$test <- test_name                                 
      124 +    type_msg <- switch(                                       
      125 +      type,                                                   
      126 +      "CATCH_CHECK_FALSE" = "isn't false.",                   
      127 +      "CATCH_CHECK_THROWS" = "did not throw an exception.",   
      128 +      "CATCH_CHECK_THROWS_AS" = "threw an exception with      
          +unexpected type.",                                            
      129 +      "isn't true."                                           
      130 +    )                                                         
      131                                                                
      132 -        get_reporter()$add_result(                            
      133 -          context = context_name,                             
      134 -          test = test_name,                                   
      135 -          result = exp                                        
      136 -        )                                                     
      137 -      }                                                       
      132 +    org_text <- paste(org_text, type_msg)                     
      133                                                                
      134 -      exceptions <- xml2::xml_find_all(test, "./Exception")   
      135 -      for (exception in exceptions) {                         
      136 -        the$test_expectations <- the$test_expectations + 1L   
      137 -        exception_text <- xml2::xml_text(exception, trim =    
          -TRUE)                                                         
      138 -        filename <- xml2::xml_attr(exception, "filename")     
      139 -        line <- xml2::xml_attr(exception, "line")             
      134 +    line <- xml2::xml_attr(failure, "line")                   
      135 +    failure_srcref <- srcref(                                 
      136 +      srcfile(file.path("src", filename)),                    
      137 +      c(line, line, 1, 1)                                     
      138 +    )                                                         
      139                                                                
      140 -        exception_srcref <- srcref(                           
      141 -          srcfile(file.path("src", filename)),                
      142 -          c(line, line, 1, 1)                                 
      143 -        )                                                     
      140 +    # Signal the failure as a condition (not an error) so we  
          +can report                                                    
      141 +    # multiple failures without stopping. The expectation     
          +handler will                                                  
      142 +    # catch this and properly register it.                    
      143 +    exp <- new_expectation("failure", org_text, srcref =      
          +failure_srcref)                                               
      144 +    signal_cpp_expectation(exp)                               
      145 +  }                                                           
      146                                                                
      147 -        exp <- new_expectation(                               
      148 -          "error",                                            
      149 -          exception_text,                                     
      150 -          srcref = exception_srcref                           
      151 -        )                                                     
      152 -        exp$test <- test_name                                 
      147 +  exceptions <- xml2::xml_find_all(test, "./Exception")       
      148 +  for (exception in exceptions) {                             
      149 +    exception_text <- xml2::xml_text(exception, trim = TRUE)  
      150 +    filename <- xml2::xml_attr(exception, "filename")         
      151 +    line <- xml2::xml_attr(exception, "line")                 
      152                                                                
      153 -        get_reporter()$add_result(                            
      154 -          context = context_name,                             
      155 -          test = test_name,                                   
      156 -          result = exp                                        
      157 -        )                                                     
      158 -      }                                                       
      153 +    exception_srcref <- srcref(                               
      154 +      srcfile(file.path("src", filename)),                    
      155 +      c(line, line, 1, 1)                                     
      156 +    )                                                         
      157                                                                
      158 -      get_reporter()$end_test(context = context_name, test =  
          -test_name)                                                    
      159 -    }                                                         
      158 +    exp <- new_expectation("error", exception_text, srcref =  
          +exception_srcref)                                             
      159 +    signal_cpp_expectation(exp)                               
      160    }                                                           
      161 +                                                              
      162 +  get_reporter()$end_test(context = context_name, test =      
          +test_name)                                                    
      163  }                                                             
      164                                                                
      165 +# Helper to process an error that occurred during C++ test    
          +execution                                                     
      166 +process_cpp_test_section_error <- function(test_name,         
          +message) {                                                    
      167 +  local_description_set(test_name)                            
      168 +                                                              
      169 +  get_reporter()$start_test(context = "Catch", test =         
          +test_name)                                                    
      170 +                                                              
      171 +  exp <- new_expectation("failure", message)                  
      172 +  signal_cpp_expectation(exp)                                 
      173 +                                                              
      174 +  get_reporter()$end_test(context = "Catch", test =           
          +test_name)                                                    
      175 +}                                                             
      176 +                                                              
      177 +# Signal an expectation condition without throwing for        
          +failures/errors                                               
      178 +# This allows C++ tests to report multiple failures without   
          +stopping                                                      
      179 +signal_cpp_expectation <- function(exp) {                     
      180 +  withRestarts(                                               
      181 +    signalCondition(exp),                                     
      182 +    muffle_expectation = function(e) NULL                     
      183 +  )                                                           
      184 +}                                                             
      185 +                                                              
      186  #' Use Catch for C++ unit testing                             
      187  #'                                                            
      188  #' Add the necessary infrastructure to enable C++ unit        
           testing                                                       

⏺ Now let me test this cleaner implementation.

⏺ Bash(Rscript -e "                                                timeout: 5m 
        devtools::install('/Users/davis/files/r/packages/testthat  0s
      ')…)
  ⎿  pkgload (6d29dc983... -> [{"sha":"...) [GitHub]
     Downloading GitHub repo r-lib/pkgload@
     … +105 lines (ctrl+o to expand)

⏺ The cleaner implementation works. Here's a summary of the changes:

  Key changes:

  1. succeed("") for successes - this signals an expectation condition that gets caught by handle_expectation in test_code(), which properly increments the$test_expectations and registers the result
  2. signal_cpp_expectation() helper - for failures and errors, we can't use fail() because it throws (which would stop processing after the first failure). Instead, this helper signals the expectation as a condition using signalCondition() rather than stop(). This allows multiple failures to be reported.
  3. local_description_set(test_name) in process_cpp_test_section() - ensures test_description() returns the correct test name when expectations are handled by register_expectation()
  4. Factored out helper functions - process_cpp_test_section() and process_cpp_test_section_error() provide proper scoping for local_description_set() and make the code cleaner

  The flow now is:
  1. succeed("") / signal_cpp_expectation() signal condition
  2. handle_expectation in test_code() catches it
  3. Handler increments the$test_expectations and calls register_expectation()
  4. register_expectation() uses test_description() (which reads the$description we set) and adds to reporter

✻ Crunched for 4m 14s

My two prompts were

❯ There seems to be a bug between cpp11 and testthat where a single test is 
  always reported as skipped.                                               
                                                                            
  If you run the cpp11test tests as outlined in CLAUDE.md, you should see a 
  single test that is skipped.                                              
                                                                            
  I believe this is a bug in testthat, possibly in                          
  `testthat::run_cpp_tests`.                                                
                                                                            
  I have checked out a branch of testthat in                                
  `/Users/davis/files/r/packages/testthat` for you to modify. You may work  
  in that directory, but not in any other one.                              
                                                                            
  See if you can figure out what is wrong in testthat that causes a single  
  test to always be skipped. You can check your changes in testthat by      
  running the following:                                                    
                                                                            
  ```r                                                                      
  Rscript -e "                                                              
  devtools::install('/Users/davis/files/r/packages/testthat')               
  devtools::clean_dll('./cpp11test')                                        
  devtools::test('./cpp11test')                                             
  "                                                                         
  ```   

❯ That seems like the right idea, but it seems like the real issue is that  
  `run_cpp_tests()` is very low level                                       
  and is not emitting `pass()` or `fail()` after processing each test,      
  which means that the `handle_expectation`                                 
  handler in `test_code()` is not being emitted, and that is what           
  normally bumps `the$test_expectations`. Is                                
  there a way to transition `run_cpp_tests()` to use `pass()` and           
  `fail()` and friends instead?       

@DavisVaughan DavisVaughan force-pushed the fix/testthat-cpp-skip branch from 9a5d51f to 18d248c Compare January 21, 2026 17:47
Comment on lines 85 to 88
for (test in tests) {
test_description <- xml2::xml_attr(test, "name")

test_that(test_description, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should consider if you should use test_code() here instead, since it's a little lower level.

test_description <- xml2::xml_attr(test, "name")

test_that(test_description, {
result <- xml2::xml_find_first(test, "./OverallResults")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably out of scope for this PR, but it would probably be better to do:

result <- xml2::xml_find_first(test, "./OverallResults|./Expression|./Exception")

in order preserve the original ordering of successes, failures, and errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've determined that I don't think that will actually help that much

OverallResults reports something like

<OverallResults successes="5" failures="1" expectedFailures="0">

from which we extract 5 to call pass() 5 times

but we don't know where the successes are in relation to the 1 failure

a single test_that block on the cpp11 side might look like

  test_that("data_frame works") {
    auto getExportedValue = cpp11::package("base")["getExportedValue"];
    auto mtcars = getExportedValue("datasets", "mtcars");
    cpp11::data_frame mtcars_df(mtcars);

    expect_true(mtcars_df.nrow() == 31);
    expect_true(mtcars_df.ncol() == 11);

    cpp11::strings names(mtcars_df.names());
    expect_true(names[0] == "mpg");
    expect_true(names[7] == "vs");

    auto iris = getExportedValue("datasets", "iris");
    cpp11::data_frame iris_df(iris);

    expect_true(iris_df.nrow() == 150);
    expect_true(iris_df.ncol() == 5);
  }

but Catch just tells us this

{xml_node}
<Section name="data_frame works" filename="test-data_frame.cpp" line="9">
[1] <Expression success="false" type="CATCH_CHECK" filename="test-data_frame.cpp" line="14">\n  <Original>\n            mtcars_df.nrow() == 31\n          </Original>\n  <Expanded>\n            32 == 31\n          </Ex ...
[2] <OverallResults successes="5" failures="1" expectedFailures="0"/>

i.e. we see 5 successes, 1 failure, and we can get info about that failure from the Expression, but it doesn't list the successes interweaved with that failure, so we can't call pass() and fail() in the right order.


I'm also not sure it matters much. Whether we call pass() and fail() in the right order doesn't seem that important, right? Because pass() never emits anything to the screen.

The one case that might be kinda important is emitting fail() vs expectation("error") in the right order, but that feels rare enough that I'm ok with the current setup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense.

@DavisVaughan
Copy link
Member Author

@hadley b8c1942 refactors this into a series of self contained parsers so that we can handle all the xml2 bits up front with parse_catch_contexts() and get one big R object back to process. It feels easier to follow now.

@DavisVaughan DavisVaughan requested a review from hadley January 21, 2026 21:26
for (test in context$tests) {
test_that(test$name, {
for (i in seq_len(test$n_successes)) {
pass()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now makes it very clear why order doesn't matter.

@DavisVaughan DavisVaughan changed the title Rework run_cpp_tests() to avoid testthat internals! Rework run_cpp_tests() to avoid testthat internals Jan 22, 2026
@DavisVaughan DavisVaughan merged commit dd772a0 into r-lib:main Jan 22, 2026
17 checks passed
@DavisVaughan DavisVaughan deleted the fix/testthat-cpp-skip branch January 22, 2026 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants