You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This vignette is a quick reference guide for testing challenging functions. It's organized by the problem, rather than the technique used to solve it, so you can quickly skim the whole vignette, spot the problem you're facing, then learn more about useful tools for solving it.
23
+
This vignette is a quick reference guide for testing challenging functions. It's organized by problem type rather than technique, so you can quickly skim the whole vignette, spot the problem you're facing, then learn more about useful tools for solving it. In it you'll learn how to overcome the following challenges:
24
+
25
+
* Functions that depend on options and environment variables.
26
+
* Random number generators.
27
+
* Tests that can't be run in some environments.
28
+
* Testing web APIs.
29
+
* User interaction.
30
+
* User-facing text.
31
+
* Repeated code.
24
32
25
33
## Options and environment variables
26
34
27
-
If your function depends on options or environment variables, first try refactoring the functions to make the [inputs explicit](https://design.tidyverse.org/inputs-explicit.html). If that's not possible, then you can use a function like `withr::local_options()` or `withr::local_envvar()` to temporarily change options and environment values within a test. Learn more in `vignette("test-fixtures")`.
35
+
If your function depends on options or environment variables, first try refactoring the function to make the [inputs explicit](https://design.tidyverse.org/inputs-explicit.html). If that's not possible, then you can use functions like `withr::local_options()` or `withr::local_envvar()` to temporarily change options and environment values within a test. Learn more in `vignette("test-fixtures")`.
36
+
37
+
<!-- FIXME: Consider adding a brief example showing the difference between implicit and explicit approaches - this would make the recommendation more concrete -->
28
38
29
39
## Random numbers
30
40
31
-
Random number generators generate different numbers each time you call them because they update a special `.Random.seed` variable stored in the global environment. You can temporarily set this seed to a known value to make your random numbers reproducible with `withr::local_seed()`, making random numbers a special case of test fixtures. Learn more in `vignette("test-fixtures")`.
41
+
What happens if you want to test a function that relies on randomness in some way? If you're writing a random number generator, you probably want to generate large quantities of random numbers and apply some statistical test. But what if your function just happens to use a little bit of pre-existing randomness? How do you make your tests repeatable and reproducible?
42
+
43
+
Under the hood, random number generators generate different numbers each time you call them because they update a special `.Random.seed` variable stored in the global environment. You can temporarily set this seed to a known value to make your random numbers reproducible with `withr::local_seed()`, making random numbers a special case of test fixtures (`vignette("test-fixtures")`).
44
+
45
+
Here's a simple example showing how you might test the basic operation of function that rolls a dice:
32
46
33
47
```{r}
34
48
#| label: random-local-seed
35
-
36
49
dice <- function() {
37
50
sample(6, 1)
38
51
}
@@ -46,7 +59,7 @@ test_that("dice returns different numbers", {
46
59
})
47
60
```
48
61
49
-
Alternatively, you might want to mock the function to eliminate randomness. Learn more in `vignette("mocking")`.
62
+
Alternatively, you might want to mock (`vignette("mocking")`) the function to eliminate randomness.
When should you set the seed and when should you use mocking? As a general rule of thumb, set the seed when you want to test the actual random behavior, and use mocking when you want to test the logic that uses the random results.
78
+
64
79
## Some tests can't be run in some circumstances
65
80
66
-
You can skip a test without passing or failing if it's not possible to run it in the current environment (e.g., it's OS dependent, or it only works interactively, or it shouldn't be tested on CRAN). Learn more in `vignette("skipping")`.
81
+
You can skip a test without it passing or failing if it's not possible to run it in the current environment (e.g., it's OS dependent, it only works interactively, or it shouldn't be tested on CRAN). Learn more in `vignette("skipping")`.
67
82
68
83
## HTTP requests
69
84
70
-
If you're trying to test functions that rely on HTTP requests, we recommend using {vcr} or {httptest2}.
71
-
These packages provide the ability to record and then later replay HTTP requests so that you can test without an active internet connection. If your package is going to CRAN, we highly recommend either using one of these packages or using `skip_on_cran()` for your internet-facing tests. This ensures that your package won't break on CRAN just because the service you're using is temporarily down.
85
+
If you're trying to test functions that rely on HTTP requests, we recommend using {vcr} or {httptest2}. These packages both allow you to interactively record HTTP responses, and then later replay them in tests. This is a specialised type of mocking, which allows your tests to run without the internet and isolates your tests from failures in the underlying service.
86
+
87
+
If your package is going to CRAN, you **must** either one of these packages or using `skip_on_cran()` for internet-facing tests. Otherwise you are at high risk of failing `R CMD check` if the API you are binding is temporarily down. This sort of failure causes extra work for the CRAN maintainers and extra hassle for you.
72
88
73
89
## User interaction
74
90
75
-
If you're testing a function that relies on user feedback from `readline()`or `menu()` or similar, you can use mocking to temporarily make those functions return fixed values. For example, imagine that you have the following function that asks the user if they want to continue:
91
+
If you're testing a function that relies on user feedback from `readline()`, `utils::menu()`, or `utils::askYesNo()` or similar, you can use mocking (`vignette("mocking")`) to have those functions return fixed values within the test. For example, imagine that you have the following function that asks the user if they want to continue:
76
92
77
93
```{r}
78
94
#| label: continue
@@ -115,7 +131,7 @@ test_that("user must respond y or n", {
115
131
})
116
132
```
117
133
118
-
If you were testing the behaviour of some function that used `continue()`, you might choose to mock it directly:
134
+
If you were testing the behavior of some function that used `continue()`, you might choose to mock that function instead of `continue()`. For example, the function below requires user confirmation before overwriting an existing file. In order to focus our tests on the behaviour of just this function, we mock `continue()` to return either `TRUE` or `FALSE` without any user messaging.
Errors, warnings, and other user-facing text should be tested to ensure they're consistent and actionable. Obviously, you can't test this 100% automatically, but you can ensure that such messaging is clearly shown in PRs so another human can take a look. This is the point of snapshot tests; learn more in `vignette("snapshotting")`.
162
+
Errors, warnings, and other user-facing text should be tested to ensure they're consistent and actionable. Obviously, you can't test this 100% automatically, but you use snapshots (`vignette("snapshotting")`) to ensure that user-facing messages are clearly shown in PRs and easily reviewed by another human.
163
+
164
+
## Repeated code
165
+
166
+
If you find yourself repeating the same set of expectations again and again across your test suite, it may be a sign that you should design your own expectation. Learn how in `vignette("custom-expectations")`.
This vignette shows you how to write your own expectations. You can use them within your package by putting them in a helper file, or share them with others by exporting them from your package.
20
+
This vignette shows you how to write your own expectations. Custom expectations allow you to extend testthat to meet your own specialised testing needs, creating new `expect_` functions that work exactly the same way as the built-ins. Custom expectations are particularly useful if you want to produce expectations tailored for domain-specific data structures, combining multiple checks into a single expectation, or creating more actionable feedback when an expectation fails. You can use them within your package by putting them in a helper file, or share them with others by exporting them from your package.
21
+
22
+
In this vignette, you'll learn about the three-part structure of expectations, how to test your custom expectations, see a few examples, and, if you're writing a lot of expectation, learn how to reduce repeated code.
The first step in any expectation is to use `quasi_label()` to capture a "labeled value", i.e., a list that contains both the value (`$val`) for testing and a label (`$lab`) for messaging. This is a pattern that exists for fairly esoteric reasons; you don't need to understand it, just copy and paste it 🙂.
48
+
The first step in any expectation is to use `quasi_label()` to capture a "labeled value", i.e., a list that contains both the value (`$val`) for testing and a label (`$lab`) used to make failure messages as informative as possible. This is a pattern that exists for fairly esoteric reasons; you don't need to understand it, just copy and paste it.
47
49
48
50
Next you need to check each way that `object` could violate the expectation. In this case, there's only one check, but in more complicated cases there can be multiple checks. In most cases, it's easier to check for violations one by one, using early returns to `fail()`. This makes it easier to write informative failure messages that first describe what was expected and then what was actually seen.
49
51
50
-
Also note that you need to use `return(fail())` here. You won't see the problem when interactively testing your function because when run outside of `test_that()`, `fail()` throws an error, causing the function to terminate early. When running inside of `test_that()`, however, `fail()` does not stop execution because we want to collect all failures in a given test.
52
+
Note that you need to use `return(fail())` here. If you don't, your expectation might end up failing multiple times or both failing and succeeeding. You won't see these problems when interactively testing your expectation, but forgetting to `return()` can lead to incorrect fail and pass counts in typical usage. In the next section, you'll learn how to test your expecatation to avoid this issue.
51
53
52
-
Finally, if the object is as expected, call `pass()` with `act$val`. Returning the input value is good practice since expectation functions are called primarily for their side-effects (triggering a failure). This allows expectations to be chained:
54
+
Finally, if the object is as expected, call `pass()` with `act$val`. This is good practice because expectation functions are called primarily for their side-effects (triggering a failure), and returning the value allows expectations to be piped together:
53
55
54
56
```{r}
55
-
mtcars |>
56
-
expect_type("list") |>
57
-
expect_s3_class("data.frame") |>
58
-
expect_length(11)
57
+
#| label: piping
58
+
59
+
test_that("mtcars is a 13 row data frame", {
60
+
mtcars |>
61
+
expect_type("list") |>
62
+
expect_s3_class("data.frame") |>
63
+
expect_length(11)
64
+
})
59
65
```
60
66
61
67
### Testing your expectations
62
68
63
-
Once you've written your expectation, you need to test it, and luckily testthat comes with three expectations designed specifically to test expectations:
69
+
Once you've written your expectation, you need to test it: expectations are functions that can have bugs, just like any other function, and it's really important that they generate actionable failure messages. Luckily testthat comes with three expectations designed specifically to test expectations:
64
70
65
71
*`expect_success()` checks that your expectation emits exactly one success and zero failures.
66
72
*`expect_failure()` checks that your expectation emits exactly one failure and zero successes.
67
-
*`expect_failure_snapshot()` captures the failure message in a snapshot, making it easier to review if it's useful or not.
73
+
*`expect_snapshot_failure()` captures the failure message in a snapshot, making it easier to review whether it's useful.
68
74
69
-
The first two expectations are particularly important because they ensure that your expectation reports the correct number of successes and failures to the user.
75
+
The first two expectations are particularly important because they ensure that your expectation always reports either a single success or a single failure. If it doesn't, the end user is going to get confusing results in their test suite reports.
The following sections show you a few more variations, loosely based on existing testthat expectations.
92
+
The following sections show you a few more variations, loosely based on existing testthat expectations. These expectations were picked to show how you can generate actionable failures in slightly more complex situations.
87
93
88
94
### `expect_vector_length()`
89
95
90
-
Let's make `expect_length()` a bit more strict by also checking that the input is a vector. R is a bit unusual in that it gives a length to pretty much every object, and you can imagine not wanting this code to succeed:
96
+
Let's make `expect_length()` a bit more strict by also checking that the input is a vector. R is a bit unusual in that it gives a length to pretty much every object, and you can imagine not wanting code like the following to succeed, because it's likely that the user passed the wrong object to the test.
Or imagine if you're checking to see if an object inherits from an S3 class. In R, there's no direct way to tell if an object is an S3 object: you can confirm that it's an object, then that it's not an S4 object. So you might organize your expectation this way:
139
+
Or imagine you're checking to see if an object inherits from an S3 class. R has a lot of different OO systems, and you want your failure messages to be as informative as possible, so before checking that the class matches, you probably want to check that the object is from the correct OO family.
Note the variety of error messages. We always print what was expected, and where possible, also display what was actually recieved:
182
188
183
-
* When `object` isn't an object, we only need to say what we expect.
184
-
* When `object`isn't an S3 object, we know it's an S4 object.
185
-
* When `inherits()` is `FALSE`, we provide the actual *class*, since that's most informative.
189
+
* When `object` isn't an object, we can only say what we expected.
190
+
* When `object`is an S4 object, we can report that.
191
+
* When `inherits()` is `FALSE`, we provide the actual class, since that's most informative.
186
192
187
-
I also check that the `class` argument must be a string. This is an error, not a failure, because it suggests you're using the function incorrectly.
193
+
The general principle is to tailor error messages to what the user can act on based on what you know about the input.
194
+
195
+
Also note that I check that the `class` argument is a string. If it's not a string, I throw an error. This is not a test failure; the user is calling the function incorrectly. In general, you should check the type of all arguments that affect the operation and error if they're not what you expect.
188
196
189
197
```{r}
190
198
#| error: true
191
199
expect_s3_class(x1, 1)
192
200
```
193
201
202
+
### An optional class
203
+
194
204
## Repeated code
195
205
196
-
As you write more expectations, you might discover repeated code that you want to extract out into a helper. Unfortunately, creating helper functions is not straightforward in testthat because every `fail()` captures the calling environment in order to give maximally useful tracebacks. Because getting this right is not critical (you'll just get a slightly suboptimal traceback in the case of failure), we don't recommend bothering in most cases. However, we document it here because it's important to get it right in testthat itself.
206
+
As you write more expectations, you might discover repeated code that you want to extract into a helper. Unfortunately, creating 100% correct helper functions is not straightforward in testthat because `fail()` captures the calling environment in order to give useful tracebacks, and testthat's own expectations don't expose this an argument. Fortunately, getting this right is not critical (you'll just get a slightly suboptimal traceback in the case of failure), so we don't recommend bothering in most cases. We document it here, however, because it's important to get it right in testthat itself.
197
207
198
208
The key challenge is that `fail()` captures a `trace_env` which should be the execution environment of the expectation. This usually works, because the default value of `trace_env` is `caller_env()`. But when you introduce a helper, you'll need to explicitly pass it along:
199
209
@@ -219,3 +229,5 @@ A few recommendations:
219
229
* The helper shouldn't be user facing, so we give it a `_` suffix to make that clear.
220
230
* It's typically easiest for a helper to take the labeled value produced by `quasi_label()`.
221
231
* Your helper should usually call both `fail()` and `pass()` and be returned from the wrapping expectation.
232
+
233
+
Again, you're probably not writing so many expectations that it makes sense for you to go to this effort, but it is important for testthat to get correct.
0 commit comments