You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/ci-release.md
+95Lines changed: 95 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -228,4 +228,99 @@ ex: Branch is named `develop` and the PR is numbered `113`
228
228
- `stacks-core:2.1.0.0.0`
229
229
- `stacks-core:latest`
230
230
231
+
## Mutation Testing
232
+
233
+
When a new Pull Request (PR) is submitted, this feature evaluates the quality of the tests added or modified in the PR.
234
+
It checks the new and altered functions through mutation testing.
235
+
Mutation testing involves making small changes (mutations) to the code to check if the tests can detect these changes.
236
+
237
+
The mutations are run with or without a [Github Actions matrix](https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs).
238
+
The matrix is used when there is a large number of mutations to run ([check doc specific cases](https://github.com/stacks-network/actions/blob/main/stacks-core/mutation-testing/check-packages-and-shards/README.md#outputs)).
239
+
We utilize a matrix strategy with shards to enable parallel execution in GitHub Actions.
240
+
This approach allows for the concurrent execution of multiple jobs across various runners.
241
+
The total workload is divided across all shards, effectively reducing the overall duration of a workflow because the time taken is approximately the total time divided by the number of shards (+ initial build & test time).
242
+
This is particularly advantageous for large packages that have significant build and test times, as it enhances efficiency and speeds up the process.
243
+
244
+
Since mutation testing is directly correlated to the written tests, there are slower packages (due to the quantity or time it takes to run the tests) like `stackslib` or `stacks-node`.
245
+
These mutations are run separately from the others, with one or more parallel jobs, depending on the amount of mutations found.
246
+
247
+
Once all the jobs have finished testing mutants, the last job collects all the tested mutations from the previous jobs, combines them and outputs them to the `Summary` section of the workflow, at the bottom of the page.
248
+
There, you can find all mutants on categories, with links to the function they tested, and a short description on how to fix the issue.
249
+
The PR should only be approved/merged after all the mutants tested are in the `Caught` category.
250
+
251
+
### Time required to run the workflow based on mutants outcome and packages' size
252
+
253
+
- Small packages typically completed in under 30 minutes, aided by the use of shards.
254
+
- Large packages like stackslib and stacks-node initially required about 20-25 minutes for build and test processes.
255
+
- Each "missed" and "caught" mutant took approximately 15 minutes. Using shards, this meant about 50-55 minutes for processing around 32 mutants (10-16 functions modified). Every additional 8 mutants added another 15 minutes to the runtime.
256
+
- "Unviable"mutants, which are functions lacking a Default implementation for their returned struct type, took less than a minute each.
257
+
- "Timeout"mutants typically required more time. However, these should be marked to be skipped (by adding a skip flag to their header) since they indicate functions unable to proceed in their test workflow with mutated values, as opposed to the original implementations.
- caught — A test failed with this mutant applied.
266
+
This is a good sign about test coverage.
267
+
268
+
- missed — No test failed with this mutation applied, which seems to indicate a gap in test coverage.
269
+
Or, it may be that the mutant is undistinguishable from the correct code.
270
+
In any case, you may wish to add a better test.
271
+
272
+
- unviable — The attempted mutation doesn't compile.
273
+
This is inconclusive about test coverage, since the function's return structure may not implement `Default::default()` (one of the mutations applied), hence causing the compile to fail.
274
+
It is recommended to add `Default` implementation for the return structures of these functions, only mark that the function should be skipped as a last resort.
275
+
276
+
- timeout — The mutation caused the test suite to run for a long time, until it was eventually killed.
277
+
You might want to investigate the cause and only mark the function to be skipped if necessary.
278
+
279
+
### Skipping Mutations
280
+
281
+
Some functions may be inherently hard to cover with tests, for example if:
282
+
283
+
- Generated mutants cause tests to hang.
284
+
- You've chosen to test the functionality by human inspection or some higher-level integration tests.
285
+
- The function has side effects or performance characteristics that are hard to test.
286
+
- You've decided that the function is not important to test.
287
+
288
+
To mark functions as skipped, so they are not mutated:
289
+
290
+
- Add a Cargo dependency of the [mutants](https://crates.io/crates/mutants) crate, version `0.0.3` or later (this must be a regular `dependency`, not a `dev-dependency`, because the annotation will be on non-test code) and mark functions with `#[mutants::skip]`, or
291
+
292
+
- You can avoid adding the dependency by using the slightly longer `#[cfg_attr(test, mutants::skip)]`.
293
+
294
+
### Example
295
+
296
+
```rust
297
+
use std::time::{Duration, Instant};
298
+
299
+
/// Returns true if the program should stop
300
+
#[cfg_attr(test, mutants::skip)] // Returning false would cause a hang
301
+
fn should_stop() -> bool {
302
+
true
303
+
}
304
+
305
+
pub fn controlled_loop() {
306
+
let start = Instant::now();
307
+
for i in 0.. {
308
+
println!("{}", i);
309
+
if should_stop() {
310
+
break;
311
+
}
312
+
if start.elapsed() > Duration::from_secs(60 * 5) {
0 commit comments