Add fuzzer for distribution parameters#53
Merged
benjamin-lieser merged 1 commit intorust-random:masterfrom Mar 19, 2026
Merged
Conversation
The new fuzzer script covers all distributions that accept parameters; while the framework is limited in what types of code it can effectively explore, the script may help catch some crashes and hangs in distribution creation or sampling that should either fixed or made into explicitly returned errors; as well as invalid outputs.
benjamin-lieser
approved these changes
Mar 16, 2026
Member
benjamin-lieser
left a comment
There was a problem hiding this comment.
Thanks, that looks really nice to have. What problems we consider worth addressing is a different question.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CHANGELOG.mdentrySummary
This adds code to fuzz the distribution samplers and identify inputs and random samples on which have unexpected behavior (panic, spin in an infinite loop, or produce output which is outside the expected range).
Motivation
Random samplers are sometimes used with parameters derived from external data, random processes, or unstable formulas (like
1/x) which can produce extreme or otherwise unusual values. I think it would make the rand_distr library more useful if it were to clearly indicate which distribution parameters are valid, and on those parameters always produce output that is at least contained in the expected range.(Perfect sampling on unusual extreme inputs, while it would be nice to have, is both hard to achieve and hard to define; even just sampling from the [0,1) interval of floating point values can be done in many different ways.)
Also, ensuring the distribution samplers pass fuzzing (even if this involves fixing events that would only happen with probability < 2^{-256}) will in turn make it easier for users to use fuzzing to find issues in parts of their code that interact with rand_distr.
Details
This creates one large fuzzer script, using https://github.com/rust-fuzz/cargo-fuzz as backend, to handle all parameter types. While this makes finding some issue in the library easy (just run the fuzzer script and wait), there are enough easily reachable and hard to fix issues at the moment that testing a specific distribution requires modifying the script to return early on other types.
The fuzzer code is unfortunately defined as a separate from the code to implement the samplers. I'd prefer a system that defines fuzz targets as functions inside each file, instead of having them separate, but am not aware of any simple and usable ways to do this.
Questions:
How strictly should this check that outputs are in the expected range (and not NaN)? Some relatively central distributions (like Normal) may produce NaN if the input parameters are
NaN. Also, the rand crate's Uniform distribution on a floating point values outputsvalue0_1 * self.scale + self.low, which may be affected by floating point rounding and fall 1ulp outside the target interval or overflow. There are ways to fix these, but I don't know whether they are worth the runtime cost of checking for NaN inputs and adjusting the sampling logic.