who2: Towards a faster expression builder#351
who2: Towards a faster expression builder#351langston-barrett wants to merge 41 commits intomasterfrom
Conversation
6bbf720 to
865bd25
Compare
|
Here are some benchmarking results on a subset of |
865bd25 to
5b0b41d
Compare
|
I added some datastructures inspired by What4's This fully halves the average time, decides two additional problems, and only very slightly increases the median time (which is dominated by very small problems). |
When working on `w4smt2`, I noticed that `ExprBuilder` added considerable overhead to SMT queries, and the overhead grew with larger queries. Profiling revealed that the culprit was What4's normalizing datastructures such as `WeightedSum`, which have `O(n log n)` worst-case construction time. Who2 provides an alternative expression builder that performs local rewrites and tracks abstract domains, but does not include such heavyweight datastructures. It strives to keep construction approximately linear. Benchmarks show that it actually slightly speeds up SMT solvers (!!) when used as a preprocessing pass (via `w2smt2`). Even if further data doesn't support that surprising conclusion, it would certainly be worthwhile to consider benchmarking if Who2 speeds up our actual workloads from Crucible, SAW, and GREASE. Since the correctness of Who2 is crucial to any tools that might be built on it (and validity of benchmarking results), I have put special care into fairly extensive testing. - I have run `who2` via `w2smt2` on many SMT-Lib sample problems and verified its agreement with SMT solvers and What4. - Each rewrite has a corresponding unit test, and that statement is itself tested. Each rewrite test is itself validated against Z3. - I created a Hedgehog generator for a type `ExprBuilderAPI`. This is like the existing "template" tests. I created a `TestBuilder` that uses the same `SymExpr` type as Who2's `Builder`, but without rewriting and with all-top abstract domains. There is a Hedgehog test stating that these builders produce equisatisfiable predicates. I did some manual mutation testing (changing some rewrites) and confirmed that it found the issues I introduced. - There are property-based tests for laws of typeclass `instance`s in the code. Possible extensions: - Improve abstract domains - Support other SMT-Lib logics (integers, reals) - What4-like interface to solvers (currently `w2smt2` just serializes to file) - Add a simple `WeightedSum`-like datastructure that has a `Seq` of non-constant terms, together with a single constant term - Generalize `BloomSeq` to a map-like datastructure, use for a more comprehensive `WeightedSum`-like datastructure
Remove an indirect call (through `bAllocator`) on a fast path
Benchmarking show this not making too much of a difference either way, and it's good for memory usage.
0c39cf1 to
e5bfab3
Compare
b08e0cd to
315889c
Compare
Motivation
When working on
w4smt2, I noticed thatExprBuilderadded considerable overhead to SMT queries, and the overhead grew with larger queries. Profiling revealed that the culprit was What4's normalizing datastructures such asWeightedSum, which haveO(n log n)worst-case construction time. I became concerned that our higher-level tools might be paying an excessive performance penalty by going through What4.Approach
Who2 provides an alternative expression builder that performs local rewrites and tracks abstract domains, but uses only approximately linear time by default.
Many of Who2's performance-oriented features (e.g., hash-consing) can be controlled via Cabal flags, allowing consumers to try out different features depending on their use-case. When disabled, they result in clearly dead branches that are pruned by the compiler.
Results
Benchmarks show that Who2 offers extremely low overhead but still manages to solve a substantial number of problems without consulting a solver. It would certainly be worthwhile to consider benchmarking if Who2 speeds up our actual workloads from Crucible, SAW, and GREASE.
Testing
Since the correctness of Who2 is crucial to any tools that might be built on it (and validity of benchmarking results), I have put special care into fairly extensive testing.
who2viaw2smt2on many SMT-Lib sample problems and verified its agreement with SMT solvers and What4.-- test:comments in the source). Each rewrite test is itself validated against Z3.:provecommand establishing its validity, if Cryptol is available these are proved in the test suite.-- test-smt:comments in the source).ExprBuilderAPI. This is like the existing "template" tests.Builderreduces expressions without variables to literals.Buildernever builds empty datastructures (e.g., semiring sums)ExprBuilderAPI, as interpreted byBuilder, can generate every constructor ofBuilder'sExpr. This ensures good coverage in the generator.TestBuilderthat uses the sameSymExprtype as Who2'sBuilder, but without rewriting and with all-top abstract domains. There is a Hedgehog test stating that these builders produce equisatisfiable predicates. I did some manual mutation testing (changing some rewrites) and confirmed that it found the issues I introduced.instances in the code.There are, in fact, more test modules than source modules.
Documentation
See especially the Haddocks on:
Who2.BuilderWho2.ConfigWho2.Expr.AppStatus and future work
Only
QF_BVoperations are implemented at the moment.TODO (before merging):
Representoras negatedanda la What4PolarizedBloomSeq)Seqsupporting appendingBloomSeqto a map-like datastructure, use for a more comprehensiveWeightedSum-like datastructureTemplateHaskellinstances)Possible extensions:
w2smt2just serializes to file)I did use Claude code in building who2.