[ZIPT Benchmark] Z3 c3 branch — 2026-03-20 #9054
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Qf S Benchmark. A newer discussion is available at Discussion #9057. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Date: 2026-03-20
Branch: c3
Benchmark set: QF_S (50 randomly selected files from
tests/QF_S.tar.zst, total pool: 22,172 files)Timeout: 10 seconds per benchmark (
-T:10for Z3 nseq;-T:5for seq+trace;-t:10000for ZIPT)Build: Debug (
CMAKE_BUILD_TYPE=Debug, Z3 4.17.0)Summary
Soundness disagreements (any two solvers return conflicting sat/unsat): 0
Headline: nseq is ~10× faster than seq overall (3.67 s vs 36.93 s total wall-clock time) and solves 2 more instances (45 vs 43 definitive answers). No soundness bugs were found between any pair of solvers.
Notable Issues
Soundness Disagreements (Critical)
None found. All solvers agree on every instance where they both return a definitive answer.
Crashes / Bugs
All 4 ZIPT
bugverdicts are unsupported-feature errors, not crashes. ZIPT outputs:During the timed benchmark run nseq returned
unknownin 35 ms without hitting the assertion. seq correctly solves this instance assatin 845 ms. ZIPT also returnssatin 207 ms. This indicates a bug in nseq's model-construction path for the type of constraints inquery3188.smt2.Slow Benchmarks (> 8s)
No benchmark exceeded 8 s for any solver (seq's worst was 5.009 s, the internal T:5 hard cap; nseq's worst was 0.916 s).
Cases Where nseq Solved But seq Did Not
Three instances where nseq (and ZIPT) found a definitive answer but seq hit its timeout:
dining-cryptographers_sat_non_incre_equiv_init_0_7.smt203_track_76.smt2slog_stranger_3309_sink.smt2Case Where seq Solved But nseq Did Not
One instance where seq returned
satbut nseq returnedunknown:query3188.smt2As noted above, nseq has a latent assertion violation in its model-construction path for this instance.
Trace Analysis: seq-fast / nseq-slow Hypotheses
No seq-fast / nseq-slow cases were observed in this run.
The trend is the opposite: nseq is uniformly faster than seq on every instance where both solvers produce the same verdict. The overall 10× speedup in nseq is consistent and covers both sat and unsat instances across all benchmark families tested (automatark-lu, woorpje-lu, slog, sygus-qgen, rna-sat, pcp-string).
The seq trace for
dining-cryptographers_sat_non_incre_equiv_init_0_7.smt2(which seq could not solve within 5 s) shows the solver spending most of its time in:mk_eq_coreequality normalisations (thousands of X == varout entries)enque_axiomfor seq.unit characters)propagate_in_re) for complex union/concatenation regexesadd_length,seq.length_limit[1:varout])The seq solver appears to expand the regex structure into character-level axioms rather than reasoning about it holistically. nseq likely uses a more direct Parikh/automata-based argument to refute the constraints without character-level case splitting.
Per-File Results
Click to expand — all 50 benchmark results
instance11200.smt2instance05074.smt2instance08489.smt2instance15382.smt2dining-cryptographers_sat_non_incre_equiv_init_0_7.smt2benchmark_0438.smt2instance06903.smt2instance05630.smt2instance01549.smt2instance15219.smt2instance13864.smt2instance04058.smt2instance10667.smt2instance01181.smt2instance12719.smt2instance04793.smt2instance06019.smt2instance15262.smt2instance02487.smt2benchmark_0013.smt2instance06027.smt2instance11858.smt2instance06416.smt2instance08037.smt2instance01652.smt2benchmark_0190.smt2instance08668.smt2instance12920.smt2instance01206.smt2instance15469.smt2query3188.smt2instance12921.smt2instance04285.smt2instance02474.smt2instance02273.smt2slog_stranger_4971_sink.smt203_track_76.smt2instance01832.smt2instance00006.smt2pcp_instance_418.smt2instance01431.smt2slog_stranger_3309_sink.smt2instance06168.smt2instance09484.smt203_track_110.smt2instance10830.smt2instance01565.smt2instance10351.smt2instance04659.smt2instance13803.smt2Generated automatically by the ZIPT Benchmark workflow on the c3 branch.
Beta Was this translation helpful? Give feedback.
All reactions