[ZIPT Benchmark] Z3 c3 branch — 2026-03-14 #8985
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Qf S Benchmark. A newer discussion is available at Discussion #9031. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Date: 2026-03-14
Branch: c3 (
27f5541b0bd518ae3477df3b31ea9b001537ebc7)Z3 Version: 4.17.0 — 64 bit
Benchmark set: QF_S (50 randomly selected files from
tests/QF_S.tar.zst, total pool: 22,172 files)Timeout: 10 seconds per benchmark (
-T:10)Summary
seqsolvernseqsolverSoundness disagreements (seq says sat, nseq says unsat or vice versa): 0 ✅
Key Findings
seqis significantly ahead ofnseqon this sample: seq solved 44/50 benchmarks (sat+unsat) vs. nseq's 13/50. Thenseqsolver timed out (returnedunknownafter ~10s) on 32 benchmarks thatseqsolved quickly.nseqis fast when it works: On the 13 benchmarks where nseq found an answer, its average was well under 1s. The issue is coverage, not speed-when-solving.seqslow case:slog_stranger_5307_sink.smt2took 8.717s for seq;instance13753.smt2andinstance14506.smt2both timed out for seq (10s).20230329-automatark-lu(automata-based), plus a few2019-Jiang/slog(string log analysis),20250403-rna(RNA folding), and20250403-pcp-string(Post Correspondence Problem).Notable Issues
Soundness Disagreements (Critical)
None found. 🎉
Crashes / Bugs
None. 🎉
Slow Benchmarks (> 8s for seq)
slog_stranger_5307_sink.smt2instance13753.smt2instance14506.smt2nseqtimeouts: benchmarks seq solved but nseq did not (32 files)These are benchmarks where
seqreturnedsatorunsatquickly butnseqreturnedunknownafter 10s, indicating incomplete coverage in the new solver:instance00826,instance04922,instance01403,instance08173,instance10859,instance07383,instance06304,instance04240,instance08947,instance08216,instance05830,instance08022,instance04707,instance11370,instance04767,instance04033,instance12274,instance13512,instance09705,instance14182,instance04229,instance00500,instance04918,slog_stranger_1844_sink,instance00239,instance11871,instance01916,instance02721,pcp_instance_443,unsolved_pcp_instance_63,instance07197,slog_stranger_5307_sinkPer-File Results
Click to expand full per-file table (50 rows)
instance00826.smt2instance04922.smt2instance02637.smt2instance02212.smt2instance07104.smt2instance01403.smt2instance15166.smt2slog_stranger_5307_sink.smt2slog_stranger_1444_sink.smt2instance08173.smt2benchmark_0339.smt2instance13753.smt2instance10859.smt2instance07383.smt2pcp_instance_443.smt2instance06304.smt2instance08250.smt2instance07262.smt2instance07197.smt2instance04240.smt2instance08947.smt2instance08216.smt2instance05830.smt2instance08022.smt2instance04707.smt2instance11370.smt2slog_stranger_1251_sink.smt2instance04767.smt2instance15230.smt2instance11218.smt2instance00285.smt2instance14506.smt2instance04033.smt2instance12274.smt2instance09203.smt2instance03342.smt2instance13512.smt2instance09705.smt2slog_stranger_386_sink.smt2instance14182.smt2instance04229.smt2instance00500.smt2instance04918.smt2slog_stranger_1844_sink.smt2benchmark_0059.smt2instance00239.smt2instance11871.smt2unsolved_pcp_instance_63.smt2instance01916.smt2instance02721.smt2Generated automatically by the QF_S Benchmark workflow on the c3 branch.
Beta Was this translation helpful? Give feedback.
All reactions