Commit 1cc5e96
V3.97: hygiene release (#507)
* Incrementing SEMVER to v3.96.1
* Fix UBSan: guard negative exponent overflow in areal conversion
When exponent is a large negative (e.g. -72), the shift `1ull << -exponent`
exceeds 63 bits, causing undefined behavior. Add `exponent > -64` guard
so both positive and negative extremes fall through to the safe ipow() path.
* weird difference between double and duble
* systems and position paper roadmaps
* Add papers/ artifact tree with three mixed-precision solver case studies
Create self-contained papers/ directory for systems and position paper
artifacts that can be zipped and shared with reviewers:
- papers/systems-paper/iterative_refinement.cpp: Carson & Higham
three-precision LU-IR across IEEE, posit, cfloat, dd, cross-family
- papers/systems-paper/conjugate_gradient.cpp: CG for SPD systems with
single-precision and two-precision (low preconditioner) configurations
- papers/systems-paper/idrs.cpp: IDR(s) for non-symmetric systems with
shadow space dimension sweep and number system comparison
Move paper docs from docs/papers/ to papers/docs/ for co-location.
Add UNIVERSAL_BUILD_PAPERS CMake option (wired into BUILD_ALL cascade).
* Add changelog and session doc for paper artifact tree and solver studies
* Add LaTeX scaffolding for arXiv systems paper
Plain article class (12pt) with full section structure, TODO placeholders,
29 BibTeX references (14 from JOSS + 15 new), and Makefile for local builds.
* Add changelog and session doc for LaTeX paper scaffolding
* arxiv paper first draft
* arxiv systems paper draft v3
* Adding a mixed-precision Attention head with KV cache test case for the ArXiv paper
* Complete posit2 arithmetic, conversion, logic, and assignment test suites
Add all four arithmetic operations (sub, mul, div plus existing add) via
blocktriple pipeline, port conversion/assignment/logic regression tests
from original posit, and fix three bugs discovered during testing:
- convert_ieee754() extractBits too small: nbits+4 lost IEEE sticky bits
causing false midpoint ties; now uses max(numeric_limits<Real>::digits, nbits+4)
- Integer assignment via blocktriple had hidden-bit off-by-one in round();
rerouted through convert_ieee754(static_cast<double>(rhs))
- Literal comparison operators accessed private _block member; replaced
with delegation to posit-posit comparison operators
* Fix posit2 clang test failures: regime value() and setbits() UB
positRegime::value() used manual division (1.0l / uint64_t(1) << -e2)
for negative exponents, which produced wrong results under clang due
to a codegen issue in this template context. Replace with std::ldexp()
matching the original posit implementation.
posit::setbits(uint64_t) used an uninitialized blockbinary temporary
leaving upper MSU bits as garbage. Replace with _block.setbits(value)
which properly masks the MSU.
* Guard posit2 long double operators with LONG_DOUBLE_SUPPORT for MSVC
On MSVC, LONG_DOUBLE_SUPPORT is 0 but the long double comparison
operators were not guarded, causing ambiguous conversion errors since
the posit(long double) constructor was correctly excluded. Add
matching #if LONG_DOUBLE_SUPPORT guards to friend declarations,
operator implementations, and test code.
* Implementation of Unum 2.0 (for v3.96) (#505)
* Skeletal implementation of Unum 2.0
* Fix point multiplication bug and add support for reverse interval
* Added pow() and abs() + some optimizations
* Refactor according to Codacy suggestions
* Added op table/matrix for efficient operations
* Fix unum2 includes
* Fix operation matrix bug in unum2_impl.hpp
* unum2_fwd.hpp, static test and improvements
* Change bitset::_Find_first() to manually finding the bit for compatibility reasons
* Make unum2 friend class a little more specific on class lattice
* Make lattice parameters public to bypass MSVC build fails
* incrementing SEMVER to v3.97
* Fix posit<8,2> fast specialization: replace broken float_assign with proven convert_to_bb
The hand-rolled float_assign truncated toward zero instead of rounding
to nearest, causing 25K+ failures across arithmetic and conversion tests.
Replaced with the battle-tested convert_to_bb path used by posit<16,1>,
posit<16,2>, and posit<32,2>. Also fixed reciprocal() NaR handling and
enabled regression testing (MANUAL_TESTING 0).
* Fix posit<32,2> RISC-V failure: replace long double float_assign with double
The root cause of the 100% arithmetic failure rate on RISC-V was
float_assign(long double) using std::numeric_limits<long double>::digits
to size the conversion bitblock. On x86, long double is 80-bit (dfbits=63);
on RISC-V, long double is 128-bit quad (dfbits=112), causing convert_to_bb
to instantiate with a different bitblock size that produces wrong results.
Since double's 52 fraction bits exceed posit<32,2>'s maximum 28 fraction
bits, double is more than sufficient. This matches the proven pattern
already used by posit<16,1>, posit<16,2>, and posit<8,2>.
Also fixed reciprocal() NaR handling (same fix as posit<8,2>).
* Add RISC-V 64 cross-compilation CI job with QEMU emulation
Adds a new CI matrix entry that cross-compiles for RISC-V 64 using
g++-riscv64-linux-gnu and runs tests via qemu-riscv64-static. This
catches architecture-specific issues like the long double dfbits
divergence fixed in d0ef1c2.
* Fix RISC-V CI: set QEMU_LD_PREFIX for dynamic linker resolution
The qemu-user-static package registers a binfmt_misc handler that
intercepts RISC-V binaries but runs them without the -L sysroot flag,
causing "Could not open '/lib/ld-linux-riscv64-lp64d.so.1'" errors.
Setting QEMU_LD_PREFIX ensures QEMU finds the RISC-V sysroot regardless
of whether it is invoked via CMAKE_CROSSCOMPILING_EMULATOR or binfmt_misc.
* Fix cfloat sNaN test failures on RISC-V and add POWER CI job
RISC-V, ARM, and POWER architectures quiet sNaN (signaling NaN) on any
FP register contact, so cfloat's sNaN encoding cannot survive a
round-trip through native float/double on these platforms.
- Add UNIVERSAL_SNAN_ROUND_TRIPS_NATIVE_FP macro to architecture.hpp
documenting sNaN behaviour per architecture (only defined on x86)
- Guard sNaN round-trip tests in cfloat_test_suite.hpp and
assignment.cpp: skip sNaN-specific assertions on non-x86 platforms
- Fix gcc_long_double.hpp to_binary/to_triple/color_print for POWER's
128-bit IEEE quad long double (112 fraction bits, no x86 bit63 field)
- Add POWER ppc64le cross-compilation CI job with QEMU emulation
- Add cmake/toolchains/ppc64le-linux-gnu.cmake toolchain file
Tested locally: 389/389 pass on RISC-V, 389/389 pass on POWER,
7/7 cfloat tests pass on native x86.
* Fix broken cmake install after include directory reorganization (#503)
The include tree was moved from include/universal/ to include/sw/universal/
but the install rules were never updated, causing "file INSTALL cannot find"
errors. Fix the install source path, BUILD_INTERFACE, and include_install_dir
to match the new layout.
---------
Signed-off-by: Theodore Omtzigt <theo@stillwater-sc.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: SD Asif Hossein <101280084+singul4ri7y@users.noreply.github.com>1 parent 6daf46f commit 1cc5e96
File tree
11 files changed
+247
-134
lines changed- .github/workflows
- cmake/toolchains
- include/sw/universal
- native/nonconstexpr
- number/posit/specialized
- utility
- verification
- static
- cfloat/conversion
- posit/specialized
11 files changed
+247
-134
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
52 | 66 | | |
53 | 67 | | |
54 | 68 | | |
| |||
60 | 74 | | |
61 | 75 | | |
62 | 76 | | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
63 | 93 | | |
64 | 94 | | |
65 | 95 | | |
| |||
98 | 128 | | |
99 | 129 | | |
100 | 130 | | |
| 131 | + | |
101 | 132 | | |
102 | 133 | | |
103 | 134 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| |||
565 | 565 | | |
566 | 566 | | |
567 | 567 | | |
568 | | - | |
569 | | - | |
| 568 | + | |
| 569 | + | |
570 | 570 | | |
571 | 571 | | |
572 | 572 | | |
| |||
614 | 614 | | |
615 | 615 | | |
616 | 616 | | |
617 | | - | |
| 617 | + | |
618 | 618 | | |
619 | 619 | | |
620 | 620 | | |
| |||
658 | 658 | | |
659 | 659 | | |
660 | 660 | | |
661 | | - | |
| 661 | + | |
662 | 662 | | |
663 | 663 | | |
664 | 664 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
Lines changed: 61 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | | - | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
119 | 139 | | |
120 | 140 | | |
121 | 141 | | |
122 | 142 | | |
123 | 143 | | |
124 | 144 | | |
125 | 145 | | |
| 146 | + | |
126 | 147 | | |
127 | 148 | | |
128 | 149 | | |
| |||
151 | 172 | | |
152 | 173 | | |
153 | 174 | | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
154 | 192 | | |
155 | 193 | | |
156 | 194 | | |
157 | 195 | | |
158 | 196 | | |
159 | 197 | | |
| 198 | + | |
160 | 199 | | |
161 | 200 | | |
162 | 201 | | |
| |||
195 | 234 | | |
196 | 235 | | |
197 | 236 | | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
198 | 256 | | |
199 | 257 | | |
200 | 258 | | |
201 | 259 | | |
202 | 260 | | |
203 | 261 | | |
204 | 262 | | |
| 263 | + | |
205 | 264 | | |
206 | 265 | | |
207 | 266 | | |
| |||
Lines changed: 19 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
93 | | - | |
| 93 | + | |
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
| |||
434 | 434 | | |
435 | 435 | | |
436 | 436 | | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
437 | 442 | | |
438 | 443 | | |
439 | 444 | | |
| |||
663 | 668 | | |
664 | 669 | | |
665 | 670 | | |
666 | | - | |
667 | | - | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
668 | 679 | | |
669 | 680 | | |
670 | 681 | | |
| |||
0 commit comments