Skip to content

V3.104: dfloat (IEEE 754-2008 decimal) and hfloat (IBM System/360 hex FP)#518

Merged
Ravenwater merged 10 commits intomainfrom
v3.104
Feb 27, 2026
Merged

V3.104: dfloat (IEEE 754-2008 decimal) and hfloat (IBM System/360 hex FP)#518
Ravenwater merged 10 commits intomainfrom
v3.104

Conversation

@Ravenwater
Copy link
Contributor

@Ravenwater Ravenwater commented Feb 26, 2026

Summary

  • dfloat: Complete IEEE 754-2008 decimal floating-point with both BID and DPD encodings, combination field encode/decode, arithmetic (+,-,*,/), DPD codec (all 1000 round-trips verified), math library, standard aliases (decimal32/64/128)
  • hfloat: IBM System/360 hexadecimal floating-point — base-16 exponent, no NaN/inf/subnormals, truncation rounding, wobbling precision, standard aliases (short/long/extended)
  • 17 regression tests covering conversion, logic, arithmetic, and standard formats — all pass with both gcc and clang, zero warnings

Test plan

  • Build and run all 17 tests with gcc
  • Build and run all 17 tests with clang
  • Verify trivially constructible/copyable for both types
  • Verify DPD codec exhaustive 1000-value round-trip
  • Verify BID/DPD encoding agreement on values and arithmetic
  • CI passes on push

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added decimal floating-point (dfloat) with BID/DPD encodings (including decimal128) and IBM hexadecimal floating-point (hfloat) short/long/extended aliases; broad math, classification, formatting, and manipulators support.
  • Documentation

    • New implementation plans, session notes and design docs comparing encodings, behaviors, and test results.
  • Tests

    • Extensive regression suites covering API, arithmetic, conversion, comparison, and codec verification.
  • Chores

    • CI workflows now run on main pushes; build option added to enable hexadecimal float tests.

Ravenwater and others added 5 commits February 26, 2026 13:29
The merge commit on main is a new state that was never tested — PR CI only
validates the branch, not the merged result. Push-to-main builds verify that
the merge commit is green.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…em/360 hex FP)

dfloat: complete IEEE 754-2008 decimal floating-point with BID and DPD encodings,
combination field encode/decode, arithmetic operations, math library, and standard
aliases (decimal32/64/128). DPD codec with exhaustive 1000-value verification.

hfloat: IBM System/360 hexadecimal floating-point with base-16 exponent, no NaN/inf,
truncation rounding, wobbling precision. Standard aliases (short/long/extended).

Both types pass trivially constructible checks and compile with gcc and clang.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dfloat tests: assignment/conversion, comparison operators, addition,
subtraction, multiplication, division, and decimal32 standard format
verification (BID/DPD agreement).

hfloat tests: assignment/conversion, comparison operators, addition,
subtraction, multiplication, division, and short precision format
verification (field widths, no NaN/inf, wobbling precision, truncation).

All 17 tests pass with both gcc and clang.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0c777d8 and 61eb2da.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • docs/plans/decimal-hexadecimal-float-implementation.md
  • docs/plans/decimal128-support-for-dfloat.md
  • docs/sessions/2026-02-26_decimal128_support.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md

📝 Walkthrough

Walkthrough

Adds two new numeric systems: dfloat (IEEE 754-2008 decimal with BID/DPD encodings) and hfloat (IBM System/360 hexadecimal float), encoding-aware APIs and math libs, many manipulators/traits/numeric_limits specializations, extensive regression tests, CMake integration (new HFLOAT option, version bump), and CI push-trigger changes.

Changes

Cohort / File(s) Summary
dfloat core & API
include/sw/universal/number/dfloat/dfloat_fwd.hpp, include/sw/universal/number/dfloat/dfloat.hpp, include/sw/universal/number/dfloat/dfloat_impl.hpp
Introduces DecimalEncoding (BID/DPD), changes dfloat signature to dfloat<ndigits,es,Encoding,BlockType>, adds encoding-aware packing/unpacking, wide-significand support for decimal128, full arithmetic, comparisons, IO, and many internal helpers.
dfloat codec & utilities
include/sw/universal/number/dfloat/dpd_codec.hpp, include/sw/universal/number/dfloat/attributes.hpp, include/sw/universal/number/dfloat/manipulators.hpp
Adds constexpr DPD encode/decode utilities, trailing-significand helpers, and encoding-aware attributes/manipulators (type_tag, components, color_print, ranges).
dfloat math, limits & traits
include/sw/universal/number/dfloat/math/*, include/sw/universal/number/dfloat/mathlib.hpp, include/sw/universal/number/dfloat/numeric_limits.hpp, include/sw/universal/traits/dfloat_traits.hpp
Adds ~15 math wrapper headers (exp/log/trig/hyperbolic/hypot/pow/sqrt/next/minmax/...), mathlib aggregator, std::numeric_limits specialization for encoding-aware dfloat, and dfloat trait helpers.
dfloat tests & drivers
static/float/dfloat/...
Adds many regression/test executables (arithmetic, conversion, logic, dpd_codec, standard/decimal32, decimal128) and updates test drivers to exercise new encodings and decimal128 where supported.
hfloat core & API
include/sw/universal/number/hfloat/hfloat_fwd.hpp, include/sw/universal/number/hfloat/hfloat.hpp, include/sw/universal/number/hfloat/hfloat_impl.hpp
Adds new hfloat<ndigits,es,BlockType> type implementing IBM S/360 hex-float semantics (truncation, saturation, no NaN/Inf), full arithmetic, pack/unpack, conversions, IO, and public aliases (hfloat_short/long/extended).
hfloat support & manipulators
include/sw/universal/number/hfloat/exceptions.hpp, include/sw/universal/number/hfloat/attributes.hpp, include/sw/universal/number/hfloat/manipulators.hpp
Adds hfloat-specific exceptions, range/scale helpers, and manipulators (type_tag, components, color_print).
hfloat math, limits & traits
include/sw/universal/number/hfloat/math/*, include/sw/universal/number/hfloat/mathlib.hpp, include/sw/universal/traits/hfloat_traits.hpp, include/sw/universal/number/hfloat/numeric_limits.hpp
Adds ~15 math wrapper headers, copysign, std::numeric_limits specialization for hfloat, and hfloat trait helpers.
hfloat tests & drivers
static/float/hfloat/..., static/float/hfloat/CMakeLists.txt
Adds many hfloat test executables (API, arithmetic, conversion, logic, standard/short) and CMake integration to build them when HFLOATS option enabled.
Build & CI
.github/workflows/cmake.yml, .github/workflows/coverage.yml, CMakeLists.txt
Enables CI push triggers on main (removed branches-ignore / added push event), adds UNIVERSAL_BUILD_NUMBER_HFLOATS CMake option gated by UNIVERSAL_BUILD_NUMBER_STATICS, adds static/float/hfloat subdirectory conditional, and bumps UNIVERSAL_VERSION_MINOR to 104.
Docs & Changelog
CHANGELOG.md, docs/sessions/2026-02-26_dfloat_hfloat_implementation.md, docs/plans/*
Adds detailed changelog entries, implementation/session docs and planning docs for decimal128/dfloat/hfloat design, tradeoffs, and test plans.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

🐰 I nibble bits of BID and hop on hex delight,
I pack declets, chase exponents through the night,
DPD and IBM dance in my burrowed heap,
Headers sprout math and tests wake from sleep,
A rabbit cheers—new floats take flight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.80% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'V3.104: dfloat (IEEE 754-2008 decimal) and hfloat (IBM System/360 hex FP)' accurately and specifically summarizes the main changes: addition of two new numeric types (dfloat and hfloat) with clear identifiers for their standards.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch v3.104

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (22)
static/float/dfloat/logic/logic.cpp (1)

11-22: Minor: REGRESSION_LEVEL_3 and REGRESSION_LEVEL_4 are defined but unused.

The regression level macros are defined but not currently used in the test logic. Consider either adding higher-level tests in the future or removing unused definitions to avoid confusion.

Also, reportTestCases is always true and could be simplified, though keeping it allows easy toggling during development.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/logic/logic.cpp` around lines 11 - 22, Remove or document
unused regression macros and simplify the always-true flag: either delete the
unused macros REGRESSION_LEVEL_3 and REGRESSION_LEVEL_4 from the top of
logic.cpp to avoid confusion, or add use-cases/tests gated by those macros in
the test logic (e.g., inside main where REGRESSION_LEVEL_1/2 are used) so they
are meaningful; likewise, if you don't need runtime toggling, simplify the
boolean reportTestCases by replacing it with a constant or removing it and any
conditional branches that rely on it, ensuring references to reportTestCases,
REGRESSION_LEVEL_3, and REGRESSION_LEVEL_4 are updated accordingly.
include/sw/universal/number/hfloat/math/functions/minmax.hpp (1)

11-21: Consider using native comparison to preserve precision.

Converting to double for comparison works but can lose precision for hfloat_extended (120-bit). Since hfloat already has comparison operators, a direct comparison would be more precise and slightly more efficient:

♻️ Proposed refactor using native comparison
 template<unsigned ndigits, unsigned es, typename bt>
 hfloat<ndigits, es, bt>
 min(hfloat<ndigits, es, bt> x, hfloat<ndigits, es, bt> y) {
-	return hfloat<ndigits, es, bt>(std::min(double(x), double(y)));
+	return (x < y) ? x : y;
 }

 template<unsigned ndigits, unsigned es, typename bt>
 hfloat<ndigits, es, bt>
 max(hfloat<ndigits, es, bt> x, hfloat<ndigits, es, bt> y) {
-	return hfloat<ndigits, es, bt>(std::max(double(x), double(y)));
+	return (x > y) ? x : y;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/math/functions/minmax.hpp` around lines 11
- 21, The min/max functions for hfloat (template<unsigned ndigits, unsigned es,
typename bt> min(...) and max(...)) currently convert operands to double then
compute std::min/std::max, risking precision loss; change them to use hfloat's
native comparison operators instead and return the proper hfloat operand (e.g.,
return (x < y) ? x : y for min and (x > y) ? x : y for max) so no double
conversion occurs and precision is preserved.
include/sw/universal/number/dfloat/dfloat_fwd.hpp (1)

24-26: Minor: Consider const reference parameter for abs.

The abs function takes its parameter by const&, which is appropriate for potentially large types. However, the fabs declaration on line 34 takes by value. Verify this asymmetry is intentional (fabs traditionally takes by value in some implementations).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_fwd.hpp` around lines 24 - 26, The
fabs declaration in dfloat_fwd.hpp is inconsistent with abs: fabs currently
takes its parameter by value while abs takes a const reference; update the fabs
prototype (fabs(const dfloat<ndigits, es, Encoding, BlockType>&)) to take a
const reference to match abs for large-type efficiency and avoid unnecessary
copies, or if the by-value signature was intentional, add a short comment near
the fabs declaration explaining why it must accept the dfloat by value;
reference the functions abs(...) and fabs(...) in dfloat_fwd.hpp when making the
change.
include/sw/universal/number/dfloat/math/functions/minmax.hpp (1)

11-21: Consider comparing dfloat values directly to preserve precision of the returned value.

The current implementation converts both arguments to double, applies std::min/std::max, and reconstructs the result. For high-precision configurations like decimal128 (34 digits), this round-trip through double (~15-17 digits) loses precision unnecessarily. Since dfloat implements comparison operators, returning one of the original arguments preserves the full precision:

♻️ Proposed refactor
 template<unsigned ndigits, unsigned es, DecimalEncoding Encoding, typename bt>
 dfloat<ndigits, es, Encoding, bt>
 min(dfloat<ndigits, es, Encoding, bt> x, dfloat<ndigits, es, Encoding, bt> y) {
-	return dfloat<ndigits, es, Encoding, bt>(std::min(double(x), double(y)));
+	return (x < y) ? x : y;
 }

 template<unsigned ndigits, unsigned es, DecimalEncoding Encoding, typename bt>
 dfloat<ndigits, es, Encoding, bt>
 max(dfloat<ndigits, es, Encoding, bt> x, dfloat<ndigits, es, Encoding, bt> y) {
-	return dfloat<ndigits, es, Encoding, bt>(std::max(double(x), double(y)));
+	return (x < y) ? y : x;
 }

This pattern is already used in ereal (adaptive-precision type) for the same precision preservation rationale.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/math/functions/minmax.hpp` around lines 11
- 21, The min/max implementations convert dfloat to double and lose precision;
change template functions min(...) and max(...) to compare the dfloat arguments
directly using their comparison operators and return the original x or y (e.g.,
if (y < x) return y; else return x for min, and if (x < y) return y; else return
x for max) so the returned value preserves full dfloat precision. Ensure you use
the existing dfloat comparison semantics (handle equality by returning one
operand consistently) and replace the std::min/std::max(double(...)) round-trip
with direct returns of the input objects.
include/sw/universal/number/dfloat/dpd_codec.hpp (2)

294-303: Document custom encoding for partial digit groups.

The remainder digit encoding (1-2 digits) uses a custom packed BCD format rather than standard DPD declets. While the implementation is consistent between encode and decode, adding a brief comment would clarify this is intentional for configurations where (ndigits-1) % 3 != 0.

📝 Suggested documentation
 	// Handle remaining 1 or 2 digits
+	// Note: Partial groups use simple packed BCD, not DPD encoding
 	if (remaining_digits == 2) {
 		unsigned d1 = static_cast<unsigned>((trailing / 10) % 10);
 		unsigned d0 = static_cast<unsigned>(trailing % 10);
-		// 2 digits encoded in 7 bits
+		// 2 BCD digits packed: d1 in bits [7:4], d0 in bits [3:0]
 		result |= (static_cast<uint64_t>((d1 << 4) | d0) << shift);
 	}
 	else if (remaining_digits == 1) {
-		// 1 digit encoded in 4 bits
+		// 1 BCD digit in bits [3:0]
 		result |= (static_cast<uint64_t>(trailing & 0xF) << shift);
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dpd_codec.hpp` around lines 294 - 303, Add
a brief comment above the partial-group encoding branch in dpd_codec.hpp to
document that when remaining_digits is 1 or 2 the implementation deliberately
uses a custom packed BCD format (not standard DPD declets) for the trailing
value; reference the variables remaining_digits, trailing, result and shift and
state that this packing (4 bits for one digit, 7 bits for two digits) is
intentional and mirrored by the decoder for configurations where (ndigits-1) % 3
!= 0 to ensure clarity for future readers.

32-104: Dead code: encode_declet function always returns 0.

This function contains abandoned implementation attempts with multiple commented-out approaches and ultimately returns 0 (line 103). Since the actual encoding is handled by dpd_encode_3digits below, this function appears to be vestigial development code.

♻️ Suggested fix: Remove dead code
 namespace dpd_detail {
 
-// Encode 3 BCD digits (d0, d1, d2) into a 10-bit DPD declet
-// d0 = hundreds digit, d1 = tens digit, d2 = units digit
-// All digits must be 0-9
-static constexpr uint16_t encode_declet(unsigned d0, unsigned d1, unsigned d2) {
-	// ... (entire function body)
-	return result; // This function is not actually used; we use the table below
-}
-
 // Generate the full 1000-entry encode table and 1024-entry decode table
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dpd_codec.hpp` around lines 32 - 104, The
encode_declet function is dead code (always returns 0) and duplicates
dpd_encode_3digits; remove the vestigial implementation to avoid confusion.
Delete the static constexpr uint16_t encode_declet(...) definition (and any
prototype) and, if any code referenced it, replace those calls with
dpd_encode_3digits(...) (or add a trivial inline wrapper that forwards to
dpd_encode_3digits) and run a build to ensure no references remain. Ensure
tests/uses of encode_declet are updated to use dpd_encode_3digits so behavior is
unchanged.
include/sw/universal/number/dfloat/dfloat_impl.hpp (1)

1006-1009: Note: Comparison via double may lose precision for large decimal significands.

The operator< comparison delegates to double(lhs) < double(rhs) which can lose precision for values with more than ~15 significant digits. This is acceptable for decimal32 (7 digits) but may cause incorrect ordering for larger formats.

Consider implementing direct decimal comparison for precision-critical use cases with decimal64/128.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 1006 - 1009,
The operator< implementation currently casts operands via double(lhs) <
double(rhs), which can lose precision for larger decimal formats; update
operator< in dfloat_impl.hpp (the operator< overload) to perform a direct
decimal comparison instead of IEEE double conversion: normalize both dfloat
values to a common exponent (or compare signs, exponents, and integer
significands after adjusting for exponent differences), compare signs first,
then compare adjusted significands using integer arithmetic (big integer or
128-bit integer) to avoid precision loss for decimal64/128 formats; keep the
double fallback only for non-critical or performance paths and ensure edge-cases
(NaN, infinities, zero with differing signs) are handled consistently with
existing semantics.
static/float/dfloat/conversion/assignment.cpp (1)

47-56: Floating-point round-trip test may have false negatives.

The comparison back != v at line 51 compares doubles directly. For values like 0.125 that aren't exactly representable in decimal32 (7 digits), the round-trip may produce a slightly different value, causing test failures that don't indicate a bug.

💡 Consider using values exactly representable in decimal32
 	{
 		using Decimal32 = dfloat<7, 6, DecimalEncoding::BID, uint32_t>;
-		double values[] = { 0.0, 1.0, -1.0, 0.5, -0.5, 0.25, 0.125, 42.0, -42.0, 1e6, -1e6 };
+		// Use values exactly representable in decimal (terminate in decimal)
+		double values[] = { 0.0, 1.0, -1.0, 0.5, -0.5, 0.25, 0.2, 42.0, -42.0, 1e6, -1e6 };
 		for (double v : values) {

Note: 0.125 = 1/8 is exactly representable in binary floating-point but requires infinite decimal expansion. 0.2 is exactly representable in decimal but not binary - both could show interesting behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/conversion/assignment.cpp` around lines 47 - 56, The
round-trip test in assignment.cpp uses a strict equality check (back != v) on
doubles which yields false negatives for values not exactly representable in
Decimal32; update the test in the loop that constructs Decimal32 a(v) and
computes double back to either (a) compare with a tolerance/ULP check (implement
a nearlyEqual function and replace back != v with !nearlyEqual(back, v)) or (b)
restrict the values[] array to values known to be exactly representable in
Decimal32 (e.g., integers and scaled powers of ten within Decimal32’s
precision); reference the Decimal32 constructor and the back variable in your
change.
static/float/dfloat/standard/dpd_codec.cpp (1)

114-125: Test 5 doesn't verify round-trip correctness.

The DPD dfloat round-trip test constructs values and prints them but doesn't assert correctness. Consider adding verification that the converted double matches the input.

💚 Suggested verification
 	// Test 5: DPD dfloat round-trip through value construction
 	std::cout << "+---------    DPD dfloat value round-trip\n";
 	{
 		using DpdFloat = dfloat<7, 6, DecimalEncoding::DPD, uint32_t>;
 		DpdFloat a(42.0);
 		double d = double(a);
 		std::cout << "  DPD dfloat<7,6>(42.0) = " << d << " : " << to_binary(a) << '\n';
+		if (d != 42.0) {
+			std::cerr << "FAIL: DPD dfloat round-trip for 42.0: got " << d << '\n';
+			++nrOfFailedTestCases;
+		}
 
 		DpdFloat b(0.1);
 		double db = double(b);
 		std::cout << "  DPD dfloat<7,6>(0.1) = " << db << " : " << to_binary(b) << '\n';
+		if (db != 0.1) {
+			std::cerr << "FAIL: DPD dfloat round-trip for 0.1: got " << db << '\n';
+			++nrOfFailedTestCases;
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/standard/dpd_codec.cpp` around lines 114 - 125, The test
prints DPD dfloat round-trip values but does not verify correctness; add
assertions after converting DpdFloat instances to double (e.g., after double(a)
and double(b)) to check the round-trip equals the original inputs. For DpdFloat
a(42.0) assert that d == 42.0 (exact compare) and for DpdFloat b(0.1) assert
that db equals 0.1 using an appropriate tolerance (e.g., fabs(db - 0.1) <
epsilon) to account for floating-point rounding; place these checks immediately
after the conversions for variables a, b (and keep to_binary output for debug
only).
include/sw/universal/number/hfloat/math/functions/sqrt.hpp (1)

12-21: Consider adding inline specifier.

For consistency with other math function templates and to explicitly mark the function for potential multiple-definition scenarios, consider adding inline.

Suggested fix
 template<unsigned ndigits, unsigned es, typename bt>
-hfloat<ndigits, es, bt> sqrt(const hfloat<ndigits, es, bt>& a) {
+inline hfloat<ndigits, es, bt> sqrt(const hfloat<ndigits, es, bt>& a) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/math/functions/sqrt.hpp` around lines 12 -
21, Add the inline specifier to the templated sqrt function to avoid
multiple-definition/linkage issues in header-only code: update the function
declaration for template<unsigned ndigits, unsigned es, typename bt>
hfloat<ndigits, es, bt> sqrt(const hfloat<ndigits, es, bt>& a) to be inline,
keeping the body logic for negative check, zero return, and std::sqrt conversion
unchanged (refer to the sqrt function template and the hfloat<ndigits, es, bt>
type).
include/sw/universal/number/hfloat/math/functions/truncate.hpp (1)

29-33: Minor documentation inaccuracy for ceil.

The comment says "smallest integral value that is greater than x", but std::ceil returns the "smallest integral value not less than x". For example, ceil(3.0) returns 3.0, not a value greater than 3.0.

Suggested fix
-// Round x upward, returning the smallest integral value that is greater than x
+// Round x upward, returning the smallest integral value that is not less than x
 template<unsigned ndigits, unsigned es, typename bt>
 hfloat<ndigits, es, bt> ceil(hfloat<ndigits, es, bt> x) {
 	return hfloat<ndigits, es, bt>(std::ceil(double(x)));
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/math/functions/truncate.hpp` around lines
29 - 33, The comment for the ceil function is incorrect: update the
documentation for template<unsigned ndigits, unsigned es, typename bt>
hfloat<ndigits, es, bt> ceil(hfloat<ndigits, es, bt> x) to state that it returns
the "smallest integral value not less than x" (i.e., rounds x upward but returns
x itself when x is already integral) instead of saying "greater than x"; no code
changes required.
include/sw/universal/number/dfloat/math/functions/classify.hpp (1)

13-16: Minor inconsistency: fpclassify lacks inline specifier.

All other functions in this file (isfinite, isinf, isnan, iszero) have inline specifiers, but fpclassify does not. For consistency and to avoid potential ODR violations when included in multiple translation units, consider adding inline.

Suggested fix
 template<unsigned ndigits, unsigned es, DecimalEncoding Encoding, typename bt>
-int fpclassify(const dfloat<ndigits, es, Encoding, bt>& a) {
+inline int fpclassify(const dfloat<ndigits, es, Encoding, bt>& a) {
 	return std::fpclassify(double(a));
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/math/functions/classify.hpp` around lines
13 - 16, Add the missing inline specifier to the template function fpclassify so
it matches the other functions; update the declaration/definition of
fpclassify(const dfloat<ndigits, es, Encoding, bt>& a) to be inline and keep the
existing body that calls std::fpclassify(double(a)); this ensures consistency
with isfinite/isinf/isnan/iszero and avoids ODR issues for the template
function.
static/float/dfloat/arithmetic/subtraction.cpp (1)

39-48: Consider potential precision issues with direct double comparison.

The comparison d != tc.expected (Line 43) uses direct floating-point equality. While this works for the current test cases (which use integer values and simple fractions), it may fail for edge cases with precision loss. For more robust testing, consider using an epsilon-based comparison or ensuring test values remain exactly representable.

This is acceptable for the current test scope but worth noting for future expansion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/arithmetic/subtraction.cpp` around lines 39 - 48, The
test compares the double-converted result (variable d from Decimal32 result) to
tc.expected using direct equality, which is fragile for floating-point; change
the check in the loop that iterates over cases to use an epsilon-based
comparison (e.g., fabs(d - tc.expected) > eps or a combined relative+absolute
tolerance) instead of d != tc.expected, and keep the same failure/reporting
logic that increments nrOfFailedTestCases and prints when reportTestCases is
true; pick a small eps (for example 1e-12) appropriate for Decimal32 precision.
static/float/hfloat/api/api.cpp (1)

74-86: Convert expectation-only prints into failing assertions.

This block logs expected behavior but does not fail the test for isnan/isinf and qnan mapping mismatches, which can hide regressions.

Suggested refactor
 		Real a(1.0f);
-		std::cout << "isnan(1.0)  : " << a.isnan() << " (should be 0)\n";
-		std::cout << "isinf(1.0)  : " << a.isinf() << " (should be 0)\n";
+		std::cout << "isnan(1.0)  : " << a.isnan() << " (should be 0)\n";
+		std::cout << "isinf(1.0)  : " << a.isinf() << " (should be 0)\n";
+		if (a.isnan() || a.isinf()) {
+			std::cerr << "FAIL: finite value reported as NaN/inf\n";
+			++nrOfFailedTestCases;
+		}
 
 		// NaN request maps to zero
 		Real qnan(SpecificValue::qnan);
 		std::cout << "qnan maps to: " << qnan << " (should be 0)\n";
+		if (!qnan.iszero()) {
+			std::cerr << "FAIL: SpecificValue::qnan should map to zero\n";
+			++nrOfFailedTestCases;
+		}

As per coding guidelines, "Each number system regression suite should provide exhaustive validation of assignment, conversion, arithmetic, logic, exceptions, number traits, and special cases".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/hfloat/api/api.cpp` around lines 74 - 86, Replace the
informative prints with failing checks that increment nrOfFailedTestCases when
expectations are not met: call Real a(1.0f) and assert !a.isnan() and !a.isinf()
(increment nrOfFailedTestCases on failure), construct Real
qnan(SpecificValue::qnan) and assert qnan == Real(0) (increment on mismatch),
and assert Real(SpecificValue::infpos) == Real(SpecificValue::maxpos) (increment
nrOfFailedTestCases when false); use the existing symbols isnan(), isinf(),
qnan, SpecificValue::qnan, SpecificValue::infpos, mp, pinf and
nrOfFailedTestCases so the test fails instead of only printing when expectations
are violated.
static/float/dfloat/arithmetic/addition.cpp (1)

11-14: Unused macro definitions.

The REGRESSION_LEVEL_* macros are defined but never referenced in the test file. Consider removing them or implementing regression level gating for the test clusters if that's the intended use.

🧹 Remove unused macros
-#define REGRESSION_LEVEL_1 1
-#define REGRESSION_LEVEL_2 1
-#define REGRESSION_LEVEL_3 0
-#define REGRESSION_LEVEL_4 0
-
 int main()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/arithmetic/addition.cpp` around lines 11 - 14, The four
unused macros REGRESSION_LEVEL_1, REGRESSION_LEVEL_2, REGRESSION_LEVEL_3, and
REGRESSION_LEVEL_4 are defined but never used; either remove these macro
definitions from addition.cpp or actually gate the test clusters using them
(e.g., wrap the relevant test blocks with `#if/`#ifdef checks against
REGRESSION_LEVEL_* or add conditional logic that references REGRESSION_LEVEL_*
in the test setup), ensuring you keep consistent naming (REGRESSION_LEVEL_1..4)
so any future regression-level toggles will compile and behave as intended.
static/float/hfloat/arithmetic/division.cpp (1)

85-96: Division-by-zero test doesn't assert expected behavior.

This test prints a note when 1/0 doesn't equal maxpos, but doesn't increment nrOfFailedTestCases. If maxpos saturation is the documented behavior for hfloat division-by-zero, consider making this an actual assertion. If multiple behaviors are acceptable, consider adding a comment explaining why the test intentionally allows them.

🔧 Suggested fix to make the test explicit
 	// Test 4: Division by zero saturates (no inf in hfloat)
 	std::cout << "+---------    Division by zero behavior\n";
 	{
 		HfloatShort one(1), zero(0);
 		HfloatShort result = one / zero;
 		// hfloat has no inf; dividing by zero should saturate to maxpos
 		HfloatShort mp(SpecificValue::maxpos);
 		if (double(result) != double(mp)) {
-			// accept either maxpos saturation or some defined behavior
-			std::cout << "  Note: 1/0 = " << double(result) << " (maxpos = " << double(mp) << ")\n";
+			++nrOfFailedTestCases;
+			if (reportTestCases) std::cerr << "FAIL: 1/0 = " << double(result) 
+			                               << " (expected maxpos = " << double(mp) << ")\n";
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/hfloat/arithmetic/division.cpp` around lines 85 - 96, The test
prints a note when HfloatShort result of one/zero doesn't equal the saturated
value mp but does not mark the test as failed; update the division-by-zero check
in the block that constructs HfloatShort one, zero, result and mp
(SpecificValue::maxpos) so that either it asserts the expected saturation
(increment nrOfFailedTestCases or call the existing test-fail helper) when
double(result) != double(mp) to make the behavior explicit, or if multiple
behaviors are acceptable, replace the print with a clear comment documenting
allowed outcomes and leave nrOfFailedTestCases unchanged; locate symbols
HfloatShort, result, mp, and nrOfFailedTestCases to implement the change.
include/sw/universal/number/hfloat/hfloat_fwd.hpp (2)

8-16: Missing <string> include for std::string usage.

The parse function declaration uses std::string but the header only includes <cstdint>. Forward declaration headers are typically included early in the include chain, so they should be self-contained for the types they reference.

Proposed fix
 `#pragma` once
 // hfloat_fwd.hpp : forward declarations of the IBM System/360 hexadecimal floating-point environment
 //
 // Copyright (C) 2017 Stillwater Supercomputing, Inc.
 // SPDX-License-Identifier: MIT
 //
 // This file is part of the universal numbers project, which is released under an MIT Open Source license.
 `#include` <cstdint>
+#include <string>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/hfloat_fwd.hpp` around lines 8 - 16, The
header declares parse(const std::string& number, hfloat<ndigits, es,
BlockType>&) but doesn't include <string>, so add `#include` <string> at the top
of this header to make it self-contained; update the include list near the
existing `#include` <cstdint> and ensure no other symbols are missing for the
forward declaration of template class hfloat and the parse function.

18-28: Inconsistent parameter passing style between abs and fabs.

abs takes a const hfloat& (by reference) while fabs takes hfloat (by value). For consistency and to match the implementation pattern used elsewhere in the library, consider using the same parameter style for both.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/hfloat_fwd.hpp` around lines 18 - 28, The
declarations for the hfloat helpers are inconsistent: abs and sqrt accept a
const hfloat<ndigits, es, BlockType>& but fabs takes hfloat<ndigits, es,
BlockType> by value; change the fabs declaration to accept a const
hfloat<ndigits, es, BlockType>& to match abs/sqrt, and update any corresponding
fabs definitions/implementations to use the same const reference signature so
declaration and implementation remain consistent (look for the template<unsigned
ndigits, unsigned es, typename BlockType> hfloat<ndigits, es, BlockType>
fabs(...) symbols to update).
include/sw/universal/number/hfloat/attributes.hpp (2)

30-41: Consider more descriptive variable names in dynamic_range().

The single-letter variable names b, c, d, e obscure the purpose. Using maxneg, minneg, minpos, maxpos would improve readability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/attributes.hpp` around lines 30 - 41, The
dynamic_range function uses opaque single-letter locals (b,c,d,e) which hurt
readability; rename them to descriptive identifiers (e.g., maxneg, minneg,
minpos, maxpos) throughout the template<unsigned ndigits, unsigned es, typename
bt> std::string dynamic_range(const hfloat<ndigits, es, bt>& a) implementation,
updating their construction (currently hfloat<ndigits, es, bt>
b(SpecificValue::maxneg) etc.) and all subsequent uses in stream output and
to_binary calls so the semantics remain identical but the code is clearer.

43-59: Parameter b is unused in scale helper functions.

In minpos_scale, maxpos_scale, and max_negative_scale, the parameter b is copied to c but never actually used—only the template parameters are needed. The functions could be simplified to not require an instance.

Example simplification for minpos_scale
 template<unsigned ndigits, unsigned es, typename bt>
-int minpos_scale(const hfloat<ndigits, es, bt>& b) {
-	hfloat<ndigits, es, bt> c(b);
-	return c.minpos().scale();
+int minpos_scale(const hfloat<ndigits, es, bt>& = {}) {
+	hfloat<ndigits, es, bt> v;
+	return v.minpos().scale();
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/attributes.hpp` around lines 43 - 59, The
three helpers minpos_scale, maxpos_scale, and max_negative_scale currently
accept a parameter b that is copied but unused; change each function to drop the
unused parameter and compute the scale from a default-constructed hfloat of the
template type instead. Specifically, update the signatures of minpos_scale,
maxpos_scale, and max_negative_scale to take no arguments and return
hfloat<ndigits, es, bt>().minpos().scale(), hfloat<...>().maxpos().scale(), and
hfloat<...>().maxneg().scale() respectively (referencing the functions minpos(),
maxpos(), maxneg() and the template type hfloat<ndigits, es, bt> to locate the
code).
include/sw/universal/number/dfloat/math/functions/truncate.hpp (1)

11-33: Precision loss when dfloat exceeds double's range/precision.

These truncation functions convert to double before applying std:: functions. For decimal128 configurations (34 digits), values may exceed double's precision (~15-17 decimal digits), leading to incorrect results for large or high-precision values.

This is acceptable for an initial implementation but consider adding a comment noting this limitation, or implementing native decimal truncation for higher-precision configurations in the future.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/math/functions/truncate.hpp` around lines
11 - 33, The trunc/round/floor/ceil functions (templates trunc, round, floor,
ceil) convert dfloat to double which loses precision for high-precision configs
(e.g., decimal128); update these functions to include a clear comment above each
(or a single shared comment block) that documents the precision/range limitation
when converting to double and add a TODO note to implement native decimal
truncation/rounding for ndigits > double precision (e.g., decimal128) in the
future so callers are aware of the current limitation.
include/sw/universal/number/hfloat/hfloat_impl.hpp (1)

707-712: parse function is unimplemented.

The parse function always returns false, causing operator>> to always fail with an error message. This makes input streaming non-functional.

Would you like me to generate an implementation for hex string parsing, or should I open an issue to track this task?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/hfloat_impl.hpp` around lines 707 - 712,
The parse template function for hfloat (parse<unsigned ndigits, unsigned es,
typename BlockType>) is currently a stub that returns false, breaking
operator>>; implement full hex-floating-point parsing inside parse: accept
optional sign, "0x" prefix, hex significand with optional '.', parse the hex
digits into the hfloat's significand bits (handle rounding/truncation to
ndigits), parse the binary exponent after 'p'/'P' and adjust the unbiased
exponent for the hfloat exponent field (es), and handle special cases (zero,
subnormal/denormal, overflow->infinity, underflow->zero, and NaN/infinite
literal forms); on successful conversion set the provided hfloat value and
return true, otherwise return false. Use existing hfloat internals
(constructors/setters or bit-field accessors in this header) and keep behavior
consistent with operator>> expectations so streaming succeeds when parse returns
true.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/sessions/2026-02-26_dfloat_hfloat_implementation.md`:
- Around line 106-109: The fenced code block showing the test output ("100%
tests passed, 0 tests failed out of 17 (gcc)" / "100% tests passed, 0 tests
failed out of 17 (clang)") must include a language identifier for proper
markdown rendering; update the block delimiter from ``` to ```text (or
```console) so the snippet becomes a fenced code block with a language specifier
and passes linting.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Around line 78-80: pow10_64 silently returns 0 for n >= 20 which can cause
subtle bugs for large decimal configurations (e.g., decimal128/34 digits);
update the function (or surrounding compile-time checks) to prevent misuse by
adding a static_assert or explicit guard that fails when a configuration
requires pow10_64 beyond the _pow10_table range, or provide an alternative
implementation using wider types (e.g., __uint128_t) for larger exponents;
reference the function name pow10_64 and the table _pow10_table when locating
the code and ensure the check triggers at compile time where possible
(static_assert) or at initialization if not.
- Around line 267-276: The exponent-alignment loops multiplying aligned_lhs /
aligned_rhs by 10 for each unit of shift can overflow the uint64_t accumulator;
update the alignment logic in the function that sets result_exp, lhs_exp,
rhs_exp, shift, aligned_lhs and aligned_rhs to guard against overflow by either
(a) clamping shift to a safe maximum before scaling, (b) performing the multiply
in a wider type (use __uint128_t when available for the intermediate multiply
and then check if result fits back into uint64_t), or (c) short-circuiting when
the magnitude difference is so large that the smaller term is effectively zero;
implement one of these strategies and add a clear check around the for-loops
that multiplies by 10 to avoid UB from overflow.

In `@include/sw/universal/number/dfloat/manipulators.hpp`:
- Around line 68-84: components() uses unpack() which returns the significand as
uint64_t, truncating values for ndigits > 19; fix by changing unpack() (and any
callers like components()) to return a big integer or string capable of holding
up to ndigits decimal digits (e.g., use boost::multiprecision::cpp_int or
std::string) and adjust the unpack signature and implementation to compute and
return the full significand without relying on pow10_64; update components(const
dfloat<...>&) to accept and print that big-integer/string significand instead of
uint64_t so decimal128 and other high-precision dfloat instances are represented
correctly.

In `@include/sw/universal/number/hfloat/hfloat_impl.hpp`:
- Around line 718-723: The equality operator operator== for hfloat<ndigits, es,
BlockType> must not convert to double (lossy for extended formats like
hfloat_extended); replace the double-based comparison with a direct
block-by-block comparison of the underlying storage (compare the _block
array/vector contents and any sign/exp/fraction metadata) after handling the
iszero() fast path; to access protected/private storage either expose a getter
for the block storage or add a friend declaration for the free function
operator== so it can iterate and compare each BlockType element exactly,
ensuring identical bit patterns (and treating NaNs/zeros per existing
semantics).
- Line 137: The operator= overload hfloat& operator=(char rhs) uses
convert_unsigned which misinterprets negative char values on platforms where
char is signed; change this to handle char's signedness explicitly by using
std::is_signed<char> to pick the right conversion: call
convert_signed(static_cast<int64_t>(rhs)) when char is signed, otherwise call
convert_unsigned(static_cast<uint64_t>(rhs)); reference the operator= for char
and the helper functions convert_signed/convert_unsigned to implement the
conditional at compile time.
- Around line 182-189: The code in hfloat_impl.hpp that computes alignment uses
`shift` to right-shift `aligned_rhs`/`aligned_lhs` (affecting variables
aligned_rhs, aligned_lhs, result_exp, lhs_exp, rhs_exp) but does not guard
against shifts >= word width, causing UB when `shift * 4` >= 64; clamp the shift
amount to a safe maximum (e.g., if (shift*4) >= 64 then treat as full loss of
precision and either set the smaller operand to zero or return the dominant
operand’s value) and only perform the right-shift when the computed shift in
bits is < 64, otherwise set aligned_* to 0 (or short-circuit to return
lhs_exp/rhs_exp as appropriate) to avoid undefined behavior.

In `@include/sw/universal/number/hfloat/manipulators.hpp`:
- Around line 68-82: The current components(...) calls unpack(...) which writes
the full fraction into a uint64_t causing UB for formats with >64 fraction bits
(e.g., hfloat_extended); update the fix by either modifying unpack in
hfloat_impl.hpp to provide a safe multi-block fraction representation (e.g.,
return/accept an array or vector of 64-bit words) or add a new method (e.g.,
unpack_large or unpack_multiword) that components can call for ndigits*4 > 64,
and use that multiword value when composing hex digits; alternatively, if you
prefer to reject unsupported sizes, add an assertion in unpack that fbits <= 64
and in components check ndigits and fail/ASSERT with a clear message referencing
hfloat<ndigits, es, bt>::unpack and hfloat_extended so callers know extended
precision isn't supported by the single-word unpack.

In `@include/sw/universal/number/hfloat/math/functions/sqrt.hpp`:
- Around line 14-18: The non-throwing branch in sqrt (check a.isneg() under
HFLOAT_THROW_ARITHMETIC_EXCEPTION) only logs the error but continues execution;
add an early return after the std::cerr line to avoid taking std::sqrt on a
negative value. Update the non-throwing path in sqrt.hpp so that when a.isneg()
is true you log the error and then return a sentinel (for example return a; or
return hfloat(0);) — reference the sqrt function and the a.isneg() check and
mirror the throwing branch behavior but with a safe early return.

In `@static/float/hfloat/arithmetic/addition.cpp`:
- Around line 93-105: The test block that checks truncation of division (uses
HfloatShort a, b, third = a / HfloatShort(3), and d3) is misplaced in
addition.cpp; remove this division-specific test or move it into the division
test file (division.cpp) where other division tests live; alternatively replace
it with an addition-specific truncation test that exercises precision loss when
adding a very small HfloatShort to a very large one (keep the same reporting
pattern using nrOfFailedTestCases and reportTestCases). Ensure you update or
remove the variables a, b, third, and the division expression so the file only
contains addition-related checks.

---

Nitpick comments:
In `@include/sw/universal/number/dfloat/dfloat_fwd.hpp`:
- Around line 24-26: The fabs declaration in dfloat_fwd.hpp is inconsistent with
abs: fabs currently takes its parameter by value while abs takes a const
reference; update the fabs prototype (fabs(const dfloat<ndigits, es, Encoding,
BlockType>&)) to take a const reference to match abs for large-type efficiency
and avoid unnecessary copies, or if the by-value signature was intentional, add
a short comment near the fabs declaration explaining why it must accept the
dfloat by value; reference the functions abs(...) and fabs(...) in
dfloat_fwd.hpp when making the change.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Around line 1006-1009: The operator< implementation currently casts operands
via double(lhs) < double(rhs), which can lose precision for larger decimal
formats; update operator< in dfloat_impl.hpp (the operator< overload) to perform
a direct decimal comparison instead of IEEE double conversion: normalize both
dfloat values to a common exponent (or compare signs, exponents, and integer
significands after adjusting for exponent differences), compare signs first,
then compare adjusted significands using integer arithmetic (big integer or
128-bit integer) to avoid precision loss for decimal64/128 formats; keep the
double fallback only for non-critical or performance paths and ensure edge-cases
(NaN, infinities, zero with differing signs) are handled consistently with
existing semantics.

In `@include/sw/universal/number/dfloat/dpd_codec.hpp`:
- Around line 294-303: Add a brief comment above the partial-group encoding
branch in dpd_codec.hpp to document that when remaining_digits is 1 or 2 the
implementation deliberately uses a custom packed BCD format (not standard DPD
declets) for the trailing value; reference the variables remaining_digits,
trailing, result and shift and state that this packing (4 bits for one digit, 7
bits for two digits) is intentional and mirrored by the decoder for
configurations where (ndigits-1) % 3 != 0 to ensure clarity for future readers.
- Around line 32-104: The encode_declet function is dead code (always returns 0)
and duplicates dpd_encode_3digits; remove the vestigial implementation to avoid
confusion. Delete the static constexpr uint16_t encode_declet(...) definition
(and any prototype) and, if any code referenced it, replace those calls with
dpd_encode_3digits(...) (or add a trivial inline wrapper that forwards to
dpd_encode_3digits) and run a build to ensure no references remain. Ensure
tests/uses of encode_declet are updated to use dpd_encode_3digits so behavior is
unchanged.

In `@include/sw/universal/number/dfloat/math/functions/classify.hpp`:
- Around line 13-16: Add the missing inline specifier to the template function
fpclassify so it matches the other functions; update the declaration/definition
of fpclassify(const dfloat<ndigits, es, Encoding, bt>& a) to be inline and keep
the existing body that calls std::fpclassify(double(a)); this ensures
consistency with isfinite/isinf/isnan/iszero and avoids ODR issues for the
template function.

In `@include/sw/universal/number/dfloat/math/functions/minmax.hpp`:
- Around line 11-21: The min/max implementations convert dfloat to double and
lose precision; change template functions min(...) and max(...) to compare the
dfloat arguments directly using their comparison operators and return the
original x or y (e.g., if (y < x) return y; else return x for min, and if (x <
y) return y; else return x for max) so the returned value preserves full dfloat
precision. Ensure you use the existing dfloat comparison semantics (handle
equality by returning one operand consistently) and replace the
std::min/std::max(double(...)) round-trip with direct returns of the input
objects.

In `@include/sw/universal/number/dfloat/math/functions/truncate.hpp`:
- Around line 11-33: The trunc/round/floor/ceil functions (templates trunc,
round, floor, ceil) convert dfloat to double which loses precision for
high-precision configs (e.g., decimal128); update these functions to include a
clear comment above each (or a single shared comment block) that documents the
precision/range limitation when converting to double and add a TODO note to
implement native decimal truncation/rounding for ndigits > double precision
(e.g., decimal128) in the future so callers are aware of the current limitation.

In `@include/sw/universal/number/hfloat/attributes.hpp`:
- Around line 30-41: The dynamic_range function uses opaque single-letter locals
(b,c,d,e) which hurt readability; rename them to descriptive identifiers (e.g.,
maxneg, minneg, minpos, maxpos) throughout the template<unsigned ndigits,
unsigned es, typename bt> std::string dynamic_range(const hfloat<ndigits, es,
bt>& a) implementation, updating their construction (currently hfloat<ndigits,
es, bt> b(SpecificValue::maxneg) etc.) and all subsequent uses in stream output
and to_binary calls so the semantics remain identical but the code is clearer.
- Around line 43-59: The three helpers minpos_scale, maxpos_scale, and
max_negative_scale currently accept a parameter b that is copied but unused;
change each function to drop the unused parameter and compute the scale from a
default-constructed hfloat of the template type instead. Specifically, update
the signatures of minpos_scale, maxpos_scale, and max_negative_scale to take no
arguments and return hfloat<ndigits, es, bt>().minpos().scale(),
hfloat<...>().maxpos().scale(), and hfloat<...>().maxneg().scale() respectively
(referencing the functions minpos(), maxpos(), maxneg() and the template type
hfloat<ndigits, es, bt> to locate the code).

In `@include/sw/universal/number/hfloat/hfloat_fwd.hpp`:
- Around line 8-16: The header declares parse(const std::string& number,
hfloat<ndigits, es, BlockType>&) but doesn't include <string>, so add `#include`
<string> at the top of this header to make it self-contained; update the include
list near the existing `#include` <cstdint> and ensure no other symbols are
missing for the forward declaration of template class hfloat and the parse
function.
- Around line 18-28: The declarations for the hfloat helpers are inconsistent:
abs and sqrt accept a const hfloat<ndigits, es, BlockType>& but fabs takes
hfloat<ndigits, es, BlockType> by value; change the fabs declaration to accept a
const hfloat<ndigits, es, BlockType>& to match abs/sqrt, and update any
corresponding fabs definitions/implementations to use the same const reference
signature so declaration and implementation remain consistent (look for the
template<unsigned ndigits, unsigned es, typename BlockType> hfloat<ndigits, es,
BlockType> fabs(...) symbols to update).

In `@include/sw/universal/number/hfloat/hfloat_impl.hpp`:
- Around line 707-712: The parse template function for hfloat (parse<unsigned
ndigits, unsigned es, typename BlockType>) is currently a stub that returns
false, breaking operator>>; implement full hex-floating-point parsing inside
parse: accept optional sign, "0x" prefix, hex significand with optional '.',
parse the hex digits into the hfloat's significand bits (handle
rounding/truncation to ndigits), parse the binary exponent after 'p'/'P' and
adjust the unbiased exponent for the hfloat exponent field (es), and handle
special cases (zero, subnormal/denormal, overflow->infinity, underflow->zero,
and NaN/infinite literal forms); on successful conversion set the provided
hfloat value and return true, otherwise return false. Use existing hfloat
internals (constructors/setters or bit-field accessors in this header) and keep
behavior consistent with operator>> expectations so streaming succeeds when
parse returns true.

In `@include/sw/universal/number/hfloat/math/functions/minmax.hpp`:
- Around line 11-21: The min/max functions for hfloat (template<unsigned
ndigits, unsigned es, typename bt> min(...) and max(...)) currently convert
operands to double then compute std::min/std::max, risking precision loss;
change them to use hfloat's native comparison operators instead and return the
proper hfloat operand (e.g., return (x < y) ? x : y for min and (x > y) ? x : y
for max) so no double conversion occurs and precision is preserved.

In `@include/sw/universal/number/hfloat/math/functions/sqrt.hpp`:
- Around line 12-21: Add the inline specifier to the templated sqrt function to
avoid multiple-definition/linkage issues in header-only code: update the
function declaration for template<unsigned ndigits, unsigned es, typename bt>
hfloat<ndigits, es, bt> sqrt(const hfloat<ndigits, es, bt>& a) to be inline,
keeping the body logic for negative check, zero return, and std::sqrt conversion
unchanged (refer to the sqrt function template and the hfloat<ndigits, es, bt>
type).

In `@include/sw/universal/number/hfloat/math/functions/truncate.hpp`:
- Around line 29-33: The comment for the ceil function is incorrect: update the
documentation for template<unsigned ndigits, unsigned es, typename bt>
hfloat<ndigits, es, bt> ceil(hfloat<ndigits, es, bt> x) to state that it returns
the "smallest integral value not less than x" (i.e., rounds x upward but returns
x itself when x is already integral) instead of saying "greater than x"; no code
changes required.

In `@static/float/dfloat/arithmetic/addition.cpp`:
- Around line 11-14: The four unused macros REGRESSION_LEVEL_1,
REGRESSION_LEVEL_2, REGRESSION_LEVEL_3, and REGRESSION_LEVEL_4 are defined but
never used; either remove these macro definitions from addition.cpp or actually
gate the test clusters using them (e.g., wrap the relevant test blocks with
`#if/`#ifdef checks against REGRESSION_LEVEL_* or add conditional logic that
references REGRESSION_LEVEL_* in the test setup), ensuring you keep consistent
naming (REGRESSION_LEVEL_1..4) so any future regression-level toggles will
compile and behave as intended.

In `@static/float/dfloat/arithmetic/subtraction.cpp`:
- Around line 39-48: The test compares the double-converted result (variable d
from Decimal32 result) to tc.expected using direct equality, which is fragile
for floating-point; change the check in the loop that iterates over cases to use
an epsilon-based comparison (e.g., fabs(d - tc.expected) > eps or a combined
relative+absolute tolerance) instead of d != tc.expected, and keep the same
failure/reporting logic that increments nrOfFailedTestCases and prints when
reportTestCases is true; pick a small eps (for example 1e-12) appropriate for
Decimal32 precision.

In `@static/float/dfloat/conversion/assignment.cpp`:
- Around line 47-56: The round-trip test in assignment.cpp uses a strict
equality check (back != v) on doubles which yields false negatives for values
not exactly representable in Decimal32; update the test in the loop that
constructs Decimal32 a(v) and computes double back to either (a) compare with a
tolerance/ULP check (implement a nearlyEqual function and replace back != v with
!nearlyEqual(back, v)) or (b) restrict the values[] array to values known to be
exactly representable in Decimal32 (e.g., integers and scaled powers of ten
within Decimal32’s precision); reference the Decimal32 constructor and the back
variable in your change.

In `@static/float/dfloat/logic/logic.cpp`:
- Around line 11-22: Remove or document unused regression macros and simplify
the always-true flag: either delete the unused macros REGRESSION_LEVEL_3 and
REGRESSION_LEVEL_4 from the top of logic.cpp to avoid confusion, or add
use-cases/tests gated by those macros in the test logic (e.g., inside main where
REGRESSION_LEVEL_1/2 are used) so they are meaningful; likewise, if you don't
need runtime toggling, simplify the boolean reportTestCases by replacing it with
a constant or removing it and any conditional branches that rely on it, ensuring
references to reportTestCases, REGRESSION_LEVEL_3, and REGRESSION_LEVEL_4 are
updated accordingly.

In `@static/float/dfloat/standard/dpd_codec.cpp`:
- Around line 114-125: The test prints DPD dfloat round-trip values but does not
verify correctness; add assertions after converting DpdFloat instances to double
(e.g., after double(a) and double(b)) to check the round-trip equals the
original inputs. For DpdFloat a(42.0) assert that d == 42.0 (exact compare) and
for DpdFloat b(0.1) assert that db equals 0.1 using an appropriate tolerance
(e.g., fabs(db - 0.1) < epsilon) to account for floating-point rounding; place
these checks immediately after the conversions for variables a, b (and keep
to_binary output for debug only).

In `@static/float/hfloat/api/api.cpp`:
- Around line 74-86: Replace the informative prints with failing checks that
increment nrOfFailedTestCases when expectations are not met: call Real a(1.0f)
and assert !a.isnan() and !a.isinf() (increment nrOfFailedTestCases on failure),
construct Real qnan(SpecificValue::qnan) and assert qnan == Real(0) (increment
on mismatch), and assert Real(SpecificValue::infpos) ==
Real(SpecificValue::maxpos) (increment nrOfFailedTestCases when false); use the
existing symbols isnan(), isinf(), qnan, SpecificValue::qnan,
SpecificValue::infpos, mp, pinf and nrOfFailedTestCases so the test fails
instead of only printing when expectations are violated.

In `@static/float/hfloat/arithmetic/division.cpp`:
- Around line 85-96: The test prints a note when HfloatShort result of one/zero
doesn't equal the saturated value mp but does not mark the test as failed;
update the division-by-zero check in the block that constructs HfloatShort one,
zero, result and mp (SpecificValue::maxpos) so that either it asserts the
expected saturation (increment nrOfFailedTestCases or call the existing
test-fail helper) when double(result) != double(mp) to make the behavior
explicit, or if multiple behaviors are acceptable, replace the print with a
clear comment documenting allowed outcomes and leave nrOfFailedTestCases
unchanged; locate symbols HfloatShort, result, mp, and nrOfFailedTestCases to
implement the change.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between facc7a4 and fbc46a8.

📒 Files selected for processing (67)
  • .github/workflows/cmake.yml
  • .github/workflows/coverage.yml
  • CHANGELOG.md
  • CMakeLists.txt
  • docs/sessions/2026-02-26_dfloat_hfloat_implementation.md
  • include/sw/universal/number/dfloat/attributes.hpp
  • include/sw/universal/number/dfloat/dfloat.hpp
  • include/sw/universal/number/dfloat/dfloat_fwd.hpp
  • include/sw/universal/number/dfloat/dfloat_impl.hpp
  • include/sw/universal/number/dfloat/dpd_codec.hpp
  • include/sw/universal/number/dfloat/manipulators.hpp
  • include/sw/universal/number/dfloat/math/functions/classify.hpp
  • include/sw/universal/number/dfloat/math/functions/error_and_gamma.hpp
  • include/sw/universal/number/dfloat/math/functions/exponent.hpp
  • include/sw/universal/number/dfloat/math/functions/fractional.hpp
  • include/sw/universal/number/dfloat/math/functions/hyperbolic.hpp
  • include/sw/universal/number/dfloat/math/functions/hypot.hpp
  • include/sw/universal/number/dfloat/math/functions/logarithm.hpp
  • include/sw/universal/number/dfloat/math/functions/minmax.hpp
  • include/sw/universal/number/dfloat/math/functions/next.hpp
  • include/sw/universal/number/dfloat/math/functions/pow.hpp
  • include/sw/universal/number/dfloat/math/functions/sqrt.hpp
  • include/sw/universal/number/dfloat/math/functions/trigonometry.hpp
  • include/sw/universal/number/dfloat/math/functions/truncate.hpp
  • include/sw/universal/number/dfloat/mathlib.hpp
  • include/sw/universal/number/dfloat/numeric_limits.hpp
  • include/sw/universal/number/hfloat/attributes.hpp
  • include/sw/universal/number/hfloat/exceptions.hpp
  • include/sw/universal/number/hfloat/hfloat.hpp
  • include/sw/universal/number/hfloat/hfloat_fwd.hpp
  • include/sw/universal/number/hfloat/hfloat_impl.hpp
  • include/sw/universal/number/hfloat/manipulators.hpp
  • include/sw/universal/number/hfloat/math/functions/classify.hpp
  • include/sw/universal/number/hfloat/math/functions/error_and_gamma.hpp
  • include/sw/universal/number/hfloat/math/functions/exponent.hpp
  • include/sw/universal/number/hfloat/math/functions/fractional.hpp
  • include/sw/universal/number/hfloat/math/functions/hyperbolic.hpp
  • include/sw/universal/number/hfloat/math/functions/hypot.hpp
  • include/sw/universal/number/hfloat/math/functions/logarithm.hpp
  • include/sw/universal/number/hfloat/math/functions/minmax.hpp
  • include/sw/universal/number/hfloat/math/functions/next.hpp
  • include/sw/universal/number/hfloat/math/functions/pow.hpp
  • include/sw/universal/number/hfloat/math/functions/sqrt.hpp
  • include/sw/universal/number/hfloat/math/functions/trigonometry.hpp
  • include/sw/universal/number/hfloat/math/functions/truncate.hpp
  • include/sw/universal/number/hfloat/mathlib.hpp
  • include/sw/universal/number/hfloat/numeric_limits.hpp
  • include/sw/universal/traits/dfloat_traits.hpp
  • include/sw/universal/traits/hfloat_traits.hpp
  • static/float/dfloat/api/api.cpp
  • static/float/dfloat/arithmetic/addition.cpp
  • static/float/dfloat/arithmetic/division.cpp
  • static/float/dfloat/arithmetic/multiplication.cpp
  • static/float/dfloat/arithmetic/subtraction.cpp
  • static/float/dfloat/conversion/assignment.cpp
  • static/float/dfloat/logic/logic.cpp
  • static/float/dfloat/standard/decimal32.cpp
  • static/float/dfloat/standard/dpd_codec.cpp
  • static/float/hfloat/CMakeLists.txt
  • static/float/hfloat/api/api.cpp
  • static/float/hfloat/arithmetic/addition.cpp
  • static/float/hfloat/arithmetic/division.cpp
  • static/float/hfloat/arithmetic/multiplication.cpp
  • static/float/hfloat/arithmetic/subtraction.cpp
  • static/float/hfloat/conversion/assignment.cpp
  • static/float/hfloat/logic/logic.cpp
  • static/float/hfloat/standard/short.cpp
💤 Files with no reviewable changes (1)
  • .github/workflows/cmake.yml

Comment on lines +106 to +109
```
100% tests passed, 0 tests failed out of 17 (gcc)
100% tests passed, 0 tests failed out of 17 (clang)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language specification to fenced code block.

The code block should have a language identifier for proper rendering and to satisfy markdown linting.

📝 Proposed fix
-```
+```text
 100% tests passed, 0 tests failed out of 17 (gcc)
 100% tests passed, 0 tests failed out of 17 (clang)
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 106-106: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/sessions/2026-02-26_dfloat_hfloat_implementation.md` around lines 106 -
109, The fenced code block showing the test output ("100% tests passed, 0 tests
failed out of 17 (gcc)" / "100% tests passed, 0 tests failed out of 17 (clang)")
must include a language identifier for proper markdown rendering; update the
block delimiter from ``` to ```text (or ```console) so the snippet becomes a
fenced code block with a language specifier and passes linting.

hfloat& operator=(int rhs) noexcept { return convert_signed(rhs); }
hfloat& operator=(long rhs) noexcept { return convert_signed(rhs); }
hfloat& operator=(long long rhs) noexcept { return convert_signed(rhs); }
hfloat& operator=(char rhs) noexcept { return convert_unsigned(rhs); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

char signedness is implementation-defined; using convert_unsigned may produce incorrect results.

On platforms where char is signed (common on x86), negative char values will be implicitly converted to large positive uint64_t values in convert_unsigned, producing incorrect results.

Proposed fix: handle char signedness explicitly
-	hfloat& operator=(char rhs)               noexcept { return convert_unsigned(rhs); }
+	hfloat& operator=(char rhs)               noexcept { 
+		// char signedness is implementation-defined; use signed conversion for safety
+		return convert_signed(static_cast<signed char>(rhs)); 
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/hfloat_impl.hpp` at line 137, The
operator= overload hfloat& operator=(char rhs) uses convert_unsigned which
misinterprets negative char values on platforms where char is signed; change
this to handle char's signedness explicitly by using std::is_signed<char> to
pick the right conversion: call convert_signed(static_cast<int64_t>(rhs)) when
char is signed, otherwise call convert_unsigned(static_cast<uint64_t>(rhs));
reference the operator= for char and the helper functions
convert_signed/convert_unsigned to implement the conditional at compile time.

Comment on lines +182 to +189
if (shift >= 0) {
result_exp = lhs_exp;
aligned_rhs >>= (static_cast<unsigned>(shift) * 4u);
}
else {
result_exp = rhs_exp;
aligned_lhs >>= (static_cast<unsigned>(-shift) * 4u);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Undefined behavior when exponent difference is large.

Right-shifting a uint64_t by 64 or more bits is undefined behavior in C++. With es=7, exponent differences can reach 127, resulting in shifts of up to 508 bits (shift * 4). This will cause UB and incorrect results.

Proposed fix: clamp shift or return dominant operand
 		if (shift >= 0) {
 			result_exp = lhs_exp;
+			if (static_cast<unsigned>(shift) * 4u >= 64u) {
+				// rhs is negligible compared to lhs
+				return *this;
+			}
 			aligned_rhs >>= (static_cast<unsigned>(shift) * 4u);
 		}
 		else {
 			result_exp = rhs_exp;
+			if (static_cast<unsigned>(-shift) * 4u >= 64u) {
+				// lhs is negligible compared to rhs
+				*this = rhs;
+				return *this;
+			}
 			aligned_lhs >>= (static_cast<unsigned>(-shift) * 4u);
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/hfloat_impl.hpp` around lines 182 - 189,
The code in hfloat_impl.hpp that computes alignment uses `shift` to right-shift
`aligned_rhs`/`aligned_lhs` (affecting variables aligned_rhs, aligned_lhs,
result_exp, lhs_exp, rhs_exp) but does not guard against shifts >= word width,
causing UB when `shift * 4` >= 64; clamp the shift amount to a safe maximum
(e.g., if (shift*4) >= 64 then treat as full loss of precision and either set
the smaller operand to zero or return the dominant operand’s value) and only
perform the right-shift when the computed shift in bits is < 64, otherwise set
aligned_* to 0 (or short-circuit to return lhs_exp/rhs_exp as appropriate) to
avoid undefined behavior.

Comment on lines +718 to +723
template<unsigned ndigits, unsigned es, typename BlockType>
inline bool operator==(const hfloat<ndigits, es, BlockType>& lhs, const hfloat<ndigits, es, BlockType>& rhs) {
if (lhs.iszero() && rhs.iszero()) return true;
// compare through double
return double(lhs) == double(rhs);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Equality comparison via double loses precision for extended formats.

For hfloat_extended with 112 fraction bits, converting to double (53-bit mantissa) truncates precision. Two distinct hfloat values may incorrectly compare as equal.

Consider implementing direct block-by-block comparison for accurate results:

Proposed fix: direct comparison
 template<unsigned ndigits, unsigned es, typename BlockType>
 inline bool operator==(const hfloat<ndigits, es, BlockType>& lhs, const hfloat<ndigits, es, BlockType>& rhs) {
 	if (lhs.iszero() && rhs.iszero()) return true;
-	// compare through double
-	return double(lhs) == double(rhs);
+	// direct block comparison
+	for (unsigned i = 0; i < hfloat<ndigits, es, BlockType>::nrBlocks; ++i) {
+		if (lhs._block[i] != rhs._block[i]) return false;
+	}
+	return true;
 }

Note: This would require making _block accessible or adding a friend declaration for the free function to access protected members.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/hfloat_impl.hpp` around lines 718 - 723,
The equality operator operator== for hfloat<ndigits, es, BlockType> must not
convert to double (lossy for extended formats like hfloat_extended); replace the
double-based comparison with a direct block-by-block comparison of the
underlying storage (compare the _block array/vector contents and any
sign/exp/fraction metadata) after handling the iszero() fast path; to access
protected/private storage either expose a getter for the block storage or add a
friend declaration for the free function operator== so it can iterate and
compare each BlockType element exactly, ensuring identical bit patterns (and
treating NaNs/zeros per existing semantics).

Comment on lines +68 to +82
// components: show unpacked sign, exponent, fraction in hex
template<unsigned ndigits, unsigned es, typename bt>
std::string components(const hfloat<ndigits, es, bt>& number) {
std::stringstream s;
bool sign; int exp; uint64_t frac;
number.unpack(sign, exp, frac);
s << (sign ? "(-" : "(+") << "0x";
// show hex digits
for (int i = static_cast<int>(ndigits) - 1; i >= 0; --i) {
unsigned hex_digit = (frac >> (i * 4)) & 0xF;
s << "0123456789ABCDEF"[hex_digit];
}
s << " * 16^" << exp << ')';
return s.str();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find hfloat class definition and unpack method
fd -e hpp -e h "hfloat" include/sw/universal/number | head -20

Repository: stillwater-sc/universal

Length of output: 213


🏁 Script executed:

#!/bin/bash
# Search for unpack method in hfloat header files
rg "void unpack\(bool" --type hpp include/sw/universal/number/hfloat/ -A 5

Repository: stillwater-sc/universal

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Look for unpack implementations with different signatures
rg "unpack\(" --type hpp include/sw/universal/number/hfloat/ -B 2 -A 3

Repository: stillwater-sc/universal

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Search for unpack method without type flag
rg "void unpack\(bool" include/sw/universal/number/hfloat/ -A 5

Repository: stillwater-sc/universal

Length of output: 581


🏁 Script executed:

#!/bin/bash
# Read hfloat_impl.hpp to find unpack implementation
wc -l include/sw/universal/number/hfloat/hfloat_impl.hpp

Repository: stillwater-sc/universal

Length of output: 121


🏁 Script executed:

#!/bin/bash
# Find and read the full unpack() implementation
rg "void unpack\(bool" include/sw/universal/number/hfloat/hfloat_impl.hpp -A 30

Repository: stillwater-sc/universal

Length of output: 918


🏁 Script executed:

#!/bin/bash
# Check hfloat template parameters and configurations
rg "hfloat<.*>" include/sw/universal/number/hfloat/ --max-count=20

Repository: stillwater-sc/universal

Length of output: 16295


Critical: unpack() causes undefined behavior for extended hfloat (112+ fraction bits).

The unpack() method in hfloat_impl.hpp attempts to extract all fbits bits into a uint64_t fraction. For hfloat_extended with 112 fraction bits, the loop executes 112 iterations, but bit shifts beyond 63 (i.e., 1ull << i where i ≥ 64) are undefined behavior in C++.

This corrupts the fraction output for extended format in components() and any code using unpack() with sizes larger than 64 bits.

Either:

  1. Use a wider type (or multi-block representation) that can hold the full fraction
  2. Split unpack() into overloads or add a separate method for extended-precision unpacking
  3. Document and reject extended format in unpack() with an assertion
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/manipulators.hpp` around lines 68 - 82,
The current components(...) calls unpack(...) which writes the full fraction
into a uint64_t causing UB for formats with >64 fraction bits (e.g.,
hfloat_extended); update the fix by either modifying unpack in hfloat_impl.hpp
to provide a safe multi-block fraction representation (e.g., return/accept an
array or vector of 64-bit words) or add a new method (e.g., unpack_large or
unpack_multiword) that components can call for ndigits*4 > 64, and use that
multiword value when composing hex digits; alternatively, if you prefer to
reject unsupported sizes, add an assertion in unpack that fbits <= 64 and in
components check ndigits and fail/ASSERT with a clear message referencing
hfloat<ndigits, es, bt>::unpack and hfloat_extended so callers know extended
precision isn't supported by the single-word unpack.

Comment on lines +14 to +18
#if HFLOAT_THROW_ARITHMETIC_EXCEPTION
if (a.isneg()) throw hfloat_negative_sqrt_arg();
#else
if (a.isneg()) std::cerr << "hfloat argument to sqrt is negative: " << a << std::endl;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing early return for negative input in non-throwing path.

When HFLOAT_THROW_ARITHMETIC_EXCEPTION is not defined and a.isneg() is true, the code prints an error but continues execution. Since std::sqrt of a negative double returns NaN, and IBM System/360 HFP has no NaN representation, the resulting behavior is undefined. The dfloat implementation has the same issue.

Consider returning a sentinel value (e.g., zero or a unchanged) after logging the error to ensure well-defined behavior.

Suggested fix
 `#if` HFLOAT_THROW_ARITHMETIC_EXCEPTION
 	if (a.isneg()) throw hfloat_negative_sqrt_arg();
 `#else`
-	if (a.isneg()) std::cerr << "hfloat argument to sqrt is negative: " << a << std::endl;
+	if (a.isneg()) {
+		std::cerr << "hfloat argument to sqrt is negative: " << a << std::endl;
+		return hfloat<ndigits, es, bt>{};  // return zero as fallback
+	}
 `#endif`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/hfloat/math/functions/sqrt.hpp` around lines 14 -
18, The non-throwing branch in sqrt (check a.isneg() under
HFLOAT_THROW_ARITHMETIC_EXCEPTION) only logs the error but continues execution;
add an early return after the std::cerr line to avoid taking std::sqrt on a
negative value. Update the non-throwing path in sqrt.hpp so that when a.isneg()
is true you log the error and then return a sentinel (for example return a; or
return hfloat(0);) — reference the sqrt function and the a.isneg() check and
mirror the throwing branch behavior but with a safe early return.

Comment on lines +93 to +105
// Test 4: Truncation property -- adding small to large loses the small value
std::cout << "+--------- Truncation rounding behavior\n";
{
// In hfloat, the result is truncated (never rounds up)
HfloatShort a(1.0), b(1.0);
HfloatShort third = a / HfloatShort(3);
double d3 = double(third);
// truncation: 1/3 should be <= exact 1/3
if (d3 > (1.0 / 3.0) + 1e-15) {
++nrOfFailedTestCases;
if (reportTestCases) std::cerr << "FAIL: hfloat 1/3 = " << d3 << " exceeds exact 1/3\n";
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Test 4 belongs in division.cpp, not addition.cpp.

This test validates truncation behavior of division (a / HfloatShort(3)), not addition. It appears to be a copy-paste artifact from the division test suite. Consider removing this test from the addition validation file or replacing it with an addition-specific truncation test (e.g., adding values where precision loss occurs).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/hfloat/arithmetic/addition.cpp` around lines 93 - 105, The test
block that checks truncation of division (uses HfloatShort a, b, third = a /
HfloatShort(3), and d3) is misplaced in addition.cpp; remove this
division-specific test or move it into the division test file (division.cpp)
where other division tests live; alternatively replace it with an
addition-specific truncation test that exercises precision loss when adding a
very small HfloatShort to a very large one (keep the same reporting pattern
using nrOfFailedTestCases and reportTestCases). Ensure you update or remove the
variables a, b, third, and the division expression so the file only contains
addition-related checks.

@Ravenwater Ravenwater self-assigned this Feb 26, 2026
Ravenwater and others added 2 commits February 26, 2026 19:00
… > 19

pow10_64() silently returned 0 for n >= 20, causing wrong results for
decimal128 (34 digits). Added static_assert limiting ndigits to 19
(uint64_t significand range) and removed the silent fallback. Commented
out decimal128 aliases until __uint128_t significand support is added.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The exponent-alignment loops in operator+= multiply by 10 per shift unit,
which can overflow uint64_t when ndigits > 9 and shift approaches ndigits.
Short-circuit when magnitude difference >= ndigits (smaller operand cannot
contribute), and use __uint128_t for the scaled values and addition to
prevent overflow for decimal64 (ndigits=16).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
include/sw/universal/number/dfloat/dfloat_impl.hpp (1)

225-225: ⚠️ Potential issue | 🟡 Minor

Platform-dependent behavior for char assignment.

char signedness is implementation-defined. On platforms where char is signed (common), negative values will wrap when passed to convert_unsigned, producing large positive results instead of the expected negative value.

Consider explicitly handling signed char:

🛠️ Suggested fix
-	dfloat& operator=(char rhs)               noexcept { return convert_unsigned(rhs); }
+	dfloat& operator=(char rhs)               noexcept { return convert_signed(static_cast<signed char>(rhs)); }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` at line 225,
operator=(char rhs) currently forwards a possibly-signed implementation-defined
char to convert_unsigned, causing negative chars to be treated as large unsigned
values; change the assignment to detect char signedness and forward the
properly-cast value to the appropriate converter: if std::is_signed<char>::value
call convert_signed(static_cast<signed char>(rhs)), otherwise call
convert_unsigned(static_cast<unsigned char>(rhs)); update the operator=(char)
implementation to perform this branching and use convert_signed/convert_unsigned
to preserve the intended negative semantics on platforms where char is signed.
🧹 Nitpick comments (1)
include/sw/universal/number/dfloat/dfloat_impl.hpp (1)

1057-1070: Note: Comparison via double may lose precision for decimal64.

The operator< implementation converts to double for comparison (line 1069), which has ~15.9 digits of precision. For decimal64 (16 digits), edge cases may compare incorrectly. Consider implementing native decimal comparison for full precision in a future enhancement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 1057 - 1070,
operator< currently falls back to converting dfloat to double (double(lhs) <
double(rhs)), which can lose precision for decimal64; replace the double-based
comparison in operator< for dfloat<ndigits, es, Encoding, BlockType> with a
native decimal comparison: after the existing NaN/inf/sign checks, normalize
both operands (align exponents), compare adjusted significands (or integer
representations) taking sign into account, and use exponent as tie-breaker;
implement this logic inside operator< (and reuse any existing helpers for
normalization, exponent(), significand()/mantissa(), or add small helpers) so
full ndigits precision is preserved instead of casting to double.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Around line 383-388: The double-based fallback in operator*= (in
dfloat_impl.hpp) loses precision for types with ndigits > 15 (e.g., decimal64);
add a compile-time guard so the fallback is not used for high-precision configs:
insert a static_assert that ndigits <= 15 (or an equivalent constexpr check)
where the `#else` double fallback is selected, with a message like "double
fallback insufficient for ndigits > 15; provide a proper implementation or
enable long double fallback"; alternatively, conditionally use long double only
when its precision suffices, otherwise fail the build so the incorrect
double-based multiplication cannot silently compile.
- Around line 325-344: The fallback path that multiplies
aligned_lhs_u/aligned_rhs_u can overflow when __SIZEOF_INT128__ is not available
for ndigits > 9; add a compile-time guard that prevents instantiation in that
case (e.g., a static_assert tied to the template parameter ndigits) so types
like decimal64 (ndigits=16) fail to compile when the fallback is used;
specifically, add a static_assert near the fallback branch (before using
aligned_lhs_u, aligned_rhs_u, result_sig_i) that triggers when
!defined(__SIZEOF_INT128__) && ndigits > 9, with a clear message referencing
decimal64/ndigits.
- Around line 889-896: In convert_signed, avoid negating v directly (UB for
INT64_MIN); instead compute the absolute value into a uint64_t without signed
overflow: when negative, compute abs_v as static_cast<uint64_t>(-(v + 1)) + 1,
otherwise cast v directly; update convert_signed to use this abs_v path so
INT64_MIN is handled safely (refer to function convert_signed in
dfloat_impl.hpp).
- Around line 428-432: The division fallback uses double conversion and loses
precision for configurations with NDIGITS > 15; update the operator/=(...)
implementation in dfloat_impl.hpp (the dfloat::operator/= overload where the
snippet uses "double d = double(*this) / double(rhs); *this = d;") to guard this
fallback at compile time: add a static_assert or preprocessor check referencing
the NDIGITS (or template parameter controlling precision) to only allow the
double-based fallback when NDIGITS <= 15, and otherwise force the caller to
use/implement the full-precision division path (or fail the build) so no silent
precision loss occurs.

---

Outside diff comments:
In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Line 225: operator=(char rhs) currently forwards a possibly-signed
implementation-defined char to convert_unsigned, causing negative chars to be
treated as large unsigned values; change the assignment to detect char
signedness and forward the properly-cast value to the appropriate converter: if
std::is_signed<char>::value call convert_signed(static_cast<signed char>(rhs)),
otherwise call convert_unsigned(static_cast<unsigned char>(rhs)); update the
operator=(char) implementation to perform this branching and use
convert_signed/convert_unsigned to preserve the intended negative semantics on
platforms where char is signed.

---

Nitpick comments:
In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Around line 1057-1070: operator< currently falls back to converting dfloat to
double (double(lhs) < double(rhs)), which can lose precision for decimal64;
replace the double-based comparison in operator< for dfloat<ndigits, es,
Encoding, BlockType> with a native decimal comparison: after the existing
NaN/inf/sign checks, normalize both operands (align exponents), compare adjusted
significands (or integer representations) taking sign into account, and use
exponent as tie-breaker; implement this logic inside operator< (and reuse any
existing helpers for normalization, exponent(), significand()/mantissa(), or add
small helpers) so full ndigits precision is preserved instead of casting to
double.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fbc46a8 and 16f04c5.

📒 Files selected for processing (2)
  • include/sw/universal/number/dfloat/dfloat.hpp
  • include/sw/universal/number/dfloat/dfloat_impl.hpp

Comment on lines +383 to +388
#else
// fallback: delegate through double for smaller configs
double d = double(*this) * double(rhs);
*this = d;
return *this;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Multiplication fallback loses precision for ndigits > 15.

The double-based fallback has ~15.9 decimal digits of precision, insufficient for decimal64 (16 digits). This can silently produce incorrect results.

🛡️ Suggested compile-time guard
 `#else`
 		// fallback: delegate through double for smaller configs
+		static_assert(ndigits <= 15,
+			"dfloat multiplication requires __uint128_t for ndigits > 15");
 		double d = double(*this) * double(rhs);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 383 - 388,
The double-based fallback in operator*= (in dfloat_impl.hpp) loses precision for
types with ndigits > 15 (e.g., decimal64); add a compile-time guard so the
fallback is not used for high-precision configs: insert a static_assert that
ndigits <= 15 (or an equivalent constexpr check) where the `#else` double fallback
is selected, with a message like "double fallback insufficient for ndigits > 15;
provide a proper implementation or enable long double fallback"; alternatively,
conditionally use long double only when its precision suffices, otherwise fail
the build so the incorrect double-based multiplication cannot silently compile.

Comment on lines +428 to +432
#else
double d = double(*this) / double(rhs);
*this = d;
return *this;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Division fallback has same precision limitation.

Same issue as multiplication—double fallback loses precision for configurations with ndigits > 15.

🛡️ Suggested compile-time guard
 `#else`
+		static_assert(ndigits <= 15,
+			"dfloat division requires __uint128_t for ndigits > 15");
 		double d = double(*this) / double(rhs);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 428 - 432,
The division fallback uses double conversion and loses precision for
configurations with NDIGITS > 15; update the operator/=(...) implementation in
dfloat_impl.hpp (the dfloat::operator/= overload where the snippet uses "double
d = double(*this) / double(rhs); *this = d;") to guard this fallback at compile
time: add a static_assert or preprocessor check referencing the NDIGITS (or
template parameter controlling precision) to only allow the double-based
fallback when NDIGITS <= 15, and otherwise force the caller to use/implement the
full-precision division path (or fail the build) so no silent precision loss
occurs.

Comment on lines +889 to +896
dfloat& convert_signed(int64_t v) noexcept {
if (0 == v) {
setzero();
return *this;
}
else {
// convert
bool negative = (v < 0);
uint64_t abs_v = static_cast<uint64_t>(negative ? -v : v);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Undefined behavior when v == INT64_MIN.

Negating INT64_MIN causes signed integer overflow (UB) because -INT64_MIN cannot be represented in int64_t. This can occur when assigning LLONG_MIN to a dfloat.

🐛 Suggested fix to avoid UB
 	dfloat& convert_signed(int64_t v) noexcept {
 		if (0 == v) {
 			setzero();
 			return *this;
 		}
 		bool negative = (v < 0);
-		uint64_t abs_v = static_cast<uint64_t>(negative ? -v : v);
+		// Avoid UB when v == INT64_MIN by casting before negation
+		uint64_t abs_v = negative
+			? static_cast<uint64_t>(-(v + 1)) + 1u
+			: static_cast<uint64_t>(v);

Alternative using <cstdlib>:

uint64_t abs_v = static_cast<uint64_t>(std::abs(v));

Note: std::abs(INT64_MIN) is also UB in C++, so the manual approach is safer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 889 - 896,
In convert_signed, avoid negating v directly (UB for INT64_MIN); instead compute
the absolute value into a uint64_t without signed overflow: when negative,
compute abs_v as static_cast<uint64_t>(-(v + 1)) + 1, otherwise cast v directly;
update convert_signed to use this abs_v path so INT64_MIN is handled safely
(refer to function convert_signed in dfloat_impl.hpp).

Introduce a conditional significand_t type alias (uint64_t for ndigits <= 19,
__uint128_t for ndigits <= 38) to support IEEE 754-2008 decimal128 (34 digits).
Update all significand-handling code (unpack, pack, normalize, arithmetic,
comparisons, DPD codec, manipulators) to use significand_t. Decimal128 addition
uses mixed scale-up/scale-down, multiplication uses split schoolbook with
17-digit halves, and division uses iterative long division.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
include/sw/universal/number/dfloat/dfloat_impl.hpp (3)

524-528: ⚠️ Potential issue | 🟠 Major

Multiplication fallback loses precision for ndigits > 15 without compile-time guard.

The double-based fallback has ~15.9 decimal digits of precision, which is insufficient for decimal64 (16 digits). This can silently produce incorrect results on platforms without __SIZEOF_INT128__.

🛡️ Suggested compile-time guard
 `#else`
+		static_assert(ndigits <= 15,
+			"dfloat multiplication requires __uint128_t for ndigits > 15");
 		double d = double(*this) * double(rhs);
 		*this = d;
 		return *this;
 `#endif`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 524 - 528,
The double-based multiplication fallback in the dfloat_impl.hpp multiplication
operator (the block that does "double d = double(*this) * double(rhs); *this =
d; return *this;") can lose precision for decimal types with ndigits > 15 on
platforms lacking __SIZEOF_INT128__; add a compile-time guard to prevent silent
loss: wrap the fallback in a conditional `#if` that only permits the double path
when __SIZEOF_INT128__ is defined or when ndigits <= 15, and otherwise emit a
static_assert or `#error` directing the build to use the higher-precision integer
implementation (or to require int128). Ensure the check references the ndigits
constant and __SIZEOF_INT128__ to fail fast at compile time rather than
producing incorrect runtime results.

672-676: ⚠️ Potential issue | 🟠 Major

Division fallback has same precision limitation without compile-time guard.

Same issue as multiplication—double fallback loses precision for configurations with ndigits > 15.

🛡️ Suggested compile-time guard
 `#else`
+		static_assert(ndigits <= 15,
+			"dfloat division requires __uint128_t for ndigits > 15");
 		double d = double(*this) / double(rhs);
 		*this = d;
 		return *this;
 `#endif`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 672 - 676,
The division fallback casts to double and loses precision for configurations
with ndigits > 15; add a compile-time guard around the double fallback in the
operator/= implementation in dfloat_impl.hpp so it is only used when the decimal
precision fits in double. Use a static_assert (or enable_if) that checks ndigits
<= std::numeric_limits<double>::digits10 (or an equivalent compile-time
constant) before performing the double cast fallback, and emit a clear
static_assert message directing maintainers to provide a high-precision
implementation when ndigits is larger.

1284-1301: ⚠️ Potential issue | 🔴 Critical

Undefined behavior when v == INT64_MIN is still present.

Line 1290 computes -v when v is negative, but negating INT64_MIN causes signed integer overflow which is undefined behavior in C++. This was flagged in a previous review but remains unaddressed.

🐛 Suggested fix to avoid UB
 dfloat& convert_signed(int64_t v) noexcept {
     if (0 == v) {
         setzero();
         return *this;
     }
     bool negative = (v < 0);
-    uint64_t abs_v = static_cast<uint64_t>(negative ? -v : v);
+    // Avoid UB when v == INT64_MIN by casting before negation
+    uint64_t abs_v = negative
+        ? static_cast<uint64_t>(-(v + 1)) + 1u
+        : static_cast<uint64_t>(v);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 1284 - 1301,
The code in convert_signed negates v directly which is undefined for v ==
INT64_MIN; change the absolute-value computation to avoid negating INT64_MIN —
compute uint64_t abs_v by branching on negative and, when negative, use a safe
transform like negating (v + 1) then adding 1 (or otherwise converting via
unsigned math without signed overflow) so you never perform -INT64_MIN; update
the computation that currently reads "uint64_t abs_v =
static_cast<uint64_t>(negative ? -v : v);" in convert_signed before the
trailing-zero loop and keep the rest (normalize_and_pack call) unchanged.
🧹 Nitpick comments (4)
include/sw/universal/number/dfloat/dfloat_impl.hpp (1)

809-813: String parsing not implemented—consider adding a static_assert or documentation.

The assign() method silently returns zero for any input string. This could lead to subtle bugs if users expect string construction to work.

Would you like me to help implement decimal string parsing, or should I open an issue to track this?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 809 - 813,
The assign(const std::string& txt) implementation currently silently clears and
returns zero; replace this with a clear explicit failure and documentation:
update dfloat::assign(const std::string&) to throw a std::invalid_argument (or
std::runtime_error) with a clear message like "dfloat::assign(string) not
implemented" and add a comment above the method documenting that decimal string
parsing is unimplemented and should not be relied upon (or open an issue link).
This makes failures explicit until you either implement full decimal string
parsing for class dfloat or add a proper parser implementation in the same
method.
static/float/dfloat/standard/decimal128.cpp (3)

244-275: Add NaN/Inf comparison edge cases in the comparison block.

This section only validates finite comparisons; adding NaN/Inf operator checks will close important IEEE-style special-case coverage.

🔧 Proposed additions
 	{
 		decimal128_bid a(10), b(20), c(10);
+		decimal128_bid pinf(SpecificValue::infpos);
+		decimal128_bid ninf(SpecificValue::infneg);
+		decimal128_bid qnan(SpecificValue::qnan);
@@
 		if (!(neg5 < a)) {
 			++nrOfFailedTestCases;
 			if (reportTestCases) std::cerr << "FAIL: -5 < 10\n";
 		}
+		if (!(ninf < pinf)) {
+			++nrOfFailedTestCases;
+			if (reportTestCases) std::cerr << "FAIL: -inf < +inf\n";
+		}
+		if (qnan == qnan) {
+			++nrOfFailedTestCases;
+			if (reportTestCases) std::cerr << "FAIL: NaN == NaN must be false\n";
+		}
 	}

Based on learnings, "Each number system regression suite should provide exhaustive validation of assignment, conversion, arithmetic, logic, exceptions, number traits, and special cases".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/standard/decimal128.cpp` around lines 244 - 275, Add
tests for NaN and Infinity to the comparison block: construct special values
with decimal128_bid (e.g., NaN via conversion from "NaN", positive and negative
Infinity via "Inf"/"-Inf" or the library's special-value constructors), then
assert IEEE-like behavior for operators on those symbols — NaN comparisons (nan
== any) must be false, (nan != any) true, and ordered comparisons with NaN
should be false; Inf comparisons should rank +/-Inf above/below finite values
(e.g., inf > 20, -inf < -10) and equality for identical infinities should hold.
Use the same test scaffolding (nrOfFailedTestCases and reportTestCases) and
variables like decimal128_bid nan, inf, negInf to mirror existing patterns in
the comparison tests.

1-1: Test file placement appears outside the expected regression taxonomy.

This test is under static/float/dfloat/standard/, while the project guideline calls for organizing regression tests under api, conversion, logic, arithmetic, math, and complex. Consider moving it to the closest category (or documenting an explicit exception for standard-format tests).

As per coding guidelines, "Organize regression tests by number system type with subdirectories for api, conversion, logic, arithmetic, math, and complex operations".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/standard/decimal128.cpp` at line 1, The test file
decimal128.cpp (verifying dfloat<34, 12> against IEEE 754-2008 decimal128) is
placed under static/float/dfloat/standard but should be organized under the
regression taxonomy; move decimal128.cpp into the appropriate number-system
subdirectory (e.g., static/float/dfloat/<number-system>/arithmetic or conversion
depending on the test intent) or create a documented exception if it must remain
in standard, and update any test runner or include paths that reference
decimal128.cpp (search for references to decimal128.cpp and the class/template
dfloat<34, 12> in test manifests or CMake/test configuration to ensure the new
path is used).

170-215: Compare BID and DPD directly in decimal domain instead of converting to double.

The current conversions to double can mask decimal-domain mismatches when both encodings round to the same binary value. Rather than using the components() function (which returns a display string), use the operator== for direct dfloat-to-dfloat comparison, which properly unpacks and compares the sign, exponent, and significand in the decimal domain.

🔧 Proposed refactor (direct decimal-domain comparison)
-			double dbid = double(bid);
-			double ddpd = double(dpd);
-			if (dbid != ddpd) {
+			if (bid != dpd) {
 				++nrOfFailedTestCases;
-				if (reportTestCases) std::cerr << "FAIL: BID(" << v << ") = " << dbid
-				                               << " but DPD(" << v << ") = " << ddpd << '\n';
+				if (reportTestCases) std::cerr << "FAIL: BID(" << v << ") and DPD(" << v
+				                               << ") encode to different values\n";
 			}
@@
-			double bid_sum = double(ba + bb);
-			double dpd_sum = double(da + db);
-			if (bid_sum != dpd_sum) {
+			if ((ba + bb) != (da + db)) {
 				++nrOfFailedTestCases;
 				if (reportTestCases) std::cerr << "FAIL: BID " << tc.a << " + " << tc.b
-				                               << " = " << bid_sum << " but DPD = " << dpd_sum << '\n';
+				                               << " gives different results in BID vs DPD\n";
 			}
@@
-			double bid_prod = double(ba * bb);
-			double dpd_prod = double(da * db);
-			if (bid_prod != dpd_prod) {
+			if ((ba * bb) != (da * db)) {
 				++nrOfFailedTestCases;
 				if (reportTestCases) std::cerr << "FAIL: BID " << tc.a << " * " << tc.b
-				                               << " = " << bid_prod << " but DPD = " << dpd_prod << '\n';
+				                               << " gives different results in BID vs DPD\n";
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/standard/decimal128.cpp` around lines 170 - 215, The
tests compare decimal128_bid and decimal128_dpd by converting to double (e.g.,
double(bid), double(dpd)), which can hide decimal-domain differences; change the
comparisons to compare decimal values directly using the decimal types'
equality/inequality operators instead of converting to double or using
components(). Specifically, in the BID vs DPD value loop and the arithmetic
checks replace the double conversions and comparisons with direct comparisons of
decimal128_bid/decimal128_dpd results (use operator== / operator!= on
decimal128_bid and decimal128_dpd and compare the results of ba+bb vs da+db and
ba*bb vs da*db using those operators) so sign/exponent/significand are checked
in the decimal domain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Around line 1200-1216: dpd_encode_trailing_wide currently only encodes full
3-digit groups and drops any leftover 1–2 digits; update
dpd_encode_trailing_wide (and use ndigits, trailing_val, bit_offset, setbit and
dpd_encode) to handle the remainder after the while loop: compute remaining %= 3
(or check remaining == 1/2), extract the final 1 or 2-digit group from
trailing_val (use pow10_s or modulus), call dpd_encode on that group, and write
only the lower 4 bits for a single digit or lower 7 bits for two digits with
setbit, then increment bit_offset by 4 or 7 respectively so the function matches
dpd_trailing_bits allocation and preserves those digits.
- Around line 1177-1197: dpd_decode_trailing_wide currently only decodes full
3-digit declets and drops remainder digits when (ndigits-1) % 3 != 0; fix
dpd_decode_trailing_wide to decode the leftover 1 or 2 digits after the while
loop by reading each remaining digit as a 4-bit BCD (use getbit at bit_offset +
b to assemble each 4-bit digit), multiply each decoded digit by the appropriate
multiplier and add into result, then return msd * pow10_s(ndigits - 1) + result;
reference symbols: dpd_decode_trailing_wide, msd, ndigits, bit_offset, getbit,
dpd_decode, pow10_s, multiplier, result.
- Around line 1492-1503: The current alignment loop in dfloat_impl.hpp that
multiplies rsig/lsig by 10^diff (using rsig, lsig, ndigits, le, re) can overflow
for large ndigits; change the logic to avoid actual large multiplications: if
diff >= static_cast<int>(ndigits) then determine the ordering by exponents (the
value with larger exponent is greater unless the other's significand is zero)
and return that comparison result immediately; otherwise continue with safe
multiplication (or use a checked-multiply loop that breaks on overflow) for diff
< ndigits. Ensure you reference and use rsig, lsig, le, re, and ndigits in the
updated branch so no multiplication by huge powers of ten is performed.

In `@static/float/dfloat/standard/decimal128.cpp`:
- Around line 112-119: The test currently constructs tenth using the binary
double 0.1 which is inexact; instead build an exact decimal 0.1 using
decimal-native division. Replace the line that defines tenth (decimal128_bid
tenth(0.1)) with a decimal division, e.g. decimal128_bid tenth =
decimal128_bid(1) / decimal128_bid(10) (or decimal128_bid(1) / ten), so tenth,
product (ten * tenth) and the equality check against one use exact decimal
values.

---

Duplicate comments:
In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Around line 524-528: The double-based multiplication fallback in the
dfloat_impl.hpp multiplication operator (the block that does "double d =
double(*this) * double(rhs); *this = d; return *this;") can lose precision for
decimal types with ndigits > 15 on platforms lacking __SIZEOF_INT128__; add a
compile-time guard to prevent silent loss: wrap the fallback in a conditional
`#if` that only permits the double path when __SIZEOF_INT128__ is defined or when
ndigits <= 15, and otherwise emit a static_assert or `#error` directing the build
to use the higher-precision integer implementation (or to require int128).
Ensure the check references the ndigits constant and __SIZEOF_INT128__ to fail
fast at compile time rather than producing incorrect runtime results.
- Around line 672-676: The division fallback casts to double and loses precision
for configurations with ndigits > 15; add a compile-time guard around the double
fallback in the operator/= implementation in dfloat_impl.hpp so it is only used
when the decimal precision fits in double. Use a static_assert (or enable_if)
that checks ndigits <= std::numeric_limits<double>::digits10 (or an equivalent
compile-time constant) before performing the double cast fallback, and emit a
clear static_assert message directing maintainers to provide a high-precision
implementation when ndigits is larger.
- Around line 1284-1301: The code in convert_signed negates v directly which is
undefined for v == INT64_MIN; change the absolute-value computation to avoid
negating INT64_MIN — compute uint64_t abs_v by branching on negative and, when
negative, use a safe transform like negating (v + 1) then adding 1 (or otherwise
converting via unsigned math without signed overflow) so you never perform
-INT64_MIN; update the computation that currently reads "uint64_t abs_v =
static_cast<uint64_t>(negative ? -v : v);" in convert_signed before the
trailing-zero loop and keep the rest (normalize_and_pack call) unchanged.

---

Nitpick comments:
In `@include/sw/universal/number/dfloat/dfloat_impl.hpp`:
- Around line 809-813: The assign(const std::string& txt) implementation
currently silently clears and returns zero; replace this with a clear explicit
failure and documentation: update dfloat::assign(const std::string&) to throw a
std::invalid_argument (or std::runtime_error) with a clear message like
"dfloat::assign(string) not implemented" and add a comment above the method
documenting that decimal string parsing is unimplemented and should not be
relied upon (or open an issue link). This makes failures explicit until you
either implement full decimal string parsing for class dfloat or add a proper
parser implementation in the same method.

In `@static/float/dfloat/standard/decimal128.cpp`:
- Around line 244-275: Add tests for NaN and Infinity to the comparison block:
construct special values with decimal128_bid (e.g., NaN via conversion from
"NaN", positive and negative Infinity via "Inf"/"-Inf" or the library's
special-value constructors), then assert IEEE-like behavior for operators on
those symbols — NaN comparisons (nan == any) must be false, (nan != any) true,
and ordered comparisons with NaN should be false; Inf comparisons should rank
+/-Inf above/below finite values (e.g., inf > 20, -inf < -10) and equality for
identical infinities should hold. Use the same test scaffolding
(nrOfFailedTestCases and reportTestCases) and variables like decimal128_bid nan,
inf, negInf to mirror existing patterns in the comparison tests.
- Line 1: The test file decimal128.cpp (verifying dfloat<34, 12> against IEEE
754-2008 decimal128) is placed under static/float/dfloat/standard but should be
organized under the regression taxonomy; move decimal128.cpp into the
appropriate number-system subdirectory (e.g.,
static/float/dfloat/<number-system>/arithmetic or conversion depending on the
test intent) or create a documented exception if it must remain in standard, and
update any test runner or include paths that reference decimal128.cpp (search
for references to decimal128.cpp and the class/template dfloat<34, 12> in test
manifests or CMake/test configuration to ensure the new path is used).
- Around line 170-215: The tests compare decimal128_bid and decimal128_dpd by
converting to double (e.g., double(bid), double(dpd)), which can hide
decimal-domain differences; change the comparisons to compare decimal values
directly using the decimal types' equality/inequality operators instead of
converting to double or using components(). Specifically, in the BID vs DPD
value loop and the arithmetic checks replace the double conversions and
comparisons with direct comparisons of decimal128_bid/decimal128_dpd results
(use operator== / operator!= on decimal128_bid and decimal128_dpd and compare
the results of ba+bb vs da+db and ba*bb vs da*db using those operators) so
sign/exponent/significand are checked in the decimal domain.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 16f04c5 and 0c777d8.

📒 Files selected for processing (4)
  • include/sw/universal/number/dfloat/dfloat.hpp
  • include/sw/universal/number/dfloat/dfloat_impl.hpp
  • include/sw/universal/number/dfloat/manipulators.hpp
  • static/float/dfloat/standard/decimal128.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • include/sw/universal/number/dfloat/dfloat.hpp
  • include/sw/universal/number/dfloat/manipulators.hpp

Comment on lines +1177 to +1197
significand_t dpd_decode_trailing_wide(unsigned msd) const noexcept {
significand_t result = 0;
significand_t multiplier = 1;
unsigned remaining = ndigits - 1;
unsigned bit_offset = 0;

while (remaining >= 3) {
// Read 10-bit declet from bit_offset
uint16_t declet = 0;
for (unsigned b = 0; b < 10; ++b) {
if (getbit(bit_offset + b)) declet |= static_cast<uint16_t>(1u << b);
}
unsigned value = dpd_decode(declet);
result += static_cast<significand_t>(value) * multiplier;
multiplier *= 1000;
bit_offset += 10;
remaining -= 3;
}

return static_cast<significand_t>(msd) * pow10_s(ndigits - 1) + result;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

DPD wide decoding has same remainder digit issue.

Symmetric to the encoding issue—remainder 1-2 digits are not decoded when (ndigits-1) % 3 != 0.

🔧 Suggested fix to handle remainder digits in decode
 		while (remaining >= 3) {
 			// Read 10-bit declet from bit_offset
 			uint16_t declet = 0;
 			for (unsigned b = 0; b < 10; ++b) {
 				if (getbit(bit_offset + b)) declet |= static_cast<uint16_t>(1u << b);
 			}
 			unsigned value = dpd_decode(declet);
 			result += static_cast<significand_t>(value) * multiplier;
 			multiplier *= 1000;
 			bit_offset += 10;
 			remaining -= 3;
 		}

+		// Handle remainder 1 or 2 digits
+		if (remaining == 2) {
+			uint8_t val = 0;
+			for (unsigned b = 0; b < 7; ++b) {
+				if (getbit(bit_offset + b)) val |= (1u << b);
+			}
+			result += static_cast<significand_t>(val) * multiplier;
+		}
+		else if (remaining == 1) {
+			uint8_t val = 0;
+			for (unsigned b = 0; b < 4; ++b) {
+				if (getbit(bit_offset + b)) val |= (1u << b);
+			}
+			result += static_cast<significand_t>(val) * multiplier;
+		}
+
 		return static_cast<significand_t>(msd) * pow10_s(ndigits - 1) + result;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 1177 - 1197,
dpd_decode_trailing_wide currently only decodes full 3-digit declets and drops
remainder digits when (ndigits-1) % 3 != 0; fix dpd_decode_trailing_wide to
decode the leftover 1 or 2 digits after the while loop by reading each remaining
digit as a 4-bit BCD (use getbit at bit_offset + b to assemble each 4-bit
digit), multiply each decoded digit by the appropriate multiplier and add into
result, then return msd * pow10_s(ndigits - 1) + result; reference symbols:
dpd_decode_trailing_wide, msd, ndigits, bit_offset, getbit, dpd_decode, pow10_s,
multiplier, result.

Comment on lines +1200 to +1216
void dpd_encode_trailing_wide(significand_t significand) noexcept {
significand_t msd_factor = pow10_s(ndigits - 1);
significand_t trailing_val = significand % msd_factor;
unsigned remaining = ndigits - 1;
unsigned bit_offset = 0;

while (remaining >= 3) {
unsigned group = static_cast<unsigned>(trailing_val % 1000);
trailing_val /= 1000;
uint16_t declet = dpd_encode(group);
for (unsigned b = 0; b < 10; ++b) {
setbit(bit_offset + b, (declet >> b) & 1);
}
bit_offset += 10;
remaining -= 3;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

DPD wide encoding doesn't handle remainder digits when (ndigits-1) % 3 != 0.

The loop only processes complete 3-digit groups, but dpd_trailing_bits() allocates 4 bits for 1 remaining digit or 7 bits for 2 remaining digits. For non-standard configurations where (ndigits-1) % 3 != 0, the trailing 1-2 digits would be lost.

Standard formats (decimal32/64/128) happen to have divisible trailing digit counts, but this should be guarded or implemented for completeness.

🔧 Suggested fix to handle remainder digits
 		while (remaining >= 3) {
 			unsigned group = static_cast<unsigned>(trailing_val % 1000);
 			trailing_val /= 1000;
 			uint16_t declet = dpd_encode(group);
 			for (unsigned b = 0; b < 10; ++b) {
 				setbit(bit_offset + b, (declet >> b) & 1);
 			}
 			bit_offset += 10;
 			remaining -= 3;
 		}
+		// Handle remainder 1 or 2 digits (4 or 7 bits respectively)
+		if (remaining == 2) {
+			unsigned val = static_cast<unsigned>(trailing_val % 100);
+			for (unsigned b = 0; b < 7; ++b) {
+				setbit(bit_offset + b, (val >> b) & 1);
+			}
+		}
+		else if (remaining == 1) {
+			unsigned val = static_cast<unsigned>(trailing_val % 10);
+			for (unsigned b = 0; b < 4; ++b) {
+				setbit(bit_offset + b, (val >> b) & 1);
+			}
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 1200 - 1216,
dpd_encode_trailing_wide currently only encodes full 3-digit groups and drops
any leftover 1–2 digits; update dpd_encode_trailing_wide (and use ndigits,
trailing_val, bit_offset, setbit and dpd_encode) to handle the remainder after
the while loop: compute remaining %= 3 (or check remaining == 1/2), extract the
final 1 or 2-digit group from trailing_val (use pow10_s or modulus), call
dpd_encode on that group, and write only the lower 4 bits for a single digit or
lower 7 bits for two digits with setbit, then increment bit_offset by 4 or 7
respectively so the function matches dpd_trailing_bits allocation and preserves
those digits.

Comment on lines +1492 to +1503
if (le < re) {
int diff = re - le;
if (diff < static_cast<int>(ndigits)) {
for (int i = 0; i < diff; ++i) rsig *= 10;
}
}
else if (re < le) {
int diff = le - re;
if (diff < static_cast<int>(ndigits)) {
for (int i = 0; i < diff; ++i) lsig *= 10;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential significand overflow in comparison alignment for large ndigits.

When aligning significands for comparison, multiplying by 10^diff can overflow. For decimal128 (ndigits=34), a significand near 10^34 multiplied by 10^33 would yield 10^67, which exceeds __uint128_t capacity (~3.4×10³⁸).

Consider clamping the alignment or using a different comparison strategy that doesn't require full alignment.

🛡️ Suggested fix using safe comparison
 	// same overall scale: compare significands at same exponent
 	// Align to same exponent by adjusting significands
+	// Guard against overflow: limit scaling to safe amount
+	constexpr int safe_scale = (ndigits <= 19) ? 18 : 4; // conservative limits
 	if (le < re) {
 		int diff = re - le;
-		if (diff < static_cast<int>(ndigits)) {
+		if (diff <= safe_scale) {
 			for (int i = 0; i < diff; ++i) rsig *= 10;
 		}
+		else {
+			// Large diff with same scale means rsig is effectively larger
+			return ls; // negative: larger magnitude means smaller; positive: opposite
+		}
 	}
 	else if (re < le) {
 		int diff = le - re;
-		if (diff < static_cast<int>(ndigits)) {
+		if (diff <= safe_scale) {
 			for (int i = 0; i < diff; ++i) lsig *= 10;
 		}
+		else {
+			return !ls;
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/sw/universal/number/dfloat/dfloat_impl.hpp` around lines 1492 - 1503,
The current alignment loop in dfloat_impl.hpp that multiplies rsig/lsig by
10^diff (using rsig, lsig, ndigits, le, re) can overflow for large ndigits;
change the logic to avoid actual large multiplications: if diff >=
static_cast<int>(ndigits) then determine the ordering by exponents (the value
with larger exponent is greater unless the other's significand is zero) and
return that comparison result immediately; otherwise continue with safe
multiplication (or use a checked-multiply loop that breaks on overflow) for diff
< ndigits. Ensure you reference and use rsig, lsig, le, re, and ndigits in the
updated branch so no multiplication by huge powers of ten is performed.

Comment on lines +112 to +119
decimal128_bid ten(10);
decimal128_bid tenth(0.1);
decimal128_bid product = ten * tenth;
decimal128_bid one(1);
if (!(product == one)) {
++nrOfFailedTestCases;
if (reportTestCases) std::cerr << "FAIL: 10 * 0.1 != 1.0, got " << product << '\n';
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, verify the exact code at lines 112-119
cat -n static/float/dfloat/standard/decimal128.cpp | sed -n '112,119p'

Repository: stillwater-sc/universal

Length of output: 386


🏁 Script executed:

# Check for decimal128_bid constructors and initialization patterns
fd -e hpp -e h . include/sw/universal/number/dfloat | head -20

Repository: stillwater-sc/universal

Length of output: 1231


🏁 Script executed:

# Search for how decimal128_bid is constructed elsewhere in tests
rg -n 'decimal128_bid\s*\(' static/float/dfloat/standard/decimal128.cpp | head -30

Repository: stillwater-sc/universal

Length of output: 49


🏁 Script executed:

# Check the dfloat implementation for constructors
head -300 include/sw/universal/number/dfloat/dfloat_impl.hpp | tail -200

Repository: stillwater-sc/universal

Length of output: 7319


🏁 Script executed:

# Look for constructor definitions in dfloat_impl.hpp
rg -n -A5 'dfloat\s*\(' include/sw/universal/number/dfloat/dfloat_impl.hpp | head -60

Repository: stillwater-sc/universal

Length of output: 2520


🏁 Script executed:

# Check if division operator is available
rg -n 'operator/' include/sw/universal/number/dfloat/dfloat_impl.hpp

Repository: stillwater-sc/universal

Length of output: 706


🏁 Script executed:

# Look for string-based construction patterns in the same file
rg -n 'const char\*|std::string' include/sw/universal/number/dfloat/dfloat_impl.hpp | head -20

Repository: stillwater-sc/universal

Length of output: 956


🏁 Script executed:

# Check for other instances of binary literal construction in decimal128.cpp
rg -n 'decimal128_bid\s*\(\s*[0-9]*\.[0-9]+\s*\)' static/float/dfloat/standard/decimal128.cpp

Repository: stillwater-sc/universal

Length of output: 49


🏁 Script executed:

# Look for string-based decimal construction in the test file
rg -n '"[0-9]*\.[0-9]+"' static/float/dfloat/standard/decimal128.cpp | head -20

Repository: stillwater-sc/universal

Length of output: 49


🏁 Script executed:

# Check for division-based construction patterns in similar test files
fd -name '*.cpp' -path '*/float/dfloat/*' -type f | xargs rg -l 'operator/' | head -5

Repository: stillwater-sc/universal

Length of output: 510


🏁 Script executed:

# Search for broader test patterns around this test
cat -n static/float/dfloat/standard/decimal128.cpp | sed -n '100,130p'

Repository: stillwater-sc/universal

Length of output: 1092


🏁 Script executed:

# Check if there are other similar "multiplication by reciprocal" tests
rg -n -B2 -A2 'ten.*tenth' static/float/dfloat/standard/decimal128.cpp

Repository: stillwater-sc/universal

Length of output: 233


🏁 Script executed:

# Look for best practices: how are decimal values constructed elsewhere in tests
rg -n 'decimal128_bid\(' static/float/dfloat/standard/decimal128.cpp | grep -E '\("|\"' | head -10

Repository: stillwater-sc/universal

Length of output: 49


🏁 Script executed:

# Check how the double constructor handles 0.1
rg -n -A10 'operator=.*double' include/sw/universal/number/dfloat/dfloat_impl.hpp | head -50

Repository: stillwater-sc/universal

Length of output: 2459


🏁 Script executed:

# Check the actual conversion from double to dfloat
rg -n 'convert.*double\|convert_ieee754' include/sw/universal/number/dfloat/dfloat_impl.hpp | head -20

Repository: stillwater-sc/universal

Length of output: 49


🏁 Script executed:

# See if the test is actually expecting this to pass or fail
cat -n static/float/dfloat/standard/decimal128.cpp | sed -n '109,121p'

Repository: stillwater-sc/universal

Length of output: 527


🏁 Script executed:

# Check if there are other division-based arithmetic tests in the file
rg -n ' / ' static/float/dfloat/standard/decimal128.cpp | head -10

Repository: stillwater-sc/universal

Length of output: 250


🏁 Script executed:

# Verify the string constructor approach by looking at assign() method
rg -n -A15 'assign.*string' include/sw/universal/number/dfloat/dfloat_impl.hpp | head -40

Repository: stillwater-sc/universal

Length of output: 989


Use decimal-native construction for the exactness check.

Line 113 constructs tenth from binary literal 0.1 via the double constructor. Since IEEE 754 cannot exactly represent 0.1 in binary, this converts an approximation rather than testing pure decimal arithmetic. The string constructor is unimplemented (marked TODO), so use division to achieve exact decimal 0.1:

🔧 Proposed fix
-		decimal128_bid tenth(0.1);
+		decimal128_bid tenth = decimal128_bid(1) / decimal128_bid(10);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
decimal128_bid ten(10);
decimal128_bid tenth(0.1);
decimal128_bid product = ten * tenth;
decimal128_bid one(1);
if (!(product == one)) {
++nrOfFailedTestCases;
if (reportTestCases) std::cerr << "FAIL: 10 * 0.1 != 1.0, got " << product << '\n';
}
decimal128_bid ten(10);
decimal128_bid tenth = decimal128_bid(1) / decimal128_bid(10);
decimal128_bid product = ten * tenth;
decimal128_bid one(1);
if (!(product == one)) {
+nrOfFailedTestCases;
if (reportTestCases) std::cerr << "FAIL: 10 * 0.1 != 1.0, got " << product << '\n';
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/float/dfloat/standard/decimal128.cpp` around lines 112 - 119, The test
currently constructs tenth using the binary double 0.1 which is inexact; instead
build an exact decimal 0.1 using decimal-native division. Replace the line that
defines tenth (decimal128_bid tenth(0.1)) with a decimal division, e.g.
decimal128_bid tenth = decimal128_bid(1) / decimal128_bid(10) (or
decimal128_bid(1) / ten), so tenth, product (ten * tenth) and the equality check
against one use exact decimal values.

@Ravenwater Ravenwater merged commit 10368fb into main Feb 27, 2026
28 of 29 checks passed
@Ravenwater Ravenwater deleted the v3.104 branch February 27, 2026 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant