Skip to content

Improve tests about UB/IB #309

@stefano-zanotti-88

Description

@stefano-zanotti-88

Like you did for NaNs, we should move the following tests too -- check them against a known string, not against system printf. They are UB or IB.

The major problem in the tests is that int is not guaranteed to be 32-bit, it might be 16-bit.
For primitive integral types, see the nice table here:
https://en.cppreference.com/w/cpp/language/types
Note that we need explicit checks; turning all of the literals into long by using the l suffix is not correct, since long can be larger than int, so the argument extraction with va_arg would fail.

Since checks for 16-bit ints are never used, do you actually want to support them, or is NPF intended for 32-bit (embedded or not)? Do you know if you have users on 8-bit platforms? That seems likely, given the tinyness goals of NPF.

Here they follow.

UB;

#if NANOPRINTF_USE_FIELD_WIDTH_FORMAT_SPECIFIERS == 1
require_conform("%", "%-%");
#endif // NANOPRINTF_USE_FIELD_WIDTH_FORMAT_SPECIFIERS
require_conform("%", "% %");
require_conform("%", "%+%");
#if NANOPRINTF_USE_ALT_FORM_FLAG == 1
require_conform("%", "%#%");
#endif
// require_conform(" %", "%10%"); clang adds width, gcc doesn't
// require_conform("% ", "%-10%"); clang adds -width, gcc doesn't
// require_conform(" %", "%10.10%"); clang adds width + precision.
// require_conform("%012%"); Undefined

For these, only '\0' is UB, the rest can stay:

SUBCASE("char") {
// every char
for (int i = CHAR_MIN; i < CHAR_MAX; ++i) {
char output[2] = {(char)i, 0};
require_conform(output, "%c", i);
}
require_conform("A", "%+c", 'A');
require_conform("", "%+c", 0);

Also note that printing '\0' produces a 0-length string, but npf returns 1.

"+c" is UB:

require_conform(" A", "%+4c", 'A');

This is commented, but it's good to have (in another file; and now we do pad with 0s):

// require_conform("abc", "%010s", "abc"); // undefined

This should be guarded by if(UINT_MAX == 0xFFFFFFFFu); better if also with a case for 16-bit int

require_conform("4294967295", "%u", UINT_MAX);

Guarded by sizeof(short) < 4 (though I don't know of any platform where this fails -- and the same is true of many many other places: int might be more than 32-bits, and long more than 64, but that's unlikely to happen soon -- and of course all of this is also predicated on CHAR_BIT being 8, but that's another thing that would break much more than these tests)

require_conform("13", "%hu", (1 << 21u) + 13u); // "short" mod clips

This should be guarded by if(UINT_MAX == 0xFFFFFFFFu)

#if ULONG_MAX > UINT_MAX
require_conform("4294967296", "%lu", (unsigned long)UINT_MAX + 1u);
#endif

UB:

require_conform(" 1", "%+4u", 1); // undefined but usually skips +

Guarded by if(sizeof(int) == 4), or individual checks on the two values

require_conform("-2147483648", "%i", INT_MIN);

require_conform("2147483647", "%i", INT_MAX);

Guarded:

require_conform("37777777777", "%o", UINT_MAX);
require_conform("17", "%ho", (1 << 29u) + 15u);

Guarded by if(sizeof(int) == 4) or similar

#if ULONG_MAX > UINT_MAX
require_conform("40000000000", "%lo", (unsigned long)UINT_MAX + 1u);
#endif

'+' is UB on unsigned specifiers

require_conform("0", "%+o", 0);
require_conform("1", "%+o", 1);
require_conform(" 1", "%+4o", 1);

Guarded:

require_conform("12345678", "%x", 0x12345678);
require_conform("ffffffff", "%x", UINT_MAX);

Guarded:

require_conform("90ABCDEF", "%X", 0x90ABCDEF);
require_conform("FFFFFFFF", "%X", UINT_MAX);

UB:

require_conform("0", "%+x", 0);
require_conform("1", "%+x", 1);

Guarded:

require_conform("7b", "%hx", (1 << 26u) + 123u);
#if ULONG_MAX > UINT_MAX
require_conform("100000000", "%lx", (unsigned long)UINT_MAX + 1u);
#endif

These, hadn't they already been moved out?

#if NANOPRINTF_USE_FLOAT_FORMAT_SPECIFIERS == 1
SUBCASE("float NaN") {
std::string const lowercase_nan = []{
char buf[32];
REQUIRE(npf_snprintf(buf, sizeof(buf), "%f", (double)NAN) == 3);
return std::string{buf};
}();
// doctest can't do || inside REQUIRE
if (lowercase_nan != "nan") {
REQUIRE(lowercase_nan == "-nan");
} else {
REQUIRE(lowercase_nan == "nan");
}
std::string const uppercase_nan = []{
char buf[32];
REQUIRE(npf_snprintf(buf, sizeof(buf), "%F", (double)NAN) == 3);
return std::string{buf};
}();
// doctest can't do || inside REQUIRE
if (uppercase_nan != "NAN") {
REQUIRE(uppercase_nan == "-NAN");
} else {
REQUIRE(uppercase_nan == "NAN");
}
}

IB: 'inf' or 'infinity':

require_conform("inf", "%f", (double)INFINITY);
require_conform("-inf", "%f", -(double)INFINITY);
#if NANOPRINTF_USE_FIELD_WIDTH_FORMAT_SPECIFIERS == 1
require_conform(" inf", "%4f", (double)INFINITY);
#endif // NANOPRINTF_USE_FIELD_WIDTH_FORMAT_SPECIFIERS
require_conform("inf", "%.100f", (double)INFINITY);
require_conform("inf", "%.10f", (double)INFINITY);
require_conform("inf", "%.10e", (double)INFINITY);
require_conform("inf", "%.10g", (double)INFINITY);
require_conform("inf", "%.10a", (double)INFINITY);
require_conform("INF", "%F", (double)INFINITY);

Yet other UB/IB tests:
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L130
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L132-L136
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L160-L171
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L296
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L298-L302
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L314
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L316-L320
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L334
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L336-L340
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L358
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L360-L364
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L377
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L379-L383
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L404
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L406-L410
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L422
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L424-L428
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L439
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L441-L445
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L457
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L462-L466
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L475
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L480-L484
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L496
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L498-L502
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L563
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L566
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L568-L571
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L573-L574
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L762
https://github.com/charlesnicholson/nanoprintf-paland-conformance/blob/25420456ecc3ced5c7cdd692a03889773b0c03a4/paland.cc#L764

Finally, for this:

npf_pprintf(+[](int, void*) {}, nullptr, "%u%s%n", 0, "abcd", &writeback);
REQUIRE(writeback == 5);

I'd say a more thorough test would be helpful (repeated for all possible %n sizes):

int writeback  = -1;
printf("%s%n%i", "abcd...", &writeback , 1234); // check output "abcd...1234", with a string at least 128 long (except for "%hhn"), check writeback == 128

This has an additional argument after %n, to check that the advancement through va_arg is correct.
Also, we should probably check that the write back does not overflow; but how to do that? Maybe:

int writeback [3] = {-1, -1, -1};
printf("%s%n%i", "abcd", &writeback[1], 1234);
check writeback [0] == -1, writeback[2] == -1

Though the compiler might decide to remove those checks, since they can only fail on UB.
So, this should do:

struct {
  volatile int guard1;
  int writeback;
  volatile int guard2;
} s = {-1, -1, -1};
printf("%s%n%i", "abcd", &s.writeback, 1234);
check s.guard1 == -1, s.guard2 == -1

volatile prevents removing the checks; the struct forbids the reordering or disappearance of the variables.

A similar check-that-va_arg-advancement-is-correct should be done for %i %b, %x -- that is, at least once per each different code path that performs argument extraction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions