Skip to content

Conversation

@stefano-zanotti-88
Copy link
Contributor

This optimizes argument fetching/writing for redundant 'h' and 'l/ll/j/z/t'.

When sizeof(short) == sizeof(int), we can have the enum value NPF_FMT_SPEC_LEN_MOD_SHORT be an alias for NPF_FMT_SPEC_LEN_MOD_NONE.
Similarly for the long ones (aliased to NONE or LONG).
We can then conditionally remove some of the NPF_EXTRACT and NPF_WRITEBACK cases.
This is done in this PR.

We can also remove LONG if it aliases NONE, with some caveats.
'l' is needed for %c/%s, and especially for the floats. But in both cases they are the same as NONE -- that's because we don't support them, for %c and %s; and because the standard says so, for floats. So, for all specifiers, LONG can correctly alias NONE.
This is also done in this PR.
If %lc and %ls will ever be supported, this will need revisiting.

Since 'hh' is the size of char, and from what I understand (short) must be at least 2*sizeof(char), 'hh' can't possibly alias any other of the types involved here.

This also addresses the loosely related #311.

What size changes do you see on your reports, for this branch?
Running nm locally, I get this strange-looking results:
Without type removal (before):

0000000000000010 t npf_bufputc_nop
0000000000000020 t npf_bufputc
0000000000000020 T npf_pprintf
0000000000000020 t npf_vpprintf.cold
0000000000000080 T npf_vsnprintf
0000000000000090 T npf_snprintf
00000000000021b0 T npf_vpprintf
0000000000002380 t npf_utoa_rev

With type removal (after):

0000000000000010 t npf_bufputc_nop
0000000000000020 t npf_bufputc
0000000000000020 T npf_pprintf
0000000000000070 t npf_utoa_rev
0000000000000090 T npf_snprintf
00000000000000b0 T npf_vsnprintf
0000000000002200 T npf_vpprintf

The size of npf_utoa_rev before is particularly strange; that's a very simple function.

Also mind that I've changed the order of the NPF_FMT_SPEC_LEN_MOD_* enums. I don't think this should have any impact at all

@stefano-zanotti-88
Copy link
Contributor Author

To elaborate a bit:
NPF_LEN_MOD_LONG_IS_ALIASED set to 1 (aliased to NONE) is always correct for floats. We can't always do that because it might not be true for integers.
Anyway, we already ignore it for floats, with code like:

if (fs.length_modifier == NPF_FMT_SPEC_LEN_MOD_LONG_DOUBLE) {
  val = (double)va_arg(args, long double);
} else {
  val = va_arg(args, double);
}

@stefano-zanotti-88
Copy link
Contributor Author

Other considerations about va_arg-extraction.
This:

typedef ptrdiff_t npf_ssize_t;

is used here:
NPF_EXTRACT(LARGE_SIZET, npf_ssize_t, npf_ssize_t);

However, can't we just extract the parameter as size_t, rather than ssize_t? The two types are guaranteed to be the same width, and we assigned the value to npf_int_t val anyway, without even a cast (I mean no cast to npf_int_t). Is it actually UB to access a value with va_arg using the wrong signedness?
The C23 standard says (§7.16.1.1):

The va_arg macro expands to an expression that has the specified type
[…]
The parameter type shall be an object type
name. If type is not compatible with the type of the actual next argument (as promoted according to
the default argument promotions), the behavior is undefined, except for the following cases:
— both types are pointers to qualified or unqualified versions of compatible types;
— one type is compatible with a signed integer type, the other type is compatible with the
corresponding unsigned integer type, and the value is representable in both types;
— one type is pointer to qualified or unqualified void and the other is a pointer to a qualified or
unqualified character type;
— or, the type of the next argument is nullptr_t and type is a pointer type that has the same
representation and alignment requirements as a pointer to a character type.296)

So, though signed and unsigned are incompatible (that's from other parts of the standard), va_arg has a specific exception for it; but only if the value is representable in both types (which mostly means that the sign bit is 0, and which might or might not be less of a problem now that 2's complement is the only acceptable representation for signed integers, though this is recent -- maybe C23 and C++14).
This seems like something that is formally UB but that no sane implementation would ever break.

In the current PR, however, the signedness is not affected. Only the actual type is "aliased" (but the size is correct). However, those types are not strictly "compatible".

Other libraries seem to incur in this UB without (?) problems:
STB (https://github.com/nothings/stb/blob/f0569113c93ad095470c54bf34a17b36646bbbb5/stb_sprintf.h#L1061)
eyalroz (https://github.com/eyalroz/printf/blob/013db1e345cbb166a7eb758da2b4edba93ad2458/src/printf/printf.c#L1394)
glibc (https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/vfprintf.c;h=fc370e8cbc4e9652a2ed377b1c6f2324f15b1bf9;hb=3321010338384ecdc6633a8b032bb0ed6aa9b19a)

@charlesnicholson
Copy link
Owner

This is very interesting; I'm surprised arm-gcc can't figure this out on its own since the underlying types are identical. Or, maybe it does? Looking at your metrics:

your "before"

0000000000000010 t npf_bufputc_nop
0000000000000020 t npf_bufputc
0000000000000020 T npf_pprintf
0000000000000020 t npf_vpprintf.cold
0000000000000080 T npf_vsnprintf
0000000000000090 T npf_snprintf
00000000000021b0 T npf_vpprintf
0000000000002380 t npf_utoa_rev

and your "after"

0000000000000010 t npf_bufputc_nop
0000000000000020 t npf_bufputc
0000000000000020 T npf_pprintf
0000000000000070 t npf_utoa_rev
0000000000000090 T npf_snprintf
00000000000000b0 T npf_vsnprintf
0000000000002200 T npf_vpprintf

Ignoring the strange npf_utoa_rev thing, which I don't understand yet, and directly comparing npf_vpprintf, we have:

Before: 0x20 (npf_vpprintf.cold) + 0x21b0 (npf_vpprintf) = 0x21d0
After: 0x2200

At least from this data, it looks like this change increases the footprint by 0x30 bytes?

@stefano-zanotti-88
Copy link
Contributor Author

I don't know whether the compiler can be that clever, since the simplifications I'm doing are (as I understand it) technically UB. Also, my numbers were taken with -O3 rather than -Os, something I noticed only afterwards.
And, I would trust your own size reports much more than mine, given the minor issues I've encountered with it.

@stefano-zanotti-88
Copy link
Contributor Author

What is your feeling about this potential added UB? Also, do you understand the same as I do, or do you think it's actually legal to alias types that way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants