Binary and Hex values should 'just work' for signed values. #717

tannergooding · 2017-07-03T21:28:44Z

tannergooding
Jul 3, 2017
Collaborator

Currently int x = 0xFFFF0000 is invalid as 0xFFFF0000 is treated as System.UInt32. The same goes for 0b11111111111111110000000000000000.

In order to get the above to work (without casting), one has to define int x = -65536, int x = -0x10000, or int x = -0b10000000000000000. With casting, one would define int x = unchecked((int)0xFFFF0000) or int x = unchecked((int)0b11111111111111110000000000000000). None of these methods are as readable as just defining the raw value.

I propose that binary and hex values should 'just work'. That is, provided the number of bits/hex-digits defined matches the number of bits/hex-digits in the target type, it should be useable without conversion.

byte/sbyte would allow up-to 8 binary or 2 hex-digits.
short/ushort would allow up-to 16 binary or 4 hex-digits
int/uint would allow up-to 32 binary or 8 hex-digits
long/ulong would allow up-to 64 binary or 16 hex-digits

jnm2 · 2017-07-03T21:36:28Z

jnm2
Jul 3, 2017
Collaborator

Seems like ambiguities could arise, especially if a literal is copied and pasted. I really like (int)0xFFFF0000 because it's impossible to misunderstand the intention.

The fix I'd rather see is for (int)0xFFFF0000 to always be an unchecked literal.

0 replies

MgSam · 2017-07-04T14:25:47Z

MgSam
Jul 4, 2017

I don't understand the problem here. If you are treating the number as signed, then it seems to me showing the sign is the most clear thing you could do. If you want to show the full bit pattern for some reason you are not actually treating it as a signed number, so a cast to force you to state your intent doesn't seem like much of a penalty.

I'd argue changing of this behavior would lead to more bugs with virtually no benefit.

0 replies

tannergooding · 2017-07-05T16:37:15Z

tannergooding
Jul 5, 2017
Collaborator Author

@jnm2 do you have an explicit sample of where an ambiguity could arise (it might be too early, but I can't currently think of any scenarios where 0xFFFF0000 and (int)0xFFFF0000 would produce different results).

The only places I can really think of right now are var, where it would likely continue defaulting to uint for back-compat.

0 replies

tannergooding · 2017-07-05T16:39:07Z

tannergooding
Jul 5, 2017
Collaborator Author

That being said, I think explicitly allowing (int)hex-literal or (int)binary-literal to always be unchecked would be good to.

0 replies

tannergooding · 2017-07-05T16:47:49Z

tannergooding
Jul 5, 2017
Collaborator Author

@MgSam, hex and binary values carry and display the sign in their representation, while decimal values do that by displaying the sign (+/-).

Traditionally (or at least in all code I've ever looked at), you display -65536 as:

Binary: 0b11111111111111110000000000000000
Hex: 0xFFFF0000
Decimal: -65536

However, you can also write it as:

Binary: -0b10000000000000000
Hex: -0x10000
Decimal: -65536

It's probably worth noting that C/C++ currently supports what I am proposing:

int b = 0b11111111111111110000000000000000;     // just works
int x = 0xFFFF0000;                             // just works
int n = -65536;                                 // just works

auto ab = 0b11111111111111110000000000000000;   // unsigned int
auto ax = 0xFFFF0000;                           // unsigned int
auto an = -65536;                               // int

auto sb = -0b10000000000000000;                 // int
auto sx = -0x10000;                             // int
auto sn = -65536;                               // int

0 replies

MgSam · 2017-07-05T17:40:00Z

MgSam
Jul 5, 2017

Yes, it's an accurate representation of the bits, but its confusing for someone reading the code. If it's supposed to be a negative number, then why not put in a negative sign? Eliding it only increases the chances that someone is confused and thinks a) it really should be an unsigned number or b) misreads the number.

C# is intentionally designed to be a higher-level language than C/C++, thus disallowing many things that those languages allow that could lead to bugs.

0 replies

tannergooding · 2017-07-05T18:18:25Z

tannergooding
Jul 5, 2017
Collaborator Author

I'm not so sure its confusing for someone reading in the code. Today in C# code, I see unchecked((int)0xFFFF0000) more than I have seen -0x10000 (which I only saw due to invoking a new IDE code-fix).

I'm also fairly certain that is why the comment on just allowing (int)0xFFFF0000 rather than requiring unchecked((int)0xFFFF0000) is being upvoted.

0 replies

jnm2 · 2017-07-05T20:31:12Z

jnm2
Jul 5, 2017
Collaborator

@tannergooding

do you have an explicit sample of where an ambiguity could arise (it might be too early, but I can't currently think of any scenarios where 0xFFFF0000 and (int)0xFFFF0000 would produce different results).

Yep, say you're looking at this line:

Foo(0xFFFF0000);

And you want to use the exact same value in Bar, so you write the next line:

Foo(0xFFFF0000);
Bar(0xFFFF0000);

Imagine that each method prints its argument. The output is this:

-65536
4294901760

^{Given
void Foo(int value) => Console.WriteLine(value);
void Bar(long value) => Console.WriteLine(value);}

I'd much rather the compiler kept its current complaint and required you to write Bar((int)0xFFFF0000);– with the rule that literals cast to a constant-able type are always unchecked.

0 replies

tannergooding · 2017-07-05T20:36:21Z

tannergooding
Jul 5, 2017
Collaborator Author

You have the same thing today with:

void Foo(uint value) => Console.WriteLine(value);
void Bar(long value) => Console.WriteLine(value);

I'm not sure that (int)0xFFFF0000 makes it any more readable, other than explicitly calling out that you are passing in the value as an int, which you could do if desired.

I do understand the reason people would want to keep it explicit though. If this makes it to LDM and they decide to not just allow int x = 0xFFFF0000 (like C/C++ allows), your proposal of allowing int x = (int)0xFFFF0000 would probably be the next best thing.

0 replies

jnm2 · 2017-07-05T20:39:44Z

jnm2
Jul 5, 2017
Collaborator

I'm not sure that (int)0xFFFF0000 makes it any more readable

It does because it signals that the number is interpreted as a negative, unlike existing literals with no sign.

0 replies

jnm2 · 2017-07-05T20:42:39Z

jnm2
Jul 5, 2017
Collaborator

Also, I think my proposal of making literal casts to const-able types unchecked has wider value, and we can always come back to this if it turns out that we really want to go even further.

0 replies

tannergooding · 2017-07-05T20:48:36Z

tannergooding
Jul 5, 2017
Collaborator Author

Do you have a proposal up on that yet? If not, you should 😄

0 replies

jnm2 · 2017-07-05T20:50:31Z

jnm2
Jul 5, 2017
Collaborator

@tannergooding Current status = trying to keep my head above water with work for the next month (or few?)... but I'd be motivated to write it up if it kept you from going ahead with this 😆

0 replies

tannergooding · 2017-07-05T20:56:13Z

tannergooding
Jul 5, 2017
Collaborator Author

I'm going to keep this open regardless (because I would prefer this -- it makes interop code so much easier to write), was mostly just wondering if you had a proposal so that LDM will potentially look at it eventually 😄

0 replies

yaakov-h · 2017-07-06T08:46:41Z

yaakov-h
Jul 6, 2017

It does because it signals that the number is interpreted as a negative, unlike existing literals with no sign.

Well no, it just signals that the number is an int and gets whatever sign is appropriate. If I accidentally drop a digit and it becomes (int)0xFFF0000 that doesn't become a negative (I think?).

I agree that in ambiguous cases a cast would be useful, but I also think that explicit cases such as int a = 0xFFFFFFFF or enum X : int { Value = 0xFFFFFFFF } should be valid without casts or unchecked.

These would make interop code - particularly Win32 interop code - much nicer.

0 replies

jnm2 · 2017-07-06T13:13:47Z

jnm2
Jul 6, 2017
Collaborator

@yaakov-h

Well no, it just signals that the number is an int and gets whatever sign is appropriate.

Exactly. But given the context, it'll be unusual to use it unless the number is negative.

@lachbaer Can I pawn that off on you? 😁

0 replies

tannergooding · 2017-07-12T21:59:35Z

tannergooding
Jul 12, 2017
Collaborator Author

@gafter, I was told (by @jaredpar) that you would be a good person to tag here on possible overload resolution consequences.

Under my current proposal, a hex or binary literal will be treated as uint for var and will continue defaulting to uint for overload resolution. However, if explicitly assigned to an int, no casting will be required. It will also work for int overloads if there is no uint which qualifies as a better match. (corresponding rules for sbyte/byte, short/ushort, and long/ulong). I imagine uint would also continue favoring ulong over int (and same for implicit upcasting on the other types).

0 replies

gafter · 2017-07-13T15:49:25Z

gafter
Jul 13, 2017

@tannergooding This is the sort of thing I'd be concerned about

public void M(int i, string o) ...
public void M(uint u, object s) ...
...
M(0xffffffff, null); // used to call M(uint u, object s)

Only the second method is applicable in C# 7. With the new conversion from 0xffffffff to int, both methods are applicable.

The tiebreaker you suggest would presumably be inserted into the overload resolution algorithm somewhere to prevent this from turning into an ambiguous call (because both methods are applicable). Perhaps this would be added to the "better conversion from expression" section.

The problem is that in this example there is a "better conversion from expression" for the second argument that prefers the first overload. If the new rule makes the first argument expression a "better conversion from expression" to the second overload, then this example changes from one that used to work into one for which there is an ambiguity. That would be a breaking change, which is what we hope to avoid.

0 replies

ghost · 2018-02-05T00:00:00Z

ghost
Feb 5, 2018

I don't see any confusion. If var is used, the number can be assumed to be the shortest positve type, otherwise the programmer should clear his intentions.

0 replies

ghost · 2018-02-05T00:18:31Z

ghost
Feb 5, 2018

This ssue is related to the same area:
https://github.com/dotnet/corefx/issues/26826

0 replies

gafter · 2018-02-05T00:21:02Z

gafter
Feb 5, 2018

I don't see any confusion.

The confusion is that the existing program (which contains no vars) in my previous comment would change its behavior. It would cease to compile, now being technically ambiguous, preventing the successful compilation of any previously-compiling program that contained something like it in source.

0 replies

ghost · 2018-02-05T00:27:14Z

ghost
Feb 5, 2018

At least give it a new symbole: 0nb1111_1111.

0 replies

jnm2 · 2018-02-05T01:08:50Z

jnm2
Feb 5, 2018
Collaborator

@MohammadHamdyGhanem I still say (int)0b1111_1111 is the clearer and more idiomatic option. All we need is for unchecked to be considered unnecessary for such literal casts.

0 replies

gafter · 2018-02-05T01:16:29Z

gafter
Feb 5, 2018

@jnm2 The fact that constant expressions are evaluated in a checked context (with compile-time reporting of errors) was a carefully considered part of the language design. It isn't clear to me how to relax that without undermining some valuable compile-time checking.

0 replies

jnm2 · 2018-02-05T01:21:57Z

jnm2
Feb 5, 2018
Collaborator

@gafter A new rule that if a literal (not just any constant) of an unsigned integer type is being explicitly cast to the signed version, unchecked is not required.

0 replies

gafter · 2018-02-05T01:34:13Z

gafter
Feb 5, 2018

@jnm2 That would not help with (short)0xffff as the type of the literal is int.

0 replies

jnm2 · 2018-02-05T01:37:17Z

jnm2
Feb 5, 2018
Collaborator

@gafter I'm actually mildly happy with that outcome. @tannergooding?

0 replies

yaakov-h · 2018-02-05T07:09:46Z

yaakov-h
Feb 5, 2018

@gafter did you mean 0x?

0 replies

gafter · 2018-02-05T07:21:14Z

gafter
Feb 5, 2018

@yaakov-h Yes.

0 replies

mellamokb · 2023-08-26T04:03:36Z

mellamokb
Aug 26, 2023

Hmm, this discussion seems very stale but it's the only reference I could find.

My use case is I'm trying to understand the binary representation of a decimal value from a hex editor. So I copied out the hex values directly. But I had to research about unchecked to get it to work and this seems unwieldy.

var lo = unchecked((int)0xE8000000);
var md = unchecked((int)0x9FD0803C);
var hi = unchecked((int)0x033B2E3C);
var d2 = new decimal(lo, md, hi, false, 0)

Because the raw hex values are as-is I didn't want to do the 2's complement, so at a glance I could verify they match. It would have been much simpler figuring this out if I could simply have cast them to int.

0 replies

Binary and Hex values should 'just work' for signed values. #717

Uh oh!

tannergooding Jul 3, 2017 Collaborator

Replies: 31 comments

Uh oh!

jnm2 Jul 3, 2017 Collaborator

Uh oh!

Uh oh!

tannergooding Jul 5, 2017 Collaborator Author

Uh oh!

tannergooding Jul 5, 2017 Collaborator Author

Uh oh!

tannergooding Jul 5, 2017 Collaborator Author

Uh oh!

Uh oh!

tannergooding Jul 5, 2017 Collaborator Author

Uh oh!

jnm2 Jul 5, 2017 Collaborator

Uh oh!

tannergooding Jul 5, 2017 Collaborator Author

Uh oh!

jnm2 Jul 5, 2017 Collaborator

Uh oh!

jnm2 Jul 5, 2017 Collaborator

Uh oh!

tannergooding Jul 5, 2017 Collaborator Author

Uh oh!

jnm2 Jul 5, 2017 Collaborator

Uh oh!

tannergooding Jul 5, 2017 Collaborator Author

Uh oh!

Uh oh!

jnm2 Jul 6, 2017 Collaborator

Uh oh!

tannergooding Jul 12, 2017 Collaborator Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jnm2 Feb 5, 2018 Collaborator

Uh oh!

Uh oh!

jnm2 Feb 5, 2018 Collaborator

Uh oh!

Uh oh!

jnm2 Feb 5, 2018 Collaborator

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tannergooding
Jul 3, 2017
Collaborator

jnm2
Jul 3, 2017
Collaborator

tannergooding
Jul 5, 2017
Collaborator Author

tannergooding
Jul 5, 2017
Collaborator Author

tannergooding
Jul 5, 2017
Collaborator Author

tannergooding
Jul 5, 2017
Collaborator Author

jnm2
Jul 5, 2017
Collaborator

tannergooding
Jul 5, 2017
Collaborator Author

jnm2
Jul 5, 2017
Collaborator

jnm2
Jul 5, 2017
Collaborator

tannergooding
Jul 5, 2017
Collaborator Author

jnm2
Jul 5, 2017
Collaborator

tannergooding
Jul 5, 2017
Collaborator Author

jnm2
Jul 6, 2017
Collaborator

tannergooding
Jul 12, 2017
Collaborator Author

jnm2
Feb 5, 2018
Collaborator

jnm2
Feb 5, 2018
Collaborator

jnm2
Feb 5, 2018
Collaborator