Proposal: Copy on Write for Structs #644

salmankhann · 2017-05-31T07:12:25Z

salmankhann
May 31, 2017

Copy on Write for value semantics was implemented in Swift provides a very good immutable model for value types where the compiler copies a value type automatically when any of its value type properties are modified.

Is there a reason why a similar model cannot be implemented also in c#?

DavidArno · 2017-05-31T10:26:38Z

DavidArno
May 31, 2017

Is there a reason why a similar model cannot be implemented also in c#?

Because it's slower than mutable values and the "speed heads" don't like that.

However, there are discussions around "withers" or "with expressions" that could be used to achieve this. See #162 and #77 for some details.

0 replies

salmankhann · 2017-05-31T12:57:31Z

salmankhann
May 31, 2017
Author

Hmm but isn't the point of value types is that they are immutable by default? Because it's the value that matters not the identity.

For example,

int a = 1;
int b=a;

a = 2;
// b is still 1

structs should follow value semantics and not reference semantics like classes.

Now I understand the point about pleasing the "speed heads" and maintaining backward compatibility. A keyword could be introduced while defining a struct so that the value semantics can be enforced by the compiler. Don't you think?

0 replies

HaloFour · 2017-05-31T13:03:41Z

HaloFour
May 31, 2017

@salmankhann

They are mutable by default, but when you assign a struct you are making a copy:

Point p1 = new Point(2, 2);
Point p2 = p1;
p1.X = 4; // mutates p1 in place
Debug.Assert(p2.X == 2);

The only way to take a reference to a struct is explicitly via the ref keyword:

Point p1 = new Point(2, 2);
ref Point p2 = ref p1;
p1.X = 4;
Debug.Assert(p2.X == 4);

0 replies

DavidArno · 2017-05-31T13:05:56Z

DavidArno
May 31, 2017

@salmankhann,

I 100% agree with you, but folk with way more influence over the design of C# wave their arms around and mutter nonsense about how they are sort of immutable really as copies are made on assignment and return. Load of old tosh in my view, but they get to make the language decisions, so we are stuck with mutable structs.

0 replies

HaloFour · 2017-05-31T13:09:44Z

HaloFour
May 31, 2017

@DavidArno

The claim isn't that they are magickally immutable, the claim is that their copy semantics makes the mutability isolated and thus "safe". Everything beyond that is ideology, on both sides. C# doesn't descend, syntactically or philosophically, from functional languages.

0 replies

salmankhann · 2017-05-31T13:18:33Z

salmankhann
May 31, 2017
Author

@HaloFour that's perfect, I seemed to have been mis informed based on a stackoverflow post regarding this. Thanks for clarifying this.

But what is the motivation behind not implementing them as copy on write ? Doesn't this help improve performance under the scenes?

0 replies

HaloFour · 2017-05-31T13:44:04Z

HaloFour
May 31, 2017

@salmankhann

But what is the motivation behind not implementing them as copy on write ? Doesn't this help improve performance under the scenes?

I don't know much about how value types are implemented in Swift so I can't really compare directly. In C# a value type is just a block of memory reserved on the stack. In order to employ CoW you'd be required to take a pointer to that memory and pass that around, relying on the consumers of that pointer to "do the right thing" by not writing to it directly. You'd still also have to copy the pointer value around which is likely 64-bits wide.

A quick Google of CoW in Swift seems to illustrate that it's a convention accomplished by taking a struct, which has copy semantics, and making it contain a class, which has reference semantics. So you'd incur both the memory allocation plus the copy plus more memory allocations for CoW. Honestly it seems like the worst of both worlds from a performance point of view. But for larger data types like arrays or dictionaries I guess it makes sense. In C# those are simply reference types, so they're faster in C# but they aren't immutable/CoW.

0 replies

DavidArno · 2017-05-31T14:03:45Z

DavidArno
May 31, 2017

@HaloFour,

Everything beyond that is ideology

The "C# should stay true to its mutable, OO, roots" line is ideology too.

0 replies

HaloFour · 2017-05-31T14:08:13Z

HaloFour
May 31, 2017

@DavidArno

The "C# should stay true to its mutable, OO, roots" line is ideology too.

Which is why I included, "on both sides."

C# is what it is, the behavior of the existing syntax is extremely unlikely to change. Differing behavior of new features can certainly be considered but consistency is still an important factor. If tuples weren't value types they very likely would have been immutable.

Your disagreement with the philosophy and direction of C# is well noted, but if you're expecting the language to start moving wholly in a different direction you'll probably just find yourself constantly disappointed. C# is never going to become F#.

0 replies

DavidArno · 2017-05-31T14:14:26Z

DavidArno
May 31, 2017

... you'll probably just find yourself constantly disappointed ...

How right you are! 😄

0 replies

jnm2 · 2017-05-31T18:02:37Z

jnm2
May 31, 2017
Collaborator

@DavidArno I'm not really sure how to respond to the things you bring up because they're not really where we are anymore. Bashing the team and saying you're disappointed is... I mean, even you are assuming it's a dead end to the conversation.

0 replies

CyrusNajmabadi · 2017-05-31T18:51:11Z

CyrusNajmabadi
May 31, 2017
Collaborator

and mutter nonsense about how they are sort of immutable really as copies are made on assignment and return.

That's not the argument that has been made. The argument that has been made is that the approach we've provided gives a good balance of speed as well as safety. We care a ton about immutability and safety (just see how the Roslyn compiler and APIs are designed). But we're not zealots about it. For example, Roslyn implements many immutable facades that are actually heavily mutable under the covers. Indeed, it's how we can produce that immutable facade so efficiently.

In practice, we've found that the majority benefit to immutability is actually not that the contents can't change at all, but that you can share safely. So, with tuples, if i pass you my tuple, you can't affect the tuple i have. Indeed, i can pass myself this tuple and change one without having to worry about changing another.

If we could have had identical performance with completely immutable tuples, we would have made tuples completely immutable. But, in actuality, the performance is quite different, and even once you get to a few elements, the perf hit can become significant due to all the copying that happens. We wanted people to be able to adopt tuples without putting friction points like this into place. We've seen first hand how even small perf issues can completely kill a feature from being used in a high-perf scenario. For example, Roslyn itself has to avoid things like Linq/Lambdas in many core codepaths due to their overhead. We'd like to avoid that sort of thing when possible, and it made little sense to go for ideological purity that wouldn't actually help anyone out, and would actually end up hurting this feature.

0 replies

CyrusNajmabadi · 2017-05-31T18:58:45Z

CyrusNajmabadi
May 31, 2017
Collaborator

If tuples weren't value types they very likely would have been immutable.

Precisely. I laid out four options we considered and the pros/cons of all of them. As was demonstrated, there was no perfect answer. There were only tradeoffs. And the moment you have tradeoffs you're going to end up with something that isn't perfect for all scenarios. What i don't understand is that this is how development almost always works, and I generally expect that people can understand this. But some people seem to take the position that the decision was purely wrong, and only one decision was correct (even though that decision had downsides of its own). This is not the first time this has happened, we've had several language changes that also had different sets of pros/cons.

I'm not expecting people to come to the same balance as us as to what value those pros/cons come out to. But i am a little dissapointed that there are some that don't seem to want to even acknowledge that there are potentially other considerations that factor into these decisions, and that continually harp that we are "mutter[ing] nonsense".

The idea of immutable tuples was not only considered, it was also tested. We saw the actual perf problems and we saw that it didn't actually dramatically improve the programming experience. We used this actual information to drive a design that we felt had higher value in the end. If you want to treat that as 'nonsense' so be it. But this is how we've always designed and how we're going to continue to design for the foreseeable future.

0 replies

orthoxerox · 2017-06-01T07:33:23Z

orthoxerox
Jun 1, 2017

@DavidArno the principal ideology is "C# should not directly harm one programming paradigm by preferring another". So if you make tuples immutable, this directly hinders procedural code and low-level code written in C#. If you make them mutable, this doesn't directly hinder pure functional code, you simply don't mutate them yourself and are slightly annoyed when you have to deal with libraries that pass around refs to mutable tuples.

0 replies

salmankhann · 2017-06-01T13:11:25Z

salmankhann
Jun 1, 2017
Author

@HaloFour

Honestly it seems like the worst of both worlds from a performance point of view

This seems to be difficult considering the fact a copy on write feature on value types would theoretically always be faster than copying all the time. Currently in C#, everywhere structs are passed as function argument unnecessarily creates a copy especially if these structs are only read from and never modified. I take the view that Copy on Write from Swift will be much better in terms of performance since it gives you a good balance between not making unnecessary copies everywhere & safety in mutating objects across scopes and thread boundaries. Remember the entire point of using value types is so that we don't leave the task of ensuring safety on types on the developer but rather on the compiler. Having copy on write enforces that and provides a performance gain. And the good thing is that consumers will not need to change anything in their existing code base since all the changes are under the hood.

I also have another question here. In Swift, all collections are also value types, but it does not seem to be the case in C#. Atleast an array of value types should be a value type right? If so, it enhances safety where these array of value types can be passed around ensuring safety across scopes and threads. Are there plans to implement this? Or better still, is there an array or list implementation in the C# world purely based on value semantics and not reference semantics?

I'd like to add that I'm not a Swift advocate. I work with both languages and I appreciate all the effort!

0 replies

HaloFour · 2017-06-01T13:23:38Z

HaloFour
Jun 1, 2017

@salmankhann

This seems to be difficult considering the fact a copy on write feature on value types would theoretically always be faster than copying all the time.

You're always copying something anyway, whether that be the value or a pointer to that value. This is true in every programming language, including Swift. Passing a copy of a struct that is 64-bits wide is no more expensive than passing a pointer to that same struct.

I take the view that Copy on Write from Swift will be much better in terms of performance since it gives you a good balance between not making unnecessary copies everywhere & safety in mutating objects across scopes and thread boundaries.

Everything I've read about CoW in Swift is that it's not a language feature but rather a convention established by writing a value type that internally contains a reference type. That value type wrapper must be written specifically to manage allocation/manipulation as to provide thread safety. The creation and manipulation of that type is more expensive than a regular reference type. So is passing it around, since you have to copy the value type (which is at least a pointer wide since it internally contains a reference.) But it's cheaper than copying a giant blob of an array or other collection.

I also have another question here. In Swift, all collections are also value types, but it does not seem to be the case in C#. Atleast an array of value types should be a value type right?

Almost all collections in the CLR are reference types. There are exceptions, such as ImmutableArray<T>, which has the same CoW semantics as arrays in Swift. Normal arrays are reference types as they are allocated at runtime, with the exception of blocks of memory allocated via stackalloc.

0 replies

alrz · 2017-06-01T14:08:15Z

alrz
Jun 1, 2017

@DavidArno You already can't mutate tuple fields if the tuple itself is readonly.

readonly (int x, int y) field;
// ..
field.x = 1; // ERROR

Same would apply to "readonly locals and parameters".

0 replies

DavidArno · 2017-06-01T14:40:10Z

DavidArno
Jun 1, 2017

@alrz,

Wow, I didn't know structs behaved that way. I had been thinking earlier that immutability in tuples could be achieved once we get let:

let t = (x:1, y:2);
t.x = 2; // ERROR

but dismissed that as wishful thinking as I didn't know about this read-only behaviour of structs and so assumed t.x = 2; would be valid. As you say, read-only locals would indeed fix the mutable structs issue.

Thanks, that's cheered me up no end. I can leave here on a positive.

0 replies

alrz · 2017-06-01T17:53:31Z

alrz
Jun 1, 2017

Wow, I didn't know structs behaved that way.

Seems like you unknowingly never accidentally mutate anything so that'd be a non-issue on your part.

0 replies

HaloFour · 2017-06-01T18:27:02Z

HaloFour
Jun 1, 2017

For readonly fields the compiler does prevent attempts to write to fields of that struct as well as call property setter accessor methods, even if the latter doesn't attempt to mutate the struct. The compiler allows normal method calls, including those that mutate, but to prevent them from attempting to overwrite the field the compiler automatically copies the field into a local. Since the compiler has no way of knowing which methods mutate it has to copy every time, including for property getter accessors since nothing stops them from self-mutation.

0 replies

alrz · 2017-06-01T18:32:36Z

alrz
Jun 1, 2017

but to prevent them from attempting to overwrite the field the compiler automatically copies the field into a local. Since the compiler has no way of knowing which methods mutate it has to copy every time, including for property getter accessors since nothing stops them from self-mutation.

I think #421 is related here. Though that might be overkill at this point (see the discussion).

0 replies

CyrusNajmabadi · 2017-06-01T19:05:00Z

CyrusNajmabadi
Jun 1, 2017
Collaborator

Since the compiler has no way of knowing which methods mutate it has to copy every time, including for property getter accessors since nothing stops them from self-mutation.

Note: we optimize this away for well known types we know to be immutable. No point in copying if we know the methods won't actually mutate.

I'm curious if we do that for tuple (i doubt it). but paging @jcouv to check.

@jcouv If you call ".ToString, Equals, GetHashCode" on a readonly tuple, do we optimize away the copy, knowing that all the standard impls are non-mutating? Or do we conservatively assume the impl could mutate and we make a copy no matter what?

Thanks!

0 replies

CyrusNajmabadi · 2017-06-01T19:05:57Z

CyrusNajmabadi
Jun 1, 2017
Collaborator

I take the view that Copy on Write from Swift will be much better in terms of performance since it gives you a good balance between not making unnecessary copies everywhere & safety in mutating objects across scopes and thread boundaries

If there is an allocation then right out of the gate, that's going to be a problem for some domains. I like the swift approach, but it has it's own set of tradeoffs that it is balancing.

0 replies

MadsTorgersen · 2017-06-02T01:07:02Z

MadsTorgersen
Jun 2, 2017
Collaborator

Tuples were designed with the expectation that most would be short-lived (returned from a method then immediately deconstructed or otherwise consumed) or stored in the heap as part of other objects (e.g. Task<T> or KeyValuePair<K, V>).

In those scenarios there's relatively more "creating" and less "moving". Structs favor that tradeoff: essentially free to create (no heap allocation), a bit more expensive to move (more to copy).

0 replies

Proposal: Copy on Write for Structs #644

Uh oh!

Replies: 24 comments

Uh oh!

Uh oh!

salmankhann May 31, 2017 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

salmankhann May 31, 2017 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jnm2 May 31, 2017 Collaborator

Uh oh!

CyrusNajmabadi May 31, 2017 Collaborator

Uh oh!

CyrusNajmabadi May 31, 2017 Collaborator

Uh oh!

Uh oh!

salmankhann Jun 1, 2017 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CyrusNajmabadi Jun 1, 2017 Collaborator

Uh oh!

CyrusNajmabadi Jun 1, 2017 Collaborator

Uh oh!

MadsTorgersen Jun 2, 2017 Collaborator

salmankhann
May 31, 2017
Author

salmankhann
May 31, 2017
Author

jnm2
May 31, 2017
Collaborator

CyrusNajmabadi
May 31, 2017
Collaborator

CyrusNajmabadi
May 31, 2017
Collaborator

salmankhann
Jun 1, 2017
Author

CyrusNajmabadi
Jun 1, 2017
Collaborator

CyrusNajmabadi
Jun 1, 2017
Collaborator

MadsTorgersen
Jun 2, 2017
Collaborator