Proposal: Copy on Write for Structs #644
Replies: 24 comments
-
Because it's slower than mutable values and the "speed heads" don't like that. However, there are discussions around "withers" or "with expressions" that could be used to achieve this. See #162 and #77 for some details. |
Beta Was this translation helpful? Give feedback.
-
Hmm but isn't the point of value types is that they are immutable by default? Because it's the value that matters not the identity. For example, int a = 1; a = 2; structs should follow value semantics and not reference semantics like classes. Now I understand the point about pleasing the "speed heads" and maintaining backward compatibility. A keyword could be introduced while defining a struct so that the value semantics can be enforced by the compiler. Don't you think? |
Beta Was this translation helpful? Give feedback.
-
They are mutable by default, but when you assign a struct you are making a copy: Point p1 = new Point(2, 2);
Point p2 = p1;
p1.X = 4; // mutates p1 in place
Debug.Assert(p2.X == 2); The only way to take a reference to a struct is explicitly via the Point p1 = new Point(2, 2);
ref Point p2 = ref p1;
p1.X = 4;
Debug.Assert(p2.X == 4); |
Beta Was this translation helpful? Give feedback.
-
I 100% agree with you, but folk with way more influence over the design of C# wave their arms around and mutter nonsense about how they are sort of immutable really as copies are made on assignment and return. Load of old tosh in my view, but they get to make the language decisions, so we are stuck with mutable structs. |
Beta Was this translation helpful? Give feedback.
-
The claim isn't that they are magickally immutable, the claim is that their copy semantics makes the mutability isolated and thus "safe". Everything beyond that is ideology, on both sides. C# doesn't descend, syntactically or philosophically, from functional languages. |
Beta Was this translation helpful? Give feedback.
-
@HaloFour that's perfect, I seemed to have been mis informed based on a stackoverflow post regarding this. Thanks for clarifying this. But what is the motivation behind not implementing them as copy on write ? Doesn't this help improve performance under the scenes? |
Beta Was this translation helpful? Give feedback.
-
I don't know much about how value types are implemented in Swift so I can't really compare directly. In C# a value type is just a block of memory reserved on the stack. In order to employ CoW you'd be required to take a pointer to that memory and pass that around, relying on the consumers of that pointer to "do the right thing" by not writing to it directly. You'd still also have to copy the pointer value around which is likely 64-bits wide. A quick Google of CoW in Swift seems to illustrate that it's a convention accomplished by taking a struct, which has copy semantics, and making it contain a class, which has reference semantics. So you'd incur both the memory allocation plus the copy plus more memory allocations for CoW. Honestly it seems like the worst of both worlds from a performance point of view. But for larger data types like arrays or dictionaries I guess it makes sense. In C# those are simply reference types, so they're faster in C# but they aren't immutable/CoW. |
Beta Was this translation helpful? Give feedback.
-
The "C# should stay true to its mutable, OO, roots" line is ideology too. |
Beta Was this translation helpful? Give feedback.
-
Which is why I included, "on both sides." C# is what it is, the behavior of the existing syntax is extremely unlikely to change. Differing behavior of new features can certainly be considered but consistency is still an important factor. If tuples weren't value types they very likely would have been immutable. Your disagreement with the philosophy and direction of C# is well noted, but if you're expecting the language to start moving wholly in a different direction you'll probably just find yourself constantly disappointed. C# is never going to become F#. |
Beta Was this translation helpful? Give feedback.
-
How right you are! 😄 |
Beta Was this translation helpful? Give feedback.
-
@DavidArno I'm not really sure how to respond to the things you bring up because they're not really where we are anymore. Bashing the team and saying you're disappointed is... I mean, even you are assuming it's a dead end to the conversation. |
Beta Was this translation helpful? Give feedback.
-
That's not the argument that has been made. The argument that has been made is that the approach we've provided gives a good balance of speed as well as safety. We care a ton about immutability and safety (just see how the Roslyn compiler and APIs are designed). But we're not zealots about it. For example, Roslyn implements many immutable facades that are actually heavily mutable under the covers. Indeed, it's how we can produce that immutable facade so efficiently. In practice, we've found that the majority benefit to immutability is actually not that the contents can't change at all, but that you can share safely. So, with tuples, if i pass you my tuple, you can't affect the tuple i have. Indeed, i can pass myself this tuple and change one without having to worry about changing another. If we could have had identical performance with completely immutable tuples, we would have made tuples completely immutable. But, in actuality, the performance is quite different, and even once you get to a few elements, the perf hit can become significant due to all the copying that happens. We wanted people to be able to adopt tuples without putting friction points like this into place. We've seen first hand how even small perf issues can completely kill a feature from being used in a high-perf scenario. For example, Roslyn itself has to avoid things like Linq/Lambdas in many core codepaths due to their overhead. We'd like to avoid that sort of thing when possible, and it made little sense to go for ideological purity that wouldn't actually help anyone out, and would actually end up hurting this feature. |
Beta Was this translation helpful? Give feedback.
-
Precisely. I laid out four options we considered and the pros/cons of all of them. As was demonstrated, there was no perfect answer. There were only tradeoffs. And the moment you have tradeoffs you're going to end up with something that isn't perfect for all scenarios. What i don't understand is that this is how development almost always works, and I generally expect that people can understand this. But some people seem to take the position that the decision was purely wrong, and only one decision was correct (even though that decision had downsides of its own). This is not the first time this has happened, we've had several language changes that also had different sets of pros/cons. I'm not expecting people to come to the same balance as us as to what value those pros/cons come out to. But i am a little dissapointed that there are some that don't seem to want to even acknowledge that there are potentially other considerations that factor into these decisions, and that continually harp that we are "mutter[ing] nonsense". The idea of immutable tuples was not only considered, it was also tested. We saw the actual perf problems and we saw that it didn't actually dramatically improve the programming experience. We used this actual information to drive a design that we felt had higher value in the end. If you want to treat that as 'nonsense' so be it. But this is how we've always designed and how we're going to continue to design for the foreseeable future. |
Beta Was this translation helpful? Give feedback.
-
@DavidArno the principal ideology is "C# should not directly harm one programming paradigm by preferring another". So if you make tuples immutable, this directly hinders procedural code and low-level code written in C#. If you make them mutable, this doesn't directly hinder pure functional code, you simply don't mutate them yourself and are slightly annoyed when you have to deal with libraries that pass around refs to mutable tuples. |
Beta Was this translation helpful? Give feedback.
-
This seems to be difficult considering the fact a copy on write feature on value types would theoretically always be faster than copying all the time. Currently in C#, everywhere structs are passed as function argument unnecessarily creates a copy especially if these structs are only read from and never modified. I take the view that Copy on Write from Swift will be much better in terms of performance since it gives you a good balance between not making unnecessary copies everywhere & safety in mutating objects across scopes and thread boundaries. Remember the entire point of using value types is so that we don't leave the task of ensuring safety on types on the developer but rather on the compiler. Having copy on write enforces that and provides a performance gain. And the good thing is that consumers will not need to change anything in their existing code base since all the changes are under the hood. I also have another question here. In Swift, all collections are also value types, but it does not seem to be the case in C#. Atleast an array of value types should be a value type right? If so, it enhances safety where these array of value types can be passed around ensuring safety across scopes and threads. Are there plans to implement this? Or better still, is there an array or list implementation in the C# world purely based on value semantics and not reference semantics? I'd like to add that I'm not a Swift advocate. I work with both languages and I appreciate all the effort! |
Beta Was this translation helpful? Give feedback.
-
You're always copying something anyway, whether that be the value or a pointer to that value. This is true in every programming language, including Swift. Passing a copy of a struct that is 64-bits wide is no more expensive than passing a pointer to that same struct.
Everything I've read about CoW in Swift is that it's not a language feature but rather a convention established by writing a value type that internally contains a reference type. That value type wrapper must be written specifically to manage allocation/manipulation as to provide thread safety. The creation and manipulation of that type is more expensive than a regular reference type. So is passing it around, since you have to copy the value type (which is at least a pointer wide since it internally contains a reference.) But it's cheaper than copying a giant blob of an array or other collection.
Almost all collections in the CLR are reference types. There are exceptions, such as |
Beta Was this translation helpful? Give feedback.
-
@DavidArno You already can't mutate tuple fields if the tuple itself is readonly. readonly (int x, int y) field;
// ..
field.x = 1; // ERROR Same would apply to "readonly locals and parameters". |
Beta Was this translation helpful? Give feedback.
-
Wow, I didn't know structs behaved that way. I had been thinking earlier that immutability in tuples could be achieved once we get let t = (x:1, y:2);
t.x = 2; // ERROR but dismissed that as wishful thinking as I didn't know about this read-only behaviour of structs and so assumed Thanks, that's cheered me up no end. I can leave here on a positive. |
Beta Was this translation helpful? Give feedback.
-
Seems like you unknowingly never accidentally mutate anything so that'd be a non-issue on your part. |
Beta Was this translation helpful? Give feedback.
-
For readonly fields the compiler does prevent attempts to write to fields of that struct as well as call property setter accessor methods, even if the latter doesn't attempt to mutate the struct. The compiler allows normal method calls, including those that mutate, but to prevent them from attempting to overwrite the field the compiler automatically copies the field into a local. Since the compiler has no way of knowing which methods mutate it has to copy every time, including for property getter accessors since nothing stops them from self-mutation. |
Beta Was this translation helpful? Give feedback.
-
I think #421 is related here. Though that might be overkill at this point (see the discussion). |
Beta Was this translation helpful? Give feedback.
-
Note: we optimize this away for well known types we know to be immutable. No point in copying if we know the methods won't actually mutate. I'm curious if we do that for tuple (i doubt it). but paging @jcouv to check. @jcouv If you call ".ToString, Equals, GetHashCode" on a readonly tuple, do we optimize away the copy, knowing that all the standard impls are non-mutating? Or do we conservatively assume the impl could mutate and we make a copy no matter what? Thanks! |
Beta Was this translation helpful? Give feedback.
-
If there is an allocation then right out of the gate, that's going to be a problem for some domains. I like the swift approach, but it has it's own set of tradeoffs that it is balancing. |
Beta Was this translation helpful? Give feedback.
-
Tuples were designed with the expectation that most would be short-lived (returned from a method then immediately deconstructed or otherwise consumed) or stored in the heap as part of other objects (e.g. In those scenarios there's relatively more "creating" and less "moving". Structs favor that tradeoff: essentially free to create (no heap allocation), a bit more expensive to move (more to copy). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Copy on Write for value semantics was implemented in Swift provides a very good immutable model for value types where the compiler copies a value type automatically when any of its value type properties are modified.
Is there a reason why a similar model cannot be implemented also in c#?
Beta Was this translation helpful? Give feedback.
All reactions