Skip to content

Fix for methods returning null collections#18

Merged
gabikliot merged 4 commits intodotnet:masterfrom
jkonecki:fix_for_methods_returning_null_collections
Jan 26, 2015
Merged

Fix for methods returning null collections#18
gabikliot merged 4 commits intodotnet:masterfrom
jkonecki:fix_for_methods_returning_null_collections

Conversation

@jkonecki
Copy link
Copy Markdown
Contributor

I believe returning an empty collection should be preferred over returning a null - this helps avoiding NullRerefenceExceptions and allows a straightforward client loops.

I've turned a simple comment to method documentation <summary> tag in ClientDirectory class (where returning null seems to be the required behaviour) in order for it to pop up in intellisense.

@jesuspsalas
Copy link
Copy Markdown

Hi Jakub/All

We should be careful on the null collections change as we should prioritize performance against clarity. “The faster code is the one that do not execute” as you probably know ☺

I know .net will manage quite fast to create an empty object that will be collected later by GC and looking to this as a single operation is not that much.

At millions of requests scale, this matters a lot.

Return a null and checking for null is faster.

There are other considerations, like if we are overriding built-in methods that .net already
Show an standard behavior to them (we must stick to this), if the method can be use or not in a fluent
style (we need to return empty collections then) , etc…

Maybe to create companion methods TryXXXX that returns bool value if the TryXXXX has been success and the data is ok (like int.TryParse)

I’m asking to put piece of the PR on hold until we reach a consensus on how to handle all this.

Regards
Jesus Salas
VS Anywhere

From: Jakub Konecki [mailto:notifications@github.com]
Sent: domingo, 25 de enero de 2015 6:10
To: dotnet/orleans
Subject: [orleans] Fix for methods returning null collections (#18)

I believe returning an empty collection should be preferred over returning a null - this helps avoiding NullRerefenceExceptions and allows a straightforward client loops.

I've turned a simple comment to method documentation

tag in ClientDirectory class (where returning null seems to be the required behaviour) in order for it to pop up in intellisense.


You can view, comment on, or merge this pull request online at:

#18

Commit Summary

File Changes

Patch Links:


Reply to this email directly or view it on GitHubhttps://github.com//pull/18.

@jkonecki
Copy link
Copy Markdown
Contributor Author

I fully agree with you, Jesus.
I do not know how ofter those methods are called - I actually couldn't find any usages of some of them...

My preference is for the code to be safe but I wouldn't mind if the PR is rejected if there are genuine arguments for returning nulls.
Maybe we should start with just documenting the methods that they may return null?

@sergeybykov
Copy link
Copy Markdown
Contributor

The general reason to return nulls instead of empty collections/dictionaries indeed has been performance concerns - to avoid unnecessary allocations and GC pressure. However, in these particular cases I don't think we should have perf concerns as the ConsistentRing and Directory methods are only invoked upon a cluster change, so we could make them safe, I think. Some of the SetExtensions methods are use by the test code that we haven't ported to Github yet. So I don't think there is a perf concern their at the moment either.

I'll wait for Gabi to weigh in.

@jesuspsalas
Copy link
Copy Markdown

Hi ,

As long it is not in the critical path and probably being better for the cluster management classes to be the safer as possible, sounds like a good thing.

Jakub, I understand is safer to not return null, and I’m happy with this. Just knowing the implications to create/GC objects in a critical path (related to performance)
sometimes provide safety is not the best... (in any case is wrong!)

I didn’t check what classes were impacted, anyways, I don’t have yet the full Orleans architecture in my mind and can tell really what areas were impacted.

Let’s wait for Gabriel.

Jesus
VS Anywhere

From: Sergey Bykov [mailto:notifications@github.com]
Sent: domingo, 25 de enero de 2015 12:49
To: dotnet/orleans
Cc: Jesus Salas
Subject: Re: [orleans] Fix for methods returning null collections (#18)

The general reason to return nulls instead of empty collections/dictionaries indeed has been performance concerns - to avoid unnecessary allocations and GC pressure. However, in these particular cases I don't think we should have perf concerns as the ConsistentRing and Directory methods are only invoked upon a cluster change, so we could make them safe, I think. Some of the SetExtensions methods are use by the test code that we haven't ported to Github yet. So I don't think there is a perf concern their at the moment either.

I'll wait for Gabi to weigh in.


Reply to this email directly or view it on GitHubhttps://github.com//pull/18#issuecomment-71392660.

@gabikliot
Copy link
Copy Markdown
Contributor

In general, I totally agree with Jesus on the importance of performance. And Jakub's point about cleaner APIs is also valid.
A couple of extra general points to consider:

  • We don't have to always allocate a new empty collection in such cases. We can instead have preallocated static readonly empty list (or empty dictionary) and return it instead. Like String.Empty, or BinaryTokenStreamReader.emptySegment.
  • I think Jakub's point is important mainly for public APIs. But for internal methods, I think it is OK to leave it up to the implementer of this internal component to decide what's best for him - perf/readability/everyone has his style.

Specifically, in those cases:

  1. ActivationDirectory.FindTargets is on a hot critical path for stateless workers. So we can either use null or pre-allocated empty readonly list (ReadOnlyCollection that implements IList).
  2. The code in ConsistentRing, LocalGrainDirectory and GrainDirectoryHandoffManager is not on a critical path. But it is also mostly internal. Can either leave as is or return pre-allocated readonly empty list.
  3. with set extension we need to be careful, as those are very general public APIs, so they need to really be intuitive and correct. I wonder what happens if you call List.ToList() on an empty list? On the first look you expect it to throw NullRef, but ToList is an extension method on IEnumerable and those usually work on null IEnumerable . Just tried, ToList() on null throws. So seems like our HashSet ToSet(this IEnumerable values) should also throw and not return null? I don't know. Need to think about it. Maybe we will just remove those public extension methods.

I will think more about it and pull this request (with appropriate modifications) tomorrow.

@jesuspsalas
Copy link
Copy Markdown

Wouldn’t use a static readonly instance cause problems when use in fluent way or when trying to manipulate the collection?

I mean, I like this approach and I used it on some architectures like when I wanted to have a default initialized class and use it as
static ‘identity’ instance to use for comparisons with other instances… (limiting its use as read-only operations)

However in a case where I do a ToList() and then the code, by some criteria (i.e. count == 0 ) choose to manipulates the collection (i.e. adding elements) I’ll receive an exception
anyway as collection is ReadOnly what is a semantic change in object in the middle of the flow (‘if data available then not readonly’ to ‘if no data available then readonly’)

In the other hand if we do not use readonly there is a chance code will mutate the ‘what has become a singleton dummy instance’ to a new state causing
unspecified behavior in subsequent flows where an empty collection should be returned because of null…

I believe that if we want to follow this approach will require a bit more of research, probably the best as Gabriel suggest would be to mimic the
natural .Net behavior as it is what I’d like to see if I were using the public API, or by following Occam suggestions ( ;) ) keep the null what is probably the simpler.

All this talking from the broad architecture point of view and not about the specifics in the affected classes that maybe the idea fits if data is immutable, I’m still learning Orleans..

Also, I concur public APIs are the most important.

As they are exposed to developers they must be as perfect as possible, the internal layers should be as practical as possible and we should stock to some established common pattern.

Jesus
VS Anywhere

From: Gabriel Kliot [mailto:notifications@github.com]
Sent: domingo, 25 de enero de 2015 21:22
To: dotnet/orleans
Cc: Jesus Salas
Subject: Re: [orleans] Fix for methods returning null collections (#18)

In general, I totally agree with Jesus on the importance of performance. And Jakub's point about cleaner APIs is also valid.
A couple of extra general points to consider:

  • We don't have to always allocate a new empty collection in such cases. We can instead have preallocated static readonly empty list (or empty dictionary) and return it instead. Like String.Empty, or BinaryTokenStreamReader.emptySegment.
  • I think Jakub's point is important mainly for public APIs. But for internal methods, I think it is OK to leave it up to the implementer of this internal component to decide what's best for him - perf/readability/everyone has his style.

Specifically, in those cases:

  1. ActivationDirectory.FindTargets is on a hot critical path for stateless workers. So we can either use null or pre-allocated empty readonly list (ReadOnlyCollection that implements IList).
  2. The code in ConsistentRing, LocalGrainDirectory and GrainDirectoryHandoffManager is not on a critical path. But it is also mostly internal. Can either leave as is or return pre-allocated readonly empty list.
  3. with set extension we need to be careful, as those are very general public APIs, so they need to really be intuitive and correct. I wonder what happens if you call List.ToList() on an empty list? On the first look you expect it to throw NullRef, but ToList is an extension method on IEnumerable and those usually work on null IEnumerable . Just tried, ToList() on null throws. So seems like our HashSet ToSet(this IEnumerable values) should also throw and not return null? I don't know. Need to think about it. Maybe we will just remove those public extension methods.

I will think more about it and pull this request (with appropriate modifications) tomorrow.


Reply to this email directly or view it on GitHubhttps://github.com//pull/18#issuecomment-71415943.

@gabikliot
Copy link
Copy Markdown
Contributor

Of course, I meant only to use preallocated static readonly only for cases that do not manipulate the collection. I all the examples in this pull request, and in most places in the runtime, we do not manipulate the returned collection. We hardly ever manipulate the returned collection. If we need to do that, we refactor the common logic between caller and callee into some shared component. This is part of the discipline of making sharing between components explicit, which is a good practice for multi threaded code.

@jkonecki
Copy link
Copy Markdown
Contributor Author

Another option could be changing the return type from List<T> to IEnumerable<T> and use Enumerable<T>.Empty as a return value.

@gabikliot
Copy link
Copy Markdown
Contributor

Yes, agree.

@jkonecki
Copy link
Copy Markdown
Contributor Author

OK. Let me see what kind of impact that would make and I'll update PR.

@gabikliot
Copy link
Copy Markdown
Contributor

Jakub, for this specific set of changes I would not do much. As I wrote, all except Set extensions are internal methods. So we can even leave them as is. Set extensions are the only problem, but since we don't use them anyway, we can just remove those methods.

@jkonecki
Copy link
Copy Markdown
Contributor Author

All right. I'll just get rid of Set extension methods and leave the rest as
is.

On Mon, 26 Jan 2015 18:37 Gabriel Kliot notifications@github.com wrote:

Jakub, for this specific set of changes I would not do much. As I wrote,
all except Set extensions are internal methods. So we can even leave them
as is. Set extensions are the only problem, but since we don't use them
anyway, we can just remove those methods.


Reply to this email directly or view it on GitHub
#18 (comment).

@jkonecki
Copy link
Copy Markdown
Contributor Author

I reverted the changes and left the Set extension methods intact as they are actually used.
Public methods ConsistentRingProvider.GetMySucessors(int n = 1) and ConsistentRingProvider.GetMyPredecessors(int n = 1) are not used and I believe could be removed along with internal FindPredecessors(SiloAddress silo, int count) and FindSuccessors(SiloAddress silo, int count).

@gabikliot
Copy link
Copy Markdown
Contributor

Ok, Jakub. I will work on merging that pull request. Thanks.

gabikliot added a commit that referenced this pull request Jan 26, 2015
…ollections

Fix for methods returning null collections
@gabikliot gabikliot merged commit 5777927 into dotnet:master Jan 26, 2015
@jkonecki jkonecki deleted the fix_for_methods_returning_null_collections branch January 28, 2015 07:18
@ifle ifle mentioned this pull request Jun 12, 2018
@github-actions github-actions bot locked and limited conversation to collaborators Dec 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants