Nullable reference types: Mads's current thoughts #863
Replies: 24 comments
-
So all the language officially has is the null annotation '?' and the tracking of that in the type system, and whatever rules that implies? Then theoretically, an analyzer can use that information (plus its own analysis) to determine if any violations exist? |
Beta Was this translation helpful? Give feedback.
-
That would offer the flexibility in the analysis to "dial up" the aggressiveness of the feature. It would be easier to ship a continually evolving analyzer with the compiler and IDE than to try to bake such a thing into the language spec. I do hope that the end-developer experience isn't any different, that a simple project- or solution-level opt-in would enable analysis that would ship by default. |
Beta Was this translation helpful? Give feedback.
-
This sounds quite sensible. I've been thinking of non-null ref types as a flow analyser for a while, but splitting it into various analysers would also allow users to:
For me personally it would probably still be quite dramatic, but I believe the approach described above would allow me to slowly migrate my large codebase from Code Contracts to C# 8 one assembly at a time, in two passes (1. enable non-null refs top-down, 2. remove contracts bottom-up). |
Beta Was this translation helpful? Give feedback.
-
So since you say that the flow analysis will be like analyzers, does that also include overriding behavior in specific contexts by use of |
Beta Was this translation helpful? Give feedback.
-
I'd like to point out that R# uses a concept of "optimistic vs pessimistic" opt-in with different level of warnings produced for implicitly nullable variables -- that could help with adoption story. |
Beta Was this translation helpful? Give feedback.
-
Implementing most of this via built-in analysers, combined with being able to "dial up" the level of aggressiveness seems a really good idea. One request though please: when I "turn it up to 11", I'd like it to also warn on the use of the dammit operator. The use case normally cited for this feature is to assist in adopting it with existing code. For it to be a genuine temporary quick fix, there really needs to be a warning level at which it therefore warns on using dammit. For those other use cases where, in very rare cases, it's really needed, then some sort of |
Beta Was this translation helpful? Give feedback.
-
To reiterate here, analyzers external to the compiler could also have the advantage of having configuration around helper methods that could confirm the non-nullability of a reference. For example, |
Beta Was this translation helpful? Give feedback.
-
The more this topic is discussed, with forward-compat being such a key piece, the less I think using I think the C# team is too focused here on what is technically possible vs the business case for what people will actually do. Writing analyzers has been available for years at this point, and they're practically unused outside of MS. Dialing analyzers up and down is not something the vast majority of devs want to be thinking about. Frankly, its not something most devs will do. They'll just go with whatever the default is- which I'm assuming is still to have them turned off to avoid tons of warnings on codebases being upgraded. I don't think this solution meets the goal of easing adoption. Instead, having all reference types nullable by default and decorating them with Pros:
Cons:
But then again, is that first con really a con? We're doing so much work to not break legacy codebases anyway that I think you're looking at lot of work to adopt non-nullness in existing code either scenario. |
Beta Was this translation helpful? Give feedback.
-
It wasn't completely clear from my write-up, but I do propose that the warnings that stem from use of nullable reference types are still mandated by the language: int M(string? ns) => ns.Length; // built-in language warning The warnings arising from creation of null values of non-nullable value types are the ones I propose moved out of the language and made optional, in an analyzer-like way: string s = null; // caught by analyzer
var sa = new string[10]; // caught by analyzer I've added a sentence to the write-up to clarify. |
Beta Was this translation helpful? Give feedback.
-
@mattwar: See my comment above. the language itself would still be responsible for the portion of the warnings that arise from use of nullable reference types. It's the warnings that "make unannotated reference types mean non-null" that are moved out. @HaloFour: Agreed that it's easier to take special annotations into account from an analyzer than a language feature. You mention functions that check for null, and other interesting cases are @DavidArno: I don't think the primary scenario for That said, I agree that you may want to limit or weed out egregious use of them, e.g. with an analyzer. |
Beta Was this translation helpful? Give feedback.
-
I am trying to make sure I understand:
Specifically that last line, it doesn't matter that the nullability analysis can determine that |
Beta Was this translation helpful? Give feedback.
-
First of all, I'm cautiously optimistic as it sounds like this may be able to meet the needs of a varied audience. It may be valuable to implement this analysis in preview as a warning with a separate ID and default severity None, allowing for experimentation since the underlying analysis is likely to be present.
💡 By treating static null checks conceptually as analyzers, we are gaining substantial flexibility to modify the set of warnings reported in the future to meet the needs of different audiences. This flexibility is impossible to retain for anything dictated by the core rules of the language. |
Beta Was this translation helpful? Give feedback.
-
This is potentially problematic as a built-in language warning, because suppressing the warnings fundamentally relies on flow analysis. Flow analysis is one of the primary areas where we may wish to introduce changes in the future, and the ability to make those changes would be impaired if it impacted built-in warnings. As a simple example, one frequently-encountered case when dealing with flow analysis for null values is the handling of While it likely makes sense to enable one or more (conceptual) analyzers, the only ones that seem to be included as part of the language are signature-related analyzers that do not require flow analysis of any kind for accuracy (e.g. warn if method uses a non-nullable parameter type when overriding/implementing a method that has a nullable type for that parameter). ¹ Speaking from experience, failure to account for unit test frameworks will cause substantial pain in test suites for code involving nullable reference types. |
Beta Was this translation helpful? Give feedback.
-
Splitting this feature into separate analyzers, is not good idea. There asre some tools and applications, that use compilers programatically, and there may be need to get these warnings straight out of compiler. Better idea is to bake these wanings into language, and create great UI tool to manage warnings, as current switch for all warnings is just too short. New UI tools should allow create profiles, and profile defines which warnings are visible and/or interpreted as errors. This tool will allow to quickly activate / deactivate particular profiles, and this way define 'level of help'. There may be fixed default profile, that will be used for light-bulb options 'do not show that warning in this project' and ' do not show that warning in any project' |
Beta Was this translation helpful? Give feedback.
-
Tools that use the Roslyn APIs programmatically can also load in Analysers and get the full set of warnings. What you call "profiles" sounds like "rulesets" to me, which already exist today. |
Beta Was this translation helpful? Give feedback.
-
@vbcodec Having a good developer experience surrounding analyzer-produced diagnostics sounds like a great overall request/goal. Tools and applications should not be at a disadvantage simply because a development team is interested in adopting one or more analyzers in an effort to improve code quality, regardless of whether these analyzers are created by the Roslyn team or some third party. |
Beta Was this translation helpful? Give feedback.
-
This also applies to warning waves. I think, among other reasons, efficiency (or accessible APIs for that matter) might be important for a warning (either null-related, or otherwise) to reside in the compiler. For example, unused usings are reported by the compiler since it is already walked through all the nodes. |
Beta Was this translation helpful? Give feedback.
-
What kind of code fixes this null-analyzer will be providing? Without fixes, analyzers may become unusable. |
Beta Was this translation helpful? Give feedback.
-
This doesn't sound like a good idea. I get that null reference exceptions are bad but what is much worse is finding out that you program is not working correctly by some one telling you the results are incorrect. Forcing people to adopt this will just result in people using the default value instead of null and then it's much harder to cache the exception since you won't get a compile time error or a runtime exception. Also the ? symbol has a meaning in C# already it makes struct nullable if you use it for reference types it will look the same but produce different results. And what about the default operator will I get a new instance of the object or will it return a nullable type? What constructor will be used if I get an object? |
Beta Was this translation helpful? Give feedback.
-
I think you have confused this issue with #167. |
Beta Was this translation helpful? Give feedback.
-
I'm sure this has been discussed, but I'm not finding anything on it. What is the meaning of public static T ProveNotNull<T>(T? value) where T:struct
{
if (!value.HasValue) throw new ArgumentNullException();
return value.GetValueOrDefault();
} And it will only accept nullable value types. public static T ProveNotNull<T>(T value)
{
if (value == null) throw new ArgumentNullException();
return value;
} And it will accept any type, with the jitter cutting out the null branch for non-nullable value types. public static T ProveNotNull<T>(T? value)
{
if (!value == null) throw new ArgumentNullException();
return value;
} It seems at first glance to be a way to do a check and get the known-not-null at the end, but how does this interact with the first piece of code above? Would the first piece be allow alongside this, or would it clash as methods only different in their constraints do? If the first piece wasn't there, and we passed a nullable value type, would it cast to the underlying type in the non-null branch? |
Beta Was this translation helpful? Give feedback.
-
That code won't compile because |
Beta Was this translation helpful? Give feedback.
-
@HaloFour public void MehWhatever<T>(T value)
{
T value = ProveNotNull(value);
DoSomethingOrOther(value);
} And have it work whether |
Beta Was this translation helpful? Give feedback.
-
@HaloFour this actually does compile: public static T ProveNotNull<T>(T? value)
{
if (!value == null) throw new ArgumentNullException();
return value;
} (exactly the same as it would if the ? was not there) You merely cannot call it and change the type of the underlying variable: int? i = null;
Console.Write(ProveNotNull<int>(i)); // error CS1503: Argument 1: cannot convert from 'int?' to 'int?' (with a quite lovely error I think 😈) You can call it like this: int? i = null;
int j = ProveNotNull(i).Value; // T/return is Nullable<int> and the test throws |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Nullable reference types: Mads's current thoughts
Over the past month we've been driving hard towards a shared understanding of the Nullable Reference Types feature. I believe we have mapped out the design choices and tradeoffs thoroughly. It is reflected in the updated strawman proposal, which also delineates an initial public prototype that we can share and learn from.
The strawman punts on the issue of what knobs we should eventually employ to facilitate opt-in or opt-out of various degrees of checking. It lists the options we've discussed, but for the prototype essentially just has everything on by default, which is sensible, given that we aim to try out the full feature set and get feedback on it.
However, Since finalizing the strawman, I've been thinking further about how we should eventually ship this feature. I am sharing my thoughts - my current opinion (which as you all know is quite swayable) - below, knowing full well that use of the prototype will further inform these choices, and possibly invalidate my reasoning. In a sense I should just shut up and wait, but having gathered my thoughts on the topic, I decided it wouldn't hurt to share them.
General model
As an overview, I think there are two main feature areas, with quite different characteristics.
Nullable reference types themselves are a highly language-involved feature. It introduces new syntax and metadata, and relies on intricate flow analysis. It is deeply dependent on compiler participation. On the other hand it is not much of a breaking change - a few scenarios, probably representative of actual bugs, would yield warnings on existing code. A "legacy mode" switch could temporarily silence those if you really do need current code to compile right now. Also, there is not much controversy about "how aggressive" it should be (with one exception around dotted names, which I'll get to).
Static null checks is a set of mostly local checks to prevent null values of non-nullable types from being created. Most are triggered by simple information right in the vicinity of specific expression forms. All are quite breaking. They range roughly on a scale from lenient (converting
null
tostring
) to stringent (flaggingnew string[]
because elements are null), and different people would likely want different "aggressiveness settings", maybe even at different times during development.The point (controversial, I know) I want to get to is that, while Nullable Reference Types feels like a feature that really must be part of the language, Static Null Checks feel a lot more like analyzers: They do not contribute to language syntax, types or semantics, are local in nature, warn on specific things and can and would be individually turned on and off, even in a new or fully updated code base. I am not saying we should necessarily do them as analyzers (though I do want to explore that), but we should think of them as analyzers, maybe some sort of "compiler built-in" ones, and the UI for tinkering with them in tools should not be significantly different from other analyzers. They are outside the language.
There are practical and philosophical reasons for this split. Practically, the split avoids a lot of the breaking-change discussion, and leverages an existing paradigm - analyzers - for regulating opt-in and aggressiveness. Philosophically, the core role of the language is to offer you expressiveness: the new ability to state your intent around nullness, whereas limiting legal use of the existing language is the province of tools and analyzers, where each individual can pick and choose.
Note that it is the presence of nullable reference types in the language that makes it reasonable to employ analyzers against reference types that are not-nullable. Without the expressiveness to "opt out of non-nullness" with a
?
, it would be draconian to forbid nulls.Null-checking analyzers
We can gather up the set of checks to prevent null values of not-nullable types, and make individually toggled analyzers, or put them on a severity scale, or whichever model we find best. We have considerable design freedom here, because we are outside of the language.
I'll express the checks in terms of "non-defaultable types". Non-defaultable types are themselves a potentially adjustable concept:
The analyzers will flag the following code:
null
literals and target typeddefault
expressions to non-defaultable typesdefault(T)
expressions, whereT
is a non-defaultable typenew T[]
, whereT
is a non-defaultable type(That first one is less local in nature, and may be a good reason why this is best done directly by the compiler. An external analyzer might be on the expensive side, but I'm just speculating.)
(Honestly, a lot of people would probably appreciate an analyzer that makes you explicitly initialize every field on every constructor path, not just the non-defaultable ones.)
Nullable reference types
The warnings that flow from use of nullable reference types (by dereferencing or assigning to non-nullable reference types) are generally uncontroversial, and should be mandated by the language.
By the current design (which I like) there are three ways that existing pre-C# 8.0 code can implicitly already have nullable reference types, without any
?
s occurring in the code:null
literals can now infer nullable reference typesAll three of those can lead to implicitly nullable references being dereferenced or assigned to non-nullable types. If the compiler can see that the null-state of such references is "not-null" then all is fine. However, in unguarded cases, new warnings may occur. Those would all be easy to address, by:
!
and taking responsibility for the fallout.I believe that these breaks are not ones that you'll want to suppress in the long term, or individually. They could be temporarily addressed by a single "legacy mode" switch on the compiler, for as long as someone wants to "just compile" and not address what are likely bugs in their code.
(We could even consider having the legacy mode turn off nullable reference types completely. That way, you don't end up living in a "half-state" where there are nullable reference types manifest in your code, while others are being suppressed by the legacy mode. I am not yet sure about the pros and cons here; just an idea.)
Dotted chains
One potential long-term lever of aggresiveness does pertain specifically to nullable reference types. That is the question of whether the flow analysis should track null-state for "dotted chains" or just for parameters and local variables.
The point of the flow analysis is to embrace existing code, and existing code will be full of null checks on dotted chains. The inconvenience of not tracking them would be significant, to the point of hindering adoption. However, tracking them leads to lowered confidence in the analysis, because they are so much more likely to change their value "behind the scenes" than a parameter or local variable.
This feels like a choice that is more akin to the null-checking analyzers above. We should be more lenient in the language, track the dotted chains, and give them a pass when they are thought to be "non-null".
Then, if people are concerned about the lenience of that, it is a relatively local check to recognize the situation and just warn anyway. A job for an analyzer, or a compiler-built-in quasi-analyzer: "If the expression is a dotted chain, and its type is a nullable reference type, and it is being converted to a non-nullable reference type, and it's null-state is "non-null" then warn anyway!"
What am I not getting?
Per reference opt-in
With what I described, there is no way to select per library reference whether nullability in a library should be seen or not. Once you're in, you're in. If this turns out to be important, nothing above precludes it. But my guess is that this granularity is not necessary, and not worth the complexity it would bring.
Library opt-in
Also, there is no way for a library to express that it "intends" it's non-nullable reference types to reject nulls. That whole feature set is pushed out of the language itself.
If it turns out to be important, there could still be an attribute that the non-null analyzers (or quasi-analyzers) heed, so that signatures that never intended for
string
to mean "non-nullablestring
" would not be warned on. But with no language support, such "non-non-null types" would not be tracked through the type system, in type inference and so on.I think this is fine, and nicely simplifying.
Language notion of "not-null"
This pushes the notion of "not-null" out of the language. There is no anointed, language-spec'ed behavior around preventing nulls. All that the language offers is a nullable notion to "evacuate to", so that analyzers can strike at nulls with abandon.
I guess you can see this as a shortcoming. But I think that the tension between convenience and strictness is too hard to balance once and for all to everyone's satisfaction in the language. It's almost like a code style issue, where different individuals and organizations will want to have a certain strictness profile, and are willing to take the inconvenience caused by it because they chose it themselves.
What am I getting?
As a result, it will be undramatic to "move to" the nullable world. You can quickly inhabit the nullable side of things in the language, while dialing in the non-null harshness to where you want it.
Beta Was this translation helpful? Give feedback.
All reactions