#370 Nullable or NotNull by default

jodastephen Sat 11 Oct 2008

As I follow through my arguments, I've wondered if I may have been wrong on one point. Perhaps nullable should be the default in Fan!

I wouldn't have expected to have ever thought that, but considering the general flexibility in the language, and my lack of desire to tie developers down to boilerplate code or an overly restrictive compiler, it suddenly starts to make more sense that way round. (ie. if the compiler has tight null checking rules then not-null-by-default, but if the compiler has light null checking rules then it should be nullable-by-default).

The implications would be that the syntax would change - Int would mean nullable, as in Java, while a new syntax such as Int! would mean not-null.

This would mean that developers would have to take action to choose to make variables not-null. ie. they have to specifically choose the tradeoff of extra safety vs more boilerplate code.

This way round fits well with interactions with the Java APIs - as they don't define null status. It also allows a more natural migration from Java/C# to Fan (and Fan is more of an evolution language).

I'm still debating over how it would affect type inference, as I'm unsure how easy it would be to pick whether the local variable can hold null or not. (The natural default is that it can). Perhaps obj! := value means that obj is non-null.

Then there is the primitives question. Three possibilities. The first is to have types of int and float which are convenience for Int! and Float!. The second is to do analysis in the compiler to promote Int to Int! if possible. The third is to do nothing and force the developer to write Int! if they want extra performance.

Just to be clear, I haven't yet completely changed my mind to nullable-by-default, but I'm starting to seriously consider it. I'd like to hear other views.

helium Sat 11 Oct 2008

I perfere mandatory as default i.e. you have to explicitly declare that something is optional. In my experience that works just perfectly.

And if I have to choose I'd prefere implicit coercion from T? to T (=> light null checking rules) over optional as default any time.

brian Sat 11 Oct 2008

Actually I think nullable by default is going to work out well. Since local variables always use type inference you only need to think about it for your method and field signatures. It is definitely a bit of a mental exercise when writing your APIs, but I think it is a really good exercise. I'm finding it really useful to go thru my APIs and formally mark the nullable types.

I agree with you Stephen that even without a lot of static type checking, having APIs formally marked as nullable is actually really useful. I happen to think that basic things like checking use of null literals and failing fast are going to be extremely useful even if we never reach full static provability.

I've updated the sys APIs with nullable, and the stats are:

slots:   1048  259  24%
fields:    60    9  15%
methods:  988  250  25%
returns:  988  142  14%

So roughly only a quarter of sys slots use nullable types. Only 14% of methods return a nullable type. So the common case is definitely non-nullable. I've also had a tendency to design APIs which overload to return null - for example I've used a pattern where an option boolean can be used to return null instead of throw an exception (fromStr, findType, etc).

I think other APIs are going to use nullable even less. Last night I did the inet API and the number of slots with nullable was only 6%.

JohnDG Sat 11 Oct 2008

And if I have to choose I'd prefere implicit coercion from T? to T (=> light null checking rules) over optional as default any time.

They are actually equivalent, because for "light null checking rules", the presence or absence of a ? on a type declaration bears no relation to whether or not the value actually contains null.

katox Sat 11 Oct 2008

I got lost what was the meaning of by default. If it is only a question of ? or ! I'd be for writing ? because it is definitely more common (previously mentioned academic evidence and brian's statistics confirms that). Anything else?

I'll try to hack the compiler a bit to play with nullable/non-nullable Fan codebase. I think that strict null checking would actually reduce the boilerplate and increase the readability (not the opposite). However I'm not sure what would that mean for dynamic/reflection code but I'll we'll see that.

Brian: What is the current status? I am a bit in trouble trying to bootstrap the compiler right now...

JohnDG Sat 11 Oct 2008

I got lost what was the meaning of by default. If it is only a question of ? or ! I'd be for writing ? because it is definitely more common (previously mentioned academic evidence and brian's statistics confirms that). Anything else?

You are right. It is very uncommon to want a null value. So if nullability were the default, then if correctly coded, the majority of variable declarations would use !, which is inefficient and adds to visual clutter.

Therefore, it makes sense for non-nullability to be the default, and for ? to denote a nullable instance.

My point is that in the presence of light checking, it does not matter whether you annotate instances or not, because how they are annotated has no relation to whether or not they contain null.

helium Sat 11 Oct 2008

My point is that in the presence of light checking, it does not matter whether you annotate instances or not, because how they are annotated has no relation to whether or not they contain null.

That's not true. There are still runtime checks. That is what Brian means with falling fast:

Obj? foo = null

Obj bar = foo   // assignment of an optional value to a non-nullable variable => here a NPE will be thrown

doSomethingWith(bar)  // in other languages it might crash somewhere in there

As soon as you assign an optional value to a non-nullable variable or pass it to a function expecting a non-null object it will throw.

the line is effectively translated to something like this:

Obj bar = foo ?: throw NPE    

A non-nullable variable can never contain null. The question is only whether you want compiletime checks or runtime checks to asure this. And somehow I'm pretty sure Fan will get runtime checks instead of compiletime checks.

JohnDG Sat 11 Oct 2008

That's not true. There are still runtime checks. That is what Brian means with falling fast:

That means every assignment of non-nullable to nullable will incur the cost of a runtime check. I wonder what the performance implications will be in mixed code.

A non-nullable variable can never contain null. The question is only whether you want compiletime checks or runtime checks to asure this.

My point is that it does not help reduce NPEs. It just shifts them to a different place in the code (albeit a more helpful place, from a diagnostic perspective).

helium Sat 11 Oct 2008

Additionally it helps removing unneccesary tests like

Foo method(Bar argument)
{
   if (argument == null)
      throw new InvalidArgumentException("argument must not be null")
   ...
}

or whatever.

brian Sat 11 Oct 2008

Brian: What is the current status? I am a bit in trouble trying to bootstrap the compiler right now...

You won't be able to work off the Mercurial tip - it requires the painful process of bootstrapping thru each changeset. Sometime in the next few days I should have all the APIs annotated and the basic infrastructure in place and I'll post a build which can compile the tip.

it does not matter whether you annotate instances or not, because how they are annotated has no relation to whether or not they contain null.

As Helium pointed out, this is not true. A non-nullable is guaranteed to not be null. You cannot coerce a nullable to a non-nullable without a runtime check. The only issue at debate is whether the coercion in source code is implicit like auto-casting or explicit (such as with ?: or an if statement).

So since most variables are actually non-nullable, we can avoid a lot of the defensive null checking that occurs in idiomatic Java code. In fact I lean towards making things like the following illegal for non-nullable variables:

x?.method
x ?: y
if (x == null) ...

And somehow I'm pretty sure Fan will get runtime checks instead of compiletime checks.

We'll definitely have 100% coverage on runtime checks - although we can't complete those checks until we finish off our ctor/with-block debate. And we'll definitely have basic compile time checks (like you can't use null literal with a non-nullable type). So the debated issue is auto-coersion to non-nullable or forcing the developer to handle with a conditional expression/statement.

JohnDG Sun 12 Oct 2008

The only issue at debate is whether the coercion in source code is implicit like auto-casting or explicit (such as with ?: or an if statement).

That's not the only issue. The "implicit" coercion does not generally lead to any reduction in NPEs (they just occur in a different place during runtime), while use of conditional operators eliminates NPEs. So that non-nullable variables cannot hold null still does not establish any clear benefits of annotation in the presence of "weak nullable rules".

like you can't use null literal with a non-nullable type

This is only marginally helpful. e.g.:

Obj? myNull = null
Obj notNull = myNull

is perfectly legal. Most NPEs occur as a result of infection from other nulls.

So the debated issue is auto-coersion to non-nullable or forcing the developer to handle with a conditional expression/statement.

This makes it sound like the debate is between "letting the compiler do the work" and "forcing the developer to do the work". In point of fact, the debate is between "praying that nullable instances don't contain null" and "not letting the developer infect non-null instances with values that will be null sometimes".

They have completely different effects on the resultant code base. In the one case, you force the developer to more rigorously specify the behavior of the code. In the other case, you allow holes in the specification and hope things don't explode for common runs.

The debate is exactly the same as for static typing versus dynamic typing.

although we can't complete those checks until we finish off our ctor/with-block debate

Ah, so like 5 years from now. :-)

katox Sun 12 Oct 2008

JohnDG: There could be a reduction of NPEs even with the current approach. Once you have near to 3/4 non-null APIs you are safe inside. Having clear separation of null/non-null in APIs directly (not by docs) is also benefitial. But I agree completely that we shouldn't stop here unless there is a damn good reason.

Side note: most of NPEs are actually bad code, as you said, disregarding some cases. Even in Fan compiler there were a few and that is quite conscious code with a lot of thinking invested. And all of those would probably be detected automatically...

katox Sun 12 Oct 2008

In fact I lean towards making things like the following illegal for non-nullable variables

x?.method
x ?: y
if (x == null) ...

That would just make prototyping and "trying out" harder. I personaly hate defensive code but in this case it does exactly what it is supposed to do. Do you plan to make also

x := 5
if (x < 4) echo(x)

illegal? I'm all for poiting all of that out as warnings but making it compile errors sounds as irritating as declaring a precise set of checked exceptions in Java signatures. How many times did you have to comment out an exception in the method signature after you had commented out a block of function code in it just to try something? ;)

brian Sun 12 Oct 2008

So that non-nullable variables cannot hold null still does not establish any clear benefits of annotation in the presence of "weak nullable rules".

John I don't see this as so black and white - to me it is a spectrum of gray and we just have to find where Fan sits. No type system is perfect - it is merely finding the right trade-off which provides good static analysis without burdening developers. You seem be saying that basic things like checking null literals is useless. Well I implemented those checks today and I think this is going to be a pretty huge deal compared with how idiotic Java code is structured. Something as simple as the null literal check does indeed catch a ton of things - literally hundreds of places where my manual annotations where incorrect. In fact several dozen of my tests which were checking null conditions went away because the offending code did not even compile anymore.

x := 5
if (x < 4) echo(x)

Why would that be illegal? In fact once we implement value types we can optimize that comparison to a single Java opcode (versus all the null checking we perform today).

helium Mon 13 Oct 2008

Why would that be illegal?

I can only guess, but perhaps because x cannot be less than 4 in this case as it's 5? But this fact isn't reflected by the type system (and that would be hard with mutable state) so it obviously has nothing to do with the null debate.

katox Mon 13 Oct 2008

Why would that be illegal? In fact once we implement value types we can optimize that comparison to a single Java opcode (versus all the null checking we perform today).

That's the point. It is the same case as if (x == null) ... and the rest. It can be optimized out (and warning about unreachable code, always false condition etc. could be printed). What is so special about on "overdone" code on non-nulls that the compiler would handle it differently?

katox Mon 13 Oct 2008

I can only guess, but perhaps because x cannot be less than 4 in this case as it's 5?

No, if you don't share state between threads (and Fan doesn't, correct me if I'm wrong) the compiler could prove it quite easily. For instance gcc would optimized it out completely. It is clearly a dead code much like Str x := "hello"; if (x == null) .... I don't see a difference but the information used to prove it (type system vs. constants).

brian Mon 13 Oct 2008

I don't see a difference but the information used to prove it (type system vs. constants).

The Fan compiler could optimize it (and maybe will someday). But remember we also have HotSpot under us doing its own optimization - so in general we don't need to obsess about optimizations (although compiler level never hurts).

I don't think it is the same thing at all because it is legal by the type system. If for example x was declared to be a positive number by the type system and we were comparing against something provably incorrect, then yes we should make it illegal.

I'm not sure if making null checks against non-nullable types is the correct course. I'm going to prototype it and see how it works.

Login or Signup to reply.