|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +post-type: blog |
| 4 | +by: Martin Odersky |
| 5 | +title: Scaling DOT to Scala - Soundness |
| 6 | +disqus: true |
| 7 | +--- |
| 8 | + |
| 9 | +In my [last |
| 10 | +blog post](http://www.scala-lang.org/blog/2016/02/03/essence-of-scala.html) |
| 11 | +I introduced DOT, a minimal calculus that underlies much of Scala. |
| 12 | +DOT is much more than an academic exercise, because it gives us |
| 13 | +guidelines on how to design a sound type system for full Scala. |
| 14 | + |
| 15 | +## Recap: The Problem of Bad Bounds |
| 16 | + |
| 17 | +As was argued in the previous blog post, the danger a path-dependent type |
| 18 | +system like Scala's faces is inconsistent bounds or aliases. For |
| 19 | +instance, you might have a type alias |
| 20 | + |
| 21 | + type T = String |
| 22 | + |
| 23 | +in scope in some part of the program, but in another part the same |
| 24 | +type member `T` is known as |
| 25 | + |
| 26 | + type T = Int |
| 27 | + |
| 28 | +If you connect the two parts, you end up allowing assigning a `String` |
| 29 | +to an `Int` and vice versa, which is unsound - it will crash at |
| 30 | +runtime with a `ClassCastException`. The problem is that there |
| 31 | +is no obvious, practical, compile time analysis for DOT or |
| 32 | +Scala that ensures that all types have good bounds. Types can contain |
| 33 | +abstract type members with bounds that can be refined elsewhere and |
| 34 | +several independent refinements might lead together to a bad bound |
| 35 | +problem. Barring a whole program analysis there is no specific |
| 36 | +point in the program where we can figure this out straightforwardly. |
| 37 | + |
| 38 | +In DOT, the problem is resolved by insisting that every path prefix `p` |
| 39 | +of a type `p.T` is at runtime a concrete value. That way, we only have |
| 40 | +to check for good bounds when objects are _created_ with `new`, and |
| 41 | +that check is easy: When objects are created, we know their class and |
| 42 | +we can insist that all nested types in that class are aliases or |
| 43 | +have consistent bounds. So far so good. |
| 44 | + |
| 45 | +## Loopholes Caused by Scaling Up |
| 46 | + |
| 47 | +But if we want to scale up the DOT result for full Scala, several |
| 48 | +loopholes open up. These come all down to the fact that the prefix of |
| 49 | +a type selection might _not_ be a value that's constructed with a |
| 50 | +`new` at run time. The loopholes can be classified into three |
| 51 | +categories: |
| 52 | + |
| 53 | + 1. The prefix value might be lazy, and never instantiated to anything, as in: |
| 54 | + |
| 55 | + lazy val p: S = p |
| 56 | + ... p.T ... |
| 57 | + |
| 58 | + Note that trying to access the lazy value `p` would result in an infinite loop. But using `p` in a type does not force its evaluation, so we might never evaluate `p`. Since `p` is not initialized with a `new`, bad bounds for `T` would go undetected. |
| 59 | + |
| 60 | + 2. The prefix value might be initialized to `null`, as in |
| 61 | + |
| 62 | + val p: S = null |
| 63 | + ... p.T ... |
| 64 | + |
| 65 | + The problem here is similar to the first one. `p` is not initialized |
| 66 | + with a `new` so we know nothing about the bounds of `T`. |
| 67 | + |
| 68 | + 3. The prefix might be a type `T` in a type projection `T # A`, where `T` |
| 69 | + is not associated with a runtime value. |
| 70 | + |
| 71 | +We can in fact construct soundness issues in all of these cases. Look |
| 72 | +at the discussion for issues [#50](https://github.com/lampepfl/dotty/issues/50) |
| 73 | +and [#1050](https://github.com/lampepfl/dotty/issues/1050) in the |
| 74 | +[dotty](https://github.com/lampepfl/dotty/issues/1050) repository |
| 75 | +on GitHub. All issues work fundamentally in the same way: Construct a type `S` |
| 76 | +which has a type member `T` with bad bounds, say |
| 77 | + |
| 78 | + Any <: T <: Nothing |
| 79 | + |
| 80 | +Then, use the left subtyping to turn an expression of type `Any` into |
| 81 | +an expression of type `T` and use the right subtyping to turn that |
| 82 | +expression into an expression of type `Nothing`: |
| 83 | + |
| 84 | + def f(x: Any): p.T = x |
| 85 | + def g(x: p.T): Nothing = x |
| 86 | + |
| 87 | +Taken together, `g(f(x))` will convert every expression into an |
| 88 | +expression of type `Nothing`. Since `Nothing` is a subtype of every |
| 89 | +other type, this means you can convert an arbitrary expression to have |
| 90 | +any type you choose. Such a feat is an impossible promise, of |
| 91 | +course. The promise is usually broken at run-time by failing with a |
| 92 | +`ClassCastException`. |
| 93 | + |
| 94 | +## Plugging the Loopholes |
| 95 | + |
| 96 | +To get back to soundness we need to plug the loopholes. Some of the |
| 97 | +necessary measures are taken in pull request [#1051](https://github.com/lampepfl/dotty/issues/1051). |
| 98 | +That pull request |
| 99 | + |
| 100 | + - tightens the rules for overrides of lazy values: lazy values |
| 101 | + cannot override or implement non-lazy values, |
| 102 | + - tightens the rules which lazy values can appear in paths: they |
| 103 | + must be final and must have concrete types with known consistent bounds, |
| 104 | + - allows type projections `T # A` only if `T` is a concrete type |
| 105 | + with known consistent bounds. |
| 106 | + |
| 107 | +It looks like this is sufficient to plug soundness problems (1) and |
| 108 | +(3). To plug (2), we need to make the type system track nullability in |
| 109 | +more detail than we do it now. Nullability tracking is a nice feature |
| 110 | +in its own right, but now we have an added incentive for implementing |
| 111 | +it: it would help to ensure type soundness. |
| 112 | + |
| 113 | +There's one sub-case of nullability checking which is much harder to do |
| 114 | +than the others. An object reference `x.f` might be `null` at run time |
| 115 | +because the field `f` is not yet initialized. This can lead to a |
| 116 | +soundness problem, but in a more roundabout way than the other issues |
| 117 | +we have identified. In fact, Scala guarantees that in a program that |
| 118 | +runs to completion without aborting, every field will eventually be |
| 119 | +initialized, so every non-null field will have good bounds. Therefore, |
| 120 | +the only way an initialized field `f` could cause a soundness problem |
| 121 | +is if the program in question would never get to initialize `f`, |
| 122 | +either because it goes into an infinite loop or because it aborts with |
| 123 | +an exception or `System.exit` call before reaching the initialization |
| 124 | +point of `f`. It's a valid question whether type soundness guarantees |
| 125 | +should extend to this class of "strange" programs. We might want to |
| 126 | +draw the line here and resort to runtime checks or exclude "strange" |
| 127 | +programs from any soundness guarantees we can give. The research community |
| 128 | +has coined the term [soundiness](http://soundiness.org/) for |
| 129 | +this kind of approach and has [advocated](http://cacm.acm.org/magazines/2015/2/182650-in-defense-of-soundiness/fulltext) for it. |
| 130 | + |
| 131 | +The necessary restrictions on type projection `T # A` are problematic |
| 132 | +because they invalidate some idioms in type-level programming. For |
| 133 | +instance, the cute trick of making Scala's type system Turing complete |
| 134 | +by having it [simulate SK |
| 135 | +combinators](https://michid.wordpress.com/2010/01/29/scala-type-level-encoding-of-the-ski-calculus/) |
| 136 | +would not longer work since that one relies on unrestricted type |
| 137 | +projections. The same holds for some of the encodings of type-level |
| 138 | +arithmetic. |
| 139 | + |
| 140 | +To ease the transition, we will continue for a while to allow unrestricted type |
| 141 | +projections under a flag, even though they are potentially |
| 142 | +unsound. In the current dotty compiler, that flag is a language import |
| 143 | +`-language:Scala2`, but it could be something different for other |
| 144 | +compilers, e.g. `-unsafe`. Maybe we can find rules that are less |
| 145 | +restrictive than the ones we have now, and are still sound. But one |
| 146 | +aspect should be non-negotiable: Any fundamental deviations from the |
| 147 | +principles laid down by DOT needs to be proven mechanically correct |
| 148 | +just like DOT was. We have achieved a lot with the DOT proofs, so we |
| 149 | +should make sure not to back-slide. And if the experience of the past |
| 150 | +10 years has taught us one thing, it is that the meta theory of type |
| 151 | +systems has many more surprises in store than one might think. That's |
| 152 | +why mechanical proofs are essential. |
| 153 | + |
| 154 | + |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | + |
| 159 | + |
0 commit comments