Skip to content

Commit 121728e

Browse files
committed
Merge pull request scala#413 from odersky/blog-scaling-dot
Blog: scaling dot
2 parents cf317f5 + dcd05b4 commit 121728e

File tree

1 file changed

+159
-0
lines changed

1 file changed

+159
-0
lines changed
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
---
2+
layout: blog
3+
post-type: blog
4+
by: Martin Odersky
5+
title: Scaling DOT to Scala - Soundness
6+
disqus: true
7+
---
8+
9+
In my [last
10+
blog post](http://www.scala-lang.org/blog/2016/02/03/essence-of-scala.html)
11+
I introduced DOT, a minimal calculus that underlies much of Scala.
12+
DOT is much more than an academic exercise, because it gives us
13+
guidelines on how to design a sound type system for full Scala.
14+
15+
## Recap: The Problem of Bad Bounds
16+
17+
As was argued in the previous blog post, the danger a path-dependent type
18+
system like Scala's faces is inconsistent bounds or aliases. For
19+
instance, you might have a type alias
20+
21+
type T = String
22+
23+
in scope in some part of the program, but in another part the same
24+
type member `T` is known as
25+
26+
type T = Int
27+
28+
If you connect the two parts, you end up allowing assigning a `String`
29+
to an `Int` and vice versa, which is unsound - it will crash at
30+
runtime with a `ClassCastException`. The problem is that there
31+
is no obvious, practical, compile time analysis for DOT or
32+
Scala that ensures that all types have good bounds. Types can contain
33+
abstract type members with bounds that can be refined elsewhere and
34+
several independent refinements might lead together to a bad bound
35+
problem. Barring a whole program analysis there is no specific
36+
point in the program where we can figure this out straightforwardly.
37+
38+
In DOT, the problem is resolved by insisting that every path prefix `p`
39+
of a type `p.T` is at runtime a concrete value. That way, we only have
40+
to check for good bounds when objects are _created_ with `new`, and
41+
that check is easy: When objects are created, we know their class and
42+
we can insist that all nested types in that class are aliases or
43+
have consistent bounds. So far so good.
44+
45+
## Loopholes Caused by Scaling Up
46+
47+
But if we want to scale up the DOT result for full Scala, several
48+
loopholes open up. These come all down to the fact that the prefix of
49+
a type selection might _not_ be a value that's constructed with a
50+
`new` at run time. The loopholes can be classified into three
51+
categories:
52+
53+
1. The prefix value might be lazy, and never instantiated to anything, as in:
54+
55+
lazy val p: S = p
56+
... p.T ...
57+
58+
Note that trying to access the lazy value `p` would result in an infinite loop. But using `p` in a type does not force its evaluation, so we might never evaluate `p`. Since `p` is not initialized with a `new`, bad bounds for `T` would go undetected.
59+
60+
2. The prefix value might be initialized to `null`, as in
61+
62+
val p: S = null
63+
... p.T ...
64+
65+
The problem here is similar to the first one. `p` is not initialized
66+
with a `new` so we know nothing about the bounds of `T`.
67+
68+
3. The prefix might be a type `T` in a type projection `T # A`, where `T`
69+
is not associated with a runtime value.
70+
71+
We can in fact construct soundness issues in all of these cases. Look
72+
at the discussion for issues [#50](https://github.com/lampepfl/dotty/issues/50)
73+
and [#1050](https://github.com/lampepfl/dotty/issues/1050) in the
74+
[dotty](https://github.com/lampepfl/dotty/issues/1050) repository
75+
on GitHub. All issues work fundamentally in the same way: Construct a type `S`
76+
which has a type member `T` with bad bounds, say
77+
78+
Any <: T <: Nothing
79+
80+
Then, use the left subtyping to turn an expression of type `Any` into
81+
an expression of type `T` and use the right subtyping to turn that
82+
expression into an expression of type `Nothing`:
83+
84+
def f(x: Any): p.T = x
85+
def g(x: p.T): Nothing = x
86+
87+
Taken together, `g(f(x))` will convert every expression into an
88+
expression of type `Nothing`. Since `Nothing` is a subtype of every
89+
other type, this means you can convert an arbitrary expression to have
90+
any type you choose. Such a feat is an impossible promise, of
91+
course. The promise is usually broken at run-time by failing with a
92+
`ClassCastException`.
93+
94+
## Plugging the Loopholes
95+
96+
To get back to soundness we need to plug the loopholes. Some of the
97+
necessary measures are taken in pull request [#1051](https://github.com/lampepfl/dotty/issues/1051).
98+
That pull request
99+
100+
- tightens the rules for overrides of lazy values: lazy values
101+
cannot override or implement non-lazy values,
102+
- tightens the rules which lazy values can appear in paths: they
103+
must be final and must have concrete types with known consistent bounds,
104+
- allows type projections `T # A` only if `T` is a concrete type
105+
with known consistent bounds.
106+
107+
It looks like this is sufficient to plug soundness problems (1) and
108+
(3). To plug (2), we need to make the type system track nullability in
109+
more detail than we do it now. Nullability tracking is a nice feature
110+
in its own right, but now we have an added incentive for implementing
111+
it: it would help to ensure type soundness.
112+
113+
There's one sub-case of nullability checking which is much harder to do
114+
than the others. An object reference `x.f` might be `null` at run time
115+
because the field `f` is not yet initialized. This can lead to a
116+
soundness problem, but in a more roundabout way than the other issues
117+
we have identified. In fact, Scala guarantees that in a program that
118+
runs to completion without aborting, every field will eventually be
119+
initialized, so every non-null field will have good bounds. Therefore,
120+
the only way an initialized field `f` could cause a soundness problem
121+
is if the program in question would never get to initialize `f`,
122+
either because it goes into an infinite loop or because it aborts with
123+
an exception or `System.exit` call before reaching the initialization
124+
point of `f`. It's a valid question whether type soundness guarantees
125+
should extend to this class of "strange" programs. We might want to
126+
draw the line here and resort to runtime checks or exclude "strange"
127+
programs from any soundness guarantees we can give. The research community
128+
has coined the term [soundiness](http://soundiness.org/) for
129+
this kind of approach and has [advocated](http://cacm.acm.org/magazines/2015/2/182650-in-defense-of-soundiness/fulltext) for it.
130+
131+
The necessary restrictions on type projection `T # A` are problematic
132+
because they invalidate some idioms in type-level programming. For
133+
instance, the cute trick of making Scala's type system Turing complete
134+
by having it [simulate SK
135+
combinators](https://michid.wordpress.com/2010/01/29/scala-type-level-encoding-of-the-ski-calculus/)
136+
would not longer work since that one relies on unrestricted type
137+
projections. The same holds for some of the encodings of type-level
138+
arithmetic.
139+
140+
To ease the transition, we will continue for a while to allow unrestricted type
141+
projections under a flag, even though they are potentially
142+
unsound. In the current dotty compiler, that flag is a language import
143+
`-language:Scala2`, but it could be something different for other
144+
compilers, e.g. `-unsafe`. Maybe we can find rules that are less
145+
restrictive than the ones we have now, and are still sound. But one
146+
aspect should be non-negotiable: Any fundamental deviations from the
147+
principles laid down by DOT needs to be proven mechanically correct
148+
just like DOT was. We have achieved a lot with the DOT proofs, so we
149+
should make sure not to back-slide. And if the experience of the past
150+
10 years has taught us one thing, it is that the meta theory of type
151+
systems has many more surprises in store than one might think. That's
152+
why mechanical proofs are essential.
153+
154+
155+
156+
157+
158+
159+

0 commit comments

Comments
 (0)