Skip to content

Commit 4e9020d

Browse files
committed
New blog post
1 parent 2865f05 commit 4e9020d

File tree

1 file changed

+143
-0
lines changed

1 file changed

+143
-0
lines changed
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
layout: blog
3+
post-type: blog
4+
by: Martin Odersky
5+
title: Scaling DOT to Scala: Soundness
6+
---
7+
8+
In my [last
9+
blog](http://www.scala-lang.org/blog/2016/02/03/essence-of-scala.html)
10+
I introduced DOT, a minimal calculus that underlies much of Scala.
11+
DOT is much more than an academic exercise, because it gives us
12+
guidelines how to design a sound type system for full Scala. As was
13+
argued in the previous blog post, the danger a path-dependent type
14+
system like Scala's faces is inconsistent bounds or aliases. For
15+
instance, you might have a type alias
16+
17+
type T = String
18+
19+
in scope in some part of the program, but in another part the same
20+
type member `T` is known as
21+
22+
type T = Int
23+
24+
If you connect the two parts, you end up allowing assigning a `String`
25+
to an `Int` and vice versa, which is unsound - it will crash at
26+
runtime with a `ClassCastException`. The problem is that there
27+
is no obvious, practical, compile time analysis for DOT or
28+
Scala that ensures that all types have good bounds. Types can contain
29+
abstract type members with bounds that can be refined elsewhere and
30+
several independent refinements might lead together to a bad bound
31+
problem. Barring a whole program analysis there is no specific
32+
point in the program where we can figure this out straightforwardly.
33+
34+
In DOT, the problem is resolved by insisting that every path prefix `p`
35+
of a type `p.T` is at runtime a concrete value. That way, we only have
36+
to check for good bounds when objects are _created_ with `new`, and
37+
that check is easy: When objects are created, we know their class and
38+
we can insist that all nested types in that class are aliases or
39+
have consistent bounds. So far so good.
40+
41+
But if we want to scale up the DOT result for full Scala, several
42+
loopholes open up. These come all down to the fact that the prefix of
43+
a type selection might _not_ be a value that's constructed with a
44+
`new` at run time. The loopholes can be classified into three
45+
categories:
46+
47+
1. The prefix value might be lazy, and never instantiated to anything, as in:
48+
49+
lazy val p: S = p
50+
... p.T ...
51+
52+
Note that trying to access the lazy value `p` would result in an infinite loop. But using `p` in a type does not force its evaluation, so we might never evaluate `p`. Since `p` is not initialized
53+
with a `new`, bad bounds for `T` would go undetected.
54+
55+
2. The prefix value might be initialized to `null`, as in
56+
57+
val p: S = null
58+
... p.T ...
59+
60+
The problem here is similar to the first one. `p` is not initialized
61+
with a `new` so we know nothing about the bounds of `T`.
62+
63+
3. The prefix might be a type `T` in a type projection `T # A`, where `T`
64+
is not associated with a runtime value.
65+
66+
We can in fact construct soundness issues in all of these cases. Look
67+
at the discussion for issues [#50](https://github.com/lampepfl/dotty/issues/50)
68+
and [#1050](https://github.com/lampepfl/dotty/issues/1050) in the
69+
[dotty](https://github.com/lampepfl/dotty/issues/1050) repository
70+
on GitHub. All issues work fundamentally in the same way: Construct a type `S`
71+
which has a type member `T` with bad bounds, say
72+
73+
Any <: T <: Nothing
74+
75+
Then, use the left subtyping to turn an expression of type `Any` into
76+
an expression of type `T` and use the right subtyping to turn that
77+
expression into an expression of type `Nothing`:
78+
79+
def f(x: Any): p.T = x
80+
def g(x: p.T): Nothing = x
81+
82+
Taken together, `g(f(x))` will convert every expression into an
83+
expression of type `Nothing`. Since `Nothing` is a subtype of every
84+
other type, this means you can convert an arbitrary expression to have
85+
any type you choose. Such a feat is an impossible promise, of
86+
course. The promise is usually broken at run-time by failing with a
87+
`ClassCastException`.
88+
89+
To get back to soundness we need to plug the loopholes. Some of the
90+
necessary measures are taken in pull request [#1051](https://github.com/lampepfl/dotty/issues/1051).
91+
That pull request
92+
93+
- tightens the rules for overrides of lazy values: lazy values
94+
cannot override or implement non-lazy values,
95+
- tightens the rules which lazy values can appear in paths: they
96+
must be final and must have concrete types with known consistent bounds,
97+
- allows type projections `T # A` only if `T` is a concrete type
98+
with known consistent bounds.
99+
100+
It looks like this is sufficient to plug soundness problems (1) and
101+
(3). To plug (2), we need to make the type system track nullability in
102+
more detail than we do it now. Nullability tracking is a nice feature
103+
in its own right, but now we have an added incentive for implementing
104+
it: it would help to ensure type soundness.
105+
106+
There's one sub-case of nullability checking which is much harder to do
107+
than the others. An object reference `x.f` might be `null` at run time
108+
because the field `f` is not yet initialized. This can lead to a
109+
soundness problem, but in a more roundabout way than the other issues
110+
we have identified. In fact, Scala guarantees that in a program that
111+
runs to completion without aborting, every field will eventually be
112+
initialized, so every non-null field will have good bounds. Therefore,
113+
the only way an initialized field `f` could cause a soundness problem
114+
is if the program in question would never get to initialize `f`,
115+
either because it goes into an infinite loop or because it aborts with
116+
an exception or `System.exit` call before reaching the initialization
117+
point of `f`. It's a valid question whether type soundness guarantees
118+
should extend to this class of "strange" programs. We might want to
119+
draw the line here and resort to runtime checks or exclude "strange"
120+
programs from any soundness guarantees we can give.
121+
122+
The necessary restrictions on type projection `T # A` are problematic
123+
because they invalidate a common idiom in type-level programming. To
124+
ease the transition, we will continue to allow unrestricted type
125+
projections for a while under a flag, even though they are
126+
potentially unsound. In the current dotty compiler, that flag is a
127+
language import `-language:Scala2`, but it could be something
128+
different for other compilers, e.g. `-unsafe`. Maybe we can find
129+
rules that are less restrictive than the ones we have now, and are
130+
still sound. But one aspect should be non-negotiable: Any fundamental
131+
deviations from the principles laid down by DOT needs to be proven
132+
mechanically correct just like DOT was. We have achieved a lot with
133+
the DOT proofs, so we should make sure not to back-slide. And if the
134+
experience of the past 10 years has taught us one thing, it is that
135+
the meta theory of type systems has many more surprises in store than
136+
one might think. That's why mechanical proofs are essential.
137+
138+
139+
140+
141+
142+
143+

0 commit comments

Comments
 (0)