You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/_docs/internals/specialized-traits.md
+108-4Lines changed: 108 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,7 @@
1
-
# Specialized Traits
1
+
# Specialized Traits and Classes
2
2
3
3
Specialization is one of the few remaining desirable features from Scala 2 that's are as yet missing in Scala 3. We could try to port the Scala 2 scheme, which would be non-trivial since the implementation is quite complex. But that scheme is problematic enough to suggest that we also look for alternatives. A possible alternative is described here. It is meant to complement the [proposal on inline traits](https://github.com/lampepfl/dotty/issues/15532). That proposal also contains a more detailed critique of Scala 2 specialization.
4
+
The parts in that proposal that mention specialization should be ignored; they are superseded by the proposal here.
4
5
5
6
The main problem of Scala-2 specialization is code bloat. We have to pro-actively generate up to 11 copies of functions and classes when they have a specialized type parameter, and this grows exponentially with the number of such type parameters. Miniboxing tries to reduce the number under the exponent from ~10 to 3 or 4, but it has problems dealing with arrays.
Note that these traits repeat the parent types of their corresponding inline traits, for instance `ArrayIterator$sp$Int` extends `ArrayIterator[Int]` as well as its parent `Iterator[Int]`. After erasure, the definition of
186
+
Note that these traits repeat the parent types of their corresponding inline traits. For instance,`ArrayIterator$sp$Int` extends `ArrayIterator[Int]` as well as its parent `Iterator[Int]`. After erasure, the definition of
@@ -262,7 +263,7 @@ This method is generated by Scala 2's function specialization which is also adop
262
263
The example shows that indeed all code is properly specialized with no need for box or unbox operations.
263
264
264
265
265
-
## Conclusion
266
+
## Evaluation
266
267
267
268
The described scheme is surprisingly simple. All the heavy lifting is done by inline traits. Adding specialization on top requires little more than arranging for a cache of specialized instances.
There's precedent for this in Kotlin where the majority of higher-order collection methods are declared inline, in this case in order to allow specialization for suspendability. So the restriction does not look like a blocker.
279
280
280
-
Some flexibility could be gained if we allowed method overloading between specialized inline methods and normal methods with matching type signatures. For instance, the `Vector` implementation above seriously restricts `map` by requiring that its `B` type parameter is also `Specialized`. Thus `map` cannot be used to map a specialized collection to another collection if the result element type is not ground. But we could alleviate the problem by allowing a second, overloaded `map` operation like this:
281
+
## Going Further: Improve Existing Class Hierarchies
282
+
283
+
We have shown that we can formulate an alternative version of a collection-like class hierarchy that is fully specialized. But can we retro-fit this idea even to existing collections? The direct approach would
284
+
clearly not work since an existing collection like `Vector[T]` can be created from anywhere whereas a specialized collection can be created only in a monomorphic context where we know the type instance of `T`. So specialized
285
+
collections come with a tax in expressiveness which pays for their superior performance.
286
+
287
+
But it turns out we can gain a lot of flexibility with three additional tweaks to the language and compiler.
288
+
289
+
### 1. Adapt Overloading to Specialization
290
+
291
+
More flexibility could be gained if we allowed method overloading between specialized inline methods and normal methods with matching type signatures. For instance, the `Vector` implementation above seriously restricts `map` by requiring that its `B` type parameter is also `Specialized`. Thus `map` cannot be used to map a specialized collection to another collection if the result element type is not statically known. But we could alleviate the problem by allowing a second, overloaded `map` operation like this:
@@ -286,6 +297,99 @@ The second implementation of `map` will return an unspecialized vector if
286
297
the new element type is not statically known. If overloads like this were allowed, they could be resolved by picking the specialized inline version if
287
298
a `Specialized` instance can be synthesized for the actual type argument, and picking the unspecialized version otherwise.
288
299
300
+
We can do even better if we allow some additions of the existing collections. In that case, we can add definitions like the inline `map` above to the original collections.
301
+
That means, whenever we have a collection `xs` with a type such as `Vector[A]` and a function `f` with a statically known result type `B`, then `xs.map(f)` returns a specialized collection. So we can get specialized collections out of normal collections as long as the element type of the created collection is statically known.
302
+
303
+
This can be generalized. In particular, all `apply` methods of `Vector` should be split into methods taking specialized types and unrestricted methods. For instance:
The same holds for all collection methods such as `map` that return a new collection of a different element type.
310
+
311
+
### 2. Automate the Boilerplate with `specializedBy`
312
+
313
+
The described scheme would entail some amount of code duplication. We could automate this with a new annotation that is put on a class and states that the class has a specialized variant. Example:
314
+
```scala
315
+
@specializedBy[faster.Vector] classVector[+T] ...
316
+
```
317
+
If a class carries such an annotation the specialized inline functions described above could be added automatically.
318
+
319
+
### 3. Optimize Use Sites by Path Splitting
320
+
321
+
One remaining problem is that specialization is a compile-time operation. Without putting in additional work, we cannot immediately exploit the situation where a runtime type is a specialized collection but the static type is unspecialized. For instance, consider this use of `Vector`:
322
+
323
+
```scala
324
+
defsumElems(xs: Vector[Int]):Int=
325
+
vari=0
326
+
varsum=0
327
+
while i < xs.length do
328
+
sum += xs(i)
329
+
i +=1
330
+
sum
331
+
```
332
+
Here, the problem is that, even though we know that `xs` is a `Vector` of `Int`, we cannot deduce that has been specialized to a `faster.Vector[Int]`. Therefore, `xs(i)` goes through the `apply` method of `Vector`. If the runtime class of `Vector` is indeed specialized this would box the `Int` element to `Object` in a bridge method and unbox it again to `Int` at the call site. This could lose a lot of performance, unless the JVM manages to optimize the box/unbox pair away (so far, experience shows that the JVM is not very good at this). The performance could be even worse than working with an unspecialized `Vector` where elements are held in boxed form so they don't have to be boxed each time they are accessed.
333
+
334
+
335
+
Of course, we can narrow the type of `sumElems` to
336
+
```scala
337
+
defsumElems(xs: faster.Vector[Int]):Int
338
+
```
339
+
but that would make it less generally usable. Another alternative is to optimize `sumElems` by path splitting. We could detect at runtime whether
340
+
`xs` is a `faster.Vector` and optimize the code if it is. For instance, like this:
341
+
```scala
342
+
defsumElems(xs: Vector[Int]):Int=
343
+
valfaster: faster.Vector[Int] |Null= xs match
344
+
casexs: faster.Vector[_] => xs
345
+
case _ => xs
346
+
vari=0
347
+
varsum=0
348
+
while i < xs.length do
349
+
sum += (if faster !=nullthen faster(i) else xs(i))
350
+
i +=1
351
+
sum
352
+
```
353
+
That would avoid the boxing at the cost of a type test in the computation of `faster` and a null test in the call of `apply`. The single type test would be amortized over possibly many calls in the loop. We could do even better by generating a bit more code, splitting the whole loop:
354
+
```scala
355
+
defsumElems(xs: Vector[Int]):Int=
356
+
valfaster: faster.Vector[Int] |Null= xs match
357
+
casexs: faster.Vector[_] => xs
358
+
case _ => xs
359
+
vari=0
360
+
varsum=0
361
+
if faster !=nullthen
362
+
while i < xs.length do
363
+
sum += faster(i)
364
+
i +=1
365
+
else
366
+
while i < xs.length do
367
+
sum += xs(i)
368
+
i +=1
369
+
sum
370
+
```
371
+
The example has shown that it is possible to have code over possibly specialized collections that is both general and high performance. But it does require a lot of hand-written boiler-plate.
372
+
373
+
The boilerplate could be generated automatically by an optimization phase in the compiler. Essentially when compiling methods that take parameters whose type is a class
374
+
that's annotated with `specializedBy`, we can do the path splitting automatically in an optimization step. The optimization would first analyze the body of the method to decide which path splitting strategy to use.
375
+
376
+
I believe the three steps I have outlined could overcome most of the performance penalties imposed by existing unspecialized class hierarchies like collections, making their performance comparable to languages that use global monomorphization.
377
+
378
+
## Going Further: Hand-written Specializations
379
+
380
+
Additional improvements could be gained if we allowed the programmer to pick their own implementations for specialized class instances. For example,
The implementation in `IntHashMap` could exploit that fact that the key type `K` is known to be `Int` to pick a more performant algorithm, for instance.
390
+
391
+
It would be great if we could use `IntHashMap` each time a specialized HashMap such as `HashMap$sp$Int$String` is referred to or created. In other words, `IntHashMap` should act as a drop-in replacement for `HashMap$sp$Int$String` that is selected automatically. A detailed proposal for this is left for future work.
0 commit comments