Skip to content

Commit ff1a3a1

Browse files
committed
SIP-62 - For comprehension improvements
1 parent 8a15186 commit ff1a3a1

File tree

1 file changed

+375
-0
lines changed

1 file changed

+375
-0
lines changed

content/better-fors.md

Lines changed: 375 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,375 @@
1+
---
2+
layout: sip
3+
permalink: /sips/:title.html
4+
stage: design
5+
status: submitted
6+
title: SIP-62 - For comprehension improvements
7+
---
8+
9+
**By: Kacper Korban (VirtusLab)**
10+
11+
## History
12+
13+
| Date | Version |
14+
|---------------|--------------------|
15+
| June 6th 2023 | Initial Draft |
16+
| Feb 15th 2024 | Reviewed Version |
17+
18+
## Summary
19+
20+
`for`-comprehensions in Scala 3 improved their usability in comparison to Scala 2, but there are still some pain points relating both usability of `for`-comprehensions and simplicity of their desugaring.
21+
22+
This SIP tries to address some of those problems, by changing the specification of `for`-comprehensions. From user perspective, the biggest change is allowing aliases at the start of the `for`-comprehensions. e.g.
23+
24+
```
25+
for {
26+
x = 1
27+
y <- Some(2)
28+
} yield x + y
29+
```
30+
31+
## Motivation
32+
33+
There are some clear pain points related to Scala'3 `for`-comprehensions and those can be divided into two categories:
34+
35+
1. User-facing and code simplicity problems
36+
37+
Specifically, for the following example written in a Haskell-style do-comprehension
38+
39+
```haskell
40+
do
41+
a = largeExpr(arg)
42+
b <- doSth(a)
43+
combineM(a, b)
44+
```
45+
in Scala we would have to write
46+
47+
```scala
48+
val a = largeExpr(b)
49+
for
50+
b <- doSth(a)
51+
x <- combineM(a, b)
52+
yield x
53+
```
54+
55+
This complicates the code, even in this simple example.
56+
2. The simplicity of desugared code
57+
58+
The second pain point is that the desugared code of `for`-comprehensions can often be surprisingly complicated.
59+
60+
e.g.
61+
```scala
62+
for
63+
a <- doSth(arg)
64+
b = a
65+
yield a + b
66+
```
67+
68+
Intuition would suggest for the desugared code will be of the form
69+
70+
```scala
71+
doSth(arg).map { a =>
72+
val b = a
73+
a + b
74+
}
75+
```
76+
77+
But because of the possibility of an `if` guard being immediately after the pure alias, the desugared code is of the form
78+
79+
```scala
80+
doSth(arg).map { a =>
81+
val b = a
82+
(a, b)
83+
}.map { case (a, b) =>
84+
a + b
85+
}
86+
```
87+
88+
These unnecessary assignments and additional function calls not only add unnecessary runtime overhead but can also block other optimizations from being performed.
89+
90+
## Proposed solution
91+
92+
This SIP suggests the following changes to `for` comprehensions:
93+
94+
1. Allow `for` comprehensions to start with pure aliases
95+
96+
e.g.
97+
```scala
98+
for
99+
a = 1
100+
b <- Some(2)
101+
c <- doSth(a)
102+
yield b + c
103+
```
104+
2. Simpler conditional desugaring of pure aliases. i.e. whenever a series of pure aliases is not immediately followed by an `if`, use a simpler way of desugaring.
105+
106+
e.g.
107+
```scala
108+
for
109+
a <- doSth(arg)
110+
b = a
111+
yield a + b
112+
```
113+
114+
will be desugared to
115+
116+
```scala
117+
doSth(arg).map { a =>
118+
val b = a
119+
a + b
120+
}
121+
```
122+
123+
but
124+
125+
```scala
126+
for
127+
a <- doSth(arg)
128+
b = a
129+
if b > 1
130+
yield a + b
131+
```
132+
133+
will be desugared to
134+
135+
```scala
136+
Some(1).map { a =>
137+
val b = a
138+
(a, b)
139+
}.withFilter { case (a, b) =>
140+
b > 1
141+
}.map { case (a, b) =>
142+
a + b
143+
}
144+
```
145+
146+
3. Avoiding redundant `map` calls if the yielded value is the same as the last bound value.
147+
148+
e.g.
149+
```scala
150+
for
151+
a <- List(1, 2, 3)
152+
yield a
153+
```
154+
155+
will just be desugared to
156+
157+
```scala
158+
List(1, 2, 3)
159+
```
160+
161+
### Detailed description
162+
163+
#### Ad 1. Allow `for` comprehensions to start with pure aliases
164+
165+
Allowing `for` comprehensions to start with pure aliases is a straightforward change.
166+
167+
The Enumerators syntax will be changed from:
168+
169+
```
170+
Enumerators ::= Generator {semi Enumerator | Guard}
171+
```
172+
173+
to
174+
175+
```
176+
Enumerators ::= {Pattern1 `=' Expr semi} Generator {semi Enumerator | Guard}
177+
```
178+
179+
Which will allow adding 0 or more aliases before the first generator.
180+
181+
When desugaring is concerned, a for comprehension starting with pure aliases will generate a block with those aliases as `val` declarations and the rest of the desugared `for` as an expression. Unless the aliases are followed by a guard, then the desugaring should result in an error.
182+
183+
New desugaring rule will be added:
184+
185+
```scala
186+
For any N:
187+
for (P_1 = E_1; ... P_N = E_N; ...)
188+
==>
189+
{
190+
val x_2 @ P_2 = E_2
191+
...
192+
val x_N @ P_N = E_N
193+
for (...)
194+
}
195+
```
196+
197+
e.g.
198+
199+
```scala
200+
for
201+
a = 1
202+
b <- Some(2)
203+
c <- doSth(a)
204+
yield b + c
205+
```
206+
207+
will desugar to
208+
209+
```scala
210+
{
211+
val a = 1
212+
for
213+
b <- Some(2)
214+
c <- doSth(a)
215+
yield b + c
216+
}
217+
```
218+
219+
#### Ad 2. Simpler conditional desugaring of pure aliases. i.e. whenever a series of pure aliases is not immediately followed by an `if`, use a simpler way of desugaring.
220+
221+
Currently, for consistency, all pure aliases are desugared as if they are followed by an `if` condition. Which makes the desugaring more complicated than expected.
222+
223+
e.g.
224+
225+
The following code:
226+
227+
```scala
228+
for
229+
a <- doSth(arg)
230+
b = a
231+
yield a + b
232+
```
233+
234+
will be desugared to:
235+
236+
```scala
237+
Some(1).map { a =>
238+
val b = a
239+
(a, b)
240+
}.map { case (a, b) =>
241+
a + b
242+
}
243+
```
244+
245+
The proposed change is to introduce a simpler desugaring for common cases, when aliases aren't followed by a guard, and keep the old desugaring method for the other cases.
246+
247+
A new desugaring rules will be introduced for simple desugaring.
248+
249+
```scala
250+
For any N:
251+
for (P <- G; P_1 = E_1; ... P_N = E_N; ...)
252+
==>
253+
G.flatMap (P => for (P_1 = E_1; ... P_N = E_N; ...))
254+
255+
And:
256+
257+
for () yield E ==> E
258+
259+
(Where empty for-comprehensions are excluded by the parser)
260+
```
261+
262+
It delegares desugaring aliases to the newly introduced rule from the previous impreovement. i.e.
263+
264+
```scala
265+
For any N:
266+
for (P_1 = E_1; ... P_N = E_N; ...)
267+
==>
268+
{
269+
val x_2 @ P_2 = E_2
270+
...
271+
val x_N @ P_N = E_N
272+
for (...)
273+
}
274+
```
275+
276+
One other rule also has to be changed, so that the current desugaring method, of passing all the aliases in a tuple with the result, will only be used when desugaring a generator, followed by some aliases, followed by a guard.
277+
278+
```scala
279+
For any N:
280+
for (P <- G; P_1 = E_1; ... P_N = E_N; if E; ...)
281+
==>
282+
for (TupleN(P, P_1, ... P_N) <-
283+
for (x @ P <- G) yield {
284+
val x_1 @ P_1 = E_2
285+
...
286+
val x_N @ P_N = E_N
287+
TupleN(x, x_1, ..., x_N)
288+
}; if E; ...)
289+
```
290+
291+
This changes will make the desugaring work in the following way:
292+
293+
```scala
294+
for
295+
a <- doSth(arg)
296+
b = a
297+
yield a + b
298+
```
299+
300+
will be desugared to
301+
302+
```scala
303+
doSth(arg).map { a =>
304+
val b = a
305+
a + b
306+
}
307+
```
308+
309+
but
310+
311+
```scala
312+
for
313+
a <- doSth(arg)
314+
b = a
315+
if b > 1
316+
yield a + b
317+
```
318+
319+
will be desugared to
320+
321+
```scala
322+
Some(1).map { a =>
323+
val b = a
324+
(a, b)
325+
}.withFilter { case (a, b) =>
326+
b > 1
327+
}.map { case (a, b) =>
328+
a + b
329+
}
330+
```
331+
332+
#### Ad 3. Avoiding redundant `map` calls if the yielded value is the same as the last bound value.
333+
334+
This change is strictly an optimization. This allows for the compiler to get rid of the final `map` call, if the yielded value is the same as the last bound pattern. The pattern can be either a single variable binding or a tuple.
335+
336+
One desugaring rule has to be modified for this purpose.
337+
338+
```scala
339+
for (P <- G) yield P ==> G
340+
If P is a variable or a tuple of variables and G is not a withFilter.
341+
342+
for (P <- G) yield E ==> G.map (P => E)
343+
Otherwise
344+
```
345+
346+
e.g.
347+
```scala
348+
for
349+
a <- List(1, 2, 3)
350+
yield a
351+
```
352+
353+
will just be desugared to
354+
355+
```scala
356+
List(1, 2, 3)
357+
```
358+
359+
### Compatibility
360+
361+
This change is binary and TASTY compatible since for-comprehensions are desugared in the Typer. Thus, both class and TASTY files only ever use the desugared versions of programs.
362+
363+
While this change is forward source compatible, it is not backward compatible, as it accepts more syntax.
364+
365+
### Other concerns
366+
367+
As far as I know, there are no widely used Scala 3 libraries that depend on the desugaring specification of `for`-comprehensions.
368+
369+
## Links
370+
371+
1. Scala contributors discussion thread (pre-SIP): https://contributors.scala-lang.org/t/pre-sip-improve-for-comprehensions-functionality/3509/51
372+
2. Github issue discussion about for desugaring: https://github.com/lampepfl/dotty/issues/2573
373+
3. Scala 2 implementation of some of the improvements: https://github.com/oleg-py/better-monadic-for
374+
4. Implementation of one of the simplifications: https://github.com/lampepfl/dotty/pull/16703
375+
5. WIP implementation branch: https://github.com/dotty-staging/dotty/tree/improved-fors

0 commit comments

Comments
 (0)