-
Notifications
You must be signed in to change notification settings - Fork 28
SIP-70 - Flexible Varargs #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lihaoyi
wants to merge
23
commits into
scala:main
Choose a base branch
from
lihaoyi:flexible-varargs
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 12 commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
6e9b690
.
lihaoyi 4d7e7c5
fixdate
lihaoyi 28861a4
fixdate
lihaoyi f14d3ab
fixdate
lihaoyi 9a9ad39
fixdate
lihaoyi 2067ed1
fixdate
lihaoyi 1f358fa
fixdate
lihaoyi b8b295b
fixdate
lihaoyi b39a6c5
fixdate
lihaoyi de5078b
fixdate
lihaoyi 2bfd84b
fixdate
lihaoyi 4a89f65
fixdate
lihaoyi 2cb50b5
fixdate
lihaoyi 662713e
fixdate
lihaoyi 77cc8df
fixdate
lihaoyi b6a198a
fixdate
lihaoyi 15d1a02
fixdate
lihaoyi 35f94bd
fixdate
lihaoyi 9a81ab4
fixdate
lihaoyi 0b3d5c9
.
lihaoyi d695934
Update content/flexible-varargs.md
lihaoyi 8a9ac17
Update content/flexible-varargs.md
lihaoyi 03ad012
Update content/flexible-varargs.md
lihaoyi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,350 @@ | ||
--- | ||
layout: sip | ||
permalink: /sips/:title.html | ||
stage: implementation | ||
status: under-review | ||
title: SIP-XX - Flexible Varargs | ||
--- | ||
|
||
**By: Li Haoyi** | ||
|
||
## History | ||
|
||
| Date | Version | | ||
|---------------|--------------------| | ||
| Feb 28th 2025 | Initial Draft | | ||
|
||
## Summary | ||
|
||
This SIP proposes an extension of the Scala Vararg unpacking syntax `*`. Currently, | ||
vararg unpacking can only be used for a single `Seq`, both in expressions and in patterns: | ||
|
||
```scala | ||
def sum(x: Int*) = x.sum | ||
|
||
val numbers = Seq(1, 2, 3) | ||
|
||
val total = sum(numbers*) // 6 | ||
|
||
numbers match{ | ||
case Seq(numbers2*) => ??? | ||
} | ||
``` | ||
|
||
We propose to extend this to allow mixing of `*` unpackings and raw values in expressions and patterns: | ||
|
||
|
||
```scala | ||
val numbers = Seq(1, 2, 3) | ||
|
||
val total = sum(0, numbers*, 4) // 10 | ||
|
||
numbers match { | ||
case Seq(1, numbers2*, 3) => println(numbers2) // Seq(2) | ||
} | ||
``` | ||
|
||
Allow multiple `*`s to be unpacked into the same varargs: | ||
|
||
```scala | ||
val numbers1 = Seq(1, 2, 3) | ||
val numbers2 = Seq(4, 5, 6) | ||
|
||
val total = sum(numbers1*, numbers2*) // 21 | ||
``` | ||
|
||
And allow `Option`s to be unpacked, as it is very common to have some of the values | ||
you want to pass to a varargs be optional: | ||
|
||
```scala | ||
val number1: Int = 1 | ||
val number2: Option[Int] = Some(2) | ||
val number3: Int = 3 | ||
|
||
val total = sum(number1, number2*, number3) // 6 | ||
``` | ||
|
||
## Motivation | ||
|
||
Vararg unpacking with `*` is convenient but very limited. In particular, this proposal | ||
streamlines two scenarios that were previously awkward. | ||
|
||
### Passing Multiple Things to Varargs | ||
|
||
The first scenario that this proposal streamlines is when you want to combine a bunch of | ||
different variables into a single varargs unpacking, e.g. two single values and two sequences. | ||
This can be done as shown below, but is terribly ugly: lots of extra parens, constructing | ||
inline `Seq`s with `++`: | ||
|
||
```scala | ||
val numbers1 = Seq(1, 2, 3) | ||
val numbers2 = Seq(4, 5, 6) | ||
|
||
// BEFORE | ||
val total = sum((Seq(0) ++ numbers1 ++ numbers2 ++ Seq(7))*) // 28 | ||
|
||
val total = sum((0 +: numbers1 ++ numbers2 :+ 7)*) | ||
// (console):1: left- and right-associative operators with same precedence may not be mixed | ||
|
||
// AFTER | ||
val total = sum(0, numbers1*, numbers2*, 4) // 10 | ||
``` | ||
|
||
As you can see, the version using inline `Seq()`s and `++` is super verbose, and the "obvious" | ||
cleanup using `+:` and `:+` doesn't actually work due to weird associativity problems. | ||
With this proposal, you can write what you mean and have it just work. | ||
|
||
### Constucting Sequences | ||
|
||
The second scenarios that this streamlines is constructing `Seq`s and other collections. | ||
lihaoyi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
For example, a common scenario is constructing a collection from values: | ||
|
||
|
||
```scala | ||
val foo = 1 | ||
val bar = 2 | ||
|
||
val coll = Seq(foo, bar) | ||
// Seq(1, 2) | ||
``` | ||
|
||
This works great even as the collection grows: | ||
|
||
```scala | ||
val foo = 1 | ||
val bar = 2 | ||
val qux = 3 | ||
val baz = 4 | ||
|
||
val coll = Seq(foo, bar, qux, baz) | ||
// Seq(1, 2, 3, 4) | ||
``` | ||
|
||
This looks fine, until one of the values is sequence or optional: | ||
|
||
```scala | ||
val foo = 1 | ||
val bar = Seq(2) | ||
val qux = 3 | ||
val baz = Some(4) | ||
|
||
val coll = Seq(foo) ++ bar ++ Seq(qux) ++ baz | ||
// Seq(1, 2, 3, 4) | ||
``` | ||
|
||
Even worse, if the first value is optional, it needs to be explicitly turned into a `Seq` | ||
otherwise you get an inferred type of `Iterable` which is unexpected: | ||
|
||
```scala | ||
val foo = Some(1) | ||
val bar = Seq(2) | ||
val qux = 3 | ||
val baz = 4 | ||
|
||
val coll = foo.toSeq ++ bar ++ Seq(qux, baz) | ||
val coll = Seq() ++ foo ++ bar ++ Seq(qux, baz) // alternative syntax | ||
// Seq(1, 2, 3, 4) | ||
``` | ||
|
||
As you can see, we end up having to refactor our code significantly for what is | ||
logically a very similar operation: constructing a sequence from values. Depending | ||
on what those values are, the shape of the code varies a lot, and you are open to | ||
get surprising inferred types if you forget to call `toSeq` on the first entry (sometimes!) | ||
|
||
```scala | ||
val coll = Seq(foo, bar, qux, baz) | ||
val coll = Seq(foo) ++ bar ++ Seq(qux) ++ baz | ||
val coll = foo.toSeq ++ bar ++ Seq(qux, baz) | ||
val coll = Seq() ++ foo ++ bar ++ Seq(qux, baz) | ||
``` | ||
|
||
With those proposal, all three scenarios would look almost the same - reflecting | ||
lihaoyi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
the fact that they are really doing the same thing - and you are no longer prone to weird | ||
type inference issues depending on the type of the left-most value: | ||
|
||
|
||
```scala | ||
val coll = Seq(foo, bar, qux, baz) | ||
val coll = Seq(foo, bar*, qux, baz*) | ||
val coll = Seq(foo*, bar*, qux, baz) | ||
``` | ||
|
||
## Implementation | ||
|
||
The proposed implementation is to basically desugar the multiple `*`s into the manual | ||
`Seq`-construction code you would have written without it: | ||
|
||
```scala | ||
// User Code | ||
val total = sum(0, numbers1*, numbers2*, 4) // 10 | ||
|
||
// Desugaring | ||
val total = sum((IArray(0) ++ numbers1 ++ numbers2 ++ IArray(4))*) // 10 | ||
``` | ||
|
||
The implementation for patterns could be something like | ||
|
||
```scala | ||
// User Code | ||
numbers match { | ||
case Seq(1, numbers2*, 3) => println(numbers2) // Seq(2) | ||
} | ||
|
||
// Desugaring Helper | ||
class VarargsMatchHelper(beforeCount: Int, afterCount: Int) { | ||
def unapply[T](values: Seq[T]): Option[(Seq[T], Seq[T], Seq[T])] = { | ||
Option.when (values.length >= beforeCount + afterCount){ | ||
( | ||
values.take(beforeCount), | ||
values.drop(beforeCount).dropRight(afterCount), | ||
values.takeRight(afterCount) | ||
) | ||
} | ||
} | ||
} | ||
|
||
// Desugaring Helper | ||
val VarargsMatcher = new VarargsMatchHelper(1, 1) | ||
numbers match { | ||
case VarargsMatcher(Seq(1), numbers2, Seq(3)) => println(numbers2) // Seq(2) | ||
} | ||
``` | ||
|
||
## Limitations | ||
|
||
|
||
### Single `*` in pattern matching | ||
|
||
One major limitation is that while expressions support unpacking multiple `*`s in a varargs, | ||
pattern matching can only support a single `*`. That is because a varargs pattern with | ||
multiple `*` sub-patterns may have multiple possible ways of assigning the individual | ||
elements to each sub-pattern, and depending on the sub-patterns not all such assignments | ||
may be valid. Thus there is no way to implement it efficiently in general, as it would | ||
require an expensive (`O(2^n)`) backtracking search to try and find a valid assignment | ||
of the elements that satisfies all sub-patterns. | ||
|
||
### No Specific Performance Optimizations | ||
|
||
As proposed, the desugaring just relies on `IArray()` and `++` to construct the final | ||
sequence that will be passed to varargs. We don't want to use `Seq` because it returns | ||
a `List`, and `List#++` is very inefficient. But we do not do any further optimizations, | ||
and anyone who hits performance issues with flexible vararg unpacking can always rewrite | ||
it themselves manually constructing an `mutable.ArrayBuffer` or `mutable.ArrayDeque`. | ||
|
||
### Not supporting unpacking arbitrary `Iterable`s | ||
|
||
Given we propose to allow unpacking `Seq` and `Option`, its worth asking if we should | ||
support unpacking `Iterable` or `IterableOnce`. This proposal avoids those types for now, | ||
but if anyone wants to unpack them it's trivial to call `.toSeq`. For `Option`, we think | ||
that the use case of calling a vararg method with optional values is frequent enough that | ||
it deserves special support, whereas calling a vararg method with a `Set` or a `Map` | ||
doesn't happen enough to be worth special casing. | ||
|
||
## Alternatives | ||
|
||
Apart from the manual workflows described above, one alternative is to use implicit conversions | ||
to a target type, i.e. the "magnet pattern". For example: | ||
|
||
```scala | ||
case class Summable(value: Seq[Int]) | ||
implicit def seqSummable(value: Seq[Int]) = Summable(value) | ||
implicit def singleSummable(value: Int) = Summable(Seq(value)) | ||
implicit def optionSummable(value: Option[Int]) = Summable(value.toSeq) | ||
|
||
def sum(x: Int*) = x.sum | ||
lihaoyi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
val numbers1 = Seq(1, 2, 3) | ||
val numbers2 = Seq(4, 5, 6) | ||
|
||
val total = sum(0, numbers1, numbers2, 7) | ||
``` | ||
|
||
We can see this done repeatedly in the wild: | ||
|
||
* OS-Lib's `os.call` and `os.spawn` methods, which use | ||
`os.Shellable` as the target type, with [several implicit conversions](https://github.com/com-lihaoyi/os-lib/blob/ff52a8bc4873d9c01e085cc18780845ecea0f8a2/os/src/Model.scala#L217-L237) | ||
both for normal constructors as well as for `Seq[T]` and `Option[T]` cases | ||
|
||
* SBT's `settings` lists, which uses `SettingsDefinition` as the target type, with | ||
[two implicit conversions](https://github.com/sbt/sbt/blob/1d16ca95106a11ad4ef0e3c5a1637c17189600da/internal/util-collection/src/main/scala/sbt/internal/util/Settings.scala#L691-L695) | ||
for single and sequence entries | ||
|
||
* Scalatags' HTML templates, which use `Frag` as the target type and provide | ||
[an implicit conversion](https://github.com/com-lihaoyi/scalatags/blob/762ab37d0addc614bfd65bbeabeb5f123caf4395/scalatags/src/scalatags/Text.scala#L59-L63) from any `Seq[T]` with an implicit `T => Frag` | ||
|
||
This approach works, but relies on you controlling the | ||
target type, and adds considerable boilerplate defining implicit conversions for | ||
every such target type. It thus can sometimes be found in libraries where that | ||
overhead can be amortized over many callsites, but it not a general | ||
replacement for the more flexible `*` proposed in this document. Note that while the | ||
"manual" approach of doing `foo ++ Seq(bar) ++ qux ++ Seq(baz)` could be applied to | ||
any of the three use cases above, all three libraries found it painful enough that | ||
adding implicit conversions was worthwhile. | ||
|
||
# Prior Art | ||
|
||
### Python | ||
|
||
Python's `*` syntax works identically to this proposal. In Python, you can mix | ||
single values with one or more `*` unpackings when calling a function: | ||
|
||
```python | ||
>>> a = [1, 2, 3] | ||
>>> b = [4, 5, 6] | ||
|
||
>>> print(*a, 0, *b) | ||
1 2 3 0, 4 5 6 | ||
``` | ||
|
||
Python's [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) | ||
has the same limitation of only allowing one `*` unpacking in its | ||
[Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns), with | ||
an arbitrary number of non-`*` patterns on the left and right, and follows the | ||
[same pattern matching strategy](https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns) | ||
that I sketched above. | ||
|
||
### Javascript | ||
Javascript's expression `...` syntax works identically to this proposal. In Python, you can mix | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Python should be Javascript |
||
single values with one or more `...` unpackings when calling a function: | ||
|
||
```javascript | ||
a = [1, 2, 3] | ||
b = [4, 5, 6] | ||
|
||
console.log(...a, 0, ...b) | ||
// 1 1 2 3 0 4 5 6 | ||
``` | ||
|
||
Javascript has a stricter limitation when destructuring an array, as it only allows | ||
single values on the _left_ `...rest` pattern. | ||
|
||
```javascript | ||
[a, b, ...rest] = [10, 20, 30, 40, 50]; | ||
// a = 10 | ||
// b = 20 | ||
// rest = [30, 40, 50] | ||
|
||
[a, b, ...rest, c] = [10, 20, 30, 40, 50]; | ||
// Uncaught SyntaxError: Rest element must be last element | ||
``` | ||
|
||
### C# | ||
|
||
C# also has structural pattern matching, and supports | ||
[List Patterns](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/patterns#list-patterns), | ||
which allows a single `..` "slice" to be placed anywhere in the list: | ||
|
||
```csharp | ||
Console.WriteLine(new[] { 1, 2, 3, 4, 5 } is [> 0, > 0, ..]); // True | ||
Console.WriteLine(new[] { 1, 1 } is [_, _, ..]); // True | ||
Console.WriteLine(new[] { 0, 1, 2, 3, 4 } is [> 0, > 0, ..]); // False | ||
Console.WriteLine(new[] { 1 } is [1, 2, ..]); // False | ||
|
||
Console.WriteLine(new[] { 1, 2, 3, 4 } is [.., > 0, > 0]); // True | ||
Console.WriteLine(new[] { 2, 4 } is [.., > 0, 2, 4]); // False | ||
Console.WriteLine(new[] { 2, 4 } is [.., 2, 4]); // True | ||
|
||
Console.WriteLine(new[] { 1, 2, 3, 4 } is [>= 0, .., 2 or 4]); // True | ||
Console.WriteLine(new[] { 1, 0, 0, 1 } is [1, 0, .., 0, 1]); // True | ||
Console.WriteLine(new[] { 1, 0, 1 } is [1, 0, .., 0, 1]); // False | ||
``` |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.