|
| 1 | +--- |
| 2 | +layout: doc-page |
| 3 | +title: "Capture Checking Basics" |
| 4 | +nightlyOf: https://docs.scala-lang.org/scala3/reference/experimental/capture-checking/basics.html |
| 5 | +--- |
| 6 | + |
| 7 | +## Introduction |
| 8 | + |
| 9 | +Capture checking can be enabled by the language import |
| 10 | +```scala |
| 11 | +import language.experimental.captureChecking |
| 12 | +``` |
| 13 | +At present, capture checking is still highly experimental and unstable, and it evolves quickly. |
| 14 | +Before trying it out, make sure you have the latest version of Scala. |
| 15 | + |
| 16 | +To get an idea what capture checking can do, let's start with a small example: |
| 17 | +```scala |
| 18 | +def usingLogFile[T](op: FileOutputStream => T): T = |
| 19 | + val logFile = FileOutputStream("log") |
| 20 | + val result = op(logFile) |
| 21 | + logFile.close() |
| 22 | + result |
| 23 | +``` |
| 24 | +The `usingLogFile` method invokes a given operation with a fresh log file as parameter. Once the operation has ended, the log file is closed and the |
| 25 | +operation's result is returned. This is a typical _try-with-resources_ pattern, similar to many other such patterns which are often supported by special language constructs in other languages. |
| 26 | + |
| 27 | +The problem is that `usingLogFile`'s implementation is not entirely safe. One can |
| 28 | +undermine it by passing an operation that performs the logging at some later point |
| 29 | +after it has terminated. For instance: |
| 30 | +```scala |
| 31 | +val later = usingLogFile { file => () => file.write(0) } |
| 32 | +later() // crash |
| 33 | +``` |
| 34 | +When `later` is executed it tries to write to a file that is already closed, which |
| 35 | +results in an uncaught `IOException`. |
| 36 | + |
| 37 | +Capture checking gives us the mechanism to prevent such errors _statically_. To |
| 38 | +prevent unsafe usages of `usingLogFile`, we can declare it like this: |
| 39 | +```scala |
| 40 | +def usingLogFile[T](op: FileOutputStream^ => T): T = |
| 41 | + // same body as before |
| 42 | +``` |
| 43 | +The only thing that's changed is that the `FileOutputStream` parameter of `op` is now |
| 44 | +followed by `^`. We'll see that this turns the parameter into a _capability_ whose lifetime is tracked. |
| 45 | + |
| 46 | +If we now try to define the problematic value `later`, we get a static error: |
| 47 | +``` |
| 48 | + | val later = usingLogFile { f => () => f.write(0) } |
| 49 | + | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 50 | + | Found: (f: java.io.FileOutputStream^'s1) ->'s2 () ->{f} Unit |
| 51 | + | Required: java.io.FileOutputStream^ => () ->'s3 Unit |
| 52 | + | |
| 53 | + | Note that capability f cannot be included in outer capture set 's3. |
| 54 | +``` |
| 55 | +In this case, it was easy to see that the `logFile` capability escapes in the closure passed to `usingLogFile`. But capture checking also works for more complex cases. |
| 56 | +For instance, capture checking is able to distinguish between the following safe code: |
| 57 | +```scala |
| 58 | +val xs = usingLogFile { f => |
| 59 | + List(1, 2, 3).map { x => f.write(x); x * x } |
| 60 | +} |
| 61 | +``` |
| 62 | +and the following unsafe one: |
| 63 | +```scala |
| 64 | +val xs = usingLogFile { f => |
| 65 | + LzyList(1, 2, 3).map { x => f.write(x); x * x } |
| 66 | +} |
| 67 | +``` |
| 68 | +An error would be issued in the second case, but not the first one (this assumes a capture-aware |
| 69 | +formulation `LzyList` of lazily evaluated lists, which we will present later in this page). |
| 70 | + |
| 71 | +It turns out that capture checking has very broad applications. Besides the various |
| 72 | +try-with-resources patterns, it can also be a key part to the solutions of many other long standing problems in programming languages. Among them: |
| 73 | + |
| 74 | + - How to have a simple and flexible system for checked exceptions. We show later |
| 75 | + how capture checking enables a clean and fully safe system for checked exceptions in Scala. |
| 76 | + - How to address the problem of effect polymorphism in general. |
| 77 | + - How to solve the "what color is your function?" problem of mixing synchronous |
| 78 | + and asynchronous computations. |
| 79 | + - How to do region-based allocation, safely. |
| 80 | + - How to reason about capabilities associated with memory locations. |
| 81 | + |
| 82 | +The following sections explain in detail how capture checking works in Scala 3. |
| 83 | + |
| 84 | + |
| 85 | +## Capabilities and Capturing Types |
| 86 | + |
| 87 | +The capture checker extension introduces a new kind of types and it enforces some rules for working with these types. |
| 88 | + |
| 89 | + |
| 90 | +Capture checking is done in terms of _capturing types_ of the form |
| 91 | +`T^{c₁, ..., cᵢ}`. Here `T` is a type, and `{c₁, ..., cᵢ}` is a _capture set_ consisting of references to capabilities `c₁, ..., cᵢ`. |
| 92 | + |
| 93 | +An _object capability_ is syntactically a method- or class-parameter, a local variable, or the `this` of an enclosing class. The type of a capability |
| 94 | +must be a capturing type with a non-empty capture set. We also say that |
| 95 | +variables that are capabilities are _tracked_. |
| 96 | + |
| 97 | +In a sense, every |
| 98 | +capability gets its authority from some other, more sweeping capability which it captures. The recursion stops with a _universal capability_, written `cap`, from which all other capabilities are ultimately derived. |
| 99 | +If `T` is a type, then `T^` is a shorthand for `T^{cap}`, meaning `T` can capture arbitrary capabilities. |
| 100 | + |
| 101 | +Here is an example: |
| 102 | +```scala |
| 103 | +class FileSystem |
| 104 | + |
| 105 | +class Logger(fs: FileSystem^): |
| 106 | + def log(s: String): Unit = ... // Write to a log file, using `fs` |
| 107 | + |
| 108 | +def test(fs: FileSystem^) = |
| 109 | + val l: Logger^{fs} = Logger(fs) |
| 110 | + l.log("hello world!") |
| 111 | + val xs: LzyList[Int]^{l} = |
| 112 | + LzyList.from(1) |
| 113 | + .map { i => |
| 114 | + l.log(s"computing elem # $i") |
| 115 | + i * i |
| 116 | + } |
| 117 | + xs |
| 118 | +``` |
| 119 | +Here, the `test` method takes a `FileSystem` as a parameter. `fs` is a capability since its type has a non-empty capture set. The capability is passed to the `Logger` constructor |
| 120 | +and retained as a field in class `Logger`. Hence, the local variable `l` has type |
| 121 | +`Logger^{fs}`: it is a `Logger` which retains the `fs` capability. |
| 122 | + |
| 123 | +The second variable defined in `test` is `xs`, a lazy list that is obtained from |
| 124 | +`LzyList.from(1)` by logging and mapping consecutive numbers. Since the list is lazy, |
| 125 | +it needs to retain the reference to the logger `l` for its computations. Hence, the |
| 126 | +type of the list is `LzyList[Int]^{l}`. On the other hand, since `xs` only logs but does |
| 127 | +not do other file operations, it retains the `fs` capability only indirectly. That's why |
| 128 | +`fs` does not show up in the capture set of `xs`. |
| 129 | + |
| 130 | +Capturing types come with a subtype relation where types with "smaller" capture sets are subtypes of types with larger sets (the _subcapturing_ relation is defined in more detail below). If a type `T` does not have a capture set, it is called _pure_, and is a subtype of |
| 131 | +any capturing type that adds a capture set to `T`. |
| 132 | + |
| 133 | +## Function Types |
| 134 | + |
| 135 | +The usual function type `A => B` now stands for a function that can capture arbitrary capabilities. We call such functions |
| 136 | +_impure_. By contrast, the new single arrow function type `A -> B` stands for a function that cannot capture any capabilities, or otherwise said, is _pure_. |
| 137 | +One can add a capture set after the arrow of an otherwise pure function. |
| 138 | +For instance, `A ->{c, d} B` would be a function that can capture capabilities `c` and `d`, but no others. |
| 139 | +This type is a shorthand for `(A -> B)^{c, d}`, i.e. the function type `A -> B` with possible captures `{c, d}`. |
| 140 | + |
| 141 | +The impure function type `A => B` is treated as an alias for `A ->{cap} B`. That is, impure functions are functions that can capture anything. |
| 142 | + |
| 143 | +A capture annotation `^` binds more strongly than a function arrow. So |
| 144 | +`A -> B^{c}` is read as `A -> (B^{c})` and `A -> B^` is read as `A -> (B^)`. |
| 145 | + |
| 146 | +Analogous conventions apply to context function types. `A ?=> B` is an impure context function, with `A ?-> B` as its pure complement. |
| 147 | + |
| 148 | +**Note 1:** The identifiers `->` and `?->` are now treated as soft keywords when used as infix type operators. They are |
| 149 | +still available as regular identifiers for terms. For instance, the mapping syntax `Map("x" -> 1, "y" -> 2)` is still supported since it only applies to terms. |
| 150 | + |
| 151 | +**Note 2:** The distinctions between pure vs impure function types do not apply to methods. In fact, since methods are not values they never capture anything directly. References to |
| 152 | +capabilities in a method are instead counted in the capture set of the enclosing object. |
| 153 | + |
| 154 | +## By-Name Parameter Types |
| 155 | + |
| 156 | +A convention analogous to function types also extends to by-name parameters. In |
| 157 | +```scala |
| 158 | +def f(x: => Int): Int |
| 159 | +``` |
| 160 | +the actual argument can refer to arbitrary capabilities. So the following would be OK: |
| 161 | +```scala |
| 162 | +f(if p(y) then throw Ex() else 1) |
| 163 | +``` |
| 164 | +On the other hand, if `f` was defined like this |
| 165 | +```scala |
| 166 | +def f(x: -> Int): Int |
| 167 | +``` |
| 168 | +the actual argument to `f` could not refer to any capabilities, so the call above would be rejected. |
| 169 | +One can also allow specific capabilities like this: |
| 170 | +```scala |
| 171 | +def f(x: ->{c} Int): Int |
| 172 | +``` |
| 173 | +Here, the actual argument to `f` is allowed to use the `c` capability but no others. |
| 174 | + |
| 175 | +## Subtyping and Subcapturing |
| 176 | + |
| 177 | +Capturing influences subtyping. As usual we write `T₁ <: T₂` to express that the type |
| 178 | +`T₁` is a subtype of the type `T₂`, or equivalently, that `T₁` conforms to `T₂`. An |
| 179 | +analogous _subcapturing_ relation applies to capture sets. If `C₁` and `C₂` are capture sets, we write `C₁ <: C₂` to express that `C₁` _is covered by_ `C₂`, or, swapping the operands, that `C₂` _covers_ `C₁`. |
| 180 | + |
| 181 | +Subtyping extends as follows to capturing types: |
| 182 | + |
| 183 | + - Pure types are subtypes of capturing types. That is, `T <: T ^ C`, for any type `T`, capturing set `C`. |
| 184 | + - For capturing types, smaller capturing sets produce subtypes: `T₁ ^ C₁ <: T₂ ^ C₂` if |
| 185 | + `C₁ <: C₂` and `T₁ <: T₂`. |
| 186 | + |
| 187 | +A subcapturing relation `C₁ <: C₂` holds if `C₂` _accounts for_ every element `c` in `C₁`. This means one of the following three conditions must be true: |
| 188 | + |
| 189 | + - `c ∈ C₂`, |
| 190 | + - `c` refers to a parameter of some class `Cls` and `C₂` contains `Cls.this`, |
| 191 | + - `c`'s type has capturing set `C` and `C₂` accounts for every element of `C` (that is, `C <: C₂`). |
| 192 | + |
| 193 | + |
| 194 | +**Example 1.** Given |
| 195 | +```scala |
| 196 | +fs: FileSystem^ |
| 197 | +ct: CanThrow[Exception]^ |
| 198 | +l : Logger^{fs} |
| 199 | +``` |
| 200 | +we have |
| 201 | +``` |
| 202 | +{l} <: {fs} <: {cap} |
| 203 | +{fs} <: {fs, ct} <: {cap} |
| 204 | +{ct} <: {fs, ct} <: {cap} |
| 205 | +``` |
| 206 | +The set consisting of the root capability `{cap}` covers every other capture set. This is |
| 207 | +a consequence of the fact that, ultimately, every capability is created from `cap`. |
| 208 | + |
| 209 | +**Example 2.** Consider again the FileSystem/Logger example from before. `LzyList[Int]` is a proper subtype of `LzyList[Int]^{l}`. So if the `test` method in that example |
| 210 | +was declared with a result type `LzyList[Int]`, we'd get a type error. Here is the error message: |
| 211 | +``` |
| 212 | +11 |def test(using fs: FileSystem^): LzyList[Int] = { |
| 213 | + | ^ |
| 214 | + | Found: LzyList[Int]^{fs} |
| 215 | + | Required: LzyList[Int] |
| 216 | +``` |
| 217 | +Why does it say `LzyList[Int]^{fs}` and not `LzyList[Int]^{l}`, which is, after all, the type of the returned value `xs`? The reason is that `l` is a local variable in the body of `test`, so it cannot be referred to in a type outside that body. What happens instead is that the type is _widened_ to the smallest supertype that does not mention `l`. Since `l` has capture set `fs`, we have that `{fs}` covers `{l}`, and `{fs}` is acceptable in a result type of `test`, so `{fs}` is the result of that widening. |
| 218 | +This widening is called _avoidance_; it is not specific to capture checking but applies to all variable references in Scala types. |
| 219 | + |
| 220 | +## Capability Classes |
| 221 | + |
| 222 | +Classes like `CanThrow` or `FileSystem` have the property that their values are always intended to be capabilities. We can make this intention explicit and save boilerplate by letting these classes extend the `SharedCapability` class defined in object `cap`. |
| 223 | + |
| 224 | +The capture set of type extending `SharedCapability` is always `{cap}`. This means we could equivalently express the `FileSystem` and `Logger` classes as follows: |
| 225 | +```scala |
| 226 | +import caps.SharedCapability |
| 227 | + |
| 228 | +class FileSystem extends SharedCapability |
| 229 | + |
| 230 | +class Logger(using FileSystem): |
| 231 | + def log(s: String): Unit = ??? |
| 232 | + |
| 233 | +def test(using fs: FileSystem) = |
| 234 | + val l: Logger^{fs} = Logger() |
| 235 | + ... |
| 236 | +``` |
| 237 | +In this version, `FileSystem` is a capability class, which means that the `{cap}` capture set is implied on the parameters of `Logger` and `test`. |
| 238 | + |
| 239 | +Another, unrelated change in the version of the last example here is that the `FileSystem` capability is now passed as an implicit parameter. It is quite natural to model capabilities with implicit parameters since it greatly reduces the wiring overhead once multiple capabilities are in play. |
| 240 | + |
| 241 | +## Escape Checking |
| 242 | + |
| 243 | +Capabilities follow the usual scoping discipline, which means that capture sets |
| 244 | +can contain only capabilities that are visible at the point where the set is defined. |
| 245 | + |
| 246 | +We now reconstruct how this principle produced the error in the introductory example, where |
| 247 | +`usingLogFile` was declared like this: |
| 248 | +```scala |
| 249 | +def usingLogFile[T](op: FileOutputStream^ => T): T = ... |
| 250 | +``` |
| 251 | +The error message was: |
| 252 | +``` |
| 253 | + | val later = usingLogFile { f => () => f.write(0) } |
| 254 | + | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 255 | + | Found: (f: java.io.FileOutputStream^'s1) ->'s2 () ->{f} Unit |
| 256 | + | Required: java.io.FileOutputStream^ => () ->'s3 Unit |
| 257 | + | |
| 258 | + | Note that capability f cannot be included in outer capture set 's3. |
| 259 | +``` |
| 260 | +This error message was produced by the following logic: |
| 261 | + |
| 262 | + - The `f` parameter has type `FileOutputStream^`, which makes it a capability. |
| 263 | + - Therefore, the type of the expression `() => f.write(0)` is `() ->{f} Unit`. |
| 264 | + - This makes the type of the whole closure passed to `usingLogFile` the dependent function type |
| 265 | + `(f: FileOutputStream^'s1) ->'s2 () ->{f} Unit`, |
| 266 | + for some as yet uncomputed capture sets `'s1` and `'s2`. |
| 267 | + - The expected type of the closure is a simple, parametric, impure function type `FileOutputStream^ => T`, |
| 268 | + for some instantiation of the type variable `T`. |
| 269 | + - Matching with the found type, `T` must have the shape `() ->'s3 Unit`, for |
| 270 | + some capture set `'s3` defined at the level of value `later`. |
| 271 | + - That capture set cannot include the capability `f` since `f` is locally bound. |
| 272 | + This causes the error. |
| 273 | + |
| 274 | +An analogous restriction applies to the type of a mutable variable. |
| 275 | +Another way one could try to undermine capture checking would be to |
| 276 | +assign a closure with a local capability to a global variable. Maybe |
| 277 | +like this: |
| 278 | +```scala |
| 279 | +var loophole: () => Unit = () => () |
| 280 | +usingLogFile { f => |
| 281 | + loophole = () => f.write(0) |
| 282 | +} |
| 283 | +loophole() |
| 284 | +``` |
| 285 | +But this will not compile either, since the capture set of the mutable variable `loophole` cannot refer to variable `f`, which is not visible |
| 286 | +where `loophole` is defined. |
| 287 | + |
| 288 | +Looking at object graphs, we observe a monotonicity property: The capture set of an object `x` covers the capture sets of all objects reachable through `x`. This property is reflected in the type system by the following _monotonicity rule_: |
| 289 | + |
| 290 | + - In a class `C` with a field `f`, the capture set `{this}` covers the capture set `{this.f}` as well as the capture set of any application of `this.f` to pure arguments. |
0 commit comments