-
-
Notifications
You must be signed in to change notification settings - Fork 253
Add GADT documentation to the manual #1096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 3 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
e429217
adding GADT documentation and tutorial
Josef-Thorne-A 37b3fcb
code formatting fixed in GADT man page
Josef-Thorne-A fc343b1
Merge pull request #1 from rescript-lang/master
Josef-Thorne-A 3d3934f
Making changes according to feedback
Josef-Thorne-A d2cc815
Merge branch 'master' of github.com:Josef-Thorne-A/rescript-lang.org
Josef-Thorne-A 05ac308
Formatting fix
Josef-Thorne-A b98d798
Fixing remaining feedback
Josef-Thorne-A File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
294 changes: 294 additions & 0 deletions
294
pages/docs/manual/v12.0.0/generalized-algebraic-data-types.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,294 @@ | ||
--- | ||
title: "Generalized Algebraic Data Types" | ||
description: "Generalized Algebraic Data Types in Rescript" | ||
canonical: "/docs/manual/v12.0.0/generalized-algebraic-data-types" | ||
--- | ||
|
||
# Generalized Algebraic Data Types | ||
|
||
Generalized Algebraic Data Types (GADTs) are an advanced feature of Rescript's type system. "Generalized" can be somewhat of a misnomer -- what they actually allow you to do is add some extra type-specificity to your variants. Using a GADT, you can give the individual cases of a variant _different_ types. | ||
|
||
For a quick overview of the use cases, reach for GADTs when: | ||
|
||
1. You need to distinguish between different members of a variant at the type-level | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
2. You want to "hide" type information in a type-safe way, without resorting to casts. | ||
3. You need a function to return a different type depending on its input. | ||
|
||
GADTs usually are overkill, but when you need them, you need them! Understanding them from first principles is difficult, so it is best to explain through some motivating examples. | ||
fhammerschmidt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Distinguishing Constructors (Subtyping) | ||
|
||
Suppose a simple variant type that represents the current timezone of a date value. This handles both daylight savings and standard time: | ||
|
||
```res example | ||
type timezone = | ||
| EST // standard time | ||
| EDT // daylight time | ||
| CST // standard time | ||
| CDT // daylight time | ||
// etc... | ||
``` | ||
|
||
Using this variant type, we will end up having functions like this: | ||
|
||
```res example | ||
let convert_to_daylight = tz => { | ||
switch tz { | ||
| EST => EDT | ||
| CST => CDT | ||
| EDT | CDT /* or, _ */ => failwith("Invalid timezone provided!") | ||
} | ||
} | ||
``` | ||
|
||
This function is only valid for a subset of our variant type's constructors but we can't handle this in a type-safe way using regular variants. We have to enforce that at runtime -- and moreover the compiler can't help us ensure we are failing only in the invalid cases. We are back to dynamically checking validity like we would in a language without static typing. If you work with a large variant type long enough, you will frequently find yourself writing repetitive catchall `switch` statements like the above, and for little actual benefit. The compiler should be able to help us here. | ||
|
||
Lets see if we can find a way for the compiler to help us out with normal variants. We could define another variant type to distinguish the two kinds of timezone. | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```res example | ||
type daylight_or_standard = | ||
| Daylight(timezone) | ||
| Standard(timezone) | ||
``` | ||
|
||
This has a lot of problems. For one, it's cumbersome and redundant. We would now have to pattern-match twice whenever we deal with a timezone that's wrapped up here. The compiler will force us to check whether we are dealing with daylight or standard time, but notice that there's nothing stopping us from providing invalid timezones to these constructors: | ||
|
||
```res example | ||
let invalid_tz1 = Daylight(EST) | ||
let invalid_tz2 = Standard(EDT) | ||
``` | ||
|
||
Consequently, we still have to write our redundant catchall cases. We could define daylight savings time and standard time as two _separate_ types, and unify those in our `daylight_or_standard` variant. That could be a passable solution, but that makes a distinction really would like to do is implement some kind of _subtyping_ relationship. We have two _kinds_ of timezone. This is where GADTs are handy: | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```res example | ||
type standard | ||
type daylight | ||
type rec timezone<_> = | ||
| EST: timezone<standard> | ||
| EDT: timezone<daylight> | ||
| CST: timezone<standard> | ||
| CDT: timezone<daylight> | ||
``` | ||
|
||
We define our type with a type parameter. We manually annotate each constructor, providing it with the correct type parameter indicating whether it is standard or daylight. Each constructor is a `timezone`, | ||
but we've added another level of specificity using a type parameter. Constructors are now understood to be `standard` or `daylight` at the _type_ level. Now we can fix our function like this: | ||
|
||
```res example | ||
let convert_to_daylight = tz => { | ||
switch tz { | ||
| EST => EDT | ||
| CST => CDT | ||
} | ||
} | ||
``` | ||
|
||
The compiler can infer correctly that this function should only take `timezone<standard>` and only output | ||
`timezone<daylight>`. We don't need to add any redundant catchall cases and the compiler will even error if | ||
we try to return a standard timezone from this function. Actually, this seems like it could be a problem, | ||
we still want to be able to match on all cases of the variant sometimes, and a naive attempt at this will not pass the type checker. A naive example will fail: | ||
|
||
```res example | ||
let convert_to_daylight = tz => { | ||
switch tz { | ||
| EST => EDT | ||
| CST => CDT | ||
| CDT => CDT | ||
| EDT => EDT | ||
} | ||
} | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
``` | ||
|
||
This will complain that `daylight` and `standard` are incompatible. To fix this, we need to explicitly annotate to tell the compiler to accept both: | ||
|
||
```res example | ||
let convert_to_daylight : type a. timezone<a> => timezone<daylight> = // ... | ||
``` | ||
|
||
`type a.` here defines a _locally abstract type_ which basically tells the compiler that the type parameter a is some specific type, but we don't care what it is. The cost of the extra specificity and safety that GADTs give us is that the compiler is not able to help us with type inference as much. | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
## Varying return type | ||
|
||
Sometimes, a function should have a different return type based on what you give it, and GADTs are how we can do this in a type-safe way. We can implement a generic `add` function that works on both `int` or `float`: | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```res example | ||
type rec number<_> = Int(int): number<int> | Float(float): number<float> | ||
let add: | ||
type a. (number<a>, number<a>) => a = | ||
(a, b) => | ||
switch (a, b) { | ||
| (Int(a), Int(b)) => a + b | ||
| (Float(a), Float(b)) => a +. b | ||
} | ||
let foo = add(Int(1), Int(2)) | ||
let bar = add(Int(1), Float(2.0)) // the compiler will complain here | ||
``` | ||
|
||
How does this work? The key thing is the function signature for add. The number GADT is acting as a `type witness`. We have told the compiler that the type parameter for `number` will be the same as the type we return -- both are set to `a`. So if we provide a `number<int>`, `a` equals `int`, and the function will therefore return an `int`. | ||
|
||
We can also use this to avoid returning `option` unnecessarily. This example is adapted from Real World Ocaml, chapter 9. We create an array searching function can be configured to either raise an exception, return an `option`, or provide a `default` value depending on the behavior we want. | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```res example | ||
module If_not_found = { | ||
type t<_,_> | ||
}module IfNotFound = { | ||
type rec t<_, _> = | ||
| Raise: t<'a, 'a> | ||
| ReturnNone: t<'a, option<'a>> | ||
| DefaultTo('a): t<'a, 'a> | ||
} | ||
let flexible_find: | ||
type a b. (~f: a => bool, array<a>, IfNotFound.t<a, b>) => b = | ||
(~f, arr, ifNotFound) => { | ||
open IfNotFound | ||
switch Array.find(arr, f) { | ||
| None => | ||
switch ifNotFound { | ||
| Raise => failwith("No matching item found") | ||
| ReturnNone => None | ||
| DefaultTo(x) => x | ||
} | ||
| Some(x) => | ||
switch ifNotFound { | ||
| ReturnNone => Some(x) | ||
| Raise => x | ||
| DefaultTo(_) => x | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Hide and recover Type information Dynamically | ||
|
||
In a very advanced case that combines many of the above techniques, we can use GADTs to selectively hide and recover type information. This helps us create more generic types. | ||
The below example defines a `num` type similar to our above addition example, but this lets us use `int` and `float` arrays | ||
interchangeably, hiding the implementation type rather than exposing it. This is similar to a regular variant. However, it is a tuple including embedding a `num_ty` and another value. | ||
`num_ty` serves as a type-witness, making it | ||
possible to recover type information that was hidden dynamically. Matching on `num_ty` will "reveal" the type of the other value in the pair.We can use this to write a generic sum function over arrays of numbers: | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```res example | ||
type rec num_ty<'a> = | ||
| Int: num_ty<int> | ||
| Float: num_ty<float> | ||
and num = Num(num_ty<'a>, 'a): num | ||
and num_array = Narray(num_ty<'a>, array<'a>): num_array | ||
let add_int = (x, y) => x + y | ||
let add_float = (x, y) => x +. y | ||
let sum = (Narray(witness, array)) => { | ||
switch witness { | ||
| Int => Num(Int, array->Array.reduce(0, add_int)) | ||
| Float => Num(Float, array->Array.reduce(0., add_float)) | ||
} | ||
} | ||
``` | ||
|
||
## A Practical Example -- writing bindings: | ||
|
||
Javascript libraries that are highly polymorphic or use inheritance can benefit hugely from GADTs, but they can be useful for bindings even in other cases. The following examples are writing bindings to a simplified | ||
fhammerschmidt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
of Node's `Stream` API. | ||
|
||
This API has a method for binding event handlers, `on`. This takes an event and a callback. The callback accepts different parameters | ||
depending in which event we are binding to. A naive implementation might look similar to this, defining a | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
separate method for each stream event to wrap the unsafe version of on. | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```res example | ||
module Stream = { | ||
type t | ||
@new @module("node:http") external make: unit => t = "stream" | ||
@send external on : (stream, string, 'a) => unit | ||
let onEnd = (stream, callback: unit=> unit) => stream->on("end", callback) | ||
let onData = (stream, callback: ('a => 'b)) => stream->on("", callback) | ||
// etc. ... | ||
} | ||
``` | ||
|
||
Not only is this quite tedious to write, and quite ugly, but we gain very little from it. The function wrappers even add performance overhead, so we are losing on almost all fronts. If we define subtypes of | ||
Stream like `Readable` or `Writable`, which have all sorts of special interactions with the callback that jeopardize our type-safety, we are going to be in even deeper trouble. | ||
|
||
Instead, we can use the same GADT technique that let us vary return type to vary the input type. | ||
Not only are we able to now just use a single method, but the compiler will guarantee we are always using the correct callback type for the given event. We simply define an event GADT which specifies | ||
the type signature of the callback and pass this instead of a plain string. | ||
|
||
Additionally, we use some type parameters to represent the different types of Streams. | ||
|
||
This example is complex, but it enforces tons of useful rules. The wrong event can never be used | ||
with the wrong callback, but it also will never be used with the wrong kind of stream. The compiler will will complain for example if we try to use a `Pipe` event with anything other than a `writable` stream. | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
The real magic happens in the signature of `on`. Read it carefully, and then look at the examples and try to | ||
follow how the type variables are getting filled in, write it out on paper what each type variable is equal | ||
to if you need and it will soon become clear. | ||
|
||
```res example | ||
module Stream = { | ||
type t<'a> | ||
type writable | ||
type readable | ||
type buffer = {buffer: ArrayBuffer.t} | ||
@unboxed | ||
type chunk = | ||
| Str(string) | ||
// Node uses actually its own buffer type, but for the tutorial are just using the stdlib's buffer type. | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| Buf(buffer) | ||
type rec event<_, _> = | ||
// "as" here is setting the runtime representation of our constructor | ||
| @as("pipe") Pipe: event<writable, t<readable> => unit> | ||
| @as("end") End: event<'inputStream, option<chunk> => unit> | ||
| @as("data") Data: event<readable, chunk => unit> | ||
@new @module("node:http") external make: unit => t<'a> = "Stream" | ||
@send | ||
external on: (t<'inputStream>, event<'inputStream, 'callback>, 'callback) => unit = "on" | ||
} | ||
let writer = Stream.Writable.make() | ||
let reader = Stream.Readable.make() | ||
// Types will be correctly inferred for each callback, based on the event parameter provided | ||
writer->Stream.on(Pipe, r => { | ||
Js.log("Piping has started") | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
r->Stream.on(Data, chunk => | ||
switch chunk { | ||
| Stream.Str(s) => Js.log(s) | ||
| Stream.Buf(buffer) => Js.log(buffer) | ||
} | ||
) | ||
}) | ||
writer->Stream.on(End, _ => Js.log("End reached")) | ||
``` | ||
|
||
This example is only over a tiny, imaginary subset of node's Stream API, but it shows a real-life example | ||
where GADTs are all but indispensable. | ||
|
||
## Conclusion | ||
|
||
While GADTs can make your types extra-expressive and get more safety, with great power comes great | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
responsibility. Code that uses GADTs can sometimes be too clever for its own good. The type errors you | ||
encounter will be more difficult to understand, and the compiler sometimes requires extra help to properly | ||
type your code. | ||
|
||
However, There are definite situations where GADTs are the _right_ decision | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
and will _simplify_ your code and help you avoid bugs, even rendering some bugs impossible. The `Stream` example above is a good example where the "simpler" alternative of using regular variants or even strings. | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
would lead to a much more complex and error prone interface. | ||
|
||
Ordinary variants are not necessarily _simple_ therefore, and neither are GADTs necessarily _complex_. | ||
The choice is rather which tool is the right one for the job. When your logic is complex, the highly expressive nature of GADTs can make it simpler to capture that logic. | ||
When your logic is simple, it's best to reach for a simpler tool and avoid the cognitive overhead. | ||
The only way to get good at identifying which the situation calls for is to try out | ||
fhammerschmidt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.