Improved contextual type inference errors with bidirectional type checking #4661
Replies: 2 comments
-
Very cool! Hayleigh shared this paper on bidirectional type checking here: https://www.cl.cam.ac.uk/~nk480/bidir.pdf This would be more or less a full rewrite of the analyser, which seems like a very daunting bit of work. I like the improved case expression message. I'd be interested in what impact this could have on the |
Beta Was this translation helpful? Give feedback.
-
This would be incredible. Count me in as the biggest supporter of this idea. As Louis says, this would be a substantial rewrite. While special-casing situations isn't perfect, I think having temporary logic for the worst offenders would be a big improvement over the status quo. Are there any incremental changes that can be put in place for syntax like |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
During a discussion on Discord, we noticed two places where type error messages could potentially be improved (#4639, #4640). While looking at the compiler code trying to get an idea how I would implement this, I thought that it would be best instead to change how contextual information is handled during the entire type inference process, not just special-casing these 2 situations.
Right now, expressions are type-checked bottom-up; the untyped ast is traversed in a depth-first order, and nodes are recursively typed. The high-level algorithm for this looks like this:
as an example, when typechecking a function call, the arguments are typed first, before they are unified with their expected argument types. From this, the return type is inferred and returned to the called. The caller then (after the call has finished type-checking) unfies the result with its expected type. If those do not match, the entire call expression is marked as a type error.
Proposal
Instead, I propose passing known type information about an expression down to the infer function for tha expression; Most expreessions are composed of more nested expressions of different shapes. Instead of first producing a typed child epxression which then gets unified with that result, the individual inference functions could directly unify their result with all known information, providing more targeted type error messages regardless of code structure.
The new high-level overview of the algorithm would look like this:
Current state
The compiler does something similar already in exactly one situation:
When inferring anonymous functions, the surrounding type context is passed down the tree to allow anonymous functions to access record and tuple fields without needing annotations. This proposal is basically to do something similar for all other expressions, too.
Examples
Here I've tried to simulate in my head what this might look like; Of course I haven't run this code, so maybe it doesn't actually work that way! I expect people to be quite opinionated here. I think a good result of this might be to try it in certain situations (for example case expressions) first and slowly expand that system over time if the resulting errors are indeed better.
1.
let
bindingscurrently, the compiler first fully type-checks the expression on the right-hand side, and then tries to unify it with the type annotation on the left. If it doesn't match, the entire right-hand side is marked as a type error.
Current Compiler Error
New Compiler Error
What's happening here? Let's first desugar the pipe to regular function calls. We can also directly add the types of functions defined in other modules here, so I will also do that.
Definitely too detailed explanation of the inference steps
In the new algorithm, we pass type information down, so we immediately know what type
result.unwrap
needs to return, and therefore whata
needs to be equal to:The algorithm works recursively: we can now know that the
list.try_map
call has to returnResult(List(Int), b)
, and we can pass this information down. Again, the when inferring the call, we know that this type has to unify with the return type oflist.try_map
and has to matchResult(List(d), e)
. From this,d
is inferred to beInt
ande
is inferred to beb
:we can immediately infer the type of the constant list as
List(String)
. arguments are still unified from left-to-right, so nothing special happens here.c
is equal toString
. Now when inferring the anonymous function, we already know that the try_map argument has to be of typefn(String) -> Result(Int, b)
. Again, Instead of inferring the child expression and unifying, we pass that information down:str
has to unify withString
, and the function body (and return value) has to unify withResult(Int, b)
.float.parse
is a function of typefn(String) -> Result(Float, Nil)
, so before even type-checking the arguments, the inference process would stop here, since the result types cannot be unified: The constraints we gathered up to this point are conflicting.2. generic values
Current compiler error
New compiler error
I do not know what to do with the hint in that case.
case expressions
In a case expression, if all branches match except for one, only that specific branch (and even the specific expression inside that branch) can be annotated with the error.
Current Error Message
New Error Message
use
ExpressionsIn use expressions, the return type of the function can be used to add constraints to the return type of the middleware function, in turn constraining the callback function type. Passing down these constraints means that
use
expressions:Current Error Message
New Error Message
Beta Was this translation helpful? Give feedback.
All reactions