Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 118 additions & 51 deletions resources/type-system/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -1134,25 +1134,58 @@ succintly, the syntax of Dart is extended to allow the following forms:
any loss of functionality, provided they are not trying to construct a proof
of soundness._

In addition, the following forms are added to allow constructor invocations,
dynamic method invocations, function object invocations, instance method
invocations, and static/toplevel method invocations to be distinguished:
In addition, forms are added to allow constructor invocations, dynamic method
invocations, function object invocations, instance method invocations, and
static/toplevel method invocations to be distinguished. In these forms, `n_i:
m_i` (where `i` is an integer) is used as a convenient meta-syntax to refer to
an invocation argument `m_i` (an elaborated expression), possibly preceded by a
name selector `n_i:` (where `n_i` is a string). In this document, we use the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String seems misleading here, maybe just say parameter name and leave it abstract?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

string `∅` to represent a name selector which is absent (meaning the
corresponding `m_i` is a positional argument rather than a named argument).

The new invocation forms are as follows:

- `@DYNAMIC_INVOKE(m_0.id<T_1, T_2, ...>(n_1: m_1, n_2: m_2, ...))`

_This covers invocations of user-definable binary and unary operators, method
invocations, and implicit uses of the `call` method, where the target has
static type `dynamic` or `Function`. For example, if `d` has static type
`dynamic`, the following expressions will be elaborated using
`@DYNAMIC_INVOKE`: `d.f()`, `-d`, `d + 0`, and `d()`._

- `@FUNCTION_INVOKE(m_0.call<T_1, T_2, ...>(n_1: m_1, n_2: m_2, ...))`

_This covers invocations of expressions whose type is a function type (but not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the division between FUNCTION_INVOKE and STATIC_INVOKE. At a minimum, I think you need to say something here like "expressions which do not resolve to top level or static methods" because otherwise it is ambiguous what to do with f(3) where f is a top level method (because f is an expression whose type is a function type, and it is also a top level method). I'm also confused about something like x.call(). I guess maybe the screwy Dart grammar doesn't treat the x.call here as an expression and therefore we can say that it is unambiguously covered by INSTANCE_INVOKE instead of being ambiguous? Is that right? So (x.call)() is a FUNCTION_INVOKE, but x.call() is an INSTANCE_INVOKE?

`Function`). For example, if `l` has type `List<void Function()>`, then
`l.first()` will be elaborated using `@FUNCTION_INVOKE`._

- `@INSTANCE_INVOKE(m_0.id<T_1, T_2, ...>(n_1: m_1, n_2: m_2, ...))`

_This covers invocations of user-definable binary and unary operators, method
invocations, and implicit uses of the `call` method, where the static type of
the target is the interface type of a class, mixin, enum, or extension
type. For example, if `i` has static type `int`, and the static type of `x` is
a class containing a `call` method, then the following expressions will be
elaborated using `@INSTANCE_INVOKE`: `i.abs()`, `-i`, `i + 1`, and `x()`._

- `@STATIC_INVOKE(f<T_1, T_2, ...>(n_1: m_1, n_2: m_2, ...))`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the syntactic category of f? Is it only an identifier? Or prefixed identifier perhaps? Probably not an expression?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a situation where my choice of notation really fails to capture what’s in my head.

My intention is for @STATIC_INVOKE to be the spec counterpart of the kernel’s StaticInvocation class (https://github.com/dart-lang/sdk/blob/f1febd2cf3bf754ae641eb3fda8ee885b8bda5e5/pkg/kernel/lib/ast.dart#L6531), with f corresponding to StaticInvocation.targetReference, which has type Reference. A kernel Reference is an indirection mechanism allowing different parts of the kernel representation of a program to refer to one another unambiguously. In practice it’s serialized as a sequence of strings separated by ::, the first of which is a URI (e.g. dart:core::int::tryParse), but that’s an unimportant implementation detail. What’s important is that StaticInvocation.targetReference is some way of referring to unambiguously to a top level function or static method defined in a library somewhere in the user’s program.

Putting on my spec writer’s hat, I guess if we want to be really formal about it, we could say that f is either a pair (URI, name) referring to a top level function named name in the library located at URI, or a triple (URI, name1, name2), where name1 is the name of a class, mixin, enum, or extension, and name2 is the name of a static method inside the thing named by name1. But I don’t really want to be this formal because it will make my examples really unwieldy (e.g. I want to say @STATIC_INVOKE(print(s)) rather than @STATIC_INVOKE(('dart:core', 'print')(s))).

Also, even (URI, name1, name2) isn’t good enough to uniquely identify a static method inside an unnamed extension, since a single library could have multiple unnamed extensions containing static methods with the same name. The kernel representation deals with this by giving every unnamed extension a synthetic name, but that feels like an implementation detail of the kernel that doesn’t belong in the spec.

What I really want to say is: “gentle reader, please think of f using whatever representation suits you for unambiguously referring to a static method or top level function that exists somewhere in the program being type inferred. Maybe that’s a combination of a URI and some strings, with some synthetic naming magic to disambiguate methods in unnamed extensions. Maybe it’s a unique integer identifier you assign to each static method or top level function in the user’s program. For my purposes in this document, I’m going to try to stick to examples where it’s clear from context what I’m referring to, like @STATIC_INVOKE(print(s)) (which clearly refers to the top level function print in dart:core or @STATIC_INVOKE(int.tryParse(s)) (which clearly refers to the static method tryParse in the class int in dart:core).”

I’m not really sure how to express that idea in the sort of formal style that we typically use for specifications. Any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to move to the semantic layer here? In the examples above, you just use the elaborated expression as the target. Why not something similar here? That is, you are starting with an expression (or maybe prefixed identifier?) f, why not just continue to use that in the meta-syntax?


_This covers invocations of static methods and top level methods. For example,
the following expressions will be elaborated using `@STATIC_INVOKE`:
`print(s)` and `int.tryParse(s)`._

In each of these forms, each `m_i` represents an elaborated expression, each
`T_i` represents a type, and each `n_i` represents an optional argument name
identifier. When present, `id` represents an identifier or operator name, and
`f` represents a static method or top level function.

The semantics of each of these forms is to evaluate the `m_i` in sequence, then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically I think "evaluating the lookup of id in m_i" is interposed in here. This is required for at least the DYNAMIC_INVOKE, since you don't know whether you are invoking a method or calling a function typed getter. Whether it is required for any of the others depends a bit on the answer to my previous question about what semantic forms are intended to be covered here. That is, do you intend

class A {
  int Function(int) f = throw "yeehaw"; 
  void test() {
      this.f(3);
  }
}

to be covered by one of the above, or do you intend to add something else to cover this later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I’ve tried to clarify this as much as I can without formally specifying the exact semantics of each form.

As to your example with this.f(3), I intend to eventually represent this as something like @FUNCTION_INVOKE(@INSTANCE_GET(m_0.id).call(3)). In this section, I’ve tried to hint at this in the non-normative text I’ve added under @FUNCTION_INVOKE.

The full details are in the “Method invocation inference” section, under the bullet “Otherwise, if U_0 has an accessible instance getter named id, then:”, but that text is a little wishy washy because I haven’t introduced the notion of @INSTANCE_GET yet as of this PR.

perform the appropriate kind of method or function call.
perform the appropriate kind of method or function call. _The precise runtime
semantics are not specified here, however it should be noted that in the case of
`@DYNAMIC_INVOKE`, the name `id` may not be found in the target's runtime type
information at all (in which case `noSuchMethod` will be invoked), or `id` may
resolve to a getter (in which case the getter will be invoked, and then a
dynamic invocation of `call` will be attempted)._

## Additional properties satisfied by elaborated expressions

Expand Down Expand Up @@ -1263,8 +1296,8 @@ non-generic function or method, or when type arguments need to be inferred._
The output of argument part inference is a sequence of elaborated expressions
`{m_1, m_2, ...}` (of the same length as the sequence of input expressions), a
sequence of zero or more elaborated type arguments `{U_1, U_2, ...}`, and a
result type known as `R`. _(The sequence of optional names is unchanged by
argument part inference.)_
result type `R`. _(The sequence of optional names is unchanged by argument part
inference.)_

To emphasize the relationship between argument part inference and the syntax of
Dart source code, the inputs and outputs of argument part inference are
Expand All @@ -1281,8 +1314,8 @@ The procedure for argument part inference is as follows:
- Let `{X_1, X_2, ...}` be the list of formal type parameters in `F`, or an
empty list if `F` is `∅`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading the below, I think (but am not positive) that it would be very easy to just split out the case where F is not present into a single section, instead of interspersing it throughout. If this is relatively easy to do (and I think it should be, because it basically boils down to "do inference on the arguments in the empty context") I think it will help a fair bit in making this more readable, because I'm getting lost in the nested series of "if we are in this case do X" below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can definitely see the appeal of that. If you don’t mind, I’d like to try to do it in a follow-up PR after landing this one, because it might involve moving a lot of text around and duplicating some things, so I want to make sure I’ve fully addressed all your other code review comments before I do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack.


- Let `{P_1, P_2, ...}` be the list of formal parameter types corresponding to
`{e_1, e_2, ...}`, determined as follows:
- Let `{P_1, P_2, ...}` be the list of formal parameter type schemas
corresponding to `{e_1, e_2, ...}`, determined as follows:

- If `F` is `∅`, then let each `P_i` be `_`.

Expand Down Expand Up @@ -1330,17 +1363,17 @@ The procedure for argument part inference is as follows:
- Initialize `{U_1, U_2, ...}` to a list of type schemas, of the same length
as `{X_1, X_2, ...}`, each of which is `_`.

- Using subtype constraint generation, attempt to match `R_F` as a subtype
of `K` with respect to the list of type variables `{X_1, X_2, ...}`.
- Check whether `R_F` is a subtype match for `K` with respect to the list of
type variables `{X_1, X_2, ...}`.

- If this succeeds, then accumulate the resulting list of type constraints
into `C`. Then, let `{V_1, V_2, ...}` be the constraint solution for the
set of type variables `{X_1, X_2, ...}` with respect to the constaint
set `C`, with partial solution `{U_1, U_2, ...}`. Replace `{U_1, U_2,
...}` with `{V_1, V_2, ...}`. _This step is known as "downwards
inference", since it has the effect of passing type information __down__
the syntax tree from the context of the invocation to the context of the
arguments._
set of type variables `{X_1, X_2, ...}` with respect to the constraint
set `C`, with partial solution `{U_1, U_2, ...}`. Let the new values of
`{U_1, U_2, ...}` be `{V_1, V_2, ...}`. _This step is known as
"downwards inference", since it has the effect of passing type
information __down__ the syntax tree from the context of the invocation
to the context of the arguments._

- Otherwise, leave `C` and `{U_1, U_2, ...}` unchanged.

Expand Down Expand Up @@ -1369,18 +1402,27 @@ The procedure for argument part inference is as follows:
has the effect that all arguments that are not function expressions are type
inferred first, in the order in which they appear in source code, followed
by all arguments that __are__ function expressions, possibly in a staged
fashion._
fashion. The reason for traversing the non-function expressions in stage
zero before the function expressions is to allow any type promotions that
occur in the non-function expressions to carry over into the function
expressions, regardless of the order of the arguments. For example, `f(int?
i) => g(() { print(i+1); }, i!);` is accepted by type inference, because the
null check `i!` is guaranteed to execute before the function expression `()
{ print(i+1); }` can ever be invoked. See
https://github.com/dart-lang/language/blob/main/accepted/2.18/horizontal-inference/feature-specification.md#why-do-we-defer-all-function-literals
for more information._

- If `F` has at least one formal type parameter, and there are zero type
arguments `T` (_in other words, generic type inference is being performed_),
then:

- For each `e_i` in stage _k_:
- For each `e_i` in stage _k_, in the order in which expression inference
was performed:

- Let `S_i` be the static type of `m_i_preliminary`.

- Using subtype constraint generation, attempt to match `S_i` as a subtype
of `K_i` with respect to the list of type variables `{X_1, X_2, ...}`.
- Check whether `S_i` is a subtype match for `P_i` with respect to the
list of type variables `{X_1, X_2, ...}`.

- If this succeeds, then accumulate the resulting list of type
constraints into `C`.
Expand All @@ -1391,21 +1433,21 @@ The procedure for argument part inference is as follows:

- If _k_ is not the last stage in the argument partitioning, then let
`{V_1, V_2, ...}` be the constraint solution for the set of type
variables `{X_1, X_2, ...}` with respect to the constaint set `C`, with
partial solution `{U_1, U_2, ...}`. Replace `{U_1, U_2, ...}` with
`{V_1, V_2, ...}`. _This step is known as "horizontal inference", since
it has the effect of passing type information __across__ the syntax tree
from the static type of some arguments to the context of other
arguments._
variables `{X_1, X_2, ...}` with respect to the constraint set `C`, with
partial solution `{U_1, U_2, ...}`. Let the new values of `{U_1, U_2,
...}` be `{V_1, V_2, ...}`. _This step is known as "horizontal
inference", since it has the effect of passing type information
__across__ the syntax tree from the static type of some arguments to the
context of other arguments._

- Otherwise (_k_ __is__ the last stage in the argument partitioning), let
`{V_1, V_2, ...}` be the _grounded_ constraint solution for the set of
type variables `{X_1, X_2, ...}` with respect to the constaint set `C`,
with partial solution `{U_1, U_2, ...}`. Replace `{U_1, U_2, ...}` with
`{V_1, V_2, ...}`. _This step is known as "upwards inference", since it
has the effect of passing type information __up__ the syntax tree from
the static type of arguments to the inferred type arguments of the
invocation._
type variables `{X_1, X_2, ...}` with respect to the constraint set `C`,
with partial solution `{U_1, U_2, ...}`. Let the new values of `{U_1,
U_2, ...}` be `{V_1, V_2, ...}`. _This step is known as "upwards
inference", since it has the effect of passing type information __up__
the syntax tree from the static type of arguments to the inferred type
arguments of the invocation._

- _Note that at this point, all entries in `{U_1, U_2, ...}` have been obtained
either from a _grounded_ constraint solution or from explicit types in the
Expand Down Expand Up @@ -1465,7 +1507,10 @@ _Note that if `F` is `∅`, then the resulting graph has no edges._
_So, for example, if the invocation in question is this:_

```dart
f((t, u) { ... } /* A */, () { ... } /* B */, (v) { ... } /* C */, (u) { ... } /* D */)
f((t, u) { ... }, // A
() { ... }, // B
(v) { ... }, // C
(u) { ... }) // D
```

_And the target of the invocation is declared like this:_
Expand Down Expand Up @@ -1537,6 +1582,16 @@ name identifiers `{n_1, n_2, ...}`, a sequence of zero or more type arguments
`{T_1, T_2, ...}`, and a type schema `K`. As with [argument part
inference](#Argument-part-inference), the symbol `∅` denotes a missing name.

_Method invocation inference is used for the following constructs:_

- _Explicit method invocations (e.g. `[].add(x)`), in which case `id` is the name of the method (`add` in this example)._

- _Call invocations (e.g. `f()`, where `f` is a local variable), in which case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just local variables though, right? Any expression, top level method etc could be in this form?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct.

`id` is `call`._

- _Invocations of user-definable operators (e.g. `1 + 2`), in which case `id` is
the name of the operator._

The output of method invocation inference is an elaborated expression `m`, with
static type `T`, where `m` and `T` are determined as follows:

Expand All @@ -1550,9 +1605,9 @@ static type `T`, where `m` and `T` are determined as follows:
then:

- Invoke [argument part inference](#Argument-part-inference) on `<T_1, T_2,
...>(n_1: e_1, n_2: e_2, ...)`, using a target function type of `∅` and a
type schema `K`. Denote the resulting elaborated expressions by `{m_1, m_2,
...}`.
...>(n_1: e_1, n_2: e_2, ...)`, using `∅` as the target function type and
`K` as the context. Denote the resulting elaborated expressions by `{m_1,
m_2, ...}`.

- Let `m` be `@DYNAMIC_INVOKE(m_0.id<U_1, U_2, ...>(n_1: m_1, n_2: m_2,
...))`.
Expand Down Expand Up @@ -1583,10 +1638,10 @@ static type `T`, where `m` and `T` are determined as follows:
then:

- Invoke [argument part inference](#Argument-part-inference) on `<T_1, T_2,
...>(n_1: e_1, n_2: e_2, ...)`, using a target function type of `U_0` and a
type schema `K`. Denote the resulting elaborated expressions by `{m_1, m_2,
...}`, the resulting elaborated type arguments by `{U_1, U_2, ...}`, and the
result type by `R`.
...>(n_1: e_1, n_2: e_2, ...)`, using `U_0` as the target function type and
`K` as the context. Denote the resulting elaborated expressions by `{m_1,
m_2, ...}`, the resulting elaborated type arguments by `{U_1, U_2, ...}`,
and the result type by `R`.

- Let `m` be `@FUNCTION_INVOKE(m_0.call<U_1, U_2, ...>(n_1: m_1, n_2: m_2,
...))`, and let `T` be `R`. _Soundness follows from the fact that the static
Expand All @@ -1598,10 +1653,10 @@ static type `T`, where `m` and `T` are determined as follows:
- Let `F` be the return type of `U_0`'s accessible instance getter named `id`.

- Invoke [argument part inference](#Argument-part-inference) on `<T_1, T_2,
...>(n_1: e_1, n_2: e_2, ...)`, using a target function type of `F` and a
type schema `K`. Denote the resulting elaborated expressions by `{m_1, m_2,
...}`, the resulting elaborated type arguments by `{U_1, U_2, ...}`, and the
result type by `R`.
...>(n_1: e_1, n_2: e_2, ...)`, using `F` as the target function type and
`K` as the context. Denote the resulting elaborated expressions by `{m_1,
m_2, ...}`, the resulting elaborated type arguments by `{U_1, U_2, ...}`,
and the result type by `R`.

- Let `m` be `@FUNCTION_INVOKE(m_0.id.call<U_1, U_2, ...>(n_1: m_1, n_2: m_2,
...))`, and let `T` be `R`. _Soundness follows from the fact that the static
Expand All @@ -1614,10 +1669,18 @@ static type `T`, where `m` and `T` are determined as follows:
- Let `F` be the type of `U_0`'s accessible instance method named `id`.

- Invoke [argument part inference](#Argument-part-inference) on `<T_1, T_2,
...>(n_1: e_1, n_2: e_2, ...)`, using a target function type of `F` and a
type schema `K`. Denote the resulting elaborated expressions by `{m_1, m_2,
...}`, the resulting elaborated type arguments by `{U_1, U_2, ...}`, and the
result type by `R`.
...>(n_1: e_1, n_2: e_2, ...)`, using `F` as the target function type and
`K` as the context. Denote the resulting elaborated expressions by `{m_1,
m_2, ...}`, the resulting elaborated type arguments by `{U_1, U_2, ...}`,
and the result type by `R`.

_TODO(paulberry): handle the possibility that `F` is not a function type (it
might even be `dynamic`). Note that the analyzer and CFE currently behave
slightly differently if `F` is an interface type that declares a getter
called `call`: the analyzer rejects this (`call` must be a method), but the
CFE accepts it, recursively desugaring `.call` invocations to arbitrary
depth. See https://github.com/dart-lang/language/issues/3482 for more
context._

- Let `m` be `@INSTANCE_INVOKE(m_0.id<U_1, U_2, ...>(n_1: m_1, n_2: m_2,
...))`, and let `T` be `R`. _Soundness follows from the fact that the static
Expand Down Expand Up @@ -1702,11 +1765,15 @@ _TODO(paulberry): specify this._
### Implicit this invocation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something, but this seems wrong. If I follow the algorithm here, I think it implies that the following is an error:

int Function(int) f = (x) => x;
void test() {
 f(3);
}

because:

  • Neither of the previous two cases apply to f(3)
  • I think f(3) parses so as to match this case
  • The selector chain here does not have access to this.

Am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right. As a band-aid, I’ve changed the text “cannot be resolved to the name of a local variable” to “cannot be resolved using local scope resolution rules” to resolve this problem, but this is not a good solution. In particular, it doesn’t handle the possibility that the identifier can be resolved using local scope resolution rules, and it resolves to an instance method or getter in the current class.

I’ve been working ahead a bit while this PR was under review, and I’ve got some ideas for how to address this, so I’m going to defer further work on this issue to a TODO comment.


If the selector chain consists of _&lt;identifier&gt; &lt;argumentPart&gt;_, and
_&lt;identifier&gt;_ __cannot__ be resolved to the name of a local variable,
_&lt;identifier&gt;_ __cannot__ be resolved using local scope resolution rules,
then the result of selector chain type inference in context `K` is the
elaborated expression `m`, with static type `T`, where `m` and `T` are
determined as follows:

_TODO(paulberry): also handle the situation where &lt;identifier&gt; __can__ be
resolved using local scope resolution rules, but it resolves to a member of the
current class, mixin, enum, extension, or extension type._

- Let `m_0` be `this`, with static type `T_0`, where `T_0` is the interface type
of the immediately enclosing class, enum, mixin, or extension type, or the "on"
type of the immediately enclosing extension. _The runtime behavior of `this` is
Expand Down Expand Up @@ -1853,8 +1920,8 @@ _TODO(paulberry): specify this._
### Implicit this method tearoff with type arguments

If the selector chain consists of _&lt;identifier&gt; &lt;typeArguments&gt;_,
and _&lt;identifier&gt;_ __cannot__ be resolved to the name of a local variable,
then the result of selector chain type inference in context `K` is the
and _&lt;identifier&gt;_ __cannot__ be resolved using local scope resolution
rules, then the result of selector chain type inference in context `K` is the
elaborated expression `m`, with static type `T`, where `m` and `T` are
determined as follows:

Expand Down