Skip to content

How should array initializers be constrained? #6160

@geoffromer

Description

@geoffromer

Summary of issue:

Consider the following four array initializations:

fn MakeX() -> X;
let a1: Core.Array(X, 3) = (MakeX(), MakeX(), MakeX());

fn MakeXs() -> (var X, var X, var X);
let a2: Core.Array(X, 3) = MakeXs();

impl form(val A) as Core.AsPrimitive(X) where .ResultForm = form(val X);
impl form(val B) as Core.AsPrimitive(X) where .ResultForm = form(ref X);
impl form(val C) as Core.AsPrimitive(X) where .ResultForm = form(var X);
let mixed_tuple: (A, B, C) = ...;
let a3: Core.Array(X, 3) = mixed_tuple as (X, X, X);

let x_tuple: (X, X, X) = ...;
let a4: Core.Array(X, 3) = x_tuple;

Which of them should Core.Array support? If a user-defined array type takes the place of Core.Array, which combinations should it have the option of supporting?

To narrow the question slightly, I believe that if we disallow any one of them, we will effectively have to disallow the ones below it, so the question is where in that code fragment is the first invalid initialization (if any)? Specifically, what answers to that question does the core language allow, and what answer does Core.Array actually choose?

Details:

I believe we have consensus that a1 is allowed, and is the preferred way to initialize an array from a list of element values, but as @zygoloid articulated in the 2025-06-26 open discussion, there are at least two different mental models of that initialization, which lead to different answers for the others:

  • In one model, (MakeX(), MakeX(), MakeX()) is a literal representation of a tuple, i.e. a value of type (X, X, X), and so a1 is being initialized from a tuple. From that point of view, a2, a3, and a4 are also initialized from tuple values, so it would be surprising if they were not allowed.
  • In the other model, (MakeX(), MakeX(), MakeX()) doesn't necessarily represent a tuple, but rather an abstract sequence of three X values (much like a C++ braced initializer list). It can be used to initialize tuples, arrays, or other types, but isn't innately tied to any of them. From this point of view, the initializers of a2, a3, and a4 are not this kind of "sequence literal", but rather tuple expressions, so it is arguably surprising if they are allowed.

If we want to allow a1 while disallowing a2, we would have to make tuple literals have a separate type from tuple values, or else enforce this restriction outside the type system altogether. That seems likely to be quite disruptive, but I haven't explored it in depth because so far there hasn't seemed to be an appetite for it, even though it's arguably most consistent with the second mental model.

Expression forms, as proposed in #5545 and #5389, give us a mechanism for disallowing a4 while allowing a1 and a2 -- x_tuple has a primitive form, but (MakeX(), MakeX(), MakeX()) and MakeXs() have tuple forms, and Core.Array can choose to define an implicit conversion from the latter but not the former. However, it will be difficult if not impossible to explain this restriction (in documentation or in compiler diagnostics) without discussing forms. That could undermine our efforts to ensure that beginning Carbon programmers don't need to be aware of forms (following the principle of progressive disclosure).

The situation with a3 is more complicated, because the conversion from (A, B, C) to (X, X, X) would most naturally have the form (form(val X), form(ref X), form(var X)), which is a tuple form and would therefore be allowed if a2 is allowed. However, it seems very surprising to allow a3 but disallow a4, because it means that a4 is disallowed because the initializer type isn't different enough from the binding type.

To address that problem, #5545 currently proposes that if the source of a tuple-to-tuple conversion has a primitive form, the result is converted to an initializing primitive form. Then, as with any other expression, it may be implicitly form-converted to satisfy the form expectations of the point of use. The resulting chain of form conversions may in some cases be less efficient than one that wasn't constrained to pass through an initializing primitive form. For example, in let t: (P, Q) = ...; let r: R = t as (R, R);, if form(val P) and form(val Q) are both convertible to form(val R), the intermediate conversion to an initializing primitive form causes an unnecessary temporary materialization.

Note that without that fix, a3 is valid if and only if a2 is valid, and with that fix, a3 is valid if and only if a4 is valid. I don't see a practical way to delegate this decision to the array library.

In summary:

  • Under Expression form basics #5545 as currently written, the array library can choose between the following options:
    • Allow all.
    • Allow a1 and a2, disallow a3 and a4 (Core.Array chooses this)
    • Disallow all.
  • We could make the design somewhat simpler and more efficient by giving the library the following options instead:
    • Allow all.
    • Allow a1, a2, and a3, disallow a4.
    • Disallow all.
  • Other combinations have not been explored in depth, but don't look promising to me.

Any other information that you want to share?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    leads questionA question for the leads team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions