Skip to content

[RFC]: supporting valueOf in zero-dimensional ndarrays for scalar-like behavior #863

@kgryte

Description

@kgryte

Description

This RFC proposes supporting valueOf() for primitive coercion of zero-dimensional ndarrays. Adding primitive type coercion would allow zero-dimensional ndarrays to exhibit scalar-like behavior in common unary and binary operations, such as addition, subtraction, etc.

A common design principle among ndarray APIs is (and will be) returning ndarrays, even when a scalar might be expected (e.g., computing the sum over a one-dimensional ndarray). This follows principles set forth in the Data APIs Standard, as consistently returning ndarrays is more conducive for whole-graph optimization, retaining dtype information, and ensuring that data can remain on device (e.g., GPU/TPU), thus avoiding unnecessary device synchronization.

Without valueOf() support, zero-dimensional ndarrays exhibit the following behavior:

In [1]: var x = ndarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' );

In [2]: +x
Out[2]: NaN

In [3]: 3 + x
Out[3]: "3ndarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' )"

In [4]: x + 3
Out[4]: "ndarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' )3"

In [5]: x + x
Out[5]: "ndarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' )ndarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' )"

In [6]: Number( x )
Out[6]: NaN

In [7]: new Date( x )
Out[7]: Invalid Date

In [8]: 3.14 == x
Out[8]: false

In [9]: typeof x
Out[9]: 'object'

In [10]: x + 'foo'
Out[10]: "ndarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' )foo"

In [11]: 'foo' + x
Out[11]: "foondarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' )"

By adding valueOf() behavior, this RFC proposes the following behavior:

In [15]: x.valueOf = function() { return x.get(); };

In [16]: +x
Out[16]: 3.14

In [17]: 3 + x
Out[17]: 6.140000000000001

In [18]: x + 3
Out[18]: 6.140000000000001

In [19]: x + x
Out[19]: 6.28

In [20]: Number( x )
Out[20]: 3.14

In [21]: new Date( x )
Out[21]: 1970-01-01T00:00:00.003Z

In [22]: 3.14 == x
Out[22]: true

In [23]: x == 3.14
Out[23]: true

In [24]: x === 3.14
Out[24]: false

In [25]: 3.14 === x
Out[25]: false

In [26]: typeof x
Out[26]: 'object'

In [27]: x + 'foo'
Out[27]: '3.14foo'

In [28]: 'foo' + x
Out[28]: 'foo3.14'

In [29]: String( x )
Out[29]: "ndarray( 'generic', [ 3.14 ], [], [ 0 ], 0, 'row-major' )"

Making this change will allow zero-dimensional to (mostly) behave like their scalar equivalents. The exceptions are as follows:

  • zero-dimensional ndarrays are mutable, while number primitives are immutable. This could lead to some surprises (i.e., action-at-a-distance) if a zero-dimensional ndarray is shared across contexts, as number primitives are passed by value, while ndarray objects are passed by reference. In general, this could be addressed by exercising good hygiene and always converting a maybe 0-D ndarray to a number primitive if one believes that a value could be an ndarray and may be shared across contexts.

  • typeof: a zero-dimensional ndarray will still have an object type. Hence, typeof 3.14 !== typeof x. This could be potentially problematic for those functions which perform explicit type checking of input arguments (e.g., if ( typeof x === 'number' ) {...}). Most (possibly all) "base" special mathematical functions assume numeric input and eschew explicit typeof <number> checks, so passing a zero-dimensional ndarray to low-level math functions should just work. How likely users are to mix high level ndarray APIs with low-level math APIs remains uncertain. If we wanted to be overly cautious, we could recommend to always do +x or x.valueOf() prior to passing a result which may be a zero-dimensional ndarray to a lower level math function (or any other function explicitly expecting a numeric value).

  • toString(): a zero-dimensional ndarray will still serialize to an ndarray creation string. This ensures consistency with non-zero-dimensional ndarrays and makes sense from the standpoint that a zero-dimensional ndarray should reconstitute as an ndarray.

  • toJSON(): similar logic/arguments as toString().

  • == and ===: for the most part, equality matches primitive number behavior. The one exception is for zero-dimensional ndarrays representing NaN.

    In [31]: x.set( NaN )
    
    In [32]: +x
    Out[32]: NaN
    
    In [33]: x === x
    Out[33]: true
    
    In [34]: x !== x
    Out[34]: false
    
    In [35]: x != x
    Out[35]: false
    
    In [36]: x == x
    Out[36]: true

    As can be observed above, the standard check for NaN fails, as x === x compares references, not values. Same for loose equality. In this case, one needs to perform explicit numeric coercion

    In [37]: +x === +x
    Out[37]: false
    
    In [38]: +x !== +x
    Out[38]: true

    Accordingly, this is a potential footgun, which can be resolved in one of two ways: (a) we can add logic to, e.g., @stdlib/math/base/assert/is-nan to perform numeric type conversion before comparison as shown in the previous example and always ensure that we use the package to check for NaN or (b) punt the responsibility to userland to perform type coercion.

    Personally, I'm in favor of (b), as I'm not convinced that mixing abstraction levels (e.g., generic high level APIs with low level "base" APIs) is/will be common and all high level APIs which do operate on ndarray objects should have logic for appropriately handling zero-dimensional ndarrays (e.g., unwrapping the 0-d value before invoking a base API).

Conclusion

In short, this proposal should strike a reasonable balance between allowing zero-dimensional ndarrays to be scalar-like in most cases and retaining ndarray behavior when operating on values across various ndarray APIs. There are subtle differences which do require vigilance in ensuring that one is explicit in terms of value type expectations, which is especially important in ambiguous contexts where a value may only be "scalar-like".

Related Issues

None.

Questions

  • Are there other potential footguns that I've missed in the above?
  • Are we okay with the subtle differences between a zero-dimensional ndarray exhibiting scalar-like behavior and its number primitive equivalent?

Other

  • For non-zero-dimensional ndarrays, valueOf() will continue to return this.

Checklist

  • I have read and understood the Code of Conduct.
  • Searched for existing issues and pull requests.
  • The issue name begins with RFC:.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeatureIssue or pull request for adding a new feature.RFCRequest for comments. Feature requests and proposed changes.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions