|
| 1 | +--- |
| 2 | +Feature Name: rescript-integers |
| 3 | +Start Date: 2024-11-19 |
| 4 | +RFC PR: (leave this empty) |
| 5 | +ReScript Issue: (leave this empty) |
| 6 | +--- |
| 7 | + |
| 8 | +## Summary |
| 9 | + |
| 10 | +Semantics deifinition of the ReScript's `int` type and integer primitives. |
| 11 | + |
| 12 | +## Motivation |
| 13 | + |
| 14 | +ReScript has three numeric primitive types, `int`, `float` and `bigint`. |
| 15 | + |
| 16 | +The semantics of `float` and `bigint` completely match JavaScript's ones, but `int` is unique to ReScript and originally came from OCaml's `int` type. |
| 17 | + |
| 18 | +`int` stands for 32-bit signed integers. It's a bit unusual for a language to have int32 only and no other precision — mostly for historical reasons, and it isn't very clear due to differences in behavior with JavaScript. |
| 19 | + |
| 20 | +This RFC describes its semantics and chosen trade-offs as precisely as possible. |
| 21 | + |
| 22 | +## Definition |
| 23 | + |
| 24 | +TBD |
| 25 | + |
| 26 | +```res |
| 27 | +type int |
| 28 | +
|
| 29 | +let n = 100 |
| 30 | +``` |
| 31 | + |
| 32 | +Using unbounded integer literals may result in compile-time errors with messages such as `"Integer literal exceeds the range of representable integers of type int."` |
| 33 | + |
| 34 | +## Primitives |
| 35 | + |
| 36 | +Let `max_value` be $2^{31}-1$ and `min_value` be $-2^{31}$. |
| 37 | + |
| 38 | +### `fromNumber(x: number)` |
| 39 | + |
| 40 | +1. If `x` is JavaScript's `Infinity`, return `max_value`. |
| 41 | +2. If `x` is JavaScript's `-Infinity`, return `min_value`. |
| 42 | +3. Let `int32` be [`ToInt32`]`(x)`, return `int32`. |
| 43 | + |
| 44 | +The actions 1 and 2 are intended to reduce confusion when converting from an infinate value. (e.g. https://github.com/rescript-lang/rescript/issues/6737) However, it can be omitted if it is obvious that the `x` is not `Infinity` or `-Infinity`. |
| 45 | + |
| 46 | +The [`ToInt32`] behavior follows the definition in ECMA-262 as is. In action, the ReScript compiler uses `bitwiseOR(number, 0)`. This is what appears in the output as `number | 0`. And this removes all special numbers defined in IEEE-754. `int` never contain the following values: |
| 47 | + |
| 48 | +- `NaN` |
| 49 | +- `Infinity` and `-Infinity` |
| 50 | +- `-0` |
| 51 | + |
| 52 | +`fromNumber(x)` must be idempotent. |
| 53 | + |
| 54 | +### `add(x: int, y: int)` |
| 55 | + |
| 56 | +1. Let `number` be mathmatically $x + y$. |
| 57 | +2. Let `int32` be `fromNumber(number)`, return `int32`. |
| 58 | + |
| 59 | +### `subtract(x, y)` |
| 60 | + |
| 61 | +1. Let `number` be mathmatically $x - y$. |
| 62 | +2. Let `int32` be `fromNumber(number)`, return `int32`. |
| 63 | + |
| 64 | +### `multiply(x, y)` |
| 65 | + |
| 66 | +1. Let `number` be mathmatically $x * y$. |
| 67 | +2. Let `int32` be `fromNumber(number)`, return `int32`. |
| 68 | + |
| 69 | +### `exponentiate(x, y)` |
| 70 | + |
| 71 | +1. Let `number` be mathmatically $x ^ y$. |
| 72 | +2. Let `int32` be `fromNumber(number)`, return `int32`. |
| 73 | + |
| 74 | +`exponentiate(x, y)` must match the result of `multiply` accumulated `y` times. |
| 75 | + |
| 76 | +```js |
| 77 | +function exponentiate(x, y) { |
| 78 | + let int32 = 1; |
| 79 | + for (let i = 0; i < y; i++) { |
| 80 | + int32 *= x; |
| 81 | + } |
| 82 | + return int32 | 0; |
| 83 | +} |
| 84 | +``` |
| 85 | + |
| 86 | +### `divide(x, y)` |
| 87 | + |
| 88 | +1. If `y` equals `0`, raise `Divide_by_zero`. |
| 89 | +2. Let `number` be mathmatically $x / y$. |
| 90 | +3. Let `int32` be `fromNumber(number)`, return `int32`. |
| 91 | + |
| 92 | +### `remainder(x, y)` |
| 93 | + |
| 94 | +1. If `y` equals `0`, raise `Divide_by_zero`. |
| 95 | + |
| 96 | +### `abs(x)` |
| 97 | + |
| 98 | +1. If `x` is `min_value`, raise `Overflow_value`. |
| 99 | + |
| 100 | +## API consideration |
| 101 | + |
| 102 | +## Questions |
| 103 | + |
| 104 | +### Why do we even use `int`? |
| 105 | + |
| 106 | +The use of `int` is primarily for backward compatibility — not with OCaml, but with all existing ReScript codebases. |
| 107 | + |
| 108 | +Additionally, using `int` is beneficial for JavaScript programs since major JavaScript engines treat integers differently. |
| 109 | + |
| 110 | +Depending on the implementation, integer values (especially 32-bit integers) may have a distinct memory representation compared to floating-point numbers. For example, V8 (the JavaScript engine in Chromium) employs an internal element kind called "SMI" (Small integers). This provides an efficient memory representation for signed 32-bit integers and enhances runtime performance by avoiding heap allocation. |
| 111 | + |
| 112 | +At compile time, the compiler ensures that certain operations are restricted to using only `int` types. This increases the likelihood of utilizing the optimized execution paths for SMIs and reduces the potential for runtime de-optimization caused by element kind transitions. |
| 113 | + |
| 114 | +### Why do we truncate values instead of bounds-checking? |
| 115 | + |
| 116 | +It is also for backward compatibility. Bounds-checking and failure early may be more useful for fast feedback loop, but we don't want to break any programs that (accidentally) worked before. |
| 117 | + |
| 118 | +The `number | 0` is actually the most concise output form we can consistently use. Introducing any other runtime codes universally would lead to significant code bloat in the output. |
| 119 | + |
| 120 | +## Future posibilities |
| 121 | + |
| 122 | +Guaranteeing the use of int32 types may offer additional advantages in the future when targeting WebAssembly or alternative native backends. |
| 123 | + |
| 124 | +[`ToInt32`]: https://262.ecma-international.org/#sec-toint32 |
0 commit comments