draft RFC for int semantics

cometkim · cometkim · commit 4bd7186820ac · 2024-11-19T01:35:34.000+09:00
diff --git a/text/0000-int.md b/text/0000-int.md
@@ -0,0 +1,124 @@
+---
+Feature Name: rescript-integers
+Start Date: 2024-11-19
+RFC PR: (leave this empty)
+ReScript Issue: (leave this empty)
+---
+
+## Summary
+
+Semantics deifinition of the ReScript's `int` type and integer primitives.
+
+## Motivation
+
+ReScript has three numeric primitive types, `int`, `float` and `bigint`.
+
+The semantics of `float` and `bigint` completely match JavaScript's ones, but `int` is unique to ReScript and originally came from OCaml's `int` type.
+
+`int` stands for 32-bit signed integers. It's a bit unusual for a language to have int32 only and no other precision — mostly for historical reasons, and it isn't very clear due to differences in behavior with JavaScript.
+
+This RFC describes its semantics and chosen trade-offs as precisely as possible.
+
+## Definition
+
+TBD
+
+```res
+type int
+
+let n = 100
+```
+
+Using unbounded integer literals may result in compile-time errors with messages such as `"Integer literal exceeds the range of representable integers of type int."`
+
+## Primitives
+
+Let `max_value` be $2^{31}-1$ and `min_value` be $-2^{31}$.
+
+### `fromNumber(x: number)`
+
+1. If `x` is JavaScript's `Infinity`, return `max_value`.
+2. If `x` is JavaScript's `-Infinity`, return `min_value`.
+3. Let `int32` be [`ToInt32`]`(x)`, return `int32`.
+
+The actions 1 and 2 are intended to reduce confusion when converting from an infinate value. (e.g. https://github.com/rescript-lang/rescript/issues/6737) However, it can be omitted if it is obvious that the `x` is not `Infinity` or `-Infinity`.
+
+The [`ToInt32`] behavior follows the definition in ECMA-262 as is. In action, the ReScript compiler uses `bitwiseOR(number, 0)`. This is what appears in the output as `number | 0`. And this removes all special numbers defined in IEEE-754. `int` never contain the following values:
+
+- `NaN`
+- `Infinity` and `-Infinity`
+- `-0`
+
+`fromNumber(x)` must be idempotent.
+
+### `add(x: int, y: int)`
+
+1. Let `number` be mathmatically $x + y$.
+2. Let `int32` be `fromNumber(number)`, return `int32`.
+
+### `subtract(x, y)`
+
+1. Let `number` be mathmatically $x - y$.
+2. Let `int32` be `fromNumber(number)`, return `int32`.
+
+### `multiply(x, y)`
+
+1. Let `number` be mathmatically $x * y$.
+2. Let `int32` be `fromNumber(number)`, return `int32`.
+
+### `exponentiate(x, y)`
+
+1. Let `number` be mathmatically $x ^ y$.
+2. Let `int32` be `fromNumber(number)`, return `int32`.
+
+`exponentiate(x, y)` must match the result of `multiply` accumulated `y` times.
+
+```js
+function exponentiate(x, y) {
+  let int32 = 1;
+  for (let i = 0; i < y; i++) {
+    int32 *= x;
+  }
+  return int32 | 0;
+}
+```
+
+### `divide(x, y)`
+
+1. If `y` equals `0`, raise `Divide_by_zero`.
+2. Let `number` be mathmatically $x / y$.
+3. Let `int32` be `fromNumber(number)`, return `int32`.
+
+### `remainder(x, y)`
+
+1. If `y` equals `0`, raise `Divide_by_zero`.
+
+### `abs(x)`
+
+1. If `x` is `min_value`, raise `Overflow_value`.
+
+## API consideration
+
+## Questions
+
+### Why do we even use `int`?
+
+The use of `int` is primarily for backward compatibility — not with OCaml, but with all existing ReScript codebases.
+
+Additionally, using `int` is beneficial for JavaScript programs since major JavaScript engines treat integers differently.
+
+Depending on the implementation, integer values (especially 32-bit integers) may have a distinct memory representation compared to floating-point numbers. For example, V8 (the JavaScript engine in Chromium) employs an internal element kind called "SMI" (Small integers). This provides an efficient memory representation for signed 32-bit integers and enhances runtime performance by avoiding heap allocation.
+
+At compile time, the compiler ensures that certain operations are restricted to using only `int` types. This increases the likelihood of utilizing the optimized execution paths for SMIs and reduces the potential for runtime de-optimization caused by element kind transitions.
+
+### Why do we truncate values instead of bounds-checking?
+
+It is also for backward compatibility. Bounds-checking and failure early may be more useful for fast feedback loop, but we don't want to break any programs that (accidentally) worked before.
+
+The `number | 0` is actually the most concise output form we can consistently use. Introducing any other runtime codes universally would lead to significant code bloat in the output.
+
+## Future posibilities
+
+Guaranteeing the use of int32 types may offer additional advantages in the future when targeting WebAssembly or alternative native backends.
+
+[`ToInt32`]: https://262.ecma-international.org/#sec-toint32