Skip to content

val() becomes a different 64-bit integer than the input size_t #25859

@LTLA

Description

@LTLA

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 4.0.20 (6913738ec5371a88c4af5a80db0ab42bad3de681)
clang version 22.0.0git (https:/github.com/llvm/llvm-project d163988dd2833f28fbca8c144265108d25ae7bd2)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /home/luna/Software/emsdk/emsdk/upstream/bin

Failing command line in full:

I observed some interesting behavior where emscripten::val() is converted into a BigInt that differs in value from the std::size_t that was used to construct it. I've boiled it down to the following minimal example:

#include <emscripten/bind.h>
#include <cstddef>
#include <iostream>
#include <string>

emscripten::val get_max_size(emscripten::val x, int n) {
    std::size_t maxed = 0;
    for (int i = 0; i < n; ++i) {
        auto current = x[i].template as<std::string>();
        std::cout << "'" << current << "'" << std::endl;
        if (current.size() > maxed) {
            maxed = current.size();
        }
    }
    return emscripten::val(maxed);
}

EMSCRIPTEN_BINDINGS(get_max_size) {
   emscripten::function("get_max_size", &get_max_size);
}

Compiled with CMake 3.27.1:

em++ --std=c++17 --bind -O3 -sMEMORY64 -sEXPORT_ES6 -sMODULARIZE=1 -sEXPORT_NAME=load test.cpp -o bar.js

And run with Node v24.11.1:

import load from "./bar.js";
const module = await load();
console.log(module.get_max_size(["abcd"], 1));
console.log(module.get_max_size(["abcde"], 1));

This yields the following output, where the reported maximum size for the second call is not the expected 5n:

node test.js 
## 'abcd'
## 4n
## 'abcde'
## 433791696901n

Whatever's going on, this is a very specific scenario, as the bug disappears if:

  • -O3 is not used.
  • -sMEMORY64 is not used (in which case the returned values are regular Numbers instead).
  • A std::size_t is returned directly by get_max_size() instead of wrapping it in an emscripten::val().
  • No looping across the array is performed within get_max_size().
  • No strings are involved within get_max_size().
  • The value of maxed is printed before constructing the val.
  • The value of maxed is cast to a int64_t before constructing the val.
  • UBSAN is enabled.

Run on Ubuntu 22.04.5 LTS on an Intel i7.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions