Implement automatic test suite skipping facility into the browser test harness #25533

juj · 2025-10-09T00:12:22Z

Implement automatic test suite skipping facility into the browser test harness, so that users do not need to manually maintain skip lists of features. (or in present state, have to maintain fewer of them).

Paves the way toward test harness working for users out of the box, improving First-Time User Experience, rather than have a large bulk of tests fail for users by default.

sbc100

Very nice!

I do worry a little bit about any mechanism that automatically skips test though.

Having to spell out EMTEST_LACKS_GRAPHICS_HARDWARE explicitly in the CI configuration means that the we don't need to trust the auto-disable feature.

This is why we have EMTEST_SKIP_NODE_CANARY explicitly rather than trying to automatically figure out at runtime if we should skip them.

I think I've seen cases where we were mistakenly always skipping certain tests due to a bug in the auto-skip magic, and its went unnoticed.

Imagine this scenario where we have an auto-skip mechanism:

1.I setup a CI bot with idea that it will run all the tests under node.
2. I install node on the bot
3. I run the tests and see that they all pass
4. Someone adds some new tests that depend on a more recent version of node.
5. An auto-skip mechanism means I never actually run them and bot keeps passing

I would much rather have my the bot start failing at (4) so can explictly choose to either:

Update node.
Set some environment variable such as SKIP_NODE24 or SKIP_MEMORY64 to opt out explicitly.

Another example is that we don't automatically skip d8 tests when d8 is not found. We expect folks to explicitly opt out with EMTEST_SKIP_V8.

Maybe your are not proposing that we go that far, but JIC I thought I would mention the rationale for the status quo.

sbc100 · 2025-10-09T00:20:39Z

test/test_browser.py

+  # On Nightly and BEta, e.g. 145.0a1, pretend it to still mean version 144,
+  # since it is a pre-release version
+  if any(c in milestone for c in ("a", "b")):
+    version -= 1


Can we not use firefox --version for this?

firefox --version seems to be a linux thing. It doesn't exist on Windows.

test/test_browser.py

juj · 2025-10-09T10:19:39Z

There are multiple use cases that motivate this work:

Whenever we have people come in to raise a bug that something is not working for them, the most common reply from the team is to say "do the tests work for you?" to get to a baseline understanding of what could be at fault.

Then if the user goes and points EMTEST_BROWSER=my_browser and runs test/runner browser, then depending on their browser, they could get some hundred failures in the browser suite. Most of these failures will be just some random JavaScript syntax errors, that do not give a clue to the reader that the failures would be due to a missing feature in browser, leading the user to think that the project would be poorly developed/maintained/tested/whatever.

Our EMTEST_SKIP_* and EMTEST_LACKS_* are not discoverable by users, and even if they were, there is no hint/connection to the user to figure out which of these random JS syntax errors would be addressed by which env. vars, unless they try out applying any and all of them for their particular browser, or familiarize themselves with the gory internals of the harness and the details of the failing tests. Which takes a lot of effort.

When we get new contributors raising PRs, like is the case now at fix: added missing options to EmscriptenAudioWorkletNodeCreateOptions closes #23982 #25497 (comment) for example.. the answer to give in practice (if we want to be sure they get a green run) is "ehh.. it's complicated", whereas it should just be as simple as "Point EMTEST_BROWSER to your browser and run test/runner browser".
When you are an Emscripten developer raising a bug report up to a browser database, like for example in https://bugzilla.mozilla.org/show_bug.cgi?id=1992558#c2 , it is risky to state "to repro, run test/runner browser", since that will likely provide that barrage of tests failing in JS syntax errors. This makes it more difficult to report complex harness-related issues to upstream browser vendors. (I have two harness-related bugs I would like to report to upstream Safari, but it is hard to make a quality repro case here that wouldn't give them red herrings)
When you are a downstream user of Emscripten, who is shipping code on a particular set of hardware * OS * browser * browser_min_version combinations, and you need to ensure that all these combinations will actually work like you advertise to your customers.. you'll need to setup a CI to verify that it does. But that leads you to have to discover all the intricate details of every single test, to manually manage the skips for every combination. That takes a lot of work.

And when you do all that work, there is currently no model to contribute that information back to the project, so that others running the harness could benefit from out of the box. Other people running the suite will need to independently do that same work. Until now by this PR, that is.

1.I setup a CI bot with idea that it will run all the tests under node.
2. I install node on the bot
3. I run the tests and see that they all pass
4. Someone adds some new tests that depend on a more recent version of node.
5. An auto-skip mechanism means I never actually run them and bot keeps passing

Yeah, I agree this kind of accident could happen. Here I would stress that the fact that all existing functionality will still be green, i.e. the Emscripten project will not have been regressed (at least as far as test coverage is concerned), since if there is a new feature that requires newer Node, then all the tests that are running against older Node will still be ok, verifying that the project runs successfully on the older Node. So nothing has regressed(?) - maybe just the newly added test coverage is not being incorporated correctly.

It is definitely a burden on the project that it also needs to keep up with new Node versions. But the risk of missing a new feature long-term by "auto-magic" seems low.. it is not like the Emscripten project could be oblivious to the latest feature developments in its domain (like Wasm Exnref exceptions), since the Emscripten devs are also contributing to the development of those feature specifications.

Maybe we can add a disable feature to these auto-detection flags. For example:

EMTEST_LACKS_OFFSCREEN_CANVAS=1 : force-skip all OffscreenCanvas tests
EMTEST_LACKS_OFFSCREEN_CANVAS not present: auto-detect the browser
EMTEST_LACKS_OFFSCREEN_CANVAS=0: force-don't-skip OffscreenCanvas tests

That way Emscripten CI can opt-out from the auto-disables and be able to run all the tests?

juj · 2025-10-09T10:31:09Z

I updated the PR to enable setting EMTEST_LACKS_*=0 to enable force-running tests. This way we could have CircleCI config auto-run all tests, so there wouldn't exist any accidents with tests getting skipped?

sbc100 · 2025-10-09T17:15:51Z

How about is the versions checking this was all behind EMTEST_AUTOSKIP=1 which you could enable in your CI?

Specifically I think I would want to turn off EMTEST_AUTOSKIP in the main CI and instead be explict.

Funnily enough we just had an example of where skipping tests caused us to loose coverage. In this case were explictly skipping using EMTEST_SKIP_WASM64 and EMTEST_SKIP_NODE_CANARY, but all of the wasm64 tests in test_other.py were not being run until #25531 landed. I suppose this is unrelated really because were explicitly skipping but its the kind of problem I'd like to minimize if possible.

juj · 2025-10-10T19:25:48Z

How about is the versions checking this was all behind EMTEST_AUTOSKIP=1 which you could enable in your CI?

Having a single EMTEST_AUTOSKIP= variable instead of multiple EMTEST_LACKS_*=0 variables sounds ok.. however, could we flip the default direction of the flag, so that if it is not present, then the autoskips are enabled? This is because this sounds a more friendly default for all contributors and new users.

Only for this specific use case of CircleCI, we need to set up to disable the autoskip, so maybe we can set the circleci config.yml files to have a EMTEST_AUTOSKIP=0 flags explicitly? Would that work well?

sbc100 · 2025-10-10T21:54:38Z

Only for this specific use case of CircleCI, we need to set up to disable the autoskip, so maybe we can set the circleci config.yml files to have a EMTEST_AUTOSKIP=0 flags explicitly? Would that work well?

Is it actually more friendly though?

Imagine I am a new contributor and run all the tests, and they all appeared to pass. Then, later, in review a bunch of tests fail because for example the v8 we automatically skipped because I didn't install v8.

The test framework is basically saying "I noticed you don't have v8 installed, so I conveniently skipped all the v8 test for you".

I would much rather the test framework say something like: "I noticed you don't have v8 installed so all tests will fail, please either install v8 or set EMTEST_AUTOSKIP=1 to have these tests automatically skipped".

Having to explicitly set EMTEST_AUTOSKIP is a pretty low bar.

sbc100 · 2025-10-10T21:55:13Z

Only for this specific use case of CircleCI, we need to set up to disable the autoskip, so maybe we can set the circleci config.yml files to have a EMTEST_AUTOSKIP=0 flags explicitly? Would that work well?

Is it actually more friendly though?

Imagine I am a new contributor and run all the tests, and they all appeared to pass. Then, later, in review a bunch of tests fail because for example the v8 we automatically skipped because I didn't install v8.

The test framework is basically saying "I noticed you don't have v8 installed, so I conveniently skipped all the v8 test for you".

I would much rather the test framework say something like: "I noticed you don't have v8 installed so all tests will fail, please either install v8 or set EMTEST_AUTOSKIP=1 to have these tests automatically skipped".

Having to explicitly set EMTEST_AUTOSKIP is a pretty low bar.

I could see I suppose the autoskipping based on particular browser versions might be different...?

juj · 2025-10-16T17:34:49Z

Is it actually more friendly though?

I think it is more friendly.

It is better for first-time users to get

[0%] test_pthread_growth_growable_arraybuffers (test_browser.browser.test_pthread_growth_growable_arraybuffers)
... skipped 'This test requires a browser that supports growable ArrayBuffers'
[50%] test_pthread_growth_mainthread_growable_arraybuffers (test_browser.browser.test_pthread_growth_mainthread_growable_arraybuffers)
... skipped 'This test requires a browser that supports growable ArrayBuffers'

out of the box, instead of

[0%] test_pthread_growth_mainthread_growable_arraybuffers (test_browser.browser.test_pthread_growth_mainthread_growable_arraybuffers)
... FAIL
[50%] test_pthread_growth_growable_arraybuffers (test_browser.browser.test_pthread_growth_growable_arraybuffers)
... FAIL

======================================================================
FAIL: test_pthread_growth_growable_arraybuffers (test_browser.browser.test_pthread_growth_growable_arraybuffers)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 58, in testPartExecutor
    yield
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 651, in run
    self._callTestMethod(testMethod)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 606, in _callTestMethod
    if method() is not None:
       ~~~~~~^^
  File "C:\emsdk\emscripten\main\test\common.py", line 1000, in resulting_test
    return func(self, *args)
  File "C:\emsdk\emscripten\main\test\common.py", line 265, in decorated
    return func(self, *args, **kwargs)
  File "C:\emsdk\emscripten\main\test\common.py", line 265, in decorated
    return func(self, *args, **kwargs)
  File "C:\emsdk\emscripten\main\test\test_browser.py", line 4797, in test_pthread_growth
    self.btest_exit('pthread/test_pthread_memory_growth.c', cflags=['-pthread', '-sALLOW_MEMORY_GROWTH', '-sINITIAL_MEMORY=32MB', '-sMAXIMUM_MEMORY=256MB'] + cflags)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 2830, in btest_exit
    return self.btest(filename, *args, **kwargs)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 2859, in btest
    self.run_browser(outfile, expected=['/report_result?' + e for e in expected], timeout=timeout)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 2783, in run_browser
    raise e
  File "C:\emsdk\emscripten\main\test\common.py", line 2774, in run_browser
    self.assertContained(expected, output)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 1774, in assertContained
    self.fail("Expected to find '%s' in '%s', diff:\n\n%s\n%s" % (
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      limit_size(values[0]), limit_size(string), limit_size(diff),
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      additional_info,
      ^^^^^^^^^^^^^^^^
    ))
    ^^
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 732, in fail
    raise self.failureException(msg)
AssertionError: Expected to find '/report_result?exit:0
' in '/report_result?exception:wasmMemory.toResizableBuffer is not a function / updateMemoryViews@http://localhost:8889/test.js:845:22
initMemory@http://localhost:8889/test.js:888:3
@http://localhost:8889/test.js:2857:3
', diff:

--- expected
+++ actual
@@ -1 +1,4 @@
-/report_result?exit:0
+/report_result?exception:wasmMemory.toResizableBuffer is not a function / updateMemoryViews@http://localhost:8889/test.js:845:22
+initMemory@http://localhost:8889/test.js:888:3
+@http://localhost:8889/test.js:2857:3
+



======================================================================
FAIL: test_pthread_growth_mainthread_growable_arraybuffers (test_browser.browser.test_pthread_growth_mainthread_growable_arraybuffers)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 58, in testPartExecutor
    yield
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 651, in run
    self._callTestMethod(testMethod)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 606, in _callTestMethod
    if method() is not None:
       ~~~~~~^^
  File "C:\emsdk\emscripten\main\test\common.py", line 1000, in resulting_test
    return func(self, *args)
  File "C:\emsdk\emscripten\main\test\common.py", line 265, in decorated
    return func(self, *args, **kwargs)
  File "C:\emsdk\emscripten\main\test\common.py", line 265, in decorated
    return func(self, *args, **kwargs)
  File "C:\emsdk\emscripten\main\test\test_browser.py", line 4780, in test_pthread_growth_mainthread
    self.btest_exit('pthread/test_pthread_memory_growth_mainthread.c', cflags=['-pthread', '-sALLOW_MEMORY_GROWTH', '-sINITIAL_MEMORY=32MB', '-sMAXIMUM_MEMORY=256MB'] + cflags)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 2830, in btest_exit
    return self.btest(filename, *args, **kwargs)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 2859, in btest
    self.run_browser(outfile, expected=['/report_result?' + e for e in expected], timeout=timeout)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 2783, in run_browser
    raise e
  File "C:\emsdk\emscripten\main\test\common.py", line 2774, in run_browser
    self.assertContained(expected, output)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "C:\emsdk\emscripten\main\test\common.py", line 1774, in assertContained
    self.fail("Expected to find '%s' in '%s', diff:\n\n%s\n%s" % (
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      limit_size(values[0]), limit_size(string), limit_size(diff),
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      additional_info,
      ^^^^^^^^^^^^^^^^
    ))
    ^^
  File "C:\emsdk\python\3.13.3_64bit\Lib\unittest\case.py", line 732, in fail
    raise self.failureException(msg)
AssertionError: Expected to find '/report_result?exit:0
' in '/report_result?exception:wasmMemory.toResizableBuffer is not a function / updateMemoryViews@http://localhost:8888/test.js:845:22
initMemory@http://localhost:8888/test.js:888:3
@http://localhost:8888/test.js:2857:3
', diff:

--- expected
+++ actual
@@ -1 +1,4 @@
-/report_result?exit:0
+/report_result?exception:wasmMemory.toResizableBuffer is not a function / updateMemoryViews@http://localhost:8888/test.js:845:22
+initMemory@http://localhost:8888/test.js:888:3
+@http://localhost:8888/test.js:2857:3
+

that has no hint towards the browser version being the problem (unless we start regex-matching individual error messages or bw-compat checking all features before using them)

The above example was even simplified to a nice printed version, only running the growable arraybuffer tests. In the real world it is a worse result: anyone running test/runner core0 or test/runner browser will get a barrage failures like that, with different syntax errors in different failures.

Imagine I am a new contributor and run all the tests, and they all appeared to pass. Then, later, in review a bunch of tests fail because for example the v8 we automatically skipped because I didn't install v8.

This does not seem like a problem at all? When the tests fail on CI, they will then get to read which tests failed, and they will likely try to run those tests locally, and the local test run then gives them helpful info:

test_pthread_growth_growable_arraybuffers (test_browser.browser.test_pthread_growth_growable_arraybuffers)
... skipped 'This test requires a browser that supports growable ArrayBuffers'

and then they can have that moment "oh, I need to test my feature against browser/node Y instead. Let me install that." There is hand-holding information all the way to take the developer forward.

Not also that the set of developers who just report bugs or only use Emscripten is larger than the set of people who actually propose PRs to Emscripten. So test/runner core0 and test/runner browser would be beneficial to work for the largest group of developers out of the box to the extent possible, without needing to go through extra setup.

sbc100 · 2025-10-16T17:47:35Z

OK, I see that a nice error would be better by default.

How about we do the same thing we do for node canary and error with something like"

"ERROR: Test requires resizeable array buffer but your browser does not support this. Run with EMTEST_AUTOSKIP=1 to skip this test automatically".

That matches the existing behaviour for things like node canary, has a nice message, but is also not skipping things by default, so hopefully can make everyone happy.

If folks complain about having to set EMTEST_AUTOSKIP all the time we can consider changing the default then.

sbc100 · 2025-10-16T17:48:44Z

What I want to avoid here is folks getting the false impression that "the entire browser test suite passes" becuase they don't see any failures, when in fact a large chunk of tests were in fact auto-skipped.

tools/feature_matrix.py

juj · 2025-10-16T18:42:28Z

How about we [...] error with

Sure. Though here I realize that the current unittest.skipIf() architecture does not seem to allow erroring a test, so I'll fall back to your original plan for now.

sbc100 · 2025-10-16T20:42:59Z

How about we [...] error with

Sure. Though here I realize that the current unittest.skipIf() architecture does not seem to allow erroring a test, so I'll fall back to your original plan for now.

We can probably make our own helper/wrapper like skipExecIf.

tools/feature_matrix.py

…t harness, so that users do not need to manually maintain skip lists of features. Paves the way toward test harness working for users out of the box.

…MTEST_LACKS_x=0 can be used to force-don't-skip tests.

sbc100 reviewed Oct 9, 2025

View reviewed changes

kleisauke reviewed Oct 16, 2025

View reviewed changes

tools/feature_matrix.py Show resolved Hide resolved

kleisauke reviewed Oct 17, 2025

View reviewed changes

tools/feature_matrix.py Outdated Show resolved Hide resolved

juj added 6 commits October 18, 2025 00:08

Implement automatic test suite skipping facility into the browser tes…

e415409

…t harness, so that users do not need to manually maintain skip lists of features. Paves the way toward test harness working for users out of the box.

ruff

3e76ea0

Refactor the test skip auto-detect feature so that env. var. values E…

c4c1e01

…MTEST_LACKS_x=0 can be used to force-don't-skip tests.

Add EMTEST_AUTOSKIP

14abd61

Revise targets

0afeb4c

Bump min chrome version on growable arraybuffers

1c5aa15

juj force-pushed the test_harness_feature_skip_checks branch from 5a1ab53 to 1c5aa15 Compare October 17, 2025 21:08

Implement automatic test suite skipping facility into the browser test harness #25533

Are you sure you want to change the base?

Implement automatic test suite skipping facility into the browser test harness #25533

Conversation

juj commented Oct 9, 2025

Uh oh!

sbc100 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbc100 Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

juj Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

juj commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juj commented Oct 9, 2025

Uh oh!

sbc100 commented Oct 9, 2025

Uh oh!

juj commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 commented Oct 10, 2025

Uh oh!

sbc100 commented Oct 10, 2025

Uh oh!

juj commented Oct 16, 2025

Uh oh!

sbc100 commented Oct 16, 2025

Uh oh!

sbc100 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

juj commented Oct 16, 2025

Uh oh!

sbc100 commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sbc100 left a comment •

edited

Loading

juj commented Oct 9, 2025 •

edited

Loading

juj commented Oct 10, 2025 •

edited

Loading

sbc100 commented Oct 16, 2025 •

edited

Loading