Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions rfcs/test262_integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# RFC 229: Integrating Test262

## Summary

This RFC integrates the Test262 (ECMAScript) test suite into WPT by vendoring it directly under `third_party/test262`. Execution will be on by default.

## Motivation

* **Browser-based Testing:** Run Test262 in real browsers (stable and experimental), not just JS shells. This provides a more complete conformance signal for how ECMAScript behaves in the full web platform environment.
* **Unified Infrastructure:** Use the existing WPT infrastructure (`wpt`, CI) and `wpt.fyi` for test execution and results analysis.
* **Operational Efficiency:** Remove the need for vendors to maintain separate infrastructure for running Test262 and correlating its results with WPT.
* **Interop:** Allow ECMAScript features to be proposed and selected for inclusion in cross-browser Interop efforts.

## Detailed Design

### 1. Source Management

The Test262 test suite will be vendored directly into the WPT repository under `third_party/test262/`. This establishes a clear convention for third-party test suites. Other suites, such as WebAssembly (`wasm/core/`), could be moved under `third_party/` in the future for consistency.

* A `vendored.toml` file will be located at `third_party/test262/vendored.toml`. This file will contain the source repository URL and the specific commit SHA of the vendored revision, acting as a source of truth. The format will be:
```toml
[test262]
source = "https://github.com/tc39/test262"
rev = "<sha1>"
```
* A dedicated CI job will keep this vendored copy up-to-date. By having a separate configuration file for each vendored suite, automated update PRs for different suites will not conflict with each other. The CI job will periodically:
1. Fetch the latest commit SHA from the official `tc39/test262` repository.
2. Copy the `test/` and `harness/` directories into `third_party/test262`.
3. Update the `rev` in `vendored.toml`.
4. Open a PR with all the above changes.
* The update PR will include a smoketest to ensure Test262 integration is not broken. This small, dedicated test suite in `infrastructure/test262/` will verify key integration points (e.g., YAML frontmatter parsing for `negative` tests/`flags`, `assert.js` harness injection). Expected results will be in corresponding `.ini` files within `infrastructure/metadata/test262/` (e.g., `infrastructure/testharness/lone-surrogates.html` metadata in `infrastructure/metadata/infrastructure/testharness/lone-surrogates.html.ini`). This smoketest is the *only* blocking Test262-related check in the PR; full test runs occur in scheduled nightly runs, populating wpt.fyi.
* This approach ensures WPT consumers have no novel requirements beyond selecting the test type.

### 2. Test Integration

WPT's manifest generation will be extended to support Test262.

* **Manifest Generation:** `wpt manifest` will, by default, recognize `third_party/test262`, traverse its `test` directory, and add discovered tests to `MANIFEST.json`.
* **Metadata Parsing:** A new Test262-specific parser will read YAML frontmatter from `.js` files to extract metadata (e.g., ES features, negative expectations, execution flags). Unlike `wasm/core` tests, which are pre-processed, Test262 metadata will be used by `wptserve` handlers at runtime.
* **Harness and Server:** Specialized Test262 harness files and `wptserve` handlers will use parsed metadata to construct the execution environment. Test modes will use distinct URL conventions based on file extensions (e.g., `path/to/test.js` served as `path/to/test.test262.html` or `path/to/test.test262-module.html`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need test262 as flags in the test URL. I think we can wire up the handlers so they only apply in the relevant test262 directories, and use a simpler mapping from source file to test URL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good point about simplifying the URL mapping.

My main reason for using the explicit .test262.html extension in the current proof-of-concept was to handle the smoke tests. Since they live in infrastructure/test262/, a global rule based on the file extension was the most straightforward way to ensure they were picked up by the correct wptserve handler.

If we switched to a handler that only activates for paths inside third_party/test262/, it wouldn't find the smoke tests. We could explicitly add a second check for infrastructure/test262/, but hardcoding special-case paths like that into the server logic feels a bit brittle.

Another alternative could be to make the handler apply to any path that contains a test262 subdirectory. That would cover both cases neatly.

Let me know your thoughts.


### 3. Execution Control

Running Test262 is **on by default**.

* **Default Behavior:** `wpt run` will include Test262 tests by default, consistent with all other test types.
* **Excluding Tests:** Users who wish to exclude Test262 (or any other type) can do so by specifying the types they *do* want to run (e.g., `wpt run --test-types=testharness,reftest`). A dedicated `--exclude-test-types` flag may be added in the future.
* **CI Integration:** CI systems will run Test262 tests by default. No special flags are needed to enable them.

## Alternatives Considered

### Storing Test262 Metadata in the Manifest

Storing parsed YAML metadata from Test262 files directly within `MANIFEST.json` was considered. This is the standard WPT architecture for other dynamic test types (e.g., Workers, Window tests) to optimize runtime performance.

The current dynamic approach requires two disk I/O operations per test: one for the `wptserve` handler to parse YAML frontmatter, and a second for the browser to fetch the `.js` content. Storing metadata in `MANIFEST.json` would eliminate the first disk I/O and YAML parsing, effectively halving disk I/O operations per test during runtime.

However, this optimization is deferred for initial implementation simplicity. Metadata will be parsed dynamically by `wptserve` handlers at runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think it's a good thing to defer work until the tests are running when possible. Less time spent building the manifest makes ./wpt run feel faster as it starts running tests faster. (But total execution time should be the goal I think, and I'm sure it's sometimes faster to do the work once up front.)


### Pre-generating HTML Wrappers

Pre-generating static `.js.html` files for each Test262 test (as done for some other vendored suites like WebAssembly) was considered but rejected due to:

* **Complexity of Multiple Contexts:** Test262 tests execute in various contexts (e.g., module, strict). Pre-generating wrappers for all combinations would lead to a combinatorial explosion of static files, adding potentially far more than 50,000 new files and creating a significant management burden.

Dynamic HTML wrapper generation at serve-time, driven by Test262's metadata, offers a more flexible and efficient solution without duplicating repository files.

### Using a dedicated `--test262` flag

This was considered to provide a clear separation from WPT's internal test types. However, using the existing `--test-types` flag is more consistent with how other test types are selected and is the preferred approach.

## Implementation Considerations & Risks

* **Manifest Generation Performance:** Including the 50,000+ Test262 tests in the manifest by default will increase the time required to run `wpt manifest`. This performance cost is accepted to ensure a simpler user experience, where a single, default manifest contains all tests.
* **Repository Size:** Vendoring Test262 will significantly increase WPT repository size (~50,000 files), a manageable increase.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe note that vendors that don't want it can exclude this directory using their existing mechanisms to exclude parts of WPT?

This would add the requirement that WPT has to work even with the directory deleted, should we test that in CI somehow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyone who is known to not want these tests and can't simply ignore them? I feel like we're worrying a lot about people potentially being annoyed by the size and/or runtime increases, but without any specific examples of people who are actually worried by those things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. But I just updated the RFC to have it run by default.

* **CI Performance:** Full Test262 runs are lengthy. They will be scheduled during the existing nighly builds, not on every pull request.
* **Upstream Changes:** Updating vendored Test262 tests requires care to ensure changes to the test harness or metadata format are handled correctly. The CI job's update PRs will include a smoketest for basic integration issues.
* **Unit Tests for New Code:** New Python code for Test262 integration (e.g., manifest parsing, serving logic) will have dedicated unit tests, separate from integration smoketests.

## Proof of Concept

A prototype Test262 integration exists in:
[https://github.com/web-platform-tests/wpt/pull/55997](https://github.com/web-platform-tests/wpt/pull/55997)