|
| 1 | +# RFC 131: Disable wpt-chrome-dev-stability runs |
| 2 | + |
| 3 | +## Summary |
| 4 | + |
| 5 | +The wpt-chrome-dev-stability job on WPT fails quite frequently and |
| 6 | +it's impacting developer productivity. The job should be changed to |
| 7 | +non-blocking, while keeping a record of potential flakiness |
| 8 | +introductions that could be useful in further investigations. |
| 9 | + |
| 10 | +## Details |
| 11 | + |
| 12 | +Stability jobs are intended to help us ensure that new and modified |
| 13 | +tests aren't flaky. In particular it's designed to ensure that the |
| 14 | +test author is aware when they have written a test that is flaky in a |
| 15 | +different browser. This is because authors writing a test in a |
| 16 | +specific browser engine might be unaware that they are depending on |
| 17 | +details of that implementation that make the test unreliable in other |
| 18 | +engines. Leaving this to other developers to fix is expensive, because |
| 19 | +they have to spend time becoming familiar with details of the test |
| 20 | +which are already understood by the original author. In addition, when |
| 21 | +there are many unstable web-platfrom-tests, it undermines confidence |
| 22 | +in the overall suite and, where such instability is automatically |
| 23 | +handled in CI systems (e.g. by marking a test as [PASS, FAIL] in |
| 24 | +metadata), reduces the ability of the tests to catch regressions. |
| 25 | + |
| 26 | +Currently, stability checks for the corresponding browser are skipped |
| 27 | +when [merging export PRs from their own |
| 28 | +repositories](https://github.com/web-platform-tests/wpt/issues/29737) |
| 29 | +(i.e. Chromium exports don't run Chrome stability checks, and |
| 30 | +likewise for Firefox). This is because we assume that any real |
| 31 | +stability issue is likely to have been handled in the source |
| 32 | +repository. It's common for such changesets to include code changes |
| 33 | +to the browser itself that fix intermittents, and in the source |
| 34 | +repository the tests will run together with the corresponding browser |
| 35 | +changes. However on GitHub we are using the latest development |
| 36 | +release of the browser, which is unlikely to contain code fixes at |
| 37 | +the time of export. This makes the stability checks on these PRs |
| 38 | +highly prone to misleading failures. |
| 39 | + |
| 40 | +The main problem with the stability checks as currently implemented is |
| 41 | +that although the test author is in the best place to understand the |
| 42 | +behaviour of the test, they may not understand how to reproduce the |
| 43 | +failure in another browser, or may be unable to fix the problem |
| 44 | +(e.g. if it's a browser bug rather than a test bug). |
| 45 | + |
| 46 | +Faced with these tradeoffs, the Chromium developers believe the |
| 47 | +project is better served by allowing tests which are unstable in |
| 48 | +Chrome to land in web-platform-tests and using the tooling they have |
| 49 | +available in the Chromium CI system to investigate the |
| 50 | +flakiness. Therefore the job will be marked as non-blocking, so we are |
| 51 | +able to see where it fails and assess whether the correct tradeoff has |
| 52 | +been made. |
| 53 | + |
| 54 | +Marking a job as non-blocking will require changes in the taskcluster |
| 55 | +configuration so that the job does not block the sink task. |
| 56 | + |
| 57 | +Disabling the job entirely would also reduce the CI load. This can be |
| 58 | +considered once the impact of the change is well understood. |
| 59 | + |
| 60 | +## Risks |
| 61 | + |
| 62 | +New intermittent failures specific to Chromium could be introduced |
| 63 | +upstream and only noticed when a WPT import to Chromium CI |
| 64 | +happens. Chromium developers believe the tradeoff is worthwhile, as |
| 65 | +the Chromium project has robust tooling to investigate and handle such |
| 66 | +cases. |
0 commit comments