Skip to content

How To Reduce Flaky Tests in Vaadin 8 TestBench

Tatu Lund edited this page Feb 12, 2026 · 1 revision

How To Reduce Flaky Tests in Vaadin 8 TestBench (Selenium)

This repo uses Vaadin 8.30.x, TestBench 5.4.x and Selenium 4. That stack is perfectly capable of stable end‑to‑end tests, but you only get reliability if your tests consistently wait for Vaadin and the DOM instead of racing the UI.

Below is a practical checklist (with tiny snippets) that’s tailored to the patterns already used in the vaadincreate-ui ITs.


1) Start from a deterministic browser environment

Flakes often come from “almost the same” rendering or timing.

Do this:

  • Use a fixed viewport size for every run.
  • Keep headless flags consistent across CI and local.
  • Avoid dynamic scaling (device pixel ratio changes, responsive breakpoints).

This repo already does the important part in the base test: fixed Chrome --window-size=1280,900 and stable screenshot settings.

Extra tip (visual tests): if a view has CSS transitions/animations, wait for a state (class present, overlay gone, chart finished) instead of sleeping. See step 5.


2) Prefer stable locators: IDs over structure

Vaadin UIs re-render frequently; DOM structure and indexes change.

Do this:

  • Give critical elements stable IDs in the UI (buttons, fields, dialogs, grid, overlays).
  • Use TestBench element APIs with .id("...").
  • Avoid CSS selectors that depend on layout/ordering.

Avoid this: selecting by “the second suggestion” or “first row” unless it’s truly deterministic.

Minimal example: avoid index-based ComboBox selection

Instead of:

combo.getPopupSuggestionElements().get(1).click();

Prefer a stable value:

combo.selectByText("Available");

If the list is dynamic, combine it with waiting (step 4/5).


3) Know when to wait: TestBench auto-waits (and when it doesn’t)

Vaadin 8 is server-driven. Many UI actions cause RPC calls + re-render. If you click/type and immediately assert, you’re racing the framework.

The important nuance: most TestBench element interactions already include waiting. Methods like click(), setValue(...), many element lookups, and other TestBench APIs typically trigger an internal waitForVaadin() around the interaction.

So: it’s good to know testBench().waitForVaadin(), but in normal TestBench-style tests it’s rarely needed as an extra call.

What waitForVaadin() actually waits for

testBench().waitForVaadin() waits for the Vaadin request cycle (pending client↔server communication initiated by the UI interaction).

Critical exception: Push / async updates

If your UI updates asynchronously (e.g., @Push, background threads pushing UI changes, server events), waitForVaadin() does not reliably wait for those pushed updates to appear.

In those cases, you must use explicit Selenium/TestBench waits for the resulting DOM/state:

waitUntil(d -> $(LabelElement.class).id("status").getText().contains("Ready"));

Minimal pattern (preferred)

$(ButtonElement.class).id("save-button").click(); // usually auto-waits

waitForElementPresent(By.className("v-Notification"));

When to add an explicit waitForVaadin()

Add it when you mix in operations that bypass TestBench’s auto-wait, for example:

  • Raw WebDriver: driver.findElement(...).click()
  • Selenium Actions
  • executeScript(...) that triggers Vaadin client actions

Pattern:

driver.findElement(By.id("save-button")).click();
testBench().waitForVaadin();
waitForElementPresent(By.className("v-Notification"));

4) Treat re-renders as normal: handle StaleElementReferenceException

In Vaadin, components can be replaced in the DOM between “find” and “read”. The ITs already implement a good pattern: search for a matching element inside waitUntil(...), and retry on stale references.

Minimal reusable helper pattern

NotificationElement notification = waitUntil(driver -> {
    for (NotificationElement n : $(NotificationElement.class).all()) {
        try {
            if (expected.equals(n.getCaption())) {
                return n;
            }
        } catch (org.openqa.selenium.StaleElementReferenceException ignored) {
            return null; // force retry
        }
    }
    return null;
});

Do this:

  • If an element is frequently stale, re-query it inside waitUntil.
  • Keep the lambda “pure”: no side effects; just return the element when ready.

5) Replace sleeps with state-based waits (especially for animations)

A fixed sleep (Thread.sleep, or wrapper wait(Duration)) can reduce flakes locally but still fail on slower CI or after a minor UI change.

Use sleeps only as a last resort, and only when you can point to a known fixed-duration animation.

Better: wait for “stable” conditions

Charts example (instead of waiting 1 second):

var chart = $(ChartElement.class).id("price-chart");
waitUntil(d -> chart.getDataLabels().size() == 3);

Dialog example:

$(ButtonElement.class).id("delete-button").click();
waitForElementVisible(By.id("confirm-dialog"));

Form open/close example:

$(ButtonElement.class).id("new-product").click();
waitUntil(d -> $(CssLayoutElement.class).id("book-form")
        .getClassNames().contains("bookform-wrapper-visible"));

6) Make tests independent: no shared mutable fixtures

A huge source of flakiness is test order dependence. If one test deletes or modifies shared data (users/categories/books) and another test expects the original fixtures, you’ll get “works on my machine” failures when running a subset, retrying failed tests, or running in parallel.

Do this:

  • Each test should create its own data and clean it up.
  • If you must use fixtures, reset the database between tests or test classes.
  • Never delete a shared fixture user like User4 unless you recreate it.

Minimal pattern: use unique names

String username = "Testuser-" + System.currentTimeMillis();
userField.setValue(username);

…and then delete that same record at the end of the test (or in @After).


7) Close overlays/notifications so they don’t block future clicks

Overlays are classic “hidden flake” causes: they can intercept clicks, block focus, or change layout.

Do this:

  • When you assert a notification, close it.
  • When you open a popup/menu, close it before continuing.

This repo already closes notifications in the book tests—keep using that pattern everywhere.


8) Avoid asserting more than you need

Assertions that are too strict are fragile:

  • Exact localized notification strings (punctuation, whitespace)
  • Full aria-label strings that include dynamic values
  • Exact chart labels when underlying data can change

Do this:

  • Assert the smallest stable piece of behavior.
  • Prefer contains(...) when the exact string is not the real requirement.

Example:

assertTrue(notification.getText().contains("poistettu"));

9) Screenshot tests: stabilize the UI before compareScreen

The base test already sets:

  • Parameters.setMaxScreenshotRetries(3)
  • Parameters.setScreenshotRetryDelay(1000)
  • a tolerance for minor diffs

To reduce flakes further:

  • Ensure the UI is in a stable state (no open tooltips, menus, caret blinking).
  • For animated components (charts, sliding panels), wait for a stable condition (step 5).
  • Keep dynamic content (timestamps, random IDs) out of screenshot regions.

10) A quick “anti-flake” template for new ITs

Use this as a mental checklist:

  1. open() and waitForAppLoaded() (already in base)
  2. Login; wait for an app root element
  3. For every action that triggers server work: click/type (TestBench auto-waits) → wait for the DOM/state you need
  4. Prefer stable IDs
  5. Handle re-render staleness by re-querying inside waitUntil
  6. Create unique data; clean up
  7. Close overlays/notifications

Summary

Stable Vaadin 8 TestBench tests are mostly about synchronization and isolation:

  • Synchronization: rely on TestBench auto-waits, and use explicit element/state waits (especially for animations and Push/async updates).
  • Isolation: no shared mutable fixtures; create your own data and clean up.
  • Robustness: stable IDs, avoid index-based selectors, retry around staleness.

If you want, I can also propose a small set of helper methods to add into the base test (e.g., clickAndWaitForVaadin, waitForNotificationCaption, waitForChartLabels) to standardize these patterns across all ITs.

Clone this wiki locally