Add support for execution sharding #2600

agologan · 2025-07-27T18:24:42Z

🤔 What's changed?

Adds new flag --shard 1/3 to allow for execution sharding similar to https://playwright.dev/docs/test-sharding

⚡️ What's your motivation?

🏷️ What kind of change is this?

📖 Documentation (improvements without changing code)
⚡ New feature (non-breaking change which adds new behaviour)

♻️ Anything particular you want feedback on?

Feature is implemented as plugin on event pickles:filter and will run between filter and order plugins.
As such defined order is not affected by sharding and random order will only shuffle tests in a shard.

Alternatively, a new event pickles:shard could be introduced to be executed after pickles:order which would allow a global random seed. This will require a documentation change to warn users they should use the same seed across shards.

The option was added to ISourcesCoordinates as it acts more like a filtering option for a specific instance rather than how --parallel behaves and the plugin was loaded only for runCucumber.

📋 Checklist:

I agree to respect and uphold the Cucumber Community Code of Conduct
I've changed the behaviour of the code
- I have added/updated tests to cover my changes.
My change requires a change to the documentation.
- I have updated the documentation accordingly.
Users should know about my change
- I have added an entry to the "Unreleased" section of the CHANGELOG, linking to this pull request.

This text was originally generated from a template, then edited by hand. You can modify the template here.

coveralls · 2025-07-27T18:28:10Z

coverage: 97.792% (+0.01%) from 97.78%
when pulling 92ba787 on agologan:main
into cf6ab33 on cucumber:main.

(to avoid this being a breaking change for api consumers)

davidjgoss

Many thanks for contributing this, it's a great quality PR.

Feature is implemented as plugin on event pickles:filter and will run between filter and order plugins.
As such defined order is not affected by sharding and random order will only shuffle tests in a shard.

Alternatively, a new event pickles:shard could be introduced to be executed after pickles:order which would allow a global random seed. This will require a documentation change to warn users they should use the same seed across shards.

I think what you've done is the best balance - no sense in adding another event just for this.

Tallyb · 2025-08-22T12:29:58Z

From looking at the code, I coudl not tell how the pickles are sorted. If this is a random sort, based on how they are read from the file system, this will lead to errors. You must sort all pickles in a unique way. As a side note, since you use modulo, sorting by file size and then alphabetically will give better performance compared to alphabetically only.

davidjgoss · 2025-08-22T19:24:08Z

@Tallyb can you elaborate on what errors you are foreseeing? Are you able to reproduce them in a sample project?

This PR doesn't deal with any sorting - that is handled in a later phase, after filtering.

Tallyb · 2025-08-23T09:52:03Z

let's take a simple example:
Assume we have tests A,B,C,D, E, F, G and we run them on shard = 3.
Each shard is running in a different machine, so it is completely unaware of what is happening on the other machines.

Machine 1: tests are read in the following order - A, B, C, D, E, F, G. tests taken - A, D, G. (positions: 1, 4, 7)
Machine 2: tests are read in the following order - B, A, C, D, E, F, G. tests taken: A, E, (2, 5)
Machine 2: tests are read in the following order - B, C, A, D, E, F, G. tests taken: A, F, (3, 6)
As you can see, A is tested 3 times, while B and C are not tested at all. to make sure everything is tested, you must guarantee the same order in which tests are being read.

agologan · 2025-08-25T08:39:36Z

The code in all instances does 3 things in this order:

discovers paths
filters the paths
applies ordering (if not default: defined)

Can see this is load_sources.ts or run_cucumber.ts

Sharding is applied after discovery but before further ordering.

So discovery is responsible for the defined order and that can be seen in paths.ts

Getting back to your example, all 3 machines run the exact same code to discover
the paths which as the docs put it "roughly means alphabetical order of file path
followed by sequential order within each file".

So all 3 machines will discover the tests in an order which roughly resembles
A, B, C, D, E, F, G order, and is the same across them.

Then all 3 machines will apply the same filtering logic, which would leave them
with the same sequence, to which sharding is applied.

Picking nth test on each shard will provide uniform distribution of all tests afterwards.

As a last step if a random order is used, those left on each shard are reshuffled.

davidjgoss · 2025-08-25T08:44:37Z

Thanks for laying that out @agologan.

As you say, the initial order of sources and pickles is deterministic given the same configuration, so I don't believe we have a problem here.

Tallyb · 2025-08-25T09:00:00Z

Ok, as long as it is deterministic, there is no issue.
You can use my tip for improving performance by sorting by size (assuming size ~= execution time).

agologan force-pushed the main branch from b613ccb to 4604477 Compare July 27, 2025 18:27

agologan force-pushed the main branch from 4604477 to 4d0f0a0 Compare July 27, 2025 18:31

Add support for execution sharding

64169d3

agologan force-pushed the main branch from 4d0f0a0 to 64169d3 Compare July 27, 2025 18:37

davidjgoss added 5 commits August 16, 2025 09:29

alphabetise cli options

e165a90

make new field optional on api type

26eb8d3

(to avoid this being a breaking change for api consumers)

add unhappy scenario and option validation

5b88936

update feature description

8f6ae43

attribution

9b15471

davidjgoss approved these changes Aug 16, 2025

View reviewed changes

davidjgoss added 2 commits August 16, 2025 09:43

Update CHANGELOG.md

b2dfc43

Merge branch 'main' into main

92ba787

davidjgoss merged commit c79fe2d into cucumber:main Aug 16, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add support for execution sharding #2600

Add support for execution sharding #2600

Uh oh!

agologan commented Jul 27, 2025 •

edited by davidjgoss

Loading

Uh oh!

coveralls commented Jul 27, 2025 •

edited

Loading

Uh oh!

davidjgoss left a comment

Uh oh!

Uh oh!

Tallyb commented Aug 22, 2025

Uh oh!

davidjgoss commented Aug 22, 2025

Uh oh!

Tallyb commented Aug 23, 2025 •

edited

Loading

Uh oh!

agologan commented Aug 25, 2025

Uh oh!

davidjgoss commented Aug 25, 2025

Uh oh!

Tallyb commented Aug 25, 2025

Uh oh!

Uh oh!

Uh oh!

Add support for execution sharding #2600

Add support for execution sharding #2600

Uh oh!

Conversation

agologan commented Jul 27, 2025 • edited by davidjgoss Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤔 What's changed?

⚡️ What's your motivation?

🏷️ What kind of change is this?

♻️ Anything particular you want feedback on?

📋 Checklist:

Uh oh!

coveralls commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidjgoss left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Tallyb commented Aug 22, 2025

Uh oh!

davidjgoss commented Aug 22, 2025

Uh oh!

Tallyb commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agologan commented Aug 25, 2025

Uh oh!

davidjgoss commented Aug 25, 2025

Uh oh!

Tallyb commented Aug 25, 2025

Uh oh!

Uh oh!

agologan commented Jul 27, 2025 •

edited by davidjgoss

Loading

coveralls commented Jul 27, 2025 •

edited

Loading

Tallyb commented Aug 23, 2025 •

edited

Loading