feat(tests): add worst-case benchmark for `MOD` #1151

chfast · 2025-01-29T17:45:02Z

🗒️ Description

Add a worst-case test running a block with as many MOD instructions with arguments of the parametrized range.

🔗 Related Issues

#1571

✅ Checklist

All: Set appropriate labels for the changes.
All: Considered squashing commits to improve commit history.
All: Added an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
Tests: All converted JSON/YML tests from ethereum/tests have been added to converted-ethereum-tests.txt.
Tests: A PR with removal of converted JSON/YML tests from ethereum/tests have been opened.
Tests: Included the type and version of evm t8n tool used to locally execute test cases: e.g., ref with commit hash or geth 1.13.1-stable-3f40e65.
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

marioevz

I think we should rather create a marker, e.g.:

@pytest.mark.benchmark

And then you can place these tests in tests/frontier/opcodes/ but still select them with uv run fill -m benchmark.

How do you see the clients consuming these tests for benchmarks? Is there a command in evmone to run benchmark state tests?

tests/benchmarks/test_mod.py

chfast · 2025-01-30T13:28:12Z

Sorry, I didn't put a proper description of this yet. But here are my thoughts about benchmarks in EEST.

Benchmarks are formally regular tests and the existing types (state tests, blockchain tests, even EOF tests and transaction tests) are already very useful. They should be produced with fill and be consumed in a normal way.
We still should separate them from the main tests, maybe by using a marker like @pytest.mark.benchmark and/or in a separate directory.
They should not contribute to the test coverage. Therefore, not be part of the fixtures release or physically separated.
We want to generate them for a single fork and likely a recent one. It is an open question if there should be a default fork for all benchmarks and how to upgrade the fork. But in generally, we are mostly interested about EVM performance near the tip of the chain, not in the historical execution.
That's also the reason why sorting them by the fork names is not good idea. Potential categories are something like: real use cases (e.g. SHA1), micro-benchmarks, worst-case, etc. But I don't think we need to decide categories right now.
I imagine used to be like this: if this is the regular test execution:
evm test benchmark.json
then benchmarking is:
evm test --bench benchmark.json
The difference is just with --bench the test is executed in the loop and the execution time is reported.
Because of this loop execution we may need to put some additional requirements on the benchmarks. E.g. it should not modify a state, if it modifies a stare the state diff should be the same in every iteration, the test itself should execute for at least 1 ms.

marioevz · 2025-01-31T14:56:41Z

@chfast I like the idea, and basically everything you mention is somewhat easily doable in EEST.

One question, for (6), is there a pass/fail condition? Or is this something that is left to the client test consumer discretion?

chfast · 2025-02-03T20:34:25Z

One question, for (6), is there a pass/fail condition? Or is this something that is left to the client test consumer discretion?

In the example evm test benchmark.json is a regular test execution: no change in the fail/pass conditions.

The additional --bench likely will execute a benchmark in a loop and report the timing statistics. What I try to do in my current tooling is to execute it once and check assertions. Only then switch to the loop execution. In general, you want to make sure the benchmarks are executed correctly otherwise there is no point to benchmark incorrect implementation (or the test itself). In practice the exact method my depend on the time for single execution. It is indeed up to the client teams to decided. But also up to us to audit these decisions.

chfast · 2025-05-20T20:45:01Z

Adapted to the zkevm task.

jsign

Left some comments for your consideration, but overall LGTM.

Cycles:

tests/zkevm/test_worst_compute.py::test_worst_mod[fork_Cancun-blockchain_test-mod_bits_63]-1    3918422193
tests/zkevm/test_worst_compute.py::test_worst_mod[fork_Cancun-blockchain_test-mod_bits_255]-1   6321807633
tests/zkevm/test_worst_compute.py::test_worst_mod[fork_Cancun-blockchain_test-mod_bits_127]-1   6559030064
tests/zkevm/test_worst_compute.py::test_worst_mod[fork_Cancun-blockchain_test-mod_bits_191]-1   8734621043

tests/zkevm/test_worst_compute.py

chfast · 2025-05-21T08:56:07Z

Added also SMOD variant.

Add a worst-case test running a block with as many MOD instructions with arguments of the parametrized range.

marioevz

LGTM, thanks!

Add a worst-case test running a block with as many MOD instructions with arguments of the parametrized range.

marioevz reviewed Jan 29, 2025

View reviewed changes

tests/benchmarks/test_mod.py Outdated Show resolved Hide resolved

tests/benchmarks/test_mod.py Outdated Show resolved Hide resolved

pdobacz assigned chfast Feb 17, 2025

chfast force-pushed the bench/mod branch from 0e0d5d4 to e65c7d4 Compare May 20, 2025 20:42

chfast changed the title ~~new(tests): add worst-case benchmark for MOD~~ feat(tests): add worst-case benchmark for MOD May 20, 2025

chfast added scope:tests Scope: Changes EL client test cases in `./tests` type:feat type: Feature feature:zkevm labels May 20, 2025

chfast requested review from jsign and marioevz May 20, 2025 20:45

chfast force-pushed the bench/mod branch from e65c7d4 to fb3d6b9 Compare May 20, 2025 20:49

chfast marked this pull request as ready for review May 20, 2025 20:49

jsign approved these changes May 20, 2025

View reviewed changes

tests/zkevm/test_worst_compute.py Outdated Show resolved Hide resolved

tests/zkevm/test_worst_compute.py Outdated Show resolved Hide resolved

tests/zkevm/test_worst_compute.py Outdated Show resolved Hide resolved

tests/zkevm/test_worst_compute.py Outdated Show resolved Hide resolved

chfast force-pushed the bench/mod branch 2 times, most recently from ea4c288 to 426b758 Compare May 21, 2025 08:55

chfast force-pushed the bench/mod branch from 426b758 to cdf0ae3 Compare May 21, 2025 09:02

feat(tests): add worst-case benchmark for MOD

5fd631c

Add a worst-case test running a block with as many MOD instructions with arguments of the parametrized range.

chfast force-pushed the bench/mod branch from cdf0ae3 to 5fd631c Compare May 21, 2025 09:19

marioevz approved these changes May 21, 2025

View reviewed changes

marioevz merged commit c3e5fb6 into ethereum:main May 21, 2025
14 checks passed

marioevz deleted the bench/mod branch May 21, 2025 18:39

kclowes pushed a commit to kclowes/execution-spec-tests that referenced this pull request Oct 20, 2025

feat(tests/zkevm): add worst-case benchmark for MOD (ethereum#1151)

306fe99

Add a worst-case test running a block with as many MOD instructions with arguments of the parametrized range.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tests): add worst-case benchmark for `MOD` #1151

feat(tests): add worst-case benchmark for `MOD` #1151

Uh oh!

chfast commented Jan 29, 2025 •

edited

Loading

Uh oh!

marioevz left a comment

Uh oh!

Uh oh!

Uh oh!

chfast commented Jan 30, 2025

Uh oh!

marioevz commented Jan 31, 2025

Uh oh!

chfast commented Feb 3, 2025

Uh oh!

chfast commented May 20, 2025

Uh oh!

jsign left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chfast commented May 21, 2025

Uh oh!

marioevz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(tests): add worst-case benchmark for MOD #1151

feat(tests): add worst-case benchmark for MOD #1151

Uh oh!

Conversation

chfast commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

🔗 Related Issues

✅ Checklist

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chfast commented Jan 30, 2025

Uh oh!

marioevz commented Jan 31, 2025

Uh oh!

chfast commented Feb 3, 2025

Uh oh!

chfast commented May 20, 2025

Uh oh!

jsign left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chfast commented May 21, 2025

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(tests): add worst-case benchmark for `MOD` #1151

feat(tests): add worst-case benchmark for `MOD` #1151

chfast commented Jan 29, 2025 •

edited

Loading