Release v5.0.0 · ethereum/execution-spec-tests

🇯🇵 Summary

EEST Fujisan is our first full release for Osaka, the first full release since Pectra!

In addition to the latest Osaka specific test cases, it includes re-filled GeneralStateTests from ethereum/tests (now fully maintained within EEST under tests/static) for Osaka adhering to the transaction gas limit cap from EIP-7825. Further framework changes include new simulators, test formats and test types.

⚔️ Future Weld with EELS

EEST will merge with EELS during Q4 2025, after which EEST becomes read-only for external contributors.

What this means?

All EEST code moves to the EELS repository.
New EEST framework location: execution-specs/src/ethereum_spec_tests/.
New EEST tests location: execution-specs/tests/eest/.
Future PRs go to EELS instead of EEST.

Important Notes

All PRs for tests and framework changes should still be directed at EEST until further notice.
There will be a brief freeze on EEST contributions during Q4 "The Switch", after which contributors can continue as before, but in EELS.
Test releases will continue from EEST as normal before, during, and after this transition.

More information will be communicated accordingly through the normal communication channels.

❗Current Status Quo

Test Fixtures Overview

fixtures_static.tar.gz has been deprecated.
fixtures_stable.tar.gz & fixtures_develop.tar.gz now both contain re-filled static tests, GeneralStateTests from ethereum/tests, filled from Cancun.
fixtures_stable.tar.gz contains tests filled for forks until Prague.
fixtures_develop.tar.gz contains tests filled for forks until Osaka.
fixtures_benchmark.tar.gz contains benchmark tests filled for only Prague.

EL Client Test Requirements

Prague Coverage (Mainnet):

Run all state_test's & blockchain_test's from fixtures_stable.tar.gz.
Run only BlockchainTests from the latest ethereum/tests release, filled until Prague.

Fusaka Coverage (Including Mainnet):

Run all state_test's & blockchain_test's from fixtures_develop.tar.gz
Run only BlockchainTests from the latest ethereum/tests release, filled until Prague.

Note: If you require GeneralStateTests from ethereum/tests filled for forks before Cancun then you must get these from the latest ethereum/tests release.

Benchmark Tests

For the most up-to-date benchmark tests, use fixtures_benchmark.tar.gz.

Note: Benchmark tests for Osaka (compliant with EIP-7825 transaction gas limit cap) will be added in a future feature release.

Test Fixture Formats

This release includes 2 new test formats designed primarily for Hive simulators:

blockchain_tests_engine_x: An optimized version of blockchain_tests_engine where multiple tests share the same genesis state, allowing multiple tests to run on a single client instantiation within Hive's consume-engine. The standard format requires a fresh client startup for each test. Due its combined genesis state, this is additionally the primary format used by the Nethermind team for benchmarking.
blockchain_tests_sync: A new format adjacent to the existing blockchain_tests_engine format. Used specifically for the upcoming consume-sync simulator, which delivers engine payloads from test fixtures to the client under test, then sync's a separate client to it. This test fixture is only marked to be filled for the EIP-7934 block RLP limit tests in Osaka.

Tooling & Simulators

Improved tooling and new Hive simulators are additionally included in this release:

execute remote: this command now supports optional Engine RPC endpoints (--engine-endpoint) with JWT authentication #2070.
- This allows manual control over block creation and transaction inclusion for more deterministic test execution on live networks. Previously, execute remote could only submit transactions and rely on the network's automatic block production, but now it can actively drive chain progression by creating blocks on-demand via the Engine API.
consume sync: Adjacent to consume-engine, designed to work with the blockchain_tests_sync format for testing client sync scenarios.
execute blobs: A new Hive specific simulator that uses the EEST execute pytest plugin. Sends blob transactions to the client under test and verifies its engine_getBlobsVX endpoint. Requires tests to be written with a new python test format blob_transaction_test. Primarily used to test PeerDAS from the EL perspective.
execute eth config: A command used to test the eth_config endpoint from EIP-7910. Can be ran remotely or within Hive.

Filling For Stateless Clients

A witness-filler extension is included in this release, allowing for tests to be filled that include an executionWitness for each fixture #2066. This essentially calls an external executable written in rust, and hence must be installed for usage within fill using the --witness flag. The current approach is below:

cargo install --git https://github.com/kevaundray/reth.git --branch jsign-witness-filler witness-filler
uv run fill ... --output=fixtures-witness --witness --clean

Note: The witness-filler executable is not maintained by EEST so we cannot help with any issues.

💥 Breaking Changes

Important changes for EEST superusers

EEST now requires uv>=0.7.0 (#1904). If your version of uv is too old.
When filling fixtures transition forks are included within there respective "to" fork, where --fork Osaka will now include PragueToOsakaAtTime15k. Previously transitions fork would only be included when filling with --from Prague --until Osaka flags.
Python 3.10 support was removed in this release (#1808).
EEST no longer allows usage of Yul code in Python tests. From now on, please make use of our opcode wrapper. Yul code is now only allowed in the "static tests" located in ./tests/static/ (these are test cases defined by JSON and YAML files instead of Python test functions that were originally maintained in ethereum/tests).
In order to fill the static tests (which is not the case by default), please ensure that solc is located in your PATH.
The output behavior of fill has changed (#1608):
- Before: fill wrote fixtures into the directory specified by the --output flag (default: fixtures). This could have many unintended consequences, including unexpected errors if old or invalid fixtures existed in the directory (for details see #1030).
- Now: fill will exit without filling any tests if the specified directory exists and is not-empty. This may be overridden by adding the --clean flag, which will first remove the specified directory.
Writing debugging information to the EVM "dump directory" by default has been disabled. To obtain debug output, the --evm-dump-dir flag must now be explicitly set. As a consequence, the now redundant --skip-evm-dump option was removed (#1874). This undoes functionality originally introduced in #999 and #1150.

Feature `zkevm` updated to `benchmark`

Due to the crossover between zkevm and benchmark tests, all instances of the former have been replaced with the latter nomenclature. Repository PR labels and titles are additionally updated to reflect this change.

This update renames the zkevm feature release to benchmark and further expands the latter for 1M, 10M, 30M, 45M, 60M, 90M, and 120M block gas limits in fixtures_benchmark.tar.gz.

To select a test for a given gas limit, the IDs of the tests have been expanded to contain benchmark-gas-value_XM, where X can be any of the aforementioned values.

The benchmark release also now includes BlockchainEngineX format that combines most of the tests into a minimal amount of genesis files. For more info see Blockchain Engine X Tests in the EEST documentation.

Users can select any of the artifacts depending on their benchmarking or testing needs for their provers.

🔑 Other Key Changes

🛠️ Framework

🔀 Refactoring

🔀 Move TransactionType enum from test file to proper module location in ethereum_test_types.transaction_types for better code organization and reusability.
✨ Opcode classes now validate keyword arguments and raise ValueError with clear error messages.
🔀 This PR removes the solc requirement to fill Python test cases. Regular test contributors no longer need to concern themselves with solc and, as such, the solc-select dependency has been removed. The remaining tests that used Yul have been ported to the EEST opcode wrapper mini-lang and the use of Yul in Python tests is no longer supported. Maintainers only: To fill the "static" JSON and YAML tests (./tests/static/) locally, solc (ideally v0.8.24) must be available in your PATH.
🔀 Updated default block gas limit from 36M to 45M to match mainnet environment.
🔀 Refactor fork logic to include transition forks within there "to" fork (#2051).

`fill`

✨ Add the ported_from test marker to track Python test cases that were converted from static fillers in ethereum/tests repository (#1590).
✨ Add a new pytest plugin, ported_tests, that lists the static fillers and PRs from ported_from markers for use in the coverage Github Workflow (#1634).
✨ Enable two-phase filling of fixtures with pre-allocation groups and add a BlockchainEngineXFixture format (#1706, #1760).
✨ Add --generate-all-formats flag to enable generation of all fixture formats including BlockchainEngineXFixture in a single command; enable --generate-all-formats automatically for tarball output, --output=fixtures.tar.gz, #1855.
🔀 Refactor: Encapsulate fill's fixture output options (--output, --flat-output, --single-fixture-per-file) into a FixtureOutput class (#1471,#1612).
✨ Don't warn about a "high Transaction gas_limit" for zkevm tests (#1598).
🐞 fill no longer writes generated fixtures into an existing, non-empty output directory; it must now be empty or --clean must be used to delete it first (#1608).
🐞 zkevm marked tests have been removed from tests-deployed tox environment into its own separate workflow tests-deployed-zkevm and are filled by evmone-t8n (#1617).
✨ Field postStateHash is now added to all blockchain_test and blockchain_test_engine tests that use exclude_full_post_state_in_output in place of postState. Fixes evmone-blockchaintest test consumption and indirectly fixes coverage runs for these tests (#1667).
🔀 Changed INVALID_DEPOSIT_EVENT_LAYOUT to a BlockException instead of a TransactionException (#1773).
🔀 Disabled writing debugging information to the EVM "dump directory" to improve performance. To obtain debug output, the --evm-dump-dir flag must now be explicitly set. As a consequence, the now redundant --skip-evm-dump option was removed (#1874).
✨ Generate unique addresses with Python for compatible static tests, instead of using hard-coded addresses from legacy static test fillers (#1781).
✨ Added support for the --benchmark-gas-values flag in the fill command, allowing a single genesis file to be used across different gas limit settings when generating fixtures. (#1895).
✨ Static tests can now specify a maximum fork where they should be filled for (#1977).
✨ Static tests can now be filled in every format using --generate-all-formats (#2006).
💥 Flag --flat-output has been removed due to having been unneeded for an extended period of time (#2018).
✨ Add support for BlockchainEngineSyncFixture format for tests marked with pytest.mark.verify_sync to enable client synchronization testing via consume sync command (#2007).
✨ Framework is updated to include BPO (EIP-7892) fork markers to enable the filling of BPO tests (#2050).
✨ Generate and include execution witness data in blockchain fixtures if --witness is specified (#2066).

`consume`

✨ Add --extract-to parameter to consume cache command for direct fixture extraction to specified directory, replacing the need for separate download scripts. (#1861).
🐞 Fix consume cache --cache-folder parameter being ignored, now properly caches fixtures in the specified directory instead of always using the default system cache location.
🐞 Fix the consume_direct.sh script generated by consume in the --evm-dump dir by quoting test IDs #1987.
🔀 consume now automatically avoids GitHub API calls when using direct release URLs (better for CI environments), while release specifiers like stable@latest continue to use the API for version resolution (#1788).
🔀 Refactor consume simulator architecture to use explicit pytest plugin structure with forward-looking architecture (#1801).
🔀 Add exponential retry logic to initial fcu within consume engine (#1815).
✨ Add consume sync command to test client synchronization capabilities by having one client sync from another via Engine API and P2P networking (#2007).
💥 Removed the consume hive command, this was a convenience command that ran consume rlp and consume engine in one pytest session; the individual commands should now be used instead (#2008).
✨ Update the hive ruleset to include BPO (EIP-7892) forks (#2050).

`execute`

✨ Added new Blob class which can use the ckzg library to generate valid blobs at runtime (#1614).
✨ Added blob_transaction_test execute test spec, which allows tests that send blob transactions to a running client and verifying its engine_getBlobsVX endpoint behavior (#1644).
✨ Added execute eth-config command to test the eth_config RPC endpoint of a client, and includes configurations by default for Mainnet, Sepolia, Holesky, and Hoodi (#1863).
✨ Command execute remote now allows specification of an Engine API endpoint to drive the chain via --engine-endpoint and either --engine-jwt-secret or --engine-jwt-secret-file. This mode is useful when there's no consensus client connected to the execution client so execute will automatically request blocks via the Engine API when it sends transactions (#2070).
✨ Added --address-stubs flag to the execute command which allows to specify a JSON-formatted string, JSON file or YAML file which contains label-to-address of specific pre-deployed contracts already existing in the network where the tests are executed (#2073).

📋 Misc

✨ Add pypy3.11 support (#1854).
🔀 Use only relative imports in tests/ directory (#1848).
🔀 Convert absolute imports to relative imports in src/ directory for better code maintainability (#1907).
🔀 Misc. doc updates, including a navigation footer (#1846).
🔀 Remove Python 3.10 support (#1808).
🔀 Modernize codebase with Python 3.11 language features (#1812).
✨ Add changelog formatting validation to CI to ensure consistent punctuation in bullet points #1691.
✨ Added the EIP checklist template that serves as a reference to achieve better coverage when implementing tests for new EIPs (#1327).
✨ Added Post-Mortems of Missed Test Scenarios to the documentation that serves as a reference list of all cases that were missed during the test implementation phase of a new EIP, and includes the steps taken in order to prevent similar test cases to be missed in the future (#1327).
✨ Add documentation "Running Tests" that explains the different methods available to run EEST tests and reference guides for running consume and hive: (#1172).
✨ Added a new eest sub-command, eest info, to easily print a cloned EEST repository's version and the versions of relevant tools, e.g., python, uv (#1621).
✨ Add CONTRIBUTING.md for execution-spec-tests and improve coding standards documentation (#1604).
✨ Add CLAUDE.md to help working in @ethereum/execution-spec-tests with Claude Code (#1749).
✨ Use codespell instead of pyspelling to spell-check python and markdown sources (#1715).
🔀 Updated from pytest 7 to pytest 8, benefits include improved type hinting and hook typing, stricter mark handling, and clearer error messages for plugin and metadata development (#1433).
🐞 Fix bug in ported-from plugin and coverage script that made PRs fail with modified tests that contained no ported tests (#1661).
🔀 Refactor the click-based CLI interface used for pytest-based commands (fill, execute, consume) to make them more extensible (#1654).
🔀 Split src/ethereum_test_types/types.py into several files to improve code organization (#1665).
✨ Added extract_config command to extract genesis files used to launch clients in hive (#1740).
✨ Added automatic checklist generation for every EIP inside of the tests folder. The checklist is appended to each EIP in the documentation in the "Test Case Reference" section (#1679, #1718).
🔀 Add macOS hive development mode workaround to the docs #1786.
🔀 Refactor and clean up of exceptions including EOF exceptions within client specific mappers #1803.
🔀 Rename tests/zkevm/ to tests/benchmark/ and replace the zkevm pytest mark with benchmark #1804.
🔀 Add fixture comparison check to optimize coverage workflow in CI (#1833).
🔀 Move TransactionType enum from test file to proper module location in ethereum_test_types.transaction_types for better code organization and reusability (#1763).
✨ Opcode classes now validate keyword arguments and raise ValueError with clear error messages (#1739, #1856).
✨ All commands (fill, consume, execute) now work without having to clone the repository, e.g. uv run --with git+https://github.com/ethereum/execution-spec-tests.git consume now works from any folder (#1863).
🔀 Move Prague to stable and Osaka to develop (#1573).
✨ Add a pytest.mark.with_all_typed_transactions marker that creates default typed transactions for each tx_type supported by the current fork (#1890).
✨ Add basic support for Amsterdam fork in order to begin testing Glamsterdam (#2069).
✨ EIP-7928: Add initial framework support for Block Level Access Lists (BAL) testing for Amsterdam (#2067).

🧪 Test Cases

✨ EIP-7951: Add additional test cases for modular comparison and initcode context (#2023, & #2068).
🔀 Refactored BLOBHASH opcode context tests to use the pre_alloc plugin in order to avoid contract and EOA address collisions (#1637).
🔀 Refactored SELFDESTRUCT opcode collision tests to use the pre_alloc plugin in order to avoid contract and EOA address collisions (#1643).
✨ EIP-7594: Sanity test cases to send blob transactions and verify engine_getBlobsVX using the execute command (#1644,#1884).
🔀 Refactored EIP-145 static tests into python (#1683).
✨ EIP-7823, EIP-7883: Add test cases for ModExp precompile gas-cost updates and input limits on Osaka (#1579, #1729, #1881).
✨ EIP-7825: Add test cases for the transaction gas limit of 2^24 gas (#1711, #1882).
✨ EIP-7951: add test cases for P256VERIFY precompile to support secp256r1 curve #1670.
✨ Introduce blockchain tests for benchmark to cover the scenario of pure ether transfers #1742.
✨ EIP-7934: Add test cases for the block RLP max limit of 10MiB (#1730).
✨ EIP-7939: Add count leading zeros (CLZ) opcode tests for Osaka (#1733).
✨ EIP-7934: Add additional test cases for block RLP max limit with all typed transactions and for a log-creating transactions (#1890).
✨ EIP-7825: Pre-Osaka tests have been updated to either (1) dynamically adapt to the transaction gas limit cap, or (2) reduce overall gas consumption to fit the new limit (#1924, #1928, #1980).
✨ EIP-7918: Blob base fee bounded by execution cost test cases (initial), includes some adjustments to EIP-4844 tests (#1685).
🔀 Adds the max blob transaction limit to the tests including updates to EIP-4844 for Osaka (#1884).
🐞 Fix issues when filling block rlp size limit tests with --generate-pre-alloc-groups (#1989).
✨ EIP-7928: Add test cases for Block Level Access Lists (BAL) to Amsterdam (#2067).

New Contributors

@evgenyzdanovich made their first contribution in #1717
@Gabriel-Trintinalia made their first contribution in #1944
@souradeep-das made their first contribution in #1784
@bshastry made their first contribution in #1993
@fgimenez made their first contribution in #2016

Full Changelog: v4.5.0...v5.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v5.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

🇯🇵 Summary

⚔️ Future Weld with EELS

❗Current Status Quo

Test Fixtures Overview

EL Client Test Requirements

Benchmark Tests

Test Fixture Formats

Tooling & Simulators

Filling For Stateless Clients

💥 Breaking Changes

Important changes for EEST superusers

Feature `zkevm` updated to `benchmark`

🔑 Other Key Changes

🛠️ Framework

🔀 Refactoring

`fill`

`consume`

`execute`

📋 Misc

🧪 Test Cases

New Contributors

Contributors

Uh oh!

v5.0.0

🇯🇵 Summary

⚔️ Future Weld with EELS

❗Current Status Quo

Test Fixtures Overview

EL Client Test Requirements

Benchmark Tests

Test Fixture Formats

Tooling & Simulators

Filling For Stateless Clients

💥 Breaking Changes

Important changes for EEST superusers

Feature zkevm updated to benchmark

🔑 Other Key Changes

🛠️ Framework

🔀 Refactoring

fill

consume

execute

📋 Misc

🧪 Test Cases

New Contributors

Contributors

Uh oh!

Feature `zkevm` updated to `benchmark`

`fill`

`consume`

`execute`