ethereum · marioevz · Jul 31, 2025 · Jul 31, 2025 · Jul 31, 2025 · Jul 31, 2025
diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md
@@ -78,6 +78,7 @@ Users can select any of the artifacts depending on their testing needs for their
 - 🔀 Disabled writing debugging information to the EVM "dump directory" to improve performance. To obtain debug output, the `--evm-dump-dir` flag must now be explicitly set. As a consequence, the now redundant `--skip-evm-dump` option was removed ([#1874](https://github.com/ethereum/execution-spec-tests/pull/1874)).
 - ✨ Generate unique addresses with Python for compatible static tests, instead of using hard-coded addresses from legacy static test fillers ([#1781](https://github.com/ethereum/execution-spec-tests/pull/1781)).
 - ✨ Added support for the `--benchmark-gas-values` flag in the `fill` command, allowing a single genesis file to be used across different gas limit settings when generating fixtures. ([#1895](https://github.com/ethereum/execution-spec-tests/pull/1895)).
+- ✨ Added `--optimize-gas` flag that allows to binary search the minimum gas limit value for a transaction in a test that still yields the same test result ([#1979](https://github.com/ethereum/execution-spec-tests/pull/1979)).
 - ✨ Static tests can now specify a maximum fork where they should be filled for ([#1977](https://github.com/ethereum/execution-spec-tests/pull/1977)).
 
 #### `consume`

diff --git a/docs/navigation.md b/docs/navigation.md
@@ -20,6 +20,7 @@
       * [Code Standards](writing_tests/code_standards.md)
       * [Exception Tests](writing_tests/exception_tests.md)
       * [Using and Extending Fork Methods](writing_tests/fork_methods.md)
+      * [Gas Optimization](writing_tests/gas_optimization.md)
       * [Referencing an EIP Spec Version](writing_tests/reference_specification.md)
       * [EIP Checklist Generation](writing_tests/eip_checklist.md)
       * [Testing Checklist Templates](writing_tests/checklist_templates/index.md)

diff --git a/docs/writing_tests/gas_optimization.md b/docs/writing_tests/gas_optimization.md
@@ -0,0 +1,93 @@
+# Gas Optimization
+
+The `--optimize-gas` feature helps find the minimum gas limit required for transactions to execute correctly while maintaining the same execution trace and post-state. This is useful for creating more efficient test cases and understanding the exact gas requirements of specific operations.
+
+## Basic Usage
+
+Enable gas optimization for all tests:
+
+```bash
+uv run fill --optimize-gas
+```
+
+## Output Configuration
+
+Specify a custom output file for gas optimization results:
+
+```bash
+uv run fill --optimize-gas --optimize-gas-output=my_gas_results.json path/to/some/test/to/optimize
+```
+
+## Post-Processing Mode
+
+Enable post-processing to handle opcodes that put the current gas in the stack (like `GAS` opcode):
+
+```bash
+uv run fill --optimize-gas --optimize-gas-post-processing
+```
+
+## How It Works
+
+The gas optimization algorithm uses a binary search approach:
+
+1. **Initial Validation**: First tries reducing the gas limit by 1 to verify when even minimal changes affect the execution trace
+2. **Binary Search**: Uses binary search between 0 and the original gas limit to find the minimum viable gas limit
+3. **Verification**: For each candidate gas limit, it verifies:
+   - Execution traces are equivalent (with optional post-processing)
+   - Post-state allocation matches the expected result
+   - Transaction validation passes
+   - Account states remain consistent
+4. **Result**: Outputs the minimum gas limit that still produces correct execution
+
+## Output Format
+
+The optimization results are saved to a JSON file (default: `optimize-gas-output.json`) containing:
+
+- Test identifiers as keys of the JSON object
+- Optimized gas limits in each value or `null` if the optimization failed.
+
+## Use Cases
+
+- **Test Efficiency**: Create tests with minimal gas requirements
+- **Gas Analysis**: Understand exact gas costs for specific operations
+- **Regression Testing**: Ensure gas optimizations don't break test correctness
+- **Performance Testing**: Benchmark gas usage across different scenarios
+
+## Limitations
+
+- Only works with state tests (not blockchain tests)
+- Requires trace collection to be enabled
+- May significantly increase test execution time due to multiple trial runs
+- Some tests may not be optimizable if they require the exact original gas limit
+
+## Integration with Test Writing
+
+When writing tests, you can use gas optimization to:
+
+1. **Optimize Existing Tests**: Run `--optimize-gas` on your test suite to find more efficient gas limits
+2. **Validate Gas Requirements**: Ensure your tests use the minimum necessary gas
+3. **Create Efficient Test Cases**: Use the optimized gas limits in your test specifications
+4. **Benchmark Changes**: Compare gas usage before and after modifications
+
+## Example Workflow
+
+```bash
+# 1. Write your test
+# 2. Run with gas optimization
+uv run fill --optimize-gas --optimize-gas-output=optimization_results.json
+
+# 3. Review the results
+cat optimization_results.json
+
+# 4. Update your test with optimized gas limits if desired
+# 5. Re-run to verify correctness
+uv run fill
+```
+
+## Best Practices
+
+### Leave a Buffer for Future Forks
+
+When using the optimized gas limits in your tests, it's recommended to add a small buffer (typically 5-10%) above the exact value outputted by the gas optimization. This accounts for potential gas cost changes in future Ethereum forks that might increase the gas requirements for the same operations.
+
+For example, if the optimization outputs a gas limit of 100,000, consider using 105,000 or 110,000 in your test specification to ensure compatibility with future protocol changes.
diff --git a/docs/writing_tests/index.md b/docs/writing_tests/index.md
@@ -25,6 +25,7 @@ For help deciding which test format to select, see [Types of Tests](./types_of_t
 - [Adding a New Test](./adding_a_new_test.md) - Step-by-step guide to adding new tests
 - [Writing a New Test](./writing_a_new_test.md) - Detailed guide on writing different test types
 - [Using and Extending Fork Methods](./fork_methods.md) - How to use fork methods to write fork-adaptive tests
+- [Gas Optimization](./gas_optimization.md) - Optimize gas limits in your tests for efficiency and compatibility with future forks.
 - [Porting tests](./porting_legacy_tests.md): A guide to porting @ethereum/tests to EEST.
 
 Please check that your code adheres to the repo's coding standards and read the other pages in this section for more background and an explanation of how to implement state transition and blockchain tests.
diff --git a/pyproject.toml b/pyproject.toml
@@ -107,6 +107,7 @@ fillerconvert = "cli.fillerconvert.fillerconvert:main"
 groupstats = "cli.show_pre_alloc_group_stats:main"
 extract_config = "cli.extract_config:extract_config"
 compare_fixtures = "cli.compare_fixtures:main"
+modify_static_test_gas_limits = "cli.modify_static_test_gas_limits:main"
 
 [tool.setuptools.packages.find]
 where = ["src"]

diff --git a/src/cli/modify_static_test_gas_limits.py b/src/cli/modify_static_test_gas_limits.py
@@ -0,0 +1,225 @@
+"""
+Command to scan and overwrite the static tests' gas limits to new optimized value given in the
+input file.
+"""
+
+import json
+import re
+from pathlib import Path
+from typing import Dict, List, Set
+
+import click
+import yaml
+
+from ethereum_test_base_types import EthereumTestRootModel, HexNumber, ZeroPaddedHexNumber
+from ethereum_test_specs import StateStaticTest
+from pytest_plugins.filler.static_filler import NoIntResolver
+
+
+class GasLimitDict(EthereumTestRootModel):
+    """Formatted JSON file with new gas limits in each test."""
+
+    root: Dict[str, int | None]
+
+    def unique_files(self) -> Set[Path]:
+        """Return a list of unique test files."""
+        files = set()
+        for test in self.root:
+            filename, _ = test.split("::")
+            files.add(Path(filename))
+        return files
+
+    def get_tests_by_file_path(self, file: Path | str) -> Set[str]:
+        """Return a list of all tests that belong to a given file path."""
+        tests = set()
+        for test in self.root:
+            current_file, _ = test.split("::")
+            if current_file == str(file):
+                tests.add(test)
+        return tests
+
+
+class StaticTestFile(EthereumTestRootModel):
+    """A static test file."""
+
+    root: Dict[str, StateStaticTest]
+
+
+def _check_fixtures(*, input_path: Path, max_gas_limit: int | None, dry_run: bool, verbose: bool):
+    """Perform some checks on the fixtures contained in the specified directory."""
+    # Load the test dictionary from the input JSON file
+    test_dict = GasLimitDict.model_validate_json(input_path.read_text())
+
+    # Iterate through each unique test file that needs modification
+    for test_file in test_dict.unique_files():
+        tests = test_dict.get_tests_by_file_path(test_file)
+        test_file_contents = test_file.read_text()
+
+        # Parse the test file based on its format (YAML or JSON)
+        if test_file.suffix == ".yml" or test_file.suffix == ".yaml":
+            loaded_yaml = yaml.load(test_file.read_text(), Loader=NoIntResolver)
+            try:
+                parsed_test_file = StaticTestFile.model_validate(loaded_yaml)
+            except Exception as e:
+                raise Exception(
+                    f"Unable to parse file {test_file}: {json.dumps(loaded_yaml, indent=2)}"
+                ) from e
+        else:
+            parsed_test_file = StaticTestFile.model_validate_json(test_file_contents)
+
+        # Validate that the file contains exactly one test
+        assert len(parsed_test_file.root) == 1, f"File {test_file} contains more than one test."
+        _, parsed_test = parsed_test_file.root.popitem()
+
+        # Skip files with multiple gas limit values
+        if len(parsed_test.transaction.gas_limit) != 1:
+            if dry_run or verbose:
+                print(
+                    f"Test file {test_file} contains more than one test (after parsing), skipping."
+                )
+            continue
+
+        # Get the current gas limit and check if modification is needed
+        current_gas_limit = int(parsed_test.transaction.gas_limit[0])
+        if max_gas_limit is not None and current_gas_limit <= max_gas_limit:
+            # Nothing to do, finished
+            for test in tests:
+                test_dict.root.pop(test)
+            continue
+
+        # Collect valid gas values for this test file
+        gas_values: List[int] = []
+        for gas_value in [test_dict.root[test] for test in tests]:
+            if gas_value is None:
+                if dry_run or verbose:
+                    print(
+                        f"Test file {test_file} contains at least one test that cannot "
+                        "be updated, skipping."
+                    )
+                continue
+            else:
+                gas_values.append(gas_value)
+
+        # Calculate the new gas limit (rounded up to nearest 100,000)
+        new_gas_limit = max(gas_values)
+        modified_new_gas_limit = ((new_gas_limit // 100000) + 1) * 100000
+        if verbose:
+            print(
+                f"Changing exact new gas limit ({new_gas_limit}) to "
+                f"rounded ({modified_new_gas_limit})"
+            )
+        new_gas_limit = modified_new_gas_limit
+
+        # Check if the new gas limit exceeds the maximum allowed
+        if max_gas_limit is not None and new_gas_limit > max_gas_limit:
+            if dry_run or verbose:
+                print(f"New gas limit ({new_gas_limit}) exceeds max ({max_gas_limit})")
+            continue
+
+        if dry_run or verbose:
+            print(f"Test file {test_file} requires modification ({new_gas_limit})")
+
+        # Find the appropriate pattern to replace the current gas limit
+        potential_types = [int, HexNumber, ZeroPaddedHexNumber]
+        substitute_pattern = None
+        substitute_string = None
+
+        attempted_patterns = []
+
+        for current_type in potential_types:
+            potential_substitute_pattern = rf"\b{current_type(current_gas_limit)}\b"
+            potential_substitute_string = f"{current_type(new_gas_limit)}"
+            if (
+                re.search(
+                    potential_substitute_pattern, test_file_contents, flags=re.RegexFlag.MULTILINE
+                )
+                is not None
+            ):
+                substitute_pattern = potential_substitute_pattern
+                substitute_string = potential_substitute_string
+                break
+
+            attempted_patterns.append(potential_substitute_pattern)
+
+        # Validate that a replacement pattern was found
+        assert substitute_pattern is not None, (
+            f"Current gas limit ({attempted_patterns}) not found in {test_file}"
+        )
+        assert substitute_string is not None
+
+        # Perform the replacement in the test file content
+        new_test_file_contents = re.sub(substitute_pattern, substitute_string, test_file_contents)
+
+        assert test_file_contents != new_test_file_contents, "Could not modify test file"
+
+        # Skip writing changes if this is a dry run
+        if dry_run:
+            continue
+
+        # Write the modified content back to the test file
+        test_file.write_text(new_test_file_contents)
+        for test in tests:
+            test_dict.root.pop(test)
+
+    if dry_run:
+        return
+
+    # Write changes to the input file
+    input_path.write_text(test_dict.model_dump_json(indent=2))
+
+
+MAX_GAS_LIMIT = 16_777_216
+
+
+@click.command()
+@click.option(
+    "--input",
+    "-i",
+    "input_str",
+    type=click.Path(exists=True, file_okay=True, dir_okay=False, readable=True),
+    required=True,
+    help="The input json file or directory containing json listing the new gas limits for the "
+    "static test files files.",
+)
+@click.option(
+    "--max-gas-limit",
+    default=MAX_GAS_LIMIT,
+    expose_value=True,
+    help="Gas limit that triggers a test modification, and also the maximum value that a test "
+    "should have after modification.",
+)
+@click.option(
+    "--dry-run",
+    "-d",
+    "dry_run",
+    is_flag=True,
+    default=False,
+    expose_value=True,
+    help="Don't modify any files, simply print operations to be performed.",
+)
+@click.option(
+    "--verbose",
+    "-v",
+    "verbose",
+    is_flag=True,
+    default=False,
+    expose_value=True,
+    help="Print extra information.",
+)
+def main(input_str: str, max_gas_limit, dry_run: bool, verbose: bool):
+    """Perform some checks on the fixtures contained in the specified directory."""
+    input_path = Path(input_str)
+    if not dry_run:
+        # Always dry-run first before actually modifying
+        _check_fixtures(
+            input_path=input_path,
+            max_gas_limit=max_gas_limit,
+            dry_run=True,
+            verbose=False,
+        )
+    _check_fixtures(
+        input_path=input_path,
+        max_gas_limit=max_gas_limit,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
diff --git a/src/ethereum_clis/__init__.py b/src/ethereum_clis/__init__.py
@@ -13,6 +13,7 @@
 from .types import (
     BlockExceptionWithMessage,
     Result,
+    Traces,
     TransactionExceptionWithMessage,
     TransitionToolOutput,
 )
@@ -35,6 +36,7 @@
     "NethtestFixtureConsumer",
     "NimbusTransitionTool",
     "Result",
+    "Traces",
     "TransactionExceptionWithMessage",
     "TransitionTool",
     "TransitionToolOutput",