feat(tests): enhance eip7883 test coverage #1929

LouisTsai-Csie · 2025-07-21T06:28:36Z

🗒️ Description

EIP-7883: ModExp Gas Cost Increase

This EIP changes the gas cost, so it falls under this category in the checklist. While this category is meant specifically for gas cost changes, I’ve added a broader set of scenarios, similar to what would be done for a new precompile, since the existing tests from Byzantium are incomplete.

Call contexts

CALL / DELEGATECALL / STATICCALL / CALLCODE
Transaction Entry-point
Initcode call: IMO this is unnecessary, since we still need to use *CALL to trigger the precompile in the create/create2 initcode, it is exactly the same testing scenario as the call context one.
Precompile as Set-code Delegated Address: please check this in eip-7702 test, it is not located in eip-7883 test.

Inputs

precompile/test/inputs/valid
precompile/test/inputs/valid/boundary
precompile/test/inputs/valid/crypto
precompile/test/inputs/all_zeros
precompile/test/inputs/max_values
precompile/test/inputs/invalid
precompile/test/inputs/invalid/crypto
precompile/test/inputs/invalid/corrupted

Value Transfer

Minimum Fee Precompile
No-Fee Precompile: Do not need this one, as Modexp is not no-fee precompile

Out-of-bounds checks

precompile/test/out_of_bounds/max
precompile/test/out_of_bounds/max_plus_one

Input Lengths

Zero-length Input
Static Required Input Length: The input length is dynamic
Dynamic Required Input Length

Gas usage

Constant Gas Cost: We do not need this one, as the gas cost is dynamic
Variable Gas Cost: See analysis below
Excessive Gas: Already have this in benchmark test

Fork transition

Pre-fork Block Call
Cold/Warm Precompile Address State: We do not need this one as the Modexp is not a new precompile.

🔗 Related Issues or PRs

Issue #1791, #1790, #1971

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx --with=tox-uv tox -e lint,typecheck,spellcheck,markdownlint
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

LouisTsai-Csie · 2025-07-22T14:45:27Z

tests/osaka/eip7883_modexp_gas_increase/conftest.py

+            code += Op.RETURNDATACOPY(0, i * 32, 32)
+            code += Op.SSTORE(
+                call_contract_post_storage.store_next(modexp_expected[i * 32 : (i + 1) * 32]),
+                Op.MLOAD(0),


I tried to use a hash-based method (see the comment) here for simplicity but failed, so I simply store all the data in storage for comparison. I still prefer the hash-based method as it reduces the SSTORE operation count

I tried to do this today but had issues with guido-4 and geth-fail vectors. I then realised we have a bug in the chunking method for cases where len(modexp_expected) // 32 = 0. For guido-4 this equates to 8 // 32 = 0. So then we get for i in range(0), meaning we don't store the result for guido-4.

We should try and get the hash method working nonetheless!

I’ve updated the implementation to a hash-based method, but some issues have come up.

From EIP-198:

Consumes floor(mult_complexity(max(length_of_MODULUS, length_of_BASE)) * max(ADJUSTED_EXPONENT_LENGTH, 1) / GQUADDIVISOR) gas, and if there is enough gas, returns (BASE**EXPONENT) % MODULUS as a byte array with the same length as the modulus.

Given a ModExp input where (base, exponent, modulus) yields the value 0x01 and the modulus length is 4, what should the output be? Should it be 0x00000001 (left-padded) or 0x01000000 (right-padded)?

Referring to Mario’s vector update (check vector.json change in this PR), I initially thought it should be the latter.

However, after reviewing the relevant EIPs, such as eip-198, eip-2565, eip-7883, and eip-7823, and implementing the hashing comparison, I now think the correct format might be the former (left-padded).

The previous test cases did not fail because, as you mentioned, for some cases the length was smaller than 32, so len(modexp_expected) // 32 evaluated to 0, resulting in no loop execution. Even if the calculated result was non-zero, we should be checking the range from 0 to len(modexp_expected) // 32 + 1. Otherwise, we miss trailing bytes.

Short summary: the legacy approach never fully validated the output format. After switching to the updated verification method, I think these vectors might have an issue:

geth-fail-length

Input: 000000000000000000000000000000000000000000000000000000000000000000 Output: 000000000000000000000000000000000000000000000000000000000000000001

base_len = 0 → base = 0

exp_len = 34 → exponent = 34 zero bytes (value 0)

mod_len = 33 → modulus = 0x...0002 (value 2)
Per EIP-198, exponent = 0 and modulus > 1 → result is 1, left-padded to modulus length.

guido-4-even

Input: 0001000000000000 Output: 0000000000000001

The result might need to be left-padded, not right-padded.

I update the implementation and then run consume engine command on the server, it now works fine.

Amazing catch! I struggled to find the issue last night. Could you please update the vectors?
It would be good to run these against the clients!

So we never properly tested the expectation of these cases, from what I can see:

marius-1-even - 12 bytes expected

guido-1-even - 16 bytes expected

guido-2-even - 16 bytes expected

guido-4-even - 8 bytes expected

marcin-1-exp-heavy - 8 bytes expected

marcin-2-exp-heavy - 16 bytes expected

marcin-3-exp-heavy - 24 bytes expected

Only the call, the gas calculation and the return length!

I update the implementation and then run consume engine command on the server, it now works fine.

Nice! Thanks for taking care of this

I've already updated the test vector. My current understanding also, is that we never really test the actual data for the return data! For the geth-fail-length, it would be critical as the output provided is even incorrect.

LouisTsai-Csie · 2025-07-22T14:48:45Z

tests/osaka/eip7883_modexp_gas_increase/test_modexp_thresholds_transition.py

+@pytest.mark.parametrize(
+    "modexp_input,modexp_expected,gas_old,gas_new",
+    [
+        pytest.param(Spec.modexp_input, Spec.modexp_expected, 200, 500),  # Should be 1200


For this test case, it should take 1200 gas cost after Osaka is activated, but in this test case, it only takes 500. Still looking into the root cause here.

This is one is the "marcin-1-balanced", you can find it in the vectors.json file

I spent the entire afternoon debugging this issue without understanding what went wrong. So I decided towrite down every step I took, traces, opcode sequences, the LLM conversation history, and all the tricks I tried.
It turned into a lengthy and tedious comment. But after finishing it, I realized the root cause was simply that I forgot to pass the calldata to the transaction. So, it’s fixed now.

LouisTsai-Csie · 2025-07-22T14:50:08Z

tests/osaka/eip7883_modexp_gas_increase/test_modexp_thresholds_transition.py

+
+    senders = [pre.fund_eoa() for _ in range(3)]
+    contracts = [pre.deploy_contract(code) for _ in range(3)]
+    timestamps = [14_999, 15_000, 15_001]


To be honest, I do not know why these timestamp values are 14_999-15_001, I use this trick as this is how Spencer test CLZ opcode.

All transition forks that are defined in our tests have a transition time hard-coded to 15k seconds.
So on 14,999 we should still see Prague rules, and Osaka rules in the rest.

tests/osaka/eip7883_modexp_gas_increase/spec.py

tests/osaka/eip7883_modexp_gas_increase/conftest.py

LouisTsai-Csie · 2025-07-23T08:51:11Z

tests/osaka/eip7883_modexp_gas_increase/test_modexp_thresholds.py

+            exponent="FF" * (Spec.MAX_LENGTH_BYTES + 1),
+            modulus="FF" * (Spec.MAX_LENGTH_BYTES + 1),
+            case_id="all-too-long",
+        ),


For these test cases, the input exceeds the boundary for the ModExp precompile. However, it’s unclear whether the failure is due to exceeding the transaction gas limit or violating the input boundary constraints.

We should double check this and do the actual calculation of the gas required to do this operation, It might even be higher than the tx gas limit cap introduced in Osaka, and in that case the test might be unnecessary.

I tested with different input values and evaluated the gas cost. Based on the results, I removed some of the test cases.

If the exponent length exceeds the limit, it’s highly unlikely that the base or modulus could also exceed it. In such cases, the total gas cost will break the transaction gas cap.

For combinations like (base, exponent), (exponent, modulus), or (base, exponent, modulus) that exceed the limit, the gas cost is greater than 500 million.

But I do not evaluate it using fuzzing or formal verification, so there might be some corner case that support such combination. I am wondering how can I prove this attribute.

LouisTsai-Csie · 2025-07-31T07:13:41Z

tests/osaka/eip7883_modexp_gas_increase/test_modexp_thresholds.py

+    # Test case coverage table:
+    # ┌─────┬──────┬─────┬──────┬───────┬─────────┬─────────────────────────────────────────────┐
+    # │ ID  │ Comp │ Rel │ Iter │ Clamp │   Gas   │ Description                                 │
+    # ├─────┼──────┼─────┼──────┼───────┼─────────┼─────────────────────────────────────────────┤
+    # │ Z0  │  -   │  -  │  -   │  -    │   500   │ Zero case - empty inputs                    │
+    # │ S0  │  S   │  =  │  A   │ True  │   500   │ Small, equal, zero exponent, clamped        │
+    # │ S1  │  S   │  =  │  B   │ True  │   500   │ Small, equal, small exp, clamped            │
+    # │ S2  │  S   │  =  │  B   │ False │  4080   │ Small, equal, large exp, unclamped          │
+    # │ S3  │  S   │  =  │  C   │ False │  2032   │ Small, equal, large exp+zero low256         │
+    # │ S4  │  S   │  =  │  D   │ False │  2048   │ Small, equal, large exp+non-zero low256     │
+    # │ S5  │  S   │  >  │  A   │ True  │   500   │ Small, base>mod, zero exp, clamped          │
+    # │ S6  │  S   │  <  │  B   │ True  │   500   │ Small, base<mod, small exp, clamped         │
+    # │ L0  │  L   │  =  │  A   │ True  │   500   │ Large, equal, zero exp, clamped             │
+    # │ L1  │  L   │  =  │  B   │ False │ 12750   │ Large, equal, large exp, unclamped          │
+    # │ L2  │  L   │  =  │  C   │ False │  6350   │ Large, equal, large exp+zero low256         │
+    # │ L3  │  L   │  =  │  D   │ False │  6400   │ Large, equal, large exp+non-zero low256     │
+    # │ L4  │  L   │  >  │  B   │ True  │   500   │ Large, base>mod, small exp, clamped         │
+    # │ L5  │  L   │  <  │  C   │ False │  9144   │ Large, base<mod, large exp+zero low256      │
+    # └─────┴──────┴─────┴──────┴───────┴─────────┴─────────────────────────────────────────────┘


To verify the table, you can check this script, compare to eip-7883, and use it ot generate the test case as well as the labels: https://gist.github.com/1c8fd82ac1e75e9cfd1c79e5a2f5fbe6.git

I was inspired by this paper: https://arxiv.org/pdf/2504.12034

LouisTsai-Csie · 2025-08-04T08:06:16Z

tests/byzantium/eip198_modexp_precompile/helpers.py

@@ -73,6 +73,8 @@ def from_bytes(cls, input_data: Bytes | str) -> "ModExpInput":

        modulus = input_data[current_index : current_index + modulus_length]

+        modulus = modulus.ljust(min(1024, modulus_length), b"\x00")


This change is necessary but requires additional refactoring.

For case 4 in modexpFiller.json, the test description is:

4 - Would also parse as a base of 3, an exponent of 65535, and a modulus of 2**255. It attempts to read 32 bytes for the modulus starting from 0x80, but since there’s no further data, it right-pads the result with 31 zero bytes.

Although the operation modulus = input_data[current_index : current_index + modulus_length] is technically valid, as Python silently pads with null bytes when slicing beyond the end, we actually need to left-pad with \x00 instead.

Example: https://onecompiler.com/python/43smfe6dm

Regarding the 1024-byte length restriction, in case 2, the exponent length is 2**256 - 1, which can cause current_index + modulus_length to overflow. Therefore, I added this condition to prevent overflow. It aligns with the logic in EIP-7883, although such a restriction does not exist in EIP-198.

Some idea for refactoring in ModexpInput: We should consider not passing ModexpInput data type to the test, but bytes data instead, this would be much more flexible for strange testing scenario.

LouisTsai-Csie · 2025-08-04T08:27:31Z

I didn’t create a separate PR for issue #1971, as it depends on the infrastructure introduced in this PR.

There are 37 test cases in the legacy modexpFiller.json test. Most of them have been ported, but the following cases are still pending and require further analysis:

Case 2: Would parse a base length of 0, a modulus length of 32, and an exponent length of 2256 - 1, where the base is empty, the modulus is 2256 - 2 and the exponent is (2256 - 3) * 256(2**256 - 33) (yes, that's a really big number). It would then immediately fail, as it's not possible to provide enough gas to make that computation.
Case 28: base length 4TiB
Case 29: exp length 4TiB; returns 0 because mod is zero
Case 30: base and mod have zero-length. exp's length is 2^255. Since mod is zero, the result should be zero.
Case 36: the input found on 10 Oct. 2017 that overflows the gas calculation
Case 37: input found in July 2022, overflows the gas calculation

Based on the description of case 2, 28, 29 and 30, they will fail as eip7623 introduces the upper bound for each field, we already have similar test in test_modexp_invalid_inputs.
For case 36, 37, I am checking if it is valid after Fusaka.

spencer-tb

Thanks for working on this. Just some small comments for now.

Could you double check my understanding for my comment here:
#1929 (comment)

tests/osaka/eip7883_modexp_gas_increase/conftest.py

spencer-tb · 2025-08-11T12:28:52Z

tests/osaka/eip7883_modexp_gas_increase/vector/legacy.json

@@ -0,0 +1,162 @@
+[


Nice! I had no idea we had these vectors

This is from here, could you help check if there is anything missing?

tests/osaka/eip7883_modexp_gas_increase/conftest.py

spencer-tb · 2025-08-11T22:16:55Z

tests/osaka/eip7883_modexp_gas_increase/conftest.py

+            code += Op.RETURNDATACOPY(0, i * 32, 32)
+            code += Op.SSTORE(
+                call_contract_post_storage.store_next(modexp_expected[i * 32 : (i + 1) * 32]),
+                Op.MLOAD(0),


I tried to do this today but had issues with guido-4 and geth-fail vectors. I then realised we have a bug in the chunking method for cases where len(modexp_expected) // 32 = 0. For guido-4 this equates to 8 // 32 = 0. So then we get for i in range(0), meaning we don't store the result for guido-4.

We should try and get the hash method working nonetheless!

tests/osaka/eip7883_modexp_gas_increase/spec.py

LouisTsai-Csie changed the title ~~refactor(eip7883): update vector input structure~~ feat(tests): enhance eip7883 test coverage Jul 21, 2025

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch from 016bf91 to 4d57f68 Compare July 21, 2025 07:45

LouisTsai-Csie self-assigned this Jul 21, 2025

LouisTsai-Csie commented Jul 22, 2025

View reviewed changes

tests/osaka/eip7883_modexp_gas_increase/spec.py Show resolved Hide resolved

LouisTsai-Csie commented Jul 22, 2025

View reviewed changes

tests/osaka/eip7883_modexp_gas_increase/conftest.py Outdated Show resolved Hide resolved

LouisTsai-Csie added fork:osaka Osaka hardfork type:test Type: Add/refactor fw unit tests; no fw or el client test case changes labels Jul 22, 2025

LouisTsai-Csie marked this pull request as ready for review July 22, 2025 15:09

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch from 256b98c to ed33e8d Compare July 23, 2025 06:48

LouisTsai-Csie commented Jul 23, 2025

View reviewed changes

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch 3 times, most recently from 5c7c640 to 87c8cac Compare July 30, 2025 04:03

marioevz requested a review from spencer-tb July 31, 2025 00:12

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch from fce1649 to 13ab385 Compare July 31, 2025 03:40

LouisTsai-Csie commented Jul 31, 2025

View reviewed changes

LouisTsai-Csie requested a review from marioevz July 31, 2025 07:50

LouisTsai-Csie mentioned this pull request Jul 31, 2025

tracker(fork): osaka fork mega meta issue #1790

Open

30 tasks

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch from 3d0391f to 47ba51b Compare August 4, 2025 07:44

LouisTsai-Csie commented Aug 4, 2025

View reviewed changes

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch from 47ba51b to 55d8fec Compare August 4, 2025 08:38

LouisTsai-Csie mentioned this pull request Aug 6, 2025

test(osaka): add edge case test vectors for EIP-7883 MODEXP gas calculation #1993

Merged

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch 2 times, most recently from efb5a00 to 6e7819f Compare August 11, 2025 09:52

spencer-tb reviewed Aug 11, 2025

View reviewed changes

refactor(eip7883): update vector input structure

7a7ffae

LouisTsai-Csie added 15 commits August 12, 2025 14:24

feat: add eip7702, gas usage and extra edge cases

8ee0e3d

feat: add fork transition test

f6508ff

test: add extra invalid cases

933b0d1

refactor(tests): Improve fixture and test descriptions for clarity

a475888

refactor(tests): add helper for invalid case

70b1dc9

feat: add invalud boundary test cases

c455262

chore: update boundary input case

81b6bfa

refactor(tests): simplify boundary modexp test cases

6cda28a

fix(tests): update fork transition test

5691c15

feat(test): add gas formula egde cases

f72066e

test: add extra casefor modexp invalid input

0518821

tests: port legacy modexp test

95c01a9

refactor: update vector and data types

b93500c

refactor: remove valid case from invalid scenario

757501f

refactor: update result comparison method and test case

ec39341

LouisTsai-Csie force-pushed the enhance-eip7823-coverage branch from 6e7819f to ec39341 Compare August 12, 2025 07:48

LouisTsai-Csie requested a review from spencer-tb August 12, 2025 08:28

refactor: update comment

c6b83d4

		@@ -73,6 +73,8 @@ def from_bytes(cls, input_data: Bytes \| str) -> "ModExpInput":

		modulus = input_data[current_index : current_index + modulus_length]

		modulus = modulus.ljust(min(1024, modulus_length), b"\x00")

feat(tests): enhance eip7883 test coverage #1929

Are you sure you want to change the base?

feat(tests): enhance eip7883 test coverage #1929

Conversation

LouisTsai-Csie commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

🔗 Related Issues or PRs

✅ Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

spencer-tb Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

geth-fail-length

guido-4-even

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LouisTsai-Csie Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LouisTsai-Csie Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LouisTsai-Csie commented Aug 4, 2025

Uh oh!

spencer-tb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

spencer-tb Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

LouisTsai-Csie commented Jul 21, 2025 •

edited

Loading

spencer-tb Aug 11, 2025 •

edited

Loading

LouisTsai-Csie Jul 22, 2025 •

edited

Loading

LouisTsai-Csie Jul 31, 2025 •

edited

Loading

spencer-tb left a comment •

edited

Loading

spencer-tb Aug 11, 2025 •

edited

Loading