Skip to content

Commit 675f1a7

Browse files
CPerezzgballet
andauthored
feat(tests): multi opcode bloatnet ext cases (#2186)
* Add BloatNet tests Signed-off-by: Guillaume Ballet <[email protected]> * try building the contract Signed-off-by: Guillaume Ballet <[email protected]> * fix: SSTORE 0 -> 1 match all values in the state Signed-off-by: Guillaume Ballet <[email protected]> * add the tx for 0 -> 1 and 1 -> 2 Signed-off-by: Guillaume Ballet <[email protected]> * fix: linter issues Signed-off-by: Guillaume Ballet <[email protected]> * remove more whitespaces Signed-off-by: Guillaume Ballet <[email protected]> remove leftover single whitespace :| * fix formatting * move to benchmarks Signed-off-by: Guillaume Ballet <[email protected]> * fix linter value * use the gas limit from the environment * parameterize the written value in SSTORE * fix linter issues * update CHANGELOG.md * fix format * simplify syntax * fix: start with an empty contract storage * more fixes, but the result is still incorrect * fix: finally fix the tests * linter fix * add SLOAD tests * test(benchmark): implement CREATE2 addressing for bloatnet tests - Add CREATE2 deterministic address calculation to overcome 24KB bytecode limit - Fix While loop condition to properly iterate through contracts - Account for memory expansion costs in gas calculations - Add safety margins (50k gas reserve, 98% utilization) for stability - Tests now scale to any gas limit without bytecode constraints - Achieve 98% gas utilization with 10M and 20M gas limits * refactor(benchmark): optimize gas calculations in bloatnet tests - Remove gas reserve and 98% utilization logic for contract calculations - Directly calculate the number of contracts based on available gas - Introduce precise expected gas usage calculations for better accuracy - Ensure tests scale effectively without unnecessary constraints * refactor(benchmark): bloatnet tests with unique bytecode for I/O optimization - Update tests to generate unique bytecode for each contract, maximizing I/O reads during benchmarks. - Clarify comments regarding bytecode generation and its impact on gas costs. - Ensure CREATE2 addresses are calculated consistently using a base bytecode template. - Improve test descriptions to reflect the changes in contract deployment strategy. * refactor(benchmark): replace custom CREATE2 address calculation with utility function - Remove the custom `calculate_create2_address` function in favor of the `compute_create2_address` utility. - Update tests to utilize the new utility for consistent CREATE2 address calculations. - Simplify code by eliminating unnecessary complexity in address calculation logic. - Ensure that the CREATE2 prefix is directly set to 0xFF in the memory operation for clarity. * CREATE2 factory approach working * Version with EIP-7997 model working * refactor(benchmark): imrpove contract deployment script with interactive selection and bytecode generation - Introduced interactive contract type selection for deploying contracts in the bloatnet benchmark. - Added support for multiple contract types: max_size_24kb, sload_heavy, storage_heavy, and custom. - Refactored bytecode generation functions to improve clarity and maintainability. - Updated README to reflect changes in deployment process and contract types. - Ensured proper handling of factory deployment and transaction receipt checks. * delete: remove obsolete test_create2.py script This was commited unintentionally * refactor(benchmark): optimize gas calculations for BALANCE + EXTCODECOPY pattern - Updated the README to reflect the optimized gas cost for the BALANCE + EXTCODECOPY pattern, reducing it from ~5,007 to ~2,710 gas per contract. - Modified the test_bloatnet_balance_extcodecopy function to read only 1 byte from the end of the bytecode, minimizing gas costs while maximizing contract targeting. - Adjusted calculations for the number of contracts needed based on the new cost per contract, ensuring accurate benchmarks. * refactor(benchmark): support non-fixed max_codesize * chore: Remove all 24kB "hardcoded" refs * fix: pre-commit lint hooks * push updated deploy_create2_factory refactored with EEST as dep * refactor(benchmark): enhance CREATE2 factory deployment and testing - Updated the deploy_create2_factory_refactored.py script to improve the deployment of a CREATE2 factory with an initcode template, allowing for dynamic contract address generation. - Modified test_bloatnet.py to support on-the-fly CREATE2 address generation, optimizing gas costs and improving test accuracy. - Adjusted gas cost calculations in the README to reflect the new deployment approach, ensuring accurate benchmarks for BloatNet tests. * remove: old_deploy_factory script * chore: address PR review fixes * fix(benchmark): correct import path for ethereum_test_vm * chore(benchmark): update according to review comments Also, renamed the test file to include only multi-opcode tests there and have a more clean directory for future test inclusions. * refactor(benchmark): remove hardcoded parameters storing inside factory stub - Fixed offset at which we COPYCODE - Removed hardcoded values and added comments for clarity on factory storage layout and contract generation. * chore: update pyproject.toml configuration * refactor: rename test_mutiopcode.py to test_muti_opcode.py for consistency * fix: correct import sorting in test_muti_opcode.py to fix CI lint error * fix(benchmark): rename test file to fix typo Rename test_muti_opcode.py to test_multi_opcode.py to fix filename typo * fix(benchmark): update BloatNet tests to use factory's getConfig() method Replace direct storage access with STATICCALL to factory's getConfig() method in both test_bloatnet_balance_extcodesize and test_bloatnet_balance_extcodecopy. Changes: - Use STATICCALL to retrieve configuration from factory instead of SLOAD - Add proper error handling for failed configuration calls - Remove gas-limiting calculations, allowing tests to run until gas exhaustion - Store configuration data in memory positions 96 and 128 for cleaner access This makes the tests more robust and better aligned with the factory's public interface, avoiding direct storage access assumptions. * refactor(benchmark): enhance BloatNet test documentation and gas cost calculations * revert: restore pyproject.toml to match main branch Remove all changes to pyproject.toml to align with upstream main branch. This ensures CI compatibility and prevents configuration conflicts. * fix(benchmark): resolve W505 doc line length issues in test_multi_opcode.py Fixed all documentation and comment lines exceeding 79 characters to comply with lint requirements. * refactor(benchmark): simplify STATICCALL usage in BloatNet tests. * feat(benchmark): add gas exhaustion validation using expected_receipt Implement solution to address reviewer's concern about test validation by using EEST's expected_receipt feature to validate that benchmarks consume all gas. Changes: - Add TransactionReceipt import - Add expected_receipt to both test transactions validating gas_used equals gas_limit - Remove skip_gas_used_validation flag as validation is now explicit This ensures tests can distinguish between: - Early failure from invalid jump (~50K gas) indicating setup issues - Full gas exhaustion (all gas consumed) indicating successful benchmark run The invalid jump remains as a fail-fast mechanism for STATICCALL failures, while expected_receipt validates the benchmark actually executed. * fix(benchmark): restore skip_gas_used_validation flag Re-add skip_gas_used_validation=True to both blockchain_test calls as it was accidentally removed. This flag is still needed alongside the expected_receipt validation. * refactor(benchmark): improve readability using kwargs syntax for opcodes Apply reviewer suggestions to use more readable kwargs syntax for memory and stack operations throughout both test functions. Changes: - Use Op.MLOAD(offset) instead of Op.PUSH1(offset) + Op.MLOAD - Use Op.MSTORE(offset, value) for cleaner memory writes - Use Op.SHA3(offset, length) for hash operations - Use Op.POP(Op.BALANCE) and Op.POP(Op.EXTCODESIZE) for cleaner stack ops - Combine increment operations into single Op.MSTORE(32, Op.ADD(Op.MLOAD(32), 1)) This makes the bytecode generation more concise and easier to understand. * fix(benchmark): shorten comment lines to meet doc length limit * fix(benchmark): correct MSTORE operation to store init_code_hash properly * fix(benchmark): address review comments - remove redundant validation and fix ADD syntax --------- Signed-off-by: Guillaume Ballet <[email protected]> Co-authored-by: Guillaume Ballet <[email protected]>
1 parent 31a9089 commit 675f1a7

File tree

3 files changed

+318
-1
lines changed

3 files changed

+318
-1
lines changed

docs/CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,8 @@ Users can select any of the artifacts depending on their benchmarking or testing
236236

237237
### 🧪 Test Cases
238238

239-
-[EIP-7951](https://eips.ethereum.org/EIPS/eip-7951): Add additional test cases for modular comparison and initcode context ([#2023](https://github.com/ethereum/execution-spec-tests/pull/2023), & [#2068](https://github.com/ethereum/execution-spec-tests/pull/2068)).
239+
-[BloatNet](https://bloatnet.info)/Multidimensional Metering: Add benchmarks to be used as part of the BloatNet project and also for Multidimensional Metering.
240+
-[EIP-7951](https://eips.ethereum.org/EIPS/eip-7951): Add additional test cases for modular comparison.
240241
- 🔀 Refactored `BLOBHASH` opcode context tests to use the `pre_alloc` plugin in order to avoid contract and EOA address collisions ([#1637](https://github.com/ethereum/execution-spec-tests/pull/1637)).
241242
- 🔀 Refactored `SELFDESTRUCT` opcode collision tests to use the `pre_alloc` plugin in order to avoid contract and EOA address collisions ([#1643](https://github.com/ethereum/execution-spec-tests/pull/1643)).
242243
- ✨ EIP-7594: Sanity test cases to send blob transactions and verify `engine_getBlobsVX` using the `execute` command ([#1644](https://github.com/ethereum/execution-spec-tests/pull/1644),[#1884](https://github.com/ethereum/execution-spec-tests/pull/1884)).
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""BloatNet benchmark tests for Ethereum execution spec tests."""
Lines changed: 315 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,315 @@
1+
"""
2+
abstract: BloatNet bench cases extracted from https://hackmd.io/9icZeLN7R0Sk5mIjKlZAHQ.
3+
4+
The idea of all these tests is to stress client implementations to find out
5+
where the limits of processing are focusing specifically on state-related
6+
operations.
7+
"""
8+
9+
import pytest
10+
11+
from ethereum_test_forks import Fork
12+
from ethereum_test_tools import (
13+
Account,
14+
Alloc,
15+
Block,
16+
BlockchainTestFiller,
17+
Transaction,
18+
While,
19+
)
20+
from ethereum_test_vm import Bytecode
21+
from ethereum_test_vm import Opcodes as Op
22+
23+
REFERENCE_SPEC_GIT_PATH = "DUMMY/bloatnet.md"
24+
REFERENCE_SPEC_VERSION = "1.0"
25+
26+
27+
# BLOATNET ARCHITECTURE:
28+
#
29+
# [Initcode Contract] [Factory Contract] [24KB Contracts]
30+
# (9.5KB) (116B) (N x 24KB each)
31+
# │ │ │
32+
# │ EXTCODECOPY │ CREATE2(salt++) │
33+
# └──────────────► ├──────────────────► Contract_0
34+
# ├──────────────────► Contract_1
35+
# ├──────────────────► Contract_2
36+
# └──────────────────► Contract_N
37+
#
38+
# [Attack Contract] ──STATICCALL──► [Factory.getConfig()]
39+
# │ returns: (N, hash)
40+
# └─► Loop(i=0 to N):
41+
# 1. Generate CREATE2 addr: keccak256(0xFF|factory|i|hash)[12:]
42+
# 2. BALANCE(addr) → 2600 gas (cold access)
43+
# 3. EXTCODESIZE(addr) → 100 gas (warm access)
44+
#
45+
# HOW IT WORKS:
46+
# 1. Factory uses EXTCODECOPY to load initcode, avoiding PC-relative jumps
47+
# 2. Each CREATE2 deployment produces unique 24KB bytecode (via ADDRESS)
48+
# 3. All contracts share same initcode hash for deterministic addresses
49+
# 4. Attack rapidly accesses all contracts, stressing client's state handling
50+
51+
52+
@pytest.mark.valid_from("Prague")
53+
def test_bloatnet_balance_extcodesize(
54+
blockchain_test: BlockchainTestFiller,
55+
pre: Alloc,
56+
fork: Fork,
57+
gas_benchmark_value: int,
58+
):
59+
"""
60+
BloatNet test using BALANCE + EXTCODESIZE with "on-the-fly" CREATE2
61+
address generation.
62+
63+
This test:
64+
1. Assumes contracts are already deployed via the factory (salt 0 to N-1)
65+
2. Generates CREATE2 addresses dynamically during execution
66+
3. Calls BALANCE (cold) then EXTCODESIZE (warm) on each
67+
4. Maximizes cache eviction by accessing many contracts
68+
"""
69+
gas_costs = fork.gas_costs()
70+
71+
# Calculate gas costs
72+
intrinsic_gas = fork.transaction_intrinsic_cost_calculator()(calldata=b"")
73+
74+
# Cost per contract access with CREATE2 address generation
75+
cost_per_contract = (
76+
gas_costs.G_KECCAK_256 # SHA3 static cost for address generation (30)
77+
+ gas_costs.G_KECCAK_256_WORD * 3 # SHA3 dynamic cost (85 bytes = 3 words * 6)
78+
+ gas_costs.G_COLD_ACCOUNT_ACCESS # Cold BALANCE (2600)
79+
+ gas_costs.G_BASE # POP balance (2)
80+
+ gas_costs.G_WARM_ACCOUNT_ACCESS # Warm EXTCODESIZE (100)
81+
+ gas_costs.G_BASE # POP code size (2)
82+
+ gas_costs.G_BASE # DUP1 before BALANCE (3)
83+
+ gas_costs.G_VERY_LOW * 4 # PUSH1 operations (4 * 3)
84+
+ gas_costs.G_LOW # MLOAD for salt (3)
85+
+ gas_costs.G_VERY_LOW # ADD for increment (3)
86+
+ gas_costs.G_LOW # MSTORE salt back (3)
87+
+ 10 # While loop overhead
88+
)
89+
90+
# Calculate how many contracts to access based on available gas
91+
available_gas = gas_benchmark_value - intrinsic_gas - 1000 # Reserve for cleanup
92+
contracts_needed = int(available_gas // cost_per_contract)
93+
94+
# Deploy factory using stub contract - NO HARDCODED VALUES
95+
# The stub "bloatnet_factory" must be provided via --address-stubs flag
96+
# The factory at that address MUST have:
97+
# - Slot 0: Number of deployed contracts
98+
# - Slot 1: Init code hash for CREATE2 address calculation
99+
factory_address = pre.deploy_contract(
100+
code=Bytecode(), # Required parameter, but will be ignored for stubs
101+
stub="bloatnet_factory",
102+
)
103+
104+
# Log test requirements - deployed count read from factory storage
105+
print(
106+
f"Test needs {contracts_needed} contracts for "
107+
f"{gas_benchmark_value / 1_000_000:.1f}M gas. "
108+
f"Factory storage will be checked during execution."
109+
)
110+
111+
# Build attack contract that reads config from factory and performs attack
112+
attack_code = (
113+
# Call getConfig() on factory to get num_deployed and init_code_hash
114+
Op.STATICCALL(
115+
gas=Op.GAS,
116+
address=factory_address,
117+
args_offset=0,
118+
args_size=0,
119+
ret_offset=96,
120+
ret_size=64,
121+
)
122+
# Check if call succeeded
123+
+ Op.ISZERO
124+
+ Op.PUSH2(0x1000) # Jump to error handler if failed (far jump)
125+
+ Op.JUMPI
126+
# Load results from memory
127+
# Memory[96:128] = num_deployed_contracts
128+
# Memory[128:160] = init_code_hash
129+
+ Op.MLOAD(96) # Load num_deployed_contracts
130+
+ Op.MLOAD(128) # Load init_code_hash
131+
# Setup memory for CREATE2 address generation
132+
# Memory layout at 0: 0xFF + factory_addr(20) + salt(32) + hash(32)
133+
+ Op.MSTORE(0, factory_address) # Store factory address at memory position 0
134+
+ Op.MSTORE8(11, 0xFF) # Store 0xFF prefix at position (32 - 20 - 1)
135+
+ Op.MSTORE(32, 0) # Store salt at position 32
136+
# Stack now has: [num_contracts, init_code_hash]
137+
+ Op.PUSH1(64) # Push memory position
138+
+ Op.MSTORE # Store init_code_hash at memory[64]
139+
# Stack now has: [num_contracts]
140+
# Main attack loop - iterate through all deployed contracts
141+
+ While(
142+
body=(
143+
# Generate CREATE2 addr: keccak256(0xFF+factory+salt+hash)
144+
Op.SHA3(11, 85) # Generate CREATE2 address from memory[11:96]
145+
# The address is now on the stack
146+
+ Op.DUP1 # Duplicate for EXTCODESIZE
147+
+ Op.POP(Op.BALANCE) # Cold access
148+
+ Op.POP(Op.EXTCODESIZE) # Warm access
149+
# Increment salt for next iteration
150+
+ Op.MSTORE(32, Op.ADD(Op.MLOAD(32), 1)) # Increment and store salt
151+
),
152+
# Continue while we haven't reached the limit
153+
condition=Op.DUP1 + Op.PUSH1(1) + Op.SWAP1 + Op.SUB + Op.DUP1 + Op.ISZERO + Op.ISZERO,
154+
)
155+
+ Op.POP # Clean up counter
156+
)
157+
158+
# Deploy attack contract
159+
attack_address = pre.deploy_contract(code=attack_code)
160+
161+
# Run the attack
162+
attack_tx = Transaction(
163+
to=attack_address,
164+
gas_limit=gas_benchmark_value,
165+
sender=pre.fund_eoa(),
166+
)
167+
168+
# Post-state: just verify attack contract exists
169+
post = {
170+
attack_address: Account(storage={}),
171+
}
172+
173+
blockchain_test(
174+
pre=pre,
175+
blocks=[Block(txs=[attack_tx])],
176+
post=post,
177+
)
178+
179+
180+
@pytest.mark.valid_from("Prague")
181+
def test_bloatnet_balance_extcodecopy(
182+
blockchain_test: BlockchainTestFiller,
183+
pre: Alloc,
184+
fork: Fork,
185+
gas_benchmark_value: int,
186+
):
187+
"""
188+
BloatNet test using BALANCE + EXTCODECOPY with on-the-fly CREATE2
189+
address generation.
190+
191+
This test forces actual bytecode reads from disk by:
192+
1. Assumes contracts are already deployed via the factory
193+
2. Generating CREATE2 addresses dynamically during execution
194+
3. Using BALANCE (cold) to warm the account
195+
4. Using EXTCODECOPY (warm) to read 1 byte from the END of the bytecode
196+
"""
197+
gas_costs = fork.gas_costs()
198+
max_contract_size = fork.max_code_size()
199+
200+
# Calculate costs
201+
intrinsic_gas = fork.transaction_intrinsic_cost_calculator()(calldata=b"")
202+
203+
# Cost per contract with EXTCODECOPY and CREATE2 address generation
204+
cost_per_contract = (
205+
gas_costs.G_KECCAK_256 # SHA3 static cost for address generation (30)
206+
+ gas_costs.G_KECCAK_256_WORD * 3 # SHA3 dynamic cost (85 bytes = 3 words * 6)
207+
+ gas_costs.G_COLD_ACCOUNT_ACCESS # Cold BALANCE (2600)
208+
+ gas_costs.G_BASE # POP balance (2)
209+
+ gas_costs.G_WARM_ACCOUNT_ACCESS # Warm EXTCODECOPY base (100)
210+
+ gas_costs.G_COPY * 1 # Copy cost for 1 byte (3)
211+
+ gas_costs.G_BASE * 2 # DUP1 before BALANCE, DUP4 for address (6)
212+
+ gas_costs.G_VERY_LOW * 8 # PUSH operations (8 * 3 = 24)
213+
+ gas_costs.G_LOW * 2 # MLOAD for salt twice (6)
214+
+ gas_costs.G_VERY_LOW * 2 # ADD operations (6)
215+
+ gas_costs.G_LOW # MSTORE salt back (3)
216+
+ gas_costs.G_BASE # POP after EXTCODECOPY (2)
217+
+ 10 # While loop overhead
218+
)
219+
220+
# Calculate how many contracts to access
221+
available_gas = gas_benchmark_value - intrinsic_gas - 1000
222+
contracts_needed = int(available_gas // cost_per_contract)
223+
224+
# Deploy factory using stub contract - NO HARDCODED VALUES
225+
# The stub "bloatnet_factory" must be provided via --address-stubs flag
226+
# The factory at that address MUST have:
227+
# - Slot 0: Number of deployed contracts
228+
# - Slot 1: Init code hash for CREATE2 address calculation
229+
factory_address = pre.deploy_contract(
230+
code=Bytecode(), # Required parameter, but will be ignored for stubs
231+
stub="bloatnet_factory",
232+
)
233+
234+
# Log test requirements - deployed count read from factory storage
235+
print(
236+
f"Test needs {contracts_needed} contracts for "
237+
f"{gas_benchmark_value / 1_000_000:.1f}M gas. "
238+
f"Factory storage will be checked during execution."
239+
)
240+
241+
# Build attack contract that reads config from factory and performs attack
242+
attack_code = (
243+
# Call getConfig() on factory to get num_deployed and init_code_hash
244+
Op.STATICCALL(
245+
gas=Op.GAS,
246+
address=factory_address,
247+
args_offset=0,
248+
args_size=0,
249+
ret_offset=96,
250+
ret_size=64,
251+
)
252+
# Check if call succeeded
253+
+ Op.ISZERO
254+
+ Op.PUSH2(0x1000) # Jump to error handler if failed (far jump)
255+
+ Op.JUMPI
256+
# Load results from memory
257+
# Memory[96:128] = num_deployed_contracts
258+
# Memory[128:160] = init_code_hash
259+
+ Op.MLOAD(96) # Load num_deployed_contracts
260+
+ Op.MLOAD(128) # Load init_code_hash
261+
# Setup memory for CREATE2 address generation
262+
# Memory layout at 0: 0xFF + factory_addr(20) + salt(32) + hash(32)
263+
+ Op.MSTORE(0, factory_address) # Store factory address at memory position 0
264+
+ Op.MSTORE8(11, 0xFF) # Store 0xFF prefix at position (32 - 20 - 1)
265+
+ Op.MSTORE(32, 0) # Store salt at position 32
266+
# Stack now has: [num_contracts, init_code_hash]
267+
+ Op.PUSH1(64) # Push memory position
268+
+ Op.MSTORE # Store init_code_hash at memory[64]
269+
# Stack now has: [num_contracts]
270+
# Main attack loop - iterate through all deployed contracts
271+
+ While(
272+
body=(
273+
# Generate CREATE2 address
274+
Op.SHA3(11, 85) # Generate CREATE2 address from memory[11:96]
275+
# The address is now on the stack
276+
+ Op.DUP1 # Duplicate for later operations
277+
+ Op.POP(Op.BALANCE) # Cold access
278+
# EXTCODECOPY(addr, mem_offset, last_byte_offset, 1)
279+
# Read the LAST byte to force full contract load
280+
+ Op.PUSH1(1) # size (1 byte)
281+
+ Op.PUSH2(max_contract_size - 1) # code offset (last byte)
282+
# Use salt as memory offset to avoid overlap
283+
+ Op.ADD(Op.MLOAD(32), 96) # Add base memory offset for unique position
284+
+ Op.DUP4 # address (duplicated earlier)
285+
+ Op.EXTCODECOPY
286+
+ Op.POP # Clean up address
287+
# Increment salt for next iteration
288+
+ Op.MSTORE(32, Op.ADD(Op.MLOAD(32), 1)) # Increment and store salt
289+
),
290+
# Continue while counter > 0
291+
condition=Op.DUP1 + Op.PUSH1(1) + Op.SWAP1 + Op.SUB + Op.DUP1 + Op.ISZERO + Op.ISZERO,
292+
)
293+
+ Op.POP # Clean up counter
294+
)
295+
296+
# Deploy attack contract
297+
attack_address = pre.deploy_contract(code=attack_code)
298+
299+
# Run the attack
300+
attack_tx = Transaction(
301+
to=attack_address,
302+
gas_limit=gas_benchmark_value,
303+
sender=pre.fund_eoa(),
304+
)
305+
306+
# Post-state
307+
post = {
308+
attack_address: Account(storage={}),
309+
}
310+
311+
blockchain_test(
312+
pre=pre,
313+
blocks=[Block(txs=[attack_tx])],
314+
post=post,
315+
)

0 commit comments

Comments
 (0)