Skip to content

Commit 54b46ea

Browse files
authored
feat(benchmark): add SLOAD/SSTORE benchmark test with multi-contract support (#2256)
* feat(benchmark): add SLOAD benchmark test with multi-contract support Add test_sload_empty_erc20_balanceof to benchmark SLOAD operations on non-existing storage slots using ERC20 balanceOf() queries. The idea of this benchmark is to exploit within a single or series of N contracts calls to non-existing addresses. On this way, we force clients to resolve as many tree branches as possible. * feat(benchmark): add SSTORE benchmark test using ERC20 approve Add test_sstore_erc20_approve that benchmarks SSTORE operations by calling approve(spender, amount) on pre-deployed ERC20 contracts. Follows the same pattern as the SLOAD benchmark: - Auto-discovers ERC20 contracts from stubs - Splits gas budget evenly across all discovered contracts - Uses counter as both spender address and amount - Forces SSTOREs to allowance mapping storage slots The test measures client performance when writing to many storage slots across multiple contracts, stressing state-handling write operations. * fix(benchmark): correct SSTORE benchmark gas calculation Fixed gas calculation for test_sstore_erc20_approve to ensure accurate gas usage prediction and prevent transaction reverts: Key fixes: - Added memory expansion cost (15 gas per contract) - Corrected G_LOW gas values in comments (5 gas, not 3) - Separated per-contract overhead from per-iteration costs - Improved cost calculation clarity with detailed opcode breakdown Gas calculation (10M gas, 3 contracts): - Intrinsic: 21,000 - Overhead per contract: 38 - Cost per iteration: 20,226 - Calls per contract: 164 - Expected gas used: 9,972,306 (99.72% utilization) * feat(benchmark): add mixed SLOAD/SSTORE benchmark with configurable ratios Add test_mixed_sload_sstore to test_multi_opcode.py that combines SLOAD and SSTORE operations with parameterized gas distribution ratios (50-50, 70-30, 90-10). The test stresses clients with mixed read/write workloads by: - Dividing gas budget evenly across all discovered ERC20 contract stubs - Splitting each contract's allocation by the specified percentage ratio - Executing balanceOf (cold SLOAD on empty slots) for the SLOAD portion - Executing approve (SSTORE to new allowance slots) for the SSTORE portion Verified gas calculations for 10M gas budget with 3 contracts (50-50 ratio): - SLOAD operations: ~2,312 gas/iteration → 719 calls per contract - SSTORE operations: ~20,226 gas/iteration → 82 calls per contract - Total operations: 2,403 state operations (2,157 SLOADs + 246 SSTOREs) - Gas usage: 9.98M / 10M (16K buffer, no out-of-gas errors) This benchmark enables testing different read/write ratios to identify client performance characteristics under varying state operation mixes. * refactor(benchmark): optimize SLOAD/SSTORE benchmarks per review feedback Address review comments by optimizing loop efficiency: 1. Move function selector MSTORE outside loops (Comment #2) - BALANCEOF_SELECTOR and APPROVE_SELECTOR now stored once per contract - Saves 3 gas (G_VERY_LOW) per iteration - Total savings: ~6,471 gas for 50-50 ratio with 10M budget and 3 contracts 2. Remove unused return data from CALL operations (Comment #1) - Changed ret_offset=96/128, ret_size=32 to ret_offset=0, ret_size=0 - Eliminates unnecessary memory expansion - Minor gas savings, cleaner implementation Skipped Comment #3 (use Op.GAS for addresses): - Would lose determinism (GAS varies per iteration) - Adds complexity for minimal benefit - Counter still needed for loop control Changes applied to: - test_sload_empty_erc20_balanceof - test_sstore_erc20_approve - test_mixed_sload_sstore (both SLOAD and SSTORE loops) * refactor(benchmark): simplify SLOAD benchmark memory layout and fix calldata encoding - Move selector MSTORE outside for-loop (saves gas per contract) - Use single counter at MEM[32] instead of duplicate at MEM[0] and MEM[64] - Fix calldata encoding by using args_offset=28 for correct ABI format - Selector now properly positioned at start of calldata * refactor(benchmark): simplify SSTORE benchmark memory layout and fix calldata encoding - Move selector MSTORE outside for-loop (saves gas per contract) - Use single counter at MEM[32] instead of duplicate at MEM[0] - Fix calldata encoding by using args_offset=28 for correct ABI format - Selector now properly positioned at start of calldata * refactor(benchmark): simplify mixed SLOAD/SSTORE memory layout and fix calldata encoding - Move selectors MSTORE outside for-loop (saves gas per contract) - Use separate memory regions for balanceOf and approve to avoid conflicts - Fix calldata encoding by using correct args_offset for proper ABI format - Selectors now properly positioned at start of calldata * refactor(benchmark): simplify mixed test to reuse memory layout consistently - Reuse MEM[0] for both selectors (sequential operations, no conflict) - Reuse MEM[32] for both counters (balanceOf then approve) - Reuse MEM[64] and MEM[96] for parameters - Consistent args_offset=28 for both operations (was 28 and 128) - Matches single-opcode test pattern for easier understanding - Reduces memory footprint from 196 bytes to 96 bytes * feat(benchmark): add parametrized contract count and stub filtering to single-opcode tests - Add parametrization for num_contracts [1, 5, 10, 20, 100] - Implement stub prefix filtering based on test function name - Add validation to error if insufficient matching stubs - Add SSTORE benchmark architecture documentation - Create README with setup instructions and stubs.json format * fix(benchmark): add type annotations to test functions * fix(benchmark): add AddressStubs type annotation to address_stubs parameter * feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations - Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test - Implement stub prefix filtering for all benchmarks - Fix gas cost calculations to account for COLD/WARM account access - CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100) - SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes - Update gas calculation formulas to solve for calls per contract correctly * feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations - Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test - Implement stub prefix filtering for all benchmarks - Fix gas cost calculations to account for COLD/WARM account access - CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100) - SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes - Update gas calculation formulas to solve for calls per contract correctly
1 parent ecd3462 commit 54b46ea

File tree

3 files changed

+753
-0
lines changed

3 files changed

+753
-0
lines changed
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# BloatNet Single-Opcode Benchmarks
2+
3+
This directory contains benchmarks for testing single EVM opcodes (SLOAD, SSTORE) under state-heavy conditions using pre-deployed contracts.
4+
5+
## Test Setup
6+
7+
### Prerequisites
8+
9+
1. Pre-deployed ERC20 contracts on the target network
10+
2. A JSON file containing contract addresses (stubs)
11+
12+
### Address Stubs Format
13+
14+
Create a JSON file (`stubs.json`) mapping test-specific stub names to deployed contract addresses:
15+
16+
```json
17+
{
18+
"test_sload_empty_erc20_balanceof_USDT": "0x1234567890123456789012345678901234567890",
19+
"test_sload_empty_erc20_balanceof_USDC": "0x2345678901234567890123456789012345678901",
20+
"test_sload_empty_erc20_balanceof_DAI": "0x3456789012345678901234567890123456789012",
21+
"test_sload_empty_erc20_balanceof_WETH": "0x4567890123456789012345678901234567890123",
22+
"test_sload_empty_erc20_balanceof_WBTC": "0x5678901234567890123456789012345678901234",
23+
24+
"test_sstore_erc20_approve_USDT": "0x1234567890123456789012345678901234567890",
25+
"test_sstore_erc20_approve_USDC": "0x2345678901234567890123456789012345678901",
26+
"test_sstore_erc20_approve_DAI": "0x3456789012345678901234567890123456789012",
27+
"test_sstore_erc20_approve_WETH": "0x4567890123456789012345678901234567890123",
28+
"test_sstore_erc20_approve_WBTC": "0x5678901234567890123456789012345678901234""
29+
}
30+
```
31+
32+
**Naming Convention:**
33+
- Stub names MUST start with the test function name
34+
- Format: `{test_function_name}_{identifier}`
35+
- Example: `test_sload_empty_erc20_balanceof_USDT`
36+
37+
38+
### Running the Tests
39+
40+
#### Execute Mode (Against Live Network)
41+
42+
```bash
43+
# Run with specific number of contracts (e.g., only the 5-contract variant)
44+
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run execute \
45+
--address-stubs /path/to/stubs.json \
46+
--fork=Prague \
47+
tests/benchmark/stateful/bloatnet/test_single_opcode.py::test_sload_empty_erc20_balanceof \
48+
-k "[5]" \
49+
-v
50+
51+
# Run all parametrized variants (1, 5, 10, 20 contracts)
52+
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run execute \
53+
--address-stubs /path/to/stubs.json \
54+
--fork=Prague \
55+
tests/benchmark/stateful/bloatnet/test_single_opcode.py \
56+
-v
57+
```
58+
59+
60+
## Test Parametrization
61+
62+
Both tests are parametrized with `num_contracts = [1, 5, 10, 20, 100]`, generating 5 test variants each:
63+
64+
- **1 contract**: Baseline single-contract performance
65+
- **5 contracts**: Small-scale multi-contract scenario
66+
- **10 contracts**: Medium-scale multi-contract scenario
67+
- **20 contracts**: Large-scale multi-contract scenario
68+
- **100 contracts**: Very large-scale multi-contract stress test
69+
70+
### How Stub Filtering Works
71+
72+
1. Test extracts its function name (e.g., `test_sload_empty_erc20_balanceof`)
73+
2. Filters stubs starting with that name from `stubs.json`
74+
3. Selects the **first N** matching stubs based on `num_contracts` parameter
75+
4. Errors if insufficient matching stubs found
76+

tests/benchmark/stateful/bloatnet/test_multi_opcode.py

Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
)
2020
from ethereum_test_vm import Bytecode
2121
from ethereum_test_vm import Opcodes as Op
22+
from pytest_plugins.execute.pre_alloc import AddressStubs
2223

2324
REFERENCE_SPEC_GIT_PATH = "DUMMY/bloatnet.md"
2425
REFERENCE_SPEC_VERSION = "1.0"
@@ -465,3 +466,257 @@ def test_bloatnet_balance_extcodehash(
465466
blocks=[Block(txs=[attack_tx])],
466467
post=post,
467468
)
469+
470+
471+
# ERC20 function selectors
472+
BALANCEOF_SELECTOR = 0x70A08231 # balanceOf(address)
473+
APPROVE_SELECTOR = 0x095EA7B3 # approve(address,uint256)
474+
475+
476+
@pytest.mark.valid_from("Prague")
477+
@pytest.mark.parametrize("num_contracts", [1, 5, 10, 20, 100])
478+
@pytest.mark.parametrize(
479+
"sload_percent,sstore_percent",
480+
[
481+
pytest.param(50, 50, id="50-50"),
482+
pytest.param(70, 30, id="70-30"),
483+
pytest.param(90, 10, id="90-10"),
484+
],
485+
)
486+
def test_mixed_sload_sstore(
487+
blockchain_test: BlockchainTestFiller,
488+
pre: Alloc,
489+
fork: Fork,
490+
gas_benchmark_value: int,
491+
address_stubs: AddressStubs,
492+
num_contracts: int,
493+
sload_percent: int,
494+
sstore_percent: int,
495+
request: pytest.FixtureRequest,
496+
) -> None:
497+
"""
498+
BloatNet mixed SLOAD/SSTORE benchmark with configurable operation ratios.
499+
500+
This test:
501+
1. Filters stubs matching test name prefix
502+
(e.g., test_mixed_sload_sstore_*)
503+
2. Uses first N contracts based on num_contracts parameter
504+
3. Divides gas budget evenly across all selected contracts
505+
4. For each contract, divides gas into SLOAD and SSTORE portions by
506+
percentage
507+
5. Executes balanceOf (SLOAD) and approve (SSTORE) calls per the ratio
508+
6. Stresses clients with combined read/write operations on large
509+
contracts
510+
"""
511+
# Extract test function name for stub filtering
512+
test_name = request.node.name.split("[")[0] # Remove parametrization suffix
513+
514+
# Filter stubs that match the test name prefix
515+
matching_stubs = [
516+
stub_name for stub_name in address_stubs.root.keys() if stub_name.startswith(test_name)
517+
]
518+
519+
# Validate we have enough stubs
520+
if len(matching_stubs) < num_contracts:
521+
pytest.fail(
522+
f"Not enough matching stubs for test '{test_name}'. "
523+
f"Required: {num_contracts}, Found: {len(matching_stubs)}. "
524+
f"Matching stubs: {matching_stubs}"
525+
)
526+
527+
# Select first N stubs
528+
selected_stubs = matching_stubs[:num_contracts]
529+
gas_costs = fork.gas_costs()
530+
531+
# Calculate gas costs
532+
intrinsic_gas = fork.transaction_intrinsic_cost_calculator()(calldata=b"")
533+
534+
# Fixed overhead for SLOAD loop
535+
sload_loop_overhead = (
536+
# Attack contract loop overhead
537+
gas_costs.G_VERY_LOW * 2 # MLOAD counter (3*2)
538+
+ gas_costs.G_VERY_LOW * 2 # MSTORE selector (3*2)
539+
+ gas_costs.G_VERY_LOW * 3 # MLOAD + MSTORE address (3*3)
540+
+ gas_costs.G_BASE # POP (2)
541+
+ gas_costs.G_BASE * 3 # SUB + MLOAD + MSTORE for counter decrement (2*3)
542+
+ gas_costs.G_BASE * 2 # ISZERO * 2 for loop condition (2*2)
543+
+ gas_costs.G_MID # JUMPI (8)
544+
)
545+
546+
# ERC20 balanceOf internal gas
547+
sload_erc20_internal = (
548+
gas_costs.G_VERY_LOW # PUSH4 selector (3)
549+
+ gas_costs.G_BASE # EQ selector match (2)
550+
+ gas_costs.G_MID # JUMPI to function (8)
551+
+ gas_costs.G_JUMPDEST # JUMPDEST at function start (1)
552+
+ gas_costs.G_VERY_LOW * 2 # CALLDATALOAD arg (3*2)
553+
+ gas_costs.G_KECCAK_256 # keccak256 static (30)
554+
+ gas_costs.G_KECCAK_256_WORD * 2 # keccak256 dynamic for 64 bytes (2*6)
555+
+ gas_costs.G_COLD_SLOAD # Cold SLOAD - always cold for random addresses (2100)
556+
+ gas_costs.G_VERY_LOW * 3 # MSTORE result + RETURN setup (3*3)
557+
)
558+
559+
# Fixed overhead for SSTORE loop
560+
sstore_loop_overhead = (
561+
# Attack contract loop body operations
562+
gas_costs.G_VERY_LOW # MSTORE selector at memory[32] (3)
563+
+ gas_costs.G_LOW # MLOAD counter (5)
564+
+ gas_costs.G_VERY_LOW # MSTORE spender at memory[64] (3)
565+
+ gas_costs.G_BASE # POP call result (2)
566+
# Counter decrement
567+
+ gas_costs.G_LOW # MLOAD counter (5)
568+
+ gas_costs.G_VERY_LOW # PUSH1 1 (3)
569+
+ gas_costs.G_VERY_LOW # SUB (3)
570+
+ gas_costs.G_VERY_LOW # MSTORE counter back (3)
571+
# While loop condition check
572+
+ gas_costs.G_LOW # MLOAD counter (5)
573+
+ gas_costs.G_BASE # ISZERO (2)
574+
+ gas_costs.G_BASE # ISZERO (2)
575+
+ gas_costs.G_MID # JUMPI back to loop start (8)
576+
)
577+
578+
# ERC20 approve internal gas
579+
# Cold SSTORE: 22100 = 20000 base + 2100 cold access
580+
sstore_erc20_internal = (
581+
gas_costs.G_VERY_LOW # PUSH4 selector (3)
582+
+ gas_costs.G_BASE # EQ selector match (2)
583+
+ gas_costs.G_MID # JUMPI to function (8)
584+
+ gas_costs.G_JUMPDEST # JUMPDEST at function start (1)
585+
+ gas_costs.G_VERY_LOW # CALLDATALOAD spender (3)
586+
+ gas_costs.G_VERY_LOW # CALLDATALOAD amount (3)
587+
+ gas_costs.G_KECCAK_256 # keccak256 static (30)
588+
+ gas_costs.G_KECCAK_256_WORD * 2 # keccak256 dynamic for 64 bytes (12)
589+
+ gas_costs.G_COLD_SLOAD # Cold SLOAD for allowance check (2100)
590+
+ gas_costs.G_STORAGE_SET # SSTORE base cost (20000)
591+
+ gas_costs.G_COLD_SLOAD # Additional cold storage access (2100)
592+
+ gas_costs.G_VERY_LOW # PUSH1 1 for return value (3)
593+
+ gas_costs.G_VERY_LOW # MSTORE return value (3)
594+
+ gas_costs.G_VERY_LOW # PUSH1 32 for return size (3)
595+
+ gas_costs.G_VERY_LOW # PUSH1 0 for return offset (3)
596+
)
597+
598+
# Calculate gas budget per contract
599+
available_gas = gas_benchmark_value - intrinsic_gas
600+
gas_per_contract = available_gas // num_contracts
601+
602+
# For each contract, split gas by percentage
603+
sload_gas_per_contract = (gas_per_contract * sload_percent) // 100
604+
sstore_gas_per_contract = (gas_per_contract * sstore_percent) // 100
605+
606+
# Account for cold/warm transitions in CALL costs
607+
# First SLOAD call is COLD (2600), rest are WARM (100)
608+
sload_warm_cost = sload_loop_overhead + gas_costs.G_WARM_ACCOUNT_ACCESS + sload_erc20_internal
609+
cold_warm_diff = gas_costs.G_COLD_ACCOUNT_ACCESS - gas_costs.G_WARM_ACCOUNT_ACCESS
610+
sload_calls_per_contract = int((sload_gas_per_contract - cold_warm_diff) // sload_warm_cost)
611+
612+
# First SSTORE call is COLD (2600), rest are WARM (100)
613+
sstore_warm_cost = (
614+
sstore_loop_overhead + gas_costs.G_WARM_ACCOUNT_ACCESS + sstore_erc20_internal
615+
)
616+
sstore_calls_per_contract = int((sstore_gas_per_contract - cold_warm_diff) // sstore_warm_cost)
617+
618+
# Deploy selected ERC20 contracts using stubs
619+
erc20_addresses = []
620+
for stub_name in selected_stubs:
621+
addr = pre.deploy_contract(
622+
code=Bytecode(),
623+
stub=stub_name,
624+
)
625+
erc20_addresses.append(addr)
626+
627+
# Log test requirements
628+
print(
629+
f"Total gas budget: {gas_benchmark_value / 1_000_000:.1f}M gas. "
630+
f"~{gas_per_contract / 1_000_000:.1f}M gas per contract "
631+
f"({sload_percent}% SLOAD, {sstore_percent}% SSTORE). "
632+
f"Per contract: {sload_calls_per_contract} balanceOf calls, "
633+
f"{sstore_calls_per_contract} approve calls."
634+
)
635+
636+
# Build attack code that loops through each contract
637+
attack_code: Bytecode = (
638+
Op.JUMPDEST # Entry point
639+
+ Op.MSTORE(offset=0, value=BALANCEOF_SELECTOR) # Store selector once for all contracts
640+
)
641+
642+
for erc20_address in erc20_addresses:
643+
# For each contract, execute SLOAD operations (balanceOf)
644+
attack_code += (
645+
# Initialize counter in memory[32] = number of balanceOf calls
646+
Op.MSTORE(offset=32, value=sload_calls_per_contract)
647+
# Loop for balanceOf calls
648+
+ While(
649+
condition=Op.MLOAD(32) + Op.ISZERO + Op.ISZERO,
650+
body=(
651+
# Call balanceOf(address) on ERC20 contract
652+
# args_offset=28 reads: selector from MEM[28:32] + address
653+
# from MEM[32:64]
654+
Op.CALL(
655+
address=erc20_address,
656+
value=0,
657+
args_offset=28,
658+
args_size=36,
659+
ret_offset=0,
660+
ret_size=0,
661+
)
662+
+ Op.POP # Discard CALL success status
663+
# Decrement counter
664+
+ Op.MSTORE(offset=32, value=Op.SUB(Op.MLOAD(32), 1))
665+
),
666+
)
667+
)
668+
669+
# For each contract, execute SSTORE operations (approve)
670+
# Reuse the same memory layout as balanceOf
671+
attack_code += (
672+
# Store approve selector at memory[0] (reusing same slot)
673+
Op.MSTORE(offset=0, value=APPROVE_SELECTOR)
674+
# Initialize counter in memory[32] = number of approve calls
675+
# (reusing same slot)
676+
+ Op.MSTORE(offset=32, value=sstore_calls_per_contract)
677+
# Loop for approve calls
678+
+ While(
679+
condition=Op.MLOAD(32) + Op.ISZERO + Op.ISZERO,
680+
body=(
681+
# Store spender at memory[64] (counter as spender/amount)
682+
Op.MSTORE(offset=64, value=Op.MLOAD(32))
683+
# Call approve(spender, amount) on ERC20 contract
684+
# args_offset=28 reads: selector from MEM[28:32] +
685+
# spender from MEM[32:64] + amount from MEM[64:96]
686+
# Note: counter at MEM[32:64] is reused as spender,
687+
# and value at MEM[64:96] serves as the amount
688+
+ Op.CALL(
689+
address=erc20_address,
690+
value=0,
691+
args_offset=28,
692+
args_size=68,
693+
ret_offset=0,
694+
ret_size=0,
695+
)
696+
+ Op.POP # Discard CALL success status
697+
# Decrement counter
698+
+ Op.MSTORE(offset=32, value=Op.SUB(Op.MLOAD(32), 1))
699+
),
700+
)
701+
)
702+
703+
# Deploy attack contract
704+
attack_address = pre.deploy_contract(code=attack_code)
705+
706+
# Run the attack
707+
attack_tx = Transaction(
708+
to=attack_address,
709+
gas_limit=gas_benchmark_value,
710+
sender=pre.fund_eoa(),
711+
)
712+
713+
# Post-state
714+
post = {
715+
attack_address: Account(storage={}),
716+
}
717+
718+
blockchain_test(
719+
pre=pre,
720+
blocks=[Block(txs=[attack_tx])],
721+
post=post,
722+
)

0 commit comments

Comments
 (0)