Skip to content
Closed
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
a1f2153
Add BloatNet tests
gballet Aug 14, 2025
02d65b4
try building the contract
gballet Aug 14, 2025
e721cc6
fix: SSTORE 0 -> 1 match all values in the state
gballet Aug 14, 2025
d1cad25
add the tx for 0 -> 1 and 1 -> 2
gballet Aug 14, 2025
16f6d30
fix: linter issues
gballet Aug 14, 2025
374e08a
remove more whitespaces
gballet Aug 14, 2025
333c876
fix formatting
gballet Aug 15, 2025
79a95b8
move to benchmarks
gballet Aug 21, 2025
8131e98
fix linter value
gballet Aug 22, 2025
5f805fd
use the gas limit from the environment
gballet Aug 22, 2025
090a400
parameterize the written value in SSTORE
gballet Aug 26, 2025
cd02a02
fix linter issues
gballet Aug 26, 2025
1f3c381
update CHANGELOG.md
gballet Aug 26, 2025
f6def7e
fix format
gballet Aug 26, 2025
7e20a50
simplify syntax
gballet Aug 26, 2025
c24ad35
fix: start with an empty contract storage
gballet Aug 26, 2025
fc27e53
more fixes, but the result is still incorrect
gballet Aug 26, 2025
7d87262
fix: finally fix the tests
gballet Aug 26, 2025
8556014
linter fix
gballet Aug 27, 2025
326915e
add SLOAD tests
gballet Aug 27, 2025
1f8e62a
test(benchmark): implement CREATE2 addressing for bloatnet tests
CPerezz Aug 29, 2025
8babb13
refactor(benchmark): optimize gas calculations in bloatnet tests
CPerezz Sep 11, 2025
e70132b
refactor(benchmark): bloatnet tests with unique bytecode for I/O opt…
CPerezz Sep 11, 2025
0e889d7
refactor(benchmark): replace custom CREATE2 address calculation with …
CPerezz Sep 11, 2025
e4583b6
CREATE2 factory approach working
CPerezz Sep 17, 2025
06f9a63
Version with EIP-7997 model working
CPerezz Sep 17, 2025
49c1343
refactor(benchmark): imrpove contract deployment script with interact…
CPerezz Sep 17, 2025
2875cf4
delete: remove obsolete test_create2.py script
CPerezz Sep 18, 2025
b634ca3
refactor(benchmark): optimize gas calculations for BALANCE + EXTCODEC…
CPerezz Sep 18, 2025
774c56c
refactor(benchmark): support non-fixed max_codesize
CPerezz Sep 18, 2025
6e6863a
chore: Remove all 24kB "hardcoded" refs
CPerezz Sep 18, 2025
f2cd5f9
fix: pre-commit lint hooks
CPerezz Sep 18, 2025
cf2c7c6
push updated deploy_create2_factory refactored with EEST as dep
CPerezz Sep 18, 2025
a862f76
refactor(benchmark): enhance CREATE2 factory deployment and testing
CPerezz Sep 19, 2025
55396fb
remove: old_deploy_factory script
CPerezz Sep 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ Users can select any of the artifacts depending on their testing needs for their

### 🧪 Test Cases

- ✨ [BloatNet](bloatnet.info)/Multidimensional Metering: Add benchmarks to be used as part of the BloatNet project and also for Multidimensional Metering.
- ✨ [EIP-7951](https://eips.ethereum.org/EIPS/eip-7951): Add additional test cases for modular comparison.
- 🔀 Refactored `BLOBHASH` opcode context tests to use the `pre_alloc` plugin in order to avoid contract and EOA address collisions ([#1637](https://github.com/ethereum/execution-spec-tests/pull/1637)).
- 🔀 Refactored `SELFDESTRUCT` opcode collision tests to use the `pre_alloc` plugin in order to avoid contract and EOA address collisions ([#1643](https://github.com/ethereum/execution-spec-tests/pull/1643)).
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ exclude = [
'^fixtures/',
'^logs/',
'^site/',
'^tests/benchmark/bloatnet/deploy_.*\.py$',
]
plugins = ["pydantic.mypy"]

Expand Down
91 changes: 91 additions & 0 deletions scripts/test_create2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#!/usr/bin/env python3
"""Test CREATE2 factory deployment."""

from web3 import Web3
from eth_utils import keccak

# Connect to Geth
w3 = Web3(Web3.HTTPProvider("http://127.0.0.1:8545"))
if not w3.is_connected():
print("Failed to connect to Geth")
exit(1)

test_account = w3.eth.accounts[0]
print(f"Using test account: {test_account}")
print(f"Balance: {w3.eth.get_balance(test_account) / 10**18:.4f} ETH")

# Simple CREATE2 factory that returns the deployed address
factory_bytecode = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this file got accidentally committed but will still leave some comments.

Nit: this is the factory initcode as this bytecode will deploy the CREATE2 factory contract in case of a create transaction.

# Constructor: return runtime code
"60" + "1f" # PUSH1 0x1F (runtime size = 31)
"80" # DUP1
"60" + "0a" # PUSH1 0x0A (runtime offset)
"60" + "00" # PUSH1 0x00 (memory dest)
"39" # CODECOPY
"60" + "00" # PUSH1 0x00 (return offset)
"f3" # RETURN

# Runtime: minimal CREATE2
"36" # CALLDATASIZE
"60" + "00" # PUSH1 0x00
"60" + "00" # PUSH1 0x00
"37" # CALLDATACOPY (copy all calldata to memory)

"60" + "00" # PUSH1 0x00 (salt - using 0 for simplicity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed salt means that identical calldata to this contract will always deployed at the same address, it is thus not possible to create "copies" from this contract which run the same initcode in a different address.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! Anyways, removing this file as you pointed out. So we can ignore.

"36" # CALLDATASIZE (size of init code)
"60" + "00" # PUSH1 0x00 (offset in memory)
"60" + "00" # PUSH1 0x00 (value)
"f5" # CREATE2

"60" + "00" # PUSH1 0x00
"52" # MSTORE (store address at 0)
"60" + "20" # PUSH1 0x20
"60" + "00" # PUSH1 0x00
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PUSH1 0 could also be written as PUSH0 (0x5f) - this is cheaper but not supported on chains which do not have PUSH0. See also https://eips.ethereum.org/EIPS/eip-7997

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bloatnet focuses precisely on Ethereum's mainnet. So for now I think we're fine using PUSH0.
We can always modify if needed.

"f3" # RETURN (return address)
)

# Deploy factory
factory_tx = w3.eth.send_transaction({
'from': test_account,
'data': '0x' + factory_bytecode,
'gas': 3000000
})

factory_receipt = w3.eth.wait_for_transaction_receipt(factory_tx)
if factory_receipt.status != 1:
print("Failed to deploy factory")
exit(1)

factory_address = factory_receipt.contractAddress
print(f"\nFactory deployed at: {factory_address}")

# Create simple contract bytecode (just returns 42)
simple_bytecode = "602a60005260206000f3" # PUSH1 42, PUSH1 0, MSTORE, PUSH1 32, PUSH1 0, RETURN

# Deploy using factory
print("\nDeploying contract via CREATE2...")
deploy_tx = w3.eth.send_transaction({
'from': test_account,
'to': factory_address,
'data': '0x' + simple_bytecode,
'gas': 1000000
})

deploy_receipt = w3.eth.wait_for_transaction_receipt(deploy_tx)
print(f"Transaction status: {deploy_receipt.status}")

# Get return value (the deployed address)
result = w3.eth.call({
'to': factory_address,
'data': '0x' + simple_bytecode
})

if result:
deployed_addr = '0x' + result[-20:].hex()
print(f"Contract deployed at: {deployed_addr}")

# Verify by checking code
code = w3.eth.get_code(deployed_addr)
print(f"Deployed code length: {len(code)} bytes")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should store as code: 0x00..0042 (32 bytes)

else:
print("No return value from factory")
133 changes: 133 additions & 0 deletions tests/benchmark/bloatnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# BloatNet Benchmark Tests setup guide

## Overview

The Bloatnet benchmarks work on the following fashion:
1. They usually require a previously-deployed state (usually quite large) which the benchmarks
will interact with.
2. The deployment script helpers help deploying the required bytecode for the specific tests.
3. The outputs of the deployment scripts get hardcoded into the codebase such that the benchmarks can interact with them.

## Gas Cost Constants

### BALANCE + EXTCODESIZE Pattern
**Gas per contract: 2,707**
- `PUSH20` (address): 3 gas
- `BALANCE` (cold access): 2,600 gas
- `POP`: 2 gas
- `EXTCODESIZE` (warm): 100 gas
- `POP`: 2 gas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests is basically two scenarios in one. It tests both BALANCE (which marks it warm) and EXTCODESIZE.

Note that for accounts in the Merkle Patricia Trie in the state, account are stored as:

[nonce, balance, storageRoot, codeHash]

Thus reading balance from MPT will "just" require reading the account. EXTCODESIZE however means we have to query codeHash, and to get the size we have to lookup all the code from the DB in order to determine the size (this assumes that the client has not optimized this some way, for instance via an extra database like a codeHash => codeSize lookup which would skip first reading all the code to determine size).

So, I believe we need scenarios for BALANCE/EXTCODESIZE.

For EXTCODESIZE, I think this benchmark test is what you want:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For BALANCE (cold) this test:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow you here.

If these standalone scenarios already exist as you correctly pointed out, and my PR adds the combination of them into a single test, what is actually needed further from what this PR adds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you claiming we need to implement something? What I want is to test the combination of the 2 together. And observe if any client has optimizations that can be applied. This is all part of the following scenarios I want to implement for bloatnet: https://hackmd.io/9icZeLN7R0Sk5mIjKlZAHQ#Opcode-State-Access-Combination-Tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! You are right, I was thinking of this from a different perspective (opcodes in isolation). The combined test is indeed not written.


### BALANCE + EXTCODECOPY Pattern
**Gas per contract: ~5,007**
- `PUSH20` (address): 3 gas
- `BALANCE` (cold access): 2,600 gas
- `POP`: 2 gas
- `EXTCODECOPY` setup: ~100 gas
- `EXTCODECOPY` (24KB): ~2,300 gas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same comment regarding BALANCE + EXTCODECOPY here, should be in different tests so we can investigate the individual behavior of these opcodes instead of mixed)

For EXTCODECOPY, we want to force clients to read all code. But since we expand memory for the copied bytes and pay for those copied bytes also, copying 1 byte instead of the 24 kb should be sufficient (this forces clients to load code from disk since we need to know what that specific byte is). Therefore we should read the final byte of the contract.

This test is relevant:

attack_call = Op.EXTCODECOPY(address=Op.SHA3(32 - 20 - 1, 85), dest_offset=96, size=1000)

I'm not sure why it copies 1000 bytes, but this should be edited to read 1 byte so we can target more accounts. (1000 byte copy likely from the original idea of these benchmarks (zkEVM) because we want to measure the worst case zk cycles there, not the worst state attack, so slightly different performance perspective regarding worst case scenarios there)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: in linked tests there is also a calculation in the pre-setup phase to calculate how much contracts are necessary. This uses an upper bound, it should be slightly less in practice, so attack block will always read non-empty accounts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part of reading a single byte is pure gold! Thanks so much for this trick! I did not consider it but makes all the sense!

- `POP`: 2 gas

## Required Contracts Calculation Example:

### For BALANCE + EXTCODESIZE:
| Gas Limit | Contracts Needed | Calculation |
| --------- | ---------------- | ------------------- |
| 5M | 1,838 | 5,000,000 ÷ 2,707 |
| 50M | 18,380 | 50,000,000 ÷ 2,707 |
| 150M | 55,403 | 150,000,000 ÷ 2,707 |

### For BALANCE + EXTCODECOPY:
| Gas Limit | Contracts Needed | Calculation |
| --------- | ---------------- | ------------------- |
| 5M | 998 | 5,000,000 ÷ 5,007 |
| 50M | 9,986 | 50,000,000 ÷ 5,007 |
| 150M | 29,958 | 150,000,000 ÷ 5,007 |

You can see the associated attack constants inside of the tests in `bloatnet/test_bloatnet.py`

## Quick Start: 150M Gas Attack

### 1. Deploy CREATE2 Factory (you can use an already deployed one if preferred and therefore, skip this step)

```bash
# One-time setup - deploy the CREATE2 factory
python3 tests/benchmark/bloatnet/deploy_create2_factory.py

# Output will show:
# Factory deployed at: 0x... <-- Save this address
```

### 2. Deploy Contracts

The deployment script is interactive and will guide you through selecting the appropriate contract type for your benchmark.

#### Contract Types Available

1. **max_size_24kb**: 24KB contracts filled with unique bytecode (EXTCODE_ type of tests)
2. **sload_heavy**: Contracts optimized for SLOAD benchmarking
3. **storage_heavy**: Contracts with pre-initialized storage
4. **custom**: Custom bytecode (for future extensions)

#### Calculate Contracts Needed

Before running the deployment, calculate the number of contracts needed:
- For 150M gas BALANCE+EXTCODESIZE: 55,403 contracts
- For 150M gas BALANCE+EXTCODECOPY: 29,958 contracts

_Deploy enough contracts to cover the max gas you plan to use in your tests/benchmarks._

#### Running the Deployment

```bash
# Run the interactive deployment script
python3 tests/benchmark/bloatnet/deploy_bloatnet_simple.py \
--num-contracts 55403 \
--factory-address 0x... # Use factory address from step 1
```

#### Deployment Output

After successful deployment, the script will:

1. Display the configuration needed for tests:
```python
=== Configuration for max_size_24kb tests ===
CONTRACT_TYPE = "max_size_24kb"
FACTORY_ADDRESS = Address("0x...")
INIT_CODE_HASH = bytes.fromhex("...")
NUM_DEPLOYED_CONTRACTS = 55403
```

2. Save the configuration to a file:
```
Configuration saved to: bloatnet_config_max_size_24kb.txt
```

This file contains all the values needed to update your test configuration.

### 3. Update Test Configuration

Edit `tests/benchmark/bloatnet/test_bloatnet.py` and update:

```python
FACTORY_ADDRESS = Address("0x...") # From deployment output
INIT_CODE_HASH = bytes.fromhex("...") # From deployment output
NUM_DEPLOYED_CONTRACTS = 55403 # Actual deployed count
```

### 5. Run Benchmark Tests

```bash
# Run with specific gas values (in millions)
uv run fill --fork=Prague --gas-benchmark-values=150 \
tests/benchmark/bloatnet/test_bloatnet.py --clean

# With EVM traces for analysis
uv run fill --fork=Prague --gas-benchmark-values=150 \
--evm-dump-dir=traces/ --traces \
tests/benchmark/bloatnet/test_bloatnet.py

# Multiple gas values
uv run fill --fork=Prague --gas-benchmark-values=5,50,150 \
tests/benchmark/bloatnet/test_bloatnet.py
```
Loading