Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 129 additions & 69 deletions tests/benchmark/compute/instruction/test_system.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import pytest
from execution_testing import (
Account,
Address,
Alloc,
BenchmarkTestFiller,
Block,
Expand Down Expand Up @@ -67,6 +68,10 @@ def test_xcall(
max_contract_size = fork.max_code_size()

gas_costs = fork.gas_costs()
tx_gas_limit_cap = fork.transaction_gas_limit_cap()
assert tx_gas_limit_cap is not None, (
"This benchmark requires a tx gas limit cap"
)
Comment on lines +71 to +74
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we mark this test for Osaka forward only? (Or maybe that already happens since AFAIK in all the benchmark releases only Prague was filled, since I'm not interested we want backwards performance runs. For zkVM also makes sense that previous forks aren't interesting for this attack).

I think we could make it work backwards, it would just blow up a the logic below reg tx preparations for setup and attack. This test is already quite complex so not sure it is worth it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tx_gas_limit_cap = fork.transaction_gas_limit_cap()
assert tx_gas_limit_cap is not None, (
"This benchmark requires a tx gas limit cap"
)
tx_gas_limit_cap = fork.transaction_gas_limit_cap() or gas_benchmark_value

I believe this should be sufficient to support both pre-Osaka and post-Osaka scenarios. If fork.transaction_gas_limit_cap() returns None (e.g., in Prague), it will fall back to gas_benchmark_value, and the transaction preparation logic will result in a single transaction:

full_txs = attack_gas_limit // tx_gas_limit_cap   # becomes 1
remainder = attack_gas_limit % tx_gas_limit_cap   # becomes 0


# Calculate the absolute minimum gas costs to deploy the contract This does
# not take into account setting up the actual memory (using KECCAK256 and
Expand All @@ -90,7 +95,7 @@ def test_xcall(
+ gas_costs.G_COLD_ACCOUNT_ACCESS # Opcode cost
+ 30 # ~Gluing opcodes
)
# Calculate the number of contracts to be targeted
# Calculate an upper bound of the number of contracts to be targeted
num_contracts = (
# Base available gas = GAS_LIMIT - intrinsic - (out of loop MSTOREs)
attack_gas_limit - intrinsic_gas_cost_calc() - gas_costs.G_VERY_LOW * 4
Expand All @@ -108,6 +113,128 @@ def test_xcall(
"during the setup phase of this test."
)

initcode, factory_address, factory_caller_address = (
_deploy_max_contract_factory(pre, fork)
)
Comment on lines +116 to +118
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted the deployment of the factory contract to a separate function, mainly to avoid such a big method here which hurts readability, but mostly because I think for the upcoming test_extcode_ops I'll fix, that code is duplicated so I can re-use the function and we can delete code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's nice, if you want to share the helper function you could place it under tests/benchmark/compute/helpers.py


# Deploy num_contracts via multiple txs (each capped by tx gas limit).
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the setup phase, we can't create all the required bytecodes in a single tx because this is disallowed by Osaka (unfortuantely for us here).

This means we have to split the creation in multiple transactions. This will have an unavoidable hit in filling performance, since we actually used the previous strategy for that reason.

For 60M it takes ~2m in my machine, which maybe isn't as bad -- but still I'm not sure we can workaround this with Osaka limitations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my personal view, ~2 minutes isn’t too bad for 60M. When I tried this refactoring earlier, the setup + attack phases took more than 20 minutes (I don’t recall the exact number), this makes it very hard to check the result. The recent fill optimizations have already made a huge difference.

If this does become a bottleneck again, we can profile the fill process and identify more optimization opportunities!

with TestPhaseManager.setup():
# Rough estimate (rounded down) of contracts per tx based on dominant
# cost factor only. E.g., 17M gas limit + 24KiB contracts = ~3 per tx.
# The goal is to involve the minimum amount of gas pricing to avoid
# complexity and potential brittleness.
# If this estimation is incorrect in the future (i.e. tx gas limit cap)
# is increased or cost per byte, the post-state check will detect it
# and can be adjusted with a more complex formula.
num_contracts_per_tx = tx_gas_limit_cap // (
gas_costs.G_CODE_DEPOSIT_BYTE * max_contract_size
)
attack_txs = math.ceil(num_contracts / num_contracts_per_tx)

contracts_deployment_txs = []
for _ in range(attack_txs):
contracts_deployment_txs.append(
Transaction(
to=factory_caller_address,
gas_limit=tx_gas_limit_cap,
gas_price=10**6,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gas_price=10**6,

This should be removed. When using execute remote targeting a live network, it will automatically fetch the current network gas price.

data=Hash(num_contracts_per_tx),
sender=pre.fund_eoa(),
)
)

post = {}
for i in range(num_contracts):
deployed_contract_address = compute_create2_address(
address=factory_address,
salt=i,
initcode=initcode,
)
post[deployed_contract_address] = Account(nonce=1)

attack_call = Bytecode()
if opcode == Op.EXTCODECOPY:
attack_call = Op.EXTCODECOPY(
address=Op.SHA3(32 - 20 - 1, 85), dest_offset=96, size=1000
)
else:
# For the rest of the opcodes, we can use the same generic attack call
# since all only minimally need the `address` of the target.
attack_call = Op.POP(opcode(address=Op.SHA3(32 - 20 - 1, 85)))
Comment on lines +146 to +163
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this code is the same as before -- just that the diff is a bit unfortunate.

Something I noticed is that this has this opcode == Op.EXTCODECOPY which shouldn't be relevant for this method (look at test parametrization). Originally when I created this benchmark, the same test name was used for CALL-like opcodes and also EXTCODESIZE... but I think somebody split them after and maybe this was left here?

I'll leave this as is for now, and when I fix test_extcode_ops I'll try to re-use the logic since I'm pretty sure it was repeated there when it was splitted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t notice this when splitting the tests based on the folder structure. It would be good to remove it once this refactor is complete.

attack_code = (
# Setup memory for later CREATE2 address generation loop.
# 0xFF+[Address(20bytes)]+[seed(32bytes)]+[initcode keccak(32bytes)]
Op.MSTORE(0, factory_address)
+ Op.MSTORE8(32 - 20 - 1, 0xFF)
+ Op.MSTORE(32, Op.CALLDATALOAD(0))
+ Op.MSTORE(64, initcode.keccak256())
# Main loop
+ While(
body=attack_call + Op.MSTORE(32, Op.ADD(Op.MLOAD(32), 1)),
)
)
Comment on lines +164 to +175
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same attack code as before but with a twist. The previous version started with seed=0 and simply ran the loop until it ran out of gas, since we could do all in a singe tx.

Now we have to split the attack in multiple txs, thus I allow reading the seed from CALLDATALOAD(0) so we can create the attack txs in a way that makes sense.


if len(attack_code) > max_contract_size:
# TODO: A workaround could be to split the opcode code into multiple
# contracts and call them in sequence.
raise ValueError(
f"Code size {len(attack_code)} exceeds maximum "
f"code size {max_contract_size}"
)
Comment on lines +177 to +183
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(attack_code) > max_contract_size:
# TODO: A workaround could be to split the opcode code into multiple
# contracts and call them in sequence.
raise ValueError(
f"Code size {len(attack_code)} exceeds maximum "
f"code size {max_contract_size}"
)
# TODO: A workaround could be to split the opcode code into multiple
# contracts and call them in sequence.

This check could be removed (we could leave the TODO as it is now). Since in both fill & execute mode, there is contract size limit check:

Fill

max_code_size = self._fork.max_code_size()
assert len(code_bytes) <= max_code_size, (
f"code too large: {len(code_bytes)} > {max_code_size}"
)

Execute

assert len(code) <= MAX_BYTECODE_SIZE, (
f"code too large: {len(code)} > {MAX_BYTECODE_SIZE}"
)

attack_address = pre.deploy_contract(code=attack_code)
Comment on lines +177 to +184
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unchanged code.


with TestPhaseManager.execution():
full_txs = attack_gas_limit // tx_gas_limit_cap
remainder = attack_gas_limit % tx_gas_limit_cap

num_targeted_contracts_per_full_tx = (
# Base available gas:
# TX_GAS_LIMIT - intrinsic - (out of loop MSTOREs)
tx_gas_limit_cap
- intrinsic_gas_cost_calc()
- gas_costs.G_VERY_LOW * 4
) // loop_cost
contract_start_index = 0
opcode_txs = []
for _ in range(full_txs):
opcode_txs.append(
Transaction(
to=attack_address,
gas_limit=tx_gas_limit_cap,
gas_price=10**9,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gas_price=10**9,

data=Hash(contract_start_index),
sender=pre.fund_eoa(),
)
)
contract_start_index += num_targeted_contracts_per_full_tx
if remainder > 0:
opcode_txs.append(
Transaction(
to=attack_address,
gas_limit=remainder,
gas_price=10**9,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gas_price=10**9,

data=Hash(contract_start_index),
sender=pre.fund_eoa(),
)
)
Comment on lines +187 to +219
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: creating the attack txs to execute in the same block, such that none surpas the transaction gas limit cap. In contract_start_index we keep the offset we pass as CALLDATALOAD for the attack contract that I explained in my previous comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to mention this (did not check other logic): this looks almost similar to the split_transactions logic (

remaining_gas if i == num_splits - 1 else gas_limit_cap
) of the BenchmarkTestFiller, however - that split txs would just split the tx (without calldata) up in multiple. Which we cannot do here due to the test logic.
@LouisTsai-Csie we could maybe think of adding cases to the split txs logic here? I feel that inserting calldata is something we will see in other benchmark tests often (for instance as a "pointer" to where a certain tx should pick up where the other tx has exited - we could also do this in the contract code itself, but this means spending gas on this logic, which we want to spend on the target (the scenario/opcode we are benching))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a very common pattern in stateful scenarios as well. I need to think about it a bit more, maybe a new code generator type could help.


blockchain_test(
pre=pre,
post=post,
blocks=[
Block(txs=contracts_deployment_txs),
Block(txs=opcode_txs),
],
exclude_full_post_state_in_output=True,
)


def _deploy_max_contract_factory(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the contract factory was created remains unchanged.

Note that the factory_code already used storage slot 0 as the internal seed. This was quite nice since it means that when we call the factory mutliple times now, it keeps creating the next bytecode in a coherent way from the previous tx.

pre: Alloc,
fork: Fork,
) -> tuple[Bytecode, Address, Address]:
max_contract_size = fork.max_code_size()

# The initcode will take its address as a starting point to the input to
# the keccak hash function. It will reuse the output of the hash function
# in a loop to create a large amount of seemingly random code, until it
Expand Down Expand Up @@ -177,74 +304,7 @@ def test_xcall(
)
factory_caller_address = pre.deploy_contract(code=factory_caller_code)

with TestPhaseManager.setup():
contracts_deployment_tx = Transaction(
to=factory_caller_address,
gas_limit=env.gas_limit,
gas_price=10**6,
data=Hash(num_contracts),
sender=pre.fund_eoa(),
)

post = {}
deployed_contract_addresses = []
for i in range(num_contracts):
deployed_contract_address = compute_create2_address(
address=factory_address,
salt=i,
initcode=initcode,
)
post[deployed_contract_address] = Account(nonce=1)
deployed_contract_addresses.append(deployed_contract_address)

attack_call = Bytecode()
if opcode == Op.EXTCODECOPY:
attack_call = Op.EXTCODECOPY(
address=Op.SHA3(32 - 20 - 1, 85), dest_offset=96, size=1000
)
else:
# For the rest of the opcodes, we can use the same generic attack call
# since all only minimally need the `address` of the target.
attack_call = Op.POP(opcode(address=Op.SHA3(32 - 20 - 1, 85)))
attack_code = (
# Setup memory for later CREATE2 address generation loop.
# 0xFF+[Address(20bytes)]+[seed(32bytes)]+[initcode keccak(32bytes)]
Op.MSTORE(0, factory_address)
+ Op.MSTORE8(32 - 20 - 1, 0xFF)
+ Op.MSTORE(32, 0)
+ Op.MSTORE(64, initcode.keccak256())
# Main loop
+ While(
body=attack_call + Op.MSTORE(32, Op.ADD(Op.MLOAD(32), 1)),
)
)

if len(attack_code) > max_contract_size:
# TODO: A workaround could be to split the opcode code into multiple
# contracts and call them in sequence.
raise ValueError(
f"Code size {len(attack_code)} exceeds maximum "
f"code size {max_contract_size}"
)
opcode_address = pre.deploy_contract(code=attack_code)

with TestPhaseManager.execution():
opcode_tx = Transaction(
to=opcode_address,
gas_limit=attack_gas_limit,
gas_price=10**9,
sender=pre.fund_eoa(),
)

blockchain_test(
pre=pre,
post=post,
blocks=[
Block(txs=[contracts_deployment_tx]),
Block(txs=[opcode_tx]),
],
exclude_full_post_state_in_output=True,
)
return initcode, factory_address, factory_caller_address


@pytest.mark.parametrize(
Expand Down
Loading