-
Notifications
You must be signed in to change notification settings - Fork 392
feat(benchmarks): fix bytecode attack for CALL-like opcodes to work in Osaka #1850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: forks/osaka
Are you sure you want to change the base?
Conversation
…or contract deployment
…t transactions with proper offseting to avoid overlap
b757b8e to
3aa28b0
Compare
| initcode, factory_address, factory_caller_address = ( | ||
| _deploy_max_contract_factory(pre, fork) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted the deployment of the factory contract to a separate function, mainly to avoid such a big method here which hurts readability, but mostly because I think for the upcoming test_extcode_ops I'll fix, that code is duplicated so I can re-use the function and we can delete code.
| _deploy_max_contract_factory(pre, fork) | ||
| ) | ||
|
|
||
| # Deploy num_contracts via multiple txs (each capped by tx gas limit). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the setup phase, we can't create all the required bytecodes in a single tx because this is disallowed by Osaka (unfortuantely for us here).
This means we have to split the creation in multiple transactions. This will have an unavoidable hit in filling performance, since we actually used the previous strategy for that reason.
For 60M it takes ~2m in my machine, which maybe isn't as bad -- but still I'm not sure we can workaround this with Osaka limitations.
| # Rough estimate (rounded down) of contracts per tx based on dominant | ||
| # cost factor only. E.g., 17M gas limit + 24KiB contracts = ~3 per tx. | ||
| # The goal is to involve the minimum amount of gas pricing to avoid | ||
| # complexity and potential brittleness. | ||
| # If this estimation is incorrect in the future (i.e. tx gas limit cap) | ||
| # is increased or cost per byte, the post-state check will detect it | ||
| # and can be adjusted with a more complex formula. | ||
| num_contracts_per_tx = fork.transaction_gas_limit_cap() // ( | ||
| gas_costs.G_CODE_DEPOSIT_BYTE * max_contract_size | ||
| ) | ||
| attack_txs = math.ceil(num_contracts / num_contracts_per_tx) | ||
|
|
||
| contracts_deployment_txs = [] | ||
| for _ in range(attack_txs): | ||
| contracts_deployment_txs.append( | ||
| Transaction( | ||
| to=factory_caller_address, | ||
| gas_limit=fork.transaction_gas_limit_cap(), | ||
| gas_price=10**6, | ||
| data=Hash(num_contracts_per_tx), | ||
| sender=pre.fund_eoa(), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I preferred to go heavy in comments for future maintainability.
But basically we do a rough estimation of how many 24KiB contracts we can create per tx, and then do that. Note I do ceil in attack_txs instead of trying to do precise creation, just because it doesn't hurt plus it avoids complexity in this code. Since this is the setup phase and the tx is gas capped, I think it is okay.
The factory contract has a new CALLDATA parameter which is how many contracts to create. Although we might only be interested in this being always the max, I think it is better to leave it generic so we can try reusing it as much as possible. It just changes the contract creation loop count to whatever CALLDATALOAD says.
| post = {} | ||
| for i in range(num_contracts): | ||
| deployed_contract_address = compute_create2_address( | ||
| address=factory_address, | ||
| salt=i, | ||
| initcode=initcode, | ||
| ) | ||
| post[deployed_contract_address] = Account(nonce=1) | ||
|
|
||
| attack_call = Bytecode() | ||
| if opcode == Op.EXTCODECOPY: | ||
| attack_call = Op.EXTCODECOPY( | ||
| address=Op.SHA3(32 - 20 - 1, 85), dest_offset=96, size=1000 | ||
| ) | ||
| else: | ||
| # For the rest of the opcodes, we can use the same generic attack call | ||
| # since all only minimally need the `address` of the target. | ||
| attack_call = Op.POP(opcode(address=Op.SHA3(32 - 20 - 1, 85))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this code is the same as before -- just that the diff is a bit unfortunate.
Something I noticed is that this has this opcode == Op.EXTCODECOPY which shouldn't be relevant for this method (look at test parametrization). Originally when I created this benchmark, the same test name was used for CALL-like opcodes and also EXTCODESIZE... but I think somebody split them after and maybe this was left here?
I'll leave this as is for now, and when I fix test_extcode_ops I'll try to re-use the logic since I'm pretty sure it was repeated there when it was splitted.
| attack_code = ( | ||
| # Setup memory for later CREATE2 address generation loop. | ||
| # 0xFF+[Address(20bytes)]+[seed(32bytes)]+[initcode keccak(32bytes)] | ||
| Op.MSTORE(0, factory_address) | ||
| + Op.MSTORE8(32 - 20 - 1, 0xFF) | ||
| + Op.MSTORE(32, Op.CALLDATALOAD(0)) | ||
| + Op.MSTORE(64, initcode.keccak256()) | ||
| # Main loop | ||
| + While( | ||
| body=attack_call + Op.MSTORE(32, Op.ADD(Op.MLOAD(32), 1)), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same attack code as before but with a twist. The previous version started with seed=0 and simply ran the loop until it ran out of gas, since we could do all in a singe tx.
Now we have to split the attack in multiple txs, thus I allow reading the seed from CALLDATALOAD(0) so we can create the attack txs in a way that makes sense.
| if len(attack_code) > max_contract_size: | ||
| # TODO: A workaround could be to split the opcode code into multiple | ||
| # contracts and call them in sequence. | ||
| raise ValueError( | ||
| f"Code size {len(attack_code)} exceeds maximum " | ||
| f"code size {max_contract_size}" | ||
| ) | ||
| attack_address = pre.deploy_contract(code=attack_code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unchanged code.
| ) | ||
|
|
||
|
|
||
| def _deploy_max_contract_factory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way the contract factory was created remains unchanged.
Note that the factory_code already used storage slot 0 as the internal seed. This was quite nice since it means that when we call the factory mutliple times now, it keeps creating the next bytecode in a coherent way from the previous tx.
| tx_gas_limit_cap = fork.transaction_gas_limit_cap() | ||
| assert tx_gas_limit_cap is not None, ( | ||
| "This benchmark requires a tx gas limit cap" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we mark this test for Osaka forward only? (Or maybe that already happens since AFAIK in all the benchmark releases only Prague was filled, since I'm not interested we want backwards performance runs. For zkVM also makes sense that previous forks aren't interesting for this attack).
I think we could make it work backwards, it would just blow up a the logic below reg tx preparations for setup and attack. This test is already quite complex so not sure it is worth it.
| full_txs = attack_gas_limit // tx_gas_limit_cap | ||
| remainder = attack_gas_limit % tx_gas_limit_cap | ||
|
|
||
| num_targeted_contracts_per_full_tx = ( | ||
| # Base available gas: | ||
| # TX_GAS_LIMIT - intrinsic - (out of loop MSTOREs) | ||
| tx_gas_limit_cap | ||
| - intrinsic_gas_cost_calc() | ||
| - gas_costs.G_VERY_LOW * 4 | ||
| ) // loop_cost | ||
| contract_start_index = 0 | ||
| opcode_txs = [] | ||
| for _ in range(full_txs): | ||
| opcode_txs.append( | ||
| Transaction( | ||
| to=attack_address, | ||
| gas_limit=tx_gas_limit_cap, | ||
| gas_price=10**9, | ||
| data=Hash(contract_start_index), | ||
| sender=pre.fund_eoa(), | ||
| ) | ||
| ) | ||
| contract_start_index += num_targeted_contracts_per_full_tx | ||
| if remainder > 0: | ||
| opcode_txs.append( | ||
| Transaction( | ||
| to=attack_address, | ||
| gas_limit=remainder, | ||
| gas_price=10**9, | ||
| data=Hash(contract_start_index), | ||
| sender=pre.fund_eoa(), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR: creating the attack txs to execute in the same block, such that none surpas the transaction gas limit cap. In contract_start_index we keep the offset we pass as CALLDATALOAD for the attack contract that I explained in my previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to mention this (did not check other logic): this looks almost similar to the split_transactions logic (
| remaining_gas if i == num_splits - 1 else gas_limit_cap |
BenchmarkTestFiller, however - that split txs would just split the tx (without calldata) up in multiple. Which we cannot do here due to the test logic.@LouisTsai-Csie we could maybe think of adding cases to the split txs logic here? I feel that inserting calldata is something we will see in other benchmark tests often (for instance as a "pointer" to where a certain tx should pick up where the other tx has exited - we could also do this in the contract code itself, but this means spending gas on this logic, which we want to spend on the target (the scenario/opcode we are benching))
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/osaka #1850 +/- ##
================================================
+ Coverage 53.45% 87.31% +33.86%
================================================
Files 743 541 -202
Lines 44076 32832 -11244
Branches 3891 3015 -876
================================================
+ Hits 23559 28668 +5109
+ Misses 20306 3557 -16749
- Partials 211 607 +396
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🗒️ Description
This PR fixes the
test_xcallbenchmarks to work with Osaka.test_xcallhas a CALL-like variant of the bytecode attack exploiting unchunkified code, highly relevant for zkVM worst-cases.I think the solution should feel quite natural — I’ll add some review comments.
I checked this is doing what we want with two independent checks:
For 1. and
--block-gas-limit 30M, my Geth reports accessing ~260MiB of bytecodes were the upper limit is ~280MiB (30M/2600 — but this isn’t accounting for glue opcodes, intrinsic costs of splitting in txs, etc). So this looks fine.For 2. and
--block-gas-limit 30M, I also confirmed it really blows up the guest program witness, which is prob the best ultimate confirmation this is doing what we wanted. i.e: the json-encoded files have ~524MiB size which makes sense since bytes are hex encoded (with other things).🔗 Related Issues or PRs
#1695
✅ Checklist
toxchecks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:uvx tox -e statictype(scope):.