feat(test): test opcode programs in different scenarios by winsvega · Pull Request #808 · ethereum/execution-spec-tests

winsvega · 2024-09-16T10:28:23Z

🗒️ Description

Conversion of opcode diff places tests by ori.
A test defines series of test scenarios that are run on each parametrized opcode sequence.
Then we check if the sequence worked as expected in a given scenario.

I think this is a powerful method to template test any new given opcode.
we already have pre defined scenarios. then we just add one more parameter with what we want to test, and it will be covered on all the cases automatically.

Cases can be like:
callcode->staticcall-> [opcode]
create2-> [opcode]
[opcode] -> revert

check it out. so we can define opcode programs and scenarios. then the test will put each opcode program in each scenario and verify that it's result is the same (perhaps result will be complex depending on context and fork)

the idea is so far to have a template test and then we can easily just add opcode programms and it will be run in all crazy combinations.
likce call delegate call suicide revert and so on

This is still WIP.

🔗 Related Issues

#184
ethereum/tests#1490

✅ Checklist

All: Set appropriate labels for the changes.
All: Considered squashing commits to improve commit history.
All: Added an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
Tests: All converted JSON/YML tests from ethereum/tests have been added to converted-ethereum-tests.txt.
Tests: A PR with removal of converted JSON/YML tests from ethereum/tests have been opened.
Tests: Included the type and version of evm t8n tool used to locally execute test cases: e.g., ref with commit hash or geth 1.13.1-stable-3f40e65.
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

chfast

Can you give it any better name? The "opcode diff places" has no meaning.

winsvega · 2024-09-17T10:57:50Z

it translates as opcode in different (logical) places

marioevz

A few suggestions after a quick review. I haven't checked files in the ./scenarios/ scenarios/ folder, but will do on the re-review.

tests/frontier/scenarios/test_scenarios.py

tests/frontier/scenarios/common.py

tests/frontier/scenarios/programs/all_frontier_opcodes.py

winsvega · 2025-02-17T11:02:03Z

Since it is unreasonable to have all scenarios in one transaction (millions of gas and huge debug trace)
I now put 1 scenario into each separate block. making it a blockchain test.
(it is impossible to generate 1 state test per each scenario as pytest collect fixtures at compile time and scenarios depend on fork dynamically)

so here what it does.
for each user created byte sequence (program)
puts it into user created scenarios (like call -> callcode -> program, or create -> init code program, or call->program, suicide)
and verifies the expected result.

so if we test something like selfbalance == actual selfbalance, we create bytecode
Op.MSTORE(0, Op.SELFBALANCE) + Op.RETURN(0, 32),
and the result

ProgramResult(result=ScenarioExpectOpcode.SELFBALANCE, from_fork=Istanbul),

then scenarios will make sure that this bytecode return value is equal to scenario expected selfbalance (which is calculated at scenario code) for each scenario situation. like when we call delegatecall->program, delegatecall->call->program
this way we automatically test selfbalance opcode in all call combinations.
and inside create/create2 init code (this are basic examples of possible scenarios)

then we can add a scenario of eof code. and it will verify program value inside eof code.
automatically for each program.

this was one of the first ethereum tests. a simple verification that a bytecode like add(1,1) really returns 2 in all contexts.

1 coverage line is lost because it's no longer a state test, besides that line reffers to pre state check in test.

Also original tests were a little faulty, some cases didn't do what they were intended to do.

marioevz

I think there are a lot of anti-patterns used here, each program can be a separate python test file to make them simpler and more atomic.

The problem with putting so many verifications into a single test is that it's going to be difficult to client developers to debug in the case they find issues with one of these tests.

tests/frontier/scenarios/common.py

tests/frontier/scenarios/test_scenarios.py

winsvega · 2025-04-29T08:27:13Z

Missing coverage 3 lines are Prague evm init and state test loader check for a storage of pre.account having value of x:0.

rebase

marioevz

I've rebased and deleted the static test files that this PR supersedes.

I also disabled this test in executed because it requires accurate value of Op.GASPRICE which is unpredictable in devnets (we need a way to pass this to the tests somehow perhaps via a fixture).

marioevz · 2025-05-12T22:04:47Z

Verified the coverage files:

case EVMC_PRAGUE is running the case in Prague, which the coverage script does not fill for.
if (is_zero(value)) is not covered because the new tests do not pre-set the storage of the contracts it deploys.

winsvega · 2025-05-13T06:15:46Z

I've rebased and deleted the static test files that this PR supersedes.

I also disabled this test in executed because it requires accurate value of Op.GASPRICE which is unpredictable in devnets (we need a way to pass this to the tests somehow perhaps via a fixture).

I am aming ro make it execute compatible.
I have gashash contract to solve that.
If gasprice is unpredictable I call gashash contract with the value of op.gasprice that shall produce unique gas used so that others running the test can agree on gas used value without tests knowing which value is it

* test scenarios * format with ruff * address some comments * fix invalid opcodes rebase * generate 1 test block per scenario per program * fix style * fix all_opcodes test * refactor with test program class * fix(tests): Delete updated static tests * fix(tests): skip on execute due to `Op.GASPRICE` --------- Co-authored-by: Mario Vega <marioevz@gmail.com>

winsvega marked this pull request as draft September 16, 2024 10:28

winsvega force-pushed the dailytest branch from 1761a66 to dc4f466 Compare September 17, 2024 09:50

chfast reviewed Sep 17, 2024

View reviewed changes

winsvega closed this Sep 20, 2024

winsvega force-pushed the dailytest branch from dc4f466 to 2f2d356 Compare September 20, 2024 10:50

winsvega reopened this Sep 20, 2024

winsvega force-pushed the dailytest branch from b64a725 to 11adfb6 Compare September 20, 2024 12:06

winsvega changed the title ~~feat(test): opcode diff places test~~ feat(test): test opcode programs in different scenarios Sep 20, 2024

winsvega force-pushed the dailytest branch from 11adfb6 to 2729f06 Compare September 20, 2024 12:10

winsvega requested a review from marioevz September 20, 2024 13:37

marioevz reviewed Sep 20, 2024

View reviewed changes

winsvega force-pushed the dailytest branch 2 times, most recently from b1af590 to ff2eeff Compare September 24, 2024 13:13

winsvega force-pushed the dailytest branch 2 times, most recently from b28f2ef to f7de36c Compare October 17, 2024 10:02

winsvega force-pushed the dailytest branch from f7de36c to e2cf609 Compare October 28, 2024 11:24

winsvega added scope:pytest Scope: Changes EEST's pytest plugins type:feat type: Feature labels Oct 28, 2024

winsvega force-pushed the dailytest branch 11 times, most recently from 5d3dd84 to 940cb3e Compare October 31, 2024 14:12

winsvega force-pushed the dailytest branch from cb02788 to 9df25cb Compare February 14, 2025 12:14

winsvega requested a review from marioevz February 17, 2025 19:32

winsvega force-pushed the dailytest branch from 29be2f4 to e732d8d Compare February 25, 2025 12:44

marioevz requested changes Mar 25, 2025

View reviewed changes

winsvega force-pushed the dailytest branch 3 times, most recently from 56069b2 to 2e849b3 Compare April 29, 2025 08:25

winsvega requested a review from marioevz April 29, 2025 08:27

winsvega mentioned this pull request May 12, 2025

fix(static_tests): adjusted gasLimit to only allow gasLimit of up to 30mil as enforced in eip-7825 #1587

Closed

8 tasks

winsvega and others added 10 commits May 12, 2025 21:37

test scenarios

3d99f00

format with ruff

1f94169

address some comments

26976ed

fix invalid opcodes

4d521b2

rebase

generate 1 test block per scenario per program

e4d0be3

fix style

420fb3d

fix all_opcodes test

e1f1579

refactor with test program class

fe126f6

fix(tests): Delete updated static tests

74ed310

fix(tests): skip on execute due to Op.GASPRICE

835199e

marioevz force-pushed the dailytest branch from 2e849b3 to 835199e Compare May 12, 2025 21:43

marioevz approved these changes May 12, 2025

View reviewed changes

marioevz merged commit 0a11037 into main May 12, 2025
21 of 22 checks passed

marioevz deleted the dailytest branch May 12, 2025 22:05

marioevz mentioned this pull request Sep 30, 2025

Define standard template scenarios for expected test outcomes (for use in test case descriptions) #184

Closed

Conversation

winsvega commented Sep 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

🔗 Related Issues

✅ Checklist

Uh oh!

chfast left a comment

Choose a reason for hiding this comment

Uh oh!

winsvega commented Sep 17, 2024

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

winsvega commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

winsvega commented Apr 29, 2025

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

marioevz commented May 12, 2025

Uh oh!

Uh oh!

winsvega commented May 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

winsvega commented Sep 16, 2024 •

edited

Loading

winsvega commented Feb 17, 2025 •

edited

Loading