-
Notifications
You must be signed in to change notification settings - Fork 392
enhance(test-benchmark): use config file for fixed opcode count scenarios #1790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: forks/osaka
Are you sure you want to change the base?
enhance(test-benchmark): use config file for fixed opcode count scenarios #1790
Conversation
LouisTsai-Csie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this! I left some suggestions, but I’m happy to discuss further. I’ll share this with Kamil to confirm it aligns with their needs.
| if has_repricing: | ||
| if fixed_opcode_counts: | ||
| opcode_counts = [ | ||
| int(x.strip()) for x in fixed_opcode_counts.split(",") | ||
| opcode_counts_to_use = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contains a few logic issues, but they will be resolved in my upcoming PR.
7ca99dc to
7c68d19
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/osaka #1790 +/- ##
============================================
Coverage 87.31% 87.31%
============================================
Files 541 541
Lines 32832 32832
Branches 3015 3015
============================================
Hits 28668 28668
Misses 3557 3557
Partials 607 607
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
LouisTsai-Csie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some suggestion for the parser! Thanks
tests/benchmark/configs/parser.py
Outdated
| uv run python parser.py --check # Check for new/missing entries (CI) | ||
| uv run python parser.py --dry-run # Show config without changes | ||
| uv run python parser.py --update # Update fixed_opcode_counts.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure but this seems not working for me. I still need to specify the path like:
uv run tests/benchmark/configs/parser.py --check
I also wonder whether we should place this script under packages/testing/src/execution_testing/cli/, since most related scripts, such as extract_config and diff_opcode_count, are located there. wdyt?
tests/benchmark/configs/parser.py
Outdated
| parser.add_argument( | ||
| "--no-filter", | ||
| action="store_true", | ||
| help="Include all benchmark tests, not just those with code_generator", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/benchmark/configs/parser.py
Outdated
| if not node.name.startswith("test_"): | ||
| self.generic_visit(node) | ||
| return | ||
|
|
||
| # Check if function has benchmark_test parameter | ||
| if not self._has_benchmark_test_param(node): | ||
| self.generic_visit(node) | ||
| return | ||
|
|
||
| # Optional: Filter for code generator usage | ||
| if self.filter_code_gen and not self._uses_code_generator(node): | ||
| self.generic_visit(node) | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure whether we need self.generic_visit(node) in each of the if–else branches. My understanding is that once a condition matches, it will recursively visit the nested function anyway.
This seems relevant only for cases like:
def test_function1(...):
def test_function2(...):
...If so, maybe we could remove the logic here:
| if not node.name.startswith("test_"): | |
| self.generic_visit(node) | |
| return | |
| # Check if function has benchmark_test parameter | |
| if not self._has_benchmark_test_param(node): | |
| self.generic_visit(node) | |
| return | |
| # Optional: Filter for code generator usage | |
| if self.filter_code_gen and not self._uses_code_generator(node): | |
| self.generic_visit(node) | |
| return | |
| if not node.name.startswith("test_"): | |
| return | |
| # Check if function has benchmark_test parameter | |
| if not self._has_benchmark_test_param(node): | |
| return | |
| # Optional: Filter for code generator usage | |
| if self.filter_code_gen and not self._uses_code_generator(node): | |
| return |
tests/benchmark/configs/parser.py
Outdated
| # Check if "opcode" or "op" is in parameter names | ||
| if "opcode" not in param_str and param_str not in ("op", "op,"): | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest only use opcode as the keyword, instead of three combinations. Here we need a detailed documentation!
This would requires us to refactor the parametrization for the following tests:
test_arithmetic.py:test_mod(op),test_mod_arithmetic(op),
tests/benchmark/configs/parser.py
Outdated
| # Case 1: Op.ADD | ||
| if isinstance(node, ast.Attribute): | ||
| return node.attr | ||
|
|
||
| # Case 2: pytest.param(Op.ADD, ...) or pytest.param((Op.ADD, x), ...) | ||
| if isinstance(node, ast.Call): | ||
| if len(node.args) > 0: | ||
| first_arg = node.args[0] | ||
| if isinstance(first_arg, ast.Attribute): | ||
| return first_arg.attr | ||
| elif isinstance(first_arg, ast.Tuple) and first_arg.elts: | ||
| first_elem = first_arg.elts[0] | ||
| if isinstance(first_elem, ast.Attribute): | ||
| return first_elem.attr | ||
|
|
||
| # Case 3: Tuple like (Op.ADD, args) | ||
| if isinstance(node, ast.Tuple) and node.elts: | ||
| first_elem = node.elts[0] | ||
| if isinstance(first_elem, ast.Attribute): | ||
| return first_elem.attr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More explanation for the logic here! Suggestion:
Case 1: Direct opcode reference
Example: @pytest.mark.parametrize("opcode", [Op.ADD, Op.MUL])
Result: ADD, MUL
Case 2a: Pytest param with tuple
Example: @pytest.mark.parametrize("opcode,args", [pytest.param((Op.ADD, 123))])
Result: ADD
Case 2b: Plain tuples
Example: @pytest.mark.parametrize("opcode,args", [(Op.ADD, 123), (Op.MUL, 456)])
Result: ADD, MUL
And here we assume the parametrization is always "opcode", this should be documented somewhere.
IIUC, this would not be recognized
@pytest.mark.parametrize("opcode_args,opcode", [pytest.param((123, Op.ADD))])
tests/benchmark/configs/parser.py
Outdated
| categories: dict[str, list[str]] = { | ||
| "ACCOUNT QUERY OPERATIONS": [], | ||
| "ARITHMETIC OPERATIONS": [], | ||
| "BITWISE OPERATIONS": [], | ||
| "BLOCK CONTEXT OPERATIONS": [], | ||
| "CALL CONTEXT OPERATIONS": [], | ||
| "COMPARISON OPERATIONS": [], | ||
| "CONTROL FLOW OPERATIONS": [], | ||
| "HASHING OPERATIONS": [], | ||
| "LOGGING OPERATIONS": [], | ||
| "MEMORY OPERATIONS": [], | ||
| "STACK OPERATIONS": [], | ||
| "STORAGE OPERATIONS": [], | ||
| "SYSTEM OPERATIONS": [], | ||
| "TRANSACTION CONTEXT OPERATIONS": [], | ||
| "OTHER": [], | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current keyword-based solution requires manually update the mapping, but i notice a pattern here:
"ACCOUNT QUERY OPERATIONS": [
"selfbalance",
"codesize",
"ext_account",
"BALANCE",
"EXTCODE",
],The key (ACCOUNT QUERY OPERATIONS) here could be derived from the file name:
test_account_query_operations.py. So when we loop over the benchmark test, we could also record its parent file (in scan_benchmark_tests function).
And then update the categorize_patterns function like this:
...
categories: dict[str, list[str]] = {}
for pattern in config.keys():
if pattern not in pattern_sources:
# Fallback for patterns without source info
category = "OTHER"
else:
source_file = pattern_sources[pattern]
# Extract test file name: test_arithmetic.py -> arithmetic
file_name = source_file.stem # Gets filename without extension
if file_name.startswith("test_"):
category_name = file_name[5:] # Remove "test_" prefix
# arithmetic -> ARITHMETIC OPERATIONS
category = (
f"{category_name.upper().replace('_', ' ')} OPERATIONS"
)
else:
category = "OTHER"
if category not in categories:
categories[category] = []
categories[category].append(pattern)
# Sort patterns within each category and sort categories alphabetically
return {k: sorted(v) for k, v in sorted(categories.items())}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on this approach, we could remove category_keywords
packages/testing/src/execution_testing/cli/pytest_commands/plugins/shared/benchmarking.py
Show resolved
Hide resolved
5506d25 to
14b6b64
Compare
14b6b64 to
697a7d0
Compare
…d gas bench values
🗒️ Description
This PR adds a CLI tool
benchmark_parserto automatically scan benchmark tests and generate a configuration file.fixed_opcode_counts.jsonfor the--fixed-opcode-countfeature from #1747.Key Changes
uv run benchmark_parsertests/benchmark/for tests with@pytest.mark.repricingmarker@pytest.mark.parametrizedecorators.fixed_opcode_counts.jsonat repo root with opcode counts mapping--checkmode for CI validation:uv run benchmark_parser --check.fixed_opcode_counts.jsontest_arithmetic.pyfor consistencyUsage
uv run benchmark_parser.fixed_opcode_counts.json:{ "scenario_configs": { "test_codecopy.*": [ 1 ], ... }Fill works correctly and I tested execute remote on Hoodi with 1K opcode count set, the latest txs: https://hoodi.etherscan.io/address/0x83fd666bfb2b345f932c3e4e04b6d85e5ed3568d
Future Items
--fixed-opcode-countafter generating the config file.--fixed-opcode-countwithdebug_traceTransactionusingexecute hive.🔗 Related Issues or PRs
#1747
✅ Checklist
toxchecks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:uvx tox -e statictype(scope):.