-
Notifications
You must be signed in to change notification settings - Fork 0
feat(trace): neotrace - PR for testing and self-reviews #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 4 commits
498c3c4
9ca6797
df1b62d
b153b26
4158ee7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -93,6 +93,16 @@ def execute_test(test_case: TestCase, dumper: Dumper): | |
| for name, kind, data in test_case.case_fn(): | ||
| if kind == "meta": | ||
| meta[name] = data | ||
| elif kind == "pydantic": | ||
| # new data type for spec traces | ||
| outputs += [ | ||
| # dump trace data for yaml serialization | ||
| ("trace", "data", data.model_dump(mode="json", exclude_none=True)), | ||
| ] + [ | ||
| (name, "ssz", value) | ||
| # ssz artifacts are already serialized and will be compressed by the dumper | ||
| for name, value in data.artifacts.items() | ||
| ] | ||
|
||
| else: | ||
| method = getattr(dumper, f"dump_{kind}", None) | ||
| if method is None: | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,133 @@ | ||||||
| ## Spec trace framework | ||||||
|
|
||||||
coderabbitai[bot] marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| This is an implementation of #4603 a new testing framework for the Ethereum | ||||||
| consensus spec tests, based on tracing spec method calls and recording them in a | ||||||
| structured trace file. | ||||||
|
|
||||||
| The basic idea is make tests more simple and linear and hide the minutiae of | ||||||
| dumping data into the test harness (`@spec_trace` decorator) and automate | ||||||
| everything that doesn't have to be manual. | ||||||
|
|
||||||
| Spec is being wrapped into a transparent proxy object and all method calls are | ||||||
| being tracked including any state mutations before and after. Final state is | ||||||
| recorded and all relevant artifacts including all states are being saved into | ||||||
| SSZ artifacts (hash-addressed to avoid duplication). | ||||||
|
||||||
|
|
||||||
| ### Usage and example test | ||||||
|
|
||||||
| ```python | ||||||
| from tests.infra.trace import spec_trace | ||||||
|
|
||||||
|
|
||||||
| @with_all_phases | ||||||
| @spec_state_test # keep these like before | ||||||
| @spec_trace # this is the thing that makes the magic happen | ||||||
| def test_linear_sanity_slots_222( | ||||||
| spec, state | ||||||
| ): # spec and state can be positional but the name matters | ||||||
| # just use spec methods, they are traced automagically, and state is dumped | ||||||
| spec.process_slot(state) | ||||||
| ``` | ||||||
|
|
||||||
| this is for example purposes put into | ||||||
|
||||||
| `tests/core/pyspec/eth2spec/test/gloas/sanity/test_slots_2.py` and can be run | ||||||
| with something like | ||||||
|
|
||||||
| ``` | ||||||
| make reftests fork=gloas runner=sanity k=linear_sanity_slots_222 verbose=true | ||||||
| ``` | ||||||
coderabbitai[bot] marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| that produces a trace in | ||||||
| `../consensus-spec-tests/tests/minimal/gloas/sanity/slots_2/pyspec_tests/linear_sanity_slots_222/trace.yaml` | ||||||
|
|
||||||
| ### Spec trace file example | ||||||
|
|
||||||
| ```yaml | ||||||
| default_fork: gloas | ||||||
| trace: | ||||||
| - {op: load_state, state_root: | ||||||
| 95d19311d30804985b06c40cc437bdfbb126209ad9ea8253ba33e0ff0af74c40.ssz_snappy} | ||||||
| - op: spec_call | ||||||
| method: process_slot | ||||||
| input: {state: | ||||||
| 95d19311d30804985b06c40cc437bdfbb126209ad9ea8253ba33e0ff0af74c40.ssz_snappy} | ||||||
| - {op: assert_state, state_root: | ||||||
| 41f562b491baaa9fdd981973c8aef64bb7c663c4b07f35141c16afc9e11184c1.ssz_snappy} | ||||||
| ``` | ||||||
|
|
||||||
| In this example, `process_slot` does not return anything but we can see the | ||||||
| initial state and the final state being dumped automatically and they are | ||||||
| different. In the other more complex example test (omitted here for brevity) we | ||||||
| can examine how complex inputs and outputs being dumped and how out-of-band | ||||||
| state mutations are being tracked with assert and load steps. | ||||||
|
|
||||||
| A non-normative example of a little more complex inputs and outputs: | ||||||
|
|
||||||
| ```yaml | ||||||
| - op: spec_call | ||||||
| method: get_current_epoch | ||||||
| input: {state: | ||||||
| 0740b3ecc6fb1bdc20c4f2d792da51dc7aaaa506e445ee7ba7ef1dd7ed900443.ssz_snappy} | ||||||
| assert_output: 2 | ||||||
| - op: spec_call | ||||||
| method: get_seed | ||||||
| input: | ||||||
| state: | ||||||
| 0740b3ecc6fb1bdc20c4f2d792da51dc7aaaa506e445ee7ba7ef1dd7ed900443.ssz_snappy | ||||||
| epoch: 2 | ||||||
| domain_type: '0x00000000' | ||||||
| assert_output: '0x79edc6cbb9ffac34477afe87d8569b7afab320241ce69d70c6d0c0a1839379df' | ||||||
| ``` | ||||||
|
|
||||||
| simple primitive types here (e.g. int or bytes, not containers) are serialized | ||||||
| directly into yaml (even in cases where they subclass SSZ `View` and can | ||||||
| technically be dumped as ssz artifacts), bytes are dumped as 0x-prefixed hex | ||||||
| strings which seems appropriate. SSZ artifacts are always referred to by root | ||||||
| hash-based filename so that there's no need to maintain any mappings or | ||||||
| filename-generating logic. | ||||||
|
|
||||||
| ### Implementation details | ||||||
|
|
||||||
| wrapt is used to wrap spec methods and record their calls, parameters and | ||||||
| results. A decorator is used to set things up. Some simple pydantic models are | ||||||
| used for the trace file structure and some sanitation/formatting. | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A minor wording suggestion: 'sanitization' is the more common term in software engineering for cleaning or filtering data, whereas 'sanitation' typically refers to public health. Using 'sanitization' would be more precise here.
Suggested change
|
||||||
|
|
||||||
| From a consumer standpoint (i.e. test runner) new tests using this decorator | ||||||
| behave differently and are being detected by a new data type (a pydantic model | ||||||
| instance). Some logic was added to `execute_test` in | ||||||
| `tests/core/pyspec/eth2spec/gen_helpers/gen_base/gen_runner.py` to catch that | ||||||
| new case and apply new serialization method. | ||||||
|
|
||||||
| The way `wrapt.ObjectProxy` operates is that it allows you to create a proxy | ||||||
| object for e.g. consensus spec module and override things on it without | ||||||
| affecting any of the underlying logic (unlike monkey-patching). In our case here | ||||||
| we override all lowercase methods in the spec object by wrapping them in a | ||||||
| `wrapt` decorator with tracer function. Whenever a state is detected in any of | ||||||
| the method calls it gets automatically tracked and then it's checked again after | ||||||
| each method call to check for mutations. Everything is saved in a pydantic model | ||||||
| object for further dumping using existing reftest tooling. | ||||||
|
|
||||||
| ### TODO | ||||||
|
|
||||||
| This is still being cooked. | ||||||
|
|
||||||
| Integration with test runner is done by yielding pydantic model as a new data | ||||||
| type and very simple change in the runner. It would be more natural to just | ||||||
| return the value but that would require supporting non-generator functions as | ||||||
| tests which would require much more code than seems reasonable here. | ||||||
|
|
||||||
| I tried my best to separate core logic from the boilerplate needed but it could | ||||||
| be improved upon. | ||||||
|
|
||||||
| Some cleanup and polishing is still required. Typing could be improved. | ||||||
|
|
||||||
| ### Credits | ||||||
|
|
||||||
| Thanks to Leo for the initial idea and guidance, and to all the reviewers who | ||||||
| helped refine this. | ||||||
|
|
||||||
| Thanks to Cristobal for the first prototype of this framework, it's not used | ||||||
| here but I reviewed 4724 and got some inspiration from that. | ||||||
|
|
||||||
| Thanks to IG organizers, mentors, sponsors and fellow builders for making this | ||||||
| possible! | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| from .decorator import spec_trace # noqa: F401 |
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,53 @@ | ||||||||||||||
| import functools | ||||||||||||||
| import inspect | ||||||||||||||
| from collections.abc import Callable, Generator | ||||||||||||||
| from typing import Any | ||||||||||||||
|
|
||||||||||||||
| from .traced_spec import RecordingSpec | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| def spec_trace(fn: Callable) -> Callable: | ||||||||||||||
| """ | ||||||||||||||
| Decorator to wrap a pyspec test and record execution traces. | ||||||||||||||
| Usage: | ||||||||||||||
| @with_all_phases # or other decorators | ||||||||||||||
| @spec_state_test # still needed as before | ||||||||||||||
| @spec_trace # new decorator to record trace | ||||||||||||||
| def test_my_feature(spec, ...): | ||||||||||||||
| ... | ||||||||||||||
| """ | ||||||||||||||
|
|
||||||||||||||
| @functools.wraps(fn) | ||||||||||||||
| def wrapper(*args: Any, **kwargs: Any) -> Generator: | ||||||||||||||
| # 1. Bind arguments to find 'spec' and fixtures | ||||||||||||||
| try: | ||||||||||||||
| bound_args = inspect.signature(fn).bind(*args, **kwargs) | ||||||||||||||
| bound_args.apply_defaults() | ||||||||||||||
| except TypeError as e: | ||||||||||||||
| raise ValueError( | ||||||||||||||
| f"Failed to bind arguments for test function '{fn.__name__}': {e}" | ||||||||||||||
| ) from e | ||||||||||||||
|
||||||||||||||
| raise ValueError( | |
| f"Failed to bind arguments for test function '{fn.__name__}': {e}" | |
| ) from e | |
| raise TypeError( | |
| f"Failed to bind arguments for test function '{fn.__name__}': {e}" | |
| ) from e |
IvanAnishchuk marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,106 @@ | ||
| default_fork: gloas | ||
| trace: | ||
| - {op: load_state, state_root: | ||
| 95d19311d30804985b06c40cc437bdfbb126209ad9ea8253ba33e0ff0af74c40.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_slots | ||
| input: {state: | ||
| 95d19311d30804985b06c40cc437bdfbb126209ad9ea8253ba33e0ff0af74c40.ssz_snappy, | ||
| slot: 8} | ||
| - op: spec_call | ||
| method: process_slots | ||
| input: {state: | ||
| 5eda5f14e032cc1821dd272f3fcbdf031f1bb7ea0a56eda0b247e9eabb09dd53.ssz_snappy, | ||
| slot: 16} | ||
| - {op: assert_state, state_root: | ||
| 94921c903a7696c239dc749d38a1963dfc062fb08885340f39279a5346bdea62.ssz_snappy} | ||
| - {op: load_state, state_root: | ||
| f766adc187b97a82f93e93d282198353b3aea83a74a91d94c992535cb1a9bb28.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_slots | ||
| input: {state: | ||
| f766adc187b97a82f93e93d282198353b3aea83a74a91d94c992535cb1a9bb28.ssz_snappy, | ||
| slot: 23} | ||
| - op: spec_call | ||
| method: process_slot | ||
| input: {state: | ||
| d0c2247f19af245cf9822678aa3ab52cc076e784f23b1146372cf41433a0a395.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_justification_and_finalization | ||
| input: {state: | ||
| 6e2a5be5877d15d253fa89ebc2143d2ee6e5c0a4777aea8ea6a197e60185c7f9.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_inactivity_updates | ||
| input: {state: | ||
| 6e2a5be5877d15d253fa89ebc2143d2ee6e5c0a4777aea8ea6a197e60185c7f9.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_rewards_and_penalties | ||
| input: {state: | ||
| 6e2a5be5877d15d253fa89ebc2143d2ee6e5c0a4777aea8ea6a197e60185c7f9.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_registry_updates | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_slashings | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_eth1_data_reset | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_pending_deposits | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_pending_consolidations | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_effective_balance_updates | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_slashings_reset | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_randao_mixes_reset | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_historical_summaries_update | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_participation_flag_updates | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_sync_committee_updates | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_proposer_lookahead | ||
| input: {state: | ||
| a7f798e2212b92d0e2185444ecb5785c60e2548f2e95cb4d5518ed26e8dc017f.ssz_snappy} | ||
| - op: spec_call | ||
| method: process_builder_pending_payments | ||
| input: {state: | ||
| 216dcc57820542ed089ebc0a7cf482272af284e4b750f72c5b217154d4caf425.ssz_snappy} | ||
| - op: spec_call | ||
| method: get_current_epoch | ||
| input: {state: | ||
| 0740b3ecc6fb1bdc20c4f2d792da51dc7aaaa506e445ee7ba7ef1dd7ed900443.ssz_snappy} | ||
| assert_output: 2 | ||
| - op: spec_call | ||
| method: get_seed | ||
| input: | ||
| state: | ||
| 0740b3ecc6fb1bdc20c4f2d792da51dc7aaaa506e445ee7ba7ef1dd7ed900443.ssz_snappy | ||
| epoch: 2 | ||
| domain_type: '0x00000000' | ||
| assert_output: '0x79edc6cbb9ffac34477afe87d8569b7afab320241ce69d70c6d0c0a1839379df' | ||
| - {op: assert_state, state_root: | ||
| 0740b3ecc6fb1bdc20c4f2d792da51dc7aaaa506e445ee7ba7ef1dd7ed900443.ssz_snappy} |
Uh oh!
There was an error while loading. Please reload this page.