Skip to content

feat(tests): Initial implementation for generating client test vectors#76

Merged
tcoratger merged 18 commits intoleanEthereum:mainfrom
fselmo:feat/test-vectors
Oct 28, 2025
Merged

feat(tests): Initial implementation for generating client test vectors#76
tcoratger merged 18 commits intoleanEthereum:mainfrom
fselmo:feat/test-vectors

Conversation

@fselmo
Copy link
Contributor

@fselmo fselmo commented Oct 21, 2025

🗒️ Description

I have ported over some ideas from execution-spec-tests for vector filling. It may not be perfectly cleaned up yet but I wanted to put this out so we can start discussing the format for the vectors from implementers' sides and think about how you want to consume these.

If you're not familiar with execution-spec-tests, I will try to summarize some of the high-level ideas behind the approach here. Coincidentally, execution-spec-tests is in the middle of merging into execution-specs as a monorepo this week. I've been waiting a bit to see the final state of the monorepo organization so we can mimic this here as I think it's a good idea to try to keep things as familiar as possible for developers.

Workspace with Test Package

The approach I took here is the same as we did for execution repos. I have set up leanSpec as a uv workspace with a packages/ directory where the tests framework lives. Inside this packages/, for now, only exists this testing framework and this is considered a separate package altogether, though they share the same workspace. In here you'll find it has its own setup with pyproject.toml, etc, though the main linting, spellcheck, and all these things are configured at the root level of the repository (leanSpec/pyproject.toml). All the tox commands have also been updated to run these checks across the entirety of the workspace, which makes things quite maintainable.

Filling Test Vectors

I defined a basic set of fixtures for now. I think it would be quite nice to find the fewest entry points that we can test the consensus layer from, as the public API entry points, and write all tests from these as test fixtures / formats. What I mean is, we currently have blockchain_test, state_test, benchmark_test in execution specs. If we could somehow get some small subset of test fixtures for consensus, this could be ideal from an integration standpoint. If this is not possible, however, we can create as many fixture types as we need to but this is a chance to redesign the approach we are taking so we should think on this. If a minimal subset can be used, an added benefit is that we can have a more "integration" type of test where we're not only checking logic in one place but from the entry point and across all relevant logic.

The test fixtures I defined for now with some influence from consensus-specs vector designs:

  • state_transition_test: Test via the STF as the entry point
  • fork_choice_test: Tests that use fork choice as the test logic

We have a new command:

uv run fill --clean --fork=devnet
  • fill: This is the command to fill test vectors from pytest tests that are written inside of tests/spec_tests
  • --clean: This flag forces you to specify that you want to write over the current fixtures/ so that you don't accidentally remove already-filled tests you want to keep
  • --fork: Specifies the relevant fork to fill for, right now only devnet makes sense
  • --output: Can be specified to fill fixtures to an output directory that isn't fixtures/, so you can compare two different sets if you want or fill them somewhere else entirely.

Please use this as a discussion as this is not by any means attempting to finalize the design for the test vectors.


Note: This commit (6c892e4) makes some assumptions that change behavior in the specs but I leaned into some changes that allowed the fork choice to move the chain so I could properly test some things as the fork_choice_test was not working without these.

✅ Checklist

  • Ran tox checks to avoid unnecessary CI fails:
    uvx --with=tox-uv tox
  • Considered adding appropriate tests for the changes.
  • Considered updating the online docs in the ./docs/ directory.

Copy link
Collaborator

@tcoratger tcoratger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should put in the CLAUDE.md the description of the PR you made or something similar so that our Claude agents know how to deal with the current way of doing tests?

@@ -0,0 +1,85 @@
"""Fork choice head selection tests for the devnet fork."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this way of doing tests here, this is much simpler than before and we will be able in the future to drastically simplify the way tests are handled in the repo by just having this way of doing everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes looks pretty cool

@fselmo fselmo force-pushed the feat/test-vectors branch from c1f3942 to 92134ce Compare October 21, 2025 16:46
fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 21, 2025
@unnawut
Copy link
Collaborator

unnawut commented Oct 22, 2025

Looks really good to me!

Just one question for now. I see that the post state seems like a diff of the pre-state because of slot, but also not because it has validator_count which is not part of the LeanState.

I think ideally we probably want to full data structure as the post state as well? cc: @syjn99 @g11tech

image

@fselmo
Copy link
Contributor Author

fselmo commented Oct 22, 2025

Just one question for now. I see that the post state seems like a diff of the pre-state because of slot, but also not because it has validator_count which is not part of the LeanState.

Thanks for the feedback. Yeah this is not by any means some final draft. I'd love for client teams to chime in on the design as I took a very general approach here. How this currently works is we define in the test things for that test that we expect to be true. For example, here, I only defined that for this particular test I want to make absolutely sure that this is in the post state. What this then does is it validates the output from the spec, makes sure the spec has these expected values, and then we pass these expected values of interest to the vector.

Of course, this test is not really caring about anything important and is just to serve as an example. What we could do is validate the defined values and still always pass through the entire post object to the vector from the spec... but this could possibly get expensive / slow to run? You can certainly also write a test with all values you expect to be in the post state and all the values would then be passed through in the current case.

We should definitely define what's best for implementers / testing as well as consumers here 👍🏼

@fselmo
Copy link
Contributor Author

fselmo commented Oct 22, 2025

but also not because it has validator_count which is not part of the LeanState.

And yes I agree this perhaps is not ideal currently because this looks into the state.config.num_validators, though that's probably fine if we even care about this value. I set some things up as example but perhaps we should define a StateExpectation with all the things we would care to check and document this quite well so that we can then look into the post state and validate, but also make sure clients / consumers can do the same in their implementations for these tests in the least awkward way. We need to define all of the things we actually care about in the expectation. For now this is just an example to illustrate how we can make these checks.

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 22, 2025
@fselmo fselmo force-pushed the feat/test-vectors branch from 476ed0e to b2f113f Compare October 22, 2025 20:20
@tcoratger
Copy link
Collaborator

Of course, this test is not really caring about anything important and is just to serve as an example. What we could do is validate the defined values and still always pass through the entire post object to the vector from the spec...

Mhhhh I would say that the approach you currently have is better no because if we have a lot of tests and the full post state for each of them it can become very cumbersome right?

What we want here is to test only a specific portion of the state for each test and if we want we can have also a couple of full integration tests where we test everything for specific case but this seems to be overkill for most of small unit tests dedicated to testing only one thing right?

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 24, 2025
@fselmo fselmo force-pushed the feat/test-vectors branch from b2f113f to d3a9abf Compare October 24, 2025 17:46
@fselmo
Copy link
Contributor Author

fselmo commented Oct 24, 2025

rebased to fix conflicts 👀

@fselmo
Copy link
Contributor Author

fselmo commented Oct 24, 2025

We are in the middle of merging the execution-spec-tests within execution-specs much the same way as here. We are finalizing this process between today and Monday and I'd like to hold off here just this bit longer in case we can take any other tips from this process to set up the test tooling in leanSpec. I'll mark this as ready for review for now because I think these will just be cosmetic / organizational changes if any and we can discuss the vector generation here in the meantime.

@fselmo fselmo marked this pull request as ready for review October 24, 2025 18:02
@fselmo fselmo force-pushed the feat/test-vectors branch 2 times, most recently from 90e7ed9 to 25d3c52 Compare October 24, 2025 22:33
@fselmo
Copy link
Contributor Author

fselmo commented Oct 24, 2025

What we want here is to test only a specific portion of the state for each test and if we want we can have also a couple of full integration tests where we test everything for specific case but this seems to be overkill for most of small unit tests dedicated to testing only one thing right?

@tcoratger yes this is how it is set up now and we can define any number of things we care about for the test and it will be in the vector. The other approach is that what we define in the test we verify while filling (so we verify the spec is returning the expected values) but we can proceed to fill the post-state with everything the spec returns and so this post state is always there in full (some hybrid version of what exists here).

I think we should decide together what is best here and this can always be tweaked and changed after this PR as well.

Please take a look at this commit in particular as I tweaked a few things with consensus specs as inspiration just so I could actually drive the chain and fill multiple blocks for the test examples as it wasn't progressing without this. If we want to revert these changes or tweak these while still in devnet or not I am not certain as this is beyond my scope 😅. I believe these are the only spec changes I made here so worth giving that particular attention. The other changes in the specs side were serialization-only changes so we can serialize appropriately in the vectors so please take a look as well.

@tcoratger
Copy link
Collaborator

@fselmo In my opinion, the current way in which this PR treats things is perfect for what we want right now.

  • Personally I prefer the approach of having just a specific portion of the post state filled in (as it is for now) but let us what client teams say.
  • As we can always fine tune what we did here and this PR settles solid bases, for me this is absolutely not a problem to merge and then fine tune some details later if people complain.
  • Yes I checked the small specs changes and this is good to me.
  • I would love to merge this early next week so that we can move forward on this topic and I can start replacing the old tests step by step.
  • Also as you mention that there are potential learnings coming from the execution test move, this is good to wait until there before merging. Once we have this and a good feedback from client teams, then we are ready to merge.

@unnawut
Copy link
Collaborator

unnawut commented Oct 25, 2025

@fselmo We just merged another big PR hence the merge conflict. Could you help rebase?

@unnawut unnawut added the framework Scope: Changes to the testing framework label Oct 25, 2025
@fselmo fselmo force-pushed the feat/test-vectors branch from 51dedef to 65d52ed Compare October 26, 2025 21:48
fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 26, 2025
@fselmo
Copy link
Contributor Author

fselmo commented Oct 26, 2025

@fselmo We just merged another big PR hence the merge conflict. Could you help rebase?

Done. Cleaned up the implementation a bit too.

@unnawut
Copy link
Collaborator

unnawut commented Oct 27, 2025

I'm getting this error when running uv sync:

➜  leanSpec git:(feat/test-vectors) uv sync --all-extras
error: Workspace member `.../leanSpec/packages/tests` is missing a `pyproject.toml` (matches: `packages/*`)

@fselmo
Copy link
Contributor Author

fselmo commented Oct 27, 2025

@unnawut I renamed some things so your old .venv might be holding on to old information maybe. Try removing the .venv and re-syncing:

rm -rf .venv
uv sync --all-extras

^ from leanSpec/ main directory always.

@unnawut
Copy link
Collaborator

unnawut commented Oct 27, 2025

@unnawut I renamed some things so your old .venv might be holding on to old information maybe. Try removing the .venv and re-syncing:

rm -rf .venv
uv sync --all-extras

^ from leanSpec/ main directory always.

Thanks for the suggestion! Got the same error after removing .venv folder but a fresh clone worked.

@fselmo
Copy link
Contributor Author

fselmo commented Oct 27, 2025

Thanks for the suggestion! Got the same error after removing .venv folder but a fresh clone worked.

fwiw I know what happened here.

If anyone tested the changes before any renaming, you likely had a packages/tests folder that was renamed to packages/testing (to align with recent changes to execution-specs, see here). The problem is when you re-sync the project, the new directory is created in packages/testing but git doesn't remove old directories, so you also had a packages/tests that was now empty and it was being configured and read as a package via packages/* path in the workspace. Just removing that empty dir (packages/tests) would've fixed this issue.

- forkchoice was not advancing chain with blocks, made some minor
  changes as compared to consensus-specs that may not be correct
  but help show how to write forkchoice tests for vector generation.
- Rename the package to ``lean-ethereum-testing``
- Rename `uv` workspace package to ``packages/testing``
- Refactor the package into a generalized framework component and
  ``consensus_testing`` module that focuses on testing the consensus
  protocol. This opens up the possibility of adding execution-specific
  testing in the future.
- Use `tests/consensus/{fork}` as consensus tests that generate vectors.
- Uses a ``--layer`` flag for ``fill`` that defaults to ``consensus`` scope
  we don't even have to think about this separation for now.
@fselmo fselmo force-pushed the feat/test-vectors branch 4 times, most recently from b41e983 to 3ad914f Compare October 28, 2025 00:19
@fselmo
Copy link
Contributor Author

fselmo commented Oct 28, 2025

Rebased again. I also added a CI run that fills all tests with python 3.14. Right now it's just the example tests I added to illustrate the basic flow but as we tweak this and add actual tests, this will likely be a good sanity check that nothing is breaking.

- Reconcile differences with ``main`` after rebase.
- Clean up the implementation of the test fixtures + remove clunky
  builder for now.
@fselmo fselmo force-pushed the feat/test-vectors branch from 3ad914f to 81090ef Compare October 28, 2025 01:04
Copy link
Contributor

@syjn99 syjn99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked over the weekend, thanks for your hard work!

Copy link
Collaborator

@unnawut unnawut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🚀🚀

Copy link
Contributor

@jihoonsong jihoonsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. I have minor comments and questions.

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 28, 2025
- Remove comment that snuck into spec docstring
- Remove comment that snuck into spec docstring
- Remove ``Union`` for ``BlockSpec | Block`` for fixtures. Instead,
  use ``@field_serializer`` to serialize the filled block as the ``Block``
  in the fixture.
- Add a TODO to configure appropriate default values for validator
  number and pubkeys for fixtures.
@fselmo fselmo force-pushed the feat/test-vectors branch from a839dff to 457a1da Compare October 28, 2025 22:55
@tcoratger tcoratger merged commit b091ef7 into leanEthereum:main Oct 28, 2025
10 checks passed
@fselmo fselmo deleted the feat/test-vectors branch October 29, 2025 05:02
# Initialize Store from anchor
store = Store.get_forkchoice_store(
state=self.anchor_state,
anchor_block=self.anchor_block,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we don't need block, because the state here one to one corresponds with the block unlike beacon where state could have been rolled to the checkpoint because of missing slots towards the end

Copy link
Contributor

@g11tech g11tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post merge review, looks amazing ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

framework Scope: Changes to the testing framework

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants