feat(tests): Initial implementation for generating client test vectors by fselmo · Pull Request #76 · leanEthereum/leanSpec

fselmo · 2025-10-21T02:17:45Z

🗒️ Description

I have ported over some ideas from execution-spec-tests for vector filling. It may not be perfectly cleaned up yet but I wanted to put this out so we can start discussing the format for the vectors from implementers' sides and think about how you want to consume these.

If you're not familiar with execution-spec-tests, I will try to summarize some of the high-level ideas behind the approach here. Coincidentally, execution-spec-tests is in the middle of merging into execution-specs as a monorepo this week. I've been waiting a bit to see the final state of the monorepo organization so we can mimic this here as I think it's a good idea to try to keep things as familiar as possible for developers.

Workspace with Test Package

The approach I took here is the same as we did for execution repos. I have set up leanSpec as a uv workspace with a packages/ directory where the tests framework lives. Inside this packages/, for now, only exists this testing framework and this is considered a separate package altogether, though they share the same workspace. In here you'll find it has its own setup with pyproject.toml, etc, though the main linting, spellcheck, and all these things are configured at the root level of the repository (leanSpec/pyproject.toml). All the tox commands have also been updated to run these checks across the entirety of the workspace, which makes things quite maintainable.

Filling Test Vectors

I defined a basic set of fixtures for now. I think it would be quite nice to find the fewest entry points that we can test the consensus layer from, as the public API entry points, and write all tests from these as test fixtures / formats. What I mean is, we currently have blockchain_test, state_test, benchmark_test in execution specs. If we could somehow get some small subset of test fixtures for consensus, this could be ideal from an integration standpoint. If this is not possible, however, we can create as many fixture types as we need to but this is a chance to redesign the approach we are taking so we should think on this. If a minimal subset can be used, an added benefit is that we can have a more "integration" type of test where we're not only checking logic in one place but from the entry point and across all relevant logic.

The test fixtures I defined for now with some influence from consensus-specs vector designs:

state_transition_test: Test via the STF as the entry point
fork_choice_test: Tests that use fork choice as the test logic

We have a new command:

uv run fill --clean --fork=devnet

fill: This is the command to fill test vectors from pytest tests that are written inside of tests/spec_tests
--clean: This flag forces you to specify that you want to write over the current fixtures/ so that you don't accidentally remove already-filled tests you want to keep
--fork: Specifies the relevant fork to fill for, right now only devnet makes sense
--output: Can be specified to fill fixtures to an output directory that isn't fixtures/, so you can compare two different sets if you want or fill them somewhere else entirely.

Please use this as a discussion as this is not by any means attempting to finalize the design for the test vectors.

Note: This commit (6c892e4) makes some assumptions that change behavior in the specs but I leaned into some changes that allowed the fork choice to move the chain so I could properly test some things as the fork_choice_test was not working without these.

✅ Checklist

Ran tox checks to avoid unnecessary CI fails:
```
uvx --with=tox-uv tox
```
Considered adding appropriate tests for the changes.
Considered updating the online docs in the ./docs/ directory.

tcoratger

Maybe we should put in the CLAUDE.md the description of the PR you made or something similar so that our Claude agents know how to deal with the current way of doing tests?

packages/tests/src/lean_spec_tests/test_fixtures/fork_choice.py

tcoratger · 2025-10-21T10:58:24Z

tests/consensus/devnet/fc_examples/test_head_selection.py

@@ -0,0 +1,85 @@
+"""Fork choice head selection tests for the devnet fork."""


I love this way of doing tests here, this is much simpler than before and we will be able in the future to drastically simplify the way tests are handled in the repo by just having this way of doing everywhere.

yes looks pretty cool

unnawut · 2025-10-22T14:16:18Z

Looks really good to me!

Just one question for now. I see that the post state seems like a diff of the pre-state because of slot, but also not because it has validator_count which is not part of the LeanState.

I think ideally we probably want to full data structure as the post state as well? cc: @syjn99 @g11tech

fselmo · 2025-10-22T19:36:02Z

Just one question for now. I see that the post state seems like a diff of the pre-state because of slot, but also not because it has validator_count which is not part of the LeanState.

Thanks for the feedback. Yeah this is not by any means some final draft. I'd love for client teams to chime in on the design as I took a very general approach here. How this currently works is we define in the test things for that test that we expect to be true. For example, here, I only defined that for this particular test I want to make absolutely sure that this is in the post state. What this then does is it validates the output from the spec, makes sure the spec has these expected values, and then we pass these expected values of interest to the vector.

Of course, this test is not really caring about anything important and is just to serve as an example. What we could do is validate the defined values and still always pass through the entire post object to the vector from the spec... but this could possibly get expensive / slow to run? You can certainly also write a test with all values you expect to be in the post state and all the values would then be passed through in the current case.

We should definitely define what's best for implementers / testing as well as consumers here 👍🏼

fselmo · 2025-10-22T19:41:06Z

but also not because it has validator_count which is not part of the LeanState.

And yes I agree this perhaps is not ideal currently because this looks into the state.config.num_validators, though that's probably fine if we even care about this value. I set some things up as example but perhaps we should define a StateExpectation with all the things we would care to check and document this quite well so that we can then look into the post state and validate, but also make sure clients / consumers can do the same in their implementations for these tests in the least awkward way. We need to define all of the things we actually care about in the expectation. For now this is just an example to illustrate how we can make these checks.

tcoratger · 2025-10-24T10:50:57Z

Of course, this test is not really caring about anything important and is just to serve as an example. What we could do is validate the defined values and still always pass through the entire post object to the vector from the spec...

Mhhhh I would say that the approach you currently have is better no because if we have a lot of tests and the full post state for each of them it can become very cumbersome right?

What we want here is to test only a specific portion of the state for each test and if we want we can have also a couple of full integration tests where we test everything for specific case but this seems to be overkill for most of small unit tests dedicated to testing only one thing right?

fselmo · 2025-10-24T17:47:22Z

rebased to fix conflicts 👀

fselmo · 2025-10-24T18:02:33Z

We are in the middle of merging the execution-spec-tests within execution-specs much the same way as here. We are finalizing this process between today and Monday and I'd like to hold off here just this bit longer in case we can take any other tips from this process to set up the test tooling in leanSpec. I'll mark this as ready for review for now because I think these will just be cosmetic / organizational changes if any and we can discuss the vector generation here in the meantime.

fselmo · 2025-10-24T22:51:33Z

What we want here is to test only a specific portion of the state for each test and if we want we can have also a couple of full integration tests where we test everything for specific case but this seems to be overkill for most of small unit tests dedicated to testing only one thing right?

@tcoratger yes this is how it is set up now and we can define any number of things we care about for the test and it will be in the vector. The other approach is that what we define in the test we verify while filling (so we verify the spec is returning the expected values) but we can proceed to fill the post-state with everything the spec returns and so this post state is always there in full (some hybrid version of what exists here).

I think we should decide together what is best here and this can always be tweaked and changed after this PR as well.

Please take a look at this commit in particular as I tweaked a few things with consensus specs as inspiration just so I could actually drive the chain and fill multiple blocks for the test examples as it wasn't progressing without this. If we want to revert these changes or tweak these while still in devnet or not I am not certain as this is beyond my scope 😅. I believe these are the only spec changes I made here so worth giving that particular attention. The other changes in the specs side were serialization-only changes so we can serialize appropriately in the vectors so please take a look as well.

tcoratger · 2025-10-25T10:34:51Z

@fselmo In my opinion, the current way in which this PR treats things is perfect for what we want right now.

Personally I prefer the approach of having just a specific portion of the post state filled in (as it is for now) but let us what client teams say.
As we can always fine tune what we did here and this PR settles solid bases, for me this is absolutely not a problem to merge and then fine tune some details later if people complain.
Yes I checked the small specs changes and this is good to me.
I would love to merge this early next week so that we can move forward on this topic and I can start replacing the old tests step by step.
Also as you mention that there are potential learnings coming from the execution test move, this is good to wait until there before merging. Once we have this and a good feedback from client teams, then we are ready to merge.

unnawut · 2025-10-25T11:03:39Z

@fselmo We just merged another big PR hence the merge conflict. Could you help rebase?

fselmo · 2025-10-26T21:51:37Z

@fselmo We just merged another big PR hence the merge conflict. Could you help rebase?

Done. Cleaned up the implementation a bit too.

unnawut · 2025-10-27T03:29:07Z

I'm getting this error when running uv sync:

➜  leanSpec git:(feat/test-vectors) uv sync --all-extras
error: Workspace member `.../leanSpec/packages/tests` is missing a `pyproject.toml` (matches: `packages/*`)

fselmo · 2025-10-27T03:47:11Z

@unnawut I renamed some things so your old .venv might be holding on to old information maybe. Try removing the .venv and re-syncing:

rm -rf .venv
uv sync --all-extras

^ from leanSpec/ main directory always.

unnawut · 2025-10-27T03:57:45Z

@unnawut I renamed some things so your old .venv might be holding on to old information maybe. Try removing the .venv and re-syncing:
rm -rf .venv
uv sync --all-extras
^ from leanSpec/ main directory always.

Thanks for the suggestion! Got the same error after removing .venv folder but a fresh clone worked.

tests/consensus/devnet/stf_examples/test_blocks.py

fselmo · 2025-10-27T18:04:14Z

Thanks for the suggestion! Got the same error after removing .venv folder but a fresh clone worked.

fwiw I know what happened here.

If anyone tested the changes before any renaming, you likely had a packages/tests folder that was renamed to packages/testing (to align with recent changes to execution-specs, see here). The problem is when you re-sync the project, the new directory is created in packages/testing but git doesn't remove old directories, so you also had a packages/tests that was now empty and it was being configured and read as a package via packages/* path in the workspace. Just removing that empty dir (packages/tests) would've fixed this issue.

- forkchoice was not advancing chain with blocks, made some minor changes as compared to consensus-specs that may not be correct but help show how to write forkchoice tests for vector generation.

- Rename the package to ``lean-ethereum-testing`` - Rename `uv` workspace package to ``packages/testing`` - Refactor the package into a generalized framework component and ``consensus_testing`` module that focuses on testing the consensus protocol. This opens up the possibility of adding execution-specific testing in the future. - Use `tests/consensus/{fork}` as consensus tests that generate vectors. - Uses a ``--layer`` flag for ``fill`` that defaults to ``consensus`` scope we don't even have to think about this separation for now.

fselmo · 2025-10-28T00:21:18Z

Rebased again. I also added a CI run that fills all tests with python 3.14. Right now it's just the example tests I added to illustrate the basic flow but as we tweak this and add actual tests, this will likely be a good sanity check that nothing is breaking.

- Reconcile differences with ``main`` after rebase. - Clean up the implementation of the test fixtures + remove clunky builder for now.

syjn99

Looked over the weekend, thanks for your hard work!

unnawut

🚀🚀🚀

jihoonsong

Generally looks good. I have minor comments and questions.

CONTRIBUTING.md

packages/testing/src/consensus_testing/genesis.py

packages/testing/src/consensus_testing/test_types/step_types.py

src/lean_spec/subspecs/forkchoice/store.py

- Remove comment that snuck into spec docstring

- Remove comment that snuck into spec docstring - Remove ``Union`` for ``BlockSpec | Block`` for fixtures. Instead, use ``@field_serializer`` to serialize the filled block as the ``Block`` in the fixture. - Add a TODO to configure appropriate default values for validator number and pubkeys for fixtures.

g11tech · 2025-10-29T09:42:16Z

packages/testing/src/consensus_testing/test_fixtures/fork_choice.py

+        # Initialize Store from anchor
+        store = Store.get_forkchoice_store(
+            state=self.anchor_state,
+            anchor_block=self.anchor_block,


actually we don't need block, because the state here one to one corresponds with the block unlike beacon where state could have been rolled to the checkpoint because of missing slots towards the end

g11tech

post merge review, looks amazing ❤️

shariqnaiyer mentioned this pull request Oct 21, 2025

Integrate leanSpec spec test vectors for tests/consensus/devnet/fork choice for testing the Lean Client ReamLabs/ream#867

Closed

fselmo force-pushed the feat/test-vectors branch from 0bc0f7f to c1f3942 Compare October 21, 2025 06:10

tcoratger reviewed Oct 21, 2025

View reviewed changes

fselmo force-pushed the feat/test-vectors branch from c1f3942 to 92134ce Compare October 21, 2025 16:46

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 21, 2025

fix: cleanups from comments on PR leanEthereum#76

628f927

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 22, 2025

fix: cleanups from comments on PR leanEthereum#76

c81dfb7

fselmo force-pushed the feat/test-vectors branch from 476ed0e to b2f113f Compare October 22, 2025 20:20

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 24, 2025

fix: cleanups from comments on PR leanEthereum#76

97f6040

fselmo force-pushed the feat/test-vectors branch from b2f113f to d3a9abf Compare October 24, 2025 17:46

fselmo marked this pull request as ready for review October 24, 2025 18:02

fselmo force-pushed the feat/test-vectors branch 2 times, most recently from 90e7ed9 to 25d3c52 Compare October 24, 2025 22:33

unnawut added the framework Scope: Changes to the testing framework label Oct 25, 2025

fselmo force-pushed the feat/test-vectors branch from 51dedef to 65d52ed Compare October 26, 2025 21:48

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 26, 2025

fix: cleanups from comments on PR leanEthereum#76

0e7f1a3

unnawut reviewed Oct 27, 2025

View reviewed changes

tests/consensus/devnet/stf_examples/test_blocks.py Show resolved Hide resolved

fselmo added 7 commits October 27, 2025 15:29

refactor: some fork choice logic to help advance chain with blocks

e1a078e

- forkchoice was not advancing chain with blocks, made some minor changes as compared to consensus-specs that may not be correct but help show how to write forkchoice tests for vector generation.

fix: cleanups from comments on PR leanEthereum#76

b8e41d6

Add missing STF test post checks + confirm state validation works

899a9a3

chore: Update CLAUDE.md; make more succinct to reduce context

8c75a9b

chore: Add official Python 3.14 support to lean-spec-tests

13f232a

chore: Update CLAUDE.md based on last commit's refactor

5209b20

fselmo force-pushed the feat/test-vectors branch 4 times, most recently from b41e983 to 3ad914f Compare October 28, 2025 00:19

fselmo added 2 commits October 27, 2025 19:04

fix: reconcile differences with main after rebase + cleanup impl

fd8e8ef

- Reconcile differences with ``main`` after rebase. - Clean up the implementation of the test fixtures + remove clunky builder for now.

chore: add simple fill-devnet-tests run to CI for sanity checks

81090ef

fselmo force-pushed the feat/test-vectors branch from 3ad914f to 81090ef Compare October 28, 2025 01:04

syjn99 approved these changes Oct 28, 2025

View reviewed changes

unnawut approved these changes Oct 28, 2025

View reviewed changes

jihoonsong reviewed Oct 28, 2025

View reviewed changes

fselmo added a commit to fselmo/leanSpec that referenced this pull request Oct 28, 2025

fix: updates from comments on leanEthereum#76

a839dff

- Remove comment that snuck into spec docstring

tcoratger approved these changes Oct 28, 2025

View reviewed changes

fselmo force-pushed the feat/test-vectors branch from a839dff to 457a1da Compare October 28, 2025 22:55

tcoratger merged commit b091ef7 into leanEthereum:main Oct 28, 2025
10 checks passed

tcoratger mentioned this pull request Oct 28, 2025

tests: set appropriate default for validators in test fixtures #93

Closed

fselmo deleted the feat/test-vectors branch October 29, 2025 05:02

This was referenced Oct 29, 2025

Develop spec test framework leanEthereum/pm#32

Closed

Implement test vectors for devnet0 and devnet1 leanEthereum/pm#42

Open

g11tech reviewed Oct 29, 2025

View reviewed changes

fselmo mentioned this pull request Oct 29, 2025

fix(specs,testing): Fix serializaiton to use camelCase; refactor model impls #96

Merged

1 task

		@@ -0,0 +1,85 @@
		"""Fork choice head selection tests for the devnet fork."""

Conversation

fselmo commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

Workspace with Test Package

Filling Test Vectors

✅ Checklist

Uh oh!

tcoratger left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tcoratger Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

g11tech Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

unnawut commented Oct 22, 2025

Uh oh!

fselmo commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fselmo commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tcoratger commented Oct 24, 2025

Uh oh!

fselmo commented Oct 24, 2025

Uh oh!

fselmo commented Oct 24, 2025

Uh oh!

fselmo commented Oct 24, 2025

Uh oh!

tcoratger commented Oct 25, 2025

Uh oh!

unnawut commented Oct 25, 2025

Uh oh!

fselmo commented Oct 26, 2025

Uh oh!

unnawut commented Oct 27, 2025

Uh oh!

fselmo commented Oct 27, 2025

Uh oh!

unnawut commented Oct 27, 2025

Uh oh!

Uh oh!

fselmo commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fselmo commented Oct 28, 2025

Uh oh!

syjn99 left a comment

Choose a reason for hiding this comment

Uh oh!

unnawut left a comment

Choose a reason for hiding this comment

Uh oh!

jihoonsong left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

g11tech Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

g11tech left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

fselmo commented Oct 21, 2025 •

edited

Loading

fselmo commented Oct 22, 2025 •

edited

Loading

fselmo commented Oct 22, 2025 •

edited

Loading

fselmo commented Oct 27, 2025 •

edited

Loading