Skip to content

Conversation

richardkiss
Copy link
Contributor

Two purposes:

  • remove the dependency of chia.full_node from chia.consensus
  • prep for modifying full node code to separate out consensus concerns from block archive storage and "coin explorer" functionality

Towards RocksDB brain transplant.

@richardkiss richardkiss requested a review from a team as a code owner August 12, 2025 23:49
@richardkiss richardkiss marked this pull request as draft August 12, 2025 23:49
@richardkiss richardkiss requested a review from Copilot August 12, 2025 23:49
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR creates a new ConsensusStore class that combines CoinStore, BlockStore, and BlockHeightMap into a unified interface to reduce dependencies between chia.full_node and chia.consensus modules. This refactoring prepares for future RocksDB integration by consolidating consensus-related storage operations.

Key changes:

  • Introduces ConsensusStore and ConsensusStoreProtocol to wrap existing storage components
  • Updates Blockchain constructor to accept a single ConsensusStore instead of separate stores
  • Modifies all test files and utilities to use the new consolidated store pattern

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
chia/full_node/full_node.py Updates to create and use ConsensusStore instead of passing individual stores
chia/consensus/consensus_store_protocol.py New protocol defining the unified interface for consensus storage operations
chia/consensus/consensus_store.py Implementation of ConsensusStore that delegates to underlying stores
chia/consensus/blockchain.py Major refactoring to use ConsensusStore instead of individual store references
chia/_tests/util/blockchain.py Test utility updated to use new store pattern
chia/_tests/core/test_db_validation.py Test updated to create and use ConsensusStore
chia/_tests/core/test_db_conversion.py Test updated to create and use ConsensusStore
chia/_tests/core/full_node/stores/test_coin_store.py Test updated to create and use ConsensusStore
chia/_tests/core/full_node/stores/test_block_store.py Test updated to create and use ConsensusStore
chia/_tests/core/full_node/ram_db.py Test utility updated to use new store pattern
benchmarks/block_ref.py Benchmark updated to use new store pattern

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@richardkiss richardkiss force-pushed the consensus_store branch 2 times, most recently from 39decfe to 22f5bcb Compare August 13, 2025 22:10
@richardkiss richardkiss added the Changed Required label for PR that categorizes merge commit message as "Changed" for changelog label Aug 13, 2025
@richardkiss richardkiss force-pushed the consensus_store branch 4 times, most recently from 8bf99bf to 7bac71d Compare August 13, 2025 22:43


@dataclasses.dataclass
class ConsensusStoreSQLite3(ConsensusStoreProtocol):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally don't inherit protocols when implementing them. here's our existing pattern for expressing that a given protocol is intended to be implemented and for making an explicit mypy check against it that provides a clear explanation of deviations in more cases than other options do.

@final
@dataclasses.dataclass
class FullNode:
if TYPE_CHECKING:
from chia.rpc.rpc_server import RpcServiceProtocol
_protocol_check: ClassVar[RpcServiceProtocol] = cast("FullNode", None)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of this pattern, although I see the value of having the error be closer to the source. I asked ChatGPT and it came up with

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from my_protocols import RpcServiceProtocol
    from typing import assert_type  # or use reveal_type

    def _assert_protocol(x: RpcServiceProtocol) -> None: ...
    _assert_protocol(FullNode())  # mypy will check this

I like this a little better because at least it doesn't pollute the class, and it can be put at the end of the file. So I may try that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that provide the clear output that my suggestion does? i didn't pick it because i liked the form. i picked it because of all the options it was the most useful (and i haven't gotten back to trying to get my PR improving mypy output done).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about this, we can revisit this pattern but let's keep this pr granular instead of having it start introducing replacement hinting practices to be debated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help me understand what's wrong with the way it's done here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is different than the widely used and long standing existing pattern. it locates the expression of fulfilling the protocol much further from the class line.

why does it need to be different than all the other places we do this? and why do you want to hold this pr up on that discussion? we both agree that all options here are ugly anyways.

@richardkiss richardkiss force-pushed the consensus_store branch 3 times, most recently from 3923c13 to 99a05dc Compare August 25, 2025 23:24
Copy link
Contributor

@arvidn arvidn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea of bundling all storage classes into a single class makes sense. If we're hoping to provide a reasonable interface for transactions across different back-ends, we would need something to abstract it. To make it look like a single transaction.

However, I'm not quite following the vision for how to get to that combined transaction interface. The DBWrapper would have to be part of ConsensusStore too, wouldn't it?

from chia.util.errors import Err


async def check_block_store_invariant(bc: Blockchain):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function checks the invariant of the sqlite database. I don't think it makes sense to make it generic to check the invariant of any block store. The right hand, generic, version looks a bit odd. get_block_heights_in_main_chain() sounds like it would return range(0, peak.height).

That's not really what this invariant check is ensuring. I think you should leave this function as-is, but only operate on the sqlite block store. The future RocksDB block store will most likely have different invariants to check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a very specific invariant, one that should probably be true for any block store that supports in_main_chain (and I'm actually quite skeptical about the usefulness of keeping and constantly updating this flag, since it's a bit redundant with the in-RAM BlockHeightMap). I'm fine returning this test to its previous state and removing the rather silly API I created.

The test will have to check that bc.block_store actually exists (and when it's eventually replaced with BlockArchiveStore, have dig down to get the implementation details).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is now much closer to its original form.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and I'm actually quite skeptical about the usefulness of keeping and constantly updating this flag, since it's a bit redundant with the in-RAM BlockHeightMap)

The in RAM map is not sustainable and will need to be removed eventually

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the api to ConsensusStore seems unreasonably large, with a lot of redundancy. It could be an interesting future project to simplify the api; we could, at the same time, remove the annoying non-async methods that BlockHeightMap currently provides, and that caching moved to either FullNode or ConsensusStore.

) -> None:
await self.block_store.persist_sub_epoch_challenge_segments(ses_block_hash, segments)

async def rollback_to_block(self, block_index: int) -> dict[bytes32, CoinRecord]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name becomes easy to misunderstand now. It could be thought to reset the peak down to the specified height.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always preferred the name .rewind_to_height which seems much less ambiguous. I'd be happy to switch to this name, although the .rollback_to_block name is used in so many places that changing it everywhere would be a fairly large PR.

Copy link
Contributor Author

@richardkiss richardkiss Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not so bad.

% git grep -l 'rollback_to_block'
CHANGELOG.md
chia/_tests/blockchain/test_blockchain.py
chia/_tests/core/full_node/stores/test_coin_store.py
chia/_tests/util/spend_sim.py
chia/_tests/wallet/cat_wallet/test_cat_wallet.py
chia/_tests/wallet/did_wallet/test_did.py
chia/_tests/wallet/test_nft_store.py
chia/_tests/wallet/test_transaction_store.py
chia/_tests/wallet/test_wallet_coin_store.py
chia/consensus/blockchain.py
chia/consensus/consensus_store_protocol.py
chia/data_layer/dl_wallet_store.py
chia/full_node/coin_store.py
chia/full_node/coin_store_protocol.py
chia/full_node/consensus_store_sqlite3.py
chia/simulator/full_node_simulator.py
chia/wallet/trading/trade_store.py
chia/wallet/wallet_coin_store.py
chia/wallet/wallet_interested_store.py
chia/wallet/wallet_nft_store.py
chia/wallet/wallet_retry_store.py
chia/wallet/wallet_state_manager.py
chia/wallet/wallet_transaction_store.py

% git grep 'rollback_to_block' | wc
      40     207    3952

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a problem with "rollback" (but I agree that "rewind" sounds better). My point is that this function is now on the ConsensusStore, so it looks like it would affect the whole consensus store, both the blockchain and the coin set. But it only affects the coin set. I think that's misleading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm a bit confused. To my mind, consensus is coin set and a "peak" block (since coin records refer to block indices which are meaningless without a "peak" context) and nothing else (except for potential and annoying generator backrefs). What specifically does this rollback/rewind function fail to roll back?

Copy link
Contributor

@arvidn arvidn Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my mind, consensus is coin set and a "peak" block

The ConsensusStore class also includes the blocks forming the chain from genesis to peak. I would think those would also be part of "consensus".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in_main_chain flag has always bothered me a bit because it's not an immutable property of a block: it's a property of the view of the "best chain" that we have found so far (which can be summarized by the "peak block"). So, given a peak block, we transitively get a list of blocks in that peak block's [main] chain. The BlockHeightMap, or whatever eventually replaces it, seems like a great place to store this information about which blocks are in the main chain since it is much more clearly scoped to a particular peak/chain.

Copy link
Contributor Author

@richardkiss richardkiss Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess your concern is the block store and the coin store each have a view of the peak block that can get out of sync. This is something we would definitely have to reconcile at ConsensusStore init time if peak block context is stored separately in each DB. We might be able to modify some of this stuff with lower risk due to the opt-in alpha test period of the next DB version. But this particular PR should change nothing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think that's fair to say. But I think this concern is exacerbated by having this ConsensusStore interface which seems like it promises to maintain this invariant.

Copy link
Contributor

@arvidn arvidn Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intention is not to maintain this invariant, between the various stores, the methods should probably be named to indicate which store they operate on. This method is called rollback_to_block(), there's nothing suggesting that it's rolling back the coin store. Likewise, to roll back the blockstore, the method is called rollback(), which says nothing about that it only rolls back the block store.

I think it's dangerously confusing to use these names. I can see two solutions:

  1. unify rollback() into a single method that ensures that all underlying stores are always in sync.
  2. Be a bit more explicit about not maintaining any invariants and name all methods with a suffix of the store they operate on. e.g. rollback_blockstore() and rollback_coinstore(). Another example would be new_peak_coinstore(), add_block_blockstore(). The both "add" a new block, but neither affect the other store.

included_reward_coins: Collection[Coin],
tx_additions: Collection[tuple[bytes32, Coin, bool]],
tx_removals: list[bytes32],
) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could also be interpreted to update the peak, but it doesn't

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is also referring to the fact that it only affects the coin store, not the blockchain. It's misleading since it's a function on ConsensusStore and it doesn't say anything about only applying to the coin set.

I think you may have an opportunity to actually raise the abstraction level by making these functions apply to all stores, to make it more clear that they are kept in sync and updated in lock-step.


class ConsensusStoreProtocol(Protocol):
"""
Read-only protocol for the consensus store.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you removing the concept of a read transaction? (i.e. what we call .reader() today). If so, are you confident that's safe? I believe there are places where we expect to see a consistent snapshot of the database across calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the ConsensusStoreWriter access to a .writer function, so we could come up with ConsensusStoreReader and .reader to support read transactions with consistency guarantees if it turns out to be necessary. For consensus, we might be able to get away without it, but I'm not sure yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can assume that it's safe to remove read-consistency until proven otherwise. If you're removing it, I think the burden of proof is on you to show that it's safe.

But maybe this class could be renamed ConsensusStoreReadProtocol and the ConsensusStoreProtocol just contain;

class ConsensusStoreProtocol(Protocol):
    def writer(self) -> AsyncContextManager[ConsensusStoreWriteProtocol]: ...
    def reader(self) -> AsyncContextManager[ConsensusStoreReadProtocol]: ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I've changed any read operations that are currently in a read transaction.


return records_to_add, state_summary

async def _perform_db_operations_for_peak(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper isn't strictly necessary, but by creating the writer async context manager above, it preserves existing indentation and makes the changes much easier to review.

_peak_height: Optional[uint32]
# All blocks in peak path are guaranteed to be included, can include orphan blocks
__block_records: dict[bytes32, BlockRecord]
_block_records: dict[bytes32, BlockRecord]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to use double-underscore here, and it does have a specific meaning that we don't actually want to invoke here. https://docs.python.org/3/tutorial/classes.html#private-variables

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion. I used double underscores to make the fields more private, to make it harder to write tests with layer violations, which in turn I would suffer from in future refactors. I'm not a python expert, so I'm not sure what other considerations there are that makes double underscores undesireable.

_block_records: dict[bytes32, BlockRecord]
# all hashes of blocks in block_record by height, used for garbage collection
__heights_in_cache: dict[uint32, set[bytes32]]
_heights_in_cache: dict[uint32, set[bytes32]]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_heights_in_cache: dict[uint32, set[bytes32]]
# maps block height (of the current heaviest chain) to block hash and sub
# epoch summaries
__height_map: BlockHeightMap
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be that originally we wanted this BlockHeightMap to be a singleton. But I think it could just cause more problems than it solves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote BlockHeightMap. It was never intended to be a singleton (is there anything suggesting it was?).
Its functionality existing as members of the Blockchain class before, I just encapsulated them into their own class, with their own invariants and added unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .__ double underscore makes it a class variable, which makes it a singleton, so I thought maybe that was the original intent. But yeah, I also considered it might just be a misunderstanding/copying the (unnecessary?) patterns that already existed in Blockchain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kyle correctly points out it's not a singleton... it's just an instance variable that's name-mangled at every use in a way that's class specific. This makes it even less plausibly useful here; and was probably just introduced by mistaken since I don't think this class has a subclass. Maybe it did at one point, but I'd be surprised.

@richardkiss richardkiss marked this pull request as ready for review September 11, 2025 20:27
@richardkiss richardkiss force-pushed the consensus_store branch 2 times, most recently from deb87cc to 053344a Compare September 22, 2025 20:54
try:
# Always add the block to the database
async with self.block_store.transaction():
async with self.consensus_store.writer() as writer:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually invokes .block_store.transaction() under the covers.

and the new chain, or returns None if there was no update to the heaviest chain.
"""

async with self.consensus_store.writer() as writer:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a real change, where the writes that have been moved to _perform_db_operations_for_peak are now in a single transaction context, whereas before they were not. This is likely a different semantic. I don't see an obvious way how it could be worse though; it just limits the failure cases to a smaller subset (we can't get just some of the writes, we either get all or none, which was also possible prior to adding this line).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changed Required label for PR that categorizes merge commit message as "Changed" for changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants