Skip to content

Conversation

Jiloc
Copy link
Contributor

@Jiloc Jiloc commented Oct 6, 2025

Description

This PR introduces an overhaul to the smart contract consensus testing suite, replacing manual test setups with a declarative framework. The goal is to make it significantly easier to write robust, multi-epoch, and multi-Clarity-version consensus tests.

We focus on ensuring we don't break consensus for state transitions that already happened, can happen now, or will happen in the future.

  • Past Events: For consensus-breaking changes in past epochs (e.g., 2.0, 3.1):

    • If a specific scenario never actually occurred on mainnet, we don't need to test it, as it can no longer happen.
    • If a scenario did happen, Genesis Sync is the mechanism responsible for capturing the real-world historical state. Our test suite does not need to replicate it.
  • Current and Future Events:

    • Contract Calls: Contracts deployed in old epochs (e.g., 2.0+) can still be called today (in 3.2+) and in the future. Therefore, the strategy here is to deploy test contracts across all epochs (2.0+) using every Clarity version that was valid at the time, but execute the function calls only from the current epoch (3.2) onwards.
    • Everything Else (Deploys without calls, Transfers, etc.): For new transactions that don't interact with historical state, we only need to test them from the current (3.2) and future epochs.
  • New Macros:

    • contract_call_consensus_test!: Allows defining a full contract deployment and function call test scenario that automatically implements the strategy described above.
    • contract_deploy_consensus_test!: A specialized version for testing only the contract deployment phase, focusing on current and future epochs.

Applicable issues

Additional info (benefits, drawbacks, caveats)

Checklist

  • Test coverage for new or modified code paths
  • Changelog is updated
  • Required documentation changes (e.g., docs/rpc/openapi.yaml and rpc-endpoints.md for v2 endpoints, event-dispatcher.md for new events)
  • New clarity functions have corresponding PR in clarity-benchmarking repo
  • New integration test(s) added to bitcoin-tests.yml

StacksEpochId::Epoch32
}

pub const ALL_GTE_30: &'static [StacksEpochId] = &[
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea of adding this here is that, even if requires manual updating when we add a new epoch, most of the functions in this file will also need to be updated. Expecially the ones above this, fn latest(). I believe it will be difficult to miss it!

tempfile = "3.3"
proptest = { version = "1.6.0", default-features = false, features = ["std"] }
insta = { version = "1.37.0", features = ["ron"] }
pinny = { git = "https://github.com/BitcoinL2-Labs/pinny-rs.git", rev = "54ba9d533a7b84525a5e65a3eae1a3ae76b9ea49" } #v0.0.2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added pinny to simplify discovery of consesus tests. The main reason was to easiliy identify the ones relying on insta to easily filter them. But after some discussions we realized that this is not currently needed. As all of our consensus tests are currently within the consensus file. I can remove it

@Jiloc Jiloc self-assigned this Oct 7, 2025
@Jiloc Jiloc marked this pull request as ready for review October 7, 2025 16:53
@Jiloc Jiloc requested review from a team as code owners October 7, 2025 16:53
@Jiloc Jiloc marked this pull request as draft October 7, 2025 16:54
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

❌ Patch coverage is 47.36842% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.51%. Comparing base (33d7113) to head (e5e2a24).
⚠️ Report is 18 commits behind head on develop.

Files with missing lines Patch % Lines
stackslib/src/chainstate/tests/consensus.rs 47.36% 40 Missing ⚠️

❌ Your project check has failed because the head coverage (61.51%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

❗ There is a different number of reports uploaded between BASE (33d7113) and HEAD (e5e2a24). Click for more details.

HEAD has 74 uploads less than BASE
Flag BASE (33d7113) HEAD (e5e2a24)
126 52
Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #6562       +/-   ##
============================================
- Coverage    74.70%   61.51%   -13.20%     
============================================
  Files          568      568               
  Lines       347131   347050       -81     
============================================
- Hits        259328   213474    -45854     
- Misses       87803   133576    +45773     
Files with missing lines Coverage Δ
stacks-common/src/types/mod.rs 74.70% <ø> (-6.18%) ⬇️
stackslib/src/chainstate/tests/consensus.rs 89.61% <47.36%> (-2.95%) ⬇️

... and 347 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33d7113...e5e2a24. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines 309 to 327
insta::allow_duplicates! {
for res in contract_deploy_results {
insta::assert_ron_snapshot!("contract_deploys", res, {
// redact contract name, as it changes per epoch and clarity version
r#".transactions[].tx[0].name"# => contract_name,
r#".transactions[].tx[0].code_body"# => "[contract code]",
// redact contract code, as it may take too much space needlessly
r#".transactions[].tx[1]"# => "[clarity version]",
});
}
}
insta::allow_duplicates! {
for res in contract_call_results {
insta::assert_ron_snapshot!("contract_calls", res, {
// redact contract name, as it changes per epoch and clarity version
r#".transactions[].tx.*.name"# => contract_name,
});
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t work yet, becasuse the MARF is included in the same struct and will always change between blocks. But I think it would be really helpful for a few reasons:

  • It could provide an additional (optional) guarantee that executing the same transaction across different epochs produces the same result — while still allowing certain transactions to behave differently between epochs when needed.

  • When a new epoch is added, the tests would automatically run for that epoch as well. Tests that produce the same result as before would be marked as “passed” automatically, while only those that yield a different output would be marked as failed or “to be updated.”

  • It will help prevent most snapshot become too big and difficult to review.

In order to make it work, we will have to decide if we want to move the list of MARF hashes into a separate snapshot instead of in the same data struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in 8a6ec15

pub struct ExpectedTransactionOutput {
/// The transaction that was executed.
/// `None` for bitcoin transactions.
pub tx: Option<TransactionPayload>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when reviewing the snapshot to check if everything was correct for the matrix of contract deploy/contract call, I found it difficult to understand the mapping between the receipt and the original transaction. So I thought of adding it here. Maybe the TransactionPayload is too much, and would be enough to just save the transaction type?

@Jiloc Jiloc changed the title feat: add consensus_test macro feat: add macros for contract consensus tests Oct 10, 2025
@Jiloc Jiloc modified the milestones: 3.2.0.0.2, 3.3.0.0.0 Oct 10, 2025
@Jiloc Jiloc added the aac Avoiding Accidental Consensus label Oct 10, 2025
@Jiloc Jiloc added the aac-testing Avoiding Accidental Consensus Testing Specific Task label Oct 10, 2025
@Jiloc Jiloc moved this to Status: 💻 In Progress in Stacks Core Eng Oct 10, 2025
@Jiloc Jiloc moved this from Status: 💻 In Progress to Status: In Review in Stacks Core Eng Oct 10, 2025
@Jiloc Jiloc marked this pull request as ready for review October 10, 2025 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aac Avoiding Accidental Consensus aac-testing Avoiding Accidental Consensus Testing Specific Task

Projects

Status: Status: In Review

Development

Successfully merging this pull request may close these issues.

AAC: add automatic permutations of the clarity/epoch versions for a test AAC: Add clarity version support to the Test harness

1 participant