Skip to content

Conversation

@LagginTimes
Copy link
Contributor

@LagginTimes LagginTimes commented Feb 26, 2025

Fixes #1740

Description

Work for this began as part of #1811, based on this comment.

Emitter::mempool now returns a MempoolEvent which provides the data for tracking evicted_at:

pub struct MempoolEvent {
    /// Unemitted transactions or transactions with ancestors that are unseen by the receiver.
    ///
    /// To understand the second condition, consider a receiver which filters transactions based on
    /// whether it alters the UTXO set of tracked script pubkeys. If an emitted mempool transaction
    /// spends a tracked UTXO which is confirmed at height `h`, but the receiver has only seen up to
    /// block of height `h-1`, we want to re-emit this transaction until the receiver has seen the
    /// block at height `h`.
    pub new_txs: Vec<(Transaction, u64)>,

    /// [`Txid`]s of all transactions that have been evicted from mempool.
    pub evicted_txids: HashSet<Txid>,

    /// The latest timestamp of when a transaction entered the mempool.
    ///
    /// This is useful for setting the timestamp for evicted transactions.
    pub latest_update_time: u64,
}

Changelog notice

  • Change Emitter::mempool to return MempoolEvents which contain mempool-eviction data.
  • Change Emitter::client to have more relaxed generic bounds. C: Deref, C::Target: RpcApi are the new bounds.
  • Add conversion impls for CanonicalTx to Txid/Arc<Transaction>.
  • Add ChainPosition::is_unconfirmed method.

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature

Bugfixes:

  • This pull request breaks the existing API
  • I've added tests to reproduce the issue which are now passing
  • I'm linking the issue being fixed by this PR

@evanlinjin
Copy link
Member

Instead of rebasing, it's probably best to git switch -c <new-branch> from #1839 and cherry-pick on the new commits.

@LagginTimes LagginTimes marked this pull request as draft February 28, 2025 20:24
@LagginTimes LagginTimes force-pushed the mempool_evicted branch 3 times, most recently from f0b807b to 96053d3 Compare March 3, 2025 20:40
@notmandatory notmandatory moved this to In Progress in BDK Chain Mar 4, 2025
@notmandatory notmandatory added new feature New feature or request api A breaking API change labels Mar 4, 2025
@LagginTimes LagginTimes force-pushed the mempool_evicted branch 2 times, most recently from 7b25e4a to 16b63c4 Compare March 10, 2025 20:17
@notmandatory notmandatory moved this from In Progress to Needs Review in BDK Chain Mar 11, 2025
@notmandatory notmandatory marked this pull request as ready for review March 11, 2025 17:14
evanlinjin added a commit that referenced this pull request Mar 15, 2025
0a20724 feat(examples): Update example crates to use `expected_spk_txids` (志宇)
1f8fc17 feat(core)!: Remove redundant `SyncRequest` methods (志宇)
f42d5a8 feat(esplora): Handle spks with expected txids (志宇)
3ab4994 feat(electrum): Handle spks with expected txids (志宇)
64e4100 feat(chain): Add `TxGraph` methods that handle expected spk txids (志宇)
b38569f feat(core): Add expected txids to `SyncRequest` spks (Wei Chen)
a2dfcb9 feat(chain)!: Change `TxGraph` to understand evicted-at & last-evicted (志宇)
ae0336b feat(core): Add `TxUpdate::evicted_ats` (志宇)

Pull request description:

  Partially Fixes #1740.
  Replaces #1765.
  Replaces #1811.

  ### Description

  This PR allows the receiving structures (`bdk_chain`, `bdk_wallet`) to detect and evict incoming transactions that are double spent (cancelled).

  We add a new field to `TxUpdate` (`TxUpdate::evicted_ats`), which in turn, updates the `last_evicted` timestamps that are tracked/persisted by `TxGraph`. This is similar to how `TxUpdate::seen_ats` update the `last_seen` timestamp in `TxGraph`. Transactions with a `last_evicted` timestamp higher than their `last_seen` timestamp are considered evicted.

  `SpkWithExpectedTxids` is introduced in `SpkClient` to track expected `Txid`s for each `spk`. During a sync, if any `Txid`s from `SpkWithExpectedTxids` are not in the current history of an `spk` obtained from the chain source, those `Txid`s are considered missing. Support for `SpkWithExpectedTxids` has been added to both `bdk_electrum` and `bdk_esplora` chain source crates.

  The canonicalization algorithm is updated to disregard transactions with a `last_evicted` timestamp greater than or equal to their `last_seen` timestamp, except in cases where transitivity rules apply.

  ### Notes to the reviewers

  This PR does not fix #1740 for block-by-block chain source (such as `bdk_bitcoind_rpc`). This work is done in a separate PR (#1857).

  ### Changelog notice

  * Add `TxUpdate::evicted_ats` which tracks transactions that have been replaced and are no longer present in mempool.
  * Change `TxGraph` to track `last_evicted` timestamps. This is when a transaction is last marked as missing from the mempool.
  * The canonicalization algorithm now disregards transactions with a `last_evicted` timestamp greater than or equal to it's `last_seen` timestamp, except when a canonical descendant exists due to rules of transitivity.
  * Add `SpkWithExpectedTxids` in `spk_client` which keeps track of expected `Txid`s for each `spk`.
  * Change `bdk_electrum` and `bdk_esplora` to understand `SpkWithExpectedTxids`.
  * Add `SyncRequestBuilder::expected_txids_of_spk` method which adds an association between `txid`s and `spk`s.

  ### Checklists

  #### All Submissions:

  * [x] I've signed all my commits
  * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md)
  * [x] I ran `cargo fmt` and `cargo clippy` before committing

  #### New Features:

  * [x] I've added tests for the new feature
  * [x] I've added docs for the new feature

  #### Bugfixes:

  * [x] This pull request breaks the existing API
  * [x] I've added tests to reproduce the issue which are now passing
  * [x] I'm linking the issue being fixed by this PR

ACKs for top commit:
  LLFourn:
    ACK 0a20724

Tree-SHA512: 29ef964e4597aa9f4cf02e9997b0d17cb91ec2f0f1187b0e9ade3709636b873c2a7cbe803facbc4686a3050a2abeb3e9cc40f9308f8cded9c9353734dcc5755b
@evanlinjin evanlinjin marked this pull request as draft March 25, 2025 09:18
@LagginTimes LagginTimes force-pushed the mempool_evicted branch 4 times, most recently from 8b055d2 to ca37ae2 Compare April 1, 2025 14:09
@LagginTimes LagginTimes force-pushed the mempool_evicted branch 2 times, most recently from 9e00579 to b94a12a Compare April 7, 2025 09:43
Copy link
Member

@evanlinjin evanlinjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation is outdated, but I understand this is WIP.

@LagginTimes LagginTimes marked this pull request as ready for review April 22, 2025 12:40
@notmandatory notmandatory requested a review from evanlinjin April 22, 2025 14:06
@LagginTimes LagginTimes requested a review from evanlinjin April 24, 2025 11:38
Copy link
Member

@evanlinjin evanlinjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished reviewing yet. Some thoughts:

  1. We probably need something like insert_relevant_evicted_ats(&mut self, evicted_txs: impl IntoIterator<Item = (Txid, u64)>) on TxGraph for convenience.

  2. The old best-effort-avoid-reemitting-mempool-txs logic might actually be replaceable by caching Arc<Transaction> in expected_mempool_txids.

    prior_mempool_txs: HashMap<Txid, Arc<Transaction>>,

    The idea is we always emit the whole mempool but it is still memory-efficient.

For 2., I guess we can tackle that in a separate PR.

Copy link
Member

@evanlinjin evanlinjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks solid.

There’s room for improvement in documentation clarity across tests, inline code comments, and public doc comments:

  • Code comments: Focus on the why behind the logic, rather than describing what the code is doing—since that’s usually self-evident from the code itself.
  • Tests: For larger tests, consider adding a summary doc comment at the top. This helps readers quickly understand what the test is verifying without having to trace through each line and reconstruct the intent.
  • Public doc comments: Prioritize what’s important for the caller to know—avoid going into implementation details unless they directly impact usage.

Copy link
Member

@evanlinjin evanlinjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a final request.

Copy link
Member

@evanlinjin evanlinjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the incremental reviews — it's been eye-opening for me as well how tricky the Emitter is to reason about. Let's aim to simplify it in follow-up PRs.

One thing I’ve come to realize: the next_header method is inconsistent with the eviction-detection logic we're introducing via next_block and mempool. I suggest removing next_header entirely — everything it does can already be achieved through next_block.

Copy link
Member

@evanlinjin evanlinjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self-ACK e1c53d7

@ValuedMammal
Copy link
Collaborator

@LagginTimes Can you rebase on master and also squash 6e1178c into the previous commit?

ValuedMammal and others added 4 commits May 1, 2025 14:41
* `From<CanonicalTx> for Txid`
* `From<CanonicalTx> for Arc<Transaction>`

Also added a convenience method `ChainPosition::is_unconfirmed`.

These are intended to simplify the calls needed to populate the
`expected_mempool_txids` field of `Emitter::new`.
* Change signature of `Emitter::new` so that `expected_mempool_txids`
  can be more easily constructed from `TxGraph` methods.
* Change generic bounds of `C` within `Emitter<C>` to be `C: DeRef,
  C::Target: RpcApi`. This allows the caller to have `Arc<Client>` as
  `C` and does not force to caller to hold a lifetimed reference.
Copy link
Collaborator

@ValuedMammal ValuedMammal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 8513d83

@ValuedMammal ValuedMammal merged commit add65bc into bitcoindevkit:master May 1, 2025
17 checks passed
@github-project-automation github-project-automation bot moved this from Needs Review to Done in BDK Chain May 1, 2025
@ValuedMammal
Copy link
Collaborator

Thanks @LagginTimes

I didn't mean to become author on 28ef7c9 😅

@ValuedMammal ValuedMammal mentioned this pull request May 1, 2025
39 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api A breaking API change new feature New feature or request

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Undetected double-spent

4 participants