BIP 181, 182, 183: BIPs for Utreexo #1923

kcalvinalvin · 2025-08-10T06:56:50Z

These are the 3 BIPs that describe Utreexo, a consensus-compatible (non-soft fork) way to send and verify transactions without storing the full UTXO set.

The 3 BIPs are for:

The specification of the Utreexo accumulator.
The specification of Bitcoin block and tx validation using the Utreexo accumulator.
The peer to peer networking changes required to enable Utreexo nodes.

Mailing list post: https://groups.google.com/g/bitcoindev/c/W1lxBraKG_E

jmoik

some typos

utreexo-p2p-bip.md

jonatack

Thank you for proposing these drafts. They already look quite complete with respect to the editorial requirements (BIPs 2 and 3). I've done a cursory first pass. No immediate conceptual feedback. A few editorial comments follow; feel free to ignore them during conceptual review until they are applicable.

utreexo-p2p-bip.md

utreexo-validation-bip.md

utreexo-accumulator-bip.md

utreexo-validation-bip.md

utreexo-accumulator-bip.md

petertodd · 2025-08-12T15:52:53Z

You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common.

1BitcoinBoWP1FZ4xwTNkq6XksKidmgYYw · 2025-08-12T18:29:31Z

I strongly recommend replacing SHA-256 with SHAKE256 (from the SHA-3 standard) for the following reasons:

1. Security Advantages

🔒 Provides built-in protection against length-extension attacks
📏 Offers flexible output lengths (supports 128-bit and 256-bit security levels)
⚙️ Based on Keccak sponge construction (NIST FIPS 202 standard)
🌐 Aligns with post-quantum cryptography standards

2. Comparative Analysis: SHA-256 vs SHAKE256

Characteristic	SHA-256	SHAKE256
Algorithm Family	SHA-2	SHA-3 (Keccak)
Output Flexibility	Fixed 256-bit	Arbitrary length
Security Properties	Vulnerable to length-extension	Resistant to length-extension
Internal Structure	Merkle-Damgård	Sponge function
Standardization	NIST FIPS 180-4	NIST FIPS 202

3. Functional Example

Input: Bitcoin

SHAKE256 (512-bit output):
6beb0661ba1fa7289bf359fbb81550bd9641cf5abc62a14d466c421c8a86e528e027632ec0e7ceb994650566f3c8258af2240333b6d0e9186766fd2c1ebb763a

SHAKE256 (256-bit output):
6beb0661ba1fa7289bf359fbb81550bd9641cf5abc62a14d466c421c8a86e528

4. Implementation Benefits

✅ Maintains 256-bit output compatibility where needed
✅ Future-proofs against emerging cryptographic vulnerabilities
✅ Reduces potential attack vectors through improved design
✅ Supports Bitcoin's security evolution while maintaining performance

5. Technical Reference

For detailed cryptographic differences:
Cryptographic Comparison: SHA-2 vs SHA-3

kcalvinalvin · 2025-08-18T11:06:29Z

You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common.

Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256.

But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?

kcalvinalvin · 2025-08-18T11:10:24Z

I strongly recommend replacing SHA-256 with SHAKE256 (from the SHA-3 standard) for the following reasons:

SHAKE256 is not used in Bitcoin and introduces a new hash which increases the trust-assumption. We do not want to do this.

jonatack · 2025-08-18T14:35:55Z

Some friendly moderation to keep the discussion focused on technical review -- thanks.

kcalvinalvin · 2025-08-18T14:46:13Z

The reliance of Bitcoin on SHA-2—a legacy hash function designed by the National Security Agency (NSA)—introduces non-trivial security risks, particularly when considering the often-dismissed threat posed by quantum adversaries.

SHA256 and SHA512 are quantum resistent.

Migrating to SHAKE256 (a variant of SHA-3) would represent a meaningful improvement, though such a change merely delays the inevitable: Bitcoin must eventually transition to a quantum-resistant cryptographic framework. When this occurs—and it will, regardless of opposition—SHA-2, along with ECDSA private keys, public keys, and signatures, will become obsolete.
See: Lenght extension attack (Bitcoin is vulnerable because it's using SHA-256)

Ok but this has nothing to do with this BIP.

murchandamus · 2025-08-18T22:15:07Z

@1BitcoinBoWP1FZ4xwTNkq6XksKidmgYYw, please cut out the LLM generated comments. If any of us were interested in seeing an LLM’s prediction of what might be said about a topic, we could prompt one ourselves.

petertodd · 2025-08-18T22:18:29Z

On Mon, Aug 18, 2025 at 04:06:51AM -0700, Calvin Kim wrote: kcalvinalvin left a comment (bitcoin/bips#1923) > You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common. Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256. But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?

No part of the Bitcoin consensus protocol uses SHA512.

kcalvinalvin · 2025-08-19T06:17:17Z

On Mon, Aug 18, 2025 at 04:06:51AM -0700, Calvin Kim wrote: kcalvinalvin left a comment (bitcoin/bips#1923) > You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common. Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256. But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?
No part of the Bitcoin consensus protocol uses SHA512.

Ok but you've stated in your previous comment "You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol". Would be very helpful to see what type of justifications the other protocols have made.

Second, I don't think it matters if SHA512 wasn't used in the Bitcoin consensus protocol. SHA512 is used in BIP32 and the argument that SHA512 is safe for generating private keys but not safe for Bitcoin consensus isn't sound.

I think our original justification (better performance with SHA512/256) mentioned in the BIP is sound. Happy to provide the benchmarks, they're being worked on at the moment.

lucad70 · 2025-08-21T19:13:46Z

utreexo-validation-bip.md

+| Name              | Type                     | Description                               |
+| ----------------- | ------------------------ | ----------------------------------------- |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |


For clarification, is the Utreexo_Tag_V1 really used twice in preimage to the hash?

My guess would be that this duplication is unintended.

Suggested change

| Name | Type | Description |

| ----------------- | ------------------------ | ----------------------------------------- |

| Utreexo_Tag_V1 | 64 byte array | The version tag to be prepended to the leafhash. |

| Utreexo_Tag_V1 | 64 byte array | The version tag to be prepended to the leafhash. |

| Name | Type | Description |

| ----------------- | ------------------------ | ----------------------------------------- |

| Utreexo_Tag_V1 | 64 byte array | The version tag to be prepended to the leafhash. |

Oh no the duplication is intended.

Since we use SHA512/256 as the hash function, each chunk is 128 bytes. Since the version tag is only 64 bytes, we need two of them.

petertodd · 2025-08-24T13:48:55Z

On Mon, Aug 18, 2025 at 04:06:51AM -0700, Calvin Kim wrote: kcalvinalvin left a comment (bitcoin/bips#1923) > You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common. Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256. But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?
No part of the Bitcoin consensus protocol uses SHA512.

Ok but you've stated in your previous comment "You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol". Would be very helpful to see what type of justifications the other protocols have made.

Second, I don't think it matters if SHA512 wasn't used in the Bitcoin consensus protocol. SHA512 is used in BIP32 and the argument that SHA512 is safe for generating private keys but not safe for Bitcoin consensus isn't sound.

I think our original justification (better performance with SHA512/256) mentioned in the BIP is sound. Happy to provide the benchmarks, they're being worked on at the moment.

The question is 1) why are we added one new dependency to consensus implementations, and 2) is this actually a performance increase, given that dedicated SHA256 hardware is becoming common?

Length-extension attacks are not relevant for this use-case as we are only committing to public data.

murchandamus

I had a look at most of the Accumulator Specification for the first helping. Looks very good already. I only reviewed the function definitions up to root_position, then skimmed the rest, before reading on from Rationale.

murchandamus · 2025-08-25T19:02:51Z

utreexo-accumulator-bip.md

+         Davidson Souza <[email protected]>
+Comments-URI: TBD
+Status: Draft
+Type: Specification


Nit: BIP 2 is still active, so this should be "Standard Track" for the time being.

murchandamus · 2025-08-25T19:04:27Z

utreexo-accumulator-bip.md

+
+## Abstract
+
+This BIP describes the Utreexo accumulator and it's operations. It lays down how to update the


Suggested change

This BIP describes the Utreexo accumulator and it's operations. It lays down how to update the

This BIP describes the Utreexo accumulator and its operations. It lays down how to update the

murchandamus · 2025-08-25T20:36:36Z

utreexo-accumulator-bip.md

+To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$,
+where N is the number of elements ever added to the set, while still keeping proof sizes small and verification efficient.


In case this doesn’t get discussed later, it might be interesting to compare how O(log₂(N)) for all transaction outputs ever created compare to the current UTXO set size.

Technically the current Utreexo design is O(log2(N)) of all txos since the forest doesn't shrink on a deletion. We just move the leaf up so it has the same affect as shrinking the forest.

murchandamus · 2025-08-25T20:44:51Z

utreexo-accumulator-bip.md

+
+The following utility functions are required for performing accumulator operations:
+
+**parent_hash(left, right):** Returns the hash of the concatenation of two child hashes (`left` and `right`).


Does this ambiguity regarding the depth of the leaf in the tree not introduce similar weaknesses as the original Merkle tree construction? Why would we float up leaf-hashes rather than create a tagged hash at each level?

Is this fully mitigated due to the number of leaves being known?

Does this ambiguity regarding the depth of the leaf in the tree not introduce similar weaknesses as the original Merkle tree construction?

Not quite sure which weakness you're referring to here. Is it CVE-2012-2459 (one from calculating the Bitcoin block header commitment)? Since we don't duplicate hashes, it's not vulnerable to that particular attack.

Why would we float up leaf-hashes rather than create a tagged hash at each level?

Since we float up the leaf hashes, we can save on the proofs being sent over for the sibling later on.

On a tree like so, proof for 01 is 00, 09, 13.

14 |---------------\ 12 13 |-------\ |-------\ 08 09 10 11 |---\ |---\ |---\ |---\ 00 01 02 03 04 05 06 07

If we delete 00, then 01 moves up to 08. The proof for 01 is now 09 and 13. The proof got shorter.

14 |---------------\ 12 13 |-------\ |-------\ 01 09 10 11 |---\ |---\ |---\ |---\ 02 03 04 05 06 07

That's a good point, and we can add a bit about this potential issue in the BIP.

Leaves can move up, and it's no longer leaves getting hashed with leaves, but leaf/internal node pairs happen often. An attack would be to grind through transactions to get two leaf hashes that together could look like a leaf data preimage for a rogue UTXO.

The reason this isn't a problem is that in all cases when a node is verifying a UTXO proof, the full UTXO data is known and a keyed hash is used (see "UTXO Hash Preimages" section of the validation BIP) to get from the UTXO data to the leaf. The first 128 bytes input to the hash function are the tags (the hash of "UtreexoV1"). Since this tag is only used for UTXO data, and not in internal accumulator hashes, this should prevent any internal hashes from being interpreted as UTXO data.

murchandamus · 2025-08-25T20:49:04Z

utreexo-accumulator-bip.md

+    return sha512_256(left + right)
+```
+
+**treerows(numleaves):** Returns the minimum number of bits required to represent `numleaves - 1`. This corresponds to the height of the largest tree in the forest. Returns `0` if `numleaves` is `0`.


The numleaves - 1 throws me off here. It’s not obvious to me, why the function would be defined that way rather than the "minimum number of bits required to represent numleaves"? Perhaps a bit more context would help?

Ah it's because we wanted treerows to return the index of the largest tree not the length.
For the below tree, numleaves = 4 but we want treerows to return 2 not 3.

row 2: 06 |-------\ row 1: 04 05 |---\ |---\ row 0: 00 01 02 03

If we just took the minimum number of bits to represent numleaves = 4, we'd get 3. So to account for this, we take the minimum number of bits needed to represent numleaves-1. This off-by-one happens when numleaves is a power of two.

@adiabat did talk about wanting to make treerows return the length and not the index a while back so last chance to speak up? :)

I've added the explanation in the bip as well.

murchandamus · 2025-08-25T21:05:57Z

utreexo-accumulator-bip.md

+**parent(position, total_rows):** Returns the parent position of the given `position` in an accumulator with `total_rows` tree rows.
+
+Implementation:
+
+```python
+def parent(position: int, total_rows: int) -> int:
+    return (position >> 1) | (1 << total_rows)
+```


I could have used a little more explanation why this returns the parent, but staring at it for a bit, it seems to me that a fully filled tree with 2n leaves would have 2n-1 inner nodes, meaning that all leaves start with a zero in the first position and all inner nodes starting with a one.

E.g. for four leaves, the leaves are 000, 001, 010, and 011, and the inner nodes would be 100, 101, 110.

For 000 and 001, shifting to the right gives 00 and setting the top bit makes the parent 100. For 010 and 011, it works out to be 101. For 100 and 101, it works out to 110.

Gotcha, cool.

murchandamus · 2025-08-25T21:16:41Z

utreexo-accumulator-bip.md

+substantial. In RSA-based designs, creating a proof for any given UTXO at arbitrary times can be computationally
+intensive, especially as the number of UTXOs grows.
+
+Utreexo's design is driven by the need for Bridge Nodes: nodes that maintain backward compatibility with existing


New jargon is usually italicized on introduction, perhaps consider:

Suggested change

Utreexo's design is driven by the need for Bridge Nodes: nodes that maintain backward compatibility with existing

Utreexo's design is driven by the need for *bridge nodes*: nodes that maintain backward compatibility with existing

murchandamus

This time I took a look at the "Validation Layer" BIP. Also looks very good already. I noticed that there is no Rationale section, and the title seemed a little less informative than it could be.

murchandamus · 2025-08-27T16:14:21Z

utreexo-validation-bip.md

+```
+BIP: TBD
+Layer: Peer Services
+Title: Utreexo - Validation Layer


The title feels a bit odd to me. It could be a bit more descriptive, I was thinking "Utreexo - Transaction and block validation" or smth?

murchandamus · 2025-08-27T16:16:50Z

utreexo-validation-bip.md

+         Davidson Souza <[email protected]>
+Comments-URI: TBD
+Status: Draft
+Type: Specification


Nit: Until BIP 3 activates, this should be Standards Track.

murchandamus · 2025-08-27T16:32:26Z

utreexo-validation-bip.md

+
+This BIP defines the rules for validating blocks and transactions using the
+Utreexo accumulator. It is important to note that this BIP does not define the
+Utreexo accumulator itself, for that see BIP-????. This document is only concerned with


Maybe for the time being:

Suggested change

Utreexo accumulator itself, for that see BIP-????. This document is only concerned with

Utreexo accumulator itself, for that see [‎BIP Utreexo Accumulator](‎utreexo-accumulator-bip.md). This document is only concerned with

murchandamus · 2025-08-27T16:35:58Z

utreexo-validation-bip.md

+### Node Hashes
+
+During a node's normal operation, it will need to compute the leaf hash for UTXOs
+being added or removed from the accumulator. The leaf hash is a 32 byte hash that


Suggested change

being added or removed from the accumulator. The leaf hash is a 32 byte hash that

being added or removed from the accumulator. The leaf hash is a 32-byte hash that

murchandamus · 2025-08-27T16:36:25Z

utreexo-validation-bip.md

+
+#### UTXO Hash Preimages
+
+Individual UTXOs are represented as 32 byte hashes in the Utreexo accumulator. To obtain this


Suggested change

Individual UTXOs are represented as 32 byte hashes in the Utreexo accumulator. To obtain this

Individual UTXOs are represented as 32-byte hashes in the Utreexo accumulator. To obtain this

murchandamus · 2025-08-27T18:39:17Z

utreexo-validation-bip.md

+do not have outputs that overwrites an existing UTXO.
+
+`BIP-0034` was a rule where the block height was included in the script signature
+of the coinbase transaction. One of the reason for the change was to make


As far as I can tell, the rest of BIP 34 explains the activation mechanism of BIP 34, so I would claim that this is the main reason.

Suggested change

of the coinbase transaction. One of the reason for the change was to make

of the coinbase transaction. The main reason for the change was to make

murchandamus · 2025-08-27T18:41:25Z

utreexo-validation-bip.md

+random bytes that could be interpreted as block heights. The lowest block
+heights are: 209,921, 490,897, and 1,983,702.


Suggested change

random bytes that could be interpreted as block heights. The lowest block

heights are: 209,921, 490,897, and 1,983,702.

random bytes that could be interpreted as block heights. The lowest implicated block

heights are: 209,921, 490,897, and 1,983,702.

murchandamus · 2025-08-27T18:47:35Z

utreexo-validation-bip.md

+that will probably never actually happen, however.
+
+Block 1,983,702 is the first block that Utreexo nodes would be in danger of a
+consensus failure due to the inability to perform the BIP-0030 checks. However,


Suggested change

consensus failure due to the inability to perform the BIP-0030 checks. However,

consensus failure due to the inability to perform the BIP-0030 checks, if someone were to reuse coinbase transaction from block 164,384 . However,

murchandamus · 2025-08-27T18:48:53Z

utreexo-validation-bip.md

+
+### Historical BIP-0030 violations
+
+There were two UTXOs that were overwritten due to this consensus rule are:


Not due to this rule, but rather before it was introduced:

Suggested change

There were two UTXOs that were overwritten due to this consensus rule are:

There were two UTXOs that were overwritten by repeated transactions:

murchandamus · 2025-08-27T18:51:27Z

utreexo-validation-bip.md

+accumulator. To be consensus compatible with clients that do have the historical
+violations, the leaves representing these two UTXOs in the Utreexo accumulator
+are hardcoded as unspendable.


If I’m understanding this right:

Suggested change

accumulator. To be consensus compatible with clients that do have the historical

violations, the leaves representing these two UTXOs in the Utreexo accumulator

are hardcoded as unspendable.

accumulator. To be consensus compatible with clients that retain only the second

occurrences of these outputs, the leaves representing the corresponding first UTXOs in the Utreexo accumulator

are hardcoded as unspendable.

murchandamus

I read the whole P2P BIP, although I went over the new messages section a bit more quickly. There are some sections that felt a bit confusing to me, perhaps you could try to take a look at whether you can clarify those for the less initiated. Overall, this seems close to complete, although I noticed that it is missing a Rationale section.

murchandamus · 2025-08-28T21:10:57Z

utreexo-p2p-bip.md

+Utreexo nodes require the inclusion proof to fully validate blocks and transactions.
+Each block has a corresponding inclusion proof with it and this inclusion proof for blocks up to height 906,937 requires an additional 631.85GB, which is roughly 40GB less than the size of the block data.
+Each transaction also has a corresponding inclusion proof with it and for normal transaction relay, the proof is roughly 3 times the size of the transaction.
+It's still reasonable for a single node to download this extra data but little caching goes a long way in reducing the amount of data that one has to download.


little caching ↦ almost no caching
a little caching ↦ some caching

I think you mean the latter:

Suggested change

It's still reasonable for a single node to download this extra data but little caching goes a long way in reducing the amount of data that one has to download.

It's still reasonable for a single node to download this extra data but a little caching goes a long way in reducing the amount of data that one has to download.

murchandamus · 2025-08-28T21:16:07Z

utreexo-p2p-bip.md

+CSNs have the goal of minimizing data storage and download while performing block validation.
+Archive and bridge nodes store more data and provide this data to CSNs.
+
+Bridge nodes are nodes that can add inclusion proofs to mempool transactions, support the same set of messages as CSNs, and should in fact be indistinguishable from CSNs on the network.


It’s not clear to me how "bridge nodes should in fact be indistinguishable from CSNs on the network". By whom are they indistinguishable. In what regard are they indistinguishable? Shouldn’t they, e.g., be frequently the first peer to notify about new transactions appearing in the mempool and blocks having been found as they act as the translation layer and therefore the initial source of data for the Utreexo-portion of the node network?

It’s not clear to me how "bridge nodes should in fact be indistinguishable from CSNs on the network". By whom are they indistinguishable. In what regard are they indistinguishable?

They're indistinguishable as we don't explicitly specify which nodes are bridges. The sentence was an attempt at clarifying a common misconception that a CSN must connect to bridge nodes.

Shouldn’t they, e.g., be frequently the first peer to notify about new transactions appearing in the mempool and blocks having been found as they act as the translation layer and therefore the initial source of data for the Utreexo-portion of the node network?

Yes this is true. They usually should be the first to notify utreexo peers about new txs and blocks

Maybe "indistinguishable" is too strong -- it would be great if nobody could tell, but if there are a small number of bridge nodes and a large number of CSNs it might be traceable.

The main thing is bridge nodes don't announce themselves as such; they just pass proofs and transactions around just like CSNs. If you're a CSN connected directly to a bridge node, you might see a lot of INVs and proofs originate from that node, and they might be a bridge, but they might just be a well connected CSN.

It's similar to trying to prevent people from tracing new transactions to originating nodes, though probably in one sense harder (bridge nodes keep being bridge nodes all the time vs only getting one shot with a wallet broadcasting) but also lower stakes (determining that a node is a bridge doesn't hurt privacy or network strength that much).

murchandamus · 2025-08-28T21:19:21Z

utreexo-p2p-bip.md

+Archive and bridge nodes store more data and provide this data to CSNs.
+
+Bridge nodes are nodes that can add inclusion proofs to mempool transactions, support the same set of messages as CSNs, and should in fact be indistinguishable from CSNs on the network.
+Archive nodes are able to serve the blocks and the inclusion proofs. However, they are not able to generate the inclusion proofs as they do not keep the full UTXO set.


Does "Bridge node" refer to the aspect of whether the node has the UTXO set, and does "archive node" refer to having the full set of data? I.e., are these different dimensions? Would you run an "archive bridge node" if you want to offer all services?

Edit: Oh, never mind, you answer that right below.

murchandamus · 2025-08-28T21:23:00Z

utreexo-p2p-bip.md

+
+### Pre-P2P: Bridge Building
+
+When introducing Utreexo into an existing network, there are 2 thing needed before CSNs can operate.


Suggested change

When introducing Utreexo into an existing network, there are 2 thing needed before CSNs can operate.

When introducing Utreexo into an existing network, there are two things needed before CSNs can operate.

murchandamus · 2025-08-28T21:23:18Z

utreexo-p2p-bip.md

+### Pre-P2P: Bridge Building
+
+When introducing Utreexo into an existing network, there are 2 thing needed before CSNs can operate.
+First, archive nodes need to build proofs for old blocks to serve during the initial-block download (IBD).


Suggested change

First, archive nodes need to build proofs for old blocks to serve during the initial-block download (IBD).

First, archive nodes need to build proofs for old blocks to serve during the initial block download (IBD).

murchandamus · 2025-08-28T22:22:26Z

utreexo-p2p-bip.md

+With these merkle tree positions for the UTXOs referenced in the inputs, we can calculate the needed positions of the merkle hashes to them.
+These positions are then sent over in the `getdata` message as an another inventory vector.
+
+![Utreexo TX relay multiple Utreexo proof hash vectors](bip-utreexo-p2p/utreexo-tx-relay-multiple-proofhash-vectors.png)


Scrolling up and down through this document, it’s sometimes difficult to tell whether a paragraph belongs to the image before or after the paragraph. Since Markdown does not allow captions on images, it could for example help if either the images included the caption, or if the text were structured in some way that makes it clearer.

murchandamus · 2025-08-28T22:27:06Z

utreexo-p2p-bip.md

+![Legacy Block Propagation](bip-utreexo-p2p/legacy-block-propagation.png)
+
+Legacy block propagation without Compact Blocks comprises of three steps:


Consistency: Previously you were referring to non-Utreexo nodes as "current nodes", now it’s "legacy". Please use one term to refer to the same concept across the entire document.

murchandamus · 2025-08-28T22:37:23Z

utreexo-p2p-bip.md

+For some script types (e.g. `ScriptHash`, `PubkeyHash`, `WitnessScriptHash`, `WitnessPubkeyHash`) the actual locking condition is not in the scriptPubkey, but a hash of it.
+The script which is evaluated is provided as an element of the scriptSig or witness data.
+
+Therefore, we can safely just omit the locking script hash from the UTXO data and reconstruct it from the witness or scriptSig.


Suggested change

Therefore, we can safely just omit the locking script hash from the UTXO data and reconstruct it from the witness or scriptSig.

Therefore, we can safely omit the locking script hash from the UTXO data and reconstruct it from the witness or scriptSig.

murchandamus · 2025-08-28T22:40:45Z

utreexo-p2p-bip.md

+|--------------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| block height | uint32              | The time-to-live value of a leaf in the Utreexo merkle forest. The value is determined by the amount of leaves that were added to the accumulator since its creation                                                                                           |
+| length       | varint              | The length of the TTLs                                                                                                                                                                                                                                         |
+| TTLs         | vector of TTL infos | The TTL Info for the UTXOs that are added to the Utreexo merkle forest in blockchain ordering. See [Utreexo - Validation Layer](./utreexo-validation-bip.md#Excluded UTXOs from the accumulator) for the UTXOs that are not added to the Utreexo merkle forest |


Suggested change

| TTLs | vector of TTL infos | The TTL Info for the UTXOs that are added to the Utreexo merkle forest in blockchain ordering. See [Utreexo - Validation Layer](./utreexo-validation-bip.md#Excluded UTXOs from the accumulator) for the UTXOs that are not added to the Utreexo merkle forest |

| TTLs | vector of TTL infos | The TTL Info for the UTXOs that are added to the Utreexo merkle forest in blockchain ordering. See [Utreexo - Validation Layer](./utreexo-validation-bip.md#excluded-utxos-from-the-accumulator) for the UTXOs that are not added to the Utreexo merkle forest |

murchandamus · 2025-08-28T22:42:57Z

utreexo-p2p-bip.md

+
+Since there's one corresponding leaf data per target location, it's trivial to generate a bitmap for the leafdatas.
+
+Using the [proof_positions](./utreexo-accumulator-bip.md#Utility Functions) function, it's possible to generate the positions of the needed proof hashes for a given set of targets.


Suggested change

Using the [proof_positions](./utreexo-accumulator-bip.md#Utility Functions) function, it's possible to generate the positions of the needed proof hashes for a given set of targets.

Using the [proof_positions](./utreexo-accumulator-bip.md#utility-functions) function, it's possible to generate the positions of the needed proof hashes for a given set of targets.

jonatack · 2025-08-29T20:00:32Z

After discussion amongst the editors, we've assigned 181-183 for these 3 BIP drafts.

@murchandamus suggested 181 Accumulator / 182 Validation / 183 P2P (I agree) while leaving it up to you.

murchandamus · 2025-08-29T23:33:21Z

Whenever you get around to it, please add the numbers to the Preambles, set the "Created" header to 2025-08-29 (it holds the date a BIP got numbered), and add the table entries to the README.mediawiki.

kcalvinalvin · 2025-08-30T03:53:12Z

After discussion amongst the editors, we've assigned 181-183 for these 3 BIP drafts.

@murchandamus suggested 181 Accumulator / 182 Validation / 183 P2P (I agree) while leaving it up to you.

Whenever you get around to it, please add the numbers to the Preambles, set the "Created" header to 2025-08-29 (it holds the date a BIP got numbered), and add the table entries to the README.mediawiki.

Currently going through all the reviews and writing up the rationale for validation and p2p. Will address these as well.

luisschwab · 2025-09-02T16:11:40Z

utreexo-p2p-bip.md

+#### Compact leaf data
+
+For a CSN to learn the data associated with a UTXO, it must ask for a peer that has it.
+To authenticate this data, it is committed into the accumulator, and therefore cannot be changed by peer.


Suggested change

To authenticate this data, it is committed into the accumulator, and therefore cannot be changed by peer.

To authenticate this data, it is committed into the accumulator, and therefore cannot be changed by the peer.

luisschwab · 2025-09-02T16:14:45Z

utreexo-p2p-bip.md

+
+## Abstract
+
+Utreexo creates a compact representation of the UTXO set that only takes a couple of kilobytes.


At one extreme of this gradient, nodes minimize storage and memory requirements, keeping only the roots of the hash trees, which never exceed a kilobyte.

The Utreexo paper mentions that the upper limit of the accumulator size is a single KB. What changed?

It's essentially still a kilobyte but since we can support leaves up to the maximum of uint64, we can have 64 roots which is 64*32 = 2048. So 2KB max.

kcalvinalvin

All of the review comments are addressed and the rationale for BIPs 182 and 183 were added.

BIP-0183 was also edited in the following ways:

1: Images updated with caption
2: Images now updated with transparent backgrounds and changed the colors so they can be read in dark mod
3: Changed the layout of the images and the paragraphs to be more legible.

utreexo-p2p-bip.md

kcalvinalvin · 2025-08-29T08:53:36Z

utreexo-accumulator-bip.md

+To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$,
+where N is the number of elements ever added to the set, while still keeping proof sizes small and verification efficient.


Technically the current Utreexo design is O(log2(N)) of all txos since the forest doesn't shrink on a deletion. We just move the leaf up so it has the same affect as shrinking the forest.

kcalvinalvin · 2025-08-29T08:58:51Z

utreexo-validation-bip.md

+| Name              | Type                     | Description                               |
+| ----------------- | ------------------------ | ----------------------------------------- |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |


Oh no the duplication is intended.

Since we use SHA512/256 as the hash function, each chunk is 128 bytes. Since the version tag is only 64 bytes, we need two of them.

kcalvinalvin · 2025-08-29T09:07:08Z

utreexo-accumulator-bip.md

+
+The following utility functions are required for performing accumulator operations:
+
+**parent_hash(left, right):** Returns the hash of the concatenation of two child hashes (`left` and `right`).


Does this ambiguity regarding the depth of the leaf in the tree not introduce similar weaknesses as the original Merkle tree construction?

Not quite sure which weakness you're referring to here. Is it CVE-2012-2459 (one from calculating the Bitcoin block header commitment)? Since we don't duplicate hashes, it's not vulnerable to that particular attack.

Why would we float up leaf-hashes rather than create a tagged hash at each level?

Since we float up the leaf hashes, we can save on the proofs being sent over for the sibling later on.

On a tree like so, proof for 01 is 00, 09, 13.

14 |---------------\ 12 13 |-------\ |-------\ 08 09 10 11 |---\ |---\ |---\ |---\ 00 01 02 03 04 05 06 07

If we delete 00, then 01 moves up to 08. The proof for 01 is now 09 and 13. The proof got shorter.

14 |---------------\ 12 13 |-------\ |-------\ 01 09 10 11 |---\ |---\ |---\ |---\ 02 03 04 05 06 07

kcalvinalvin · 2025-08-29T09:26:23Z

utreexo-accumulator-bip.md

+    return sha512_256(left + right)
+```
+
+**treerows(numleaves):** Returns the minimum number of bits required to represent `numleaves - 1`. This corresponds to the height of the largest tree in the forest. Returns `0` if `numleaves` is `0`.


Ah it's because we wanted treerows to return the index of the largest tree not the length.
For the below tree, numleaves = 4 but we want treerows to return 2 not 3.

row 2: 06 |-------\ row 1: 04 05 |---\ |---\ row 0: 00 01 02 03

If we just took the minimum number of bits to represent numleaves = 4, we'd get 3. So to account for this, we take the minimum number of bits needed to represent numleaves-1. This off-by-one happens when numleaves is a power of two.

@adiabat did talk about wanting to make treerows return the length and not the index a while back so last chance to speak up? :)

I've added the explanation in the bip as well.

kcalvinalvin · 2025-08-29T10:08:22Z

utreexo-validation-bip.md

+proofs. Each of the positions in (1) refer to the UTXO hash preimage in the same
+index.


For some reason I had thought that the accumulator proof was a Merkle branch, but now reading this, it makes me think that the proofs are built-up from the leaf preimages. Which of the two is correct, and could you perhaps check whether some more clarification should be added here to make it unambiguous?

You are right, there's the merkle branches themselves and the leaf preimages are an entirely separate data apart from that.

I'll read it over again and make clarifications where needed.

kcalvinalvin · 2025-08-29T11:31:31Z

utreexo-p2p-bip.md

+CSNs have the goal of minimizing data storage and download while performing block validation.
+Archive and bridge nodes store more data and provide this data to CSNs.
+
+Bridge nodes are nodes that can add inclusion proofs to mempool transactions, support the same set of messages as CSNs, and should in fact be indistinguishable from CSNs on the network.


It’s not clear to me how "bridge nodes should in fact be indistinguishable from CSNs on the network". By whom are they indistinguishable. In what regard are they indistinguishable?

They're indistinguishable as we don't explicitly specify which nodes are bridges. The sentence was an attempt at clarifying a common misconception that a CSN must connect to bridge nodes.

Shouldn’t they, e.g., be frequently the first peer to notify about new transactions appearing in the mempool and blocks having been found as they act as the translation layer and therefore the initial source of data for the Utreexo-portion of the node network?

Yes this is true. They usually should be the first to notify utreexo peers about new txs and blocks

kcalvinalvin · 2025-08-29T11:40:37Z

utreexo-p2p-bip.md

+The node will have the block and the TTLs for the outputs of the given block which it can then use to cache parts of the inclusion proof and only request the needed parts of an inclusion proof for future blocks.
+
+We note that it is feasible for a node to receive incorrect TTL values from malicious nodes and this can negatively impact the bandwidth savings.
+Nodes can mitigate this by not downloading TTL values too far into the future or by checking if the `TTL` message received was included in the accumulator hard-coded into the binary.


Oh I should clarify this.

Since nothing is being committed to the TTL messages, a node can just lie about the values in the message. To prevent this, the node should either:

1: don't download too far into the future since the damage done will be greater.
2: rely on the pre-committed (aka "hard coded into the binary") ttl accumulator in the node software. The ttl accumulator has ttls for each of the blocks accumulated. With this accumulator, the node can check if the received ttl is valid or invalid by checking for its existence in the ttl accumulator.

kcalvinalvin · 2025-09-07T12:55:34Z

utreexo-p2p-bip.md

+
+## Abstract
+
+Utreexo creates a compact representation of the UTXO set that only takes a couple of kilobytes.


It's essentially still a kilobyte but since we can support leaves up to the maximum of uint64, we can have 64 roots which is 64*32 = 2048. So 2KB max.

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch 2 times, most recently from 9b3eafb to a94f643 Compare August 10, 2025 07:09

jonatack added the New BIP label Aug 10, 2025

jmoik reviewed Aug 11, 2025

View reviewed changes

jonatack reviewed Aug 11, 2025

View reviewed changes

luisschwab reviewed Aug 11, 2025

View reviewed changes

utreexo-accumulator-bip.md Outdated Show resolved Hide resolved

utreexo-accumulator-bip.md Outdated Show resolved Hide resolved

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch 2 times, most recently from cb2993c to d1d0342 Compare August 12, 2025 06:23

luisschwab mentioned this pull request Aug 14, 2025

Socratic Seminar 44 (August 2025) Bitcoin-Grove/miamibitdevs.org#23

Closed

lucad70 mentioned this pull request Aug 14, 2025

Agosto 2025 ClubeBitcoinUnB/bitdevs.bsb.br#25

Closed

bitcoin deleted a comment from 1BitcoinBoWP1FZ4xwTNkq6XksKidmgYYw Aug 18, 2025

bitcoin deleted a comment from kcalvinalvin Aug 18, 2025

This comment was marked as off-topic.

Sign in to view

bitcoin deleted a comment from kcalvinalvin Aug 18, 2025

bitcoin deleted a comment from 1BitcoinBoWP1FZ4xwTNkq6XksKidmgYYw Aug 18, 2025

This comment was marked as abuse.

Sign in to view

This comment was marked as off-topic.

Sign in to view

lucad70 reviewed Aug 21, 2025

View reviewed changes

diogo-ck mentioned this pull request Aug 23, 2025

Tópicos 2025/08 curitibabitdevs/curitibabitdevs.org#24

Closed

murchandamus reviewed Aug 25, 2025

View reviewed changes

murchandamus reviewed Aug 27, 2025

View reviewed changes

murchandamus reviewed Aug 28, 2025

View reviewed changes

murchandamus added the Needs number assignment label Aug 28, 2025

jonatack removed the Needs number assignment label Aug 29, 2025

TheMhv mentioned this pull request Sep 2, 2025

Tópicos para o Bitdevs de Setembro plebemineira/bhbitdevs.org#23

Open

luisschwab reviewed Sep 2, 2025

View reviewed changes

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch 3 times, most recently from d67f429 to 253c739 Compare September 7, 2025 12:20

BIP181: Add the Utreexo accumulator BIP

d89952d

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch from 253c739 to 091afe1 Compare September 7, 2025 12:29

BIP182: Add the Utreexo validation BIP

4aa26f3

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch from 091afe1 to 260f2c9 Compare September 7, 2025 12:43

kcalvinalvin added 2 commits September 7, 2025 21:52

BIP183: Add the Utreexo P2P BIP

68da366

Update README table to include BIPs: 181, 182, 183

bd1e242

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch from 260f2c9 to bd1e242 Compare September 7, 2025 12:52

kcalvinalvin changed the title ~~BIP draft: BIPs for Utreexo~~ BIP 181, 182, 183: BIPs for Utreexo Sep 7, 2025

kcalvinalvin commented Sep 7, 2025

View reviewed changes


		## Abstract

		This BIP describes the Utreexo accumulator and it's operations. It lays down how to update the

		To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$,
		where N is the number of elements ever added to the set, while still keeping proof sizes small and verification efficient.


		The following utility functions are required for performing accumulator operations:

		parent_hash(left, right): Returns the hash of the concatenation of two child hashes (`left` and `right`).

	Utreexo's design is driven by the need for Bridge Nodes: nodes that maintain backward compatibility with existing
	Utreexo's design is driven by the need for bridge nodes: nodes that maintain backward compatibility with existing

	Utreexo accumulator itself, for that see BIP-????. This document is only concerned with
	Utreexo accumulator itself, for that see [‎BIP Utreexo Accumulator](‎utreexo-accumulator-bip.md). This document is only concerned with

	being added or removed from the accumulator. The leaf hash is a 32 byte hash that
	being added or removed from the accumulator. The leaf hash is a 32-byte hash that


		#### UTXO Hash Preimages

		Individual UTXOs are represented as 32 byte hashes in the Utreexo accumulator. To obtain this

	Individual UTXOs are represented as 32 byte hashes in the Utreexo accumulator. To obtain this
	Individual UTXOs are represented as 32-byte hashes in the Utreexo accumulator. To obtain this

	of the coinbase transaction. One of the reason for the change was to make
	of the coinbase transaction. The main reason for the change was to make

		random bytes that could be interpreted as block heights. The lowest block
		heights are: 209,921, 490,897, and 1,983,702.

	consensus failure due to the inability to perform the BIP-0030 checks. However,
	consensus failure due to the inability to perform the BIP-0030 checks, if someone were to reuse coinbase transaction from block 164,384 . However,


		### Historical BIP-0030 violations

		There were two UTXOs that were overwritten due to this consensus rule are:

	It's still reasonable for a single node to download this extra data but little caching goes a long way in reducing the amount of data that one has to download.
	It's still reasonable for a single node to download this extra data but a little caching goes a long way in reducing the amount of data that one has to download.


		### Pre-P2P: Bridge Building

		When introducing Utreexo into an existing network, there are 2 thing needed before CSNs can operate.

	First, archive nodes need to build proofs for old blocks to serve during the initial-block download (IBD).
	First, archive nodes need to build proofs for old blocks to serve during the initial block download (IBD).

		![Legacy Block Propagation](bip-utreexo-p2p/legacy-block-propagation.png)

		Legacy block propagation without Compact Blocks comprises of three steps:

	Therefore, we can safely just omit the locking script hash from the UTXO data and reconstruct it from the witness or scriptSig.
	Therefore, we can safely omit the locking script hash from the UTXO data and reconstruct it from the witness or scriptSig.

	\| TTLs \| vector of TTL infos \| The TTL Info for the UTXOs that are added to the Utreexo merkle forest in blockchain ordering. See [Utreexo - Validation Layer](./utreexo-validation-bip.md#Excluded UTXOs from the accumulator) for the UTXOs that are not added to the Utreexo merkle forest \|
	\| TTLs \| vector of TTL infos \| The TTL Info for the UTXOs that are added to the Utreexo merkle forest in blockchain ordering. See [Utreexo - Validation Layer](./utreexo-validation-bip.md#excluded-utxos-from-the-accumulator) for the UTXOs that are not added to the Utreexo merkle forest \|


		Since there's one corresponding leaf data per target location, it's trivial to generate a bitmap for the leafdatas.

		Using the [proof_positions](./utreexo-accumulator-bip.md#Utility Functions) function, it's possible to generate the positions of the needed proof hashes for a given set of targets.

	To authenticate this data, it is committed into the accumulator, and therefore cannot be changed by peer.
	To authenticate this data, it is committed into the accumulator, and therefore cannot be changed by the peer.

BIP 181, 182, 183: BIPs for Utreexo #1923

Are you sure you want to change the base?

BIP 181, 182, 183: BIPs for Utreexo #1923

Conversation

kcalvinalvin commented Aug 10, 2025

Uh oh!

jmoik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jonatack left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

petertodd commented Aug 12, 2025

Uh oh!

1BitcoinBoWP1FZ4xwTNkq6XksKidmgYYw commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Security Advantages

2. Comparative Analysis: SHA-256 vs SHAKE256

3. Functional Example

4. Implementation Benefits

5. Technical Reference

Uh oh!

kcalvinalvin commented Aug 18, 2025

Uh oh!

kcalvinalvin commented Aug 18, 2025 • edited by jonatack Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

jonatack commented Aug 18, 2025

Uh oh!

kcalvinalvin commented Aug 18, 2025 • edited by jonatack Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

murchandamus commented Aug 18, 2025

Uh oh!

petertodd commented Aug 18, 2025 via email

Uh oh!

kcalvinalvin commented Aug 19, 2025

Uh oh!

This comment was marked as abuse.

This comment was marked as off-topic.

This comment was marked as off-topic.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petertodd commented Aug 24, 2025

Uh oh!

murchandamus left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

1BitcoinBoWP1FZ4xwTNkq6XksKidmgYYw commented Aug 12, 2025 •

edited

Loading

kcalvinalvin commented Aug 18, 2025 •

edited by jonatack

Loading

kcalvinalvin commented Aug 18, 2025 •

edited by jonatack

Loading


		## Abstract

		Utreexo creates a compact representation of the UTXO set that only takes a couple of kilobytes.

		proofs. Each of the positions in (1) refer to the UTXO hash preimage in the same
		index.