Skip to content

Delete nonreduced fuzz inputs#263

Merged
dergoegge merged 4 commits intobitcoin-core:mainfrom
maflcko:main
Mar 13, 2026
Merged

Delete nonreduced fuzz inputs#263
dergoegge merged 4 commits intobitcoin-core:mainfrom
maflcko:main

Conversation

@maflcko
Copy link
Copy Markdown
Contributor

@maflcko maflcko commented Mar 12, 2026

As per the usual process to avoid wasted CI resources and timeouts when CI runs on large and presumed irrelevant inputs.

Normally, deletion of non-reduced fuzz inputs should happen after feature-freeze on the master Bitcoin Core branch, but before branch-off, so that the latest release branch retains mostly valid fuzz inputs.

Previous: #239

To "reproduce"

Install a fresh VM, as explained in the bash script's doc, and run it:

apt update && apt install curl -y
curl -fLO https://raw.githubusercontent.com/bitcoin-core/qa-assets/b30853a993f2fdc2edb8608b4544164d05312428/delete_nonreduced_fuzz_inputs.sh
bash delete_nonreduced_fuzz_inputs.sh

To "test"

  • Keep an eye on coverage stats, to ensure it doesn't drop
  • Re-run the script, to ensure it is "reproducible" to some extent
  • Anything else you think is important to test or review

CI

CI should pass, except for a lint failure, which should light up on any changes like this pull request, which delete fuzz inputs.

@maflcko
Copy link
Copy Markdown
Contributor Author

maflcko commented Mar 12, 2026

Storage device usage (du -sh ./fuzz_corpora/)

6.8G -> 2.7G

Determinism

  • ~196k fuzz input files deleted:
git diff origin/main a39e31a912bb4d097e6af375ae15c629563bd8c9 --stat | tail -1
 196312 files changed, 592128 deletions(-)
  • Cross-diff with a second run of the script: 3k fuzz input files
git diff --no-renames --stat HEAD a39e31a912bb4d097e6af375ae15c629563bd8c9 | tail -1 
 3837 files changed, 438 insertions(+), 3295 deletions(-)

Coverage

main: https://drahtbot.space/host_reports/DrahtBot/reports/coverage_fuzz/monotree/b97abdcdf1396f2c/687c9922722cfd01/fuzz.coverage/index.html

this pull: https://drahtbot.space/host_reports/DrahtBot/reports/coverage_fuzz/monotree/b97abdcdf1396f2c/a39e31a912bb4d09/fuzz.coverage/index.html

@maflcko
Copy link
Copy Markdown
Contributor Author

maflcko commented Mar 12, 2026

Looks like a few lines of coverage went away. Possibly due to bitcoin/bitcoin#29018 :(

Still, I guess this should be good to go.

@sipa
Copy link
Copy Markdown
Contributor

sipa commented Mar 12, 2026

To see the number of files deleted per test:

$ git diff --raw --stat upstream/main | fgrep 'fuzz_corpora/' | fgrep '00000000000 D' | cut -d / -f 2 | sort | uniq -c | sort -g
     14 span
     17 uint256_deserialize
     20 uint160_deserialize
     21 blockheader_deserialize
     22 crypto_aes256
     23 inv_deserialize
     23 threadpool
     24 fee_rate_deserialize
     24 secp256k1_ec_seckey_import_export_der
     25 out_point_deserialize
     27 snapshotmetadata_deserialize
     29 flatfile
     34 protocol
     36 fee_rate
     39 crypto_common
     41 crypto_poly1305
     47 base64_encode_decode
     47 wallet_fees
     48 block_header
     48 kitchen_sink
     49 block_file_info_deserialize
     49 locale
     49 muhash
     51 script_deserialize
     52 flat_file_pos_deserialize
     53 base32_encode_decode
     53 bech32_roundtrip
     54 blocktransactionsrequest_deserialize
     54 float
     57 bloomfilter_deserialize
     59 random
     60 key_io
     60 parse_iso8601
     61 key_origin_info_deserialize
     61 tx_out
     62 diskblockindex_deserialize
     62 secp256k1_ecdsa_signature_parse_der_lax
     66 blocklocator_deserialize
     67 crypto_hkdf_hmac_sha256_l32
     69 http_request
     69 parse_hd_keypath
     69 pub_key_deserialize
     70 chain
     71 key
     71 tx_in_deserialize
     79 fees
     80 ellswift_roundtrip
     80 messageheader_deserialize
     81 block_filter_deserialize
     82 socks5
     88 bip324_ecdh
     88 merkle_block_deserialize
     90 script_parsing
     92 crypto_poly1305_split
     93 clusterlin_components
    100 partial_merkle_tree_deserialize
    102 feefrac
    103 clusterlin_chunking
    104 bech32_random_decode
    106 clusterlin_postlinearize
    106 tx_in
    107 checkqueue
    107 crypto_aes256cbc
    108 base58_encode_decode
    108 base58check_encode_decode
    108 feefrac_div_fallback
    108 rolling_bloom_filter
    109 chacha20_split_crypt
    110 clusterlin_postlinearize_moved_leaf
    112 difference_formatter
    118 feefrac_mul_div
    122 crypto_fschacha20
    124 script_descriptor_cache
    125 txoutcompressor_deserialize
    125 wallet_bdb_parser
    128 chacha20_split_keystream
    129 crypto_chacha20
    129 scriptnum_ops
    130 build_and_compare_feerate_diagram
    134 crypto_aeadchacha20poly1305
    135 clusterlin_fix_linearization
    135 num3072_mul
    135 overflow
    141 coin_grinder_is_optimal
    142 clusterlin_ancestor_finder
    142 crypto_fschacha20poly1305
    143 coins_deserialize
    149 bnb_finds_min_waste
    155 timeoffsets
    156 addr_info_deserialize
    157 crypto_diff_fuzz_chacha20
    165 pcp_request_port_map
    167 integer
    169 clusterlin_make_connected
    169 crypto
    172 clusterlin_linearization_chunking
    173 parse_numbers
    177 clusterlin_simple_finder
    178 addition_overflow
    190 natpmp_request_port_map
    192 clusterlin_depgraph_serialization
    195 num3072_inv
    199 netaddr_deserialize
    200 service_deserialize
    207 block_deserialize
    211 multiplication_overflow
    212 txundo_deserialize
    213 versionbits
    214 bip324_cipher_roundtrip
    215 cuckoocache
    228 netaddress
    232 message
    234 clusterlin_merge
    235 pow_transition
    243 blocktransactions_deserialize
    247 coinselection_srd
    249 net_permissions
    258 block_index_tree
    261 script_ops
    266 prefilled_transaction_deserialize
    275 golomb_rice
    275 p2p_transport_serialization
    276 netbase_dns_lookup
    279 p2p_transport_bidirectional
    281 blockmerkleroot
    284 blockfilter
    287 block_header_and_short_txids_deserialize
    288 parse_script
    289 prevector
    290 merkle
    293 clusterlin_simple_linearize
    306 address_deserialize
    310 hex
    319 pow
    337 coincontrol
    342 buffered_file
    347 asmap_direct
    354 decode_tx
    358 minisketch
    363 asmap
    363 p2p_transport_bidirectional_v1v2
    364 merkleblock
    367 blockundo_deserialize
    372 torcontrol
    385 policy_estimator_io
    394 autofile
    394 str_printf
    398 script_sigcache
    417 primitives_transaction
    417 txrequest
    420 vecdeque
    434 headers_sync_state
    436 clusterlin_depgraph_sim
    447 utxo_snapshot
    462 coinselection_bnb
    470 clusterlin_search_finder
    470 p2p_transport_bidirectional_v2
    490 coinselection_knapsack
    506 coinscache_sim
    514 script_interpreter
    537 txorphan
    554 block
    556 load_external_block_file
    574 sighash_cache
    589 node_eviction
    592 i2p
    595 clusterlin_postlinearize_tree
    605 block_index
    618 txorphan_protected
    636 system
    644 utxo_snapshot_invalid
    650 coin_grinder
    650 crypter
    652 mini_miner_selection
    659 p2p_headers_presync
    666 psbt_output_deserialize
    690 bloom_filter
    750 rbf
    811 pool_resource
    845 wallet_create_transaction
    851 script
    862 bitdeque
    872 mini_miner
    872 txorphanage_sim
    942 script_format
    951 utxo_total_supply
   1005 transaction
   1021 partially_downloaded_block
   1048 string
   1090 package_rbf
   1167 policy_estimator
   1169 addrman_serdeser
   1173 clusterlin_sfl
   1205 txdownloadman
   1266 validation_load_mempool
   1268 txdownloadman_impl
   1283 signet
   1304 clusterlin_linearize
   1392 data_stream_addr_man
   1568 net
   1610 psbt_input_deserialize
   1668 tx_pool_standard
   1677 local_address
   1745 p2p_handshake
   1807 eval_script
   1819 bitset
   1827 miniscript_script
   1835 banman
   2001 addrman
   2025 psbt_base64_decode
   2120 tx_package_eval
   2125 miniscript_smart
   2174 ephemeral_package_eval
   2234 miniscript_stable
   2319 miniscript_string
   2590 partially_signed_transaction_deserialize
   2899 signature_checker
   3045 coins_view_overlay
   3280 parse_univalue
   3461 descriptor_parse
   3501 script_flags
   4255 mocked_descriptor_parse
   4407 txgraph
   4431 connman
   4436 script_sign
   5191 process_message
   6096 process_messages
   6682 psbt
   7401 tx_pool
   8444 coins_view
   9599 coins_view_db
  10740 scriptpubkeyman
  13530 rpc

@sipa
Copy link
Copy Markdown
Contributor

sipa commented Mar 12, 2026

Would it be useful to keep relatively new files around, e.g. not delete any files that were added to qa-assets less than 2 years ago, because those might be relevant to old maintained branches even if they are not relevant to the latest release anymore?

@maflcko
Copy link
Copy Markdown
Contributor Author

maflcko commented Mar 12, 2026

Would it be useful to keep relatively new files around, e.g. not delete any files that were added to qa-assets less than 2 years ago, because those might be relevant to old maintained branches even if they are not relevant to the latest release anymore?

Yeah, I guess that is possible. Though, my preference would probably be to just create a branch of qa-assets (and use that in the previous release)

Otherwise, it will be difficult or inefficient to delete any nonreduced fuzz inputs at all, because:

@maflcko
Copy link
Copy Markdown
Contributor Author

maflcko commented Mar 12, 2026

A more precise cross-diff shows that the larger diffs happen in the targets that are more non-deterministic:

# ( cd fuzz_corpora/ && for f in ./* ; do  count=$(git diff --no-renames --stat HEAD a39e31a912bb4d097e6af375ae15c629563bd8c9 $f | wc -l) && echo $count $f; done ) | sort -h 
0 ./addition_overflow
0 ./asmap_direct
0 ./base32_encode_decode
0 ./base64_encode_decode
0 ./bech32_roundtrip
0 ./bip324_cipher_roundtrip
0 ./bip324_ecdh
0 ./block_file_info_deserialize
0 ./block_header
0 ./block_header_and_short_txids_deserialize
0 ./blockheader_deserialize
0 ./blocklocator_deserialize
0 ./blocktransactions_deserialize
0 ./blocktransactionsrequest_deserialize
0 ./bloomfilter_deserialize
0 ./chacha20_split_crypt
0 ./chain
0 ./clusterlin_chunking
0 ./clusterlin_components
0 ./clusterlin_depgraph_serialization
0 ./clusterlin_depgraph_sim
0 ./clusterlin_postlinearize
0 ./clusterlin_postlinearize_tree
0 ./crypto_aeadchacha20poly1305
0 ./crypto_aes256
0 ./crypto_common
0 ./crypto_fschacha20
0 ./crypto_fschacha20poly1305
0 ./crypto_hkdf_hmac_sha256_l32
0 ./crypto_poly1305
0 ./crypto_poly1305_split
0 ./difference_formatter
0 ./diskblockindex_deserialize
0 ./ellswift_roundtrip
0 ./fee_rate
0 ./fee_rate_deserialize
0 ./feefrac
0 ./feefrac_div_fallback
0 ./feefrac_mul_div
0 ./flat_file_pos_deserialize
0 ./flatfile
0 ./float
0 ./inv_deserialize
0 ./kitchen_sink
0 ./messageheader_deserialize
0 ./muhash
0 ./multiplication_overflow
0 ./netaddr_deserialize
0 ./netbase_dns_lookup
0 ./num3072_inv
0 ./num3072_mul
0 ./out_point_deserialize
0 ./parse_iso8601
0 ./policy_estimator_io
0 ./pow_transition
0 ./prefilled_transaction_deserialize
0 ./pub_key_deserialize
0 ./script_deserialize
0 ./script_parsing
0 ./secp256k1_ec_seckey_import_export_der
0 ./snapshotmetadata_deserialize
0 ./span
0 ./tx_in
0 ./tx_out
0 ./txoutcompressor_deserialize
0 ./uint160_deserialize
0 ./uint256_deserialize
2 ./base58check_encode_decode
2 ./bnb_finds_min_waste
2 ./checkqueue
2 ./clusterlin_linearize
2 ./clusterlin_simple_finder
2 ./clusterlin_simple_linearize
2 ./coin_grinder
2 ./coins_deserialize
2 ./key_io
2 ./key_origin_info_deserialize
2 ./locale
2 ./merkle
2 ./merkle_block_deserialize
2 ./merkleblock
2 ./miniscript_string
2 ./natpmp_request_port_map
2 ./overflow
2 ./parse_hd_keypath
2 ./partial_merkle_tree_deserialize
2 ./policy_estimator
2 ./script_descriptor_cache
2 ./service_deserialize
2 ./sighash_cache
2 ./socks5
2 ./tx_in_deserialize
2 ./txundo_deserialize
3 ./addr_info_deserialize
3 ./block_filter_deserialize
3 ./blockfilter
3 ./build_and_compare_feerate_diagram
3 ./chacha20_split_keystream
3 ./clusterlin_sfl
3 ./coin_grinder_is_optimal
3 ./coinselection_bnb
3 ./integer
3 ./netaddress
3 ./protocol
3 ./script_interpreter
3 ./script_sigcache
3 ./timeoffsets
3 ./transaction
3 ./txorphan_protected
3 ./txrequest
3 ./wallet_bdb_parser
4 ./asmap
4 ./base58_encode_decode
4 ./blockundo_deserialize
4 ./coins_view
4 ./crypto_aes256cbc
4 ./fees
4 ./hex
4 ./http_request
4 ./message
4 ./random
4 ./secp256k1_ecdsa_signature_parse_der_lax
5 ./address_deserialize
5 ./block_deserialize
5 ./buffered_file
5 ./minisketch
5 ./net
5 ./pow
5 ./scriptnum_ops
6 ./autofile
6 ./bech32_random_decode
6 ./blockmerkleroot
6 ./clusterlin_postlinearize_moved_leaf
6 ./decode_tx
6 ./local_address
6 ./node_eviction
6 ./script_ops
6 ./string
6 ./txorphan
6 ./txorphanage_sim
7 ./coinselection_knapsack
7 ./crypter
7 ./crypto
7 ./cuckoocache
7 ./i2p
7 ./parse_script
7 ./prevector
8 ./bloom_filter
8 ./net_permissions
8 ./psbt_output_deserialize
8 ./script
9 ./block
9 ./script_flags
9 ./script_format
9 ./str_printf
9 ./torcontrol
10 ./coinscache_sim
10 ./miniscript_script
10 ./miniscript_stable
11 ./clusterlin_make_connected
11 ./coins_view_overlay
11 ./parse_numbers
11 ./pcp_request_port_map
11 ./signature_checker
11 ./system
12 ./crypto_diff_fuzz_chacha20
13 ./coinselection_srd
13 ./golomb_rice
13 ./key
13 ./primitives_transaction
14 ./bitset
14 ./miniscript_smart
14 ./p2p_transport_serialization
14 ./partially_signed_transaction_deserialize
16 ./rolling_bloom_filter
17 ./crypto_chacha20
19 ./coincontrol
19 ./p2p_transport_bidirectional_v1v2
19 ./psbt_base64_decode
20 ./bitdeque
20 ./threadpool
21 ./banman
21 ./block_index_tree
21 ./eval_script
21 ./psbt_input_deserialize
21 ./signet
23 ./data_stream_addr_man
24 ./addrman
25 ./p2p_handshake
26 ./versionbits
30 ./p2p_transport_bidirectional_v2
30 ./process_messages
31 ./txdownloadman_impl
32 ./descriptor_parse
32 ./mocked_descriptor_parse
33 ./package_rbf
34 ./addrman_serdeser
34 ./parse_univalue
36 ./psbt
37 ./txdownloadman
38 ./vecdeque
39 ./wallet_fees
41 ./rbf
43 ./validation_load_mempool
45 ./coins_view_db
48 ./mini_miner
48 ./partially_downloaded_block
48 ./script_sign
50 ./headers_sync_state
51 ./pool_resource
51 ./tx_pool
67 ./tx_pool_standard
73 ./load_external_block_file
78 ./tx_package_eval
100 ./txgraph
109 ./block_index
112 ./ephemeral_package_eval
122 ./p2p_transport_bidirectional
151 ./wallet_create_transaction
154 ./utxo_snapshot
156 ./scriptpubkeyman
157 ./p2p_headers_presync
165 ./connman
168 ./process_message
213 ./utxo_total_supply
240 ./rpc
324 ./utxo_snapshot_invalid

@sipa
Copy link
Copy Markdown
Contributor

sipa commented Mar 12, 2026

Yeah, I guess that is possible. Though, my preference would probably be to just create a branch of qa-assets (and use that in the previous release)

Yeah, that's a possibility too. Alternatively, the delete_nonreduced script could run merging with multiple releases/branches.

It was just an idea I wanted to throw out there, but to be clear it's not an objection to this PR or even a suggestion to address right now.

@maflcko
Copy link
Copy Markdown
Contributor Author

maflcko commented Mar 12, 2026

Alternatively, the delete_nonreduced script could run merging with multiple releases/branches.

That'd probably be ideal. Just creating a branch, or just merging against master will not supply new fuzz inputs for older releases, whereas merging against the previous branches will do that, while also deleting nonreduced inputs that are "unused" on all branches.

@maflcko
Copy link
Copy Markdown
Contributor Author

maflcko commented Mar 13, 2026

rfm?

@dergoegge
Copy link
Copy Markdown
Member

Are we concerned about the valgrind failure here?

https://github.com/bitcoin-core/qa-assets/actions/runs/22990374949/job/66749878719?pr=263

Run wallet_create_transaction with args ['valgrind', '--quiet', '--error-exitcode=1', '/home/runner/work/_temp/build/bin/fuzz', PosixPath('/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/wallet_create_transaction')]==6501== Conditional jump or move depends on uninitialised value(s)
==6501==    at 0x781E44: wallet::SelectCoins(wallet::CWallet const&, wallet::CoinsResult&, wallet::CoinsResult const&, long const&, wallet::CCoinControl const&, wallet::CoinSelectionParams const&) (spend.cpp:843)
==6501==    by 0x78767C: wallet::CreateTransactionInternal(wallet::CWallet&, std::vector<wallet::CRecipient, std::allocator<wallet::CRecipient> > const&, std::optional<unsigned int>, wallet::CCoinControl const&, bool) (spend.cpp:1212)
==6501==    by 0x78AA7A: wallet::CreateTransaction(wallet::CWallet&, std::vector<wallet::CRecipient, std::allocator<wallet::CRecipient> > const&, std::optional<unsigned int>, wallet::CCoinControl const&, bool) (spend.cpp:1457)
==6501==    by 0x581BE3: wallet::(anonymous namespace)::wallet_create_transaction_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>) (spend.cpp:101)
==6501==    by 0x58B81F: operator() (std_function.h:591)
==6501==    by 0x58B81F: test_one_input(std::span<unsigned char const, 18446744073709551615ul>) (fuzz.cpp:88)
==6501==    by 0x2DF8CF: main (fuzz.cpp:264)
==6501== 

==6501== Conditional jump or move depends on uninitialised value(s)
==6501==    at 0x781E44: wallet::SelectCoins(wallet::CWallet const&, wallet::CoinsResult&, wallet::CoinsResult const&, long const&, wallet::CCoinControl const&, wallet::CoinSelectionParams const&) (spend.cpp:843)
==6501==    by 0x78767C: wallet::CreateTransactionInternal(wallet::CWallet&, std::vector<wallet::CRecipient, std::allocator<wallet::CRecipient> > const&, std::optional<unsigned int>, wallet::CCoinControl const&, bool) (spend.cpp:1212)
==6501==    by 0x78AA7A: wallet::CreateTransaction(wallet::CWallet&, std::vector<wallet::CRecipient, std::allocator<wallet::CRecipient> > const&, std::optional<unsigned int>, wallet::CCoinControl const&, bool) (spend.cpp:1457)
==6501==    by 0x581BE3: wallet::(anonymous namespace)::wallet_create_transaction_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>) (spend.cpp:101)
==6501==    by 0x58B81F: operator() (std_function.h:591)
==6501==    by 0x58B81F: test_one_input(std::span<unsigned char const, 18446744073709551615ul>) (fuzz.cpp:88)
==6501==    by 0x2DF8CF: main (fuzz.cpp:264)
==6501== 

@maflcko
Copy link
Copy Markdown
Contributor Author

maflcko commented Mar 13, 2026

I think this is just a known upstream valgrind bug. A possible temporary workaround is in bitcoin/bitcoin#34589

@dergoegge dergoegge merged commit bdc226d into bitcoin-core:main Mar 13, 2026
2 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants