Skip to content

Conversation

@aaupov
Copy link
Contributor

@aaupov aaupov commented Jun 21, 2025

If some basic blocks are not present in BAT, blocks in BBHashMap and
YamlBF.Blocks may not come in the same order. Decouple the iteration
during initialization.

Test Plan: TBD

Created using spr 1.3.4
@Jinjie-Huang
Copy link
Contributor

Jinjie-Huang commented Nov 27, 2025

Thanks for the fix! I have tried this patch, and it resolves the duplicate BAT issue in the previous scenario I provided. However, the assertion failure in other functions may still occur where BB in the BAT records explicitly contain a hash of 0.

So, in addition to the original duplication issue, it seems there are also scenarios where the BAT records explicitly contain a hash of 0(seems this block was optimized and split by first-round BOLT), which also causes infer-stale-profile to crash.

Detail:

3764015: 00000000697eb608   296 FUNC    LOCAL  HIDDEN    33 Curl_client_write

The BAT entry dump:

Function Address: 0x697eb608, hash: 0x8072e2e71adc5d29
BB mappings:
0x0 -> 0x0 hash: 0x1600e2282bf90000
0x1f -> 0x1f (branch)
0x25 -> 0x2c hash: 0xa267a1a00f48002c
0x29 -> 0x30 (branch)
0x2b -> 0x36 hash: 0xd3ee0f58cd7d0036
0x39 -> 0x44 (branch)
0x3f -> 0x12e hash: 0x6628e1281741012e
0x42 -> 0x131 (branch)
0x48 -> 0x137 hash: 0x7873e70bcd7d0137
0x54 -> 0x143 (branch)
0x5a -> 0x145 hash: 0x25bfdf28cd7d0145
0x63 -> 0x14e (branch)
0x69 -> 0x173 hash: 0xeebfa8387cde0173
0x71 -> 0x17b hash: 0x5d1c7a620f48017b
0x75 -> 0x17f (branch)
0x7b -> 0x181 hash: 0xcd08e23152370181
0x7d -> 0x1a5 hash: 0xf039de9e77a901a5
0x90 -> 0x1b8 (branch)
0x93 -> 0x1bb hash: 0x346cae04938201bb
0xa7 -> 0x1cf (branch)
0xa9 -> 0x1d1 hash: 0x93f050f5206901d1
0xb1 -> 0x1d9 (branch)
0xc9 -> 0x1f1 (branch)
0xc9 -> 0x1f1 (branch)
0xc9 -> 0x1f1 (branch)
0xc9 -> 0x1f1 (branch)
**0xd2 -> 0x1f1 hash: 0x0
0xd7 -> 0x1f1 hash: 0x0
0xdc -> 0x1f1 hash: 0x0**
0xe4 -> 0x1fc (branch)
0xf0 -> 0x208 (branch)
0xf6 -> 0x20a hash: 0x22d9ba21375f020a
0xf9 -> 0x20d (branch)
0xff -> 0x213 hash: 0x6341d97973e60213
0x105 -> 0x219 (branch)
0x107 -> 0x21b hash: 0xceb8d75eb8d0021b
0x10f -> 0x223 (branch)
0x115 -> 0x279 hash: 0xa6f1cee952370279
0x117 -> 0x2de hash: 0xcf229d604802de
0x127 -> 0x2ee (branch)
NumBlocks: 57

And the corresponding profile.yaml:

  - name:            'Curl_client_write/sendf.c/1'
    fid:             2418135
    hash:            0x8072E2E71ADC5D29
    exec:            15
    nblocks:         57
    blocks:
      - **bid:             0
        insns:           0
        exec:            15
        calls:           [ { off: 0x0, fid: 2418137, cnt: 8 }, { off: 0x0, fid: 2420854, cnt: 2 } ]
        succ:            [ { bid: 2, cnt: 15 }, { bid: 0, cnt: 8 } ]**
      - bid:             2
        insns:           0
        hash:            0xA267A1A00F48002C
        succ:            [ { bid: 3, cnt: 12 }, { bid: 27, cnt: 3 } ]
      - bid:             3
        insns:           0
        hash:            0xD3EE0F58CD7D0036
        succ:            [ { bid: 27, cnt: 12 } ]
      - bid:             27
        insns:           0
        hash:            0x6628E1281741012E
        succ:            [ { bid: 28, cnt: 15 } ]
      - bid:             28
        insns:           0
        hash:            0x7873E70BCD7D0137
        succ:            [ { bid: 29, cnt: 15 } ]
      - bid:             29
        insns:           0
        hash:            0x25BFDF28CD7D0145
        succ:            [ { bid: 33, cnt: 12 }, { bid: 30, cnt: 3 } ]
      - bid:             30
        insns:           0
        hash:            0xEEBF761952370150
        succ:            [ { bid: 34, cnt: 3 } ]
      - bid:             33
        insns:           0
        hash:            0xEEBFA8387CDE0173
        succ:            [ { bid: 34, cnt: 12 } ]
      - bid:             34
        insns:           0
        hash:            0x5D1C7A620F48017B
        succ:            [ { bid: 35, cnt: 12 }, { bid: 36, cnt: 3 } ]
      - bid:             35
        insns:           0
        hash:            0xCD08E23152370181
        succ:            [ { bid: 39, cnt: 15 } ]
      - bid:             36
        insns:           0
        hash:            0x64EE72E5B8D00185
        succ:            [ { bid: 37, cnt: 3 } ]
      - bid:             37
        insns:           0
        hash:            0x1CC01C2A375F0192
        succ:            [ { bid: 35, cnt: 3 } ]
      - bid:             39
        insns:           0
        hash:            0xF039DE9E77A901A5
        succ:            [ { bid: 40, cnt: 15 } ]
      - bid:             40
        insns:           0
        hash:            0x346CAE04938201BB
        succ:            [ { bid: 41, cnt: 12 }, { bid: 43, cnt: 3 } ]
      - bid:             41
        insns:           0
        hash:            0x93F050F5206901D1
        calls:           [ { off: 0x8, fid: 2418136, cnt: 12 } ]
        succ:            [ { bid: 0, cnt: 2 } ]
      - bid:             43
        insns:           0
        hash:            0x6341D97973E60213
        succ:            [ { bid: 44, cnt: 3 } ]
      - bid:             44
        insns:           0
        hash:            0xCEB8D75EB8D0021B
        succ:            [ { bid: 47, cnt: 3 } ]
      - bid:             47
        insns:           0
        hash:            0xA6F1CEE952370279
        succ:            [ { bid: 56, cnt: 3 } ]

@Jinjie-Huang
Copy link
Contributor

Perhaps another solution would be to figure out how to assign a valid hash to these blocks. (I suspect they might be generated by BOLT rather than being part of the original binary, which would explain why they have no hash?)

@aaupov
Copy link
Contributor Author

aaupov commented Nov 27, 2025

Thank you for a detailed report. Can you please share the repro steps to get BAT entries with zero hash? I've tried building and optimizing libcurl but can't get BOLT to output zero hash. We really should never emit zero hashes.

@Jinjie-Huang
Copy link
Contributor

I also seem unable to reproduce this issue with curl in isolation, and the binary from the scenario where it is reproducible is too large to be shared.

However, I've added some debug prints in both BoltAddressTranslation::saveMetadata() and BoltAddressTranslation::write(). From the output, I can see that after optimization, BOLT does indeed insert 3 BBs into Curl_client_write, which correspond to the original offset 0x1f1, and the hashes is 0:

InputOffset: 0x1f1, Index: 42, Hash: 0x0
InputOffset: 0x1f1, Index: 43, Hash: 0x0
InputOffset: 0x1f1, Index: 44, Hash: 0x0

This seems to confirm that certain optimizations are inserting new BBs (not from the original binary) with a hash value of 0. While these zero-hash BBs might still be meaningful for BAT's purposes (specifically for regular .fdata), for the YAML profile, the lack of a hash value breaks its ability to perform infer-stale-profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants