Skip to content

Conversation

@guggero
Copy link
Contributor

@guggero guggero commented Jun 4, 2025

This fixes an edge case in the pathfinding logic of lnd (reported on Slack):

  1. There are two channels with different assets:
2025-06-03 12:23:32.439 [DBG] HSWC: ShortChannelID=3607099:1470:0: aux traffic shaper reported available bandwidth: 53340397469 mSAT (<-- asset1)

2025-06-03 12:23:32.437 [DBG] HSWC: ShortChannelID=4148366:42:0: aux traffic shaper reported available bandwidth: 41166477370 mSAT (<-- asset2)
  1. When trying to send a payment using asset2, both channels report bandwidth as shown above, because we didn't have the check in this PR.
  2. Because the channel with the wrong asset reported a higher available bandwidth, lnd skipped the second channel with the log message 2025-06-03 12:23:32.296 [DBG] CRTR: Skipped edge 4561176653273366528: not max bandwidth, bandwidth=41166477370 mSAT, maxBandwidth=53340397469 mSAT
  3. The HTLC is sent over the wrong channel, then rejected in a later stage when we actually check the compatibility
  4. The pathfinding logic doesn't know what's going on and re-tries over and over again, using high CPU

This PR breaks this cycle by not reporting any bandwidth for the first channel, since from the quote we can figure out the channel isn't actually compatible.

@guggero guggero requested review from GeorgeTsagk and ffranr June 4, 2025 14:19
@guggero guggero added payment-channel RFQ Work relating to TAP channel Request For Quote (RFQ). tap-channels P0 labels Jun 4, 2025
@coveralls
Copy link

coveralls commented Jun 4, 2025

Pull Request Test Coverage Report for Build 15448702081

Details

  • 0 of 30 (0.0%) changed or added relevant lines in 1 file are covered.
  • 46 unchanged lines in 7 files lost coverage.
  • Overall coverage decreased (-0.003%) to 37.296%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tapchannel/aux_traffic_shaper.go 0 30 0.0%
Files with Coverage Reduction New Missed Lines %
address/address.go 2 69.55%
address/mock.go 2 97.39%
asset/group_key.go 2 57.89%
commitment/tap.go 4 71.82%
tapchannel/aux_leaf_signer.go 5 43.08%
asset/mock.go 6 64.86%
asset/asset.go 25 45.56%
Totals Coverage Status
Change from base Build 15424987781: -0.003%
Covered Lines: 27234
Relevant Lines: 73021

💛 - Coveralls

guggero added a commit to lightninglabs/lightning-terminal that referenced this pull request Jun 4, 2025
Adds a test case to validate the fix in
lightninglabs/taproot-assets#1583, by adding a test that:
 - Creates two asset channels between Alice and Bob
 - Creates a BTC channel between Bob and Charlie
 - The two asset channels each have a different asset in them
 - The balance of the pences channel is decreased (lower bandwidth)
 - An RFQ payment is attempted, with pences as the payment asset
@guggero
Copy link
Contributor Author

guggero commented Jun 4, 2025

I've verified this by adding a commit to lightninglabs/lightning-terminal#1082, which currently makes the tests fail.
Once this PR is merged and the go.mod in the litd PR is updated, the tests will succeed again.

Comment on lines 267 to 269
// ExtractHexDump extracts the hex bytes from a hex dump string, which is
// typically formatted with an offset, hex bytes, and ASCII representation.
func ExtractHexDump(input string) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hex bytes?

// ExtractHexDump parses a hex dump string (typically formatted with offsets,
// hex-encoded byte values, and ASCII representation) and returns the decoded bytes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how were these .hexdump files generated? Maybe we should write that down somewhere. A README.md in testdata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned in the commit message that these are helpers to decode hex dumps from a trace log of lnd/litd. But I guess it's useful to have that in the Godoc comment too.

Comment on lines 236 to 240
case specifier.HasGroupPubKey():
targetGroupKey := specifier.UnwrapGroupKeyToPtr()
if targetGroupKey == nil {
return false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the specifier has a group pub key but then we end up with targetGroupKey == nil, that's an error. I don't think it means that the channel is incompatible with the specifier necessarily because asset ID might still match.

maybe use specifier.UnwrapGroupKeyOrErr here and return an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check with specifier.HasGroupPubKey() above, so the nil case really should never happen. It's just defensive. I don't think it makes sense to return an error just for that.

To debug issues with channels, it's super helpful if we can decode the
long hexdumps of the funding, commitment and HTLC blobs.
This commit adds simple unit tests that decode those blobs and outputs
them as JSON.
@guggero guggero force-pushed the debug-trace-log branch from 94c5c9a to 55a06f0 Compare June 4, 2025 15:58
// If the specifier has a group key, then we must have a group
// key set on the OpenChannel.
return lfn.MapOptionZ(
o.GroupKey.ValOpt(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is correct, if the specifier has a group key, we could still use the channel even if it was funded with a single asset ID, if that asset ID is part of the group

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the specifier here is what was used by the user to create the quote. And if the quote was created for a group key, then the channel needs to have a group defined as well.
If the user creates a quote for a single asset ID in a grouped channel that only has that asset ID, then the specifier here will have the asset ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but I guess just re-using the AssetMatchesSpecifier function does make sense... Will try it.

@guggero guggero requested review from GeorgeTsagk and removed request for GeorgeTsagk June 4, 2025 17:14
During pathfinding, when an HTLC doesn't have an asset ID or group key
encoded, we can only find out if a channel is compatible after looking
at the specifier of the quote.
We add that to make sure pathfinding doesn't give false positives for
channels that then can't be used because they're not compatible.
@guggero guggero force-pushed the debug-trace-log branch from 55a06f0 to c0a3f16 Compare June 4, 2025 17:25
@guggero guggero requested a review from GeorgeTsagk June 4, 2025 17:26
Copy link
Member

@GeorgeTsagk GeorgeTsagk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ty for the fix! 💯


// One of the asset IDs in the channel does not match the quote,
// we don't want to route this HTLC over this channel.
if !match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@GeorgeTsagk GeorgeTsagk added this pull request to the merge queue Jun 5, 2025
Merged via the queue into main with commit 3355287 Jun 5, 2025
18 checks passed
@guggero guggero deleted the debug-trace-log branch June 5, 2025 09:53
guggero added a commit to lightninglabs/lightning-terminal that referenced this pull request Jun 5, 2025
Adds a test case to validate the fix in
lightninglabs/taproot-assets#1583, by adding a test that:
 - Creates two asset channels between Alice and Bob
 - Creates a BTC channel between Bob and Charlie
 - The two asset channels each have a different asset in them
 - The balance of the pences channel is decreased (lower bandwidth)
 - An RFQ payment is attempted, with pences as the payment asset
guggero added a commit to lightninglabs/lightning-terminal that referenced this pull request Jun 5, 2025
Adds a test case to validate the fix in
lightninglabs/taproot-assets#1583, by adding a test that:
 - Creates two asset channels between Alice and Bob
 - Creates a BTC channel between Bob and Charlie
 - The two asset channels each have a different asset in them
 - The balance of the pences channel is decreased (lower bandwidth)
 - An RFQ payment is attempted, with pences as the payment asset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P0 payment-channel RFQ Work relating to TAP channel Request For Quote (RFQ). tap-channels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants