From 6c61f9dd720f7a28dd3e124357cc62f7efe706c1 Mon Sep 17 00:00:00 2001 From: Benjamin Hindman Date: Wed, 30 Apr 2025 14:33:52 -0500 Subject: [PATCH 1/6] Path queries v1 use node ids Add proposal high-level wip Proposal summary v1 wip update bolts remove from index edits --- 07-routing-gossip.md | 52 +++++++++++++- 09-features.md | 3 +- proposals/path-queries.md | 141 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 194 insertions(+), 2 deletions(-) create mode 100644 proposals/path-queries.md diff --git a/07-routing-gossip.md b/07-routing-gossip.md index e306b6788..d6bfebc12 100644 --- a/07-routing-gossip.md +++ b/07-routing-gossip.md @@ -835,6 +835,56 @@ determine if replies are done: simply check if `first_blocknum` plus The addition of timestamp and checksum fields allow a peer to omit querying for redundant updates. +### The `path_query` and `path_reply` Messages + +1. type: 266 (`path_query`) +2. data: + * [`chain_hash`:`chain_hash`] + * [`point`:`source_node_id`] + * [`point`:`destination_node_id`] + * [`u16`:`amount_msat`] + + + +1. type: 267 (`path_reply`) +2. data: + * [`chain_hash`:`chain_hash`] + * [`point`:`source_node_id`] + * [`point`:`destination_node_id`] + * [`u16`:`amount_msat`] + * [`u16`:`path_len`] + * [`path_len*short_channel_id`:`path`] + +1. type: 268 (`reject_path_query`) +2. data: + * [`u16`:`reason_len`] + * [`reason_len*byte`:`reason`] + +#### Rationale + +One path per message allows a node to respond asynchronously. `path_reply` includes the queried fields to disambiguate multiple `path_query`s. + +### Requirements + +The origin node sending `path_query`: + - MUST set `source_node_id` to the public key of the source node. + - MUST set `destination_node_id` to the public key of the destination node. + - MUST set `amount_msat`. + +The receiving node: + - if it does not support the `option_path_query` feature: + - MUST ignore the message. + - if the node chooses to respond: + - MAY send multiple `path_reply` messages: + - MUST set `source_node_id`, `destination_node_id`, and `amount_msat` + to match the values from the original `path_query` message. + - MUST include a list of `short_channel_id`s that form a path between nodes connected + to `source_node_id` and `destination_node_id`. + - MUST set `path_len` to the number of channels in the path. + - if the node chooses not to respond: + - MAY ignore the message. + - MAY send a `reject_path_query` with `reason`. + ### The `gossip_timestamp_filter` Message 1. type: 265 (`gossip_timestamp_filter`) @@ -1114,4 +1164,4 @@ above. ![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png "License CC-BY")
-This work is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/). +This work is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/). \ No newline at end of file diff --git a/09-features.md b/09-features.md index 570b69f64..f560891b9 100644 --- a/09-features.md +++ b/09-features.md @@ -52,6 +52,7 @@ The Context column decodes as follows: | 46/47 | `option_scid_alias` | Supply channel aliases for routing | IN | | [BOLT #2][bolt02-channel-ready] | | 48/49 | `option_payment_metadata` | Payment metadata in tlv record | 9 | | [BOLT #11](11-payment-encoding.md#tagged-fields) | | 50/51 | `option_zeroconf` | Understands zeroconf channel types | IN | `option_scid_alias` | [BOLT #2][bolt02-channel-ready] | +| 152/153 | `option_path_query` | Node supports path query messages | IN | | [BOLT #7](07-routing-gossip.md#path-query-messages) | | 60/61 | `option_simple_close` | Simplified closing negotiation | IN | `option_shutdown_anysegwit` | [BOLT #2][bolt02-simple-close] | ## Requirements @@ -107,4 +108,4 @@ This work is licensed under a [Creative Commons Attribution 4.0 International Li [bolt07-query]: 07-routing-gossip.md#query-messages [bolt04-mpp]: 04-onion-routing.md#basic-multi-part-payments [bolt04-route-blinding]: 04-onion-routing.md#route-blinding -[ml-sighash-single-harmful]: https://lists.linuxfoundation.org/pipermail/lightning-dev/2020-September/002796.html +[ml-sighash-single-harmful]: https://lists.linuxfoundation.org/pipermail/lightning-dev/2020-September/002796.html \ No newline at end of file diff --git a/proposals/path-queries.md b/proposals/path-queries.md new file mode 100644 index 000000000..940b66ea0 --- /dev/null +++ b/proposals/path-queries.md @@ -0,0 +1,141 @@ + +# Path Queries + +### Introduction + +To route a payment on the Lightning Network, a sender must find a path to the destination using channels which contain sufficient liquidity and meet certain routing rules (e.g fees). The routing information that's shared within the protocol today is insufficient to reliably find a feasible path, which results in nodes spamming other nodes with failing payments. The purpose of path queries is to enable payment senders to find feasible paths with minimal knowledge of the graph, and to allow routing nodes to respond with dynamic policy. This is done by sharing path information at the peer-level, which can be scaled to a growing network while preserving channel balance privacy and payment anonymity. + +### The problem + +#### 1. Liquidity uncertainty + +Liquidity is state that is managed between two nodes and the unpredictability of this value for any given node is referred as *liquidity uncertainty*. Without prior knowledge, the uncertainty of unowned channels on the network is only bounded by the capacity of that channel and the aggregate uncertainty grows with the number of channels on the network. Currently, feasible paths are found by trial-and-error, whereby liquidity uncertainty is reduced by sequential payment attempts, but this approach has a host of issues: + +1. Pathfinding calculations account for reliability by favoring shorter paths and higher capacity channels. Not only does this effect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. + +2. Low payment success rates implies a large set of potential routes. When the final route is unknown, routing fees (and other payment details) are more difficult to predict. + +3. Failed payments are a burden to routing nodes in the failing sub-path in the form of locked liquidity, HTLC slots, and wasted computational resources. + +Furthermore, trial-and-error is a slow method to discover a feasible path because HTLCs need to be set up and torn down at each hop. HTLCs require multiple rounds of communication between peers, which is wasted time when the payment fails. Furthermore, this process must be executed serially to avoid the delivery of multiple successful payments, which fundamentally limits the rate at which path discovery can occur. To improve the performance of real payments, nodes may choose to 'probe' the network to reduce liquidity uncertainty. Due to it's dynamic nature, liquidity is always regressing to a state of uncertainty, which means nodes must continously monitor the network. As the network grows, this means an exponential growth of failing payments. + +#### 2. Graph dependence + +To route a payment, the sender needs the latest updates to the graph. This requirement is a burden to the sender, who needs to constantly sync the channel graph, and to routers, who must limit their channel updates. Inputs to routing policy, including liquidity, onchain fees, and even exogenous factors, are extremely dynamic, which means their policy should also be dynamic. Rate limits to `channel_updates` causes advertised policy to differ from desired policy, which reduces control of routing resources (e.g liquidity). + +* * * + +### Proposal + +This proposal includes the following optional messages which allows nodes to cooperatively construct a path: + +1. `path_query` + +- source_node_id + +- destination_node_id + +- amount + +2. `path_reply` + +- path + +3. `reject_path_query` + +- reason + +Upon receiving a `path_query`, a node can choose how it wants to respond, including rejecting or ignoring it. The `path_reply` message helps the requester - either a source or router - deliver a potential payment because it leverages routing information at the queried node. This solves the liquidity uncertainty problem at the queried hop because a node knows it's own balances and can respond accordingly. Compared to payment onions, queries are lightweight and can be made concurrently. A routing node can respond with any routing policy (e.g fees, expiry, etc) it desires, unconstrained by global rate limits. + +## Putting into practice + +The proposal outlines a basic set of messages and it is for the node to choose their own request & response strategies, including *who* they want to talk to (any subset of nodes), *what* they want to respond to (e.g minimum amounts) and any rate limits (number of requests and replies/paths). While there are innumerable strategies that may evolve, let's walk-through a simple example where where all nodes adopt a PEER_ONLY strategy (i.e nodes interact only with their channel peers): + +Payment from S -> R +``` + +-------+ +-------+ + +----------| A |------| B |----------+ + | +-------+ +-------+ | + | | | ++-------+ | +-----+ +| S | ---------------+ | R | ++-------+ | +-----+ + | | | + | +-------+ +-------+ | + +----------| C |------| D |----------+ + +-------+ +-------+ +``` + +Before attempting the payment, the sender (S) may choose to query any subset of it's channel peers. Alice (A) advertises the lowest routing fees but, since this is a larger payment, the sender decides to make concurrent queries to both Alice and Carol: + +- Alice receives a `path_query` message requesting a path from herself (A) to the receiver (R). She sees she does not have the outbound liquidity to Bob (B) to complete payment, so responds with a `reject_path_query` with a reason indicating a temporary failure to find a route. +- Carol (C) receives a `path_query` requesting a path from herself to the receiver (R). She has sufficient outbound liquidity through Dave (D), but before responding to the sender, she decides to query Dave: + - Dave receives a query from Carol for a path from himself (D) to the receiver. Similar to Alice, Dave responds that he has no route available +- Upon discovering insufficient liquidity from D -> R, Carol splits the sender amount and concurrently queries Bob (B) and Dave (D) with their respective splits. + - Dave receives a new query requesting a path from himself (D) to the receiver, but of a lesser amount. This he does have the liquidity for! Since Dave knows he can route the requested payment, he resonds to Carol with the given path and routing details. + - Bob (B) receives a new query requesting a path from himself (B) to the receiver for his split amount. Similar to Dave, he knows he can route the payment, so resonds to Carol with his routing details. +- Upon receiving the path details from Bob and Dave, Carol can now confidently assemble a MPP from herself to the receiver. She constructs the MPP, adds her own routing details and sends a `path_reply` to the sender. +- Upon receiving the `path_reply` from Carol, Alice attempts the payment and on her first attempt, the payment succeeded. + +As you can see, path queries enable nodes to use queries to concurrently probe the network for feasible paths. When chained together, these queries swarm to the destination, reducing liquididity uncertainty at an exponentiall faster rate than failed payments. Each hop knows it's channel balance and can therefore reduce the liquidity uncertainty at their channel in the path. After receiving a `path_reply` a node can prepend itself to the path and back-propogate it to the source. + +While the small example above illustrates the process, it is important to consider the *rate* at which liquidity uncertainty is reduced; trial-and-error may work for a small network like this, but does not scale well with a growing number of nodes. + +### Implementing Trampoline via queries + +Path queries can be used to accomplish everything trampoline routing proposes, but with reduced complexity at the protocol level. The [trampoline proposal](https://github.com/lightning/bolts/blob/trampoline-routing/proposals/trampoline.md#introduction) states, "The main goal of trampoline routing is to reduce the amount of gossip that constrained nodes need to sync". While trampoline requires constrained devices to "keep a small part of the network and leverage trampoline nodes to route payments," path queries do not need any knowledge of the graph to find a route, but only a connection to a supporting peer(s). + +Trampoline pursues this goal while also preserving anonymity for the sender and receiver. This is done by including multiple hops in the trampoline route. By using onion messages (see [Queries via Onions](#queries-via-onions)) to anonymously query a remote node, a source node can attain a comparable anonymity set with only a single trampoline-like hop using the following process: + +1. Select a node with path query support as a 'trampoline' hop (T). +2. Query channel peer (P) for a path to T. +3. Query T for a path from T to final destination (D). +4. Send payment using aggregate route +``` +S -> P -> ... -> T -> ... -> D +``` + +Note that while the pathfinding process is similar to trampoline in that it leverages routing information of other nodes, the final route does not use a special "trampoline route", but rather a regular onion route. This has several advantages: + +- Simplicity of the protocol +- Trampoline hops do not know previous trampolines +- Single-hop trampolines produce a shorter route (more reliable, lower expected fees) + +## Expanding the Protocol + +#### Queries via Onions + +The proposal as described above only supports messages using a *direct* connection between any two peers. This is sufficient for queries between channel peers because it reveals no new information about the source of a payment. However, this reduces anonymity when querying nodes elsewhere on the network, such as querying a trampoline-like hop as described above. To improve anonymity, onions could be used to carry these messages. However, without knowing the query node, responding nodes are vulnerable to spam, and consequentally, potential DoS attacks. This is a similar attack vector to channel jamming, but rather than using onions to consume payment resources (liquidity, HTLCs), a query consumes computational resources instead. To prevent this, nodes can implement their own mitigations, such as a small payment as part of the query. + +#### Adding fields + +The messages defined in this proposal are intentionally bare. Optional message fields can be added to enhance a node's capabilites and to reduce the number of messages between peers. Some examples: + +- `path_query` + - `maximum_fee`, `cltv_expiration` - reduce response messages by providing upfront filters + - `expiration`. A querying node can define the window of time they're interested in a given path (e.g 1min, 1hr, 1day, always) and get notified with updates. +- `path_reply` + - `confidence` (ranged interval) - a score to indicate the expected likelihood of payment delivery; the higher the routing node's confidence, the more a path suggestion behaves like a *quote* for delivery. This may be used by a querying node to weigh the value of a responding nodes offered paths, especially if there's a cost for `path_reply`s, such as onion queries as described above. Unlike forwarding endorsements (e.g [HTLC endorsement](https://github.com/lightning/bolts/pull/1071)), this value would be back-propogated in the `path_reply` messages, so does not leak information about the origin. + +## Potential Concerns +#### Privacy Implications + +Naturally, any time information is shared, there is a privacy implication. A `path_query` reveals a downstream node - either a hop or the destination - to the prospective routing node. When iterated upon, each node in the path becomes aware of the *queried* destination. Meanwhile, the selected channels in a `path_reply` may reveal some information about channel balances. As so, let's consider channel balance privacy and sender/receiver anonymity: + +*Privacy of channel balances* + +Path queries differ from trial-and-error (including probing) in the manner that liquidity uncertainty is reduced. Trial-and-error informs the payment *sender* about liquidity *ranges* (i.e lower and upper bound) for channels on an attempted path, while a `path_reply` only provides a set channels that meet liquidity requirements. For example, in our PEER_ONLY strategy described above, the sender (S) gained no information about liquidity on the network other than what was sufficient for the final route. While probing still remains an unsolved problem, path queries enable better information control as nodes can choose *who* they talk to and *how much* information they want to reveal. + +Generally speaking, the more channels a node has, the more difficult it is to infer liquidity based on an offered path. Large routing nodes with many channels may be more liberal in their responses than smaller nodes delivering less frequent payments. + +*Sender Anonymity* + +While a single query does not tell a routing node about the source of a payment, the number of queries a routing node receives and whom they come from may reduce the anonymity set of the *query* origin. Depending on the nature of the payment, the sender may choose it's own path construction process, including adding trampoline-like hops or opting out of queries altogether. + +*Receiver Anonymity* + +While the receiver does not have a choice in the sender's routing process, they do get to choose the final sub-path via route blinding. Using path queries, a receiver can construct more reliable paths to itself; the longer the path, the more anonymity from the sender and it's gang of routing nodes. The receiver may also choose to construct the blinded path using trampoline-like hops to prevent routing nodes from inferring full paths. + +#### Denial-of-service risks + +Nodes may choose their own response strategies, including filtering requests (e.g minimum amount) and setting rate limits. In order to enforce rate limits, a node either needs to know the source of the query or needs to enforce some cost on anonymous queries. From 487b7e3ae6c2adea000e4eefdf7de3832d9223ce Mon Sep 17 00:00:00 2001 From: Benjamin Hindman Date: Tue, 27 May 2025 15:47:01 -0500 Subject: [PATCH 2/6] Updated proposal summary & feature flags (66/67) --- 09-features.md | 2 +- proposals/path-queries.md | 47 +++++++++++++++++---------------------- 2 files changed, 21 insertions(+), 28 deletions(-) diff --git a/09-features.md b/09-features.md index f560891b9..99611ac1c 100644 --- a/09-features.md +++ b/09-features.md @@ -52,8 +52,8 @@ The Context column decodes as follows: | 46/47 | `option_scid_alias` | Supply channel aliases for routing | IN | | [BOLT #2][bolt02-channel-ready] | | 48/49 | `option_payment_metadata` | Payment metadata in tlv record | 9 | | [BOLT #11](11-payment-encoding.md#tagged-fields) | | 50/51 | `option_zeroconf` | Understands zeroconf channel types | IN | `option_scid_alias` | [BOLT #2][bolt02-channel-ready] | -| 152/153 | `option_path_query` | Node supports path query messages | IN | | [BOLT #7](07-routing-gossip.md#path-query-messages) | | 60/61 | `option_simple_close` | Simplified closing negotiation | IN | `option_shutdown_anysegwit` | [BOLT #2][bolt02-simple-close] | +| 66/67 | `option_path_query` | Node supports `path_query` messages | IN | | [BOLT #7](07-routing-gossip.md#path-query-messages) | ## Requirements diff --git a/proposals/path-queries.md b/proposals/path-queries.md index 940b66ea0..38e3cdbd4 100644 --- a/proposals/path-queries.md +++ b/proposals/path-queries.md @@ -3,25 +3,30 @@ ### Introduction -To route a payment on the Lightning Network, a sender must find a path to the destination using channels which contain sufficient liquidity and meet certain routing rules (e.g fees). The routing information that's shared within the protocol today is insufficient to reliably find a feasible path, which results in nodes spamming other nodes with failing payments. The purpose of path queries is to enable payment senders to find feasible paths with minimal knowledge of the graph, and to allow routing nodes to respond with dynamic policy. This is done by sharing path information at the peer-level, which can be scaled to a growing network while preserving channel balance privacy and payment anonymity. +To route a payment on the Lightning Network, a sender must find a path to the destination using channels which contain sufficient liquidity and meet certain routing rules (e.g fees). The routing information that's shared within the protocol today is insufficient to determine a feasible path, which results in various forms of failing payments. The purpose of path queries is to obtain routing information using queries, rather than relying on the responses of payment attempts. This gives payment senders an effective way to find feasible paths with minimal knowledge of the graph and gives routing nodes the opportunity to respond with dynamic policy. By selectively sharing routing information between peers, payment reliability can be scaled to a growing network while preserving channel balance privacy and payment anonymity. -### The problem +### The problem space #### 1. Liquidity uncertainty -Liquidity is state that is managed between two nodes and the unpredictability of this value for any given node is referred as *liquidity uncertainty*. Without prior knowledge, the uncertainty of unowned channels on the network is only bounded by the capacity of that channel and the aggregate uncertainty grows with the number of channels on the network. Currently, feasible paths are found by trial-and-error, whereby liquidity uncertainty is reduced by sequential payment attempts, but this approach has a host of issues: +For a payment to succeeed, a feasible path needs to be discovered, which requires sufficient liquidity in each channel. Liquidity is state that is managed between two nodes and the unpredictability of this value for a given node is referred to as *liquidity uncertainty*. Without any prior knowledge (i.e high uncertainty), the probability a path is feasible declines with the size of the payment and the number of channels used. Today, feasible paths are found by a process of trial-and-error, whereby liquidity uncertainty is reduced using the results of previous payment attempts, but this approach has a host of issues: -1. Pathfinding calculations account for reliability by favoring shorter paths and higher capacity channels. Not only does this effect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. +1. Pathfinding calculations attempt to increase probabilities by favoring shorter paths and higher capacity channels. Not only does this effect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. -2. Low payment success rates implies a large set of potential routes. When the final route is unknown, routing fees (and other payment details) are more difficult to predict. +2. Lower payment success probabilities implies a larger set of potential routes. When the final route is unknown, routing fees (and other payment details) are more difficult to predict. 3. Failed payments are a burden to routing nodes in the failing sub-path in the form of locked liquidity, HTLC slots, and wasted computational resources. -Furthermore, trial-and-error is a slow method to discover a feasible path because HTLCs need to be set up and torn down at each hop. HTLCs require multiple rounds of communication between peers, which is wasted time when the payment fails. Furthermore, this process must be executed serially to avoid the delivery of multiple successful payments, which fundamentally limits the rate at which path discovery can occur. To improve the performance of real payments, nodes may choose to 'probe' the network to reduce liquidity uncertainty. Due to it's dynamic nature, liquidity is always regressing to a state of uncertainty, which means nodes must continously monitor the network. As the network grows, this means an exponential growth of failing payments. +Furthermore, trial-and-error is a slow discovery process because: + +1. HTLCs need to be set up and torn down at each hop. HTLCs require multiple rounds of communication between peers, which is wasted time when the payment fails. +2. It must be executed serially to avoid the delivery of multiple successful payments + +To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes to reduce liquidity uncertainty. Due to it's dynamic nature, liquidity is always regressing to a state of uncertainty, which means nodes must actively monitor the network. As the number of routing channels grows, one can expect an exponential growth of failing payments. #### 2. Graph dependence -To route a payment, the sender needs the latest updates to the graph. This requirement is a burden to the sender, who needs to constantly sync the channel graph, and to routers, who must limit their channel updates. Inputs to routing policy, including liquidity, onchain fees, and even exogenous factors, are extremely dynamic, which means their policy should also be dynamic. Rate limits to `channel_updates` causes advertised policy to differ from desired policy, which reduces control of routing resources (e.g liquidity). +To route a payment, the sender needs the latest updates to the graph. This requirement is a burden to the sender, who needs to constantly sync the channel graph, and to routers, who must limit their `channel_update` messages. Inputs to routing policy, including liquidity, onchain fees, and external factors, are highly dynamic, which means their policy should also be dynamic. Rate limits to `channel_update` messages causes advertised policy to differ from desired policy, which reduces control of routing resources (e.g liquidity). * * * @@ -77,35 +82,23 @@ Before attempting the payment, the sender (S) may choose to query any subset of - Upon receiving the path details from Bob and Dave, Carol can now confidently assemble a MPP from herself to the receiver. She constructs the MPP, adds her own routing details and sends a `path_reply` to the sender. - Upon receiving the `path_reply` from Carol, Alice attempts the payment and on her first attempt, the payment succeeded. -As you can see, path queries enable nodes to use queries to concurrently probe the network for feasible paths. When chained together, these queries swarm to the destination, reducing liquididity uncertainty at an exponentiall faster rate than failed payments. Each hop knows it's channel balance and can therefore reduce the liquidity uncertainty at their channel in the path. After receiving a `path_reply` a node can prepend itself to the path and back-propogate it to the source. - -While the small example above illustrates the process, it is important to consider the *rate* at which liquidity uncertainty is reduced; trial-and-error may work for a small network like this, but does not scale well with a growing number of nodes. +As you can see, concurrent `path_query` messages spread amongst prospective routing nodes until a feasible path is discovered. After receiving a `path_reply` a node can prepend itself to the path and either back-propagate it to the source or attempt the payment. Each hop knows it's channel balances and can therefore reduce the liquidity uncertainty for it's respective channels. -### Implementing Trampoline via queries +While the small example above illustrates the process, it is important to consider the *rate* at which liquidity uncertainty is reduced; trial-and-error may work for a small network like this, but does not scale to a growing number of nodes. -Path queries can be used to accomplish everything trampoline routing proposes, but with reduced complexity at the protocol level. The [trampoline proposal](https://github.com/lightning/bolts/blob/trampoline-routing/proposals/trampoline.md#introduction) states, "The main goal of trampoline routing is to reduce the amount of gossip that constrained nodes need to sync". While trampoline requires constrained devices to "keep a small part of the network and leverage trampoline nodes to route payments," path queries do not need any knowledge of the graph to find a route, but only a connection to a supporting peer(s). +### Comparisons to Trampoline -Trampoline pursues this goal while also preserving anonymity for the sender and receiver. This is done by including multiple hops in the trampoline route. By using onion messages (see [Queries via Onions](#queries-via-onions)) to anonymously query a remote node, a source node can attain a comparable anonymity set with only a single trampoline-like hop using the following process: - -1. Select a node with path query support as a 'trampoline' hop (T). -2. Query channel peer (P) for a path to T. -3. Query T for a path from T to final destination (D). -4. Send payment using aggregate route -``` -S -> P -> ... -> T -> ... -> D -``` +One of the benefits of path queries is the reduced dependence on gossip data to route payments. The [trampoline proposal](https://github.com/lightning/bolts/blob/trampoline-routing/proposals/trampoline.md#introduction) states a similar goal: "The main goal of trampoline routing is to reduce the amount of gossip that constrained nodes need to sync." This is done by using a trampoline onion which enables dynamic routing between well-informed trampoline routing nodes. By using path queries, the payment sender can instead construct the entire route by querying well-informed - likely the same - routing nodes. -Note that while the pathfinding process is similar to trampoline in that it leverages routing information of other nodes, the final route does not use a special "trampoline route", but rather a regular onion route. This has several advantages: +Each approach has their own set of trade-offs. A detailed comparison is beyond the scope of this proposal, but at a high-level, a regular onion gives more control to the payment sender over the final route and is indistinguishable to the receiver. However, when a failure occurs, the error is returned to the source and a new path needs to be attempted. By comparison, errors in a trampoline sub-path are returned to the previous trampoline hop and can be retried from there. -- Simplicity of the protocol -- Trampoline hops do not know previous trampolines -- Single-hop trampolines produce a shorter route (more reliable, lower expected fees) +Generally speaking, trampoline can be expected to handle failures more efficiently, while path queries give the payment sender more control of the completed payment. ## Expanding the Protocol #### Queries via Onions -The proposal as described above only supports messages using a *direct* connection between any two peers. This is sufficient for queries between channel peers because it reveals no new information about the source of a payment. However, this reduces anonymity when querying nodes elsewhere on the network, such as querying a trampoline-like hop as described above. To improve anonymity, onions could be used to carry these messages. However, without knowing the query node, responding nodes are vulnerable to spam, and consequentally, potential DoS attacks. This is a similar attack vector to channel jamming, but rather than using onions to consume payment resources (liquidity, HTLCs), a query consumes computational resources instead. To prevent this, nodes can implement their own mitigations, such as a small payment as part of the query. +The proposal as described above only supports messages using a *direct* connection between any two peers. This is sufficient for queries between channel peers because it reveals no new information about the source of a payment. However, this reduces anonymity when querying nodes elsewhere on the network, such as querying a trampoline-like hop as described above. To improve anonymity, onions could be used to carry these messages. However, without knowing the query source, responding nodes are vulnerable to spam, and consequentally, potential DoS attacks. This is a similar attack vector to channel jamming, but rather than using onions to consume payment resources (liquidity, HTLCs), a query consumes computational resources instead. To prevent this, nodes can implement their own mitigations, such as a small payment as part of an anonymous query. #### Adding fields @@ -113,7 +106,7 @@ The messages defined in this proposal are intentionally bare. Optional message f - `path_query` - `maximum_fee`, `cltv_expiration` - reduce response messages by providing upfront filters - - `expiration`. A querying node can define the window of time they're interested in a given path (e.g 1min, 1hr, 1day, always) and get notified with updates. + - `expiration` - A querying node can define the window of time they're interested in a given path (e.g 1min, 1hr, 1day, always) and get notified with updates. - `path_reply` - `confidence` (ranged interval) - a score to indicate the expected likelihood of payment delivery; the higher the routing node's confidence, the more a path suggestion behaves like a *quote* for delivery. This may be used by a querying node to weigh the value of a responding nodes offered paths, especially if there's a cost for `path_reply`s, such as onion queries as described above. Unlike forwarding endorsements (e.g [HTLC endorsement](https://github.com/lightning/bolts/pull/1071)), this value would be back-propogated in the `path_reply` messages, so does not leak information about the origin. From 3de26a3a114f2d00a294dc8ba2da6661acc354f8 Mon Sep 17 00:00:00 2001 From: Benjamin Hindman Date: Tue, 27 May 2025 23:03:13 -0500 Subject: [PATCH 3/6] Rewrite comparisons to trampoine --- proposals/path-queries.md | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/proposals/path-queries.md b/proposals/path-queries.md index 38e3cdbd4..7c4f6684b 100644 --- a/proposals/path-queries.md +++ b/proposals/path-queries.md @@ -9,7 +9,7 @@ To route a payment on the Lightning Network, a sender must find a path to the de #### 1. Liquidity uncertainty -For a payment to succeeed, a feasible path needs to be discovered, which requires sufficient liquidity in each channel. Liquidity is state that is managed between two nodes and the unpredictability of this value for a given node is referred to as *liquidity uncertainty*. Without any prior knowledge (i.e high uncertainty), the probability a path is feasible declines with the size of the payment and the number of channels used. Today, feasible paths are found by a process of trial-and-error, whereby liquidity uncertainty is reduced using the results of previous payment attempts, but this approach has a host of issues: +For a payment to succeed, a feasible path needs to be discovered, which requires sufficient liquidity in each channel. Liquidity is state that is managed between two nodes and the unpredictability of this value for a given node is referred to as *liquidity uncertainty*. Without any prior knowledge (i.e high uncertainty), the probability a path is feasible declines with the size of the payment and the number of channels used. Today, feasible paths are found by a process of trial-and-error, whereby liquidity uncertainty is reduced using the results of previous payment attempts, but this approach has a host of issues: 1. Pathfinding calculations attempt to increase probabilities by favoring shorter paths and higher capacity channels. Not only does this effect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. @@ -22,7 +22,7 @@ Furthermore, trial-and-error is a slow discovery process because: 1. HTLCs need to be set up and torn down at each hop. HTLCs require multiple rounds of communication between peers, which is wasted time when the payment fails. 2. It must be executed serially to avoid the delivery of multiple successful payments -To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes to reduce liquidity uncertainty. Due to it's dynamic nature, liquidity is always regressing to a state of uncertainty, which means nodes must actively monitor the network. As the number of routing channels grows, one can expect an exponential growth of failing payments. +To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes to reduce liquidity uncertainty. Due to its dynamic nature, liquidity is always regressing to a state of uncertainty, which means nodes must actively monitor the network. As the number of routing channels grows, one can expect an exponential growth of failing payments. #### 2. Graph dependence @@ -54,7 +54,7 @@ Upon receiving a `path_query`, a node can choose how it wants to respond, includ ## Putting into practice -The proposal outlines a basic set of messages and it is for the node to choose their own request & response strategies, including *who* they want to talk to (any subset of nodes), *what* they want to respond to (e.g minimum amounts) and any rate limits (number of requests and replies/paths). While there are innumerable strategies that may evolve, let's walk-through a simple example where where all nodes adopt a PEER_ONLY strategy (i.e nodes interact only with their channel peers): +The proposal outlines a basic set of messages, and it is up to the node to choose their own request & response strategies, including *who* they want to talk to (any subset of nodes), *what* they want to respond to (e.g minimum amounts) and any rate limits (number of requests and replies/paths). While there are innumerable strategies that may evolve, let's walk-through a simple example where all nodes adopt a PEER_ONLY strategy. Under this strategy, nodes only send `path_query` and `path_reply` messages to their direct channel peers. Payment from S -> R ``` @@ -71,34 +71,41 @@ Payment from S -> R +-------+ +-------+ ``` -Before attempting the payment, the sender (S) may choose to query any subset of it's channel peers. Alice (A) advertises the lowest routing fees but, since this is a larger payment, the sender decides to make concurrent queries to both Alice and Carol: +Before attempting the payment, the sender (S) may choose to query any subset of it's channel peers. Since this is a larger payment, the sender decides to make concurrent queries to both Alice and Carol: -- Alice receives a `path_query` message requesting a path from herself (A) to the receiver (R). She sees she does not have the outbound liquidity to Bob (B) to complete payment, so responds with a `reject_path_query` with a reason indicating a temporary failure to find a route. +- Alice receives a `path_query` message requesting a path from herself (A) to the receiver (R). She sees she does not have the outbound liquidity to Bob (B) to complete payment, so either responds with a `reject_path_query` with a reason indicating a temporary failure to find a route, or waits for liquidity to become available to respond with a `path_reply`. - Carol (C) receives a `path_query` requesting a path from herself to the receiver (R). She has sufficient outbound liquidity through Dave (D), but before responding to the sender, she decides to query Dave: - Dave receives a query from Carol for a path from himself (D) to the receiver. Similar to Alice, Dave responds that he has no route available - Upon discovering insufficient liquidity from D -> R, Carol splits the sender amount and concurrently queries Bob (B) and Dave (D) with their respective splits. - - Dave receives a new query requesting a path from himself (D) to the receiver, but of a lesser amount. This he does have the liquidity for! Since Dave knows he can route the requested payment, he resonds to Carol with the given path and routing details. - - Bob (B) receives a new query requesting a path from himself (B) to the receiver for his split amount. Similar to Dave, he knows he can route the payment, so resonds to Carol with his routing details. + - Dave receives a new query requesting a path from himself (D) to the receiver, but of a lesser amount. This he has the liquidity for! Since Dave knows he can route the requested payment, he responds to Carol with the given path and routing details. + - Bob (B) receives a new query requesting a path from himself (B) to the receiver for his split amount. Similar to Dave, he knows he can route the payment, so responds to Carol with his routing details. - Upon receiving the path details from Bob and Dave, Carol can now confidently assemble a MPP from herself to the receiver. She constructs the MPP, adds her own routing details and sends a `path_reply` to the sender. -- Upon receiving the `path_reply` from Carol, Alice attempts the payment and on her first attempt, the payment succeeded. +- Upon receiving the `path_reply` from Carol, the sender attempts the payment and on the first attempt, the payment succeeded. -As you can see, concurrent `path_query` messages spread amongst prospective routing nodes until a feasible path is discovered. After receiving a `path_reply` a node can prepend itself to the path and either back-propagate it to the source or attempt the payment. Each hop knows it's channel balances and can therefore reduce the liquidity uncertainty for it's respective channels. +As you can see, `path_query` messages concurrently spread amongst prospective routing nodes until a feasible path is discovered. After receiving a `path_reply` a node can prepend itself to the path and either back-propagate it to the source or attempt the payment. Each node knows it's channel balances and can therefore reduce the liquidity uncertainty for it's respective channels. While the small example above illustrates the process, it is important to consider the *rate* at which liquidity uncertainty is reduced; trial-and-error may work for a small network like this, but does not scale to a growing number of nodes. ### Comparisons to Trampoline -One of the benefits of path queries is the reduced dependence on gossip data to route payments. The [trampoline proposal](https://github.com/lightning/bolts/blob/trampoline-routing/proposals/trampoline.md#introduction) states a similar goal: "The main goal of trampoline routing is to reduce the amount of gossip that constrained nodes need to sync." This is done by using a trampoline onion which enables dynamic routing between well-informed trampoline routing nodes. By using path queries, the payment sender can instead construct the entire route by querying well-informed - likely the same - routing nodes. +By querying one or more remote nodes (see [Anonymous queries via Onions](#anonymous-queries-via-onions)), a source node can construct a route similar to that used by trampoline, as demonstrated by the following example: -Each approach has their own set of trade-offs. A detailed comparison is beyond the scope of this proposal, but at a high-level, a regular onion gives more control to the payment sender over the final route and is indistinguishable to the receiver. However, when a failure occurs, the error is returned to the source and a new path needs to be attempted. By comparison, errors in a trampoline sub-path are returned to the previous trampoline hop and can be retried from there. +1. Select a remote node with path query support as a 'trampoline' hop (T2). +2. Query channel peer (T1) for a path to T2. +3. Query T2 for a path from T2 to final destination (D). +4. Send payment using aggregate route: S -> T1 -> ... -> T2 -> ... -> D -Generally speaking, trampoline can be expected to handle failures more efficiently, while path queries give the payment sender more control of the completed payment. +Note that while the pathfinding process is similar to trampoline in that it leverages the pathfinding ability of other nodes, the final route is determined by the sender and a regular onion is used. + +Each approach has it's own set of trade-offs. A regular payment onion gives the payment sender more control over routing decisions, including what route(s) to use and how to handle errors. When using a trampoline onion, many routing decisions are outsourced. For example, errors are returned to the previous trampoline hop and can be retried from that point, which may improve the payment delivery time, but may also produce a sub-optimal route (e.g more fees) from the sender's perspective. + +Routing nodes have an economic incentive to support both features in order to maximize routing fees. More importantly, trampoline nodes can also employ queries to find feasible sub-paths, thereby reducing their own dependence on a fully synced and actively probed graph. Graph maintenance is a cost that disproportionately effects nodes with smaller infrastructure and lower payment volume. Reducing these costs allows smaller nodes to be more competitive, and therefore, increases the expected distribution of routing. ## Expanding the Protocol -#### Queries via Onions +#### Anonymous queries via Onions -The proposal as described above only supports messages using a *direct* connection between any two peers. This is sufficient for queries between channel peers because it reveals no new information about the source of a payment. However, this reduces anonymity when querying nodes elsewhere on the network, such as querying a trampoline-like hop as described above. To improve anonymity, onions could be used to carry these messages. However, without knowing the query source, responding nodes are vulnerable to spam, and consequentally, potential DoS attacks. This is a similar attack vector to channel jamming, but rather than using onions to consume payment resources (liquidity, HTLCs), a query consumes computational resources instead. To prevent this, nodes can implement their own mitigations, such as a small payment as part of an anonymous query. +The proposal as described above only supports messages using a *direct* connection between any two peers. This is sufficient for queries between channel peers because it reveals no new information about the source of a payment. However, this reduces anonymity when querying remote nodes, such as the example described in [Trampoline](#comparisons-to-trampoline) above. To improve anonymity, onions could be used to carry these messages. However, without knowing the query source, responding nodes are vulnerable to spam, and consequentally, potential DoS attacks. This is a similar attack vector to channel jamming, but rather than using onions to consume routing resources (liquidity, HTLCs), query onions consume computational resources instead. To defend against spam, nodes may potentially require a small payment for anonymous recommendations. #### Adding fields @@ -127,7 +134,7 @@ While a single query does not tell a routing node about the source of a payment, *Receiver Anonymity* -While the receiver does not have a choice in the sender's routing process, they do get to choose the final sub-path via route blinding. Using path queries, a receiver can construct more reliable paths to itself; the longer the path, the more anonymity from the sender and it's gang of routing nodes. The receiver may also choose to construct the blinded path using trampoline-like hops to prevent routing nodes from inferring full paths. +While the receiver does not have a choice in the sender's routing process, they do get to choose the final sub-path via route blinding. Using path queries, a receiver can construct more reliable paths to itself; the longer the path, the more anonymity from the sender and it's set of routing nodes. The receiver may also choose to construct the blinded path using trampoline-like hops to prevent routing nodes from inferring full paths. #### Denial-of-service risks From 243b0334f0d2ad7fa7fe55a8ee52b8d305caa70b Mon Sep 17 00:00:00 2001 From: Benjamin Hindman Date: Tue, 3 Jun 2025 15:10:53 -0500 Subject: [PATCH 4/6] Proposal v2 --- 07-routing-gossip.md | 22 +++++------ 09-features.md | 2 +- proposals/path-queries.md | 79 ++++++++++++++++++++++++--------------- 3 files changed, 59 insertions(+), 44 deletions(-) diff --git a/07-routing-gossip.md b/07-routing-gossip.md index d6bfebc12..1d0c12d89 100644 --- a/07-routing-gossip.md +++ b/07-routing-gossip.md @@ -835,55 +835,53 @@ determine if replies are done: simply check if `first_blocknum` plus The addition of timestamp and checksum fields allow a peer to omit querying for redundant updates. -### The `path_query` and `path_reply` Messages +### The `query_path` and `reply_path` Messages -1. type: 266 (`path_query`) +1. type: 266 (`query_path`) 2. data: - * [`chain_hash`:`chain_hash`] * [`point`:`source_node_id`] * [`point`:`destination_node_id`] * [`u16`:`amount_msat`] -1. type: 267 (`path_reply`) +1. type: 267 (`reply_path`) 2. data: - * [`chain_hash`:`chain_hash`] * [`point`:`source_node_id`] * [`point`:`destination_node_id`] * [`u16`:`amount_msat`] * [`u16`:`path_len`] * [`path_len*short_channel_id`:`path`] -1. type: 268 (`reject_path_query`) +1. type: 268 (`reject_query_path`) 2. data: * [`u16`:`reason_len`] * [`reason_len*byte`:`reason`] #### Rationale -One path per message allows a node to respond asynchronously. `path_reply` includes the queried fields to disambiguate multiple `path_query`s. +One path per message allows a node to respond asynchronously. `reply_path` includes the queried fields to disambiguate multiple `query_path`s. ### Requirements -The origin node sending `path_query`: +The origin node sending `query_path`: - MUST set `source_node_id` to the public key of the source node. - MUST set `destination_node_id` to the public key of the destination node. - MUST set `amount_msat`. The receiving node: - - if it does not support the `option_path_query` feature: + - if it does not support the `option_query_path` feature: - MUST ignore the message. - if the node chooses to respond: - - MAY send multiple `path_reply` messages: + - MAY send multiple `reply_path` messages: - MUST set `source_node_id`, `destination_node_id`, and `amount_msat` - to match the values from the original `path_query` message. + to match the values from the original `query_path` message. - MUST include a list of `short_channel_id`s that form a path between nodes connected to `source_node_id` and `destination_node_id`. - MUST set `path_len` to the number of channels in the path. - if the node chooses not to respond: - MAY ignore the message. - - MAY send a `reject_path_query` with `reason`. + - MAY send a `reject_query_path` with `reason`. ### The `gossip_timestamp_filter` Message diff --git a/09-features.md b/09-features.md index 99611ac1c..312fc41ca 100644 --- a/09-features.md +++ b/09-features.md @@ -53,7 +53,7 @@ The Context column decodes as follows: | 48/49 | `option_payment_metadata` | Payment metadata in tlv record | 9 | | [BOLT #11](11-payment-encoding.md#tagged-fields) | | 50/51 | `option_zeroconf` | Understands zeroconf channel types | IN | `option_scid_alias` | [BOLT #2][bolt02-channel-ready] | | 60/61 | `option_simple_close` | Simplified closing negotiation | IN | `option_shutdown_anysegwit` | [BOLT #2][bolt02-simple-close] | -| 66/67 | `option_path_query` | Node supports `path_query` messages | IN | | [BOLT #7](07-routing-gossip.md#path-query-messages) | +| 66/67 | `option_path_queries` | Node supports `query_path` messages | IN | | [BOLT #7](07-routing-gossip.md#path-query-messages) | ## Requirements diff --git a/proposals/path-queries.md b/proposals/path-queries.md index 7c4f6684b..4c096a46a 100644 --- a/proposals/path-queries.md +++ b/proposals/path-queries.md @@ -1,40 +1,57 @@ # Path Queries -### Introduction +## Introduction -To route a payment on the Lightning Network, a sender must find a path to the destination using channels which contain sufficient liquidity and meet certain routing rules (e.g fees). The routing information that's shared within the protocol today is insufficient to determine a feasible path, which results in various forms of failing payments. The purpose of path queries is to obtain routing information using queries, rather than relying on the responses of payment attempts. This gives payment senders an effective way to find feasible paths with minimal knowledge of the graph and gives routing nodes the opportunity to respond with dynamic policy. By selectively sharing routing information between peers, payment reliability can be scaled to a growing network while preserving channel balance privacy and payment anonymity. +To route a payment on the Lightning Network, a sender must find a path to the destination using channels which contain sufficient liquidity and meet certain routing rules (e.g fees). The current gossip scheme is insufficient to reliably determine a feasible path and inflexible for routing nodes. The purpose of path queries is to reduce informational requirements during pathfinding and to allow routers to respond with dynamic policy. By selectively sharing routing information between peers, payment reliability can be scaled to a growing network while preserving channel balance privacy and payment anonymity. -### The problem space +## The problem: Graph Dependence -#### 1. Liquidity uncertainty +While finding a feasible path, source-based routing requires information about the network graph. In the context of Lightning, this information typically comes from two sources: gossip messages and the responses of previous payment attempts. Both sources have scaling limitations, which ultimately favor larger routing nodes and routing centralization. -For a payment to succeed, a feasible path needs to be discovered, which requires sufficient liquidity in each channel. Liquidity is state that is managed between two nodes and the unpredictability of this value for a given node is referred to as *liquidity uncertainty*. Without any prior knowledge (i.e high uncertainty), the probability a path is feasible declines with the size of the payment and the number of channels used. Today, feasible paths are found by a process of trial-and-error, whereby liquidity uncertainty is reduced using the results of previous payment attempts, but this approach has a host of issues: +### Scaling limitations of gossip -1. Pathfinding calculations attempt to increase probabilities by favoring shorter paths and higher capacity channels. Not only does this effect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. +1. Gossip data propogates the topology of the graph (i.e `node_announcement`, `channel_announcement`) and the advertised routing policies `channel_update`. -2. Lower payment success probabilities implies a larger set of potential routes. When the final route is unknown, routing fees (and other payment details) are more difficult to predict. +The gossip protocol is characterized for it's ability to quickly and reliably deliver a *limited* amount of information across a large distributed system via propagation. However, because these messages are delivered to every node (potentially multiple times), it's primary limitation is that it does not scale with a growing *quantity* of data; the more data that's shared, the more network and computational resources are required by each node to process those messages. As a result, gossip is well-suited for the Bitcoin network where transaction throughput is limited by design, but is less suited for the Lightning network where global constraints are alleviated and payments are localized. + +Therefore, when using the gossip protocol in a distributed network, it's important to consider the quantity of data. + + + +Pathfinding requires that a node synchronize the latest gossip messages from the network. +This means gossip messages need to be constrained, or else the cost to run a fully-synced node increases. + +The growth of nodes and channels is theoretically unbounded, which means gossip messages and the infrastructure (e.g bandwidth, computational, storage) needed to processes these messages is also unbounded. Growing infrastructure means running a fully-synced lightning node becomes increasingly expensive, which can price-out smaller nodes. -3. Failed payments are a burden to routing nodes in the failing sub-path in the form of locked liquidity, HTLC slots, and wasted computational resources. +More importantly, routing nodes must limit their `channel_update` messages, and therefore, their routing policies. Inputs to routing policy, including liquidity, onchain fees, and external factors, are highly dynamic, which means their policy should respond dynamically. Rate limits to `channel_update` messages causes advertised policy to differ from desired policy, which reduces control of routing resources (e.g liquidity). + +### Finding Liquidity + +For a payment to succeeed, a route requires sufficient liquidity in every channel of the path. From a data perspective, liquidity is state that is managed between two nodes. During pathfinding, the unpredictability of a channel's liquidity is referred to as *liquidity uncertainty*. Without any prior knowledge (i.e high uncertainty), the probability a path is feasible declines with the size of the payment and the number of channels used. Today, feasible paths are found by a process of trial-and-error, whereby liquidity ranges are temporarily narrowed using the results of previous payment attempts, but this approach has a host of issues: + +1. When liquidity uncertainty is high, pathfinding calculations increase payment success probabilities by favoring shorter paths and higher capacity channels. Not only does this effect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. + +2. Lower payment success probabilities implies a larger set of potential routes. When the final route is unknown, routing fees (and other payment details) are more difficult to predict. -Furthermore, trial-and-error is a slow discovery process because: +3. Trial-and-error is a slow discovery process because: -1. HTLCs need to be set up and torn down at each hop. HTLCs require multiple rounds of communication between peers, which is wasted time when the payment fails. -2. It must be executed serially to avoid the delivery of multiple successful payments +a. HTLCs need to be set up and torn down at each hop, where each HTLC requires three round-trips between peers. +b. Payments must be attempted serially to avoid the delivery of multiple successful payments. -To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes to reduce liquidity uncertainty. Due to its dynamic nature, liquidity is always regressing to a state of uncertainty, which means nodes must actively monitor the network. As the number of routing channels grows, one can expect an exponential growth of failing payments. +To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes using fake payments. However, liquidity is often highly dynamic and is always regressing to a state of uncertainty. To be effective, nodes must actively monitor the network. -#### 2. Graph dependence +4. Failed payments (both real and fake) are a burden to routing nodes in the failing sub-path in the form of locked liquidity, HTLC slots, and wasted system resources. -To route a payment, the sender needs the latest updates to the graph. This requirement is a burden to the sender, who needs to constantly sync the channel graph, and to routers, who must limit their `channel_update` messages. Inputs to routing policy, including liquidity, onchain fees, and external factors, are highly dynamic, which means their policy should also be dynamic. Rate limits to `channel_update` messages causes advertised policy to differ from desired policy, which reduces control of routing resources (e.g liquidity). +Furthermore, each of these problems represent a scaling constraint, as more nodes need to search for liquidity amongst a larger set of channels. * * * -### Proposal +## Proposal -This proposal includes the following optional messages which allows nodes to cooperatively construct a path: +The goal of path queries is to reduce a node's dependence on the graph during pathfinding by leveraging the routing information of other nodes. Specifically, this feature includes the following optional messages which allows nodes to cooperatively construct a path: -1. `path_query` +1. `query_path` - source_node_id @@ -42,19 +59,19 @@ This proposal includes the following optional messages which allows nodes to coo - amount -2. `path_reply` +2. `reply_path` - path -3. `reject_path_query` +3. `reject_query_path` - reason -Upon receiving a `path_query`, a node can choose how it wants to respond, including rejecting or ignoring it. The `path_reply` message helps the requester - either a source or router - deliver a potential payment because it leverages routing information at the queried node. This solves the liquidity uncertainty problem at the queried hop because a node knows it's own balances and can respond accordingly. Compared to payment onions, queries are lightweight and can be made concurrently. A routing node can respond with any routing policy (e.g fees, expiry, etc) it desires, unconstrained by global rate limits. +Upon receiving a `query_path` message, a node can choose how it wants to respond, including rejecting or ignoring it. The `reply_path` message helps the requester - either a source or router - deliver a potential payment because it leverages routing information at the queried node. This resolves the liquidity uncertainty problem at the queried hop because a node knows it's own channel balances and can respond accordingly. A routing node can respond with any routing policy (e.g fees, expiry, etc) it desires, unconstrained by rate limits to gossip. Compared to payments, queries are lightweight and can be made concurrently. ## Putting into practice -The proposal outlines a basic set of messages, and it is up to the node to choose their own request & response strategies, including *who* they want to talk to (any subset of nodes), *what* they want to respond to (e.g minimum amounts) and any rate limits (number of requests and replies/paths). While there are innumerable strategies that may evolve, let's walk-through a simple example where all nodes adopt a PEER_ONLY strategy. Under this strategy, nodes only send `path_query` and `path_reply` messages to their direct channel peers. +The proposal outlines a basic set of messages, and it is up to the node to choose their own request & response strategies, including *who* they want to talk to (any subset of nodes), *what* they want to respond to (e.g minimum amounts) and any rate limits (number of requests and replies/paths). While there are innumerable strategies that may evolve, let's walk-through a simple example where all nodes adopt a PEER_ONLY strategy. Under this strategy, nodes only send `query_path` and `reply_path` messages to their direct channel peers. Payment from S -> R ``` @@ -73,16 +90,16 @@ Payment from S -> R Before attempting the payment, the sender (S) may choose to query any subset of it's channel peers. Since this is a larger payment, the sender decides to make concurrent queries to both Alice and Carol: -- Alice receives a `path_query` message requesting a path from herself (A) to the receiver (R). She sees she does not have the outbound liquidity to Bob (B) to complete payment, so either responds with a `reject_path_query` with a reason indicating a temporary failure to find a route, or waits for liquidity to become available to respond with a `path_reply`. -- Carol (C) receives a `path_query` requesting a path from herself to the receiver (R). She has sufficient outbound liquidity through Dave (D), but before responding to the sender, she decides to query Dave: +- Alice receives a `query_path` message requesting a path from herself (A) to the receiver (R). She sees she does not have the outbound liquidity to Bob (B) to complete payment, so either responds with a `reject_query_path` with a reason indicating a temporary failure to find a route, or waits for liquidity to become available to respond with a `reply_path`. +- Carol (C) receives a `query_path` requesting a path from herself to the receiver (R). She has sufficient outbound liquidity through Dave (D), but before responding to the sender, she decides to query Dave: - Dave receives a query from Carol for a path from himself (D) to the receiver. Similar to Alice, Dave responds that he has no route available - Upon discovering insufficient liquidity from D -> R, Carol splits the sender amount and concurrently queries Bob (B) and Dave (D) with their respective splits. - Dave receives a new query requesting a path from himself (D) to the receiver, but of a lesser amount. This he has the liquidity for! Since Dave knows he can route the requested payment, he responds to Carol with the given path and routing details. - Bob (B) receives a new query requesting a path from himself (B) to the receiver for his split amount. Similar to Dave, he knows he can route the payment, so responds to Carol with his routing details. -- Upon receiving the path details from Bob and Dave, Carol can now confidently assemble a MPP from herself to the receiver. She constructs the MPP, adds her own routing details and sends a `path_reply` to the sender. -- Upon receiving the `path_reply` from Carol, the sender attempts the payment and on the first attempt, the payment succeeded. +- Upon receiving the path details from Bob and Dave, Carol can now confidently assemble a MPP from herself to the receiver. She constructs the MPP, adds her own routing details and sends a `reply_path` to the sender. +- Upon receiving the `reply_path` from Carol, the sender attempts the payment and on the first attempt, the payment succeeded. -As you can see, `path_query` messages concurrently spread amongst prospective routing nodes until a feasible path is discovered. After receiving a `path_reply` a node can prepend itself to the path and either back-propagate it to the source or attempt the payment. Each node knows it's channel balances and can therefore reduce the liquidity uncertainty for it's respective channels. +As you can see, `query_path` messages concurrently spread amongst prospective routing nodes until a feasible path is discovered. After receiving a `reply_path` a node can prepend itself to the path and either back-propagate it to the source or attempt the payment. Each node knows it's channel balances and can therefore reduce the liquidity uncertainty for it's respective channels. While the small example above illustrates the process, it is important to consider the *rate* at which liquidity uncertainty is reduced; trial-and-error may work for a small network like this, but does not scale to a growing number of nodes. @@ -111,20 +128,20 @@ The proposal as described above only supports messages using a *direct* connecti The messages defined in this proposal are intentionally bare. Optional message fields can be added to enhance a node's capabilites and to reduce the number of messages between peers. Some examples: -- `path_query` +- `query_path` - `maximum_fee`, `cltv_expiration` - reduce response messages by providing upfront filters - `expiration` - A querying node can define the window of time they're interested in a given path (e.g 1min, 1hr, 1day, always) and get notified with updates. -- `path_reply` - - `confidence` (ranged interval) - a score to indicate the expected likelihood of payment delivery; the higher the routing node's confidence, the more a path suggestion behaves like a *quote* for delivery. This may be used by a querying node to weigh the value of a responding nodes offered paths, especially if there's a cost for `path_reply`s, such as onion queries as described above. Unlike forwarding endorsements (e.g [HTLC endorsement](https://github.com/lightning/bolts/pull/1071)), this value would be back-propogated in the `path_reply` messages, so does not leak information about the origin. +- `reply_path` + - `confidence` (ranged interval) - a score to indicate the expected likelihood of payment delivery; the higher the routing node's confidence, the more a path suggestion behaves like a *quote* for delivery. This may be used by a querying node to weigh the value of a responding nodes offered paths, especially if there's a cost for `reply_path`s, such as onion queries as described above. Unlike forwarding endorsements (e.g [HTLC endorsement](https://github.com/lightning/bolts/pull/1071)), this value would be back-propogated in the `reply_path` messages, so does not leak information about the origin. ## Potential Concerns #### Privacy Implications -Naturally, any time information is shared, there is a privacy implication. A `path_query` reveals a downstream node - either a hop or the destination - to the prospective routing node. When iterated upon, each node in the path becomes aware of the *queried* destination. Meanwhile, the selected channels in a `path_reply` may reveal some information about channel balances. As so, let's consider channel balance privacy and sender/receiver anonymity: +Naturally, any time information is shared, there is a privacy implication. A `query_path` reveals a downstream node - either a hop or the destination - to the prospective routing node. When iterated upon, each node in the path becomes aware of the *queried* destination. Meanwhile, the selected channels in a `reply_path` may reveal some information about channel balances. As so, let's consider channel balance privacy and sender/receiver anonymity: *Privacy of channel balances* -Path queries differ from trial-and-error (including probing) in the manner that liquidity uncertainty is reduced. Trial-and-error informs the payment *sender* about liquidity *ranges* (i.e lower and upper bound) for channels on an attempted path, while a `path_reply` only provides a set channels that meet liquidity requirements. For example, in our PEER_ONLY strategy described above, the sender (S) gained no information about liquidity on the network other than what was sufficient for the final route. While probing still remains an unsolved problem, path queries enable better information control as nodes can choose *who* they talk to and *how much* information they want to reveal. +Path queries differ from trial-and-error (including probing) in the manner that liquidity uncertainty is reduced. Trial-and-error informs the payment *sender* about liquidity *ranges* (i.e lower and upper bound) for channels on an attempted path, while a `reply_path` only provides a set channels that meet liquidity requirements. For example, in our PEER_ONLY strategy described above, the sender (S) gained no information about liquidity on the network other than what was sufficient for the final route. While probing still remains an unsolved problem, path queries enable better information control as nodes can choose *who* they talk to and *how much* information they want to reveal. Generally speaking, the more channels a node has, the more difficult it is to infer liquidity based on an offered path. Large routing nodes with many channels may be more liberal in their responses than smaller nodes delivering less frequent payments. From 51f17d2540577b359a2e068c036f8e44adaa96a1 Mon Sep 17 00:00:00 2001 From: Benjamin Hindman Date: Wed, 4 Jun 2025 10:10:57 -0500 Subject: [PATCH 5/6] Updated proposal --- 07-routing-gossip.md | 28 +++++++++-------- proposals/path-queries.md | 66 +++++++++++++++------------------------ 2 files changed, 41 insertions(+), 53 deletions(-) diff --git a/07-routing-gossip.md b/07-routing-gossip.md index 1d0c12d89..a798595ca 100644 --- a/07-routing-gossip.md +++ b/07-routing-gossip.md @@ -841,47 +841,49 @@ The addition of timestamp and checksum fields allow a peer to omit querying for 2. data: * [`point`:`source_node_id`] * [`point`:`destination_node_id`] - * [`u16`:`amount_msat`] - - + * [`u64`:`amount_msat`] 1. type: 267 (`reply_path`) 2. data: * [`point`:`source_node_id`] * [`point`:`destination_node_id`] - * [`u16`:`amount_msat`] + * [`u64`:`amount_msat`] * [`u16`:`path_len`] * [`path_len*short_channel_id`:`path`] + * [`u16`:`routing_info_len`] + * [`routing_info_len*byte`:`routing_info`] 1. type: 268 (`reject_query_path`) 2. data: + * [`point`:`source_node_id`] + * [`point`:`destination_node_id`] + * [`u64`:`amount_msat`] * [`u16`:`reason_len`] * [`reason_len*byte`:`reason`] #### Rationale -One path per message allows a node to respond asynchronously. `reply_path` includes the queried fields to disambiguate multiple `query_path`s. +The `reply_path` and `reject_query_path` message includes the queried fields to disambiguate multiple `query_path`s. One path per message allows a node to respond asynchronously. ### Requirements The origin node sending `query_path`: - MUST set `source_node_id` to the public key of the source node. - MUST set `destination_node_id` to the public key of the destination node. - - MUST set `amount_msat`. + - MUST set `amount_msat` to the payment amount in millisatoshis. -The receiving node: +The node receiving a `query_path`: - if it does not support the `option_query_path` feature: - MUST ignore the message. - - if the node chooses to respond: - - MAY send multiple `reply_path` messages: + - if it does support the `option_query_path` feature: + - MAY ignore the message. + - MAY send a `reject_query_path`. + - MAY respond with a `reply_path`: - MUST set `source_node_id`, `destination_node_id`, and `amount_msat` to match the values from the original `query_path` message. - - MUST include a list of `short_channel_id`s that form a path between nodes connected + - SHOULD include a list of `short_channel_id`s that form a path between nodes connected to `source_node_id` and `destination_node_id`. - MUST set `path_len` to the number of channels in the path. - - if the node chooses not to respond: - - MAY ignore the message. - - MAY send a `reject_query_path` with `reason`. ### The `gossip_timestamp_filter` Message diff --git a/proposals/path-queries.md b/proposals/path-queries.md index 4c096a46a..980f1c658 100644 --- a/proposals/path-queries.md +++ b/proposals/path-queries.md @@ -3,40 +3,27 @@ ## Introduction -To route a payment on the Lightning Network, a sender must find a path to the destination using channels which contain sufficient liquidity and meet certain routing rules (e.g fees). The current gossip scheme is insufficient to reliably determine a feasible path and inflexible for routing nodes. The purpose of path queries is to reduce informational requirements during pathfinding and to allow routers to respond with dynamic policy. By selectively sharing routing information between peers, payment reliability can be scaled to a growing network while preserving channel balance privacy and payment anonymity. +To route a payment on the Lightning Network, a sender must find a path to the destination using channels which contain sufficient liquidity and meet certain routing rules (e.g fees). The current gossip scheme is insufficient to reliably determine a feasible path and inflexible for routing nodes. The purpose of path queries is to reduce the informational requirements during pathfinding and to allow routers to respond with dynamic policy. By selectively sharing routing information between peers, payment reliability can be scaled to a growing network while preserving channel balance privacy and payment anonymity. ## The problem: Graph Dependence -While finding a feasible path, source-based routing requires information about the network graph. In the context of Lightning, this information typically comes from two sources: gossip messages and the responses of previous payment attempts. Both sources have scaling limitations, which ultimately favor larger routing nodes and routing centralization. +While finding a feasible path, source-based routing requires information about the network graph. For Lightning, this information typically comes from two sources: gossip messages and the responses of previous payment attempts. Both sources have severe limitations, which ultimately favor larger routing nodes and contribute to routing centralization. -### Scaling limitations of gossip +### Limitations of gossip -1. Gossip data propogates the topology of the graph (i.e `node_announcement`, `channel_announcement`) and the advertised routing policies `channel_update`. - -The gossip protocol is characterized for it's ability to quickly and reliably deliver a *limited* amount of information across a large distributed system via propagation. However, because these messages are delivered to every node (potentially multiple times), it's primary limitation is that it does not scale with a growing *quantity* of data; the more data that's shared, the more network and computational resources are required by each node to process those messages. As a result, gossip is well-suited for the Bitcoin network where transaction throughput is limited by design, but is less suited for the Lightning network where global constraints are alleviated and payments are localized. - -Therefore, when using the gossip protocol in a distributed network, it's important to consider the quantity of data. - - - -Pathfinding requires that a node synchronize the latest gossip messages from the network. -This means gossip messages need to be constrained, or else the cost to run a fully-synced node increases. - -The growth of nodes and channels is theoretically unbounded, which means gossip messages and the infrastructure (e.g bandwidth, computational, storage) needed to processes these messages is also unbounded. Growing infrastructure means running a fully-synced lightning node becomes increasingly expensive, which can price-out smaller nodes. - -More importantly, routing nodes must limit their `channel_update` messages, and therefore, their routing policies. Inputs to routing policy, including liquidity, onchain fees, and external factors, are highly dynamic, which means their policy should respond dynamically. Rate limits to `channel_update` messages causes advertised policy to differ from desired policy, which reduces control of routing resources (e.g liquidity). +The gossip protocol is characterized by it's ability to quickly and reliably deliver a *finite* amount of information across a large distributed system via propagation. However, because these messages are delivered to every node (potentially multiple times), node performance suffers with a growing *quantity* of data; the more data that's shared, the more network and computational resources are required by each node to process those messages. Therefore, the size and frequency of gossip messages must be constrained. Gossip is well-suited for the Bitcoin network because nodes seek *consistency* among each other, which is done by limiting transaction throughput. Data on the Lightning Network, however, is more dynamic and routing a payment does not require consistency among all nodes, which means message limits and standardness rules are unnecessarily restrictive. For example, routing policy (cltv delta, fees, etc) has highly dynamic inputs, including available liquidity, HTLC slots, onchain fees, and even external factors. Existing limits to `channel_update` messages prohibit routing policy from accurately reflecting desired policy. ### Finding Liquidity -For a payment to succeeed, a route requires sufficient liquidity in every channel of the path. From a data perspective, liquidity is state that is managed between two nodes. During pathfinding, the unpredictability of a channel's liquidity is referred to as *liquidity uncertainty*. Without any prior knowledge (i.e high uncertainty), the probability a path is feasible declines with the size of the payment and the number of channels used. Today, feasible paths are found by a process of trial-and-error, whereby liquidity ranges are temporarily narrowed using the results of previous payment attempts, but this approach has a host of issues: +For a payment to succeed, a route requires sufficient liquidity in every channel of the path. From a data perspective, liquidity is state that is managed between two nodes. During pathfinding, the unpredictability of a channel's liquidity is referred to as *liquidity uncertainty*. Without any prior knowledge (i.e high uncertainty), the probability a path is feasible declines with the size of the payment and the number of channels used. Today, feasible paths are found by a process of trial-and-error, whereby liquidity ranges are temporarily narrowed using the results of previous payment attempts, but this approach has a host of issues: -1. When liquidity uncertainty is high, pathfinding calculations increase payment success probabilities by favoring shorter paths and higher capacity channels. Not only does this effect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. +1. When liquidity uncertainty is high, pathfinding calculations increase payment success probabilities by favoring shorter paths and higher capacity channels. Not only does this affect the payment sender who's more likely to pay extra in fees for reliable liquidity, it's also a centralizing force on the network. -2. Lower payment success probabilities implies a larger set of potential routes. When the final route is unknown, routing fees (and other payment details) are more difficult to predict. +2. Lower payment success probabilities implies a larger set of potential routes. When the final route is uncertain, routing fees (and other payment details) are more difficult to predict. 3. Trial-and-error is a slow discovery process because: -a. HTLCs need to be set up and torn down at each hop, where each HTLC requires three round-trips between peers. +a. HTLCs require multiple message exchanges between peers, with each HTLC addition and removal requiring commitment/revocation cycles. b. Payments must be attempted serially to avoid the delivery of multiple successful payments. To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes using fake payments. However, liquidity is often highly dynamic and is always regressing to a state of uncertainty. To be effective, nodes must actively monitor the network. @@ -45,8 +32,6 @@ To improve performance for real payments, nodes may choose to 'probe' the channe Furthermore, each of these problems represent a scaling constraint, as more nodes need to search for liquidity amongst a larger set of channels. -* * * - ## Proposal The goal of path queries is to reduce a node's dependence on the graph during pathfinding by leveraging the routing information of other nodes. Specifically, this feature includes the following optional messages which allows nodes to cooperatively construct a path: @@ -67,11 +52,11 @@ The goal of path queries is to reduce a node's dependence on the graph during pa - reason -Upon receiving a `query_path` message, a node can choose how it wants to respond, including rejecting or ignoring it. The `reply_path` message helps the requester - either a source or router - deliver a potential payment because it leverages routing information at the queried node. This resolves the liquidity uncertainty problem at the queried hop because a node knows it's own channel balances and can respond accordingly. A routing node can respond with any routing policy (e.g fees, expiry, etc) it desires, unconstrained by rate limits to gossip. Compared to payments, queries are lightweight and can be made concurrently. +Upon receiving a `query_path` message, a node can choose how it wants to respond, including rejecting or ignoring it. The `reply_path` message helps the requester - either a source or router - deliver a potential payment because it leverages routing information at the queried node. This resolves the liquidity uncertainty problem at the queried hop because a node knows its own channel balances and can respond accordingly. A routing node can respond with any routing policy (e.g fees, expiry, etc) it desires, unconstrained by limits to gossip. Compared to payments, queries are lightweight and can be made concurrently. ## Putting into practice -The proposal outlines a basic set of messages, and it is up to the node to choose their own request & response strategies, including *who* they want to talk to (any subset of nodes), *what* they want to respond to (e.g minimum amounts) and any rate limits (number of requests and replies/paths). While there are innumerable strategies that may evolve, let's walk-through a simple example where all nodes adopt a PEER_ONLY strategy. Under this strategy, nodes only send `query_path` and `reply_path` messages to their direct channel peers. +The proposal outlines a basic set of messages, and it is up to each node to choose its own request & response strategies, including *who* they want to talk to (any subset of nodes), *what* they want to respond to (e.g minimum amounts) and any rate limits (number of requests and replies/paths). While there are innumerable strategies that may evolve, let's walk-through a simple example where all nodes adopt a PEER_ONLY strategy. Under this strategy, nodes only send relevant messages to their direct channel peers. Payment from S -> R ``` @@ -99,11 +84,11 @@ Before attempting the payment, the sender (S) may choose to query any subset of - Upon receiving the path details from Bob and Dave, Carol can now confidently assemble a MPP from herself to the receiver. She constructs the MPP, adds her own routing details and sends a `reply_path` to the sender. - Upon receiving the `reply_path` from Carol, the sender attempts the payment and on the first attempt, the payment succeeded. -As you can see, `query_path` messages concurrently spread amongst prospective routing nodes until a feasible path is discovered. After receiving a `reply_path` a node can prepend itself to the path and either back-propagate it to the source or attempt the payment. Each node knows it's channel balances and can therefore reduce the liquidity uncertainty for it's respective channels. +As you can see, `query_path` messages concurrently spread amongst prospective routing nodes until a feasible path is discovered. After receiving a `reply_path` a node can prepend itself to the path and either back-propagate it to the source or attempt the payment. Each node knows its channel balances and can therefore reduce the liquidity uncertainty for it's respective channels. While the small example above illustrates the process, it is important to consider the *rate* at which liquidity uncertainty is reduced; trial-and-error may work for a small network like this, but does not scale to a growing number of nodes. -### Comparisons to Trampoline +## Comparisons to Trampoline By querying one or more remote nodes (see [Anonymous queries via Onions](#anonymous-queries-via-onions)), a source node can construct a route similar to that used by trampoline, as demonstrated by the following example: @@ -114,28 +99,29 @@ By querying one or more remote nodes (see [Anonymous queries via Onions](#anonym Note that while the pathfinding process is similar to trampoline in that it leverages the pathfinding ability of other nodes, the final route is determined by the sender and a regular onion is used. -Each approach has it's own set of trade-offs. A regular payment onion gives the payment sender more control over routing decisions, including what route(s) to use and how to handle errors. When using a trampoline onion, many routing decisions are outsourced. For example, errors are returned to the previous trampoline hop and can be retried from that point, which may improve the payment delivery time, but may also produce a sub-optimal route (e.g more fees) from the sender's perspective. +Each approach has its own set of trade-offs. By learning the entire route, path queries give the payment sender more control over routing decisions, including the final route and how to handle errors. While using trampoline, many routing decisions are outsourced. For example, errors are returned to the previous trampoline hop and can be retried from that point. This may improve the payment delivery time, but may also produce a sub-optimal route (e.g more fees) from the sender's perspective. For a user with multiple LSPs, path queries give the user the ability to 'shop' for the best route. -Routing nodes have an economic incentive to support both features in order to maximize routing fees. More importantly, trampoline nodes can also employ queries to find feasible sub-paths, thereby reducing their own dependence on a fully synced and actively probed graph. Graph maintenance is a cost that disproportionately effects nodes with smaller infrastructure and lower payment volume. Reducing these costs allows smaller nodes to be more competitive, and therefore, increases the expected distribution of routing. +Routing nodes have an economic incentive to support both features in order to maximize routing fees. More importantly, trampoline nodes can employ queries for themselves to find feasible sub-paths, thereby reducing their own dependence on a fully synced and actively probed graph. Graph maintenance is a cost that disproportionately effects nodes with smaller infrastructure and lower payment volume. Reducing these costs allows smaller nodes to be more competitive, which increases the expected distribution of routing. ## Expanding the Protocol -#### Anonymous queries via Onions +### Anonymous queries via Onions -The proposal as described above only supports messages using a *direct* connection between any two peers. This is sufficient for queries between channel peers because it reveals no new information about the source of a payment. However, this reduces anonymity when querying remote nodes, such as the example described in [Trampoline](#comparisons-to-trampoline) above. To improve anonymity, onions could be used to carry these messages. However, without knowing the query source, responding nodes are vulnerable to spam, and consequentally, potential DoS attacks. This is a similar attack vector to channel jamming, but rather than using onions to consume routing resources (liquidity, HTLCs), query onions consume computational resources instead. To defend against spam, nodes may potentially require a small payment for anonymous recommendations. +The proposal as described above only supports messages using a *direct* connection between any two peers. This is sufficient for queries between channel peers because it reveals no new information about the source of a payment. However, this reduces anonymity when querying remote nodes, such as the example described in [Trampoline](#comparisons-to-trampoline) above. To improve anonymity, onions could be used to carry these messages. However, without knowing the query source, responding nodes are vulnerable to spam, and consequentally, potential DoS attacks. This is a similar attack vector to channel jamming, but rather than using onions to consume channel resources (liquidity, HTLCs), query onions consume a node's computational resources instead. To defend against spam, nodes can potentially require a small payment for anonymous recommendations. -#### Adding fields +### Adding message fields -The messages defined in this proposal are intentionally bare. Optional message fields can be added to enhance a node's capabilites and to reduce the number of messages between peers. Some examples: +The messages defined in this proposal are intentionally minimal to communicate the core concept while avoiding additional complexity. That said, optional message fields can be added to enhance a node's capabilites and to reduce the number of messages between peers. Some examples: -- `query_path` +- `query_path` fields: - `maximum_fee`, `cltv_expiration` - reduce response messages by providing upfront filters - - `expiration` - A querying node can define the window of time they're interested in a given path (e.g 1min, 1hr, 1day, always) and get notified with updates. -- `reply_path` + - `query_expiration` - A querying node can define the window of time they're interested in a given path (e.g 1min, 1hr, 1day, always) and get notified with updates. +- `reply_path` fields: - `confidence` (ranged interval) - a score to indicate the expected likelihood of payment delivery; the higher the routing node's confidence, the more a path suggestion behaves like a *quote* for delivery. This may be used by a querying node to weigh the value of a responding nodes offered paths, especially if there's a cost for `reply_path`s, such as onion queries as described above. Unlike forwarding endorsements (e.g [HTLC endorsement](https://github.com/lightning/bolts/pull/1071)), this value would be back-propogated in the `reply_path` messages, so does not leak information about the origin. ## Potential Concerns -#### Privacy Implications + +### Privacy Implications Naturally, any time information is shared, there is a privacy implication. A `query_path` reveals a downstream node - either a hop or the destination - to the prospective routing node. When iterated upon, each node in the path becomes aware of the *queried* destination. Meanwhile, the selected channels in a `reply_path` may reveal some information about channel balances. As so, let's consider channel balance privacy and sender/receiver anonymity: @@ -147,12 +133,12 @@ Generally speaking, the more channels a node has, the more difficult it is to in *Sender Anonymity* -While a single query does not tell a routing node about the source of a payment, the number of queries a routing node receives and whom they come from may reduce the anonymity set of the *query* origin. Depending on the nature of the payment, the sender may choose it's own path construction process, including adding trampoline-like hops or opting out of queries altogether. +While a single query does not tell a routing node about the source of a payment, the number of queries a routing node receives and whom they come from may reduce the anonymity set of the *query* origin. Depending on the nature of the payment, the sender may choose its own path construction process, including adding trampoline-like hops or opting out of queries altogether. *Receiver Anonymity* -While the receiver does not have a choice in the sender's routing process, they do get to choose the final sub-path via route blinding. Using path queries, a receiver can construct more reliable paths to itself; the longer the path, the more anonymity from the sender and it's set of routing nodes. The receiver may also choose to construct the blinded path using trampoline-like hops to prevent routing nodes from inferring full paths. +While the receiver does not have a choice in the sender's routing process, they do get to choose the final sub-path via route blinding. Using path queries, a receiver can construct more reliable paths to itself; the longer the path, the more anonymity from the sender and its set of routing nodes. The receiver may also choose to construct the blinded path using trampoline-like hops to prevent routing nodes from inferring full paths. -#### Denial-of-service risks +### Denial-of-service risks Nodes may choose their own response strategies, including filtering requests (e.g minimum amount) and setting rate limits. In order to enforce rate limits, a node either needs to know the source of the query or needs to enforce some cost on anonymous queries. From d5815ab13c0bd8e0c8966c43fecdebe947b4333c Mon Sep 17 00:00:00 2001 From: Benjamin Hindman Date: Wed, 4 Jun 2025 10:21:24 -0500 Subject: [PATCH 6/6] formatting --- proposals/path-queries.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/proposals/path-queries.md b/proposals/path-queries.md index 980f1c658..53fbb0bc0 100644 --- a/proposals/path-queries.md +++ b/proposals/path-queries.md @@ -11,7 +11,7 @@ While finding a feasible path, source-based routing requires information about t ### Limitations of gossip -The gossip protocol is characterized by it's ability to quickly and reliably deliver a *finite* amount of information across a large distributed system via propagation. However, because these messages are delivered to every node (potentially multiple times), node performance suffers with a growing *quantity* of data; the more data that's shared, the more network and computational resources are required by each node to process those messages. Therefore, the size and frequency of gossip messages must be constrained. Gossip is well-suited for the Bitcoin network because nodes seek *consistency* among each other, which is done by limiting transaction throughput. Data on the Lightning Network, however, is more dynamic and routing a payment does not require consistency among all nodes, which means message limits and standardness rules are unnecessarily restrictive. For example, routing policy (cltv delta, fees, etc) has highly dynamic inputs, including available liquidity, HTLC slots, onchain fees, and even external factors. Existing limits to `channel_update` messages prohibit routing policy from accurately reflecting desired policy. +The gossip protocol is characterized by it's ability to quickly and reliably deliver a *finite* amount of information across a large distributed system via propagation. However, because these messages are delivered to every node (potentially multiple times), node performance suffers with a growing *quantity* of data; the more data that's shared, the more network and computational resources are required by each node to process those messages. Therefore, the size and frequency of gossip messages must be constrained. Gossip is well-suited for the Bitcoin network because nodes seek *consistency* among each other, which is done by limiting transaction throughput. Data on the Lightning Network, however, is more dynamic and routing a payment does not require consistency among all nodes, which means message limits and standardness rules are unnecessarily restrictive. For example, routing policy (cltv delta, fees, etc) has highly dynamic inputs, including available liquidity, HTLC slots, onchain fees, and even external factors. Existing limits to `channel_update` messages prohibit nodes from accurately reflecting their desired policy. ### Finding Liquidity @@ -23,10 +23,11 @@ For a payment to succeed, a route requires sufficient liquidity in every channel 3. Trial-and-error is a slow discovery process because: -a. HTLCs require multiple message exchanges between peers, with each HTLC addition and removal requiring commitment/revocation cycles. -b. Payments must be attempted serially to avoid the delivery of multiple successful payments. + a. HTLCs require multiple message exchanges between peers, with each HTLC addition and removal requiring commitment/revocation cycles. -To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes using fake payments. However, liquidity is often highly dynamic and is always regressing to a state of uncertainty. To be effective, nodes must actively monitor the network. + b. Payments must be attempted serially to avoid the delivery of multiple successful payments. + + To improve performance for real payments, nodes may choose to 'probe' the channels of routing nodes using fake payments. However, liquidity is often highly dynamic and is always regressing to a state of uncertainty. To be effective, nodes must actively monitor the network. 4. Failed payments (both real and fake) are a burden to routing nodes in the failing sub-path in the form of locked liquidity, HTLC slots, and wasted system resources.