Skip to content

Commit c2c94e0

Browse files
Merge pull request #889 from lightninglabs/docs-lnd
Update lnd documentation
2 parents 83a2b5d + 1ad1a7a commit c2c94e0

File tree

3 files changed

+333
-0
lines changed

3 files changed

+333
-0
lines changed

docs/lnd/gossip_rate_limiting.md

Lines changed: 260 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,260 @@
1+
# Gossip Rate Limiting Configuration Guide
2+
3+
When running a Lightning node, one of the most critical yet often overlooked
4+
aspects is properly configuring the gossip rate limiting system. This guide will
5+
help you understand how LND manages outbound gossip traffic and how to tune
6+
these settings for your specific needs.
7+
8+
## Understanding Gossip Rate Limiting
9+
10+
At its core, LND uses a token bucket algorithm to control how much bandwidth it
11+
dedicates to sending gossip messages to other nodes. Think of it as a bucket
12+
that fills with tokens at a steady rate. Each time your node sends a gossip
13+
message, it consumes tokens equal to the message size. If the bucket runs dry,
14+
messages must wait until enough tokens accumulate.
15+
16+
This system serves an important purpose: it prevents any single peer, or group
17+
of peers, from overwhelming your node's network resources. Without rate
18+
limiting, a misbehaving peer could request your entire channel graph repeatedly,
19+
consuming all your bandwidth and preventing normal operation.
20+
21+
## Core Configuration Options
22+
23+
The gossip rate limiting system has several configuration options that work
24+
together to control your node's behavior.
25+
26+
### Setting the Sustained Rate: gossip.msg-rate-bytes
27+
28+
The most fundamental setting is `gossip.msg-rate-bytes`, which determines how
29+
many bytes per second your node will allocate to outbound gossip messages. This
30+
rate is shared across all connected peers, not per-peer.
31+
32+
The default value of 102,400 bytes per second (100 KB/s) works well for most
33+
nodes, but you may need to adjust it based on your situation. Setting this value
34+
too low can cause serious problems. When the rate limit is exhausted, peers
35+
waiting to synchronize must queue up, potentially waiting minutes between
36+
messages. Values below 50 KB/s can make initial synchronization fail entirely,
37+
as peers timeout before receiving the data they need.
38+
39+
### Managing Burst Capacity: gossip.msg-burst-bytes
40+
41+
The burst capacity, configured via `gossip.msg-burst-bytes`, determines the
42+
initial capacity of your token bucket. This value must be greater than
43+
`gossip.msg-rate-bytes` for the rate limiter to function properly. The burst
44+
capacity represents the maximum number of bytes that can be sent immediately
45+
when the bucket is full.
46+
47+
The default of 204,800 bytes (200 KB) is set to be double the default rate
48+
(100 KB/s), providing a good balance. This ensures that when the rate limiter
49+
starts or after a period of inactivity, you can send up to 200 KB worth of
50+
messages immediately before rate limiting kicks in. Any single message larger
51+
than this value can never be sent, regardless of how long you wait.
52+
53+
### Controlling Concurrent Operations: gossip.filter-concurrency
54+
55+
When peers apply gossip filters to request specific channel updates, these
56+
operations can consume significant resources. The `gossip.filter-concurrency`
57+
setting limits how many of these operations can run simultaneously. The default
58+
value of 5 provides a reasonable balance between resource usage and
59+
responsiveness.
60+
61+
Large routing nodes handling many simultaneous peer connections might benefit
62+
from increasing this value to 10 or 15, while resource-constrained nodes should
63+
keep it at the default or even reduce it slightly.
64+
65+
### Understanding Connection Limits: num-restricted-slots
66+
67+
The `num-restricted-slots` configuration deserves special attention because it
68+
directly affects your gossip bandwidth requirements. This setting limits inbound
69+
connections, but not in the way you might expect.
70+
71+
LND maintains a three-tier system for peer connections. Peers you've ever had
72+
channels with enjoy "protected" status and can always connect. Peers currently
73+
opening channels with you have "temporary" status. Everyone else—new peers
74+
without channels—must compete for the limited "restricted" slots.
75+
76+
When a new peer without channels connects inbound, they consume one restricted
77+
slot. If all slots are full, additional peers are turned away. However, as soon
78+
as a restricted peer begins opening a channel, they're upgraded to temporary
79+
status, freeing their slot. This creates breathing room for large nodes to form
80+
new channel relationships without constantly rejecting connections.
81+
82+
The relationship between restricted slots and rate limiting is straightforward:
83+
more allowed connections mean more peers requesting data, requiring more
84+
bandwidth. A reasonable rule of thumb is to allocate at least 1 KB/s of rate
85+
limit per restricted slot.
86+
87+
## Calculating Appropriate Values
88+
89+
To set these values correctly, you need to understand your node's position in
90+
the network and its typical workload. The fundamental question is: how much
91+
gossip traffic does your node actually need to handle?
92+
93+
Start by considering how many peers typically connect to your node. A hobbyist
94+
node might have 10-20 connections, while a well-connected routing node could
95+
easily exceed 100. Each peer generates gossip traffic when syncing channel
96+
updates, announcing new channels, or requesting historical data.
97+
98+
The calculation itself is straightforward. Take your average message size
99+
(approximately 210 bytes for gossip messages), multiply by your peer count and
100+
expected message frequency, then add a safety factor for traffic spikes. Since
101+
each channel generates approximately 842 bytes of bandwidth (including both
102+
channel announcements and updates), you can also calculate based on your
103+
channel count. Here's the formula:
104+
105+
```
106+
rate = avg_msg_size × peer_count × msgs_per_second × safety_factor
107+
```
108+
109+
Let's walk through some real-world examples to make this concrete.
110+
111+
For a small node with 15 peers, you might see 10 messages per peer per second
112+
during normal operation. With an average message size of 210 bytes and a safety
113+
factor of 1.5, you'd need about 47 KB/s. Rounding up to 50 KB/s provides
114+
comfortable headroom.
115+
116+
A medium-sized node with 75 peers faces different challenges. These nodes often
117+
relay more traffic and handle more frequent updates. With 15 messages per peer
118+
per second, the calculation yields about 237 KB/s. Setting the limit to 250 KB/s
119+
ensures smooth operation without waste.
120+
121+
Large routing nodes require the most careful consideration. With 150 or more
122+
peers and high message frequency, bandwidth requirements can exceed 1 MB/s.
123+
These nodes form the backbone of the Lightning Network and need generous
124+
allocations to serve their peers effectively.
125+
126+
Remember that the relationship between restricted slots and rate limiting is
127+
direct: each additional slot potentially adds another peer requesting data. Plan
128+
for at least 1 KB/s per restricted slot to maintain healthy synchronization.
129+
130+
## Network Size and Geography
131+
132+
The Lightning Network's growth directly impacts your gossip bandwidth needs.
133+
With over 80,000 public channels at the time of writing, each generating
134+
multiple updates daily, the volume of gossip traffic continues to increase. A
135+
channel update occurs whenever a node adjusts its fees, changes its routing
136+
policy, or goes offline temporarily. During volatile market conditions or fee
137+
market adjustments, update frequency can spike dramatically.
138+
139+
Geographic distribution adds another layer of complexity. If your node connects
140+
to peers across continents, the inherent network latency affects how quickly you
141+
can exchange messages. However, this primarily impacts initial connection
142+
establishment rather than ongoing rate limiting.
143+
144+
## Troubleshooting Common Issues
145+
146+
When rate limiting isn't configured properly, the symptoms are often subtle at
147+
first but can cascade into serious problems.
148+
149+
The most common issue is slow initial synchronization. New peers attempting to
150+
download your channel graph experience long delays between messages. You'll see
151+
entries in your logs like "rate limiting gossip replies, responding in 30s" or
152+
even longer delays. This happens because the rate limiter has exhausted its
153+
tokens and must wait for refill. The solution is straightforward: increase your
154+
msg-rate-bytes setting.
155+
156+
Peer disconnections present a more serious problem. When peers wait too long for
157+
gossip responses, they may timeout and disconnect. This creates a vicious cycle
158+
where peers repeatedly connect, attempt to sync, timeout, and reconnect. Look
159+
for "peer timeout" errors in your logs. If you see these, you need to increase
160+
your rate limit.
161+
162+
Sometimes you'll notice unusually high CPU usage from your LND process. This
163+
often indicates that many goroutines are blocked waiting for rate limiter
164+
tokens. The rate limiter must constantly calculate delays and manage waiting
165+
threads. Increasing the rate limit reduces this contention and lowers CPU usage.
166+
167+
To debug these issues, focus on your LND logs rather than high-level commands.
168+
Search for "rate limiting" messages to understand how often delays occur and how
169+
long they last. Look for patterns in peer disconnections that might correlate
170+
with rate limiting delays. The specific commands that matter are:
171+
172+
```bash
173+
# View peer connections and sync state
174+
lncli listpeers | grep -A5 "sync_type"
175+
176+
# Check recent rate limiting events
177+
grep "rate limiting" ~/.lnd/logs/bitcoin/mainnet/lnd.log | tail -20
178+
```
179+
180+
Pay attention to log entries showing "Timestamp range queue full" if you've
181+
implemented the queue-based approach—this indicates your system is shedding load
182+
due to overwhelming demand.
183+
184+
## Best Practices for Configuration
185+
186+
Experience has shown that starting with conservative (higher) rate limits and
187+
reducing them if needed works better than starting too low and debugging
188+
problems. It's much easier to notice excess bandwidth usage than to diagnose
189+
subtle synchronization failures.
190+
191+
Monitor your node's actual bandwidth usage and sync times after making changes.
192+
Most operating systems provide tools to track network usage per process. When
193+
adjusting settings, make gradual changes of 25-50% rather than dramatic shifts.
194+
This helps you understand the impact of each change and find the sweet spot for
195+
your setup.
196+
197+
Keep your burst size at least double the largest message size you expect to
198+
send. While the default 200 KB is usually sufficient, monitor your logs for any
199+
"message too large" errors that would indicate a need to increase this value.
200+
201+
As your node grows and attracts more peers, revisit these settings periodically.
202+
What works for 50 peers may cause problems with 150 peers. Regular review
203+
prevents gradual degradation as conditions change.
204+
205+
## Configuration Examples
206+
207+
For most users running a personal node, conservative settings provide reliable
208+
operation without excessive resource usage:
209+
210+
```
211+
[Application Options]
212+
gossip.msg-rate-bytes=204800
213+
gossip.msg-burst-bytes=409600
214+
gossip.filter-concurrency=5
215+
num-restricted-slots=100
216+
```
217+
218+
Well-connected nodes that route payments regularly need more generous
219+
allocations:
220+
221+
```
222+
[Application Options]
223+
gossip.msg-rate-bytes=524288
224+
gossip.msg-burst-bytes=1048576
225+
gossip.filter-concurrency=10
226+
num-restricted-slots=200
227+
```
228+
229+
Large routing nodes at the heart of the network require the most resources:
230+
231+
```
232+
[Application Options]
233+
gossip.msg-rate-bytes=1048576
234+
gossip.msg-burst-bytes=2097152
235+
gossip.filter-concurrency=15
236+
num-restricted-slots=300
237+
```
238+
239+
## Critical Warning About Low Values
240+
241+
Setting `gossip.msg-rate-bytes` below 50 KB/s creates serious operational
242+
problems that may not be immediately obvious. Initial synchronization, which
243+
typically transfers 10-20 MB of channel graph data, can take hours or fail
244+
entirely. Peers appear to connect but remain stuck in a synchronization loop,
245+
never completing their initial download.
246+
247+
Your channel graph remains perpetually outdated, causing routing failures as you
248+
attempt to use channels that have closed or changed their fee policies. The
249+
gossip subsystem appears to work, but operates so slowly that it cannot keep
250+
pace with network changes.
251+
252+
During normal operation, a well-connected node processes hundreds of channel
253+
updates per minute. Each update is small, but they add up quickly. Factor in
254+
occasional bursts during network-wide fee adjustments or major routing node
255+
policy changes, and you need substantial headroom above the theoretical minimum.
256+
257+
The absolute minimum viable configuration requires at least enough bandwidth to
258+
complete initial sync in under an hour and process ongoing updates without
259+
falling behind. This translates to no less than 50 KB/s for even the smallest
260+
nodes.
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Release Notes
2+
- [Bug Fixes](#bug-fixes)
3+
- [New Features](#new-features)
4+
- [Functional Enhancements](#functional-enhancements)
5+
- [RPC Additions](#rpc-additions)
6+
- [lncli Additions](#lncli-additions)
7+
- [Improvements](#improvements)
8+
- [Functional Updates](#functional-updates)
9+
- [RPC Updates](#rpc-updates)
10+
- [lncli Updates](#lncli-updates)
11+
- [Breaking Changes](#breaking-changes)
12+
- [Performance Improvements](#performance-improvements)
13+
- [Deprecations](#deprecations)
14+
- [Technical and Architectural Updates](#technical-and-architectural-updates)
15+
- [BOLT Spec Updates](#bolt-spec-updates)
16+
- [Testing](#testing)
17+
- [Database](#database)
18+
- [Code Health](#code-health)
19+
- [Tooling and Documentation](#tooling-and-documentation)
20+
21+
# Bug Fixes
22+
23+
- [Fixed](https://github.com/lightningnetwork/lnd/pull/10097) a deadlock that
24+
could occur when multiple goroutines attempted to send gossip filter backlog
25+
messages simultaneously. The fix ensures only a single goroutine processes the
26+
backlog at any given time using an atomic flag.
27+
28+
# New Features
29+
30+
## Functional Enhancements
31+
32+
## RPC Additions
33+
34+
## lncli Additions
35+
36+
# Improvements
37+
38+
## Functional Updates
39+
40+
## RPC Updates
41+
42+
## lncli Updates
43+
44+
## Code Health
45+
46+
## Breaking Changes
47+
48+
## Performance Improvements
49+
50+
## Deprecations
51+
52+
# Technical and Architectural Updates
53+
54+
## BOLT Spec Updates
55+
56+
## Testing
57+
58+
## Database
59+
60+
## Code Health
61+
62+
## Tooling and Documentation
63+
64+
# Contributors (Alphabetical Order)
65+
* Olaoluwa Osuntokun

docs/lnd/release-notes/release-notes-0.20.0.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,10 @@
2929
- Fixed [shutdown deadlock](https://github.com/lightningnetwork/lnd/pull/10042)
3030
when we fail starting up LND before we startup the chanbackup sub-server.
3131

32+
- Fixed BOLT-11 invoice parsing behavior: [now errors](
33+
https://github.com/lightningnetwork/lnd/pull/9993) are returned when receiving
34+
empty route hints or a non-UTF-8-encoded description.
35+
3236
- [Fixed](https://github.com/lightningnetwork/lnd/pull/10027) an issue where
3337
known TLV fields were incorrectly encoded into the `ExtraData` field of
3438
messages in the dynamic commitment set.
@@ -45,6 +49,10 @@
4549

4650
## Functional Enhancements
4751

52+
* RPCs `walletrpc.EstimateFee` and `walletrpc.FundPsbt` now
53+
[allow](https://github.com/lightningnetwork/lnd/pull/10087)
54+
`conf_target=1`. Previously they required `conf_target >= 2`.
55+
4856
## RPC Additions
4957
* When querying [`ForwardingEvents`](https://github.com/lightningnetwork/lnd/pull/9813)
5058
logs, the response now include the incoming and outgoing htlc indices of the payment

0 commit comments

Comments
 (0)