Skip to content

Commit b402148

Browse files
committed
💡 feat: add NAT monitoring and hole punching logging for improved connection diagnostics
1 parent 1632d65 commit b402148

File tree

4 files changed

+330
-1
lines changed

4 files changed

+330
-1
lines changed

docs/NAT-Monitoring-Logs.md

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
# NAT Traversal and Hole Punching Monitoring
2+
3+
## Overview
4+
5+
Skypier VPN now includes comprehensive logging for NAT traversal, AutoNAT detection, and hole punching events. This helps diagnose connection issues and understand how peers are connecting to your node.
6+
7+
## New Log Components
8+
9+
Two new log components have been added:
10+
11+
- `[NAT]` - AutoNAT status and reachability monitoring
12+
- `[HOLEPUNCH]` - Connection establishment methods and hole punching monitoring
13+
14+
## Log Examples
15+
16+
### Node Startup
17+
18+
When your node starts, you'll see:
19+
20+
```
21+
[INF] [NAT ] AutoNAT monitoring started
22+
[INF] [HOLEPUNCH] Connection monitoring started
23+
```
24+
25+
### If Your Node is Publicly Reachable
26+
27+
After ~5 seconds, you'll see one of these:
28+
29+
```
30+
[INF] [NAT ] ✓ Node appears PUBLICLY reachable (has public addresses)
31+
[DBG] [NAT ] Public address detected: /ip4/203.0.113.45/udp/4002/quic-v1
32+
```
33+
34+
### If Your Node is Behind NAT
35+
36+
```
37+
[WRN] [NAT ] ⚠ Node appears behind NAT/firewall (only private addresses)
38+
[INF] [NAT ] Hole punching or relay will be used for incoming connections
39+
```
40+
41+
### Connection Methods
42+
43+
When peers connect, you'll see different messages based on the connection method:
44+
45+
#### Direct QUIC Connection (Outbound)
46+
47+
```
48+
[INF] [HOLEPUNCH] ✓ DIRECT outbound QUIC connection to peer 16Uiu2HAm...
49+
[DBG] [HOLEPUNCH] └─ Local: /ip4/192.168.1.100/udp/4002/quic-v1
50+
[DBG] [HOLEPUNCH] └─ Remote: /ip4/203.0.113.45/udp/4002/quic-v1
51+
```
52+
53+
#### Direct QUIC Connection (Inbound - Hole Punch Success!)
54+
55+
```
56+
[INF] [HOLEPUNCH] ✓ DIRECT inbound QUIC connection from peer 16Uiu2HAm...
57+
[DBG] [HOLEPUNCH] └─ Peer successfully connected to us (hole punch or public)
58+
[DBG] [HOLEPUNCH] └─ Local: /ip4/192.168.1.100/udp/4002/quic-v1
59+
[DBG] [HOLEPUNCH] └─ Remote: /ip4/198.51.100.23/udp/54321/quic-v1
60+
```
61+
62+
This indicates successful hole punching or that your node is publicly reachable!
63+
64+
#### Relayed Connection (Hole Punch Failed)
65+
66+
```
67+
[INF] [HOLEPUNCH] 🔄 RELAYED connection established with peer 16Uiu2HAm...
68+
[DBG] [HOLEPUNCH] └─ via circuit relay: /ip4/147.75.77.187/tcp/4001/p2p/QmNnoo.../p2p-circuit
69+
```
70+
71+
This means direct connection failed and traffic is going through a relay server.
72+
73+
#### TCP Connection
74+
75+
```
76+
[INF] [HOLEPUNCH] ✓ DIRECT outbound TCP connection to peer 16Uiu2HAm...
77+
[DBG] [HOLEPUNCH] └─ Local: /ip4/192.168.1.100/tcp/4001
78+
[DBG] [HOLEPUNCH] └─ Remote: /ip4/203.0.113.45/tcp/4001
79+
```
80+
81+
#### Resource-Limited Connection
82+
83+
```
84+
[INF] [HOLEPUNCH] ✓ DIRECT outbound QUIC connection to peer 16Uiu2HAm...
85+
[WRN] [HOLEPUNCH] └─ ⚠ Connection is RATE LIMITED by resource manager
86+
```
87+
88+
This means the connection is being throttled due to resource constraints.
89+
90+
### Peer Disconnection
91+
92+
```
93+
[DBG] [HOLEPUNCH] Peer disconnected: 16Uiu2HAm...
94+
```
95+
96+
## Interpreting the Logs
97+
98+
### What Success Looks Like
99+
100+
**Best Case: Public Node**
101+
```
102+
[INF] [NAT ] ✓ Node appears PUBLICLY reachable
103+
[INF] [HOLEPUNCH] ✓ DIRECT inbound QUIC connection from peer 16Uiu2H...
104+
```
105+
106+
**Good Case: NAT with Working Hole Punching**
107+
```
108+
[WRN] [NAT ] ⚠ Node appears behind NAT/firewall
109+
[INF] [HOLEPUNCH] ✓ DIRECT inbound QUIC connection from peer 16Uiu2H...
110+
[DBG] [HOLEPUNCH] └─ Peer successfully connected to us (hole punch or public)
111+
```
112+
113+
### What Problems Look Like
114+
115+
**All Connections are Relayed**
116+
```
117+
[WRN] [NAT ] ⚠ Node appears behind NAT/firewall
118+
[INF] [HOLEPUNCH] 🔄 RELAYED connection established with peer 16Uiu2H...
119+
[INF] [HOLEPUNCH] 🔄 RELAYED connection established with peer 16Uiu2H...
120+
```
121+
122+
This indicates:
123+
- Hole punching is not working
124+
- You're behind a strict/symmetric NAT
125+
- All traffic is routed through relay servers (slower)
126+
127+
**Solutions:**
128+
1. Enable UPnP on your router
129+
2. Set up port forwarding (UDP port 4002)
130+
3. Check firewall settings
131+
132+
## Log Levels
133+
134+
Use different log levels to control verbosity:
135+
136+
- `LogLevel: "error"` - Only errors
137+
- `LogLevel: "warn"` - Errors and warnings (NAT warnings, relay usage)
138+
- `LogLevel: "info"` - Normal operation (connection established messages) **[Recommended]**
139+
- `LogLevel: "debug"` - Detailed information (addresses, periodic checks)
140+
141+
Set in `/etc/skypier/config.json`:
142+
143+
```json
144+
{
145+
"LogLevel": "info"
146+
}
147+
```
148+
149+
## Monitoring in Real-Time
150+
151+
To see only NAT and connection logs:
152+
153+
```bash
154+
sudo /usr/local/bin/skypier-vpn | grep -E '\[NAT|HOLEPUNCH\]'
155+
```
156+
157+
Or to see all connection-related logs:
158+
159+
```bash
160+
sudo /usr/local/bin/skypier-vpn | grep -E '\[NAT|HOLEPUNCH|WATCHER\]'
161+
```
162+
163+
## Troubleshooting NAT Issues
164+
165+
### Check Current Status
166+
167+
1. Start your node
168+
2. Wait 5-10 seconds
169+
3. Look for the NAT status message
170+
171+
### If You See "Behind NAT" but Want Direct Connections
172+
173+
1. **Enable UPnP on your router** (easiest)
174+
2. **Set up port forwarding:**
175+
- Forward UDP port 4002 to your node's internal IP
176+
- Some nodes may also use TCP port 4001
177+
3. **Check your node's addresses:**
178+
```bash
179+
# Look for public IP addresses in the node startup logs
180+
grep "Listening on addresses" /var/log/skypier.log
181+
```
182+
183+
### If All Connections are Relayed
184+
185+
This means hole punching isn't working. Check:
186+
187+
1. Your NAT type (use online NAT type detection tools)
188+
2. Whether UPnP is enabled on your router
189+
3. Firewall rules blocking UDP traffic
190+
4. If you're behind multiple layers of NAT (e.g., ISP + home router)
191+
192+
Symmetric NAT is the most restrictive and often requires relay connections.
193+
194+
## Performance Impact
195+
196+
**Direct Connections (Best):**
197+
- Lowest latency
198+
- Highest throughput
199+
- No relay overhead
200+
201+
**Relayed Connections (Fallback):**
202+
- Higher latency (+50-200ms)
203+
- Lower throughput (bandwidth shared with relay)
204+
- Depends on relay server performance
205+
206+
The logs will help you identify which peers are using relays so you can optimize your network setup.

pkg/utils/logger.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ const (
6161
ComponentNegotiate Component = "NEGOTIATE"
6262
ComponentCrypto Component = "CRYPTO"
6363
ComponentWatcher Component = "WATCHER"
64+
ComponentNAT Component = "NAT"
65+
ComponentHolePunch Component = "HOLEPUNCH"
6466
)
6567

6668
// componentColors maps components to their display colors
@@ -77,6 +79,8 @@ var componentColors = map[Component]string{
7779
ComponentNegotiate: ColorBlue,
7880
ComponentCrypto: ColorGray,
7981
ComponentWatcher: ColorBoldYellow,
82+
ComponentNAT: ColorBoldMagenta,
83+
ComponentHolePunch: ColorBoldGreen,
8084
}
8185

8286
// levelColors maps log levels to their display colors
@@ -329,4 +333,6 @@ var (
329333
NegotiateLog = NewLogger(ComponentNegotiate)
330334
CryptoLog = NewLogger(ComponentCrypto)
331335
WatcherLog = NewLogger(ComponentWatcher)
336+
NATLog = NewLogger(ComponentNAT)
337+
HolePunchLog = NewLogger(ComponentHolePunch)
332338
)

pkg/vpn/p2p.go

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ import (
1111
"github.com/libp2p/go-libp2p"
1212
dht "github.com/libp2p/go-libp2p-kad-dht"
1313
"github.com/libp2p/go-libp2p/core/crypto"
14+
"github.com/libp2p/go-libp2p/core/event"
1415
"github.com/libp2p/go-libp2p/core/host"
1516
"github.com/libp2p/go-libp2p/core/network"
1617
peerstore "github.com/libp2p/go-libp2p/core/peer"
@@ -109,6 +110,8 @@ func StartNode(innerConfig utils.InnerConfig, pk crypto.PrivKey, tcpPort string,
109110
libp2p.Transport(quic.NewTransport),
110111
libp2p.Transport(tcp.NewTCPTransport),
111112
libp2p.NATPortMap(),
113+
libp2p.EnableAutoNATv2(),
114+
libp2p.EnableHolePunching(),
112115
libp2p.Routing(func(h host.Host) (routing.PeerRouting, error) {
113116
// Configure DHT with options to limit connection usage
114117
idht, err = dht.New(ctx, h,
@@ -167,9 +170,52 @@ func StartNode(innerConfig utils.InnerConfig, pk crypto.PrivKey, tcpPort string,
167170
// Refresh the routing table to help discover peers
168171
// idht.RefreshRoutingTable()
169172

173+
// Start monitoring NAT status and hole punching events
174+
go monitorNATStatus(ctx, node)
175+
170176
return node, idht, err
171177
}
172178

179+
// logVPNConnectionType logs the connection method when a VPN stream is established
180+
func logVPNConnectionType(node host.Host, conn network.Conn) {
181+
remoteAddr := conn.RemoteMultiaddr().String()
182+
localAddr := conn.LocalMultiaddr().String()
183+
peerID := conn.RemotePeer().ShortString()
184+
direction := conn.Stat().Direction
185+
186+
// Determine connection type
187+
if strings.Contains(remoteAddr, "/p2p-circuit/") {
188+
utils.HolePunchLog.Info("🔄 VPN over RELAYED connection with peer %s", peerID)
189+
utils.HolePunchLog.Debug(" └─ via circuit relay: %s", remoteAddr)
190+
utils.HolePunchLog.Warn(" └─ ⚠ Performance may be degraded (using relay)")
191+
} else if strings.Contains(remoteAddr, "/quic") {
192+
if direction == network.DirInbound {
193+
utils.HolePunchLog.Success("✓ VPN over DIRECT inbound QUIC from peer %s", peerID)
194+
utils.HolePunchLog.Debug(" └─ Peer connected to us (hole punch success or we're public)")
195+
} else {
196+
utils.HolePunchLog.Success("✓ VPN over DIRECT outbound QUIC to peer %s", peerID)
197+
}
198+
utils.HolePunchLog.Debug(" └─ Local: %s", localAddr)
199+
utils.HolePunchLog.Debug(" └─ Remote: %s", remoteAddr)
200+
} else if strings.Contains(remoteAddr, "/tcp") {
201+
if direction == network.DirInbound {
202+
utils.HolePunchLog.Success("✓ VPN over DIRECT inbound TCP from peer %s", peerID)
203+
} else {
204+
utils.HolePunchLog.Success("✓ VPN over DIRECT outbound TCP to peer %s", peerID)
205+
}
206+
utils.HolePunchLog.Debug(" └─ Local: %s", localAddr)
207+
utils.HolePunchLog.Debug(" └─ Remote: %s", remoteAddr)
208+
} else {
209+
utils.HolePunchLog.Info("VPN connection established with peer %s", peerID)
210+
utils.HolePunchLog.Debug(" └─ Type: %s", remoteAddr)
211+
}
212+
213+
// Check if connection is limited by resource manager
214+
if conn.Stat().Limited {
215+
utils.HolePunchLog.Warn(" └─ ⚠ VPN connection is RATE LIMITED by resource manager")
216+
}
217+
}
218+
173219
func shouldCloseStream(err error) bool {
174220
if err == nil {
175221
return false
@@ -224,6 +270,9 @@ func makeStreamHandler(node host.Host) network.StreamHandler {
224270
streamLog.Info("Starting stream handler for peer: %s", peerID)
225271
streamLog.Debug("Node status: %v (true=server, false=client)", utils.IS_NODE_HOST)
226272

273+
// Log VPN connection type (NAT/hole punching info)
274+
logVPNConnectionType(node, s.Conn())
275+
227276
// CRITICAL: Protect this VPN connection from being pruned by the connection manager
228277
// This prevents the connection manager from closing VPN streams when trimming connections
229278
streamLog.Info("🛡️ Protecting VPN connection for peer %s", peerID)
@@ -379,3 +428,71 @@ func makeStreamHandler(node host.Host) network.StreamHandler {
379428
}()
380429
} // end of stream handler function
381430
} // end of makeStreamHandler
431+
432+
// monitorNATStatus monitors and logs AutoNAT reachability status
433+
func monitorNATStatus(ctx context.Context, node host.Host) {
434+
// Subscribe to AutoNAT reachability events
435+
sub, err := node.EventBus().Subscribe(new(event.EvtPeerIdentificationCompleted))
436+
if err != nil {
437+
utils.NATLog.Error("Failed to subscribe to AutoNAT events: %v", err)
438+
return
439+
}
440+
defer sub.Close()
441+
442+
utils.NATLog.Info("AutoNAT monitoring started")
443+
444+
// Check initial reachability status periodically
445+
ticker := time.NewTicker(30 * time.Second)
446+
defer ticker.Stop()
447+
448+
checkReachability := func() {
449+
conns := node.Network().Conns()
450+
if len(conns) == 0 {
451+
utils.NATLog.Debug("No connections yet, reachability unknown")
452+
return
453+
}
454+
455+
// Check if we have any public addresses
456+
hasPublicAddr := false
457+
for _, addr := range node.Addrs() {
458+
addrStr := addr.String()
459+
// Check if address is not a private IP
460+
if !strings.Contains(addrStr, "127.0.0.1") &&
461+
!strings.Contains(addrStr, "192.168.") &&
462+
!strings.Contains(addrStr, "10.") &&
463+
!strings.Contains(addrStr, "172.16.") &&
464+
!strings.Contains(addrStr, "172.17.") &&
465+
!strings.Contains(addrStr, "172.18.") &&
466+
!strings.Contains(addrStr, "172.19.") &&
467+
!strings.Contains(addrStr, "172.2") &&
468+
!strings.Contains(addrStr, "172.30.") &&
469+
!strings.Contains(addrStr, "172.31.") {
470+
hasPublicAddr = true
471+
utils.NATLog.Debug("Public address detected: %s", addrStr)
472+
}
473+
}
474+
475+
if hasPublicAddr {
476+
utils.NATLog.Success("✓ Node appears PUBLICLY reachable (has public addresses)")
477+
} else {
478+
utils.NATLog.Warn("⚠ Node appears behind NAT/firewall (only private addresses)")
479+
utils.NATLog.Info("Hole punching or relay will be used for incoming connections")
480+
}
481+
}
482+
483+
// Initial check after 5 seconds
484+
time.Sleep(5 * time.Second)
485+
checkReachability()
486+
487+
for {
488+
select {
489+
case <-ctx.Done():
490+
return
491+
case <-ticker.C:
492+
// Periodic reachability check
493+
checkReachability()
494+
case evt := <-sub.Out():
495+
_ = evt // Event received, will trigger reachability check on next ticker
496+
}
497+
}
498+
}

0 commit comments

Comments
 (0)