Skip to content

Comments

Feature/netflow udp proxy#8909

Open
fdurand wants to merge 13 commits intodevelfrom
feature/netflow_udp_proxy
Open

Feature/netflow udp proxy#8909
fdurand wants to merge 13 commits intodevelfrom
feature/netflow_udp_proxy

Conversation

@fdurand
Copy link
Member

@fdurand fdurand commented Feb 4, 2026

Description

Fail-over reverse proxy to forward sflow/netflow/ipfix to one of the backend fingerbank-collector

Impacts

Cluster

Delete branch after merge

YES

Checklist

  • Document the feature
  • Add OpenAPI specification
  • Add unit tests
  • Add acceptance tests (TestLink)

NEWS file entries

New Features

  • UDP fail-over process that forward the sfloz/netflow/ipfix traffic to one of the fingerbank-collector instance (cluster)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new pfudpproxy service intended to provide UDP fail-over forwarding for NetFlow/sFlow (and per PR text, IPFIX) traffic to a healthy fingerbank-collector backend in cluster deployments, replacing the prior keepalived/LVS-based approach.

Changes:

  • Introduces a new Go-based UDP proxy (pfudpproxy) with backend health checking and failover selection.
  • Integrates the new service into packaging, systemd, PacketFence service management, firewall rules, and the admin UI.
  • Removes the keepalived-generated LVS UDP load-balancing configuration for NetFlow/sFlow ports.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
rpm/packetfence.spec Packages the new systemd unit and pfudpproxy binary in RPM builds.
lib/pf/services/manager/pfudpproxy.pm Adds PacketFence service manager wrapper for pfudpproxy (cluster-managed).
lib/pf/services/manager/keepalived.pm Removes LVS UDP config generation; notes pfudpproxy now handles NetFlow/sFlow.
lib/pf/iptables.pm Registers a new iptables service rule generator for pfudpproxy.
lib/pf/cmd/pf/service.pm Exposes pfudpproxy in the pfcmd service help listing.
html/pfappserver/.../_components/index.js Adds a UI toggle component export for pfudpproxy.
html/pfappserver/.../_components/TheForm.vue Adds the pfudpproxy toggle to the Services configuration UI.
go/cmd/pfudpproxy/config.go Loads VIP/backends from pfconfig and defines default ports/healthcheck defaults.
go/cmd/pfudpproxy/healthcheck.go Implements HTTPS health checks against backends.
go/cmd/pfudpproxy/loadbalancer.go Implements “first healthy backend” failover selection and backend updates.
go/cmd/pfudpproxy/main.go Wires config loading, health checker, proxy startup, systemd notify/watchdog, refresh loop.
go/cmd/pfudpproxy/proxy.go Implements UDP listening and forwarding to the selected backend.
debian/rules Installs/enables the new packetfence-pfudpproxy unit for Debian packaging.
debian/packetfence-pfudpproxy.service Debian service file link to the shared systemd unit.
config.mk Adds pfudpproxy to the Go binaries built/installed by the build system.
conf/systemd/packetfence-pfudpproxy.service New systemd unit definition for pfudpproxy.
conf/pf.conf.defaults Adds default service toggle and binary path for pfudpproxy.
conf/documentation.conf Adds documentation entries for services.pfudpproxy and services.pfudpproxy_binary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 179 to 196
// Create destination address using backend's management IP and same port
dstAddr := fmt.Sprintf("%s:%d", backend.ManagementIP, port)
udpDstAddr, err := net.ResolveUDPAddr("udp", dstAddr)
if err != nil {
log.LoggerWContext(ctx).Error(fmt.Sprintf("Failed to resolve destination address %s: %s",
dstAddr, err.Error()))
return
}

// Create a new UDP connection for forwarding
// Using a new connection each time since NetFlow/sFlow are fire-and-forget
conn, err := net.DialUDP("udp", nil, udpDstAddr)
if err != nil {
log.LoggerWContext(ctx).Error(fmt.Sprintf("Failed to connect to backend %s: %s",
dstAddr, err.Error()))
return
}
defer conn.Close()
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forwardPacket resolves the destination address and creates a new UDP socket (DialUDP) for every packet. For NetFlow/sFlow traffic this can become a major bottleneck and can exhaust ephemeral ports/file descriptors under load; reuse a socket per backend+port (or use a shared UDPConn with WriteToUDP) and cache the resolved UDPAddr.

Copilot uses AI. Check for mistakes.
# Port 2055 - NetFlow (UDP)
util_safe_push( "-i $tint -p udp -m udp --dport 2055 --jump ACCEPT", $chains->{'filter'}{'INPUT'} );
# Port 6343 - sFlow (UDP)
util_safe_push( "-i $tint -p udp -m udp --dport 6343 --jump ACCEPT", $chains->{'filter'}{'INPUT'} );
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iptables_pfudpproxy_rules opens UDP 2055/6343, but does not open UDP 4739 (IPFIX) even though fingerbank collector rules and the PR description mention IPFIX. If pfudpproxy is taking over this traffic, add the missing port (or clarify that IPFIX isn't supported).

Suggested change
util_safe_push( "-i $tint -p udp -m udp --dport 6343 --jump ACCEPT", $chains->{'filter'}{'INPUT'} );
util_safe_push( "-i $tint -p udp -m udp --dport 6343 --jump ACCEPT", $chains->{'filter'}{'INPUT'} );
# Port 4739 - IPFIX (UDP)
util_safe_push( "-i $tint -p udp -m udp --dport 4739 --jump ACCEPT", $chains->{'filter'}{'INPUT'} );

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +76
healthChecker := NewHealthChecker(config, lb)
go healthChecker.Start(ctx)

// Create and start UDP proxy
proxy := NewUDPProxy(config, lb)
go proxy.Start(ctx)

// Notify systemd we're ready
daemon.SdNotify(false, "READY=1")

Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Systemd is notified READY=1 immediately after starting healthChecker/proxy goroutines, but there’s no signal that any UDP listener actually bound successfully. This can mark the unit as ready while it isn't forwarding; consider making proxy.Start return an error/status (or a readiness channel) and only call SdNotify(READY=1) once at least one listener is active.

Copilot uses AI. Check for mistakes.
defer lb.mu.RUnlock()

backends := make([]*Backend, len(lb.backends))
copy(backends, lb.backends)
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetAllBackends returns a new slice but the elements are still pointers to the shared backend structs. Callers can easily read/mutate fields without the load balancer lock, leading to races (as in HealthChecker); consider returning an immutable snapshot (copy structs, not pointers) or adding accessor methods so all reads happen under the mutex.

Suggested change
copy(backends, lb.backends)
for i, backend := range lb.backends {
if backend == nil {
continue
}
bCopy := *backend
backends[i] = &bCopy
}

Copilot uses AI. Check for mistakes.
Comment on lines 54 to 57
func LoadConfig(ctx context.Context) (*ProxyConfig, error) {
config := &ProxyConfig{
Ports: []int{PortNetFlow, PortSFlow},
HealthCheckPort: getHealthCheckPort(ctx),
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy is advertised as handling NetFlow/sFlow/IPFIX (per PR description/NEWS), but the default Ports list only includes 2055 and 6343. If IPFIX support is intended, add 4739 (and/or make ports configurable); otherwise align the feature description/docs to only mention NetFlow/sFlow.

Copilot uses AI. Check for mistakes.
Comment on lines 16 to 36
// NewLoadBalancer creates a new load balancer with the given backends.
// Backends are ordered by their position in the slice (first = highest priority).
func NewLoadBalancer(backends []*Backend) *LoadBalancer {
return &LoadBalancer{
backends: backends,
}
}

// GetPrimary returns the first healthy backend (failover mode).
// Returns nil if no healthy backend is available.
func (lb *LoadBalancer) GetPrimary() *Backend {
lb.mu.RLock()
defer lb.mu.RUnlock()

for _, backend := range lb.backends {
if backend.Healthy {
return backend
}
}
return nil
}
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unit tests are introduced for the new failover behavior (load balancer selection, backend update preservation, and health check status transitions). Other Go commands in this repo have *_test.go coverage (e.g. go/cmd/pfdhcp), so adding focused tests here would help prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +31
// Create HTTP client with TLS config that skips certificate verification
// (fingerbank-collector uses self-signed certificates)
transport := &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: true,
},
TLSHandshakeTimeout: 5 * time.Second,
}
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Health checks unconditionally set InsecureSkipVerify: true. Even if fingerbank-collector commonly uses self-signed certs, this removes TLS authenticity entirely; consider wiring this to an explicit config toggle (or trusting a configured CA bundle) so deployments that can validate certs are not forced into insecure mode.

Copilot uses AI. Check for mistakes.
@fdurand fdurand force-pushed the feature/netflow_udp_proxy branch from eafc10f to 8aa8066 Compare February 16, 2026 20:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 91 to 97
// Close all listeners
for _, listener := range p.listeners {
if listener != nil {
listener.Close()
}
}

Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stop iterates over p.listeners without holding p.mu, while listenAndForward appends to p.listeners under p.mu. This can cause a data race/panic under the race detector and potentially miss closing a listener. Use the same mutex to protect reads/writes to p.listeners (e.g., copy the slice under lock before iterating).

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the lock

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants