Skip to content

Conversation

@myaut
Copy link

@myaut myaut commented Sep 24, 2025

Idea of per-peer-group policy is described in docs/sources/policy.md. Basically:

  • Instead of sending update to an individual *peer and computing policy for each of it, GoBGP will now compute policy and other path changes for a receiver, which can be either a *peer, or a peer-group (thus computing changes once even if there are 10 peers in a peer-group)
  • Similarly, when receiving update, per-peer-group policy is checked (stored internally as pg:PGNAME) instead of global which allows to skip neighbor set check.
  • Default Config/State values in BGP configs now computed both for oc.Peer and oc.PeerGroup table code now relies on PeerInfo structure that contains all relevant fields needed to change path when sending it to a peer.

Per-peer-group policy for various types of servers (route-reflector, route-server) are checked by TestPeerStarTopology test.

The last change revealed important problem: code which was checking presence of Local AS in received routes was checking for AS configured in Global config section even if LocalAS override was supplied for a peer group.
Per https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13761-39.html idea of LocalAS was migratory period when two AS belong to a same router, thus both AS numbers should be checked. So now both checks are performed, but for a new check (LocalAS override) path is imported while warning is issued which is subject to change in future releases.

Various invariants of LocalAS config are now checked by TestBgpServerLocalASOverride test.

In our tests we reduced propagaging time of 1.2M routes to ~100 peers in route reflector setup from 7 minutes to 2 minutes.

@myaut myaut force-pushed the per-peer-group-pols branch from 8129693 to 9eb31d0 Compare September 24, 2025 16:16
Sergey Klyaus added 2 commits September 24, 2025 21:05
Idea of per-peer-group policy is described in docs/sources/policy.md. Basically:
- Instead of sending update to an individual *peer and computing policy for each
of it, GoBGP will now compute policy and other path changes for a receiver, which
can be either a *peer, or a peer-group (thus computing changes once even
if there are 10 peers in a peer-group)
- Similarly, when receiving update, per-peer-group policy is checked (stored internally
as pg:PGNAME) instead of global which allows to skip neighbor set check.
- Default Config/State values in BGP configs now computed both for oc.Peer and oc.PeerGroup
table code now relies on PeerInfo structure that contains all relevant fields
needed to change path when sending it to a peer.

Per-peer-group policy for various types of servers (route-reflector, route-server)
are checked by TestPeerStarTopology test.

The last change revealed important problem: code which was checking presence of Local AS
in received routes was checking for AS configured in Global config section even if LocalAS
override was supplied for a peer group.
Per https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13761-39.html
idea of LocalAS was migratory period when two AS belong to a same router, thus both
AS numbers should be checked. So now both checks are performed, but for a new check (LocalAS override)
path is imported while warning is issued which is subject to change in future releases.

Various invariants of LocalAS config are now checked by TestBgpServerLocalASOverride test.

In our tests we reduced propagaging time of 1.2M routes to ~100 peers in route reflector
setup from 7 minutes to 2 minutes.
@myaut myaut force-pushed the per-peer-group-pols branch from 9eb31d0 to 30af8e4 Compare September 24, 2025 19:07
@myaut
Copy link
Author

myaut commented Sep 24, 2025

goroutine 1483 [runnable]:
github.com/osrg/gobgp/v4/pkg/server.drainChannel[...](...)
	/home/runner/work/gobgp/gobgp/pkg/server/util.go:27
github.com/osrg/gobgp/v4/pkg/server.cleanInfiniteChannel(...)
	/home/runner/work/gobgp/gobgp/pkg/server/util.go:38
github.com/osrg/gobgp/v4/pkg/server.(*fsmHandler).loop(0xc00066e4e0, {0x15b6610, 0xc0001bc280}, 0xc000702010)
	/home/runner/work/gobgp/gobgp/pkg/server/fsm.go:2091 +0x1085
created by github.com/osrg/gobgp/v4/pkg/server.newFSMHandler in goroutine 1610
	/home/runner/work/gobgp/gobgp/pkg/server/fsm.go:392 +0x387

This is kinda strange test failure since drainChannel is written in a manner it can't stuck. I belive I've seen similar in my manual runs with some race tracking function going wild, so look like a possible runtime issue. Be aware.

@myaut
Copy link
Author

myaut commented Oct 3, 2025

@fujita have you had the chance to look at it? I know PR is large, but conflicts are growing too
I may try to split it into several, like make receiver interface first, but it didn't work well first time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant