Skip to content

Conversation

@wrideout-arista
Copy link
Contributor

@wrideout-arista wrideout-arista commented Jan 8, 2026

Description of PR

Converge cEOSLab peer containers via the use of VRFs and VLANs

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
  • Skipped for non-supported platforms
  • Test case improvement

Approach

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.
The SONiC-facing configuration of each BGP peer may be separated in
routing and bridging via the use of VRFs.
The PTF-facing configuration of each BGP peer may be separated within
each VRF via VLAN tagging, enabling the use of a single backplane
interface on each host cEOSLab container.
Each VRF includes a number of interfaces either facing the SONiC DUT
or the backplane.
Changes should be as transparent to the SONiC DUT as possible.
At the time of testbed setup, the ansible topology file for the testbed
is modified to include new metadata specific to multi-vrf configuration,
and the VMs list is trimmed to only include those containers which will
host multiple BGP peerings, separated by VRF. The new metadata includes
mappings between host containers and VRFs, backplane VLAN mappings, and
BGP session parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to
make the backplane connections clearer, more unique, and easier to
implement. In general, backplane L3 addresses used by the CEOSLab peer
end in even numbers, and those used by the PTF container end in odd
numbers. All addresses generated for use in backplane connections start
with the value 100 (0x64) in the least-significant octet or hextet
(depending on the family of the address). The address changes are
mapped and stored in the new multi-vrf metadata in the ansible topology
file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out unidimensionally.
The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Introduce infrastructure changes required to converge multiple BGP peers
into a minimum number of cEOSLab hosts, via the use of VLANs and VRFs.

Overview of peer convergence:

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers.  The basic premises behind
convergence are as follows:

- cEOSLab peers in docker containers may be converged into a smaller
  number of host peers.
- The SONiC-facing configuration of each BGP peer may be separated in
  routing and bridging via the use of VRFs.
- The PTF-facing configuration of each BGP peer may be separated within
  each VRF via VLAN tagging, enabling the use of a single backplane
  interface on each host cEOSLab container.
- Each VRF includes a number of interfaces either facing the SONiC DUT
  or the backplane.
- Changes should be as transparent to the SONiC DUT as possible.

At the time of testbed setup, the ansible topology file for the testbed
is modified to include new metadata specific to multi-vrf configuration,
and the VMs list is trimmed to only include those containers which will
host multiple BGP peerings, separated by VRF.  The new metadata includes
mappings between host containers and VRFs, backplane VLAN mappings, and
BGP session parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to
make the backplane connections clearer, more unique, and easier to
implement.  In general, backplane L3 addresses used by the CEOSLab peer
end in even numbers, and those used by the PTF container end in odd
numbers.  All addresses generated for use in backplane connections start
with the value 100 (0x64) in the least-significant octet or hextet
(depending on the family of the address).  The address changes are
mapped and stored in the new multi-vrf metadata in the ansible topology
file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes.  This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Known limitations:
- cEOSLab instances do not allow for the creation of interfaces with
  interface-IDs greater than 127, when interfaces are layed out
  unidimensionally.
- The use of multiple VRFs has not been tested in conjunction with
  asynchronous ansible tasks.

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

@wrideout-arista wrideout-arista marked this pull request as draft January 8, 2026 14:51
@github-actions github-actions bot requested review from r12f and sdszhang January 8, 2026 14:51
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@wrideout-arista wrideout-arista mentioned this pull request Jan 8, 2026
5 tasks
@wrideout-arista
Copy link
Contributor Author

Hi, @wrideout-arista , while deploying, we met such issue

TASK [vm_set : Bind topology t1-isolated-d448u15-lag to VMs. base vm = VM77200] ****************************************************************************
Tuesday 13 January 2026 01:25:56 +0000 (0:00:00.095) 0:04:33.602 *******
fatal: [STR4-ACS-SERV-77]: FAILED! => {"changed": false, "msg": "Wrong vlans parameter for hostname ARISTA01T0, vm VM77200. Too many vlans. Maximum is 4"}

It seems that the parameter max_fp_num is using default value. Can you check if any changes are missing in this PR?

@yutongzhang-microsoft for the full-topo you well need to adjust the maxFpNum as set in the testbed.yaml file to 127. Apologies for not mentioning this earlier-- I will update the instructions above.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

In order for multi-vrf to support the redeploy-topo CLI command, the
VLAN interfaces created in the ptf container must be cleaned up in the
topo removal phase.  In addition, when creating the VLAN interfaces in
the ptf container during the topo add phase, check for existence of the
VLAN interface first.  If it already exists, then clear all IP
adderesses associated with the interface and skip interface creation.
Otherwise, create the VLAN interface as normal.

Existence checking must be done, as the topo removal phase does not stop
the redeployment of the topo if it fails, so we may end up adding
the topo with containers and config in a indeterminate state.

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Fix the fetching of the intf-offset when running ipv6 bgp scale tests on
multi-vrf testbeds.  The offset is now a member of a dictionary inside
the multi-vrf intf_mapping metadata.

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

In bgp/test_bgp_allow_list.py tests, pass vrf information if running on
a multi-vrf testbed when getting bgp route information from a peer.
Otherwise, use the vrf "default".

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

When running bgp traffic-shift tests on multi-vrf testbeds, fetch the
current vrf (peer) from nbrhosts metadata, and use it to pass the vrf to
bgp show commands.

This was verified to fix traffic-shift tests which were failing as
unable to verify routes on the ceoslab peers.

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@wrideout-arista wrideout-arista marked this pull request as ready for review January 23, 2026 00:46
When running qos testing for dscp on multi-vrf testbeds, extract the vm
offset from the multi-vrf metadata instead of the shortened vm list in
the test topology.

This was verified to fix KeyErrors on multi-vrf testsbeds thrown in
qos/test_qos_dscp_mapping.py.

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants