Skip to content

Commit eba0a3d

Browse files
authored
Merge pull request #1657 from lolyu/subnet_decap_HLD
[SubnetDecap] Add subnet decap HLD
2 parents ff26eeb + 33357ab commit eba0a3d

File tree

4 files changed

+222
-0
lines changed

4 files changed

+222
-0
lines changed
52.9 KB
Loading
119 KB
Loading
32.4 KB
Loading

doc/decap/subnet_decap_HLD.md

Lines changed: 222 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,222 @@
1+
# Subnet Decapsulation with SONiC
2+
3+
## High Level Design Document
4+
5+
## Table of Content
6+
7+
- [Subnet Decapsulation with SONiC](#subnet-decapsulation-with-sonic)
8+
- [High Level Design Document](#high-level-design-document)
9+
- [Table of Content](#table-of-content)
10+
- [1 Revision](#1-revision)
11+
- [2 Scope](#2-scope)
12+
- [3 Definitions/Abbreviations](#3-definitionsabbreviations)
13+
- [4 Overview](#4-overview)
14+
- [5 Requirement](#5-requirement)
15+
- [5.1 Functional Requirements](#51-functional-requirements)
16+
- [5.2 Scalability Requirements](#52-scalability-requirements)
17+
- [6 Module Design](#6-module-design)
18+
- [6.1 Tunnel Specification](#61-tunnel-specification)
19+
- [6.2 DB Schema](#62-db-schema)
20+
- [6.2.1 CONFIG\_DB](#621-config_db)
21+
- [6.2.2 APPL\_DB](#622-appl_db)
22+
- [6.2.3 STATE\_DB](#623-state_db)
23+
- [6.3 Orchestration Agent](#63-orchestration-agent)
24+
- [6.4 VLAN Subnet Decap](#64-vlan-subnet-decap)
25+
- [6.4.1 VLAN Subnet Decap Rule Generation](#641-vlan-subnet-decap-rule-generation)
26+
- [6.4.2 Dual-ToR Considerations](#642-dual-tor-considerations)
27+
- [6.4.3 Netscan VLAN Subnet Probing](#643-netscan-vlan-subnet-probing)
28+
- [6.5 Subnet Decap Configuration Workflow](#65-subnet-decap-configuration-workflow)
29+
- [6.6 CLI](#66-cli)
30+
- [7 Warm Reboot Support](#7-warm-reboot-support)
31+
- [8 Test Plan](#8-test-plan)
32+
33+
## 1 Revision
34+
35+
| Rev | Date | Author | Change Description |
36+
| :---: | :--------: | :-----------: | ------------------ |
37+
| 0.1 | 03/30/2024 | Longxiang Lyu | Initial version |
38+
39+
## 2 Scope
40+
41+
This document describes the subnet decapsulation feature on T0 SONiC that allows Netscan to probe VLAN subnet IP addresses.
42+
43+
## 3 Definitions/Abbreviations
44+
45+
| Term | Meaning |
46+
| ------- | --------------------------------------------- |
47+
| VLAN | virtual local area network |
48+
| DIP | destination IP |
49+
| SIP | source IP |
50+
| Netscan | Azure service to detect network path failures |
51+
52+
## 4 Overview
53+
54+
In Azure, Netscan probes the network paths/devices by sending IPinIP traffic. The IPinIP packet crafted by the Netscan sender has the outer DIP equals the destination device Loopback address, and the inner DIP equals the IP address of the Netscan sender. When the IPinIP packet is routed to/received by the destination device, they will be decapsulated and the inner packet will be routed back to the Netscan sender. With this probing, the Netscan sender has the awareness of any network link/device issues in the probe path by checking the receivement of the inner packets.
55+
As of today, Netscan uses this IP-decap based probing to detect route blackholes in the Azure network. The limitation is that Netscan is only able to probe the networking switches without the capability to detect any route blackholes for host nodes, especially VLAN subnet IPs. Due to the fact that the host nodes don’t have native IP-decap functionality, it is more appropriate to implement the IP-decap functionality on T0 SONiC as SONiC supports IPinIP decapsulation, and T0 SONiC will respond to the Netscan probes on behalf of the host nodes to decapsulate the Netscan IPinIP probe packets with DIP as any VLAN subnet IPs.
56+
In this design, subnet decap is introduced to enhance SONiC with the capability to generate the decap rules for the VLAN subnet so IPinIP packets from Netscan with DIP as either VLAN subnet IPs could be decapsulated and forwarded back to the Netscan sender to allow Netscan to have the awareness of any possible route blackholes to those destinations.
57+
58+
## 5 Requirement
59+
60+
### 5.1 Functional Requirements
61+
62+
High level requirements:
63+
64+
- T0 SONiC shall allow Netscan to probe with IPinIP packets with DIP as any local VLAN subnet IP by adding IP decap rules for the VLAN subnet.
65+
66+
### 5.2 Scalability Requirements
67+
68+
| Component | Expected Value |
69+
| ------------------ | -------------- |
70+
| Tunnel | N/A |
71+
| Tunnel Decap Terms | N/A |
72+
73+
## 6 Module Design
74+
75+
To support the Netscan probing over VLAN subnet, SONiC needs to generate decapsulation rules to decapsulate IPinIP packets with DIP as any IP address in VLAN subnet. The decapsulation rules will be generated based on the configured VLAN subnet.
76+
In this design, we propose the subnet decap feature that has workflow to enable SONiC to add/remove those decapsulation rules based on the configured VLAN subnets on the T0 SONiC.
77+
78+
### 6.1 Tunnel Specification
79+
80+
The tunnels in this design will be generated with the following attributes:
81+
| Attribute | Value | Note |
82+
| --------------- | --------------------------------- | -------------------------------- |
83+
| name | IPINIP_SUBNET or IPINIP_V6_SUBNET | One IPv4 tunnel, one IPv6 tunnel |
84+
| tunnel type | IPinIP | |
85+
| decap ECN mode | copy_from_outer or standard | |
86+
| decap TTL mode | pipe | |
87+
| decap DSCP mode | uniform | |
88+
89+
The decapsulation termination entry will be created with the following attributes:
90+
| Attribute | Value | Note |
91+
| --------------- | ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------ |
92+
| term entry type | MP2MP | multi-point to multi-point |
93+
| dest IP | the VLAN subnet | |
94+
| dest IP mask | the vlan subnet mask | |
95+
| source IP | Netscan privately-owned subnet | IPinIP packets that have source IP in the subnet can be safely assumed to be Netscan traffic instead of customer traffic |
96+
| source IP mask | Netscan privately-owned subnet mask | |
97+
98+
### 6.2 DB Schema
99+
100+
#### 6.2.1 CONFIG_DB
101+
102+
```
103+
### SUBNET_DECAP
104+
; Stores subnet based decapsulation configurations
105+
key = SUBNET_DECAP|config_name
106+
status = "enable"/"disable" ; status of subnet based decapsulation
107+
src_ip = source IP prefix ; source IP prefix used for tunnel
108+
src_ip_v6 = source IP prefix ; source IPv6 prefix used for tunnel_v6
109+
vlan = list of enable VLAN ; comma separated list of VLANs to enable
110+
; subnet decap, if status is enable and this
111+
; list is empty, subnet decap will apply to
112+
; all VLANs
113+
```
114+
115+
#### 6.2.2 APPL_DB
116+
117+
```
118+
### TUNNEL_DECAP_TABLE
119+
; Stores a list of decap tunnels
120+
key = TUNNEL_DECAP_TABLE:tunnel_name ; tunnel name as key
121+
tunnel_type = "IPINIP" ; tunnel type
122+
dscp_mode = "uniform"/"pipe"
123+
ecn_mode = "copy_from_outer"/"standard"
124+
ttl_mode = "uniform"/"pipe"
125+
encap_ecn_mode = "standard"
126+
127+
### TUNNEL_DECAP_TERM_TABLE
128+
; Stores a list of decap terms.
129+
key = TUNNEL_DECAP_TERM_TABLE:tunnel_name:dst_ip ; tunnel name:dst IP prefix as key
130+
term_type = "P2P"/"P2MP"/"MP2MP" ; tunnel decap term type
131+
src_ip = source IP prefix ; for decap terms of subnet decap, the
132+
; source IP is omitted
133+
subnet_type = "vlan"/"vip" ; the subnet type of the dst IP prefix, present
134+
; if this is a subnet decap term
135+
```
136+
137+
#### 6.2.3 STATE_DB
138+
139+
```
140+
### TUNNEL_DECAP_TABLE
141+
; Stores a list of created decap tunnels
142+
key = TUNNEL_DECAP_TABLE:tunnel_name ; tunnel name as key
143+
tunnel_type = "IPINIP" ; tunnel type
144+
dscp_mode = "uniform"/"pipe"
145+
ecn_mode = "copy_from_outer"/"standard"
146+
ttl_mode = "uniform"/"pipe"
147+
encap_ecn_mode = "standard"
148+
149+
### TUNNEL_DECAP_TERM_TABLE
150+
; Stores a list of created decap terms.
151+
key = TUNNEL_DECAP_TERM_TABLE:tunnel_name:dst_ip ; tunnel name:dst IP prefix as key
152+
term_type = "P2P"/"P2MP"/"MP2MP" ; tunnel decap term type
153+
src_ip = source IP prefix
154+
subnet_type = "vlan"/"vip" ; the subnet type of the dst IP prefix, present
155+
; if this is a subnet decap term
156+
```
157+
158+
### 6.3 Orchestration Agent
159+
160+
The following orchestration agents shall be modified:
161+
TunnelDecapOrch:
162+
163+
- TunnelDecapOrch shall subscribe to TUNNEL_DECAP_TABLE and create/remove the decap tunnels.
164+
165+
- TunnelDecapOrch shall subscribe to TUNNEL_DECAP_TERM_TABLE and create/remove the decap term entries.
166+
167+
- TunnelDecapOrch shall subscribe to SUBNET_DECAP and handle the tunnel decapsulation termination entry source IP changes.
168+
169+
### 6.4 VLAN Subnet Decap
170+
171+
#### 6.4.1 VLAN Subnet Decap Rule Generation
172+
173+
![VLAN decap rule gen](./image/vlan_decap_rule_gen_workflow.png)
174+
175+
The VLAN subnet decap workflow is presented in the above figure. The extra tunnels and decap rules are templated-out and pushed to APPL_DB by swssconfig service if VLAN subnet decap is enabled. TunnelDecapOrch subscribes to both TUNNEL_DECAP_TABLE and TUNNEL_DECAP_TERM_TABLE and processes the request to program the decap rules to SYNCD accordingly.
176+
177+
#### 6.4.2 Dual-ToR Considerations
178+
179+
For Dual-ToR, both ToRs are configured with the same VLAN setup, so they all have the same decap rules. As the downstream traffic from T1s are ECMPed to either ToR, the IPinIP packets to VLAN subnet IPs from Netscan received by either ToR could be decapsulated and forwarded back to the Netscan sender.
180+
181+
#### 6.4.3 Netscan VLAN Subnet Probing
182+
183+
![VLAN subnet probing](./image/vlan_subnet_probing.png)
184+
185+
### 6.5 Subnet Decap Configuration Workflow
186+
187+
TunnelDecapOrch subscribes to the SUBNET_DECAP table and reacts to subnet decap configuration change. Currently, only source IP and source IPv6 prefix change are supported and TunnelDecapOrch will change the decapsulation termination entry source IP according to the configuration change.
188+
189+
The following picture describes the workflow:
190+
191+
![subnet decap config update](./image/subnet_decap_config_update.png)
192+
193+
### 6.6 CLI
194+
195+
* `show tunnel brief`: lists out the tunnels created.
196+
197+
```
198+
# show tunnel
199+
Tunnel Name Type Dscp Mode ECN Mode TTL Mode
200+
---------------- ------ ----------- --------------- ----------
201+
IPINIP_TUNNEL IPINIP uniform copy_from_outer pipe
202+
IPINIP_V6_TUNNEL IPINIP uniform copy_from_outer pipe
203+
IPINIP_SUBNET IPINIP uniform copy_from_outer pipe
204+
IPINIP_V6_SUBNET IPINIP uniform copy_from_outer pipe
205+
```
206+
207+
* `show tunnel decap`: lists out the tunnel decap terms created.
208+
209+
```
210+
Dst IP Src IP Tunnel Name Decap Term Type
211+
------------- ------------- ------------- -----------------
212+
192.168.0.1 N/A IPINIP_TUNNEL P2MP
213+
10.10.10.0/24 20.20.20.0/24 IPINIP_SUBNET MP2MP
214+
```
215+
216+
## 7 Warm Reboot Support
217+
218+
Currently, SONiC doesn’t load `ipinip.json` after warm-reboot. As two new subnet decap tunnels (`IPINIP_SUBNET` and `IPINIP_V6_SUBNET`) are introduced by this design, `swssconfig.sh` shall be enhanced to write only those two extra tunnel entries from `ipinip.json` to APPL_DB TUNNEL_DECAP_TABLE without making duplicated writes to existing tunnels after warm-reboot.
219+
220+
## 8 Test Plan
221+
222+
The test plan will be added later based on the requirement.

0 commit comments

Comments
 (0)