Skip to content

Commit 30e5588

Browse files
authored
Merge pull request #887 from iqiyi/devel
merge v1.9.4 to master
2 parents b524e36 + 2298b0c commit 30e5588

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+4575
-4537
lines changed

conf/dpvs.bond.conf.sample

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@ global_defs {
1515
log_level WARNING
1616
! log_file /var/log/dpvs.log
1717
! log_async_mode off
18-
! pdump off
18+
! kni on
19+
! pdump off
1920
}
2021

2122
! netif config

conf/dpvs.conf.items

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ global_defs {
1818
<init> log_async_mode off <off, on|off>
1919
<init> log_async_pool_size 16383 <16383, 1023-unlimited>
2020
<init> pdump off <off, on|off>
21+
<init> kni on <on, on|off>
2122
}
2223

2324
! netif config

conf/dpvs.conf.sample

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@ global_defs {
1515
log_level WARNING
1616
! log_file /var/log/dpvs.log
1717
! log_async_mode on
18-
! pdump off
18+
! kni on
19+
! pdump off
1920
}
2021

2122
! netif config

conf/dpvs.conf.single-bond.sample

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ global_defs {
1515
log_level WARNING
1616
! log_file /var/log/dpvs.log
1717
! log_async_mode on
18+
! kni on
1819
}
1920

2021
! netif config

conf/dpvs.conf.single-nic.sample

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ global_defs {
1515
log_level WARNING
1616
! log_file /var/log/dpvs.log
1717
! log_async_mode on
18+
! kni on
1819
}
1920

2021
! netif config

doc/tutorial.md

Lines changed: 107 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ DPVS Tutorial
2222
* [UDP Option of Address (UOA)](#uoa)
2323
* [Launch DPVS in Virtual Machine (Ubuntu)](#Ubuntu16.04)
2424
* [Traffic Control(TC)](#tc)
25+
* [Multiple Instances](#multi-instance)
2526
* [Debug DPVS](#debug)
2627
- [Debug with Log](#debug-with-log)
2728
- [Packet Capture and Tcpdump](#packet-capture)
@@ -1191,6 +1192,111 @@ worker_defs {
11911192
11921193
Please refer to doc [tc.md](tc.md).
11931194
1195+
<a id='multi-instance'/>
1196+
1197+
# Multiple Instances
1198+
1199+
Generally, DPVS is a network process running on physical server which is usually equipped with dozens of CPUs and vast sufficient memory. DPVS is CPU/memory efficient, so the CPU/memory resources on a general physical server are usually far from fully used. Thus we may hope to run multiple independent DPVS instances on a server to make the most out of it. A DPVS instance may use 1~4 NIC ports, depending on if the ports are bonding and the network topology of two-arm or one-arm. Extra NICs are needed if we want to run multiple DPVS instances because one NIC port should be managed only by one DPVS instance. Now let's make insights into the details of multiple DPVS instances.
1200+
1201+
#### CPU Isolation
1202+
1203+
The CPUs used by DPVS are always busy loop. If a CPU is assigned to two DPVS instances simultaneously, then both instances are to suffer from dramatic processing delay. So different instances must run on different CPUs, which is achieved by the procedures below.
1204+
1205+
- Start DPVS with EAL options `-l CORELIST` or `--lcores COREMAP` or `-c COREMASK` to specify on which CPUs the instance is to run.
1206+
- Configure corresponding CPUs into DPVS config file (config key: worker_defs/worker */cpu_id).
1207+
1208+
It's suggested we select the CPUs and NIC ports on the same numa node on numa-aware platform. Performance degrades if the NIC ports and CPUs of a DPVS instance are on different numa nodes.
1209+
1210+
#### Memory Isolation
1211+
1212+
As is known, DPVS takes advantage of hugepage memory. The hugepage memory of different DPVS instances can be isolated by using different memory mapping files. The DPDK EAL option `--file-prefix` specifies the name prefix of memory mapping file. Thus multiple DPVS instances can run simultaneously by specifying unique name prefixes of hugepage memory with this EAL option.
1213+
1214+
#### Process Isolation
1215+
1216+
* DPVS Process Isolation
1217+
1218+
Every DPVS instance must have an unique PID file, a config file, and an IPC socket file, which are specified by the following DPVS options respectively.
1219+
1220+
-p, --pid-file FILE
1221+
-c, --conf FILE
1222+
-x, --ipc-file FILE
1223+
1224+
For example,
1225+
1226+
```sh
1227+
./bin/dpvs -c /etc/dpvs1.conf -p /var/run/dpvs1.pid -x /var/run/dpvs1.ipc -- --file-prefix=dpvs1 -a 0000:4b:00.0 -a 0000:4b:00.1 -l 0-8 --main-lcore 0
1228+
```
1229+
1230+
* Keepalived Process Isolation
1231+
1232+
One DPVS instance corresponds to one keepalived instance, and vice versa. Similarly, different keepalived processes must have unique config files and PID files. Note that depending on the configurations, keepalived for DPVS may consist of 3 daemon processes, i.e, the main process, the health check subprocess, and the vrrp subprocess. The config files and PID files for different keepalived instances can be specified by the following options, respectively.
1233+
1234+
-f, --use-file=FILE
1235+
-p, --pid=FILE
1236+
-c, --checkers_pid=FILE
1237+
-r, --vrrp_pid=FILE
1238+
1239+
For example,
1240+
1241+
```sh
1242+
./bin/keepalived -D -f etc/keepalived/keepalived1.conf --pid=/var/run/keepalived1.pid --vrrp_pid=/var/run/vrrp1.pid --checkers_pid=/var/run/checkers1.pid
1243+
```
1244+
1245+
#### Talk to different DPVS instances with dpip/ipvsadm
1246+
1247+
`Dpip` and `ipvsadm` are the utility tools used to configure DPVS. By default, they works well on the single DPVS instance server without any extra settings. But on the multiple DPVS instance server, an envrionment variable `DPVS_IPC_FILE` should be preset as the DPVS's IPC socket file to which ipvsadm/dpip wants to talk. Refer to the the previous part "DPVS Process Isolation" for how to specify different IPC socket files for multiple DPVS instances. For example,
1248+
1249+
```sh
1250+
DPVS_IPC_FILE=/var/run/dpvs1.ipc ipvsadm -ln
1251+
# or equivalently,
1252+
export DPVS_IPC_FILE=/var/run/dpvs1.ipc
1253+
ipvsadm -ln
1254+
```
1255+
1256+
#### NIC Ports, KNI and Routes
1257+
1258+
The multiple DPVS instances running on a server are independent, that is DPVS adopts the deployment model [Running Multiple Independent DPDK Applications](https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#running-multiple-independent-dpdk-applications), which requires the instances cannot share any NIC ports. We can use the EAL options "-a, --allow" or "-b, --block" to allow/disable the NIC ports for a DPVS instance. However, Linux KNI kernel module only supports one DPVS instance in a specific network namespace (refer to [kernel/linux/kni/kni_misc.c](https://github.com/DPDK/dpdk/tree/main/kernel/linux/kni)). Basically, DPVS provides two solutions to the problem.
1259+
1260+
* Solution 1: Disable KNI on all other DPVS instances except the first one. A global config item `kni` has been added to DPVS since now.
1261+
1262+
```
1263+
# dpvs.conf
1264+
global_defs {
1265+
...
1266+
<init> kni on <default on, on|off>
1267+
...
1268+
}
1269+
```
1270+
1271+
* Solution 2: Run DPVS instances in different network namespaces. It also resolves the route conflicts for multiple KNI network ports of different DPVS instances. A typical procedure to run a DPVS instance in a network namespace is shown below.
1272+
1273+
Firstly, create a new network namespace, "dpvsns" for example.
1274+
1275+
```sh
1276+
/usr/sbin/ip netns add dpvsns
1277+
```
1278+
Secondly, move the NIC ports for this DPVS instance to the newly created network namespace.
1279+
1280+
```sh
1281+
/usr/sbin/ip link set eth1 netns dpvsns
1282+
/usr/sbin/ip link set eth2 netns dpvsns
1283+
/usr/sbin/ip link set eth3 netns dpvsns
1284+
```
1285+
Lastly, start DPVS and all its related processes (such as keepalived, routing daemon) in the network namespace.
1286+
1287+
```sh
1288+
/usr/sbin/ip netns exec dpvsns ./bin/dpvs -c /etc/dpvs2.conf -p /var/run/dpvs2.pid -x /var/run/dpvs2.ipc -- --file-prefix=dpvs2 -a 0000:cb:00.1 -a 0000:ca:00.0 -a 0000:ca:00.1 -l 12-20 --main-lcore 12
1289+
/usr/sbin/ip netns exec dpvsns ./bin/keepalived -D --pid=/var/run/keepalived2.pid --vrrp_pid=/var/run/vrrp2.pid --checkers_pid=/var/run/checkers2.pid -f etc/keepalived/keepalived2.conf
1290+
/usr/sbin/ip netns exec dpvsns /usr/sbin/bird -f -c /etc/bird2.conf -s /var/run/bird2/bird.ctl
1291+
...
1292+
```
1293+
1294+
For performance improvement, we can enable multiple kthread mode when multiple DPVS instances are deployed on a server. In this mode, each KNI port is processed by a dedicated kthread rather than a shared kthread.
1295+
1296+
```sh
1297+
insmod rte_kni.ko kthread_mode=multiple carrier=on
1298+
```
1299+
11941300
<a id='debug'/>
11951301
11961302
# Debug DPVS
@@ -1361,7 +1467,7 @@ $
13611467
13621468
### dpdk-pdump
13631469
1364-
The `dpdk-pdump` runs as a DPDK secondary process and is capable of enabling packet capture on dpdk ports. DPVS works as the primary process for dpdk-pdump, which shoud enable the packet capture framework by setting `global_defs/pdump` to be `on` in `/etc/dpvs.conf` when DPVS starts up.
1470+
The `dpdk-pdump` runs as a DPDK secondary process and is capable of enabling packet capture on dpdk ports. DPVS works as the primary process for dpdk-pdump, which should enable the packet capture framework by setting `global_defs/pdump` to be `on` in `/etc/dpvs.conf` when DPVS starts up.
13651471
13661472
Refer to [dpdk-pdump doc](https://doc.dpdk.org/guides/tools/pdump.html) for its usage. DPVS extends dpdk-pdump with a [DPDK patch](../patch/dpdk-stable-18.11.2/0005-enable-pdump-and-change-dpdk-pdump-tool-for-dpvs.patch) to add some packet filtering features. Run `dpdk-pdump -- --help` to find all supported pdump params.
13671473

include/conf/blklst.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ struct dp_vs_blklst_entry {
2929
union inet_addr addr;
3030
};
3131

32-
struct dp_vs_blklst_conf {
32+
typedef struct dp_vs_blklst_conf {
3333
/* identify service */
3434
int af;
3535
uint8_t proto;
@@ -39,7 +39,7 @@ struct dp_vs_blklst_conf {
3939

4040
/* for set */
4141
union inet_addr blklst;
42-
};
42+
} dpvs_blklst_t;
4343

4444
struct dp_vs_blklst_conf_array {
4545
int naddr;

include/conf/dest.h

Lines changed: 34 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
#define __DPVS_DEST_CONF_H__
2020

2121
#include "conf/service.h"
22+
#include "conf/match.h"
2223
#include "conf/conn.h"
2324

2425
/*
@@ -40,72 +41,57 @@ enum {
4041
DPVS_DEST_F_OVERLOAD = 0x1<<1,
4142
};
4243

43-
struct dp_vs_dest_conf {
44+
typedef struct dp_vs_dest_compat {
4445
/* destination server address */
4546
int af;
46-
union inet_addr addr;
4747
uint16_t port;
48+
uint16_t proto;
49+
uint32_t weight; /* destination weight */
50+
union inet_addr addr;
51+
52+
unsigned conn_flags; /* connection flags */
4853

4954
enum dpvs_fwd_mode fwdmode;
5055
/* real server options */
51-
unsigned conn_flags; /* connection flags */
52-
int weight; /* destination weight */
5356

5457
/* thresholds for active connections */
5558
uint32_t max_conn; /* upper threshold */
5659
uint32_t min_conn; /* lower threshold */
57-
};
5860

59-
struct dp_vs_dest_entry {
60-
int af;
61-
union inet_addr addr; /* destination address */
62-
uint16_t port;
63-
unsigned conn_flags; /* connection flags */
64-
int weight; /* destination weight */
65-
66-
uint32_t max_conn; /* upper threshold */
67-
uint32_t min_conn; /* lower threshold */
68-
69-
uint32_t actconns; /* active connections */
70-
uint32_t inactconns; /* inactive connections */
71-
uint32_t persistconns; /* persistent connections */
61+
uint32_t actconns; /* active connections */
62+
uint32_t inactconns; /* inactive connections */
63+
uint32_t persistconns; /* persistent connections */
7264

7365
/* statistics */
7466
struct dp_vs_stats stats;
75-
};
76-
77-
struct dp_vs_get_dests {
78-
/* which service: user fills in these */
79-
int af;
80-
uint16_t proto;
81-
union inet_addr addr; /* virtual address */
82-
uint16_t port;
83-
uint32_t fwmark; /* firwall mark of service */
67+
#ifdef _HAVE_IPVS_TUN_TYPE_
68+
int tun_type;
69+
int tun_port;
70+
#ifdef _HAVE_IPVS_TUN_CSUM_
71+
int tun_flags;
72+
#endif
73+
#endif
74+
} dpvs_dest_compat_t;
75+
76+
typedef struct dp_vs_dest_table {
77+
int af;
78+
uint16_t proto;
79+
uint16_t port;
80+
uint32_t fwmark;
81+
union inet_addr addr;
8482

85-
/* number of real servers */
86-
unsigned int num_dests;
83+
unsigned int num_dests;
8784

88-
lcoreid_t cid;
85+
struct dp_vs_match match;
8986

90-
char srange[256];
91-
char drange[256];
92-
char iifname[IFNAMSIZ];
93-
char oifname[IFNAMSIZ];
87+
lcoreid_t cid;
88+
lcoreid_t index;
9489

95-
/* the real servers */
96-
struct dp_vs_dest_entry entrytable[0];
97-
};
90+
dpvs_dest_compat_t entrytable[0];
91+
} dpvs_dest_table_t;
9892

99-
struct dp_vs_dest_user {
100-
int af;
101-
union inet_addr addr;
102-
uint16_t port;
103-
104-
unsigned conn_flags;
105-
int weight;
106-
107-
uint32_t max_conn;
108-
uint32_t min_conn;
109-
};
93+
#define dp_vs_get_dests dp_vs_dest_table
94+
#define dp_vs_dest_entry dp_vs_dest_compat
95+
#define dp_vs_dest_conf dp_vs_dest_compat
11096

11197
#endif /* __DPVS_DEST_CONF_H__ */

include/conf/inet.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,7 @@ static inline int inet_addr_range_parse(const char *param,
159159
port1 = port2 = NULL;
160160
}
161161

162+
*af = 0;
162163
memset(range, 0, sizeof(*range));
163164

164165
if (strlen(ip1) && inet_pton(AF_INET6, ip1, &range->min_addr.in6) > 0) {

include/conf/laddr.h

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424

2525
#include "inet.h"
2626
#include "net/if.h"
27+
#include "conf/match.h"
2728
#include "conf/sockopts.h"
2829

2930
struct dp_vs_laddr_entry {
@@ -33,18 +34,17 @@ struct dp_vs_laddr_entry {
3334
uint32_t nconns;
3435
};
3536

36-
struct dp_vs_laddr_conf {
37+
typedef struct dp_vs_laddr_conf {
3738
/* identify service */
3839
int af_s;
3940
uint8_t proto;
4041
union inet_addr vaddr;
4142
uint16_t vport;
4243
uint32_t fwmark;
43-
char srange[256];
44-
char drange[256];
45-
char iifname[IFNAMSIZ];
46-
char oifname[IFNAMSIZ];
44+
45+
struct dp_vs_match match;
4746
lcoreid_t cid;
47+
lcoreid_t index;
4848

4949
/* for set */
5050
int af_l;
@@ -54,6 +54,6 @@ struct dp_vs_laddr_conf {
5454
/* for get */
5555
int nladdrs;
5656
struct dp_vs_laddr_entry laddrs[0];
57-
};
57+
} dpvs_laddr_table_t;
5858

5959
#endif /* __DPVS_LADDR_CONF_H__ */

0 commit comments

Comments
 (0)