You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the behaviour of an advertised UDN is that
UDN pods are able to talk any service in the default
network in local gateway mode.
Expected behaviour is that advertised UDN pods should
only be able to reach kapi and dns services in the
default network and rest of them should not be
reachable.
Solution adopted is to change all the relevant flows
that were using masIP subnet for services isolation
on br-ex to use the podsubnets.
Caveat is: if user advertises overlapping podSubnets
then things will fall apart and assumption is users
can't do that.
Flow details (post fix I don't want to call out the
current behaviour to avoid confusions -> current behaviour
was using the default network flows for UDN pods which was
wrong):
Onward packet
1) cookie=0xdeff105, duration=1472.742s, table=0, n_packets=9, n_bytes=666,
priority=550,ip,in_port=LOCAL,nw_src=103.103.0.0/16,nw_dst=10.96.0.0/16 actions=ct(commit,table=2,zone=64001)
this flow is same as that for UDNs with the masSubnet as the src
2) cookie=0xdeff105, duration=1472.742s, table=2, n_packets=9, n_bytes=666,
priority=200,ip,nw_src=103.103.0.0/16 actions=mod_dl_dst:02:42:ac:12:00:03,output:3
this flow is also same flow as that for UDNs with the masIP as the src
So in this above way ^ any traffic from any advertised UDNs towards clusterIP
range will always be taken into the patch port of that respective UDN
and this guarantees isolation towards services belonging in other UDNs
since that given UDN won't have any LBs for those services
For KAPI/DNS services specially we add:
cookie=0xdeff105, duration=2319.685s, table=2, n_packets=496, n_bytes=67111, priority=300,
ip,nw_dst=10.96.0.1 actions=mod_dl_dst:02:42:ac:12:00:03,output:"patch-breth0_ov"
cookie=0xdeff105, duration=2319.685s, table=2, n_packets=0, n_bytes=0, priority=300,
ip,nw_dst=10.96.0.10 actions=mod_dl_dst:02:42:ac:12:00:03,output:"patch-breth0_ov"
these two genetic flows that will work for advertised and non-advertised networks
to send the packet into the defaultnetwork patch port
Return packet
3) There is no need for returning any packets correctly unless its for
kapi and/or dns
for UDNs, I see this flow:
cookie=0xdeff105, duration=684.087s, table=0, n_packets=0, n_bytes=0,
idle_age=684, priority=500,ip,in_port=2,nw_src=10.96.0.0/16,nw_dst=169.254.0.0/17 actions=ct(table=3,zone=64001,nat)
cookie=0xdeff105, duration=684.050s, table=0, n_packets=0, n_bytes=0,
idle_age=684, priority=500,ip,in_port=4,nw_src=10.96.0.0/16,nw_dst=169.254.0.0/17 actions=ct(table=3,zone=64001,nat)
but for BGP I add these two flows:
cookie=0xdeff105, duration=264.196s, table=0, n_packets=0, n_bytes=0, priority=490,
in_port="patch-breth0_ov",nw_src=10.96.0.10,actions=ct(table=3,zone=64001,nat)
cookie=0xdeff105, duration=264.196s, table=0, n_packets=0, n_bytes=0, priority=490,
in_port="patch-breth0_ov",nw_src=10.96.0.1,actions=ct(table=3,zone=64001,nat)
generic enough to match all BGP networks
There is scope for scale improvement by combining the onward packet flows
at tables 0 and 2 into one entity for onward traffic, but maybe that could also be done for UDNs?
There is now isolation established between UDN pods and default network
because except kapi and dns all other traffic matching clusterIP is sent
to the respective UDN patch ports where its black-holed (in l3 it goes
back and forth from breth0 to GR and back where its dropped using 105 flow).
However in L2 we do see the expected bad behaviour where packet is looping from
management port -> breth0 -> GR -> management port -> breth0 and so on
which is a never ending loop
If we decide to perhaps ban the circular looping and optimize the flow better
using a proper drop rule:
inport=LOCAL, srcIP=UDNSubnet, dstIP=serviceSubnet -> 1 per advertised UDN “DROP”
that can be considered as a future improvement here
dump:
tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
20:13:18.881036 33dfd1008b804_3 P ifindex 15 0a:58:67:67:00:05 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 25988, offset 0, flags [DF], proto TCP (6), length
60)
103.103.0.5.36363 > 10.96.164.25.80: Flags [S], cksum 0x1614 (incorrect -> 0x5b5d), seq 2541324161, win 65280, options [mss 1360,sackOK,TS val 984084514 ecr 0,nop,wscale
7], length 0
20:13:18.881626 ovn-k8s-mp1 In ifindex 8 0a:58:64:41:00:03 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 63, id 25988, offset 0, flags [DF], proto TCP (6), length 60)
103.103.0.5.36363 > 10.96.164.25.80: Flags [S], cksum 0x5b5d (correct), seq 2541324161, win 65280, options [mss 1360,sackOK,TS val 984084514 ecr 0,nop,wscale 7], length
0
20:13:18.881651 breth0 Out ifindex 4 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 62, id 25988, offset 0, flags [DF], proto TCP (6), length 60)
103.103.0.5.36363 > 10.96.164.25.80: Flags [S], cksum 0x5b5d (correct), seq 2541324161, win 65280, options [mss 1360,sackOK,TS val 984084514 ecr 0,nop,wscale 7], length
0
20:13:18.882283 ovn-k8s-mp1 In ifindex 8 0a:58:64:41:00:03 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 61, id 25988, offset 0, flags [DF], proto TCP (6), length 60)
103.103.0.5.36363 > 10.96.164.25.80: Flags [S], cksum 0x5b5d (correct), seq 2541324161, win 65280, options [mss 1360,sackOK,TS val 984084514 ecr 0,nop,wscale 7], length
0
20:13:18.882298 breth0 Out ifindex 4 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 60, id 25988, offset 0, flags [DF], proto TCP (6), length 60)
103.103.0.5.36363 > 10.96.164.25.80: Flags [S], cksum 0x5b5d (correct), seq 2541324161, win 65280, options [mss 1360,sackOK,TS val 984084514 ecr 0,nop,wscale 7], length
0
20:13:18.882402 ovn-k8s-mp1 In ifindex 8 0a:58:64:41:00:03 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 59, id 25988, offset 0, flags [DF], proto TCP (6), length 60)
103.103.0.5.36363 > 10.96.164.25.80: Flags [S], cksum 0x5b5d (correct), seq 2541324161, win 65280, options [mss 1360,sackOK,TS val 984084514 ecr 0,nop,wscale 7], length
0
20:13:18.882414 breth0 Out ifindex 4 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 58, id 25988, offset 0, flags [DF], proto TCP (6), length 60)
103.103.0.5.36363 > 10.96.164.25.80: Flags [S], cksum 0x5b5d (correct), seq 2541324161, win 65280, options [mss 1360,sackOK,TS val 984084514 ecr 0,nop,wscale 7], length
$ oc rsh -n blue blue3
/ # curl --local-port 36363 10.96.164.25:80
curl: (7) Failed to connect to 10.96.164.25 port 80 after 3 ms: Host is unreachable
Signed-off-by: Surya Seetharaman <[email protected]>
0 commit comments