Skip to content

Commit 5385cde

Browse files
authored
Merge pull request kubernetes#2446 from shekhar-rajak/kep_new_template_sig_network
Migrating from old kep to new template: sig-network
2 parents f71048a + 13a4bd1 commit 5385cde

File tree

9 files changed

+92
-100
lines changed

9 files changed

+92
-100
lines changed

keps/sig-network/0030-nodelocal-dns-cache.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,8 @@ This proposal aims to improve DNS performance by running a dns caching agent on
4848

4949
## Motivation
5050

51-
* With the current DNS architecture, it is possible that pods with the highest DNS QPS have to reach out to a different node, if there is no local kube-dns instance.
52-
Having a local cache will help improve the latency in such scenarios.
51+
* With the current DNS architecture, it is possible that pods with the highest DNS QPS have to reach out to a different node, if there is no local kube-dns instance.
52+
Having a local cache will help improve the latency in such scenarios.
5353

5454
* Skipping iptables DNAT and connection tracking will help reduce [conntrack races](https://github.com/kubernetes/kubernetes/issues/56903) and avoid UDP DNS entries filling up conntrack table.
5555

@@ -69,7 +69,7 @@ Having a local cache will help improve the latency in such scenarios.
6969
* [https://github.com/kubernetes/kubernetes/issues/45363](https://github.com/kubernetes/kubernetes/issues/45363)
7070

7171

72-
This shows that there is interest in the wider Kubernetes community for a solution similar to the proposal here.
72+
This shows that there is interest in the wider Kubernetes community for a solution similar to the proposal here.
7373

7474

7575
### Goals
@@ -83,7 +83,7 @@ Being able to run a dns caching agent as a Daemonset and get pods to use the loc
8383

8484
## Proposal
8585

86-
A nodeLocal dns cache runs on all cluster nodes. This is managed as an add-on, runs as a Daemonset. All pods using clusterDNS will now talk to the nodeLocal cache, which will query kube-dns in case of cache misses in cluster's configured DNS suffix and for all reverse lookups(in-addr.arpa and ip6.arpa). User-configured stubDomains will be passed on to this local agent.
86+
A nodeLocal dns cache runs on all cluster nodes. This is managed as an add-on, runs as a Daemonset. All pods using clusterDNS will now talk to the nodeLocal cache, which will query kube-dns in case of cache misses in cluster's configured DNS suffix and for all reverse lookups(in-addr.arpa and ip6.arpa). User-configured stubDomains will be passed on to this local agent.
8787
The node's resolv.conf will be used by this local agent for all other cache misses. One benefit of doing the non-cluster lookups on the nodes from which they are happening, rather than the kube-dns instances, is better use of per-node DNS resources in cloud. For instance, in a 10-node cluster with 3 kube-dns instances, the 3 nodes running kube-dns will end up resolving all external hostnames and can exhaust QPS quota. Spreading the queries across the 10 nodes will help alleviate this.
8888

8989
#### Daemonset and Listen Interface for caching agent
@@ -169,9 +169,9 @@ CoreDNS will be the local cache agent in the first release, after considering th
169169

170170
It is possible to run any program as caching agent by modifying the daemonset and configmap spec. Publishing an image with Unbound DNS can be added as a follow up.
171171

172-
Based on the prototype/test results, these are the recommended defaults:
172+
Based on the prototype/test results, these are the recommended defaults:
173173
CPU request: 50m
174-
Memory Limit : 25m
174+
Memory Limit : 25m
175175

176176
CPU request can be dropped to a smaller value if QPS needs are lower.
177177

@@ -216,7 +216,7 @@ This feature will be launched with Alpha support in the first release. Master ve
216216

217217
## Drawbacks [optional]
218218

219-
Additional resource consumption for the Daemonset might not be necessary for clusters with low DNS QPS needs.
219+
Additional resource consumption for the Daemonset might not be necessary for clusters with low DNS QPS needs.
220220

221221

222222
## Alternatives [optional]

keps/sig-network/0031-20181017-kube-proxy-services-optional.md renamed to keps/sig-network/2447-Make-kube-proxy-service-abstraction-optional/README.md

Lines changed: 0 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,3 @@
1-
---
2-
title: Make kube-proxy service abstraction optional
3-
authors:
4-
- "@bradhoekstra"
5-
owning-sig: sig-network
6-
participating-sigs:
7-
reviewers:
8-
- "@freehan"
9-
approvers:
10-
- "@thockin"
11-
editor: "@bradhoekstra"
12-
creation-date: 2018-10-17
13-
last-updated: 2018-11-12
14-
status: provisional
15-
see-also:
16-
replaces:
17-
superseded-by:
18-
---
19-
201
# Make kube-proxy service abstraction optional
212

223
## Table of Contents
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
title: Make kube-proxy service abstraction optional
2+
kep-number: 2447
3+
authors:
4+
- "@bradhoekstra"
5+
owning-sig: sig-network
6+
participating-sigs:
7+
reviewers:
8+
- "@freehan"
9+
approvers:
10+
- "@thockin"
11+
editor: "@bradhoekstra"
12+
creation-date: 2018-10-17
13+
last-updated: 2018-11-12
14+
status: implemented
15+
see-also:
16+
replaces:
17+
superseded-by:

keps/sig-network/20190324-remove-kube-proxy-autocleanup.md renamed to keps/sig-network/2448-Remove-kube-proxy-automatic-clean-up-logic/README.md

Lines changed: 0 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,3 @@
1-
---
2-
title: Remove kube-proxy's automatic clean up logic
3-
authors:
4-
- "@vllry"
5-
owning-sig: sig-network
6-
participating-sigs:
7-
reviewers:
8-
- "@andrewsykim"
9-
approvers:
10-
- "@bowei"
11-
- "@thockin"
12-
editor: TBD
13-
creation-date: 2018-03-24
14-
last-updated: 2018-04-02
15-
status: implementable
16-
see-also:
17-
replaces:
18-
superseded-by:
19-
---
20-
211
# Remove kube-proxy's automatic clean up logic
222

233
## Table of Contents
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
title: Remove kube-proxy's automatic clean up logic
2+
kep-number: 2448
3+
authors:
4+
- "@vllry"
5+
owning-sig: sig-network
6+
participating-sigs:
7+
reviewers:
8+
- "@andrewsykim"
9+
approvers:
10+
- "@bowei"
11+
- "@thockin"
12+
editor: TBD
13+
creation-date: 2018-03-24
14+
last-updated: 2018-04-02
15+
status: implemented
16+
see-also:
17+
replaces:
18+
superseded-by:

keps/sig-network/20190920-external-dns.md renamed to keps/sig-network/2449-move-externalDNS-out-of-kubernetes-incubator/README.md

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,3 @@
1-
---
2-
title: Move ExternalDNS out of Kubernetes incubator
3-
authors:
4-
- "@njuettner"
5-
owning-sig: sig-network
6-
status: implemented
7-
---
8-
91
# Move ExternalDNS out of Kubernetes incubator
102

113
<!-- toc -->
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
title: Move ExternalDNS out of Kubernetes incubator
2+
kep-number: 2449
3+
authors:
4+
- "@njuettner"
5+
owning-sig: sig-network
6+
status: implemented

keps/sig-network/20191104-iptables-no-cluster-cidr.md renamed to keps/sig-network/2450-Remove-knowledge-of-pod-cluster-CIDR-from-iptables-rules/README.md

Lines changed: 19 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,3 @@
1-
---
2-
title: Remove knowledge of pod cluster CIDR from iptables rules
3-
authors:
4-
- "@satyasm"
5-
owning-sig: sig-network
6-
participating-sigs:
7-
reviewers:
8-
- "@thockin"
9-
- "@caseydavenport"
10-
- "@mikespreitzer"
11-
- "@aojea"
12-
- "@fasaxc"
13-
- "@squeed"
14-
- "@bowei"
15-
- "@dcbw"
16-
- "@darwinship"
17-
approvers:
18-
- "@thockin"
19-
editor: TBD
20-
creation-date: 2019-11-04
21-
last-updated: 2019-11-27
22-
status: implementable
23-
see-also:
24-
replaces:
25-
superseded-by:
26-
---
27-
281
# Removing Knowledge of pod cluster CIDR from iptables rules
292

303
## Table of Contents
@@ -67,9 +40,9 @@ This enhancement proposes ways to achieve similar goals without tracking the pod
6740

6841
## Motivation
6942

70-
The idea that makes kubernetes networking model unique and powerful is the concept of each pod having its own IP,
71-
with all the pod IPs being natively routable within the cluster. The service chains in iptable rules depend on this
72-
capability by assuming that they can treat all the endpoints of a cluster as being equivalent and load balance service
43+
The idea that makes kubernetes networking model unique and powerful is the concept of each pod having its own IP,
44+
with all the pod IPs being natively routable within the cluster. The service chains in iptable rules depend on this
45+
capability by assuming that they can treat all the endpoints of a cluster as being equivalent and load balance service
7346
traffic across all the endpoints, by just translating destination to the pod IP address.
7447

7548
While this is powerful, it also means pod IP addresses are in many cases the constraining resource for cluster creation
@@ -82,7 +55,7 @@ Some examples of use cases:
8255
* Expanding a cluster with more disjoint ranges after initial creation.
8356

8457
Not having to depend on the cluster pod CIDR for routing service traffic would effectively de-couple pod IP management
85-
and allocation strategies from service management and routing. Which in turn would mean that it would be far cheaper
58+
and allocation strategies from service management and routing. Which in turn would mean that it would be far cheaper
8659
to evolve the IP allocation schemes while sharing the same service implementation, thus significantly lowering the bar
8760
for adoption of alternate schemes.
8861

@@ -102,14 +75,14 @@ CIDR for routing cluster traffic.
10275

10376
## Proposal
10477

105-
As stated above, the goal is to re-implement the functionality called out in the summary, but in a
106-
way that does not depend on a pod cluster CIDR. The essence of the proposal is that for the
107-
first two cases in iptables implementation and first case in ipvs, we can replace the `-s proxier.clusterCIDR` with
78+
As stated above, the goal is to re-implement the functionality called out in the summary, but in a
79+
way that does not depend on a pod cluster CIDR. The essence of the proposal is that for the
80+
first two cases in iptables implementation and first case in ipvs, we can replace the `-s proxier.clusterCIDR` with
10881
some notion of node local pod traffic.
10982

110-
The core logic in these cases is “how to determine” cluster originated traffic from non-cluster originated ones.
111-
The proposal is that tracking pod traffic generated from within the node is sufficient to determine cluster originated
112-
traffic. For the first two use cases in iptables and first use case in ipvs, we provide alternatives to using
83+
The core logic in these cases is “how to determine” cluster originated traffic from non-cluster originated ones.
84+
The proposal is that tracking pod traffic generated from within the node is sufficient to determine cluster originated
85+
traffic. For the first two use cases in iptables and first use case in ipvs, we provide alternatives to using
11386
proxier.clusterCIDR in one of the following ways to determine cluster originated traffic
11487

11588
1. `-s node.podCIDR` (where node podCIDR is used for allocating pod IPs within the node)
@@ -119,7 +92,7 @@ proxier.clusterCIDR in one of the following ways to determine cluster originated
11992

12093
Note the above are equivalent definitions, when considering only pod traffic originating from within the node.
12194

122-
Given that this kep only addresses usage of the cluster CIDR (for pods), and that pods with hostNetwork are not
95+
Given that this kep only addresses usage of the cluster CIDR (for pods), and that pods with hostNetwork are not
12396
impacted by this, the assumption is that hostNetwork pod behavior will continue to work as is.
12497

12598
For the last use case, note above, in iptables and ipvs, the proposal is to drop the reference to the cluster CIDR.
@@ -147,14 +120,14 @@ the node IP so that we can send traffic to any pod within the cluster.
147120

148121
One key insight when thinking about this data path though is the fact that the iptable rules run
149122
at _every_ node boundary. So when a pod sends a traffic to a service IP, it gets translated to
150-
one of the pod IPs _before_ it leaves the node at the node boundary. So it's highly unlikely to
123+
one of the pod IPs _before_ it leaves the node at the node boundary. So it's highly unlikely to
151124
receive traffic at a node, whose destination is the service cluster IP, that is initiated by pods
152125
within the cluster, but not scheduled within that node.
153126

154-
Going by the above reasoning, if we receive traffic destined to a service whose source is not within the node
155-
generated pod traffic, we can say with very high confidence that the traffic originated from outside the cluster.
127+
Going by the above reasoning, if we receive traffic destined to a service whose source is not within the node
128+
generated pod traffic, we can say with very high confidence that the traffic originated from outside the cluster.
156129
So we can rewrite the rule in terms of just the pod identity within the node (node CIDR, interface prefix or bridge).
157-
This would be the simplest change with respect to re-writing the rule without any assumptions on how pod
130+
This would be the simplest change with respect to re-writing the rule without any assumptions on how pod
158131
networking is setup.
159132

160133
### iptables - redirecting pod traffic to external loadbalancer VIP to cluster IP
@@ -219,7 +192,7 @@ if len(proxier.clusterCIDR) != 0 {
219192
}
220193
```
221194

222-
The interesting part of this rule that it already matches conntrack state to "RELATED,ESTABLISHED",
195+
The interesting part of this rule that it already matches conntrack state to "RELATED,ESTABLISHED",
223196
which means that it does not apply to the initial packet, but after the connection has been setup and accepted.
224197

225198
In this case, dropping the `-d proxier.clusterCIDR` rule should have minimal impact on it behavior.
@@ -358,15 +331,15 @@ on _every_ and _all_ nodes that makes up the kubernetes cluster. This would not
358331
## Alternatives [optional]
359332

360333
### Multiple cluster CIDR rules
361-
One alternative to consider is to explicitly track a list of cluster CIDRs in the ip table rules. If we
334+
One alternative to consider is to explicitly track a list of cluster CIDRs in the ip table rules. If we
362335
want to do this, we might want to consider making the cluster CIDR a first class resource, which we want to avoid.
363336

364337
Instead in most cases, where the interface prefix is mostly fixed or we are using the `node.spec.podCIDR` attribute,
365-
changes to the cluster CIDR does not need any change to the kube-proxy arguments or a restart, which we believe
338+
changes to the cluster CIDR does not need any change to the kube-proxy arguments or a restart, which we believe
366339
is of benefit when managing clusters.
367340

368341
### ip-masq-agent like behavior
369-
The other alternative is to have kube-proxy never track it and instead use something like
342+
The other alternative is to have kube-proxy never track it and instead use something like
370343
[ip-masq-agent](https://kubernetes.io/docs/tasks/administer-cluster/ip-masq-agent/) to track what we masquerade
371344
or not. In this case, it assumes more knowledge from the users, but it does provide for a single place to update
372345
these cidrs using existing tooling.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
title: Remove knowledge of pod cluster CIDR from iptables rules
2+
kep-number: 2450
3+
authors:
4+
- "@satyasm"
5+
owning-sig: sig-network
6+
participating-sigs:
7+
reviewers:
8+
- "@thockin"
9+
- "@caseydavenport"
10+
- "@mikespreitzer"
11+
- "@aojea"
12+
- "@fasaxc"
13+
- "@squeed"
14+
- "@bowei"
15+
- "@dcbw"
16+
- "@darwinship"
17+
approvers:
18+
- "@thockin"
19+
editor: TBD
20+
creation-date: 2019-11-04
21+
last-updated: 2019-11-27
22+
status: implemented
23+
see-also:
24+
replaces:
25+
superseded-by:

0 commit comments

Comments
 (0)