Skip to content

Commit 319cec0

Browse files
committed
Initial FQDN Selector NPEP with User stories
1 parent bf98cec commit 319cec0

File tree

1 file changed

+140
-0
lines changed

1 file changed

+140
-0
lines changed

npep/npep-133.md

Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
# NPEP-133: FQDN Selector for Egress Traffic
2+
3+
* Issue:
4+
[#133](https://github.com/kubernetes-sigs/network-policy-api/issues/133)
5+
* Status: Provisional
6+
7+
## TLDR
8+
9+
This enhancement proposes adding a new optional selector to specify egress peers
10+
using [Fully Qualified Domain Names](https://www.wikipedia.org/wiki/Fully_qualified_domain_name)
11+
(FQDNs).
12+
13+
## Goals
14+
15+
* Provide a selector to specify egress peers using a Fully Qualified Domain Name
16+
(for example `kubernetes.io`).
17+
* Support basic wildcard matching capabilities when specifying FQDNs (for example `*.cloud-provider.io`)
18+
* Currently only `ALLOW` type rules are proposed.
19+
* Correctly enforcing DENY rules based on FQDN selectors is difficult as there
20+
is no guarantee a Network Policy plugin is aware of all IPs backing a FQDN
21+
policy.
22+
* Currently only AdminNetworkPolicy is the intended scope for this proposal.
23+
* Since Kubernetes NetworkPolicy does not have a FQDN selector, adding this
24+
capability to BaselineAdminNetworkPolicy could result in writing baseline rules that can't be replicated by an overriding NetworkPolicy.
25+
For example, if BANP allows traffic to `example.io`, but the namespace admin
26+
installs a Kubernetes Network Policy, the namespace admin has no way to
27+
replicate the `example.io` selector using just Kubernetes Network Policies.
28+
29+
## Non-Goals
30+
31+
* This enhancement does not include a FQDN selector for allowing ingress
32+
traffic.
33+
* This enhancement does not include any L7 matching or filtering capabilities,
34+
like matching HTTP traffic or URL paths.
35+
* This selector should not control what DNS records are resolvable from a
36+
particular workload.
37+
* This enhancement does not provide a mechanism for selecting in-cluster
38+
endpoints using FQDNs. This is explicitly disallowed by the spec.
39+
* To select Pods, Nodes, or the API Server, AdminNetworkPolicy has other more
40+
specific selectors.
41+
* This enhancement does not specify the details of how traffic is routed to the
42+
specified destination. For example, it does not prescribe details around NAT
43+
or egress gateways.
44+
* This enhancement does not require any mechanism for securing DNS resolution
45+
(e.g. DNSSEC or DNS-over-TLS). Unsecured DNS requests are expected to be
46+
sufficient for looking up FQDNs.
47+
48+
## Introduction
49+
50+
FQDN-based egress controls are a common enterprise security practice.
51+
Administrators often prefer to write security policies using DNS names such as
52+
“www.kubernetes.io” instead of capturing all the IP addresses the DNS name might
53+
resolve to. Keeping up with changing IP addresses is a maintenance burden, and
54+
hampers the readability of the network policies.
55+
56+
## User Stories
57+
58+
* As a cluster admin, I want to allow all Pods in the cluster to send traffic to
59+
an external service specified by a well-known domain name. For example, all
60+
Pods must be able to talk to `my-service.com`.
61+
62+
* As a cluster admin, I want to allow Pods in the "monitoring" namespace to be
63+
able to send traffic to a logs-sink, hosted at `logs-storage.com`
64+
65+
* As a cluster admin, I want to allow all Pods in the cluster to send traffic to
66+
any of the managed services provided by my Cloud Provider. Since the cloud
67+
provider has a well known parent domain, I want to allow Pods to send traffic
68+
to all sub-domains using a wild-card selector -- `*.my-cloud-provider.com`
69+
70+
### Future User Stories
71+
72+
These are some user stories we want to keep in mind, but due to limitations of
73+
the existing Network Policy API, cannot be implemented currently. The design
74+
goal in this case is to ensure we do not make these unimplementable down the line.
75+
76+
* As a cluster admin, I want to block all cluster egress traffic by default, and
77+
require namespace admins to create NetworkPolicies explicitly allowing egress
78+
to the domains they need to talk to.
79+
80+
The Cluster admin would use a `BaselineAdminNetworkPolicy` object to switch
81+
the default disposition of the cluster. Namespace admins would then use
82+
a FQDN selector in the Kubernetes `NetworkPolicy` objects to allow
83+
`my-service.com`.
84+
85+
## API
86+
87+
TODO
88+
89+
## Alternatives
90+
91+
### IP Block Selector
92+
93+
IP blocks are an important tool for specifying Network Policies. However, they
94+
do not address all user needs and have a few short-comings when compared to FQDN
95+
selectors:
96+
97+
* IP-based selectors can become verbose if a single logical service has numerous
98+
IPs backing it.
99+
* IP-based selectors pose an ongoing maintanance burden for administrators, who
100+
need to be aware of changing IPs.
101+
* IP-based selectors can result in policies that are difficult to read and
102+
audit.
103+
104+
### L7 Policy
105+
106+
Another alternative is to provide a true L7 selector, similar to the policies
107+
provided by Service Mesh providers. While L7 selectors can offer more
108+
expressibility, they often come trade-offs that are not suitable for all users:
109+
110+
* L7 selectors necessarily support a select set of protocols. Customers may be
111+
using a custom protocol for application-level communication, but still want
112+
the ability to specify endpoints using DNS.
113+
* L7 selectors often require proxies to perform deep packet inspection and
114+
enforce the policies. These proxies can introduce un-desireable latencies in
115+
the datapath of applications.
116+
117+
## References
118+
119+
* [NPEP #126](https://github.com/kubernetes-sigs/network-policy-api/issues/126):
120+
Egress Control in ANP
121+
122+
### Implementations
123+
124+
* [Antrea](https://antrea.io/docs/main/docs/antrea-network-policy/#fqdn-based-filtering)
125+
* [Calico](https://docs.tigera.io/calico-enterprise/latest/network-policy/domain-based-policy)
126+
* [Cilium](https://docs.cilium.io/en/latest/security/policy/language/#dns-based)
127+
* [Open Shift](https://docs.openshift.com/container-platform/latest/networking/openshift_sdn/configuring-egress-firewall.html)
128+
129+
The following is a best-effort breakdown of capabilities of different
130+
NetworkPolicy providers, as of 2023-09-25. This information may be out-of-date,
131+
or inaccurate.
132+
133+
| | Antrea | Calico | Cilium | OpenShift <br/> (current) | OpenShift <br/> (future) |
134+
| -------------- | ------------------------------ | ------------ | ------------ | ------------------------- | ------------------------ |
135+
| Implementation | DNS Snooping <br/> + Async DNS | DNS Snooping | DNS Snooping | Async DNS | DNS Snooping |
136+
| Wildcards || ️✔ ||||
137+
| Egress Rules || ️✔ ||||
138+
| Ingress Rules || ️❌ ||||
139+
| Allow Rules || ️✔ ||||
140+
| Deny Rules || ️❌(?) ||| ❌(?) |

0 commit comments

Comments
 (0)