Skip to content

Commit 5e0ca8a

Browse files
committed
Initial FQDN Selector NPEP with User stories
1 parent bf98cec commit 5e0ca8a

File tree

1 file changed

+135
-0
lines changed

1 file changed

+135
-0
lines changed

npep/npep-133.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
# NPEP-133: FQDN Selector for Egress Traffic
2+
3+
* Issue:
4+
[#133](https://github.com/kubernetes-sigs/network-policy-api/issues/133)
5+
* Status: Provisional
6+
7+
## TLDR
8+
9+
This enhancement proposes adding a new optional selector to specify egress peers
10+
using Fully Qualified Domain Names (FQDNs).
11+
12+
## Goals
13+
14+
* Provide a selector to specify egress peers using a Fully Qualified Domain Name
15+
(for example `kubernetes.io`).
16+
* Support a restricted set of regex matching capabilities when specifying FQDNs.
17+
* Currently only AdminNetworkPolicy is the intended scope for this proposal.
18+
* Since Kubernetes NetworkPolicy does not have a FQDN selector, adding this
19+
capability to BaselineAdminNetworkPolicy can result in unintended behavior.
20+
For example, if BANP allows traffic to `example.io`, but the namespace admin
21+
installs a Kubernetes Network Policy, the namespace admin has no way to
22+
replicate the `example.io` selector using just Kubernetes Network Policies.
23+
24+
## Non-Goals
25+
26+
* This enhancement does not include a FQDN selector for allowing ingress
27+
traffic.
28+
* This enhancement does not include any L7 matching or filtering capabilities,
29+
like matching HTTP traffic or URL paths.
30+
* This selector should not control what DNS records are resolvable from a
31+
particular workload.
32+
* This enhancement does not provide a mechanism for selecting in-cluster
33+
endpoints using FQDNs. This is explicitly disallowed by the spec.
34+
* To select Pods, Nodes, API Server, AdminNetworkPolicy has more first party
35+
selector with better UX.
36+
* This enhancement does not specify the details of how traffic is routed to the
37+
specified destination. For example, it does not prescribe details around NAT
38+
or egress gateways.
39+
* This enhancement does not require any mechanism for securing DNS resolution
40+
(e.g. DNSSEC or DNS-over-TLS). Unsecured DNS requests are expected to be
41+
sufficient for looking up FQDNs.
42+
43+
## Introduction
44+
45+
FQDN-based egress controls are a common enterprise security practice.
46+
Administrators often prefer to write security policies using DNS names such as
47+
“www.kubernetes.io” instead of capturing all the IP addresses the DNS name might
48+
resolve to. Keeping up with changing IP addresses is a maintenance burden, and
49+
hampers the readability of the network policies.
50+
51+
## User Stories
52+
53+
* As a cluster admin, I want to allow all Pods in the cluster to send traffic to
54+
an external service specified by a well-known domain name. For example, all
55+
Pods must be able to talk to `my-service.com`.
56+
57+
* As a cluster admin, I want to allow Pods in the "monitoring" namespace to be
58+
able to send traffic to a logs-sink, hosted at `logs-storage.com`
59+
60+
* As a cluster admin, I want to allow all Pods in the cluster to send traffic to
61+
any of the managed services provided by my Cloud Provider. Since the cloud
62+
provider has a well known parent domain, I want to allow Pods to send traffic
63+
to all sub-domains using a wild-card selector -- `*.my-cloud-provider.com`
64+
65+
### Future User Stories
66+
67+
These are some user stories we want to keep in mind, but due to limitations of
68+
the existing Network Policy API, cannot be implemented currently. The design
69+
goal in this case is to ensure we do not make these unimplementable down the line.
70+
71+
* As a cluster admin, I want to block all cluster egress traffic by default, and
72+
require namespace admins to create NetworkPolicies explicitly allowing egress
73+
to the domains they need to talk to.
74+
75+
The Cluster admin would use a `BaselineAdminNetworkPolicy` object to switch
76+
the default disposition of the cluster. Namespace admins would then use
77+
a FQDN selector in the Kubernetes `NetworkPolicy` objects to allow
78+
`my-service.com`.
79+
80+
## API
81+
82+
TODO
83+
84+
## Alternatives
85+
86+
### IP Block Selector
87+
88+
IP blocks are an important tool for specifying Network Policies. However, they
89+
do not address all user needs and have a few short-comings when compared to FQDN
90+
selectors:
91+
92+
* IP-based selectors can become verbose if a single logical service has numerous
93+
IPs backing it.
94+
* IP-based selectors pose an ongoing maintanance burden for administrators, who
95+
need to be aware of changing IPs.
96+
* IP-based selectors can result in policies that are difficult to read and
97+
audit.
98+
99+
### L7 Policy
100+
101+
Another alternative is to provide a true L7 selector, similar to the policies
102+
provided by Service Mesh providers. While L7 selectors can offer more
103+
expressibility, they often come trade-offs that are not suitable for all users:
104+
105+
* L7 selectors necessarily support a select set of protocols. Customers may be
106+
using a custom protocol for application-level communication, but still want
107+
the ability to specify endpoints using DNS.
108+
* L7 selectors often require proxies to perform deep packet inspection and
109+
enforce the policies. These proxies can introduce un-desireable latencies in
110+
the datapath of applications.
111+
112+
## References
113+
114+
* [NPEP #126](https://github.com/kubernetes-sigs/network-policy-api/issues/126):
115+
Egress Control in ANP
116+
117+
### Implementations
118+
119+
* [Antrea](https://antrea.io/docs/main/docs/antrea-network-policy/#fqdn-based-filtering)
120+
* [Calico](https://docs.tigera.io/calico-enterprise/latest/network-policy/domain-based-policy)
121+
* [Cilium](https://docs.cilium.io/en/latest/security/policy/language/#dns-based)
122+
* [Open Shift](https://docs.openshift.com/container-platform/latest/networking/openshift_sdn/configuring-egress-firewall.html)
123+
124+
The following is a best-effort breakdown of capabilities of different
125+
NetworkPolicy providers, as of 2023-09-25. This information may be out-of-date,
126+
or inaccurate.
127+
128+
| | Antrea | Calico | Cilium | OpenShift <br/> (current) | OpenShift <br/> (future) |
129+
| -------------- | ------------------------------ | ------------ | ------------ | ------------------------- | ------------------------ |
130+
| Implementation | DNS Snooping <br/> + Async DNS | DNS Snooping | DNS Snooping | Async DNS | DNS Snooping |
131+
| Wildcards || ️✔ ||||
132+
| Egress Rules || ️✔ ||||
133+
| Ingress Rules || ️❌ ||||
134+
| Allow Rules || ️✔ ||||
135+
| Deny Rules || ️❌(?) ||| ❌(?) |

0 commit comments

Comments
 (0)