|
1 | 1 | = Service exposition
|
2 |
| -:k8s-service: https://kubernetes.io/docs/concepts/services-networking/service/ |
3 |
| -:k8s-service-types: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types |
4 |
| -:description: Explore Stackable's service exposition options: ClusterIP for internal access, NodePort for unstable external access, and LoadBalancer for stable external access. |
5 |
| - |
| 2 | +:listener-operator: xref:listener-operator:index.adoc |
| 3 | +:secret-operator: xref:secret-operator:index.adoc |
| 4 | +:listenerclass: xref:listener-operator:listenerclass.adoc |
| 5 | +:description: Explore how Stackable uses listener-operator to expose Services. |
6 | 6 |
|
7 | 7 | Data products expose interfaces to the outside world.
|
8 | 8 | These interfaces (whether UIs, or APIs) can be accessed by other products or by end users.
|
9 |
| -Other products accessing the interfaces can run inside or outside of the same Kubernetes cluster. |
| 9 | +Clients accessing the interfaces can run inside or outside of the same Kubernetes cluster. |
10 | 10 | For example, xref:zookeeper:index.adoc[Apache ZooKeeper] is a dependency for other products, and it usually needs to be accessible only from within Kubernetes, while xref:superset:index.adoc[Apache Superset] is a data analysis product for end users and therefore needs to be accessible from outside the Kubernetes cluster.
|
11 | 11 | Users connecting to Superset can be restricted within the local company network, or they can connect over the internet depending on the company security policies and demands.
|
12 | 12 | This page gives an overview over the different options for service exposition, when to choose which option and how these options are configured.
|
13 | 13 |
|
14 |
| -== Service exposition options |
15 |
| - |
16 |
| -The Stackable Data Platform supports three {k8s-service-types}[types of Kubernetes Service] for exposing data product endpoints: |
| 14 | +== Motivation |
17 | 15 |
|
18 |
| -* ClusterIP |
19 |
| -* NodePort |
20 |
| -* LoadBalancer |
| 16 | +Service exposition is such a complicated topic, that Stackable has build it's own operator for that: {listener-operator}[]. |
| 17 | +The following section explains the motivation why we wrote such an operator over just using plain regular Kubernetes Services. |
21 | 18 |
|
22 |
| -All custom resources for data products provide a resource field named `spec.clusterConfig.listenerClass` which determines how the product can be accessed. |
23 |
| -There are three ListenerClasses, named after the goal for which they are used (more on this in the <<when-to-choose-which-option, next section>>): |
| 19 | +=== Tools advertising their address |
24 | 20 |
|
25 |
| -* `cluster-internal` => Use ClusterIP (default) |
26 |
| -* `external-unstable` => Use NodePort |
27 |
| -* `external-stable` => Use LoadBalancer |
| 21 | +Some tools need to know how they are externally reachable. |
| 22 | +This is e.g. important for HDFS, where the namenode keeps track of which datanode serves which block or Kafka (used for client bootstrapping). |
| 23 | +A HDFS client asks the namenode "I want to read block 42, who is serving that?", the namenode responds with "block 42 is served by <ip or hostname of some datanode>". |
| 24 | +For that to work, the datanode needs to know it's external address on startup and tell it the namenode. |
| 25 | +(And yes, we needed to patch Hadoop source-code for that ;)) |
28 | 26 |
|
29 |
| -The `cluster-internal` class exposes the interface of a product by using a ClusterIP Service. |
30 |
| -This service is only reachable from within the Kubernetes cluster. |
31 |
| -This setting is the most secure and was chosen as the default for that reason. |
| 27 | +The {listener-operator}[listener-operator] runs as CSI driver (same as the {secret-operator}[secret-operator]) and places files inside the CSI volume, which tell the tool how it is reachable. |
32 | 28 |
|
33 |
| -NOTE: Not all operators support all classes. |
34 |
| -Consult the operator specific documentation to find out about the supported service types. |
| 29 | +=== Integration with {secret-operator}[secret-operator] |
35 | 30 |
|
36 |
| -[#when-to-choose-which-option] |
37 |
| -== When to choose which option |
| 31 | +If a tool is secured using TLS or Kerberos, it does not only need to be reachable via the determined address, it also needs a TLS certificate/keytab issued on the determined address. |
| 32 | +{secret-operator}[secret-operator] integrated with to {listener-operator}[listener-operator], so that the platform takes care of provisioning certificates with the correct addresses (in the form of SAN entries). |
38 | 33 |
|
39 |
| -There are three options, one for internal traffic and two for external access, where internal and external refer to the Kubernetes cluster. |
40 |
| -Internal means inside of the Kuberenetes cluster, and external means access from outside of it. |
| 34 | +== {listenerclass}[ListenerClasses] |
41 | 35 |
|
42 |
| -=== Internal |
| 36 | +A {listenerclass}[] describes how a product should be exposed. |
| 37 | +Please read on {listenerclass}[it's documentation] before continuing on this page. |
43 | 38 |
|
44 |
| -`cluster-internal` is the default class and the Service behind it is only reachable from within Kubernetes. |
45 |
| -This is useful for middleware products such as xref:zookeeper:index.adoc[Apache ZooKeeper], xref:hive:index.adoc[Apache Hive metastore], or an xref:kafka:index.adoc[Apache Kafka] cluster used for internal data flow. |
46 |
| -Products using this ListenerClass are not accessible from outside Kubernetes. |
| 39 | +As a quick reminder, the platform ships with 3 default {listenerclass}[ListenerClasses]: |
47 | 40 |
|
48 |
| -=== External |
| 41 | +`cluster-internal`:: Used for listeners that are only accessible internally from the cluster. For example: communication between ZooKeeper nodes. |
| 42 | +`external-unstable`:: Used for listeners that are accessible from outside the cluster, but which do not require a stable address. For example: individual Kafka brokers. |
| 43 | +`external-stable`:: Used for listeners that are accessible from outside the cluster, and do require a stable address. For example: Kafka bootstrap. |
49 | 44 |
|
50 |
| -External access is needed when a product needs to be accessed from _outside_ of Kubernetes. |
51 |
| -This is necessary for all end user products such as xref:superset:index.adoc[Apache Superset]. |
52 |
| -Some tools can expose APIs for data ingestion like xref:kafka:index.adoc[Apache Kafka] or xref:nifi:index.adoc[Apache NiFi]. |
53 |
| -If data needs to be ingested from outside of the cluster, one of the external listener classes should be chosen. |
| 45 | +Keep in mind that you are not restricted to this list, you can configure your own custom {listenerclass}[ListenerClasses]. |
54 | 46 |
|
55 |
| -When to use `stable` and when to use `unstable`? |
56 |
| -The `external-unstable` setting exposes a product interface via a Kuberneres NodePort. |
57 |
| -In this case the service's IP address and port can change if Kubernetes needs to restart or reschedule the Pod to another node. |
| 47 | +== Configuring the ListenerClass for a stacklet |
58 | 48 |
|
59 |
| -The `external-stable` class uses a LoadBalancer. |
60 |
| -The LoadBalancer is running at a fixed address and is therefore `stable`. |
61 |
| -Managed Kubernetes services in the cloud usually offer a LoadBalancer, but for an on premise cluster you have to configure a LoadBalancer yourself. |
62 |
| -For a production setup, it is recommended to use a LoadBalancer and the `external-stable` ListenerClass. |
| 49 | +We integrated {listener-operator}[listener-operator] into most of our products, currently only xref:opa:index.adoc[] and xref:spark-k8s:index.adoc[] are not using {listener-operator}[listener-operator]. |
63 | 50 |
|
64 |
| -== Outlook |
| 51 | +Most of the tools configure the {listenerclass}[] at the role level as follows: |
65 | 52 |
|
66 |
| -For most of the Stackable operators, these listener classes are hardcoded to expose certain Service types and do not offer any additional configuration. |
67 |
| -However, some operators support specifying custom xref:listener-operator:listenerclass.adoc[ListenerClass]es with more granular configuration options, via the xref:listener-operator:index.adoc[listener-operator]. |
68 |
| -In a future release, all Stackable operators are planned to be migrated over to this system. |
| 53 | +[source,yaml] |
| 54 | +---- |
| 55 | +spec: |
| 56 | + my-role: |
| 57 | + roleConfig: |
| 58 | + listenerClass: external-unstable |
| 59 | +---- |
69 | 60 |
|
70 |
| -For more information on what is supported by any individual operator, please see that operator's documentation. |
| 61 | +Every operator has a documentation section called "Service exposition with ListenerClasses", which may provide details for the specific tool. |
0 commit comments