|
1 | | -# oci-kubernetes-monitoring |
| 1 | +# Monitoring Solution for Kubernetes |
| 2 | + |
| 3 | +## About |
| 4 | + |
| 5 | +This provides an end-to-end monitoring solution for Oracle Container Engine for Kubernetes (OKE) and other forms of Kubernetes Clusters, |
| 6 | +using Logging Analytics, Monitoring other Oracle Cloud Infrastructure (OCI) Services. |
| 7 | + |
| 8 | +## Logs |
| 9 | + |
| 10 | +This solutions offers collection of various logs of a Kubernetes cluster, out of the box into OCI Logging Analytics and offer rich analytics on top of it. |
| 11 | +Users may choose to customise the log collection by modifying the out of the box configuration that it provides. |
| 12 | + |
| 13 | +### Kubernetes System/Service Logs |
| 14 | + |
| 15 | +OKE or Kubernetes comes up with some built-in services where each one has different responsibilities and they run on one or more nodes in the cluster either as Deployments or DaemonSets. |
| 16 | + |
| 17 | +The following service logs are configured to collect out of the box: |
| 18 | +- Kube Proxy |
| 19 | +- Kube Flannel |
| 20 | +- Kubelet |
| 21 | +- CoreDNS |
| 22 | +- CSI Node Driver |
| 23 | +- DNS Autoscaler |
| 24 | +- Cluster Autoscaler |
| 25 | +- Proxymux Client |
| 26 | + |
| 27 | +### Linux System Logs |
| 28 | + |
| 29 | +The following Linux system logs are configured to collect out of the box: |
| 30 | +- Syslog |
| 31 | +- Secure logs |
| 32 | +- Cron logs |
| 33 | +- Mail logs |
| 34 | +- Audit logs |
| 35 | +- Ksplice Uptrack logs |
| 36 | +- Yum logs |
| 37 | + |
| 38 | +### Control Plane Logs |
| 39 | + |
| 40 | +The following are various Control Plane components in OKE/Kubernetes. |
| 41 | +- Kube API Server |
| 42 | +- Kube Scheduler |
| 43 | +- Kube Controller Manager |
| 44 | +- Cloud Controller Manager |
| 45 | +- etcd |
| 46 | + |
| 47 | +At present, control plane logs are not covered as part of out of the box collection, as these logs are not exposed to customers OKE. |
| 48 | +The out of the box collection for these logs will be available soon for generic Kubernetes clusters and for OKE (when OKE make it available for end users). |
| 49 | + |
| 50 | +### Application Pod/Container Logs |
| 51 | +All the logs from application pods writing STDOUT/STDERR are typically available under /var/log/containers/. |
| 52 | +Application which are having custom log handlers (say log4j or similar) may route their logs differently but in general would be available on the node (through a volume). |
| 53 | + |
| 54 | +## Kubernetes Objects |
| 55 | + |
| 56 | +"Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. Specifically, they can describe: |
| 57 | +- What containerized applications are running (and on which nodes) |
| 58 | +- The resources available to those applications |
| 59 | +- The policies around how those applications behave, such as restart policies, upgrades, and fault-tolerance" |
| 60 | + |
| 61 | +*Reference* : [Kubernetes Objects](https://kubernetes.io/docs/concepts/overview/working-with-objects/kubernetes-objects/) |
| 62 | + |
| 63 | +The following are the list of objects supported at present: |
| 64 | +- Nodes |
| 65 | +- Namespaces |
| 66 | +- Pods |
| 67 | +- DaemonSets |
| 68 | +- Deployments |
| 69 | +- ReplicaSets |
| 70 | +- Events |
| 71 | + |
| 72 | +## Installation Instructions |
| 73 | + |
| 74 | +### Pre-requisites |
| 75 | + |
| 76 | +- Logging Analytics Service must be enabled in the given OCI region before trying out the following Solution. Refer [Logging Analytics Quick Start](https://docs.oracle.com/en-us/iaas/logging-analytics/doc/quick-start.html) for details. |
| 77 | +- Create a Logging Analytics LogGroup(s) if not have done already. Refer [Create Log Group](https://docs.oracle.com/en-us/iaas/logging-analytics/doc/create-logging-analytics-resources.html#GUID-D1758CFB-861F-420D-B12F-34D1CC5E3E0E). |
| 78 | + |
| 79 | +### Docker Image |
| 80 | + |
| 81 | +We are in the process of building a docker image based off Oracle Linux 8 including Fluentd, OCI Logging Analytics Output Plugin and all the required dependencies. |
| 82 | +All the dependencies will be build from source and installed into the image. This image soon would be available to use as a pre-built image as is (OR) to create a custom image using this image as a base image. |
| 83 | +At present, for testing purposes follow the below mentioned steps to build an image using official Fluentd Docker Image as base image (off Debian). |
| 84 | +- Download all the files from [this dir](/logan/docker-images/v1.0/debian/) into a local machine having access to internet. |
| 85 | +- Run the following command to build the docker image. |
| 86 | + - *docker build -t fluentd_oci_la -f Dockerfile .* |
| 87 | +- The docker image built from above step, can be either pushed to Docker Hub or OCI Container Registry (OCIR) or to a Local Docker Registry depending on the requirements. |
| 88 | + - [How to push the image to Docker Hub](https://docs.docker.com/docker-hub/repos/#pushing-a-docker-container-image-to-docker-hub) |
| 89 | + - [How to push the image to OCIR](https://www.oracle.com/webfolder/technetwork/tutorials/obe/oci/registry/index.html). |
| 90 | + - [How to push the image to Local Registry](https://www.oracle.com/webfolder/technetwork/tutorials/obe/oci/registry/index.html). |
| 91 | + |
| 92 | +### Deploying Kuberenetes resources using Kubectl |
| 93 | + |
| 94 | +#### Pre-requisites |
| 95 | + |
| 96 | +- A machine having kubectl installed and setup to point to your Kubernetes environment. |
| 97 | + |
| 98 | +#### To enable Logs collection |
| 99 | + |
| 100 | +Download all the yaml files from [this dir](/logan/kubernetes-resources/logs-collection/). |
| 101 | +These yaml files needs to be applied using kubectl to create the necessary resources that enables the logs collection into Logging Analytics through a Fluentd based DaemonSet. |
| 102 | + |
| 103 | +##### configmap-docker.yaml | configmap-cri.yaml |
| 104 | + |
| 105 | +- This file contains the necessary out of the box fluentd configuration to collect Kubernetes System/Service Logs, Linux System Logs and Application Pod/Container Logs. |
| 106 | +- Some log locations may differ for Kubernetes clusters other than OKE, EKS and may need modifications accordingly. |
| 107 | +A comprehensive out of the box covering typical kubernetes clusters will be available soon. |
| 108 | +- Use configmap-docker.yaml for Kubernetes clusters based off Docker runtime (e.g., OKE < 1.20) and configmap-cri.yaml for Kubernetes clusters based off CRI-O. |
| 109 | +- Inline comments are available in the file for each of the source/filter/match blocks for easy reference for making any changes to the configuration. |
| 110 | +- Refer [this](https://docs.oracle.com/en/learn/oci_logging_analytics_fluentd/) to learn about each of the Logging Analytics Fluentd Output plugin configuration parameters. |
| 111 | +- *Note*: A generic source with time only parser is defined/configured for collecting all application pod logs from /var/log/containers/ out of the box. |
| 112 | + It is recommended to define and use a LogSource/LogParser at Logging Analytics for a given log type and then modify the configuration accordingly. |
| 113 | + When adding a configuration (Source, Filter section) for any new container log, also exclude the log path from generic log collection, |
| 114 | + by adding the log path to *exclude_path* field in *in_tail_containerlogs* source block. This is to avoid the duplicate collection of logs through generic log collection. |
| 115 | + |
| 116 | +##### fluentd-daemonset.yaml |
| 117 | + |
| 118 | +- This file has all the necessary resources to deploy to run the Fluentd docker image as Daemonset. |
| 119 | +- Inline comments are available in the file describing each of the fields/sections. |
| 120 | +- Make sure to replace the fields with actual values before deploying. |
| 121 | +- At minimum, <IMAGE_URL>, <OCI_LOGGING_ANALYTICS_LOG_GROUP_ID>, <OCI_TENANCY_NAMESPACE> needs to be updated. |
| 122 | +- It is recommended to update, <KUBERNETES_CLUSTER_OCID>,<KUBERNETES_CLUSTER_NAME> too tag all the logs with corresponding Kubernetes cluster at Logging Analytics. |
| 123 | + |
| 124 | +##### secrets.yaml (Optional) |
| 125 | + |
| 126 | +- At present, InstancePrincipal and OCI Config File (UserPrincipal) based Auth/AuthZ are supported for Fluentd to talk to OCI Logging Analytics APIs. |
| 127 | +- We recommend to use InstancePrincipal based AuthZ for OKE and all clusters which are running on OCI VMs and that is the default auth type configured. |
| 128 | +- Applying this file is not required when using InstancePrincipal based auth type. |
| 129 | +- You need to modify this file to fill out the values under config section with appropriate values. |
| 130 | + |
| 131 | +##### Commands Reference |
| 132 | + |
| 133 | +Apply the yaml files in the sequence of configmap-docker.yaml(or configmap-cri.yaml), secrets.yaml (not required for default auth type) and fluentd-daemonset.yaml. |
| 134 | + |
| 135 | +``` |
| 136 | +$ kubectl apply -f configmap-docker.yaml |
| 137 | +configmap/oci-la-fluentd-logs-configmap created |
| 138 | +
|
| 139 | +$ kubectl apply -f secrets.yaml |
| 140 | +secret/oci-la-credentials-secret created |
| 141 | +
|
| 142 | +$ kubectl apply -f fluentd-daemonset.yaml |
| 143 | +serviceaccount/oci-la-fluentd-serviceaccount created |
| 144 | +clusterrole.rbac.authorization.k8s.io/oci-la-fluentd-logs-clusterrole created |
| 145 | +clusterrolebinding.rbac.authorization.k8s.io/oci-la-fluentd-logs-clusterrolebinding created |
| 146 | +daemonset.apps/oci-la-fluentd-daemonset created |
| 147 | +``` |
| 148 | + |
| 149 | +You may use the following command to restart DaemonSet upon any applying any modifications to configmap or secrets to reflect the changes into the Fluentd. |
| 150 | + |
| 151 | +``` |
| 152 | +kubectl rollout restart daemonset oci-la-fluentd-daemonset -n=kubectl |
| 153 | +``` |
| 154 | + |
| 155 | +#### To enable Kubernetes Objects collection |
| 156 | + |
| 157 | +Download all the yaml files from [this dir](/logan/kubernetes-resources/objects-collection/). |
| 158 | +These yaml files needs to be applied using kubectl to create the necessary resources that enables the Kuberetes Objects collection into Logging Analytics. |
| 159 | + |
| 160 | +##### configMap-objects.yaml |
| 161 | + |
| 162 | +- This file contains the necessary out of the box fluentd configuration to collect Kubernetes Objects. |
| 163 | +- Refer [this](https://docs.oracle.com/en/learn/oci_logging_analytics_fluentd/) to learn about each of the Logging Analytics Fluentd Output plugin configuration parameters. |
| 164 | + |
| 165 | +##### fluentd-deployment.yaml |
| 166 | + |
| 167 | +Refer [this](#fluentd-daemonset.yaml) section. |
| 168 | + |
| 169 | +##### secrets.yaml |
| 170 | + |
| 171 | +Refer [this](#secrets.yaml) section. |
| 172 | + |
| 173 | +##### Commands Reference |
| 174 | + |
| 175 | +Apply the yaml files in the sequence of configmap-objects.yaml, secrets.yaml (not required for default auth type) and fluentd-deployment.yaml. |
| 176 | + |
| 177 | +``` |
| 178 | +$ kubectl apply -f configmap-objects.yaml |
| 179 | +configmap/oci-la-fluentd-objects-configmap configured |
| 180 | +
|
| 181 | +$ kubectl apply -f fluentd-deployment.yaml |
| 182 | +serviceaccount/oci-la-fluentd-serviceaccount unchanged |
| 183 | +clusterrole.rbac.authorization.k8s.io/oci-la-fluentd-objects-clusterrole created |
| 184 | +clusterrolebinding.rbac.authorization.k8s.io/oci-la-fluentd-objects-clusterrolebinding created |
| 185 | +deployment.apps/oci-la-fluentd-deployment created |
| 186 | +``` |
| 187 | + |
| 188 | +You may use the following command to restart DaemonSet upon any applying any modifications to configmap or secrets to reflect the changes into the Fluentd. |
| 189 | + |
| 190 | +``` |
| 191 | +kubectl rollout restart deployment oci-la-fluentd-deployment -n=kubectl |
| 192 | +``` |
| 193 | + |
| 194 | +### Deploying Kuberenetes resources using Helm |
| 195 | + |
| 196 | +Coming soon ... |
| 197 | + |
| 198 | + |
| 199 | + |
| 200 | + |
| 201 | + |
| 202 | + |
| 203 | + |
| 204 | + |
| 205 | + |
| 206 | + |
| 207 | + |
0 commit comments