Skip to content

Commit fede9a7

Browse files
Merge pull request #1 from santhoshkvuda/master
2 parents 1049c8f + c99a342 commit fede9a7

File tree

11 files changed

+1822
-1
lines changed

11 files changed

+1822
-1
lines changed

README.md

Lines changed: 216 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,216 @@
1-
# oci-kubernetes-monitoring
1+
# Monitoring Solution for Kubernetes
2+
3+
## About
4+
5+
This provides an end-to-end monitoring solution for Oracle Container Engine for Kubernetes (OKE) and other forms of Kubernetes Clusters using Logging Analytics, Monitoring and other Oracle Cloud Infrastructure (OCI) Services.
6+
7+
## Logs
8+
9+
This solutions offers collection of various logs of a Kubernetes cluster into OCI Logging Analytics and offer rich analytics on top of the collected logs. Users may choose to customise the log collection by modifying the out of the box configuration that it provides.
10+
11+
### Kubernetes System/Service Logs
12+
13+
OKE or Kubernetes comes up with some built-in services where each one has different responsibilities and they run on one or more nodes in the cluster either as Deployments or DaemonSets.
14+
15+
The following service logs are configured to be collected out of the box:
16+
- Kube Proxy
17+
- Kube Flannel
18+
- Kubelet
19+
- CoreDNS
20+
- CSI Node Driver
21+
- DNS Autoscaler
22+
- Cluster Autoscaler
23+
- Proxymux Client
24+
25+
### Linux System Logs
26+
27+
The following Linux system logs are configured to be collected out of the box:
28+
- Syslog
29+
- Secure logs
30+
- Cron logs
31+
- Mail logs
32+
- Audit logs
33+
- Ksplice Uptrack logs
34+
- Yum logs
35+
36+
### Control Plane Logs
37+
38+
The following are various Control Plane components in OKE/Kubernetes.
39+
- Kube API Server
40+
- Kube Scheduler
41+
- Kube Controller Manager
42+
- Cloud Controller Manager
43+
- etcd
44+
45+
At present, control plane logs are not covered as part of out of the box collection, as these logs are not exposed to OKE customers.
46+
The out of the box collection for these logs will be available soon for generic Kubernetes clusters and for OKE (when OKE makes these logs accessible to end users).
47+
48+
### Application Pod/Container Logs
49+
All the logs from application pods writing STDOUT/STDERR are typically available under /var/log/containers/.
50+
Application which are having custom log handlers (say log4j or similar) may route their logs differently but in general would be available on the node (through a volume).
51+
52+
## Kubernetes Objects
53+
54+
"Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. Specifically, they can describe:
55+
- What containerized applications are running (and on which nodes)
56+
- The resources available to those applications
57+
- The policies around how those applications behave, such as restart policies, upgrades, and fault-tolerance"
58+
59+
*Reference* : [Kubernetes Objects](https://kubernetes.io/docs/concepts/overview/working-with-objects/kubernetes-objects/)
60+
61+
The following are the list of objects supported at present:
62+
- Nodes
63+
- Namespaces
64+
- Pods
65+
- DaemonSets
66+
- Deployments
67+
- ReplicaSets
68+
- Events
69+
70+
## Installation Instructions
71+
72+
### Pre-requisites
73+
74+
- Logging Analytics Service must be enabled in the given OCI region before trying out the following Solution. Refer [Logging Analytics Quick Start](https://docs.oracle.com/en-us/iaas/logging-analytics/doc/quick-start.html) for details.
75+
- Create a Logging Analytics LogGroup(s) if not have done already. Refer [Create Log Group](https://docs.oracle.com/en-us/iaas/logging-analytics/doc/create-logging-analytics-resources.html#GUID-D1758CFB-861F-420D-B12F-34D1CC5E3E0E).
76+
- Enable access to the log group(s) to uploads logs from Kubernetes environment:
77+
- For InstancePrincipal based AuthZ (recommended for OKE and Kubernetes clusters running on OCI):
78+
- Create a dynamic group including relevant OCI Instances. Refer [this](https://docs.oracle.com/en-us/iaas/Content/Identity/Tasks/managingdynamicgroups.htm) for details about managing dynamic groups.
79+
- Add an IAM policy like,
80+
```
81+
Allow dynamic-group <dynamic_group_name> to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in compartment <Logging Analytics LogGroup's compartment_name>
82+
```
83+
- For Config file based (user principal) AuthZ:
84+
- Add an IAM policy like,
85+
```
86+
Allow group <user_group_name> to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in compartment <Logging Analytics LogGroup's compartment_name>
87+
```
88+
89+
### Docker Image
90+
91+
We are in the process of building a docker image based off Oracle Linux 8 including Fluentd, OCI Logging Analytics Output Plugin and all the required dependencies.
92+
All the dependencies will be build from source and installed into the image. This image soon would be available to use as a pre-built image as is (OR) to create a custom image using this image as a base image.
93+
At present, for testing purposes follow the below mentioned steps to build an image using official Fluentd Docker Image as base image (off Debian).
94+
- Download all the files from [this dir](/logan/docker-images/v1.0/debian/) into a local machine having access to internet.
95+
- Run the following command to build the docker image.
96+
- *docker build -t fluentd_oci_la -f Dockerfile .*
97+
- The docker image built from the above step, can either be pushed to Docker Hub or OCI Container Registry (OCIR) or to a Local Docker Registry depending on the requirements.
98+
- [How to push the image to Docker Hub](https://docs.docker.com/docker-hub/repos/#pushing-a-docker-container-image-to-docker-hub)
99+
- [How to push the image to OCIR](https://www.oracle.com/webfolder/technetwork/tutorials/obe/oci/registry/index.html).
100+
- [How to push the image to Local Registry](https://docs.docker.com/registry/deploying/).
101+
102+
### Deploying Kuberenetes resources using Kubectl
103+
104+
#### Pre-requisites
105+
106+
- A machine having kubectl installed and setup to point to your Kubernetes environment.
107+
108+
#### To enable Logs collection
109+
110+
Download all the yaml files from [this dir](/logan/kubernetes-resources/logs-collection/).
111+
These yaml files needs to be applied using kubectl to create the necessary resources that enables the logs collection into Logging Analytics through a Fluentd based DaemonSet.
112+
113+
##### configmap-docker.yaml | configmap-cri.yaml
114+
115+
- This file contains the necessary out of the box fluentd configuration to collect Kubernetes System/Service Logs, Linux System Logs and Application Pod/Container Logs.
116+
- Some log locations may differ for Kubernetes clusters other than OKE, EKS and may need modifications accordingly.
117+
- Use configmap-docker.yaml for Kubernetes clusters based off Docker runtime (e.g., OKE < 1.20) and configmap-cri.yaml for Kubernetes clusters based off CRI-O.
118+
- Inline comments are available in the file for each of the source/filter/match blocks for easy reference for making any changes to the configuration.
119+
- Refer [this](https://docs.oracle.com/en/learn/oci_logging_analytics_fluentd/) to learn about each of the Logging Analytics Fluentd Output plugin configuration parameters.
120+
- *Note*: A generic source with time only parser is defined/configured for collecting all application pod logs from /var/log/containers/ out of the box.
121+
It is recommended to define and use a LogSource/LogParser at Logging Analytics for a given log type and then modify the configuration accordingly.
122+
When adding a configuration (Source, Filter section) for any new container log, also exclude the log path from generic log collection,
123+
by adding the log path to *exclude_path* field in *in_tail_containerlogs* source block. This is to avoid the duplicate collection of logs through generic log collection.
124+
125+
##### fluentd-daemonset.yaml
126+
127+
- This file has all the necessary resources required to deploy and run the Fluentd docker image as Daemonset.
128+
- Inline comments are available in the file describing each of the fields/sections.
129+
- Make sure to replace the fields with actual values before deploying.
130+
- At minimum, <IMAGE_URL>, <OCI_LOGGING_ANALYTICS_LOG_GROUP_ID>, <OCI_TENANCY_NAMESPACE> needs to be updated.
131+
- It is recommended to update <KUBERNETES_CLUSTER_OCID>,<KUBERNETES_CLUSTER_NAME> too, to tag all the logs processed with corresponding Kubernetes cluster at Logging Analytics.
132+
133+
##### secrets.yaml (Optional)
134+
135+
- At present, InstancePrincipal and OCI Config File (UserPrincipal) based Auth/AuthZ are supported for Fluentd to talk to OCI Logging Analytics APIs.
136+
- We recommend to use InstancePrincipal based AuthZ for OKE and all clusters which are running on OCI VMs and that is the default auth type configured.
137+
- Applying this file is not required when using InstancePrincipal based auth type.
138+
- When config file based Authz is used, modify this file to fill out the values under config section with appropriate values.
139+
140+
##### Commands Reference
141+
142+
Apply the yaml files in the sequence of configmap-docker.yaml(or configmap-cri.yaml), secrets.yaml (not required for default auth type) and fluentd-daemonset.yaml.
143+
144+
```
145+
$ kubectl apply -f configmap-docker.yaml
146+
configmap/oci-la-fluentd-logs-configmap created
147+
148+
$ kubectl apply -f secrets.yaml
149+
secret/oci-la-credentials-secret created
150+
151+
$ kubectl apply -f fluentd-daemonset.yaml
152+
serviceaccount/oci-la-fluentd-serviceaccount created
153+
clusterrole.rbac.authorization.k8s.io/oci-la-fluentd-logs-clusterrole created
154+
clusterrolebinding.rbac.authorization.k8s.io/oci-la-fluentd-logs-clusterrolebinding created
155+
daemonset.apps/oci-la-fluentd-daemonset created
156+
```
157+
158+
Use the following command to restart DaemonSet after applying any modifications to configmap or secrets to reflect the changes into the Fluentd.
159+
160+
```
161+
kubectl rollout restart daemonset oci-la-fluentd-daemonset -n=kube-system
162+
```
163+
164+
#### To enable Kubernetes Objects collection
165+
166+
Download all the yaml files from [this dir](/logan/kubernetes-resources/objects-collection/).
167+
These yaml files needs to be applied using kubectl to create the necessary resources that enables the Kuberetes Objects collection into Logging Analytics.
168+
169+
##### configMap-objects.yaml
170+
171+
- This file contains the necessary out of the box fluentd configuration to collect Kubernetes Objects.
172+
- Refer [this](https://docs.oracle.com/en/learn/oci_logging_analytics_fluentd/) to learn about each of the Logging Analytics Fluentd Output plugin configuration parameters.
173+
174+
##### fluentd-deployment.yaml
175+
176+
Refer [this](#fluentd-daemonsetyaml) section.
177+
178+
##### secrets.yaml (Optional)
179+
180+
Refer [this](#secretsyaml-optional) section.
181+
182+
##### Commands Reference
183+
184+
Apply the yaml files in the sequence of configmap-objects.yaml, secrets.yaml (not required for default auth type) and fluentd-deployment.yaml.
185+
186+
```
187+
$ kubectl apply -f configmap-objects.yaml
188+
configmap/oci-la-fluentd-objects-configmap configured
189+
190+
$ kubectl apply -f fluentd-deployment.yaml
191+
serviceaccount/oci-la-fluentd-serviceaccount unchanged
192+
clusterrole.rbac.authorization.k8s.io/oci-la-fluentd-objects-clusterrole created
193+
clusterrolebinding.rbac.authorization.k8s.io/oci-la-fluentd-objects-clusterrolebinding created
194+
deployment.apps/oci-la-fluentd-deployment created
195+
```
196+
197+
Use the following command to restart Deployment after applying any modifications to configmap or secrets to reflect the changes into the Fluentd.
198+
199+
```
200+
kubectl rollout restart deployment oci-la-fluentd-deployment -n=kube-system
201+
```
202+
203+
### Deploying Kuberenetes resources using Helm
204+
205+
Coming soon ...
206+
207+
208+
209+
210+
211+
212+
213+
214+
215+
216+
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
### Referred version from fluentd-kubernetes-daemonset gitgub repo ###
2+
3+
FROM fluent/fluentd:v1.14.1-debian-1.0
4+
5+
USER root
6+
WORKDIR /home/fluent
7+
ENV PATH /fluentd/vendor/bundle/ruby/2.7.0/bin:$PATH
8+
ENV GEM_PATH /fluentd/vendor/bundle/ruby/2.7.0
9+
ENV GEM_HOME /fluentd/vendor/bundle/ruby/2.7.0
10+
# skip runtime bundler installation
11+
ENV FLUENTD_DISABLE_BUNDLER_INJECTION 1
12+
13+
COPY Gemfile* /fluentd/
14+
RUN buildDeps="sudo make gcc g++ libc-dev libffi-dev" \
15+
runtimeDeps="" \
16+
&& apt-get update \
17+
&& apt-get upgrade -y \
18+
&& apt-get install \
19+
-y --no-install-recommends \
20+
$buildDeps $runtimeDeps net-tools \
21+
&& gem install bundler --version 2.1.4 \
22+
&& bundle config silence_root_warning true \
23+
&& bundle config --local path /fluentd/vendor/bundle \
24+
&& bundle install --gemfile=/fluentd/Gemfile \
25+
&& SUDO_FORCE_REMOVE=yes \
26+
apt-get purge -y --auto-remove \
27+
-o APT::AutoRemove::RecommendsImportant=false \
28+
$buildDeps \
29+
&& rm -rf /var/lib/apt/lists/* \
30+
&& gem sources --clear-all \
31+
&& rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem
32+
33+
RUN touch /fluentd/etc/disable.conf
34+
35+
COPY entrypoint.sh /fluentd/entrypoint.sh
36+
37+
# Environment variables
38+
ENV FLUENTD_CONF="/fluentd/etc/fluent.conf"
39+
40+
# Give execution permission to entrypoint.sh
41+
RUN ["chmod", "+x", "/fluentd/entrypoint.sh"]
42+
43+
# Overwrite ENTRYPOINT to run fluentd as root for /var/log / /var/lib
44+
ENTRYPOINT ["tini", "--", "/fluentd/entrypoint.sh"]
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
source "https://rubygems.org"
2+
3+
gem "fluentd", "1.14.1"
4+
gem "fluent-plugin-oci-logging-analytics", "2.0.0"
5+
gem "fluent-plugin-concat", "~> 2.5.0"
6+
gem "fluent-plugin-rewrite-tag-filter", "~> 2.4.0"
7+
gem "fluent-plugin-parser-cri", "~> 0.1.1"
8+
gem "fluent-plugin-kubernetes_metadata_filter", "2.9.1"
9+
gem "fluent-plugin-kubernetes-objects", "~> 1.1.7"
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
#!/usr/bin/env sh
2+
3+
exec fluentd -c ${FLUENTD_CONF} -p /fluentd/plugins --gemfile /fluentd/Gemfile ${FLUENTD_OPT}

0 commit comments

Comments
 (0)