|
| 1 | +# C3 OKE - Monitoring with OCI Log Analytics |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +This page details an all OCI solution for monitoring kubernetes clusters |
| 6 | +running on a C3. |
| 7 | + |
| 8 | +## Overview |
| 9 | + |
| 10 | +C3 includes an OKE compatible Kubernetes-as-a-Service that allows the |
| 11 | +easy provisioning of kubernetes clusters. Currently the C3 contains no |
| 12 | +utilities to aid the user in management of multiple clusters on the |
| 13 | +rack, currently up to 20 per rack. While it is possible to use widely |
| 14 | +available solutions for cluster management such as Rancher or Lens some |
| 15 | +customers are looking for an \"all Oracle\" solution. Fortunately OCI |
| 16 | +Log Analytics provides a [Kubernetes |
| 17 | +solution](https://docs.oracle.com/en-us/iaas/logging-analytics/doc/kubernetes-solution.html) |
| 18 | +that allows customers to: |
| 19 | + |
| 20 | +>... to monitor and generate insights into your Kubernetes deployed in |
| 21 | +OCI, third party public clouds, private clouds, or on-premises including |
| 22 | +managed Kubernetes deployments. |
| 23 | + |
| 24 | +This solution will allow a user to monitor C3 clusters from their OCI |
| 25 | +tenancy across multiple C3 racks. There are 5 provided dashboards which |
| 26 | +work with the solution\'s log and metric collection to visualise cluster |
| 27 | +status. |
| 28 | + |
| 29 | + |
| 30 | + |
| 31 | +The OCI Log Analytics solution for Kubernetes provides a push button |
| 32 | +solution for registering OKE clusters running in OCI. For other clusters |
| 33 | +such as non-OKE ones or those running on C3 there is an alternative |
| 34 | +registration method using the [Helm](https://helm.sh/) package manager. |
| 35 | +The helm packages and installation instructions are published by Oracle |
| 36 | +in the |
| 37 | +[oci-kubernetes-monitoring](https://github.com/oracle-quickstart/oci-kubernetes-monitoring/tree/main?tab=readme-ov-file) |
| 38 | +GitHub repo. Once the pre-requisite OCI resources have been set up it\'s |
| 39 | +possible to deploy the solution with a single helm install command |
| 40 | +together with the customisations required per cluster. The solution will |
| 41 | +deploy 3 elements to the target cluster: |
| 42 | + |
| 43 | +1. An OCI Management Agent running in a pod. By default there is only a |
| 44 | + single instance of the management agent per cluster, additional |
| 45 | + replicas can be specified. |
| 46 | + |
| 47 | + 1. The agent collects kubernetes specific metrics in the |
| 48 | + *mgmtagent_kubernetes_metrics* namespace. |
| 49 | + |
| 50 | +2. Log collectors running on each node. These collectors use the OCI |
| 51 | + Log Analytics log ingestion API to send OS, kubernetes and container |
| 52 | + logs to Log Analytics for future analysis. |
| 53 | + |
| 54 | +3. An OCI management discovery task running as a kubernetes cron job. |
| 55 | + This detects the required entities, e.g. clusters and nodes, and |
| 56 | + keeps changes to the cluster in sync. |
| 57 | + |
| 58 | +## Summary of Prerequisites |
| 59 | + |
| 60 | +There are some OCI prerequisites required to install the solution that |
| 61 | +are listed in the [Pre-requisite |
| 62 | +section](https://github.com/oracle-quickstart/oci-kubernetes-monitoring/tree/main?tab=readme-ov-file#pre-requisites) |
| 63 | +in GitHub. These include: |
| 64 | + |
| 65 | +- Initial onboarding of the OCI Log Analytics service in the parent |
| 66 | + tenancy. |
| 67 | + |
| 68 | +- A Log Analytics Log Group. Note that the OCI Logging service also |
| 69 | + has a resource named Log Group! |
| 70 | + |
| 71 | +- An [OCI Management Agent install |
| 72 | + key](https://docs.oracle.com/en-us/iaas/management-agents/doc/management-agents-administration-tasks.html#GUID-C841426A-2C32-4630-97B6-DF11F05D5712) |
| 73 | + must be created. |
| 74 | + |
| 75 | +- The user groups and policies required. |
| 76 | + |
| 77 | + - Note that a dynamic group is required for all the OCI management |
| 78 | + agents that the solution registers in the OCI tenancy. A policy |
| 79 | + that allows this dynamic group to use the OCI metrics service. |
| 80 | + |
| 81 | + - A dynamic group for the OKE instances on the C3 will **not** |
| 82 | + allow workload identity propagation as per OCI OKE instances. |
| 83 | + |
| 84 | + - The solution deployed on C3 should use authentication based on a |
| 85 | + user principal with the OCI config file deployed within the |
| 86 | + pods. In the helm values override file the |
| 87 | + [*[authtype]{.underline}* should be set to |
| 88 | + *[config]{.underline}*](https://github.com/oracle-quickstart/oci-kubernetes-monitoring/blob/main/docs/FAQ.md#how-to-use-configfile-based-authz-user-principal-instead-of-default-authz-instance-principal-). |
| 89 | + A policy that allows the user group to upload to the Log |
| 90 | + Analytics service is also required. |
| 91 | + |
| 92 | +- The solution requires images to be pulled from external repositories |
| 93 | + therefore, if required, [a proxy should |
| 94 | + configured](https://docs.oracle.com/en-us/iaas/compute-cloud-at-customer/topics/oke/configuring-a-proxy.htm) |
| 95 | + in **all** the cluster\'s nodes (worker and master). This also |
| 96 | + implies that the master nodes **must** be configured with an [SSH |
| 97 | + public |
| 98 | + key](https://docs.oracle.com/en-us/iaas/compute-cloud-at-customer/topics/oke/creating-a-kubernetes-cluster.htm#:~:text=Your%20public%20SSH%20key.). |
| 99 | + |
| 100 | +## Example Custom Values File |
| 101 | + |
| 102 | +Helm allows default values to be overridden using a yaml file. This |
| 103 | +example can be customised for cluster and tenancy specifics. |
| 104 | + |
| 105 | +### override-values.yaml ### |
| 106 | + |
| 107 | +```yaml |
| 108 | +global: |
| 109 | + # OCID for OKE cluster or a unique ID for other Kubernetes clusters. Use the ocid from C3. |
| 110 | + kubernetesClusterID: ocid1.cluster.rackserno.c3region.77777777777777777777777777777777777 |
| 111 | + # Provide a unique name for the cluster. This would help in |
| 112 | + # uniquely identifying the logs and metrics data at OCI Logging Analytics |
| 113 | + # and OCI Monitoring respectively. |
| 114 | + kubernetesClusterName: auniqueclustername |
| 115 | + |
| 116 | +oci-onm-logan: |
| 117 | + # Go to OCI Logging Analytics Administration, click Service Details, |
| 118 | + # and note the namespace value. |
| 119 | + ociLANamespace: namespace |
| 120 | + |
| 121 | + privileged: true |
| 122 | + |
| 123 | + # OCI Logging Analytics Log Group OCID |
| 124 | + ociLALogGroupID: ocid1.loganalyticsloggroup.oc1.uk-london-1.77777777777777777777777777777777777777 |
| 125 | + |
| 126 | + # On C3 use config file based authenticatio |
| 127 | + authtype: config |
| 128 | + |
| 129 | + # OCI API Key Based authentication details. Required when authtype set to config |
| 130 | + oci: |
| 131 | + |
| 132 | + # Path to the OCI API config file. Ensure this matches the path in the config file below. |
| 133 | + path: /var/opt/.oci |
| 134 | + #Config file name |
| 135 | + file: config |
| 136 | + |
| 137 | + configFiles: |
| 138 | + config: |- |
| 139 | + # Replace each of the below fields with actual values. |
| 140 | + |
| 141 | + [DEFAULT] |
| 142 | + user=ocid1.user.oc1..auserinocitenancy |
| 143 | + fingerprint=AP:IK:EY:FI:NG:ER:PR:IN:T |
| 144 | + key_file=/var/opt/.oci/private.pem |
| 145 | + tenancy=ocid1.tenancy.oc1..parenttenancy |
| 146 | + region=an-ociregion-1 |
| 147 | + private.pem: |- |
| 148 | + -----BEGIN RSA PRIVATE KEY----- |
| 149 | + UsersPEMprivatekey= |
| 150 | + -----END RSA PRIVATE KEY----- |
| 151 | +
|
| 152 | +oci-onm-mgmt-agent: |
| 153 | + mgmtagent: |
| 154 | + # Provide the base64 encoded content of the Management Agent Install |
| 155 | + # Key file. Copy file as per |
| 156 | + # https://docs.oracle.com/en-us/iaas/management-agents/doc/management-agents-administration-taskshtml#GUID-3101FB2F-D774-42CA-A461-A850F0A4087C |
| 157 | + # and base64 encode. |
| 158 | + installKeyFileContent: base64encodedinstallkey= |
| 159 | +``` |
| 160 | +
|
| 161 | +## Limitations and Alternatives |
| 162 | +
|
| 163 | +This solution is based on OCI\'s Log Analytics and as such is a |
| 164 | +\"read-only\" solution, i.e. kubernetes resources can not be modified. |
| 165 | +Drilling in to the details brings up underlying log entries rather |
| 166 | +than deatils like k8s resource spec etc. |
| 167 | +There are associated log storage costs beyond 10GB , see |
| 168 | +<https://www.oracle.com/uk/manageability/pricing/#logging-analytics>. It is possible to set up an archive policy and/or a purge policy to control costs. |
| 169 | +
|
| 170 | +There are associated monitoring costs for metrics, see |
| 171 | +<https://www.oracle.com/uk/manageability/pricing/#monitoring>. |
| 172 | +
|
| 173 | +As the OKE clusters on C3 are CNCF conformant then most other k8s |
| 174 | +monitoring solutions should work, e.g. Rancher or Lens. In some cases it |
| 175 | +may not be desirable to allow access by k8s cluster users to the OCI |
| 176 | +Console and services like Log Analytics so a management utility running |
| 177 | +on the C3, like Rancher, or running on the users workstation, like Lens, |
| 178 | +may be a better choice. |
| 179 | +
|
| 180 | +# License |
| 181 | + |
| 182 | +Copyright (c) 2024 Oracle and/or its affiliates. |
| 183 | + |
| 184 | +Licensed under the Universal Permissive License (UPL), Version 1.0. |
| 185 | + |
| 186 | +See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/folder-structure/LICENSE) for more details. |
| 187 | +
|
0 commit comments