Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 61 additions & 12 deletions charts/logan/README.md

Large diffs are not rendered by default.

6 changes: 5 additions & 1 deletion charts/logan/templates/discovery-cronjob.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ spec:
spec:
restartPolicy: {{ .Values.k8sDiscovery.objects.restartPolicy }}
serviceAccountName: {{ $serviceAccount }}
{{- if .Values.tolerations }}
tolerations:
{{- toYaml .Values.tolerations | nindent 10 }}
{{- end }}
{{- if .Values.image.imagePullSecrets }}
imagePullSecrets:
- name: {{ .Values.image.imagePullSecrets }}
Expand Down Expand Up @@ -176,4 +180,4 @@ spec:
sources:
- secret:
name: {{ $resourceNamePrefix }}-oci-config
{{- end }}
{{- end }}
3 changes: 3 additions & 0 deletions charts/logan/templates/fluentd-daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ spec:
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
{{- if .Values.tolerations }}
{{- toYaml .Values.tolerations | nindent 6 }}
{{- end }}
{{- if $imagePullSecrets }}
imagePullSecrets:
- name: {{ .Values.image.imagePullSecrets }}
Expand Down
5 changes: 4 additions & 1 deletion charts/logan/templates/tcpconnect-daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ spec:
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
{{- if .Values.tolerations }}
{{- toYaml .Values.tolerations | nindent 6 }}
{{- end }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a tier based approach to applying tolerations as toleration requirements for the clients will differ.
Ex - discovery client can run on any host but the logan fluentd clients must be run on every worker node.

These should be a default toleration defined at global scope and additional tolerations for specific clients (mgmt_agent, fluentd, discovery, tcpconnect etc)

Helm should compute the final tolerations for specific client and configure the templates accordingly.

You can refer to #93 to check how we we accept same property at multiple sections and decide on the final one during run time. For timezone we are using priority based approach but for tolerations we will need a consolidation based approach.

Let's also move the current hard-coded tolerations to values.yaml (as default values) so that we remain backward compliant.

- key: node-role.kubernetes.io/master 
   effect: NoSchedule 
- key: node-role.kubernetes.io/control-plane 
   effect: NoSchedule

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alcampag can you please address these comments and re-submit.

{{- if $imagePullSecrets }}
imagePullSecrets:
- name: {{ .Values.image.imagePullSecrets }}
Expand Down Expand Up @@ -72,4 +75,4 @@ spec:
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
{{- end }}
{{- end }}
27 changes: 27 additions & 0 deletions charts/logan/values.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,33 @@
},
"ociLAClusterEntityID": {
"type": "string"
},
"tolerations": {
"type": "array",
"items": {
"type": "object",
"properties": {
"key": {
"type": "string"
},
"operator": {
"type": "string",
"enum": ["Equal", "Exists"]
},
"value": {
"type": "string"
},
"effect": {
"type": "string",
"enum": ["NoSchedule", "PreferNoSchedule", "NoExecute"]
},
"tolerationSeconds": {
"type": "integer"
}
},
"additionalProperties": false
},
"default": []
}
}
}
9 changes: 9 additions & 0 deletions charts/logan/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,15 @@ resourceOverrides:
cpu: 100m
memory: 250Mi

# -- Custom tolerations to apply to all pods in the chart.
# Default: [] (no additional tolerations)
# Example:
# tolerations:
# - key: "example-taint"
# operator: "Exists"
# effect: "NoSchedule"
tolerations: []

# -- @param extraVolumes Extra volumes.
# Example:
# - name: tmpDir
Expand Down
50 changes: 26 additions & 24 deletions charts/mgmt-agent/README.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,58 @@
# oci-onm-mgmt-agent

![Version: 3.0.0](https://img.shields.io/badge/Version-3.0.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.16.0](https://img.shields.io/badge/AppVersion-1.16.0-informational?style=flat-square)
![Version: 3.0.5](https://img.shields.io/badge/Version-3.0.5-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.16.0](https://img.shields.io/badge/AppVersion-1.16.0-informational?style=flat-square)

A Helm chart for collecting Kubernetes Metrics using OCI Management Agent into OCI Monitoring.

## Requirements

| Repository | Name | Version |
|------------|------|---------|
| file://../common | oci-onm-common | 3.0.0 |
| file://../common | oci-onm-common | 3.1.0 |

## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| deployMetricServer | bool | `true` | By default, metric server will be deployed and used by Management Agent to collect metrics. You can set this to false if you already have metric server installed on your cluster |
| deployment.cleanupEpochTime | string | `nil` | |
| deployment.daemonSet.hostPath | string | `nil` | |
| deployment.daemonSet.overrideOwnership | bool | `true` | |
| deployment.daemonSetDeployment | bool | `false` | |
| deployment.resource.limit.cpuCore | string | `"500m"` | |
| deployment.resource.limit.memory | string | `"1Gi"` | |
| deployment.resource.request.cpuCore | string | `"200m"` | |
| deployment.resource.request.memory | string | `"500Mi"` | |
| deployment.resource.request.storage | string | `"2Gi"` | |
| deployment.security.fsGroup | int | `2000` | |
| deployment.security.runAsGroup | int | `2000` | |
| deployment.security.runAsUser | int | `1000` | |
| deployment.storageClass | string | `nil` | |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not update unrelated properties in this transaction.

| global.namespace | string | `"oci-onm"` | Kubernetes Namespace in which the resources to be created. Set oci-kubernetes-monitoring-common:createNamespace set to true, if the namespace doesn't exist. |
| global.resourceNamePrefix | string | `"oci-onm"` | Prefix to be attached to resources created through this chart. Not all resources may have this prefix. |
| kubernetesCluster.compartmentId | string | `nil` | OCI Compartment Id to push Kubernetes Monitoring metrics. If not specified default is same as Agent compartment |
| kubernetesCluster.enableAutomaticPrometheusDetection | bool | `false` | |
| kubernetesCluster.monitoringNamespace | string | `nil` | OCI namespace to push Kubernetes Monitoring metrics. The namespace should match the pattern '^[a-z][a-z0-9_]*[a-z0-9]$'. By default metrics will be pushed to 'mgmtagent_kubernetes_metrics' |
| kubernetesCluster.name | string | `nil` | Kubernetes cluster name |
| kubernetesCluster.namespace | string | `"*"` | Kubernetes cluster namespace(s) to monitor. This can be a comma-separated list of namespaces or '*' to monitor all the namespaces |
| kubernetesCluster.monitoringNamespace | string | `nil` | OCI namespace to push Kubernetes Monitoring metrics. The namespace should match the pattern '^[a-z][a-z0-9_]*[a-z0-9]$'. By default metrics will be pushed to 'mgmtagent_kubernetes_metrics' |
| kubernetesCluster.overrideAllowMetricsAPIServer | string | `nil` | Provide the specific list of comma separated metric names for agent computed metrics to be collected. |
| kubernetesCluster.overrideAllowMetricsCluster | string | `nil` | Provide the specific list of comma separated metric names for agent computed metrics to be collected |
| kubernetesCluster.overrideAllowMetricsKubelet | string | `nil` | Provide the specific list of comma separated metric names for Kubelet (/api/v1/nodes/<node_name>/proxy/metrics) metrics to be collected |
| kubernetesCluster.overrideAllowMetricsNode | string | `nil` | Provide the specific list of comma separated metric names for Node (/api/v1/nodes/<node_name>/proxy/metrics/resource, /api/v1/nodes/<node_name>/proxy/metrics/cadvisor) metrics to be collected |
| kubernetesCluster.enableAutomaticPrometheusDetection | bool | `false` | Setting this to true will enable automatic PrometheusEmitter metrics collection from eligible pods |
| kubernetesCluster.overrideAllowMetricsAPIServer | string | `nil` | Provide the specific list of comma separated metric names for API server (/metrics) metrics to be collected. |
| kubernetesCluster.overrideAllowMetricsCluster | string | `nil` | Provide the specific list of comma separated metric names for agent computed metrics to be collected. |
| kubernetesCluster.overrideAllowMetricsKubelet | string | `nil` | Provide the specific list of comma separated metric names for Kubelet (/api/v1/nodes/<node_name>/proxy/metrics) metrics to be collected. |
| kubernetesCluster.overrideAllowMetricsNode | string | `nil` | Provide the specific list of comma separated metric names for Node (/api/v1/nodes/<node_name>/proxy/metrics/resource, /api/v1/nodes/<node_name>/proxy/metrics/cadvisor) metrics to be collected. |
| mgmtagent.extraEnv[0].name | string | `"DISABLE_JRE_DEFAULT_SECURITY_PROPERTIES_FILE"` | |
| mgmtagent.extraEnv[0].value | string | `"false"` | |
| mgmtagent.image.secret | string | `nil` | Image secrets to use for pulling container image (base64 encoded content of ~/.docker/config.json file) |
| mgmtagent.image.url | string | `nil` | Replace this value with actual docker image URL for Management Agent |
| mgmtagent.installKey | string | `"resources/input.rsp"` | Copy the downloaded Management Agent Install Key file under root helm directory as resources/input.rsp |
| mgmtagent.installKeyFileContent | string | `nil` | Provide the base64 encoded content of the Management Agent Install Key file (e.g. `cat input.rsp \| base64 -w 0`) |
| mgmtagent.extraEnv | string | `nil` | Please specify additional environment variables in name:value pairs |
| mgmtagent.installKeyFileContent | string | `nil` | Provide the base64 encoded content of the Management Agent Install Key file (e.g. cat input.rsp | base64 -w 0) |
| namespace | string | `"{{ .Values.global.namespace }}"` | Kubernetes namespace to create and install this helm chart in |
| oci-onm-common.createNamespace | bool | `true` | If createNamespace is set to true, it tries to create the namespace defined in 'namespace' variable. |
| oci-onm-common.createServiceAccount | bool | `true` | By default, a cluster role, cluster role binding and serviceaccount will be created for the monitoring pods to be able to (readonly) access various objects within the cluster, to support collection of various telemetry data. You may set this to false and provide your own serviceaccount (in the parent chart(s)) which has the necessary cluster role(s) binded to it. Refer, README for the cluster role definition and other details. |
| oci-onm-common.namespace | string | `"{{ .Values.global.namespace }}"` | Kubernetes Namespace in which the serviceaccount to be created. |
| oci-onm-common.resourceNamePrefix | string | `"{{ .Values.global.resourceNamePrefix }}"` | Prefix to be attached to resources created through this chart. Not all resources may have this prefix. |
| oci-onm-common.serviceAccount | string | `"{{ .Values.global.resourceNamePrefix }}"` | Name of the Kubernetes ServiceAccount |
| serviceAccount | string | `"{{ .Values.global.resourceNamePrefix }}"` | Name of the Kubernetes ServiceAccount |
| deployment.security.runAsUser | integer | `1000` | Processes in the Container will use the specified user ID |
| deployment.security.runAsGroup | integer | `2000` | Processes in the Container will use the specified group ID |
| deployment.security.fsGroup | integer | `2000` | Files created in the Container will use the specified group ID |
| deployment.cleanupEpochTime | integer | `nil` | Please provide the current epoch time in seconds (Eg: Executing the following command in a bash shell will provide the epoch time: "date +%s") to clean up the agent installation directory from previous deployment |
| deployment.daemonSetDeployment | bool | `false` | Setting the daemonset deployment to true, will deploy the Management Agents as a daemonset in addition to deploying the Management Agent as a statefulset. This is done to to distribute the node metrics collection to agents running on the node |
| deployment.daemonSet.hostPath | string | `nil` | The host path to store data, if Agent is deployed as DaemonSet. Management Agent Pod should have read-write access to it |
| deployment.daemonSet.overrideOwnership | bool | `true` | Override the ownership and permissions on the hostPath. The hostPath will be owned by the runAsUser and runAsGroup provided under security context and the permission as 750. </br>Note: This requires oraclelinux:8-slim image </br></br>Setting overrideOwnership to false will disable the ownership change. |
| deployment.resource.request.cpuCore | string | `200m` | Minimum CPU cores(millicore) for each agent instance |
| deployment.resource.request.memory | string | `500Mi` | Minimum memory(mebibytes) for each agent instance |
| deployment.resource.request.storage | string | `2Gi` | Minimum storage(gibibyte) for StatefulSet's PVC |
| deployment.resource.limit.cpuCore | string | `500m` | Maximum CPU cores(millicore) for each agent instance |
| deployment.resource.limit.memory | string | `1Gi` | Maximum memory(gibibyte) for each agent instance |
| deployment.storageClass | string | `nil` | The storage class for StatefulSet's PVC. If not provided then the Cluster's default storage class will be used |
| tolerations | list | `[]` | Custom tolerations to apply to all pods in the chart. Default: [] (no additional tolerations) Example: tolerations: - key: "example-taint" operator: "Exists" effect: "NoSchedule" |

----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.11.0](https://github.com/norwoodj/helm-docs/releases/v1.11.0)
Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)
4 changes: 4 additions & 0 deletions charts/mgmt-agent/templates/mgmt-agent-daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@ spec:
runAsGroup: {{ default 0 .Values.deployment.security.runAsGroup }}
fsGroup: {{ default 0 .Values.deployment.security.fsGroup }}
serviceAccountName: {{ include "mgmt-agent.serviceAccount" . }}
{{- if .Values.tolerations }}
tolerations:
{{- toYaml .Values.tolerations | nindent 8 }}
{{- end }}
{{- if .Values.mgmtagent.image.secret }}
imagePullSecrets:
- name: {{ include "mgmt-agent.resourceNamePrefix" . }}-mgmt-agent-container-registry-key
Expand Down
31 changes: 30 additions & 1 deletion charts/mgmt-agent/values.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,35 @@
[
"namespace"
],
"properties": {
"tolerations": {
"type": "array",
"items": {
"type": "object",
"properties": {
"key": {
"type": "string"
},
"operator": {
"type": "string",
"enum": ["Equal", "Exists"]
},
"value": {
"type": "string"
},
"effect": {
"type": "string",
"enum": ["NoSchedule", "PreferNoSchedule", "NoExecute"]
},
"tolerationSeconds": {
"type": "integer"
}
},
"additionalProperties": false
},
"default": []
}
},
"title": "Values",
"type": "object"
}
}
9 changes: 9 additions & 0 deletions charts/mgmt-agent/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,12 @@ deployment:

# Provide the storage class for StatefulSet's PVC. If not provided then the Cluster's default storage class will be used.
storageClass:

# -- Custom tolerations to apply to all pods in the chart.
# Default: [] (no additional tolerations)
# Example:
# tolerations:
# - key: "example-taint"
# operator: "Exists"
# effect: "NoSchedule"
tolerations: []