You are running Splunk Connect for Kubernetes (SCK) 1.4.9 and want to migrate to Splunk OpenTelemetry Collector for Kubernetes. Previous versions of the SCK are out of scope, but this guide still applies. You might encounter some version specific issues (such as missing/changed config options) but that shouldn't prevent you from proceeding with migration.
SCK has 3 components/applications:
- Logs, metrics, and traces from a Kubernetes cluster (deployed as a DaemonSet)
- Application to fetch cluster metrics from a Kubernetes cluster (deployed as a deployment)
- Application to fetch Kubernetes objects metadata from a Kubernetes cluster (deployed as a deployment)
All SCK applications use Fluentd to work with logs, metrics, and objects. Fluentd has significant performance issues when used to fetch logs in a Kubernetes cluster with pods that have very high throughput.
Splunk OpenTelemetry Collector for Kubernetes provides significant performance improvements over the SCK through use of the OpenTelemetry Collector agent and native OpenTelemetry functionality for logs collection rather than Fluentd. See Performance of native OpenTelemetry logs collection to learn more about the performance characteristics of this new application.
Splunk OpenTelemetry Collector for Kubernetes has the following components and applications:
- Splunk OpenTelemetry Collector Agent (
agent) to fetch logs, metrics, and traces from a Kubernetes cluster (deployed as a Kubernetes DaemonSet) - Splunk OpenTelemetry Collector Cluster Receiver (
clusterReceiver) to fetch metrics from a Kubernetes API (deployed as a Kubernetes 1-replica Deployment) - Optional Splunk OpenTelemetry Collector Gateway (
gateway) to forward data through it to reduce load on Kubernetes API and apply additional processing (deployed as a Kubernetes Deployment)
- Red Hat Universal Base Image (UBI) Docker images for our applications are no longer available, as we now use scratch images.
- AWS FireLens is not supported.
- The naming convention of the metrics used in Splunk OpenTelemetry Collector for Kubernetes follows the OpenTelemetry specification and is different than SCK. You will observe minor differences in the names of the metrics.
- Previously in SCK, you could get a large number of metrics from various APIs for Kubernetes. However, in recent versions of Kubernetes 1.18+, these API sources are disabled by default and there are fewer metrics available. See Metrics Information for the previous list of metrics.
Some additional metrics that are available in Splunk OpenTelemetry Collector for Kubernetes are metrics about the OpenTelemetry Collector itself.
These are the metrics available in Splunk OpenTelemetry Collector for Kubernetes:
- container.cpu.time
- container.cpu.utilization
- container.filesystem.available
- container.filesystem.capacity
- container.filesystem.usage
- container.memory.available
- container.memory.major_page_faults
- container.memory.page_faults
- container.memory.rss
- container.memory.usage
- container.memory.working_set
- k8s.container.cpu_limit
- k8s.container.cpu_request
- k8s.container.memory_limit
- k8s.container.memory_request
- k8s.container.ready
- k8s.container.restarts
- k8s.daemonset.current_scheduled_nodes
- k8s.daemonset.desired_scheduled_nodes
- k8s.daemonset.misscheduled_nodes
- k8s.daemonset.ready_nodes
- k8s.deployment.available
- k8s.deployment.desired
- k8s.namespace.phase
- k8s.node.condition_ready
- k8s.node.cpu.time
- k8s.node.cpu.utilization
- k8s.node.filesystem.available
- k8s.node.filesystem.capacity
- k8s.node.filesystem.usage
- k8s.node.memory.available
- k8s.node.memory.major_page_faults
- k8s.node.memory.page_faults
- k8s.node.memory.rss
- k8s.node.memory.usage
- k8s.node.memory.working_set
- k8s.node.network.errors
- k8s.node.network.io
- k8s.pod.cpu.time
- k8s.pod.cpu.utilization
- k8s.pod.filesystem.available
- k8s.pod.filesystem.capacity
- k8s.pod.filesystem.usage
- k8s.pod.memory.available
- k8s.pod.memory.major_page_faults
- k8s.pod.memory.page_faults
- k8s.pod.memory.rss
- k8s.pod.memory.usage
- k8s.pod.memory.working_set
- k8s.pod.network.errors
- k8s.pod.network.io
- k8s.pod.phase
- k8s.replicaset.available
- k8s.replicaset.desired
- otelcol_exporter_queue_size
- otelcol_exporter_send_failed_log_records
- otelcol_exporter_send_failed_metric_points
- otelcol_exporter_sent_log_records
- otelcol_exporter_sent_metric_points
- otelcol_otelsvc_k8s_ip_lookup_miss
- otelcol_otelsvc_k8s_namespace_added
- otelcol_otelsvc_k8s_namespace_updated
- otelcol_otelsvc_k8s_pod_added
- otelcol_otelsvc_k8s_pod_table_size
- otelcol_otelsvc_k8s_pod_updated
- otelcol_process_cpu_seconds
- otelcol_process_memory_rss
- otelcol_process_runtime_heap_alloc_bytes
- otelcol_process_runtime_total_alloc_bytes
- otelcol_process_runtime_total_sys_memory_bytes
- otelcol_process_uptime
- otelcol_processor_accepted_log_records
- otelcol_processor_accepted_metric_points
- otelcol_processor_dropped_log_records
- otelcol_processor_dropped_metric_points
- otelcol_processor_refused_log_records
- otelcol_processor_refused_metric_points
- otelcol_receiver_accepted_metric_points
- otelcol_receiver_refused_metric_points
- otelcol_scraper_errored_metric_points
- otelcol_scraper_scraped_metric_points
- scrape_duration_seconds
- scrape_samples_post_metric_relabeling
- scrape_samples_scraped
- scrape_series_added
- system.cpu.load_average.15m
- system.cpu.load_average.1m
- system.cpu.load_average.5m
- system.cpu.time
- system.disk.io
- system.disk.io_time
- system.disk.merged
- system.disk.operation_time
- system.disk.operations
- system.disk.pending_operations
- system.disk.weighted_io_time
- system.filesystem.inodes.usage
- system.filesystem.usage
- system.memory.usage
- system.network.connections
- system.network.dropped
- system.network.errors
- system.network.io
- system.network.packets
- system.paging.faults
- system.paging.operations
- system.paging.usage
- system.processes.count
- system.processes.created
- up
The following table shows the options for migrating from SCK to Splunk OpenTelemetry Collector for Kubernetes:
| Method | Logs | Metrics | Objects |
|---|---|---|---|
| SCK | Yes | Yes | Yes |
| Splunk OpenTelemetry Collector for Kubernetes | Yes | Yes | Yes |
As shown in the table, you can acquire logs, metrics and objects using Splunk OpenTelemetry Collector for Kubernetes.
Translate the values.yaml file from SCK to an appropriate format for Splunk OpenTelemetry Collector for Kubernetes. The following are the configurations for SCK and Splunk OpenTelemetry Collector for Kubernetes:
Translating global/generic/splunk configurations from SCK to Splunk OpenTelemetry Collector for Kubernetes
You can combine the host, port and protocol options from SCK to use the splunkPlatform.endpoint option in Splunk OpenTelemetry Collector for Kubernetes.
This option uses this format: "http://X.X.X.X:8088/services/collector" which is interpreted as "protocol://host:port/services/collector".
If you are using the clientCert, clientKey, and caFile options from SCK, use the corresponding clientCert, clientKey, and caFile options under splunkPlatform in Splunk OpenTelemetry Collector for Kubernetes to specify your HEC certificate chain.
If you are using the insecureSSL option from SCK, use the splunkPlatform.insecureSkipVerify option in Splunk OpenTelemetry Collector for Kubernetes to specify whether to verify the certificates on HEC.
If you are using the indexName option from SCK, use the splunkPlatform.index option in Splunk OpenTelemetry Collector for Kubernetes to specify which index you want to index data into
Translating custom configurations from SCK to Splunk OpenTelemetry Collector for Kubernetes for logs
You can find all the configuration options used to upgrade in the values.yaml files linked above for both SCK and Splunk OpenTelemetry Collector for Kubernetes.
If you configured SCK to explicitly set any of the following, you can do the same with Splunk OpenTelemetry Collector for Kubernetes:
- Set the
logsCollection.containers.containerRuntimeto a value from CRI-O, containerd, or Docker, depending on your runtime. - Set the
pathoption to the location of your logs on your nodes. - If using the
exclude_pathoption from SCK, you can use thelogsCollection.containers.excludePathsoption in Splunk OpenTelemetry Collector for Kubernetes to exclude any logs you want. - If using the
logsoption to definemultilineconfigs in SCK, you can use thelogsCollection.containers.multilineConfigsoption in Splunk OpenTelemetry Collector for Kubernetes to concatenate multiline logs. - If using the
checkpointFileoption in SCK to define a custom location for checkpointing your logs, you can also do so with thelogsCollection.checkpointPathoption in Splunk OpenTelemetry Collector for Kubernetes. - If using the
logoption to ingest any other logs from your nodes, you can also do so with thelogsCollection.extraFileLogsoption in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the clusterName option to set a cluster name metadata field in your logs in SCK, you can also use the clusterName option in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the podSecurityContext option to set a pod security policy, you can use the agent.securityContext option in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the customMetadata option to set a custom metadata field in your logs in SCK, you can use the extraAttributes.custom option in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the customMetadataAnnotations option to set custom annotation fields in your logs in SCK, you can use the extraAttributes.fromAnnotations option in Splunk OpenTelemetry Collector for Kubernetes.
If you are using any pod scheduling operations such as nodeSelector, affinity, and tolerations in SCK, you can use the nodeSelector, affinity and tolerations options in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the serviceAccount option to use your own service accounts in SCK, you can use the serviceAccount option in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the secret option to use your own service accounts in SCK, you can use the secret option in Splunk OpenTelemetry Collector for Kubernetes.
If you are using custom Docker images, tags, pull secrets, and pull policy in SCK, you can achieve the same using the image option in Splunk OpenTelemetry Collector for Kubernetes to specify the relevant configs for using docker images.
If you are using the resources option in SCK to limit/increase the CPU and memory usage in SCK, you can do the same using the agent.resources option in Splunk OpenTelemetry Collector for Kubernetes.
For tailing files other than container or journald logs (that is, kube audit logs), configure logsCollection.extraFileLogs using this filelog receiver configuration.
[SCK values.yaml snippet]
logs:
kube-audit:
from:
file:
path: /var/log/kube-apiserver-audit.log
timestampExtraction:
format: "%Y-%m-%dT%H:%M:%SZ"
sourcetype: kube:apiserver-audit[Splunk OpenTelemetry Collector for Kubernetes values.yaml snippet]
logsCollection:
extraFileLogs:
filelog/kube-audit:
include: [/var/log/kube-apiserver-audit.log]
start_at: beginning
include_file_path: true
include_file_name: false
storage: file_storage
resource:
host.name: 'EXPR(env("K8S_NODE_NAME"))'
com.splunk.sourcetype: kube:apiserver-audit
com.splunk.source: /var/log/kube-apiserver-audit.logUse the kube-audit keyword to continue reading from the translated checkpoint data.
Translating custom configurations from SCK to Splunk OpenTelemetry Collector for Kubernetes for objects
For collecting Kubernetes objects, configure clusterReceiver.k8sObjects using the k8sobjects receiver configurations.
If you are using the following values for objects in splunk-kubernetes-objects:
[SCK values.yaml snippet]
splunk-kubernetes-objects:
objects:
core:
v1:
- name: pods
namespace: default
mode: pull
interval: 60m
- name: events
mode: watch
apps:
v1:
- name: daemon_sets
labelSelector: environment=productionEquivalent configuration for Splunk OpenTelemetry Collector for Kubernetes:
clusterReceiver:
k8sObjects:
- name: pods
namespaces: [default]
mode: pull
interval: 60m
- name: events
mode: watch
- name: daemonsets
label_selector: environment=productionk8sobjects pulls objects every 60m by default. But, SCK pulls at every 15m. If you wish to use the same interval, you can define the interval config. The important change in k8sObjects is that you don't need to specify resource group and version. It automatically detects it.
Translating custom configurations from SCK to Splunk OpenTelemetry Collector for Kubernetes for metrics
If you are using the resources option in SCK to limit/increase the CPU and memory usage in SCK, you can do the same using the resources option in clusterReceiver section in Splunk OpenTelemetry Collector for Kubernetes.
If you are using any pod scheduling operations such as nodeSelector, affinity, and tolerations in SCK, you can use the clusterReceiver.nodeSelector, clusterReceiver.affinity and clusterReceiver.tolerations options in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the podSecurityContext option to set a pod security policy, you can use the clusterReceiver.securityContext option in Splunk OpenTelemetry Collector for Kubernetes.
If you are using the customMetadataAnnotations option to set a custom annotation fields in your logs in SCK, you can use the extraAttributes.fromAnnotations option in Splunk OpenTelemetry Collector for Kubernetes.
To delete the SCK deployment, find the name of the deployment using the helm ls command.
-
If you want to delete only logs and metrics
-
Update the values.yaml file used to deploy SCK to disable objects and run the following command:
-
helm upgrade local-k8s -f your-values-file.yaml splunk/splunk-connect-for-kubernetes
-
-
If you want to delete only logs
-
Update the values.yaml file used to deploy SCK to disable objects and run the following command:
-
helm upgrade local-k8s -f your-values-file.yaml splunk/splunk-connect-for-kubernetes
-
-
If you want to delete your entire SCK deployment (logs, objects and metrics), run the following command
- helm delete local-k8s
- See How to install for the instructions.
- Check the logs index to see if you are receiving logs from your Kubernetes cluster
index="Your logs index"
- Check the metrics index to see if you are receiving metrics from your Kubernetes cluster
| mcatalog values(metric_name) WHERE index="Your metrics index"
Splunk Connect for Kubernetes by default read containers logs from /var/log/containers/*
Splunk OpenTelemetry Collector for Kubernetes by default read containers logs from /var/log/pods/*
Change is reflected in source filed for extracted logs.
Both Splunk Connect for Kubernetes and Splunk OpenTelemetry Collector for Kubernetes define sourcetype for container logs as kube:container:<container_name> by default. But, Splunk Connect for Kubernetes explicitly defines the sourcetype of Kubernetes core components as kube:<container_name>. They are defined here
sourcetype configuration can be changed by adding logsCollection.containers.extraOperators configuration.
Splunk OpenTelemetry Collector for Kubernetes follows naming convention for OpenTelemetry for extracted fields. Table below present differences in filed names extracted by Splunk OpenTelemetry Collector for Kubernetes and Splunk Connect for Kubernetes
| Splunk Connect for Kubernetes | Splunk OpenTelemetry Collector for Kubernetes |
|---|---|
| container_id | container.id |
| container_image | container.image.name and container.image.tag |
| container_name | k8s.container.name |
| cluster_name | k8s.cluster.name |
| namespace | k8s.namespace.name |
| pod | k8s.pod.name |
| pod_uid | k8s.pod_uid |
| label_app | k8s.pod.labels.app |
If you wish to continue using Splunk Connect for Kubernetes's naming convention, you can use the following configuration:
splunkPlatform:
fieldNameConvention:
# Boolean for renaming pod logs metadata fields to match to Splunk Connect for Kubernetes helm chart.
renameFieldsSck: true
# Boolean for keeping Otel convention fields after renaming it
keepOtelConvention: falseNote that this configuration snippet only applies to log events; for metrics, users must provide their own processing.