Skip to content

CRI metadata is not removed in the log field #9088

@kinseii

Description

@kinseii

Bug Report

Some records have CRI metadata in the log field, such as timestamp and stdout. Note that the _p field is missing:

{
  "_index": "fluent-bit-*****-*******-2024.07.14-000120",
  "_id": "Rn2rtZA*******************",
  "_score": 1,
  "_source": {
    "@timestamp": "2024-07-15T09:14:05.031Z",
    "log": "2024-07-15T09:14:05.031661946Z stdout F {\"@t\":\"2024-07-15T09:14:05.0307804Z\",\"@m\":\"An unhandled exception has occurred while executing the request.*********************\"}}\n",
    "kubernetes": {
      "namespace_name": "*****",
      "annotations": {
        "*****************************************": "*******"
      },
      "labels": {
        "*****************************************": "*******"
      },
      "container_name": "***********",
      "pod_id": "*****************************************",
      "container_hash": "*******************************************************",
      "pod_name": "************************************",
      "container_image": "********************************************************",
      "docker_id": "********************************************************",
      "host": "********************************************************"
    }
  },
  "fields": {
    "@timestamp": [
      "2024-07-15T09:14:05.031Z"
    ]
  }
}

Fluent-bit settings:

    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level info
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port 2020
        Health_Check On
        storage.path /tmp/fluent-bit
        storage.sync full
        storage.checksum off
        storage.max_chunks_up 128
        storage.backlog.mem_limit 5M

    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        multiline.parser docker, cri, python, go, java
        Tag kube.*
        Skip_Long_Lines On
        Buffer_Max_Size 5MB
        Mem_Buf_Limit 500MB
        Storage.Type filesystem

    [INPUT]
        Name systemd
        Tag host.*
        Systemd_Filter _SYSTEMD_UNIT=kubelet.service
        Read_From_Tail On

    [FILTER]
        Name kubernetes
        Match kube.*
        Buffer_Size 2MB
        Merge_Log Off

    [FILTER]
        Name lua
        Match kube.*
        script /fluent-bit/scripts/common.lua
        call common

Most of the time, this metadata is deleted, but sometimes it is not. Besides, if we look at surrounding records, we will find a similar log, differing by microseconds, which will be processed properly, i.e. with this metadata removed and the _p field will also be added. The lua script further parses the log field if it is a json string. But since metadata is added at the beginning, it is not processed.

AKS v1.28.10
OpenSearch v2.12.0
Fluent-bit v3.0.3

Similar problem here, but we wouldn't want to lose multiline parsing of everything but cri: #6515

So, can anyone tell me why this is happening and where to look?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions