Log container becomes idle after several days, log forwarding stops until 「strace -f」 is processed

## Description
We are using **AWS ECS (Fargate, platform version 1.4.0)** to transfer application logs. Logs produced by application containers are forwarded via a log container based on **Fluent Bit (aws-for-fluent-bit:2.25.1)** to multiple destinations such as **Datadog** and **AWS S3 buckets**.  

Under normal operation, log forwarding works correctly. However, after running for several days, we observed that the log container’s **CPU usage dropped drastically (appearing idle)**, and all log forwarding stopped. 

<img width="1661" height="792" alt="Image" src="https://github.com/user-attachments/assets/1200e507-5383-4662-ba36-cd21806965ce" />

For log input, we are using the **`awsfirelens` log driver**, and logs are transferred via a **Unix Domain Socket** (confirmed to be opened in the running container).  

To investigate, we used the `strace` command to monitor the log container. Immediately after running the following command inside the log container,  the log container’s **CPU usage suddenly spiked close to 100%**, and for several minutes, a large backlog of logs was flushed and delivered to destinations such as the S3 bucket. It appeared that the log container processed data that had accumulated in the socket buffer.

`strace -p 1 -f -o strace_0902_log.txt -s 2048`

<img width="1672" height="874" alt="Image" src="https://github.com/user-attachments/assets/333c2d54-fd76-4431-a032-11c7ef2ff741" />

※Since this issue occurred in a **production environment**, we were not running Fluent Bit with `debug` or `info` log levels enabled. Apologies for the lack of detailed diagnostic information.

## Expected Behavior
- The log container should **continue forwarding logs reliably**, without becoming idle after running for several days. 

---

## Actual Behavior
- After several days of uptime, the log container becomes idle (CPU near 0%), and log forwarding stops completely. 

---

## Environment
- **Service**: AWS ECS Fargate  
- **Platform version**: 1.4.0  
- **Fluent Bit image**: aws-for-fluent-bit:2.25.1  
- **Log driver**: awsfirelens (Unix Domain Socket input)  
- **Outputs**: Datadog, AWS S3  

Thank you for support us.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Log container becomes idle after several days, log forwarding stops until 「strace -f」 is processed #992

Description

Expected Behavior

Actual Behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Log container becomes idle after several days, log forwarding stops until 「strace -f」 is processed #992

Description

Description

Expected Behavior

Actual Behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions