Skip to content

System freezing on CM5 eMMC (Ubuntu 24.x, kernel 6.11/6.14) #7167

@yeseuleee

Description

@yeseuleee

Describe the bug

Hello,
I'm experiencing a severe stability issue on Raspberry Pi CM5 (eMMC) when running Ubuntu 24.x.
After approximately 3.5 days of continuous uptime, the system becomes completely unresponsive (freezes) while ping still works.

Environment

  • Board: Raspberry Pi Compute Module 5 (eMMC model)
  • OS: Ubuntu 24.x
  • Kernel versions tested:
    • 6.11.0-1009-raspi
    • 6.14.0-1012-raspi
  • (Issue occurs on both versions)
  • Storage: CM5 onboard eMMC
  • Uptime before issue: typically 3.5 days or more

Symptoms when the issue occurs

  • SSH is unreachable
  • Ping works normally
  • Network communication becomes extremely slow or stops entirely
  • File-based DB connections fail
  • Sometimes logs are written, other times nothing is logged at all

The system requires a hard reboot to recover.

Observed logs

1. Filesystem suddenly turns read-only

In many cases, just before the freeze, I see multiple logs like:

fallocate[323]: fallocate: cannot open /swapfile: Read-only file system

Once this message appears, the freeze almost always follows shortly after.

2. USB errors sometimes appear before the freeze

These USB errors often appear before the read-only filesystem message:

usb 2-1.4.2: device descriptor read/64, error -71
usb 2-1.4.2: new high-speed USB device number 37 using xhci-hcd
usb 2-1.4-port2: attempt power cycle

I have also seen occasional disconnect or current-related USB errors.
However, the USB devices use a separate, stable power supply, so I do not believe this is a power issue.

3. Sometimes there are no logs at all

In several cases:

  • no syslog entries
  • no dmesg updates
  • system is frozen but not remounted read-only

What I have verified

  • No CPU spikes
  • Memory usage is stable
  • Disk usage is fine
  • No network congestion
  • System temperature is within normal range
  • eMMC I/O load is not heavy

Nothing obvious seems to lead to the freeze.

Possible root cause: eMMC CQE deadlock?

Based on my research, I found discussions mentioning CQE (Command Queue Engine) deadlock issues on certain Raspberry Pi eMMC configurations.

My questions:

  1. Is there a known CQE-related freeze/deadlock issue for CM5 eMMC in these kernel versions?
  2. If so, has this been addressed in a newer kernel or firmware update?
  3. Some users suggest disabling CQE,
    but is there an official or recommended workaround other than disabling CQE entirely?
  4. Is long-uptime instability with eMMC + CQE a known issue on CM5?

This system must operate 24/7, so long-term stability is critical.

If additional logs or traces are needed, I can provide them.
Thank you very much for your help. Let me know what further information I can collect to help diagnose this issue.

Steps to reproduce the behaviour

  1. Install Ubuntu 24.04 (or later) on Raspberry Pi CM5 eMMC.
  2. Use kernel versions such as 6.11.0-1009-raspi or 6.14.0-1012-raspi
    (the issue occurs on both).
  3. Run normal workloads (logging, DB file access, USB devices attached,
    light-to-moderate I/O). No heavy stress is required.
  4. Let the system run continuously for 3.5 days or longer.
  5. After ~3.5 days of uptime, the system gradually becomes unstable:
    • Network slows down severely or stalls.
    • SSH stops responding.
    • File operations start failing.
  6. Eventually, the system freezes completely while ping still replies.
  7. In some cases, "Read-only file system" messages or USB errors appear
    shortly before the freeze; in other cases, no logs are produced.

Device (s)

Raspberry Pi CM5

System

  • Raspberry Pi Compute Module 5 (eMMC 4G/8G)
  • OS: Ubuntu 24.04/24.10 LTS (non-Raspberry Pi OS)
  • Kernel: 6.11.0-1009-raspi or 6.14.0-1012-raspi (issue present in both)
  • Firmware: N/A on Ubuntu images
  • Uptime before failure: typically ~3.5 days
    This system must operate continuously (24/7), so resolving long-term stability issues is essential.

Logs

Common logs before freeze:
fallocate: cannot open /swapfile: Read-only file system

and sometimes,
Below are the logs captured shortly before the system freeze.

These logs show multiple kernel "hung task" events, where essential
filesystem-related tasks (jbd2, systemd-journal, application modules, sync)
remain blocked for more than 122 seconds. This indicates that the eMMC or
EXT4 journaling layer is no longer responding, which aligns with the
"Read-only file system" message observed in other freeze events.

Such behaviour suggests a possible deadlock in the block layer, EXT4 journal,
or eMMC/CQE command queue path. Once this occurs, all write operations stall
and the entire system becomes unresponsive while still answering ping.

Full logs:

----------------------------------------------------------------------
[Sat Nov 29 06:54:29 2025] INFO: task jbd2/mmcblk0p2-:258 blocked for more than 122 seconds.
[Sat Nov 29 06:54:29 2025]         Tainted: G         C E       6.14.0-1012-raspi #12-Ubuntu
[Sat Nov 29 06:54:29 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Nov 29 06:54:29 2025] task:jbd2/mmcblk0p2- state:D stack:0      pid:258      tgid:258      ppid:2         task_flags:0x240040 flags:0x00000008
[Sat Nov 29 06:54:29 2025] Call trace:
[Sat Nov 29 06:54:29 2025]  __switch_to+0xe8/0x148 (T)
[Sat Nov 29 06:54:29 2025]  __schedule+0x32c/0x990
[Sat Nov 29 06:54:29 2025]  schedule+0x3c/0x118
[Sat Nov 29 06:54:29 2025]  jbd2_journal_wait_updates+0x70/0xf0
[Sat Nov 29 06:54:29 2025]  jbd2_journal_commit_transaction+0x19c/0x16b0
[Sat Nov 29 06:54:29 2025]  kjournald2+0xc4/0x248
[Sat Nov 29 06:54:29 2025]  kthread+0x110/0x1e0
[Sat Nov 29 06:54:29 2025]  ret_from_fork+0x10/0x20
...
[Sat Nov 29 06:54:29 2025] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
----------------------------------------------------------------------

USB-related errors seen before some freezes:

  usb 2-1.4.2: device descriptor read/64, error -71
  usb 2-1.4-port2: attempt power cycle

Other occurrences:

  • Network becomes extremely slow or stops.
  • SSH becomes unavailable while ping still responds.
  • Sometimes no logs appear at all.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions