Skip to content

fix: enable THRE interrupt in emulate_serial_init for snapshot restore#5730

Open
ejc3 wants to merge 3 commits intofirecracker-microvm:mainfrom
ejc3:fix-thre-after-restore
Open

fix: enable THRE interrupt in emulate_serial_init for snapshot restore#5730
ejc3 wants to merge 3 commits intofirecracker-microvm:mainfrom
ejc3:fix-thre-after-restore

Conversation

@ejc3
Copy link

@ejc3 ejc3 commented Mar 2, 2026

Problem

After snapshot restore, the serial console (ttyS0) stalls on x86_64. The guest 8250 driver stops transmitting — no output reaches Firecracker's stdout.

Root cause

emulate_serial_init() sets IER_RDA_BIT to re-enable the Received Data Available interrupt, but does not set IER_THR_EMPTY_BIT (Transmitter Holding Register Empty). vm-superio's Serial::write for IER_OFFSET only stores the value — it does not generate any interrupt. The guest 8250 driver uses interrupt-driven TX and waits for a THRE interrupt before sending each byte. After restore, that interrupt never fires, so TX stalls.

When the bug manifests

The stall only occurs when the snapshot is taken during active serial transmission — specifically when the guest's TX circular buffer has more than 16 bytes (port->fifosize):

  1. serial8250_tx_chars() sends at most 16 bytes per THRE interrupt
  2. With >16 bytes pending, UART_IER_THRI stays set in the guest driver's cached up->ier
  3. After restore, serial8250_start_tx() sees THRI already set and skips the UART_BUG_TXEN fallback
  4. vm-superio's IER only has RDA (no THRI), so no THRE interrupt fires
  5. TX stalls permanently

When the snapshot is taken at idle (empty TX buffer), serial8250_stop_tx() has already cleared THRI from the cached IER. After restore, serial8250_start_tx() re-enables THRI via the UART_BUG_TXEN fallback which polls LSR directly — so the idle case works even without this fix. Real workloads (systemd, journald, application logging) frequently have >16 bytes buffered during active output.

Fix

  1. Set both IER_RDA_BIT | IER_THR_EMPTY_BIT in emulate_serial_init() (both architectures).
  2. On x86_64, write a byte to DATA_OFFSET to trigger thr_empty_interrupt(), which signals the eventfd → KVM irqfd → IRQ 4 → guest 8250 driver resumes TX.

The DATA write is only needed on x86_64 because ARM64's MMIO-based interrupt delivery picks up the THRE-enabled IER on the first guest TX attempt.

Tests

Unit test (test_emulate_serial_init_enables_thre_interrupt): Calls emulate_serial_init() directly and verifies IER has both bits set and the interrupt eventfd was signaled.

Integration test (test_serial_output_after_snapshot_during_active_tx): Reproduces the actual stall by snapshotting during active TX with dd bs=4096 to keep >16 bytes in the circular buffer, then checking for sustained output without sending any host input (which would mask the bug via RDA piggybacking).

Without fix With fix
Initial burst 9 bytes 148,812 bytes
Sustained output 0 bytes (10s timeout) 51 bytes (0.0s)
Result FAIL PASS

@Manciukic
Copy link
Contributor

Hey @ejc3 , thanks again for reporting the issue! We'll take a deeper look in the next days. In the meanwhile, do you know what are the conditions to trigger this or how often this is happening for you? I don't think we've ever seen this issue, despite our testing.
We should also double check if restoring the entire serial state as mentioned in the comment above in the codebase is possible and would fix this issue.

@ilstam
Copy link
Contributor

ilstam commented Mar 3, 2026

On x86 the test in commit 2 appears to be passing even without commit 1 being applied.

@ejc3
Copy link
Author

ejc3 commented Mar 4, 2026

Bleh, let me look more deeply. The manifestation was trying to get serial logs flowing after restore.

@ejc3 ejc3 force-pushed the fix-thre-after-restore branch 3 times, most recently from fdbd689 to 881961d Compare March 4, 2026 17:03
@ejc3
Copy link
Author

ejc3 commented Mar 4, 2026

Updated the PR with a new integration test and a clearer description of the reproduction conditions. Addressing both comments:

@ilstam — You were right that the old unit test passed without the fix. I replaced it with two tests:

  1. A unit test that calls emulate_serial_init() directly and checks IER bits + eventfd signaling — this one fails without the fix (assert_eq!(ier & IER_THR_EMPTY_BIT, IER_THR_EMPTY_BIT)left: 0, right: 2).

  2. An integration test that reproduces the actual stall. The key insight: the bug only manifests when the snapshot is taken during active TX with >16 bytes in the circular buffer. At idle (shell prompt), serial8250_stop_tx() clears THRI from the guest driver's cached IER, and the UART_BUG_TXEN fallback in serial8250_start_tx() handles re-enabling — so idle-state tests pass without the fix. The new test uses dd bs=4096 to keep the buffer full, snapshots mid-stream, and checks for sustained output without sending any host input. Without the fix: 0 bytes after 10s. With the fix: output resumes immediately.

@Manciukic — The condition is: snapshot during active serial output where the TX circular buffer has >16 bytes. I experienced this in ejc3/fcvm when adding tests of systemd/journald logging. It didn't show up in test_serial_after_snapshot because that test snapshots at idle. Regarding full serial state restore — that would also fix this, but this targeted fix is simpler and sufficient since the only state that matters post-restore is IER (the guest driver re-establishes everything else via its cached register copies).

ejc3 added 3 commits March 4, 2026 17:47
After snapshot restore, the guest 8250 serial driver stalls on x86_64
because emulate_serial_init() only sets IER_RDA_BIT (bit 0), not
IER_THR_EMPTY_BIT (bit 1). Without THRE interrupts enabled, the
guest driver waits indefinitely for a TX completion notification
that never arrives.

On x86_64, additionally write a byte to DATA_OFFSET to trigger the
initial THRE interrupt via thr_empty_interrupt(). Writing to IER alone
only stores the value (vm-superio Serial::write for IER_OFFSET just
sets self.interrupt_enable) — it does not generate an interrupt.
The DATA write causes thr_empty_interrupt() to signal the eventfd,
which KVM injects via irqfd when the vCPU resumes.

This is not needed on aarch64 where the GIC interrupt delivery path
handles the first guest-initiated TX write correctly.

Tested: serial console test passes on x86_64 after warm start
(30 fc-agent lines + UART marker found in host log).
Add a unit test that directly validates emulate_serial_init() enables
both RDA and THRE interrupts and triggers the THRE interrupt via a DATA
write. This replaces the previous test which bypassed emulate_serial_init()
and tested vm-superio behavior directly (which passes regardless of the fix).

The test verifies:
1. Fresh serial has IER=0x00 (all interrupts disabled)
2. After emulate_serial_init(): IER has both RDA (bit 0) and THRE (bit 1) set
3. The DATA write triggers the THRE interrupt via eventfd

Without the fix in commit 1, this test fails:
  assertion failed: emulate_serial_init must enable THRE interrupt
    left: 0
   right: 2
Add test that reproduces the serial console stall after snapshot restore
when the guest is actively transmitting. The bug manifests when the TX
circular buffer has >16 bytes at snapshot time:

1. serial8250_tx_chars() can only drain 16 bytes (fifosize) per call
2. THRI stays set in the guest driver's cached IER
3. After restore, serial8250_start_tx() sees THRI already set and
   skips the UART_BUG_TXEN fallback path
4. vm-superio's IER only has RDA (no THRI), so no THRE interrupt fires
5. TX stalls permanently

The test uses dd bs=4096 to generate writes larger than the 16-byte
FIFO limit, keeping the buffer full during snapshot. Without the fix:
9 bytes initial burst, 0 bytes sustained (10s timeout). With fix:
148K+ bytes initial, sustained output immediately.

The existing test_serial_after_snapshot doesn't catch this because it
snapshots at idle (shell prompt), where serial8250_stop_tx() has already
cleared THRI from the cached IER, allowing the TXEN fallback to work.
@ejc3 ejc3 force-pushed the fix-thre-after-restore branch from 881961d to 01b0d6e Compare March 4, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants