fix: CI failures by coderbirju · Pull Request #848 · firecracker-microvm/firecracker-containerd

coderbirju · 2026-01-27T13:35:33Z

Issue #, if available:

Our CI experiences two categories of failures:

Consistently Failing Tests

tc-redirect-tap permission denied failures in go mod
- This was due to the older version not being present in the go repository anymore
- Changed this to the only available version - v0.0.0-20250516183331-34bf829e9a5c

Intermittently Failing Tests

TestJailerCPUSet_Isolated
TestOOM_Isolated
TestCreateVM_Isolated
TestStopVM_Isolated
TestEvents_Isolated - Race condition in event collection logic
- Current implementation strictly collects exactly 10 events in a specific order
- Events arrive non-deterministically, causing test failures
- Changed this to simply check for the events and not care about the ordering of the events
TestPauseResume_Isolated variants - vsock connection timeouts
TestBrokenPipe-Isolated This test simulates a broken ioPipe by removing the stdio and stderr streams and attaching another iostream to the same task - this is very flaky as sometimes the attach doesn’t happen properly and we end up with nothing on the new streams. This test case needs to be revisited and refactored as the method for doing this is not very deterministic in nature - We should skip this test if the failures are consistent.

Most of these failures happens either because of timing delays during agent setup and cleanup, some I have tried to alleviate this by adding timeouts and individual contexts in as many places as possible but it is not consistent and the only way to get this to be consistent is probably look into how the tests are structured.

Recent changes include runc update, firecracker-go-sdk update and various other small dependencies being updated, including the docker image used for testing. Any of these can be a reason for added flakyness.

Description of changes:

added extra timeout and cleanup functions

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Signed-off-by: Arjun Raja Yogidas <arjunry@amazon.com>

When the agent receives a Shutdown request, it may close the ttrpc connection before sending the response. This is expected behavior. The runtime should proceed to Wait() for the VM to exit rather than treating this as a failure and force-terminating. Signed-off-by: Arjun Raja Yogidas <arjunry@amazon.com>

…r subtest Signed-off-by: Arjun Raja Yogidas <arjunry@amazon.com>

coderbirju requested a review from a team as a code owner January 27, 2026 13:35

coderbirju force-pushed the fix-test-failures branch from b6d5537 to 6e6d0dc Compare January 27, 2026 15:19

coderbirju changed the title ~~fix: go mod failures~~ fix: CI failures Jan 27, 2026

coderbirju force-pushed the fix-test-failures branch 25 times, most recently from f0a783f to a0c8483 Compare February 4, 2026 21:04

coderbirju force-pushed the fix-test-failures branch 2 times, most recently from 0d4d785 to de08c40 Compare February 4, 2026 22:20

coderbirju force-pushed the fix-test-failures branch 6 times, most recently from 8aa17b9 to b5f0cd4 Compare February 5, 2026 18:20

coderbirju added 3 commits February 5, 2026 19:40

fix: go mod failures

261f745

Signed-off-by: Arjun Raja Yogidas <arjunry@amazon.com>

Fix TestCreateVM_Isolated stability by creating fresh ctx/fcClient pe…

1ff34b4

…r subtest Signed-off-by: Arjun Raja Yogidas <arjunry@amazon.com>

coderbirju force-pushed the fix-test-failures branch 2 times, most recently from 23f5494 to 1ff34b4 Compare February 5, 2026 23:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: CI failures#848

fix: CI failures#848
coderbirju wants to merge 3 commits intofirecracker-microvm:mainfrom
coderbirju:fix-test-failures

coderbirju commented Jan 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

coderbirju commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderbirju commented Jan 27, 2026 •

edited

Loading