Skip to content

Flaky hanging tests after merging #54 #66

@llucax

Description

@llucax

What happened?

Pull request #54 introduced a timing issue with tests, making them flaky in amd64 but probably consistently failing in arm64 because the CI runs on qemu, which is extremely sloooooowwww.

It is probably related to the dependency bump of client-dispatch which in turn bumps the dependency of channels, which has a change in how Timers work.

There has been some investigation already done mainly by @Marenz, but it seems we'll need to spend some more time on this to find the root cause.

The issue seems to be that some condition variable is run in a different loop than the one it was created:

    | RuntimeError: <asyncio.locks.Condition object at 0x7f3d2bac0e50 [unlocked]> is bound to a different event loop

What did you expect instead?

Tests should run normally.

Affected version(s)

No response

Affected part(s)

Unit, integration and performance tests (part:tests)

Extra information

Here is a capture of logs when this happens: https://gist.github.com/Marenz/1ace8c7c0ccf01db70ceee8f767bb6f9#file-different-eventloop-py-L188.

The error seems to always happen (at least the error about using the wrong loop) inside the clean-up code from select(), it might help adding some logging there, like printing a stack trace when the select() was created and when it is being cleaned-up to see if both actions are done in different tests (and different loops).

Metadata

Metadata

Assignees

Labels

part:testsAffects the unit, integration and performance (benchmarks) testspriority:highAddress this as soon as possibleresolution:duplicateThis issue or pull request already existstype:bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions