[WIP] Maint: attempt to mitigate MPITimeout on CI by asoplata · Pull Request #1242 · jonescompneurolab/hnn-core

asoplata · 2026-02-13T19:11:56Z

This is a shallow attempt to see if a small change mentioned in #774 (comment) is enough to improve the odds of our Unit Test runners passing the particularly problematic MPI test that keeps failing frequently. The only change this makes is increasing the timeout of parallel_backends.py::_get_data_from_child_err from 0.01 to 0.05. This greatly increases the time window during which an mpi_child process during an MPI simulation must return its data, if it has any. As far as I understand it (which is only a little bit at the moment), this is the main way that our MPI child processes communicate actual simulation results to the main process.

When I did some local testing on my own computer (after reducing other timeout values elsewhere in the code), this seemed to have a very good impact on allowing for more MPI simulations to successfully complete.

I don't know if, or what, the negative impacts of this change would be, but considering that it's increasing a time window for inter-process communication from 10 milliseconds to 50, this probably doesn't have any negative impacts.

This is a shallow attempt to see if a small change mentioned in jonescompneurolab#774 (comment) is enough to improve the odds of our Unit Test runners passing the particularly problematic MPI test that keeps failing frequently. The only change this makes is increasing the timeout of `parallel_backends.py::_get_data_from_child_err` from 0.01 to 0.05. This greatly increases the time window during which an `mpi_child` process during an MPI simulation must return its data, if it has any. As far as I understand it (which is only a little bit at the moment), this is the main way that our MPI child processes communicate actual simulation results to the main process. When I did some local testing on my own computer (after reducing other timeout values elsewhere in the code), this seemed to have a very good impact on allowing for more MPI simulations to successfully complete. I don't know if, or what, the negative impacts of this change would be, but considering that it's increasing a time window for inter-process communication from 10 milliseconds to 50, this probably doesn't have any negative impacts.

asoplata · 2026-02-13T19:48:49Z

Still failed

github-project-automation bot added this to HNN Workspace Turbo 9000 Feb 13, 2026

github-project-automation bot moved this to Backlog in HNN Workspace Turbo 9000 Feb 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Maint: attempt to mitigate MPITimeout on CI#1242

[WIP] Maint: attempt to mitigate MPITimeout on CI#1242
asoplata wants to merge 1 commit intojonescompneurolab:masterfrom
asoplata:fix/mpi-timeout-mitigation-attempt-2

asoplata commented Feb 13, 2026

Uh oh!

asoplata commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

asoplata commented Feb 13, 2026

Uh oh!

asoplata commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant