Skip to content

Interrupts may cause deadlocks with threaded runtime #393

@sberkun

Description

@sberkun

Currently, the threaded runtime uses mutexes to maintain exclusive access to key data structures (namely, environments). On embedded systems, lingua franca also supports scheduling physical actions from interrupts, via lf_schedule. However, the following sequence of events may occur:

  • Core 1 aquires the mutex (i.e. LF_MUTEX_LOCK(&env->mutex))
  • Core 1 begins modifying the environment
  • An interrupt occurs on core 1. The interrupt handler calls lf_schedule, which also modifies the environment.
  • Interrupt handler finishes.
  • Core 1 finishes modifying environment
  • Core 1 releases mutex

In this case, the mutex was unhelpful, and the environment could be corrupted by the interrupt handler. The situation is even worse if the lf_schedule tries to acquire the mutex (is this the case? I could not find a LF_MUTEX_LOCK line). Then, deadlock occurs inside the interrupt handler.

There are a few potential solutions to this:

  • (1) Specify that mutexes should disable interrupts on embedded platforms. This is heavy-handed, but should work. The main concern is that interrupts would become disabled when any reaction is running, since reactions acquire a reactor-level mutex (here). Interrupts would therefore only be enabled when all cores are sleeping.
  • (2) Create a separate platform API for regular mutexes, and mutexes that disable interrupts. (i.e. there would be lf_mutex_init and lf_mutex_init_with_interrupts). The environments would be locked with "interrupt mutexes", and reactors locked with regular mutexes. On most platforms, the two APIs would just be equivalent. However, this would be a 3rd type of locking-related API (in addition to regular mutexes and critical sections), which may clutter the platform API.
  • (3) Use critical sections instead of mutexes for environments. Very similar to above.
  • (4) Create a separate function, lf_schedule_external, for scheduling from interrupts. Instead of modifying the environment, this function could have it's own queue for external events. Then whenever an event occurs, the scheduler would need to transfer external events from the external event queue onto the regular event queue. The advantage of this approach is that interrupts only need to be disabled when the external event queue is being modified. The main issue with this approach would be "waking up" the scheduler whenever the external event queue has something new; Currently, waking up is handled by _lf_cond_timedwait, which assumes a mutex.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions