Skip to content

Conversation

@martin-purplefish
Copy link
Contributor

Summary

  • Fixed a race condition in _DefaultLoadCalc.get_load() where the shared MovingAverage object was accessed without acquiring the threading lock
  • The background thread continuously modifies the moving average while get_load() reads it, causing potential inconsistent state reads
  • Changed to use the existing _get_avg() method which properly acquires the lock before reading

Details

The get_load() classmethod was directly calling cls._instance._m_avg.get_avg() without lock protection, while:

  • The background thread acquires self._lock before calling add_sample()
  • The instance method _get_avg() properly acquires self._lock before calling get_avg()

This inconsistency could cause workers to report incorrect CPU load values to the LiveKit server, potentially leading to incorrect job assignment decisions.

Test plan

  • Verified the existing _get_avg() method properly acquires the lock
  • Confirmed the fix uses the locked method instead of direct access

🤖 Generated with Claude Code

…ition

The `get_load()` classmethod was accessing the shared `MovingAverage`
object directly without acquiring the threading lock, while the
background thread continuously modifies it. This created a race
condition that could cause incorrect CPU load values to be reported.

The fix uses the existing `_get_avg()` method which properly acquires
the lock before reading the moving average.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@martin-purplefish
Copy link
Contributor Author

We've been seeing strange load readings, I'm not sure if this is the case but it seems like an obvious bug. The other way to solve this would be to have add sample update a cache of the load.

Copy link
Member

@theomonnom theomonnom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@theomonnom theomonnom merged commit c02365f into livekit:main Jan 3, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants