Skip to content

Conversation

@fm3
Copy link
Member

@fm3 fm3 commented Oct 15, 2025

Worker missing notifications now

  • report the duration since the last heartbeat
  • are sent again after 1 hour

Steps to test:

(I already tested locally)

  • run yarn enable-jobs but do not start a worker
  • Adapt timings in application.conf
  • Watch backend logging for worker missing warnings.

  • Removed dev-only changes like prints and application.conf edits
  • Considered common edge cases

@fm3 fm3 self-assigned this Oct 15, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 15, 2025

📝 Walkthrough

Walkthrough

Replaces in-memory per-worker dead tracking with a TemporaryStore-backed persistent store, adds a configurable re-report interval, updates WorkerLivenessService constructor and message formatting with a heartbeat formatter, and adds a new Jobs setting in WkConf plus its value in application.conf.

Changes

Cohort / File(s) Change summary
Worker liveness persistence and reporting
app/models/job/Worker.scala
Replaces in-memory reportedAsDead Set with a TemporaryStore[ObjectId, Unit] for per-worker persistent state. reportIfLivenessChanged now queries/updates TemporaryStore and enforces a re-report interval from conf.Jobs.workerLivenessReReportInterval. Adds formatHeartbeat(worker) helper and updates dead/resurrected message formatting. Constructor WorkerLivenessService now accepts reportedAsDeadTemporaryStore: TemporaryStore[ObjectId, Unit] and conf: WkConf.
Configuration: Jobs settings
app/utils/WkConf.scala
Adds Jobs.workerLivenessReReportInterval: FiniteDuration, read from jobs.workerLivenessReReportInterval.
Runtime config value
conf/application.conf
Sets silhouette.jobs.workerLivenessReReportInterval = 1 hour.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I hop and I watch each tiny beat,
I store the pauses where paths may meet.
From burrowed cache to hourly chime,
I mark the lost, then mark the time.
A rabbit cheers: persistent and spry 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title concisely and accurately summarizes the primary change by indicating enhancements to “Worker missing” notifications, matching the core modifications in the code. It clearly conveys that notification behavior has been improved without extraneous details or vague language. This specificity allows team members to understand the main change at a glance. Therefore it meets the criteria for an effective PR title.
Description Check ✅ Passed The pull request description directly outlines the new behavior of worker missing notifications, explaining both the inclusion of heartbeat duration and the hourly re-report interval, and it provides relevant testing steps. It refers to the actual changes made in code and configuration and notes dev-only cleanup and edge case considerations. The level of detail is sufficient to understand and verify the feature without being off-topic. Thus it satisfies the criteria for a passing description check.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch better-worker-missing-notifications

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cae1eb and 3cdc310.

📒 Files selected for processing (1)
  • app/models/job/Worker.scala (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
app/models/job/Worker.scala (4)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/TemporaryStore.scala (4)
  • TemporaryStore (9-83)
  • contains (18-23)
  • insert (55-60)
  • remove (69-74)
app/utils/WkConf.scala (1)
  • Jobs (224-228)
util/src/main/scala/com/scalableminds/util/mvc/Formatter.scala (1)
  • formatDuration (30-82)
util/src/main/scala/com/scalableminds/util/time/Instant.scala (3)
  • Instant (14-45)
  • Instant (47-103)
  • since (68-68)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-smoketest-push
  • GitHub Check: backend-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@fm3 fm3 marked this pull request as ready for review October 15, 2025 08:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ccf1c28 and 3cae1eb.

📒 Files selected for processing (3)
  • app/models/job/Worker.scala (3 hunks)
  • app/utils/WkConf.scala (1 hunks)
  • conf/application.conf (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
app/utils/WkConf.scala (1)
util/src/main/scala/com/scalableminds/util/tools/ConfigReader.scala (1)
  • get (23-26)
app/models/job/Worker.scala (5)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/TemporaryStore.scala (4)
  • TemporaryStore (9-83)
  • contains (18-23)
  • insert (55-60)
  • remove (69-74)
app/utils/WkConf.scala (1)
  • Jobs (224-228)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/slacknotification/SlackClient.scala (3)
  • warn (18-21)
  • success (24-27)
  • info (21-24)
util/src/main/scala/com/scalableminds/util/mvc/Formatter.scala (1)
  • formatDuration (30-82)
util/src/main/scala/com/scalableminds/util/time/Instant.scala (3)
  • Instant (14-45)
  • Instant (47-103)
  • since (68-68)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: backend-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants