Skip to content

Conversation

@sanderegg
Copy link
Member

@sanderegg sanderegg commented Sep 25, 2025

What do these changes do?

The dask worker is setup to either:

  • use a fix number of threads to run jobs (which is not ideal when changing the instance type where the dask-workers are running),
  • use a 0 number of threads, which means the dask-worker will compute the number of CPUs and use that number as the number of threads

In non-billable scenarii, and when running jobs that are not resource heavy, it makes sense that the number of threads is higher than the number of CPUs. A fixed amount is suboptimal as this will apply the same number on any type of machine. The multiplier is relative to the number of CPUs.

This PR introduces DASK_NTHREADS_MULTIPLIER for the dask-sidecar service which is defaulted to 1. The pendant CLUSTERS_KEEPER_DASK_NTHREADS_MULTIPLIER is also added.

Related issue/s

How to test

Dev-ops

@sanderegg sanderegg added this to the Cheops milestone Sep 25, 2025
@sanderegg sanderegg self-assigned this Sep 25, 2025
@sanderegg sanderegg added a:dask-service Any of the dask services: dask-scheduler/sidecar or worker a:clusters-keeper labels Sep 25, 2025
@codecov
Copy link

codecov bot commented Sep 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.91%. Comparing base (097758b) to head (784bfe0).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8419      +/-   ##
==========================================
+ Coverage   87.89%   87.91%   +0.01%     
==========================================
  Files        1953     1953              
  Lines       76045    76047       +2     
  Branches     1341     1341              
==========================================
+ Hits        66842    66858      +16     
+ Misses       8802     8788      -14     
  Partials      401      401              
Flag Coverage Δ
integrationtests 63.95% <0.00%> (-0.05%) ⬇️
unittests 86.59% <100.00%> (+0.02%) ⬆️
Components Coverage Δ
pkg_aws_library 93.59% <ø> (ø)
pkg_celery_library 83.58% <ø> (ø)
pkg_dask_task_models_library 79.33% <ø> (ø)
pkg_models_library 93.09% <ø> (ø)
pkg_notifications_library 85.20% <ø> (ø)
pkg_postgres_database 87.95% <ø> (ø)
pkg_service_integration 70.19% <ø> (ø)
pkg_service_library 72.56% <ø> (+0.01%) ⬆️
pkg_settings_library 90.19% <ø> (ø)
pkg_simcore_sdk 84.99% <ø> (ø)
agent 93.53% <ø> (ø)
api_server 91.96% <ø> (ø)
autoscaling 95.74% <ø> (ø)
catalog 92.36% <ø> (ø)
clusters_keeper 99.13% <100.00%> (+<0.01%) ⬆️
dask_sidecar 92.38% <ø> (ø)
datcore_adapter 97.94% <ø> (ø)
director 75.81% <ø> (ø)
director_v2 90.91% <100.00%> (+0.08%) ⬆️
dynamic_scheduler 96.30% <ø> (ø)
dynamic_sidecar 90.43% <ø> (ø)
efs_guardian 89.62% <ø> (ø)
invitations 91.44% <ø> (ø)
payments 92.62% <ø> (ø)
resource_usage_tracker 92.24% <ø> (+0.10%) ⬆️
storage 86.66% <ø> (+0.20%) ⬆️
webclient ∅ <ø> (∅)
webserver 88.00% <ø> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 097758b...784bfe0. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sanderegg sanderegg marked this pull request as ready for review September 25, 2025 08:23
Copy link
Contributor

@wvangeit wvangeit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@mergify
Copy link
Contributor

mergify bot commented Sep 25, 2025

🧪 CI Insights

Here's what we observed from your CI run for 784bfe0.

🟢 All jobs passed!

But CI Insights is watching 👀

@sanderegg sanderegg changed the title ✨Dask sidecar: add threads multiplier ✨Dask sidecar: add threads multiplier (⚠️ Devops) Sep 25, 2025
@sanderegg sanderegg requested a review from Copilot September 25, 2025 08:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a configurable threads multiplier for Dask workers to optimize thread allocation based on CPU count rather than using fixed values. This allows for better resource utilization, particularly for non-billable scenarios and non-resource-heavy jobs.

  • Adds DASK_NTHREADS_MULTIPLIER environment variable with default value of 1
  • Implements thread calculation logic in the dask-sidecar boot script
  • Updates configuration files and settings across multiple services

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
services/docker-compose.yml Adds new environment variable to clusters-keeper service
services/director-v2/src/simcore_service_director_v2/modules/comp_scheduler/_scheduler_dask.py Adds enhanced error logging for task retrieval failures
services/dask-sidecar/docker/boot.sh Implements core multiplier logic with validation and calculation
services/clusters-keeper/tests/unit/conftest.py Updates test configuration with new environment variable
services/clusters-keeper/src/simcore_service_clusters_keeper/utils/clusters.py Passes multiplier to dask-sidecar environment
services/clusters-keeper/src/simcore_service_clusters_keeper/data/docker-compose.yml Configures multiplier for dask workers and increases CPU limit
services/clusters-keeper/src/simcore_service_clusters_keeper/core/settings.py Defines new setting with validation constraints
.env-devel Sets default development environment value

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx

@sanderegg sanderegg force-pushed the dask-sidecar/add-threads-multiplier branch from 4704124 to 784bfe0 Compare September 25, 2025 09:04
@sanderegg sanderegg enabled auto-merge (squash) September 25, 2025 09:05
@sonarqubecloud
Copy link

@sanderegg sanderegg disabled auto-merge September 25, 2025 09:36
@sanderegg sanderegg merged commit 4235aab into ITISFoundation:master Sep 25, 2025
94 of 95 checks passed
@sanderegg sanderegg deleted the dask-sidecar/add-threads-multiplier branch September 25, 2025 09:36
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Oct 31, 2025
56 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a:clusters-keeper a:dask-service Any of the dask services: dask-scheduler/sidecar or worker

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants