Skip to content

Conversation

@sanderegg
Copy link
Member

What do these changes do?

As it happens more and more this is a measure to protect the platform from long running jobs that are probably hanging and it protects the user from paying for jobs that are hanged.

If a job does not produce any logs during 1h a timeout triggers and will stop the job.

Related issue/s

How to test

Dev-ops

@sanderegg sanderegg self-assigned this Oct 23, 2025
@sanderegg sanderegg added a:dask-service Any of the dask services: dask-scheduler/sidecar or worker a:computational clusters labels Oct 23, 2025
@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.39%. Comparing base (ffe52c1) to head (fe881f5).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8549      +/-   ##
==========================================
+ Coverage   87.04%   88.39%   +1.35%     
==========================================
  Files        2011     1288     -723     
  Lines       78751    54876   -23875     
  Branches     1365      187    -1178     
==========================================
- Hits        68545    48506   -20039     
+ Misses       9803     6311    -3492     
+ Partials      403       59     -344     
Flag Coverage Δ
integrationtests 60.39% <ø> (+0.03%) ⬆️
unittests 87.29% <100.00%> (+1.00%) ⬆️
Components Coverage Δ
pkg_aws_library ∅ <ø> (∅)
pkg_celery_library ∅ <ø> (∅)
pkg_dask_task_models_library 79.22% <100.00%> (+0.22%) ⬆️
pkg_models_library ∅ <ø> (∅)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration ∅ <ø> (∅)
pkg_service_library ∅ <ø> (∅)
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 84.89% <ø> (-0.06%) ⬇️
agent 93.10% <ø> (ø)
api_server 91.62% <ø> (ø)
autoscaling 95.83% <ø> (ø)
catalog 92.06% <ø> (ø)
clusters_keeper 99.14% <ø> (ø)
dask_sidecar ∅ <ø> (∅)
datcore_adapter 97.95% <ø> (ø)
director 75.81% <ø> (+0.08%) ⬆️
director_v2 85.32% <ø> (-0.03%) ⬇️
dynamic_scheduler ∅ <ø> (∅)
dynamic_sidecar 90.44% <ø> (ø)
efs_guardian 89.83% <ø> (ø)
invitations 90.90% <ø> (ø)
payments 92.80% <ø> (ø)
resource_usage_tracker 92.11% <ø> (-0.11%) ⬇️
storage 86.92% <ø> (+0.08%) ⬆️
webclient ∅ <ø> (∅)
webserver 87.07% <ø> (+0.03%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ffe52c1...fe881f5. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sanderegg sanderegg force-pushed the computational-backend/stop-running-job-if-no-feedback branch from e6ada92 to 69cb65c Compare October 23, 2025 06:55
@mergify
Copy link
Contributor

mergify bot commented Oct 23, 2025

🧪 CI Insights

Here's what we observed from your CI run for fe881f5.

❌ Job Failures

Pipeline Job Health on master Retries 🔍 CI Insights 📄 Logs
CI integration-tests Healthy 0 View View
unit-tests Healthy 0 View View

✅ Passed Jobs With Interesting Signals

Pipeline Job Signal Health on master Retries 🔍 CI Insights 📄 Logs
CI system-tests Base branch is broken, but the job passed. Looks like this might be a real fix 💪 Broken 0 View View

@sanderegg sanderegg force-pushed the computational-backend/stop-running-job-if-no-feedback branch from 69cb65c to fe881f5 Compare October 24, 2025 16:14
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a:computational clusters a:dask-service Any of the dask services: dask-scheduler/sidecar or worker

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant