Skip to content

Conversation

@GitHK
Copy link
Contributor

@GitHK GitHK commented Jul 21, 2025

What do these changes do?

While the asyncio.Task is still attached and tracked by a single instance of the service, its associated data is stored in Redis.

This change simplifies the fowling:

  • listing of all running services
  • any service can return the status of a long running task
  • sticky sessions are no longer necessary since the status is fetched via Redis

Changes:

  • Fixed an issue with RedisClientSDK that caused it to hang when closing
  • Removed and repurposed distributed_identifiers Redis DB, which is now used for the long_running_tasks
  • long running tasks no longer persist tasks in memory but these are tracked in Redis
  • long running task cancellation is now done via the lrt_api module. When a task is cancelled, the intent is stored in Redis. If the copy of the service, that received the message can't handle it (does not have the task), it will be removed by the background task running on all copies of the service.

Related issue/s

How to test

Dev-ops

  • merge this MR before merging this

@GitHK GitHK self-assigned this Jul 21, 2025
@GitHK GitHK added the a:services-library issues on packages/service-libs label Jul 21, 2025
@GitHK GitHK added this to the Engage milestone Jul 21, 2025
@codecov
Copy link

codecov bot commented Jul 21, 2025

Codecov Report

❌ Patch coverage is 84.97537% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.02%. Comparing base (b952aff) to head (5780e3e).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8131      +/-   ##
==========================================
- Coverage   88.07%   88.02%   -0.05%     
==========================================
  Files        1905     1871      -34     
  Lines       73166    72543     -623     
  Branches     1280     1260      -20     
==========================================
- Hits        64444    63859     -585     
+ Misses       8343     8308      -35     
+ Partials      379      376       -3     
Flag Coverage Δ
integrationtests 64.32% <45.76%> (+0.15%) ⬆️
unittests 86.65% <84.48%> (-0.06%) ⬇️
Components Coverage Δ
pkg_aws_library 93.93% <ø> (ø)
pkg_celery_library 87.37% <100.00%> (+0.03%) ⬆️
pkg_dask_task_models_library 79.62% <ø> (ø)
pkg_models_library 93.03% <40.00%> (-0.09%) ⬇️
pkg_notifications_library 85.26% <ø> (ø)
pkg_postgres_database 88.02% <ø> (ø)
pkg_service_integration 70.19% <ø> (ø)
pkg_service_library 71.72% <84.40%> (+0.30%) ⬆️
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 84.99% <ø> (-0.12%) ⬇️
agent 93.81% <ø> (ø)
api_server 93.08% <ø> (ø)
autoscaling 95.89% <100.00%> (+<0.01%) ⬆️
catalog 92.34% <ø> (ø)
clusters_keeper 99.13% <100.00%> (+<0.01%) ⬆️
dask_sidecar 92.37% <ø> (ø)
datcore_adapter 97.94% <ø> (ø)
director 76.14% <ø> (ø)
director_v2 90.95% <85.71%> (-0.14%) ⬇️
dynamic_scheduler 96.27% <ø> (ø)
dynamic_sidecar 90.12% <96.77%> (+0.04%) ⬆️
efs_guardian 89.60% <0.00%> (-0.17%) ⬇️
invitations 91.44% <ø> (ø)
payments 92.60% <ø> (ø)
resource_usage_tracker 92.50% <100.00%> (+0.05%) ⬆️
storage 86.46% <100.00%> (+0.06%) ⬆️
webclient ∅ <ø> (∅)
webserver 88.17% <100.00%> (+0.01%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b952aff...5780e3e. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@GitHK GitHK changed the title ♻️ TasksManager now uses an async store for keeping track of tasks ♻️ TasksManager stores contents in Redis Jul 23, 2025
@GitHK GitHK requested a review from pcrespov August 5, 2025 11:47
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as agreed, please do not forget to follow up on the refactoring of background tasks. thx

Copy link
Collaborator

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check yet, but to unblock you I approve.

Copy link
Contributor

@YuryHrytsuk YuryHrytsuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GitHK GitHK requested a review from Copilot August 6, 2025 06:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refactors the TasksManager to use Redis for storing task data instead of in-memory storage, enabling better distributed task management across multiple service instances.

  • Replaces in-memory task tracking with Redis-based storage for long-running tasks
  • Implements task cancellation across different service instances via Redis
  • Updates all services to properly configure Redis client SDK with setup() calls

Reviewed Changes

Copilot reviewed 114 out of 115 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
services/web/server/src/simcore_service_webserver/long_running_tasks.py Configures long running tasks with Redis settings and namespace
packages/service-library/src/servicelib/long_running_tasks/task.py Major refactor to use Redis-based storage instead of in-memory tracking
packages/service-library/src/servicelib/redis/_client.py Adds explicit setup() method and improves Redis health checking
packages/settings-library/src/settings_library/redis.py Renames DISTRIBUTED_IDENTIFIERS database to LONG_RUNNING_TASKS
services/*/tests/ Updates test configurations to use Redis for long running tasks
Comments suppressed due to low confidence (1)

packages/service-library/src/servicelib/long_running_tasks/task.py:441

  • The _update_progress method is called frequently during task execution but performs Redis operations on every call. Consider implementing a debouncing mechanism or batching progress updates to reduce Redis load, especially for tasks that update progress very frequently.
        except TaskNotFoundError:

@GitHK GitHK enabled auto-merge (squash) August 6, 2025 07:55
@GitHK GitHK disabled auto-merge August 6, 2025 07:55
@GitHK
Copy link
Contributor Author

GitHK commented Aug 6, 2025

@Mergifyio queue

@mergify
Copy link
Contributor

mergify bot commented Aug 6, 2025

queue

🟠 Waiting for conditions to match

  • -closed [📌 queue requirement]
  • any of: [🔀 queue conditions]
    • all of: [📌 queue conditions of queue default]
      • branch-protection-review-decision = APPROVED [🛡 GitHub branch protection]
      • #approved-reviews-by >= 2 [🛡 GitHub branch protection]
      • #approved-reviews-by>=2
      • #changes-requested-reviews-by = 0 [🛡 GitHub branch protection]
      • #changes-requested-reviews-by=0
      • #review-threads-unresolved = 0 [🛡 GitHub branch protection]
      • #review-threads-unresolved=0
      • -conflict
      • -draft
      • base=master
      • label!=🤖-do-not-merge
      • label=🤖-automerge
      • any of: [🛡 GitHub branch protection]
        • check-skipped = deploy to dockerhub
        • check-neutral = deploy to dockerhub
        • check-success = deploy to dockerhub
      • any of: [🛡 GitHub branch protection]
        • check-success = system-tests
        • check-neutral = system-tests
        • check-skipped = system-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = unit-tests
        • check-neutral = unit-tests
        • check-skipped = unit-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = check OAS' are up to date
        • check-neutral = check OAS' are up to date
        • check-skipped = check OAS' are up to date
      • any of: [🛡 GitHub branch protection]
        • check-success = integration-tests
        • check-neutral = integration-tests
        • check-skipped = integration-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = build-test-images (frontend) / build-test-images
        • check-neutral = build-test-images (frontend) / build-test-images
        • check-skipped = build-test-images (frontend) / build-test-images
      • any of: [🛡 GitHub branch protection]
        • check-success = SonarCloud Code Analysis
        • check-neutral = SonarCloud Code Analysis
        • check-skipped = SonarCloud Code Analysis
  • -conflict [📌 queue requirement]
  • -draft [📌 queue requirement]
  • any of: [📌 queue -> configuration change requirements]
    • -mergify-configuration-changed
    • check-success = Configuration changed

@sonarqubecloud
Copy link

sonarqubecloud bot commented Aug 6, 2025

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
1.2% Duplication on New Code

See analysis details on SonarQube Cloud

@GitHK GitHK added the 🤖-automerge marks PR as ready to be merged for Mergify label Aug 6, 2025
@mrnicegyu11 mrnicegyu11 merged commit b585dbf into ITISFoundation:master Aug 6, 2025
94 of 95 checks passed
@mrnicegyu11
Copy link
Member

Force merged upon @GitHK request

@GitHK GitHK deleted the pr-osparc-long-running-tasks-refactor-6 branch August 6, 2025 08:26
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Aug 8, 2025
88 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🤖-automerge marks PR as ready to be merged for Mergify a:services-library issues on packages/service-libs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants