Skip to content

Conversation

@bisgaard-itis
Copy link
Contributor

@bisgaard-itis bisgaard-itis commented Sep 18, 2025

What do these changes do?

  • This PR is motivated by a bug in the api-server. While investigating the bug I realized that the real issue was a conceptual issue in the way we pass around metadata to our celery tasks. This PR fixes the issue and also renames a lot of classes to (hopefully) make the concepts more understandable.
  • When submitting a task in celery (via our wonderful celery-library) one can pass two different types of metadata: ExecutionMetadata and OwnerMetadata. ExecutionMetadata specifies how and where a task should be executed and also additional properties associated with the behavior of the task (e.g. ephemeral). OwnerData is a very flexible type of metadata which can be stored together with the task. The owner must specify who he is in the OwnerData (via the owner field), but apart from that he can pass along other types of metadata as he pleases. Later he can filter his own tasks using a wildcard pattern.
  • A key point is that when task interfaces are exposed as an RPC interface (e.g. in storage) the user (e.g. the webserver) must be allowed to pass along his own metadata. That way the webserver can later filter tasks according to his own metadata.
  • Here are some of the reasons why I think this design is nice to have:
    • The api-server exposes an endpoint which allows a user to zip logfiles. Internally, the api-server triggers this task via the RPC interface exposed by storage. This is great, because storage is responsible for the ExecutionMetadata (e.g. where the task should be executed) and hides that complexity for the api-server. On the other hand, when an osparc user requests the status for his task via the api-server, the api-server can now query the status directly via the celery-library's task_manager. I.e. from the point of view of the api-server it is irrelevant where the task "lives" just to get the status. This works because the api-server's OwnerMetadata is passed along in the storage rpc interface without modification.
    • From the point of view of the api-server it would be nice to be able to figure out which api-server endpoint triggered the task from the task_uuid, because that way, when the user requests the result of the task, the api-server could figure out what is the response model and validate it. This is required in order to adhere to the google api design guideline for long running task APIs (https://github.com/googleapis/googleapis/blob/master/google/longrunning/operations.proto#L150). With these design changes the endpoint's operation_id can easily be added to the OwnerMetadata to achieve exactly this.
    • The fact that the OwnerMetadata requires an owner of a celery task to be specified means that owners will only see the tasks they own and no more. So this should work as a kind of security feature.
    • The filtering mechanism will allow e.g. a user of the api-server to filter his tasks according to endpoint they were triggered from.

Related issue/s

How to test

  • This is a refactoring so no new tests have been added, but old ones have been modified and extended.

Dev-ops

@bisgaard-itis bisgaard-itis self-assigned this Sep 18, 2025
@codecov
Copy link

codecov bot commented Sep 18, 2025

Codecov Report

❌ Patch coverage is 83.49515% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.91%. Comparing base (407f10e) to head (2354efd).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8388      +/-   ##
==========================================
+ Coverage   87.65%   87.91%   +0.25%     
==========================================
  Files        1832     1954     +122     
  Lines       71560    76107    +4547     
  Branches     1341     1342       +1     
==========================================
+ Hits        62726    66907    +4181     
- Misses       8433     8796     +363     
- Partials      401      404       +3     
Flag Coverage Δ
integrationtests 64.04% <100.00%> (+0.08%) ⬆️
unittests 86.58% <83.49%> (+0.32%) ⬆️
Components Coverage Δ
pkg_aws_library 93.59% <ø> (ø)
pkg_celery_library 83.41% <100.00%> (-0.17%) ⬇️
pkg_dask_task_models_library 79.33% <ø> (ø)
pkg_models_library 93.08% <ø> (-0.01%) ⬇️
pkg_notifications_library 85.20% <ø> (ø)
pkg_postgres_database 87.95% <ø> (ø)
pkg_service_integration 70.19% <ø> (ø)
pkg_service_library 72.57% <69.56%> (-0.07%) ⬇️
pkg_settings_library 90.19% <ø> (ø)
pkg_simcore_sdk 84.99% <ø> (-0.06%) ⬇️
agent 93.53% <ø> (ø)
api_server 91.94% <84.21%> (∅)
autoscaling 95.74% <ø> (ø)
catalog 92.36% <ø> (ø)
clusters_keeper 99.13% <ø> (ø)
dask_sidecar 92.38% <ø> (ø)
datcore_adapter 97.94% <ø> (ø)
director 75.81% <ø> (ø)
director_v2 90.92% <ø> (+0.04%) ⬆️
dynamic_scheduler 96.30% <ø> (ø)
dynamic_sidecar 90.43% <ø> (ø)
efs_guardian 89.62% <ø> (ø)
invitations 91.44% <ø> (ø)
payments 92.62% <ø> (ø)
resource_usage_tracker 91.87% <ø> (-0.32%) ⬇️
storage 86.78% <100.00%> (+0.24%) ⬆️
webclient ∅ <ø> (∅)
webserver 88.02% <100.00%> (+0.03%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 407f10e...2354efd. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mergify
Copy link
Contributor

mergify bot commented Sep 18, 2025

🧪 CI Insights

Here's what we observed from your CI run for 2354efd.

✅ Passed Jobs With Interesting Signals

Pipeline Job Signal Health on master Retries 🔍 CI Insights 📄 Logs
CI unit-tests Base branch is broken, but the job passed. Looks like this might be a real fix 💪 Broken 0 View View

Copy link
Collaborator

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sonarqubecloud
Copy link

@bisgaard-itis bisgaard-itis enabled auto-merge (squash) September 26, 2025 09:45
@bisgaard-itis bisgaard-itis added the 🤖-automerge marks PR as ready to be merged for Mergify label Sep 26, 2025
@bisgaard-itis
Copy link
Contributor Author

@Mergifyio queue

@mergify
Copy link
Contributor

mergify bot commented Sep 26, 2025

queue

🟠 Waiting for conditions to match

  • -closed [📌 queue requirement]
  • -conflict [📌 queue requirement]
  • -draft [📌 queue requirement]
  • any of: [📌 queue -> configuration change requirements]
    • -mergify-configuration-changed
    • check-success = Configuration changed
  • any of: [🔀 queue conditions]
    • all of: [📌 queue conditions of queue default]
      • #approved-reviews-by >= 2 [🛡 GitHub branch protection]
      • #approved-reviews-by>=2
      • #changes-requested-reviews-by = 0 [🛡 GitHub branch protection]
      • #changes-requested-reviews-by=0
      • #review-threads-unresolved = 0 [🛡 GitHub branch protection]
      • #review-threads-unresolved=0
      • -conflict
      • -draft
      • base=master
      • branch-protection-review-decision = APPROVED [🛡 GitHub branch protection]
      • label!=🤖-do-not-merge
      • label=🤖-automerge
      • any of: [🛡 GitHub branch protection]
        • check-skipped = deploy to dockerhub
        • check-neutral = deploy to dockerhub
        • check-success = deploy to dockerhub
      • any of: [🛡 GitHub branch protection]
        • check-success = system-tests
        • check-neutral = system-tests
        • check-skipped = system-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = unit-tests
        • check-neutral = unit-tests
        • check-skipped = unit-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = check OAS' are up to date
        • check-neutral = check OAS' are up to date
        • check-skipped = check OAS' are up to date
      • any of: [🛡 GitHub branch protection]
        • check-success = integration-tests
        • check-neutral = integration-tests
        • check-skipped = integration-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = build-test-images (frontend) / build-test-images
        • check-neutral = build-test-images (frontend) / build-test-images
        • check-skipped = build-test-images (frontend) / build-test-images
      • any of: [🛡 GitHub branch protection]
        • check-success = SonarCloud Code Analysis
        • check-neutral = SonarCloud Code Analysis
        • check-skipped = SonarCloud Code Analysis

@bisgaard-itis bisgaard-itis merged commit dd8edee into ITISFoundation:master Sep 26, 2025
94 of 95 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🤖-automerge marks PR as ready to be merged for Mergify a:celery-library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants