Skip to content

Conversation

@bisgaard-itis
Copy link
Contributor

@bisgaard-itis bisgaard-itis commented Jul 8, 2025

What do these changes do?

  • Ensure asyncpg is instrumented. Together with @pcrespov I discovered the other day that this library is not instrumented in the aiohttp case (webserver)
  • Clean up tracing tests
  • Add a decorator which can be used to generate opentelemetry spans for a given coroutine (e.g. a handler). That span will include a pyinstrument profile. This will make it easier to detect computational bottlenecks in the future.
  • I also enable tracing on the local deployment by default because it is very useful for debugging.
  • The changes here seems to have surfaced issues in the unit tests in storage. It seems the cause of these issues were that tracing was enabled, but there was no collector. So I have explicitly enabled tracing (adding an in-memory collector).

Related issue/s

How to test

Dev-ops

Food for thought

  • Using the decorator I profiled the get_my_profile endpoint in the webserver on my local deployment and this is (part of) the profile I got:
  _     ._   __/__   _ _  _  _ _/_   Recorded: 09:49:40  Samples:  152
 /_//_/// /_\ / //_// / //_'/ //     Duration: 1.766     CPU time: 0.180
/   _/                      v4.6.1

Program: /home/scu/.venv/bin/gunicorn simcore_service_webserver.cli:app_factory --log-level=info --bind 0.0.0.0:8080 --worker-class aiohttp.GunicornUVLoopWebWorker --workers=1 --name=webserver_wb-bisgaard-wkst-2_2025-07-08_09:47:35_7 --access-logfile=- --access-logformat=%a %t "%r" %s %b [%Dus] "%{Referer}i" "%{User-Agent}i" --worker-tmp-dir=/dev/shm

1.765 MainThread  <thread>:140197402631040
├─ 1.752 <module>  gunicorn:1
│  └─ 1.752 run  gunicorn/app/wsgiapp.py:60
│     └─ 1.752 WSGIApplication.run  gunicorn/app/base.py:201
│        └─ 1.752 WSGIApplication.run  gunicorn/app/base.py:69
│           └─ 1.752 Arbiter.run  gunicorn/arbiter.py:195
│              └─ 1.752 Arbiter.manage_workers  gunicorn/arbiter.py:564
│                 └─ 1.752 Arbiter.spawn_workers  gunicorn/arbiter.py:632
│                    └─ 1.752 Arbiter.spawn_worker  gunicorn/arbiter.py:586
│                       └─ 1.752 GunicornUVLoopWebWorker.init_process  aiohttp/worker.py:243
│                          └─ 1.752 GunicornUVLoopWebWorker.init_process  aiohttp/worker.py:50
│                             └─ 1.752 GunicornUVLoopWebWorker.init_process  gunicorn/workers/base.py:86
│                                └─ 1.752 GunicornUVLoopWebWorker.run  aiohttp/worker.py:59
│                                   ├─ 1.192 [self]  aiohttp/worker.py
│                                   ├─ 0.511 RequestHandler._handle_request  aiohttp/web_protocol.py:500
│                                   │  └─ 0.511 Application._handle  aiohttp/web_app.py:527
│                                   │     └─ 0.511 impl  aiohttp/web_middlewares.py:111
│                                   │        └─ 0.511 middleware  opentelemetry/instrumentation/aiohttp_server/__init__.py:231
│                                   │           └─ 0.511 response_trace_id_header_middleware  servicelib/aiohttp/tracing.py:171
│                                   │              └─ 0.511 factory  aiohttp_session/__init__.py:191
│                                   │                 └─ 0.511 _middleware_handler  servicelib/aiohttp/rest_middlewares.py:207
│                                   │                    └─ 0.511 _middleware_handler  servicelib/aiohttp/rest_middlewares.py:264
│                                   │                       └─ 0.511 discover_product_middleware  simcore_service_webserver/products/_web_middlewares.py:54
│                                   │                          └─ 0.511 middleware_handler  servicelib/aiohttp/monitoring.py:58
│                                   │                             └─ 0.511 profiling_middleware  servicelib/aiohttp/profiler_middleware.py:10
│                                   │                                └─ 0.511 base_long_running_error_handler  servicelib/aiohttp/long_running_tasks/_error_handlers.py:15
│                                   │                                   └─ 0.511 wrapper  servicelib/tracing.py:65
│                                   │                                      └─ 0.511 _wrapper  simcore_service_webserver/login_auth/decorators.py:45
│                                   │                                         ├─ 0.283 _wrapper  simcore_service_webserver/exception_handling/_base.py:139
│                                   │                                         │  └─ 0.283 get_my_profile  simcore_service_webserver/users/_users_rest.py:113
│                                   │                                         │     ├─ 0.268 list_user_groups_with_read_access  simcore_service_webserver/groups/_groups_service.py:39
│                                   │                                         │     │  └─ 0.268 get_all_user_groups_with_read_access  simcore_service_webserver/groups/_groups_repository.py:170
│                                   │                                         │     │     ├─ 0.267 AsyncConnection.stream  sqlalchemy/ext/asyncio/engine.py:389
│                                   │                                         │     │     │  └─ 0.267 greenlet_spawn  sqlalchemy/util/_concurrency_py3k.py:95
│                                   │                                         │     │     │     ├─ 0.266 AsyncAdapt_asyncpg_ss_cursor._prepare_and_execute  sqlalchemy/dialects/postgresql/asyncpg.py:402
│                                   │                                         │     │     │     │  └─ 0.266 AsyncAdapt_asyncpg_connection._prepare  sqlalchemy/dialects/postgresql/asyncpg.py:639
│                                   │                                         │     │     │     │     └─ 0.266 Connection.prepare  asyncpg/connection.py:533
│                                   │                                         │     │     │     │        └─ 0.266 Connection._prepare  asyncpg/connection.py:573
│                                   │                                         │     │     │     │           └─ 0.266 Connection._get_statement  asyncpg/connection.py:359
│                                   │                                         │     │     │     │              └─ 0.266 Connection._introspect_types  asyncpg/connection.py:457
│                                   │                                         │     │     │     │                 └─ 0.266 Connection.__execute  asyncpg/connection.py:1669
│                                   │                                         │     │     │     │                    └─ 0.266 Connection._do_execute  asyncpg/connection.py:1699
│                                   │                                         │     │     │     │                       └─ 0.266 [await]  asyncpg/connection.py
│                                   │                                         │     │     │     └─ 0.001 AsyncPGInstrumentor._do_cursor_execute  opentelemetry/instrumentation/asyncpg/__init__.py:204
│                                   │                                         │     │     │        └─ 0.001 _AgnosticContextManager.__enter__  opentelemetry/util/_decorator.py:54
│                                   │                                         │     │     │           └─ 0.001 Tracer.start_as_current_span  opentelemetry/sdk/trace/__init__.py:1076

Note: Handling the request takes 1.7 seconds, while the handler only takes 0.2 seconds.

@codecov
Copy link

codecov bot commented Jul 8, 2025

Codecov Report

Attention: Patch coverage is 76.08696% with 11 lines in your changes missing coverage. Please review.

Project coverage is 88.28%. Comparing base (7aec02e) to head (c04e853).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8070      +/-   ##
==========================================
- Coverage   88.28%   88.28%   -0.01%     
==========================================
  Files        1864     1864              
  Lines       71826    71869      +43     
  Branches     1261     1264       +3     
==========================================
+ Hits        63412    63449      +37     
- Misses       8050     8053       +3     
- Partials      364      367       +3     
Flag Coverage Δ
integrationtests 64.28% <100.00%> (+0.03%) ⬆️
unittests 86.89% <76.08%> (-0.01%) ⬇️
Components Coverage Δ
pkg_aws_library 93.93% <ø> (ø)
pkg_celery_library 87.15% <ø> (ø)
pkg_dask_task_models_library 79.62% <ø> (ø)
pkg_models_library 93.24% <ø> (ø)
pkg_notifications_library 85.26% <ø> (ø)
pkg_postgres_database 88.22% <ø> (ø)
pkg_service_integration 69.92% <ø> (ø)
pkg_service_library 71.25% <73.80%> (-0.08%) ⬇️
pkg_settings_library 90.39% <ø> (ø)
pkg_simcore_sdk 85.05% <ø> (-0.06%) ⬇️
agent 96.29% <ø> (ø)
api_server 92.83% <ø> (-0.03%) ⬇️
autoscaling 95.88% <ø> (ø)
catalog 92.58% <ø> (ø)
clusters_keeper 99.13% <ø> (ø)
dask_sidecar 91.79% <ø> (ø)
datcore_adapter 97.94% <ø> (ø)
director 76.77% <ø> (ø)
director_v2 91.03% <ø> (-0.02%) ⬇️
dynamic_scheduler 96.69% <ø> (ø)
dynamic_sidecar 90.09% <ø> (ø)
efs_guardian 89.65% <ø> (ø)
invitations 93.60% <ø> (ø)
payments 92.57% <ø> (ø)
resource_usage_tracker 92.52% <ø> (+0.16%) ⬆️
storage 86.89% <ø> (+0.25%) ⬆️
webclient ∅ <ø> (∅)
webserver 88.61% <100.00%> (+0.01%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7aec02e...c04e853. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bisgaard-itis bisgaard-itis self-assigned this Jul 8, 2025
@bisgaard-itis bisgaard-itis added the a:services-library issues on packages/service-libs label Jul 8, 2025
@bisgaard-itis bisgaard-itis added this to the Engage milestone Jul 8, 2025
@bisgaard-itis bisgaard-itis marked this pull request as ready for review July 8, 2025 09:59
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx!

@bisgaard-itis
Copy link
Contributor Author

@Mergifyio queue

@bisgaard-itis bisgaard-itis added the 🤖-automerge marks PR as ready to be merged for Mergify label Jul 9, 2025
@mergify
Copy link
Contributor

mergify bot commented Jul 9, 2025

queue

🟠 Waiting for conditions to match

  • -closed [📌 queue requirement]
  • -conflict [📌 queue requirement]
  • -draft [📌 queue requirement]
  • any of: [📌 queue -> configuration change requirements]
    • -mergify-configuration-changed
    • check-success = Configuration changed
  • any of: [🔀 queue conditions]
    • all of: [📌 queue conditions of queue default]
      • #approved-reviews-by >= 2 [🛡 GitHub branch protection]
      • #approved-reviews-by>=2
      • #changes-requested-reviews-by = 0 [🛡 GitHub branch protection]
      • #changes-requested-reviews-by=0
      • #review-threads-unresolved = 0 [🛡 GitHub branch protection]
      • #review-threads-unresolved=0
      • -conflict
      • -draft
      • base=master
      • branch-protection-review-decision = APPROVED [🛡 GitHub branch protection]
      • label!=🤖-do-not-merge
      • label=🤖-automerge
      • any of: [🛡 GitHub branch protection]
        • check-skipped = deploy to dockerhub
        • check-neutral = deploy to dockerhub
        • check-success = deploy to dockerhub
      • any of: [🛡 GitHub branch protection]
        • check-success = system-tests
        • check-neutral = system-tests
        • check-skipped = system-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = unit-tests
        • check-neutral = unit-tests
        • check-skipped = unit-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = check OAS' are up to date
        • check-neutral = check OAS' are up to date
        • check-skipped = check OAS' are up to date
      • any of: [🛡 GitHub branch protection]
        • check-success = integration-tests
        • check-neutral = integration-tests
        • check-skipped = integration-tests
      • any of: [🛡 GitHub branch protection]
        • check-success = build-test-images (frontend) / build-test-images
        • check-neutral = build-test-images (frontend) / build-test-images
        • check-skipped = build-test-images (frontend) / build-test-images

Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

Copy link
Contributor

@YuryHrytsuk YuryHrytsuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job. Thanks!

Copy link
Member

@mrnicegyu11 mrnicegyu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure about the test changes, But I trust that you have tested the refactored tests. For the main code, looks good and thanks for the contirbution!

@sonarqubecloud
Copy link

@bisgaard-itis bisgaard-itis enabled auto-merge (squash) July 10, 2025 04:22
@bisgaard-itis bisgaard-itis merged commit 601c3ef into ITISFoundation:master Jul 10, 2025
94 of 96 checks passed
@bisgaard-itis bisgaard-itis deleted the 8065-instrument-asyncpg branch July 10, 2025 04:27
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Aug 5, 2025
88 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🤖-automerge marks PR as ready to be merged for Mergify a:services-library issues on packages/service-libs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants