Skip to content

Conversation

@GitHK
Copy link
Contributor

@GitHK GitHK commented Oct 5, 2024

What do these changes do?

I'm not sure why the GC would not remove the alive key as soon as the tab is closed or the user disconnects. The key is set to it's maximum timeout.

Related issue/s

How to test

Dev-ops checklist

@GitHK GitHK self-assigned this Oct 5, 2024
@GitHK GitHK added the a:webserver webserver's codebase. Assigning the area is particularly useful for bugs label Oct 5, 2024
@GitHK GitHK added this to the MartinKippenberger milestone Oct 5, 2024
@codecov
Copy link

codecov bot commented Oct 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.55%. Comparing base (cafbf96) to head (bf4f248).
Report is 688 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6492      +/-   ##
==========================================
- Coverage   84.57%   82.55%   -2.03%     
==========================================
  Files          10      602     +592     
  Lines         214    30589   +30375     
  Branches       25      265     +240     
==========================================
+ Hits          181    25253   +25072     
- Misses         23     5275    +5252     
- Partials       10       61      +51     
Flag Coverage Δ
integrationtests 64.71% <100.00%> (?)
unittests 88.01% <100.00%> (+3.43%) ⬆️
Components Coverage Δ
api ∅ <ø> (∅)
pkg_aws_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library ∅ <ø> (∅)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration ∅ <ø> (∅)
pkg_service_library ∅ <ø> (∅)
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 77.44% <ø> (∅)
agent ∅ <ø> (∅)
api_server ∅ <ø> (∅)
autoscaling ∅ <ø> (∅)
catalog ∅ <ø> (∅)
clusters_keeper ∅ <ø> (∅)
dask_sidecar ∅ <ø> (∅)
datcore_adapter ∅ <ø> (∅)
director ∅ <ø> (∅)
director_v2 76.28% <ø> (∅)
dynamic_scheduler ∅ <ø> (∅)
dynamic_sidecar 59.62% <ø> (∅)
efs_guardian ∅ <ø> (∅)
invitations ∅ <ø> (∅)
osparc_gateway_server 79.42% <ø> (∅)
payments ∅ <ø> (∅)
resource_usage_tracker ∅ <ø> (∅)
storage ∅ <ø> (∅)
webclient ∅ <ø> (∅)
webserver 89.43% <ø> (∅)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d38a6c3...bf4f248. Read the comment docs.

@GitHK GitHK marked this pull request as ready for review October 7, 2024 05:44
@GitHK GitHK changed the title 🐛 GC bug fixes 🐛 Fixes possible issue with GC closing projects Oct 7, 2024
@GitHK GitHK added bug buggy, it does not work as expected t:maintenance Some planned maintenance work labels Oct 7, 2024
@sonarqubecloud
Copy link

sonarqubecloud bot commented Oct 7, 2024

await self._registry.set_key_alive(
self._resource_key(), _get_service_deletion_timeout(self.app)
)
# when the tab is closed the alive key is also removed immediately,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from my understanding, this issue is somehow connected to session IDs (where there is probably some issue how we work with them cross product/cross tabs) did you test it and are you sure this removal will not cause other side-effect issue? Are you able to reproduce this issue? (its happening so often that I guess it should be reproducable?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reproduce this everywhere. Open a project and cose the tab its alive key will still be present in Redis. This makes no sense to me. But I might be wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are not sure, why are you changing it without verifying? What if this change makes the GC more unstable?

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are not sure, why are you changing it without verifying? What if this change makes the GC more unstable?

Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test the following:

  • start a service
  • close the tab
  • wait the GC interval
  • re-open the study in a new tab --> this should work
    I have the feeling what you did here will make this fail.

await self._registry.set_key_alive(self._resource_key(), 1)

async def remove_socket_id(self) -> None:
async def remove_socket_id_after_disconnection(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't why you change the naming of that function.
In effect you are moving logic naming into a deeper location.

)
# when the tab is closed the alive key is also removed immediately,
# there is no reason to keep it active
await self._registry.set_key_alive(self._resource_key(), 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the key alive has a TTL.
I checked that this function _get_service_deletion_timeout (which is badly named) returns the default TTL.
Did you actually check what happens when you close the tab now? it will instantly remove the service when the GC runs.

This kind of disconnection happen a lot in a train for example... so I think here you might be messing around with it. Did you test that use-case?

@GitHK
Copy link
Contributor Author

GitHK commented Oct 29, 2024

outdated

@GitHK GitHK closed this Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a:webserver webserver's codebase. Assigning the area is particularly useful for bugs bug buggy, it does not work as expected t:maintenance Some planned maintenance work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants