Skip to content

Conversation

@BitTheByte
Copy link

Based on discussion at #817

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats for being patient for review here. From what I have read of the code the logic makes sense.

Do you have any thoughts about how we could test this? Have you been running this in a real world setting?

@BitTheByte
Copy link
Author

I believe #877 has implemented the same logic in this PR.

Have you been running this in a real world setting?

I've been running this PR on a 300 worker cluster for a somewhile and it worked effortlessly.

Do you have any thoughts about how we could test this?

Unfortunately, No since I'm not familiar with dask's test suites. but generally a test should launch a n of unschedulable workers and wait for the controller to downscale them to the correct number.

@jacobtomlinson
Copy link
Member

I just bumped this PR to use the latest code from main but something in here is still causing the CI to hang. It's not obvious to me what is causing it, but we will need to get to the bottom of it to be able to get this merged. @BitTheByte do you think you will have some time to try and push this over the line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants