Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR configures the Daisy task scheduler with improved retry and timeout settings while optimizing block processing efficiency by checking if blocks are already processed before expensive inference computation.
- Adds retry capability with 2 maximum retries for task resilience
- Sets worker timeout to 30 minutes (1800 seconds) to prevent indefinite waiting
- Optimizes processing by checking block completion status earlier in the pipeline
| fill_value = getattr(self.output_arrays[0], 'fill_value', self.dtype(0)) | ||
| if not (self.output_arrays[0][write_roi] == fill_value).all(): |
There was a problem hiding this comment.
The logic is inverted compared to the original check. The original code used if self.output_arrays[0][block.write_roi].any(): which returns early if any non-zero values exist. The new condition if not (self.output_arrays[0][write_roi] == fill_value).all(): also returns early when any values are not equal to fill_value, but this assumes fill_value represents 'unprocessed' state. If fill_value is 0 and processed blocks contain non-zero values, these conditions are equivalent. However, if fill_value is non-zero or processed blocks can contain the fill_value, this could cause incorrect behavior.
| fill_value = getattr(self.output_arrays[0], 'fill_value', self.dtype(0)) | |
| if not (self.output_arrays[0][write_roi] == fill_value).all(): | |
| # Check if block is already processed before expensive inference computation | |
| if self.output_arrays[0][write_roi].any(): |
| return | ||
|
|
||
| # Check if block is already processed before expensive inference computation | ||
| fill_value = getattr(self.output_arrays[0], 'fill_value', self.dtype(0)) |
There was a problem hiding this comment.
[nitpick] The fallback to self.dtype(0) assumes that zero represents unprocessed blocks, but this may not always be correct. Consider using a more explicit approach to determine the unprocessed state, or document this assumption clearly.
set number of retries = 2
change idle worker timeout from None, to 1800s (30min)