Is your feature request related to a problem? Please describe.
When group offloading is enabled, the offload and onload cannot be streamed between steps and this is really a big time comsuming problem.
Describe the solution you'd like.
Is it possible to add an option that could make the first and last block forced on device to avoid offload and onload?
@a-r-r-o-w Could you please give some help? Thanks so much.