Skip to content

[feature] tune block-size and grid-size for low-frame count#29

Merged
charlesbmi merged 5 commits intomainfrom
feature/low-frame-block-tune
Jul 7, 2025
Merged

[feature] tune block-size and grid-size for low-frame count#29
charlesbmi merged 5 commits intomainfrom
feature/low-frame-block-tune

Conversation

@charlesbmi
Copy link
Copy Markdown
Collaborator

Introduction

Currently kernel is optimized for ensemble images (e.g. ~32-400 frames). frames=1 was not really considered

Changes

Make some low-hanging changes to just the small frame_count to make it use all threads that we would have (i.e. at least 32)

Behavior

faster for n_frames=1, by about 4x

Review checklist

  • All existing tests and checks pass
  • Unit tests covering the new feature or bugfix have been added
  • The documentation has been updated if necessary

@charlesbmi charlesbmi self-assigned this Jul 7, 2025
@charlesbmi charlesbmi merged commit c71db93 into main Jul 7, 2025
3 checks passed
@charlesbmi charlesbmi deleted the feature/low-frame-block-tune branch July 7, 2025 23:46
alexrockhill pushed a commit to alexrockhill/mach that referenced this pull request Aug 18, 2025
…eurotech#29)

* Increase the number of blocks for low n_frames

* Some changes that I think work, but are a bit confusing

* Remove 64x frames because that is not aprt of the plan

* Factor out some helper functions

* fix lint

---------

Co-authored-by: Charles Guan <3221512+charlesincharge@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant