Skip to content

Conversation

@michael-swan
Copy link

Description

Adding support for polling the status of Async Future's without consequently retaining CUDA memory of the returned arrays.

See corresponding issue: #109

Motivation and context

See corresponding issue: #109

How has this been tested?

Testing is still TODO.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@michael-swan michael-swan force-pushed the async-future-status-polling branch 6 times, most recently from 6235920 to c420a66 Compare September 21, 2025 21:09
@michael-swan
Copy link
Author

@tomsmeding Hey, take a look at the above changes when you get a chance. Ultimately, I am looking for a way to ensure that redundant cuFFT handles can be cached without allocating more handles than necessary, which is unfortunately a necessity for thread-safe concurrent use so I must be able to quickly determine if it is still in-use in a different CUDA stream. See this other PR for context: AccelerateHS/accelerate-fft#13

@michael-swan
Copy link
Author

I am also aware that since y'all are working on a new pipeline for Accelerate, this change likely doesn't suit those changes but I suspect these changes could be adapted for that changeset.

@ivogabe
Copy link
Contributor

ivogabe commented Sep 23, 2025

Thanks for investigating this, and thanks for making this fix! This is a breaking change to the FFI interface of accelerate-llvm-*, and since we are close to a new release of accelerate (1.4), we (Tom and I) are afraid this will delay that release if we merge this now. Do you know whether these breaking changes are easily resolved in other accelerate-* support packages? Would you have time to write patches to those packages? Otherwise we should probably postpone this to a later release.

Regarding the new-pipeline: I think this change is fine. In new-pipeline we do not have the FFI working yet, and we will need to redesign it (with breaking changes) anyway. For instance, the Future type is removed. It will thus not really matter for the new-pipeline what we will do with the current interface.

@michael-swan
Copy link
Author

michael-swan commented Sep 23, 2025

That all makes sense, and I can live with these changes in my local environment for now so that I at least have thread-safe behaviour of accelerate-fft. Ultimately, thread-safe and concurrent use of libraries like cuFFT with minimal memory retention is worth keeping in mind. The only way I can conceive of doing this efficiently is to cache multiple redundant cuFFT handles for peak concurrent usage and to provide a means of polling the completion status of those operations which "own" that handle during their evaluation. This is additionally nuanced by the fact that it is safe to use a busy handle within the same stream but when used in different streams cannot be used until the owning operation completes. cuFFT might be the only such case this is needed at present, but this may plausibly be a requirement for others.

I am open to alternative approaches to accomplishing this end, but I spent some time thinking about this and it seems that accelerate-llvm-ptx needs to support checking completion status.

@michael-swan michael-swan force-pushed the async-future-status-polling branch 3 times, most recently from f6e2025 to ae609ae Compare September 24, 2025 21:47
@michael-swan michael-swan force-pushed the async-future-status-polling branch from ae609ae to 7a9e3de Compare September 25, 2025 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants