You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to document the parallel systems we must be able to support with the graphBLAS. This would include:
Multi-core, multi-CPU in a shared address space. Explicit management of NUMA features of a system is critical
Single GPU ... basic Host/Device model with disjoint host/device memories and Uniform Shared Memory (USM)
Multi GPU ... Host/Device model with disjoint memories and USM
Arbitrary accelerators instead of GPUs (aside: An accelerator is restricted to a fixed API, unlike a GPU which is programmable)
Shared nothing distributed systems with nodes composed of the above
We need a platform model that appropriately abstracts systems composed of the above. It must deal with the complexity of the various memory spaces and support arbitrary, dynamic partitions of the above.
Finally, we need a way to deal with nonblocking GraphBLAS operations as part of a larger execution context that supports asynchronous execution. I will add a separate issue for this topic.