-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hi,
First of all, thank you for your outstanding work on the rstsr crate — it’s shaping up to be one of the most promising foundational libraries for scientific computing in Rust.
While exploring the repository, I noticed that rstsr already supports multiple BLAS/LAPACK backends such as MKL, AOCL, and KML, which is impressive. I primarily work on macOS, so I was wondering whether it would be possible (or practical) for rstsr to utilize the Apple Accelerate Framework for optimized BLAS/LAPACK performance.
I understand this could be a non-trivial task and that maintainers may not have time to implement it. I’m interested in experimenting with this myself — ideally with your guidance if possible.
I’d appreciate it if you could help clarify a few points before I get started:
- What’s the recommended way to create a new FFI crate for a backend?
- Roughly how many BLAS/LAPACK routines does a backend need to cover for
rstsrcompatibility? - Are threading-related routines relevant for integration?
- What kind of tests should the backend pass to ensure correctness?
Additionally, I noticed that OpenBLAS’s cblas.h differs from Netlib’s reference version. For example, OpenBLAS adds:
cblas_??matcopy: In-place/out-of-place matrix scaling with optional transpositioncblas_?geadd: General matrix addition (including transposed forms)cblas_?gemm_batch: Batched matrix multiplication- BIGFLOAT and INT8 extensions
- Multithreading interfaces, etc.
Does rstsr currently rely on any of these extended routines? If so, should the Accelerate backend aim to support them as well?
To be clear, this is a personal exploration driven purely by interest, and I don’t expect to complete it fully. Even so, I’d really appreciate your insights and any advice you can share.
Best regards,
Ionizing