Skip to content

Commit 50bde0e

Browse files
committed
Solidify public API with a public header
1 parent dd1d056 commit 50bde0e

File tree

7 files changed

+169
-86
lines changed

7 files changed

+169
-86
lines changed

README.md

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,12 +28,34 @@ We note that we have an experimental `Clang.jl`-based symbol extractor that extr
2828
Because we export both the 32-bit (LP64) and 64-bit (ILP64) interfaces, if clients need header files defining the various BLAS/LAPACK functions, they must include headers defining the appropriate ABI.
2929
We provide headers broken down by interface (`LP64` vs. `ILP64`) as well as target (e.g. `x86_64-linux-gnu`), so to properly compile your code with headers provided by `libblastrampoline` you must add the appropriate `-I${prefix}/include/${interface}/${target}` flags.
3030

31-
When `libblastrampoline` loads a BLAS/LAPACK library, it will inspect it to determine whether it is a 32-bit (LP64) or 64-bit (ILP64) library, and depending on the result, it will forward from its own 32-bit/64-bit names to the names declared in the library its forwarding to. This allows automatic usage of multiple libraries with different interfaces but the same symbol names.
31+
When `libblastrampoline` loads a BLAS/LAPACK library, it will inspect it to determine whether it is a 32-bit (LP64) or 64-bit (ILP64) library, and depending on the result, it will forward from its own 32-bit/64-bit names to the names declared in the library its forwarding to.
32+
This allows automatic usage of multiple libraries with different interfaces but the same symbol names.
3233

33-
`libblastrampoline` is also cognizant of the f2c calling convention incompatibilities introduced by some libraries such as [Apple's Accelerate](https://developer.apple.com/documentation/accelerate). It will automatically probe the library to determine its calling convention and employ a return-value conversion routine to fix the `float`/`double` return value differences. This support is only available on the `x86_64` and `i686` architectures, however these are the only systems on which the incompatibilty exists to our knowledge.
34+
`libblastrampoline` is also cognizant of the f2c calling convention incompatibilities introduced by some libraries such as [Apple's Accelerate](https://developer.apple.com/documentation/accelerate).
35+
It will automatically probe the library to determine its calling convention and employ a return-value conversion routine to fix the `float`/`double` return value differences.
36+
This support is only available on the `x86_64` and `i686` architectures, however these are the only systems on which the incompatibilty exists to our knowledge.
37+
38+
## `libblastrampoline`-specific API
39+
40+
`libblastrampoline` exports a simple configuration API including `lbt_forward()`, `lbt_get_config()`, `lbt_{set,get}_num_threads()`, and more.
41+
See the [public header file](src/libblastrampoline.h) for the most up-to-date documentation on the `libblastrampoline` API.
42+
43+
**Note**: all `lbt_*` functions should be considered thread-unsafe.
44+
Do not attempt to load two BLAS libraries one two different threads at the same time.
45+
46+
### Limitations
47+
48+
This library has the ability to work with a mixture of LP64 and ILP64 BLAS libraries, but is slightly hampered on certain platforms that do not have the capability to perform `RTLD_DEEPBIND`-style linking.
49+
As of the time of this writing, this includes FreeBSD and `musl` Linux.
50+
The impact of this is that you are unable to load an ILP64 BLAS that exports the typical LP64 names (e.g. `dgemm_`) at the same time as an actual LP64 BLAS (with any naming scheme).
51+
This is because without `RTLD_DEEPBIND`-style linking semantics, when the ILP64 BLAS tries to call one of its own functions, it will call the function exported by `libblastrampoline` itself, which will result in incorrect values and segfaults.
52+
To address this, `libblastrampoline` will detect if you attempt to do this and refuse to load a library that would cause this kind of confusion.
53+
You can always tell if your system is limited in this fashion by calling `lbt_get_config()` and checking the `build_flags` member for the `LBT_BUILDFLAGS_DEEPBINDLESS` flag.
3454

3555
### Version History
3656

37-
v1.1.0 - Added f2c autodetection for Accelerate.
57+
v1.2.0 - Added threading getter/setter API, added failure configuration API.
58+
59+
v2.0.0 - Added f2c autodetection for Accelerate, changed public API to `lbt_forward()` from `load_blas_funcs()`.
3860

3961
v1.0.0 - Feburary 2021: Initial release with basic autodetection, LP64/ILP64 mixing and trampoline support.

src/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ $(builddir)/libblastrampoline.$(SHLIB_EXT): $(MAIN_OBJS)
3838
install: $(builddir)/libblastrampoline.$(SHLIB_EXT)
3939
@mkdir -p $(prefix)/include/libblastrampoline
4040
-@cp -Ra $(LBT_ROOT)/include/* $(prefix)/include/libblastrampoline
41+
@cp -a $(LBT_ROOT)/src/libblastrampoline.h $(prefix)/include/
4142
@mkdir -p $(prefix)/$(binlib)
4243
@cp -a $(builddir)/libblastrampoline.$(SHLIB_EXT) $(prefix)/$(binlib)
4344
ifeq ($(OS),WINNT)

src/libblastrampoline.c

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,7 @@
1111
uint8_t deepbindless_interfaces_loaded = 0x00;
1212

1313
/*
14-
* Load the given `libname`, lookup all registered symbols within our `exported_func_names` list,
15-
* and `dlsym()` the symbol addresses to load the addresses for forwarding into that library.
16-
*
17-
* If `clear` is set to a non-zero value, all symbol addresses will be NULL'ed out before they are
18-
* looked up in `libname`. If `clear` is set to zero, symbols that do not exist in `libname` will
19-
* keep their previous value, which allows for loading a base library, then overriding some symbols
20-
* with a second shim library, integrating separate BLAS and LAPACK libraries, merging an LP64 and
21-
* ILP64 library into one, or all three use cases at the same time.
22-
*
23-
* Note that on certain platforms (currently musl linux and freebsd) you cannot load a non-suffixed
24-
* ILP64 and an LP64 BLAS at the same time. Read the note below about lacking RTLD_DEEPBIND
25-
* support in the system libc for more details.
26-
*
27-
* If `verbose` is set to a non-zero value, it will print out debugging information.
14+
* Load `libname`, clearing previous mappings if `clear` is set.
2815
*/
2916
LBT_DLLEXPORT int lbt_forward(const char * libname, int clear, int verbose) {
3017
if (verbose) {

src/libblastrampoline.h

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
#include <stdint.h>
2+
3+
#ifdef __cplusplus
4+
extern "C" {
5+
#endif
6+
7+
// This shamelessly stolen from https://github.com/JuliaLang/julia/blob/master/src/support/platform.h
8+
#if defined(__FreeBSD__)
9+
#define _OS_FREEBSD_
10+
#elif defined(__linux__)
11+
#define _OS_LINUX_
12+
#elif defined(_WIN32) || defined(_WIN64)
13+
#define _OS_WINDOWS_
14+
#elif defined(__APPLE__) && defined(__MACH__)
15+
#define _OS_DARWIN_
16+
#elif defined(__EMSCRIPTEN__)
17+
#define _OS_EMSCRIPTEN_
18+
#endif
19+
20+
// Borrow definition from `support/dtypes.h`
21+
#ifdef _OS_WINDOWS_
22+
# ifdef LIBRARY_EXPORTS
23+
# define LBT_DLLEXPORT __declspec(dllexport)
24+
# else
25+
# define LBT_DLLEXPORT __declspec(dllimport)
26+
# endif
27+
# define LBT_HIDDEN
28+
#else
29+
# if defined(LIBRARY_EXPORTS) && defined(_OS_LINUX)
30+
# define LBT_DLLEXPORT __attribute__ ((visibility("protected")))
31+
# else
32+
# define LBT_DLLEXPORT __attribute__ ((visibility("default")))
33+
# endif
34+
# define LBT_HIDDEN __attribute__ ((visibility("hidden")))
35+
#endif
36+
37+
// The metadata stored on each loaded library
38+
typedef struct {
39+
// The library name as passed to `lbt_forward()`.
40+
// To get the absolute path to the library, use `dlpath()` or similar on `handle`.
41+
char * libname;
42+
void * handle;
43+
// The suffix used within this library as autodetected by `lbt_forward`.
44+
// Common values are `""` or `"64_"`.
45+
const char * suffix;
46+
// The interface type as autodetected by `lbt_forward`, see `LBT_INTERFACE_XXX` below
47+
int32_t interface;
48+
// The `f2c` status as autodetected by `lbt_forward`, see `LBT_F2C_XXX` below
49+
int32_t f2c;
50+
} lbt_library_info_t;
51+
52+
// Possible values for `interface` in `lbt_library_info_t`
53+
#define LBT_INTERFACE_LP64 32
54+
#define LBT_INTERFACE_ILP64 64
55+
#define LBT_INTERFACE_UNKNOWN -1
56+
57+
// Possible values for `f2c` in `lbt_library_info_t`
58+
#define LBT_F2C_PLAIN 0
59+
#define LBT_F2C_REQUIRED 1
60+
#define LBT_F2C_UNKNOWN -1
61+
62+
// The config type you get back from `lbt_get_config()`
63+
typedef struct {
64+
// The NULL-terminated list of libraries loaded via `lbt_forward()`.
65+
// This list is emptied if `clear` is set to `1` in a future `lbt_forward()` call.
66+
lbt_library_info_t ** loaded_libs;
67+
// Flags that describe this `libblastrampoline`'s build configuration.
68+
// See `LBT_BUILDFLAGS_XXX` below.
69+
uint32_t build_flags;
70+
} lbt_config_t;
71+
72+
// Possible values for `build_flags` in `lbt_config_t`
73+
#define LBT_BUILDFLAGS_DEEPBINDLESS 0x01
74+
#define LBT_BUILDFLAGS_F2C_CAPABLE 0x02
75+
76+
/*
77+
* Load the given `libname`, lookup all registered symbols within our configured list of exported
78+
* symbols and `dlsym()` the symbols to load the addresses for forwarding into that library.
79+
*
80+
* If `clear` is set to a non-zero value, all symbol addresses will be reset to a pre-set value
81+
* before they are looked up in `libname`. If `clear` is set to zero, symbols that do not exist in
82+
* `libname` will keep their previous value, which allows for loading a base library, then overriding
83+
* some symbols with a second shim library, integrating separate BLAS and LAPACK libraries, merging an
84+
* LP64 and ILP64 library into one, or all three use cases at the same time. See the docstring for
85+
* `lbt_set_default_func` for how to control what `clear` sets.
86+
*
87+
* Note that on certain platforms (currently musl linux and freebsd) you cannot load a non-suffixed
88+
* ILP64 and an LP64 BLAS at the same time. Read the note in the README about `RTLD_DEEPBIND`
89+
* support in the system libc for more details.
90+
*
91+
* If `verbose` is set to a non-zero value, it will print out debugging information.
92+
*/
93+
int lbt_forward(const char * libname, int clear, int verbose);
94+
95+
/*
96+
* Returns a structure describing the currently-loaded libraries as well as the build configuration
97+
* of this `libblastrampoline` instance. See the definition of `lbt_config_t` in this header file
98+
* for more details.
99+
*/
100+
const lbt_config_t * lbt_get_config();
101+
102+
/*
103+
* Returns the number of threads configured by the underlying BLAS library. In the event that
104+
* multiple libraries are loaded, returns the maximum over all returned values. The functions
105+
* it calls to determine the number of threads are configurable at runtime, see the docstring
106+
* for the `lbt_register_thread_interface()` function, although many common functions (such as
107+
* those for `OpenBLAS`, `MKL` and `BLIS`) are already registered by default.
108+
*/
109+
int32_t lbt_get_num_threads();
110+
111+
/*
112+
* Sets the number of threads in the underlying BLAS library. In the event that multiple
113+
* libraries are loaded, sets them all to the same value. The functions it calls to actually
114+
* set the number of threads are configurable at runtime, see the docstring for the
115+
* `lbt_register_thread_interface()` function, although many common functions (such as those
116+
* for `OpenBLAS`, `MKL` and `BLIS`) are already registered by default.
117+
*/
118+
void lbt_set_num_threads(int32_t num_threads);
119+
120+
/*
121+
* Register a new `get_num_threads()`/`set_num_threads()` pair. These functions are assumed to be
122+
* callable via the function prototypes `int32_t getter()` and `void setter(int32_t num_threads)`.
123+
* Note that due to register zero-extension on `x86_64` it is permissible that the setter actually
124+
* expects an `int64_t`, and the getter may return an `int64_t` as long as the value itself is not
125+
* larger than the maximum permissable `int64_t`.
126+
*
127+
* While `libblastrampoline` has built-in knowledge of some BLAS libraries' getter/setter
128+
* functions (such as those for `OpenBLAS`, `MKL` and `BLIS`) and will call them from
129+
* `lbt_{get,set}_num_threads()`, if the user loads some exotic BLAS that uses a different symbol
130+
* name for this functionality, they must register those getter/setter functions here to have them
131+
* automatically called whenever `lbt_{get,set}_num_threads()` is called.
132+
*/
133+
void lbt_register_thread_interface(const char * getter, const char * setter);
134+
135+
#ifdef __cplusplus
136+
} // extern "C"
137+
#endif

src/libblastrampoline_internal.h

Lines changed: 4 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
#include <string.h>
55
#include <unistd.h>
66

7-
// Load in platform-detection macros
8-
#include "platform.h"
7+
// Load in our publicly-defined functions/types
8+
#include "libblastrampoline.h"
99

1010
#ifdef _OS_LINUX_
1111
#include <linux/limits.h>
@@ -45,40 +45,17 @@ extern const void ** exported_func64_addrs[];
4545

4646
// The config type you get back from lbt_get_config()
4747
#define MAX_TRACKED_LIBS 31
48-
typedef struct {
49-
char * libname;
50-
void * handle;
51-
const char * suffix;
52-
int32_t interface;
53-
int32_t f2c;
54-
} lbt_library_info_t;
55-
56-
#define LBT_INTERFACE_LP64 32
57-
#define LBT_INTERFACE_ILP64 64
58-
#define LBT_INTERFACE_UNKNOWN -1
59-
60-
#define LBT_F2C_PLAIN 0
61-
#define LBT_F2C_REQUIRED 1
62-
#define LBT_F2C_UNKNOWN -1
63-
64-
typedef struct {
65-
lbt_library_info_t ** loaded_libs;
66-
uint32_t build_flags;
67-
} lbt_config_t;
68-
69-
// The various "build_flags" that LBT can report back to the client
70-
#define LBT_BUILDFLAGS_DEEPBINDLESS 0x01
71-
#define LBT_BUILDFLAGS_F2C_CAPABLE 0x02
7248

7349
// Functions in `config.c`
7450
void init_config();
7551
void clear_loaded_libraries();
76-
LBT_DLLEXPORT const lbt_config_t * lbt_get_config();
7752
void record_library_load(const char * libname, void * handle, const char * suffix, int interface, int f2c);
7853

7954
// Functions in `win_utils.c`
55+
#ifdef _OS_WINDOWS_
8056
int wchar_to_utf8(const wchar_t * wstr, char *str, size_t maxlen);
8157
int utf8_to_wchar(const char * str, wchar_t * wstr, size_t maxlen);
58+
#endif
8259

8360
// Functions in `dl_utils.c`
8461
void * load_library(const char * path);

src/platform.h

Lines changed: 0 additions & 41 deletions
This file was deleted.

src/threading.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ LBT_DLLEXPORT int32_t lbt_get_num_threads() {
7474
/*
7575
* Sets the given number of threads for all loaded libraries.
7676
*/
77-
LBT_DLLEXPORT int32_t lbt_set_num_threads(int32_t nthreads) {
77+
LBT_DLLEXPORT void lbt_set_num_threads(int32_t nthreads) {
7878
const lbt_config_t * config = lbt_get_config();
7979
for (int lib_idx=0; config->loaded_libs[lib_idx] != NULL; ++lib_idx) {
8080
lbt_library_info_t * lib = config->loaded_libs[lib_idx];

0 commit comments

Comments
 (0)