1
1
# x86-simd-sort
2
2
3
3
C++ header file library for SIMD based 16-bit, 32-bit and 64-bit data type
4
- sorting algorithms on x86 processors. We currently have AVX-512 and AVX2
5
- (32-bit and 64-bit only) based implementation of quicksort, quickselect,
6
- partialsort, argsort, argselect & key-value
7
- sort. The following API's are currently supported:
4
+ sorting algorithms on x86 processors. We currently have AVX-512 and AVX2 based
5
+ implementation of quicksort, quickselect, partialsort, argsort, argselect &
6
+ key-value sort. The static methods can be used by including
7
+ ` src/x86simdsort-static-incl.h ` file. Compiling them with the appropriate
8
+ compiler flags will choose either the AVX-512 or AVX2 versions. For AVX-512, we
9
+ recommend using -march=skylake-avx512 for 32-bit and 64-bit datatypes,
10
+ -march=icelake-client for 16-bit datatype and -march=sapphirerapids for
11
+ _ Float16. For AVX2 just using -mavx2 will suffice. The following API's are
12
+ currently supported:
8
13
9
14
#### Quicksort
10
15
@@ -13,8 +18,7 @@ Equivalent to `qsort` in
13
18
` std::sort ` in [ C++] ( https://en.cppreference.com/w/cpp/algorithm/sort ) .
14
19
15
20
``` cpp
16
- void avx512_qsort<T>(T* arr, size_t arrsize, bool hasnan = false , bool descending = false );
17
- void avx2_qsort<T>(T* arr, size_t arrsize, bool hasnan = false , bool descending = false );
21
+ void x86simdsortStatic::qsort<T>(T* arr, size_t arrsize, bool hasnan = false , bool descending = false );
18
22
```
19
23
Supported datatypes: ` uint16_t ` , ` int16_t ` , ` _Float16 ` , ` uint32_t ` , ` int32_t ` ,
20
24
` float ` , ` uint64_t ` , ` int64_t ` and ` double ` . AVX2 versions currently support
@@ -30,8 +34,7 @@ Equivalent to `std::nth_element` in
30
34
31
35
32
36
``` cpp
33
- void avx512_qselect<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false , bool descending = false );
34
- void avx2_qselect<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false , bool descending = false );
37
+ void x86simdsortStatic::qselect<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false , bool descending = false );
35
38
```
36
39
Supported datatypes: ` uint16_t ` , ` int16_t ` , ` _Float16 ` , ` uint32_t ` , ` int32_t ` ,
37
40
` float ` , ` uint64_t ` , ` int64_t ` and ` double ` . AVX2 versions currently support
@@ -46,8 +49,7 @@ Equivalent to `std::partial_sort` in
46
49
47
50
48
51
``` cpp
49
- void avx512_partial_qsort<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false , bool descending = false )
50
- void avx2_partial_qsort<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false , bool descending = false )
52
+ void x86simdsortStatic::partial_qsort<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false , bool descending = false )
51
53
```
52
54
Supported datatypes: ` uint16_t ` , ` int16_t ` , ` _Float16 ` , ` uint32_t ` , ` int32_t ` ,
53
55
` float ` , ` uint64_t ` , ` int64_t ` and ` double ` . AVX2 versions currently support
@@ -61,8 +63,7 @@ Equivalent to `np.argsort` in
61
63
[ NumPy] ( https://numpy.org/doc/stable/reference/generated/numpy.argsort.html ) .
62
64
63
65
``` cpp
64
- void avx512_argsort<T>(T* arr, size_t *arg, size_t arrsize, bool hasnan = false , bool descending = false );
65
- void avx2_argsort<T>(T* arr, size_t *arg, size_t arrsize, bool hasnan = false , bool descending = false );
66
+ void x86simdsortStatic::argsort<T>(T* arr, size_t *arg, size_t arrsize, bool hasnan = false , bool descending = false );
66
67
```
67
68
Supported datatypes: ` uint32_t ` , ` int32_t ` , ` float ` , ` uint64_t ` , ` int64_t ` and
68
69
` double ` .
@@ -74,8 +75,7 @@ Equivalent to `np.argselect` in
74
75
[ NumPy] ( https://numpy.org/doc/stable/reference/generated/numpy.argpartition.html ) .
75
76
76
77
``` cpp
77
- void avx512_argselect<T>(T* arr, size_t *arg, size_t k, size_t arrsize);
78
- void avx2_argselect<T>(T* arr, size_t *arg, size_t k, size_t arrsize);
78
+ void x86simdsortStatic::argselect<T>(T* arr, size_t *arg, size_t k, size_t arrsize);
79
79
```
80
80
Supported datatypes: ` uint32_t ` , ` int32_t ` , ` float ` , ` uint64_t ` , ` int64_t ` and
81
81
` double ` .
@@ -84,10 +84,10 @@ The algorithm resorts to scalar `std::sort` if the array contains NaNs.
84
84
85
85
#### Key-value sort
86
86
``` cpp
87
- void avx512_qsort_kv<T1, T2>(T1* key, T2* value, size_t arrsize);
88
- void avx2_qsort_kv<T1, T2>(T1* key, T2* value, size_t arrsize);
87
+ void x86simdsortStatic::keyvalue_qsort<T1, T2>(T1* key, T2* value, size_t arrsize);
89
88
```
90
- Supported datatypes: ` uint64_t ` , ` int64_t ` and ` double ` .
89
+ Supported datatypes: ` uint32_t ` , ` int32_t ` , ` float ` , ` uint64_t ` , ` int64_t ` and
90
+ ` double ` .
91
91
92
92
## Algorithm details
93
93
@@ -106,9 +106,7 @@ source code associated with that paper [3].
106
106
### Sample code ` main.cpp `
107
107
108
108
``` cpp
109
- #include " src/xss-common-includes.h"
110
- #include " src/xss-common-qsort.h"
111
- #include " src/avx512-32bit-qsort.hpp"
109
+ #include " src/x86simdsort-static-incl.h"
112
110
113
111
int main () {
114
112
const int ARRSIZE = 1000;
@@ -120,7 +118,7 @@ int main() {
120
118
}
121
119
122
120
/* call avx512 quicksort */
123
- avx512_qsort (arr.data(), ARRSIZE);
121
+ x86simdsortStatic::qsort (arr.data(), ARRSIZE);
124
122
return 0;
125
123
}
126
124
@@ -129,7 +127,8 @@ int main() {
129
127
### Build using g++
130
128
131
129
```
132
- g++ main.cpp -mavx512f -mavx512dq -O3
130
+ g++ main.cpp -mavx512f -mavx512dq -O3 /* for AVX-512 */
131
+ g++ main.cpp -mavx2 -O3 /* for AVX2 */
133
132
```
134
133
135
134
If you are using src files directly, then it is a header file only and we do
@@ -142,7 +141,7 @@ to include and build this library with your source code.
142
141
## Build requirements
143
142
144
143
The sorting routines relies only on the C++ Standard Library and requires a
145
- relatively modern compiler to build (gcc 8.x and above).
144
+ relatively modern compiler to build (ex: gcc 8.x and above).
146
145
147
146
## Instruction set requirements
148
147
0 commit comments