Skip to content

Commit c612632

Browse files
authored
Merge pull request #185 from r-devulap/readme
Update readme file for static methods
2 parents 5910e49 + 45114c9 commit c612632

File tree

1 file changed

+22
-23
lines changed

1 file changed

+22
-23
lines changed

src/README.md

Lines changed: 22 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,15 @@
11
# x86-simd-sort
22

33
C++ header file library for SIMD based 16-bit, 32-bit and 64-bit data type
4-
sorting algorithms on x86 processors. We currently have AVX-512 and AVX2
5-
(32-bit and 64-bit only) based implementation of quicksort, quickselect,
6-
partialsort, argsort, argselect & key-value
7-
sort. The following API's are currently supported:
4+
sorting algorithms on x86 processors. We currently have AVX-512 and AVX2 based
5+
implementation of quicksort, quickselect, partialsort, argsort, argselect &
6+
key-value sort. The static methods can be used by including
7+
`src/x86simdsort-static-incl.h` file. Compiling them with the appropriate
8+
compiler flags will choose either the AVX-512 or AVX2 versions. For AVX-512, we
9+
recommend using -march=skylake-avx512 for 32-bit and 64-bit datatypes,
10+
-march=icelake-client for 16-bit datatype and -march=sapphirerapids for
11+
_Float16. For AVX2 just using -mavx2 will suffice. The following API's are
12+
currently supported:
813

914
#### Quicksort
1015

@@ -13,8 +18,7 @@ Equivalent to `qsort` in
1318
`std::sort` in [C++](https://en.cppreference.com/w/cpp/algorithm/sort).
1419

1520
```cpp
16-
void avx512_qsort<T>(T* arr, size_t arrsize, bool hasnan = false, bool descending = false);
17-
void avx2_qsort<T>(T* arr, size_t arrsize, bool hasnan = false, bool descending = false);
21+
void x86simdsortStatic::qsort<T>(T* arr, size_t arrsize, bool hasnan = false, bool descending = false);
1822
```
1923
Supported datatypes: `uint16_t`, `int16_t`, `_Float16`, `uint32_t`, `int32_t`,
2024
`float`, `uint64_t`, `int64_t` and `double`. AVX2 versions currently support
@@ -30,8 +34,7 @@ Equivalent to `std::nth_element` in
3034

3135

3236
```cpp
33-
void avx512_qselect<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false, bool descending = false);
34-
void avx2_qselect<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false, bool descending = false);
37+
void x86simdsortStatic::qselect<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false, bool descending = false);
3538
```
3639
Supported datatypes: `uint16_t`, `int16_t`, `_Float16`, `uint32_t`, `int32_t`,
3740
`float`, `uint64_t`, `int64_t` and `double`. AVX2 versions currently support
@@ -46,8 +49,7 @@ Equivalent to `std::partial_sort` in
4649

4750

4851
```cpp
49-
void avx512_partial_qsort<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false, bool descending = false)
50-
void avx2_partial_qsort<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false, bool descending = false)
52+
void x86simdsortStatic::partial_qsort<T>(T* arr, size_t k, size_t arrsize, bool hasnan = false, bool descending = false)
5153
```
5254
Supported datatypes: `uint16_t`, `int16_t`, `_Float16`, `uint32_t`, `int32_t`,
5355
`float`, `uint64_t`, `int64_t` and `double`. AVX2 versions currently support
@@ -61,8 +63,7 @@ Equivalent to `np.argsort` in
6163
[NumPy](https://numpy.org/doc/stable/reference/generated/numpy.argsort.html).
6264

6365
```cpp
64-
void avx512_argsort<T>(T* arr, size_t *arg, size_t arrsize, bool hasnan = false, bool descending = false);
65-
void avx2_argsort<T>(T* arr, size_t *arg, size_t arrsize, bool hasnan = false, bool descending = false);
66+
void x86simdsortStatic::argsort<T>(T* arr, size_t *arg, size_t arrsize, bool hasnan = false, bool descending = false);
6667
```
6768
Supported datatypes: `uint32_t`, `int32_t`, `float`, `uint64_t`, `int64_t` and
6869
`double`.
@@ -74,8 +75,7 @@ Equivalent to `np.argselect` in
7475
[NumPy](https://numpy.org/doc/stable/reference/generated/numpy.argpartition.html).
7576

7677
```cpp
77-
void avx512_argselect<T>(T* arr, size_t *arg, size_t k, size_t arrsize);
78-
void avx2_argselect<T>(T* arr, size_t *arg, size_t k, size_t arrsize);
78+
void x86simdsortStatic::argselect<T>(T* arr, size_t *arg, size_t k, size_t arrsize, bool hasnan = false);
7979
```
8080
Supported datatypes: `uint32_t`, `int32_t`, `float`, `uint64_t`, `int64_t` and
8181
`double`.
@@ -84,10 +84,10 @@ The algorithm resorts to scalar `std::sort` if the array contains NaNs.
8484

8585
#### Key-value sort
8686
```cpp
87-
void avx512_qsort_kv<T1, T2>(T1* key, T2* value, size_t arrsize);
88-
void avx2_qsort_kv<T1, T2>(T1* key, T2* value, size_t arrsize);
87+
void x86simdsortStatic::keyvalue_qsort<T1, T2>(T1* key, T2* value, size_t arrsize, bool hasnan = false, bool descending = false);
8988
```
90-
Supported datatypes: `uint64_t`, `int64_t` and `double`.
89+
Supported datatypes: `uint32_t`, `int32_t`, `float`, `uint64_t`, `int64_t` and
90+
`double`.
9191

9292
## Algorithm details
9393

@@ -106,9 +106,7 @@ source code associated with that paper [3].
106106
### Sample code `main.cpp`
107107

108108
```cpp
109-
#include "src/xss-common-includes.h"
110-
#include "src/xss-common-qsort.h"
111-
#include "src/avx512-32bit-qsort.hpp"
109+
#include "src/x86simdsort-static-incl.h"
112110

113111
int main() {
114112
const int ARRSIZE = 1000;
@@ -120,7 +118,7 @@ int main() {
120118
}
121119

122120
/* call avx512 quicksort */
123-
avx512_qsort(arr.data(), ARRSIZE);
121+
x86simdsortStatic::qsort(arr.data(), ARRSIZE);
124122
return 0;
125123
}
126124

@@ -129,7 +127,8 @@ int main() {
129127
### Build using g++
130128

131129
```
132-
g++ main.cpp -mavx512f -mavx512dq -O3
130+
g++ main.cpp -mavx512f -mavx512dq -mavx512vl -O3 /* for AVX-512 */
131+
g++ main.cpp -mavx2 -O3 /* for AVX2 */
133132
```
134133

135134
If you are using src files directly, then it is a header file only and we do
@@ -142,7 +141,7 @@ to include and build this library with your source code.
142141
## Build requirements
143142

144143
The sorting routines relies only on the C++ Standard Library and requires a
145-
relatively modern compiler to build (gcc 8.x and above).
144+
relatively modern compiler to build (ex: gcc 8.x and above).
146145

147146
## Instruction set requirements
148147

0 commit comments

Comments
 (0)