Skip to content

Commit 905d6e4

Browse files
author
Raghuveer Devulapalli
committed
Minor changes
1 parent c310a78 commit 905d6e4

File tree

1 file changed

+25
-19
lines changed

1 file changed

+25
-19
lines changed

README.md

Lines changed: 25 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -25,14 +25,14 @@ objects. `Func` needs to have the following signature:
2525

2626
Note that the return type of the key `type_t` needs to be one of the following
2727
: `[float, uint32_t, int32_t, double, uint64_t, int64_t]`. `object_qsort` has a
28-
space complexity of `O(N)`. Specifically, it requires `arrsize*(sizeof(type_t)`
29-
\+ `sizeof(uint32_t))` additional space. It allocates two `std::vectors`: one
30-
for storing all the keys and another storing the indexes of the object array.
31-
For performance reasons, we support `object_qsort` only when the array size
32-
is less than or equal to `UINT32_MAX`. An example usage of `object_qsort`
33-
is provided in the [examples](#Sort-an-array-of-Points-using-object_qsort)
34-
section. Refer to [section](#Performance-of-object_qsort) to get a sense
35-
of how fast this is relative to `std::sort`.
28+
space complexity of `O(N)`. Specifically, it requires `arrsize *
29+
sizeof(type_t)` bytes to store a vector with all the keys and an additional
30+
`arrsize * sizeof(uint32_t)` bytes to store the indexes of the object array.
31+
For performance reasons, we support `object_qsort` only when the array size is
32+
less than or equal to `UINT32_MAX`. An example usage of `object_qsort` is
33+
provided in the [examples](#Sort-an-array-of-Points-using-object_qsort)
34+
section. Refer to [section](#Performance-of-object_qsort) to get a sense of
35+
how fast this is relative to `std::sort`.
3636

3737
## Sort an array of built-in integers and floats
3838
```cpp
@@ -143,23 +143,29 @@ array. You can read details of all the implementations
143143
[here](https://github.com/intel/x86-simd-sort/blob/main/src/README.md).
144144
145145
## Performance comparison on AVX-512: `object_qsort` v/s `std::sort`
146-
`object_qsort` relies on key-value sort which is currently accelerated only on
147-
AVX-512 (we plan to add AVX2 version soon). Benchmarks added in
148-
[bench-objsort.hpp](./benchmarks/bench-objsort.hpp) measures performance of
149-
`object_qsort` relative to `std::sort` when sorting an array of `struct Point
150-
{double x, y, z;}` and `struct Point {float x, y, x;}` for various metrics:
146+
Performance of `object_qsort` can vary significantly depending on the defintion
147+
of the custom class and we highly recommend benchmarking before using it. For
148+
the sake of illustration, we provide a few examples in
149+
[./benchmarks/bench-objsort.hpp](./benchmarks/bench-objsort.hpp) which measures
150+
performance of `object_qsort` relative to `std::sort` when sorting an array of
151+
points in the cartesian coordinates represented by the class: `struct Point
152+
{double x, y, z;}` and `struct Point {float x, y, x;}`. We sort these points
153+
based on several different metrics:
151154
152155
+ sort by coordinate `x`
153156
+ sort by manhanttan distance (relative to origin): `abs(x) + abx(y) + abs(z)`
154157
+ sort by Euclidean distance (relative to origin): `sqrt(x*x + y*y + z*z)`
155158
+ sort by Chebyshev distance (relative to origin): `max(x, y, z)`
156159
157-
The data was collected on a processor with AVX-512 and is shown in the plot
158-
below. For the simplest of cases where we want to sort an array of struct by
159-
one of its members, `object_qsort` can be up-to 5x faster for 32-bit data type
160-
and about 4x for 64-bit data type. It tends to do better when the metric to
161-
sort by gets more complicated. Sorting by Euclidean distance can be up-to 10x
162-
faster.
160+
The performance data (shown in the plot below) can be collected by building the
161+
benchmarks suite and running `./builddir/benchexe --benchmark_filter==*obj*`.
162+
The data plot shown below was collected on a processor with AVX-512 because
163+
`object_qsort` is currently accelerated only on AVX-512 (we plan to add the
164+
AVX2 version soon). For the simplest of cases where we want to sort an array of
165+
struct by one of its members, `object_qsort` can be up-to 5x faster for 32-bit
166+
data type and about 4x for 64-bit data type. It tends to do even better when
167+
the metric to sort by gets more complicated. Sorting by Euclidean distance can
168+
be up-to 10x faster.
163169
164170
![alt text](./misc/object_qsort-perf.jpg?raw=true)
165171

0 commit comments

Comments
 (0)