Skip to content

Commit f6e2e07

Browse files
authored
Merge pull request #2 from simdjson/add_dispatching
adding a discussiong of dispatching
2 parents 7696ce1 + a449e13 commit f6e2e07

File tree

3 files changed

+304
-0
lines changed

3 files changed

+304
-0
lines changed

cppcon2025/cppcon_2025_slides.md

Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -725,6 +725,205 @@ The magic:
725725

726726
**Write once, works everywhere™**
727727

728+
729+
---
730+
731+
# Runtime dispatching
732+
733+
- One function semantically
734+
- Several implementations
735+
- Select the best one at runtime for performance.
736+
737+
738+
739+
---
740+
741+
# Issue: x64 processors support different instructions
742+
743+
A Zen 5 CPU and a Pentium 4 CPU can be quite different.
744+
745+
```cpp
746+
bool has_sse2() { /* query the CPU */ }
747+
bool has_avx2() { /* query the CPU */ }
748+
bool has_avx512() { /* query the CPU */ }
749+
```
750+
751+
These functions cannot be `consteval`.
752+
753+
754+
---
755+
756+
<img src="images/dispatching.svg" width="50%">
757+
758+
---
759+
760+
# Example: Sum function
761+
762+
```cpp
763+
using SumFunc = float (*)(const float *, size_t);
764+
```
765+
766+
---
767+
768+
# Setup a reassignable implementation
769+
770+
771+
```cpp
772+
SumFunc &get_sum_fnc() {
773+
static SumFunc sum_impl = sum_init;
774+
return sum_impl;
775+
}
776+
```
777+
778+
We initialize it with some special initialization function.
779+
780+
781+
782+
---
783+
784+
```cpp
785+
float sum_init(const float *data, size_t n) {
786+
SumFunc &sum_impl = get_sum_fnc();
787+
if (has_avx2()) {
788+
sum_impl = sum_avx2;
789+
} else if (has_sse2()) {
790+
sum_impl = sum_sse2;
791+
} else {
792+
sum_impl = sum_generic;
793+
}
794+
return sum_impl(data, n);
795+
}
796+
```
797+
798+
On first call, `get_sum_fnc()` is modified, and then it will remain constant.
799+
800+
---
801+
802+
# Runtime dispatching and metaprogramming
803+
804+
- Metaprogramming is at compile-time.
805+
- Runtime dispatching is fundamentally at runtime.
806+
807+
---
808+
809+
# Does your string need escaping?
810+
811+
812+
- In JSON, you must escape control characters, quotes.
813+
- Most strings in practice do not need escaping.
814+
815+
816+
```Cpp
817+
simple_needs_escaping(std::string_view v) {
818+
for (unsigned char c : v) {
819+
if(json_quotable_character[c]) { return true; }
820+
}
821+
return false;
822+
}
823+
```
824+
825+
---
826+
827+
828+
## SIMD
829+
830+
- Stands for Single instruction, multiple data
831+
- Allows us to process 16 (or more) bytes or more with one instruction
832+
- Supported on all modern CPUs (phone, laptop)
833+
834+
---
835+
836+
# SIMD (Pentium 4 and better)
837+
838+
```cpp
839+
__m128i word = _mm_loadu_si128(data); // load 16 bytes
840+
// check for control characters:
841+
_mm_cmpeq_epi8(_mm_subs_epu8(word, _mm_set1_epi8(31)),
842+
_mm_setzero_si128());
843+
```
844+
845+
---
846+
847+
# SIMD (AVX-512)
848+
849+
```cpp
850+
__m512i word = _mm512_loadu_si512(data); // load 64 bytes
851+
// check for control characters:
852+
_mm512_cmple_epu8_mask(word, _mm512_set1_epi8(31));
853+
```
854+
855+
---
856+
857+
# Runtime dispatching is poor with quick functions
858+
859+
- Calling a fast function like `fast_needs_escaping` without inlining prevents useful optimizations.
860+
- Runtime dispatching implies a function call!
861+
862+
---
863+
864+
# Current solution
865+
866+
- No runtime dispatching (*sad face*).
867+
- All x64 processors support Pentium 4-level SIMD. Use that in a short function.
868+
- *Easy* if programmer builds for specific machine (`-march=native`), use fancier tricks.
869+
870+
---
871+
872+
# Compile-time string escaping
873+
874+
- Often the 'keys' are known at compile time.
875+
876+
877+
```cpp
878+
struct Player {
879+
std::string username;
880+
int level;
881+
double health;
882+
std::vector<std::string> inventory;
883+
};
884+
```
885+
886+
- Keys are: `username`, `level`, `health`, `inventory`.
887+
888+
---
889+
890+
# Escape at compile time.
891+
892+
```cpp
893+
[:expand(std::meta::nonstatic_data_members_of(...)] {
894+
constexpr auto key =
895+
std::define_static_string(consteval_to_quoted_escaped(
896+
std::meta::identifier_of(dm)));
897+
b.append_raw(key);
898+
b.append(':');
899+
// ...
900+
};
901+
```
902+
903+
---
904+
905+
# Otherwise tricky to do
906+
907+
- Outside metaprogramming, lots of values are compile-time constants
908+
- But processing it at compile time is not always easy/convenient.
909+
910+
---
911+
912+
# Example: `g` returns 1
913+
914+
```cpp
915+
constexpr int convert(const char * x) {
916+
if (std::is_constant_evaluated()) { return 0; }
917+
return 1;
918+
}
919+
920+
int g() {
921+
constexpr char key[] = "name";
922+
auto x = convert(key);
923+
return x;
924+
}
925+
```
926+
728927
---
729928
730929
# Conclusion

cppcon2025/images/dispatching.svg

Lines changed: 45 additions & 0 deletions
Loading
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
#include <cstdint>
2+
#include <iostream>
3+
4+
// fake
5+
bool has_sse2() { return true; }
6+
bool has_avx2() { return false; }
7+
8+
using SumFunc = float (*)(const float *, size_t);
9+
10+
float sum_generic(const float *data, size_t n) {
11+
float sum = 0.0f;
12+
for (size_t i = 0; i < n; ++i) {
13+
sum += data[i];
14+
}
15+
return sum;
16+
}
17+
18+
float sum_sse2(const float *data, size_t n) {
19+
printf("sum_sse2...\n");
20+
21+
return 1.0; // fake
22+
}
23+
24+
float sum_avx2(const float *data, size_t n) {
25+
return 1.0; // fake
26+
}
27+
28+
SumFunc &get_sum_fnc();
29+
// Fonction d'initialisation pour le dispatching
30+
float sum_init(const float *data, size_t n) {
31+
std::cout << "Initialisation de la fonction sum...\n";
32+
SumFunc &sum_impl = get_sum_fnc();
33+
if (has_avx2()) {
34+
sum_impl = sum_avx2;
35+
} else if (has_sse2()) {
36+
sum_impl = sum_sse2;
37+
} else {
38+
sum_impl = sum_generic;
39+
}
40+
return sum_impl(data, n);
41+
}
42+
43+
// Gestion du pointeur de fonction statique
44+
SumFunc &get_sum_fnc() {
45+
static SumFunc sum_impl = sum_init;
46+
return sum_impl;
47+
}
48+
49+
// Fonction principale avec dispatching
50+
float sum(const float *data, size_t n) { return get_sum_fnc()(data, n); }
51+
52+
int main() {
53+
float data[] = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f};
54+
size_t n = sizeof(data) / sizeof(data[0]);
55+
float result = sum(data, n);
56+
std::cout << "sum : " << result << std::endl;
57+
float result2 = sum(data, n);
58+
std::cout << "sum : " << result2 << std::endl;
59+
return 0;
60+
}

0 commit comments

Comments
 (0)