-
Notifications
You must be signed in to change notification settings - Fork 44
Description
Compile time: An important and runtime wise expensive part are hashes used for minimization and value de-duplication. For states the algorithm is based on bob-jenkins hashing, for values it is some common string hashing.
Files:
https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/dictionary/fsa/internal/unpacked_state.h#L151
https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/dictionary/fsa/internal/value_store_persistence.h#L83
The idea is to try different alternatives to improve performance, state hashing should have the bigger impact.
Some links:
https://github.com/rurban/smhasher
https://github.com/Cyan4973/xxHash
https://github.com/RedSpah/xxhash_cpp/blob/master/xxhash/xxhash.hpp
https://github.com/leo-yuriev/t1ha