Skip to content

Investigate alternative hashing algorithms #73

@hendrikmuhs

Description

@hendrikmuhs

Compile time: An important and runtime wise expensive part are hashes used for minimization and value de-duplication. For states the algorithm is based on bob-jenkins hashing, for values it is some common string hashing.

Files:

https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/dictionary/fsa/internal/unpacked_state.h#L151
https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/dictionary/fsa/internal/value_store_persistence.h#L83

The idea is to try different alternatives to improve performance, state hashing should have the bigger impact.

Some links:

https://github.com/rurban/smhasher
https://github.com/Cyan4973/xxHash
https://github.com/RedSpah/xxhash_cpp/blob/master/xxhash/xxhash.hpp
https://github.com/leo-yuriev/t1ha

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions