Skip to content

Commit 411b04c

Browse files
committed
IMPORT: slz: use a better hash for machines with a fast multiply
The current hash involves 3 simple shifts and additions so that it can be mapped to a multiply on architecures having a fast multiply. This is indeed what the compiler does on x86_64. A large range of values was scanned to try to find more optimal factors on machines supporting such a fast multiply, and it turned out that new factor 0x1af42f resulted in smoother hashes that provided on average 0.4% better compression on both the Silesia corpus and an mbox file composed of very compressible emails and uncompressible attachments. It's even slightly better than CRC32C while being faster on Skylake. This patch enables this factor on archs with a fast multiply. This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.
1 parent 248bbec commit 411b04c

File tree

2 files changed

+5
-0
lines changed

2 files changed

+5
-0
lines changed

include/import/slz.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@
3838
#define UNALIGNED_LE_OK
3939
#define UNALIGNED_FASTER
4040
#define USE_64BIT_QUEUE
41+
#define HAVE_FAST_MULT
4142
#elif defined(__i386__) || defined(__i486__) || defined(__i586__) || defined(__i686__)
4243
#define UNALIGNED_LE_OK
4344
//#define UNALIGNED_FASTER
@@ -47,6 +48,7 @@
4748
#elif defined(__ARM_ARCH_8A) || defined(__ARM_FEATURE_UNALIGNED)
4849
#define UNALIGNED_LE_OK
4950
#define UNALIGNED_FASTER
51+
#define HAVE_FAST_MULT
5052
#endif
5153

5254
/* Log2 of the size of the hash table used for the references table. */

src/slz.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,9 @@ static inline uint32_t slz_hash(uint32_t a)
388388
// but provides a slightly smoother hash
389389
__asm__ volatile("crc32l %1,%0" : "+r"(a) : "r"(0));
390390
return a >> (32 - HASH_BITS);
391+
#elif defined(HAVE_FAST_MULT)
392+
// optimal factor for HASH_BITS=12 and HASH_BITS=13 among 48k tested: 0x1af42f
393+
return (a * 0x1af42f) >> (32 - HASH_BITS);
391394
#else
392395
return ((a << 19) + (a << 6) - a) >> (32 - HASH_BITS);
393396
#endif

0 commit comments

Comments
 (0)