Skip to content

[Support] Import SipHash c reference implementation. #94393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions llvm/lib/Support/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,9 @@ endif()

add_subdirectory(BLAKE3)

# Temporarily ignore SipHash.cpp before we fully integrate it into LLVMSupport.
set(LLVM_OPTIONAL_SOURCES SipHash.cpp)

add_llvm_component_library(LLVMSupport
ABIBreak.cpp
AMDGPUMetadata.cpp
Expand Down
126 changes: 126 additions & 0 deletions llvm/lib/Support/README.md.SipHash
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# SipHash

[![License:
CC0-1.0](https://licensebuttons.net/l/zero/1.0/80x15.png)](http://creativecommons.org/publicdomain/zero/1.0/)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)


SipHash is a family of pseudorandom functions (PRFs) optimized for speed on short messages.
This is the reference C code of SipHash: portable, simple, optimized for clarity and debugging.

SipHash was designed in 2012 by [Jean-Philippe Aumasson](https://aumasson.jp)
and [Daniel J. Bernstein](https://cr.yp.to) as a defense against [hash-flooding
DoS attacks](https://aumasson.jp/siphash/siphashdos_29c3_slides.pdf).

SipHash is:

* *Simpler and faster* on short messages than previous cryptographic
algorithms, such as MACs based on universal hashing.

* *Competitive in performance* with insecure non-cryptographic algorithms, such as [fhhash](https://github.com/cbreeden/fxhash).

* *Cryptographically secure*, with no sign of weakness despite multiple [cryptanalysis](https://eprint.iacr.org/2019/865) [projects](https://eprint.iacr.org/2019/865) by leading cryptographers.

* *Battle-tested*, with successful integration in OSs (Linux kernel, OpenBSD,
FreeBSD, FreeRTOS), languages (Perl, Python, Ruby, etc.), libraries (OpenSSL libcrypto,
Sodium, etc.) and applications (Wireguard, Redis, etc.).

As a secure pseudorandom function (a.k.a. keyed hash function), SipHash can also be used as a secure message authentication code (MAC).
But SipHash is *not a hash* in the sense of general-purpose key-less hash function such as BLAKE3 or SHA-3.
SipHash should therefore always be used with a secret key in order to be secure.


## Variants

The default SipHash is *SipHash-2-4*: it takes a 128-bit key, does 2 compression
rounds, 4 finalization rounds, and returns a 64-bit tag.

Variants can use a different number of rounds. For example, we proposed *SipHash-4-8* as a conservative version.

The following versions are not described in the paper but were designed and analyzed to fulfill applications' needs:

* *SipHash-128* returns a 128-bit tag instead of 64-bit. Versions with specified number of rounds are SipHash-2-4-128, SipHash4-8-128, and so on.

* *HalfSipHash* works with 32-bit words instead of 64-bit, takes a 64-bit key,
and returns 32-bit or 64-bit tags. For example, HalfSipHash-2-4-32 has 2
compression rounds, 4 finalization rounds, and returns a 32-bit tag.


## Security

(Half)SipHash-*c*-*d* with *c* ≥ 2 and *d* ≥ 4 is expected to provide the maximum PRF
security for any function with the same key and output size.

The standard PRF security goal allow the attacker access to the output of SipHash on messages chosen adaptively by the attacker.

Security is limited by the key size (128 bits for SipHash), such that
attackers searching 2<sup>*s*</sup> keys have chance 2<sup>*s*−128</sup> of finding
the SipHash key.
Security is also limited by the output size. In particular, when
SipHash is used as a MAC, an attacker who blindly tries 2<sup>*s*</sup> tags will
succeed with probability 2<sup>*s*-*t*</sup>, if *t* is that tag's bit size.


## Research

* [Research paper](https://www.aumasson.jp/siphash/siphash.pdf) "SipHash: a fast short-input PRF" (accepted at INDOCRYPT 2012)
* [Slides](https://cr.yp.to/talks/2012.12.12/slides.pdf) of the presentation of SipHash at INDOCRYPT 2012 (Bernstein)
* [Slides](https://www.aumasson.jp/siphash/siphash_slides.pdf) of the presentation of SipHash at the DIAC workshop (Aumasson)


## Usage

Running

```sh
make
```

will build tests for

* SipHash-2-4-64
* SipHash-2-4-128
* HalfSipHash-2-4-32
* HalfSipHash-2-4-64


```C
./test
```

verifies 64 test vectors, and

```C
./debug
```

does the same and prints intermediate values.

The code can be adapted to implement SipHash-*c*-*d*, the version of SipHash
with *c* compression rounds and *d* finalization rounds, by defining `cROUNDS`
or `dROUNDS` when compiling. This can be done with `-D` command line arguments
to many compilers such as below.

```sh
gcc -Wall --std=c99 -DcROUNDS=2 -DdROUNDS=4 siphash.c halfsiphash.c test.c -o test
```

The `makefile` also takes *c* and *d* rounds values as parameters.

```sh
make cROUNDS=2 dROUNDS=4
```

Obviously, if the number of rounds is modified then the test vectors
won't verify.

## Intellectual property

This code is copyright (c) 2014-2023 Jean-Philippe Aumasson, Daniel J.
Bernstein. It is multi-licensed under

* [CC0](./LICENCE_CC0)
* [MIT](./LICENSE_MIT).
* [Apache 2.0 with LLVM exceptions](./LICENSE_A2LLVM).

185 changes: 185 additions & 0 deletions llvm/lib/Support/SipHash.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
/*
SipHash reference C implementation

Copyright (c) 2012-2022 Jean-Philippe Aumasson
<[email protected]>
Copyright (c) 2012-2014 Daniel J. Bernstein <[email protected]>

To the extent possible under law, the author(s) have dedicated all copyright
and related and neighboring rights to this software to the public domain
worldwide. This software is distributed without any warranty.

You should have received a copy of the CC0 Public Domain Dedication along
with
this software. If not, see
<http://creativecommons.org/publicdomain/zero/1.0/>.
*/

#include "siphash.h"
#include <assert.h>
#include <stddef.h>
#include <stdint.h>

/* default: SipHash-2-4 */
#ifndef cROUNDS
#define cROUNDS 2
#endif
#ifndef dROUNDS
#define dROUNDS 4
#endif

#define ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))

#define U32TO8_LE(p, v) \
(p)[0] = (uint8_t)((v)); \
(p)[1] = (uint8_t)((v) >> 8); \
(p)[2] = (uint8_t)((v) >> 16); \
(p)[3] = (uint8_t)((v) >> 24);

#define U64TO8_LE(p, v) \
U32TO8_LE((p), (uint32_t)((v))); \
U32TO8_LE((p) + 4, (uint32_t)((v) >> 32));

#define U8TO64_LE(p) \
(((uint64_t)((p)[0])) | ((uint64_t)((p)[1]) << 8) | \
((uint64_t)((p)[2]) << 16) | ((uint64_t)((p)[3]) << 24) | \
((uint64_t)((p)[4]) << 32) | ((uint64_t)((p)[5]) << 40) | \
((uint64_t)((p)[6]) << 48) | ((uint64_t)((p)[7]) << 56))

#define SIPROUND \
do { \
v0 += v1; \
v1 = ROTL(v1, 13); \
v1 ^= v0; \
v0 = ROTL(v0, 32); \
v2 += v3; \
v3 = ROTL(v3, 16); \
v3 ^= v2; \
v0 += v3; \
v3 = ROTL(v3, 21); \
v3 ^= v0; \
v2 += v1; \
v1 = ROTL(v1, 17); \
v1 ^= v2; \
v2 = ROTL(v2, 32); \
} while (0)

#ifdef DEBUG_SIPHASH
#include <stdio.h>

#define TRACE \
do { \
printf("(%3zu) v0 %016" PRIx64 "\n", inlen, v0); \
printf("(%3zu) v1 %016" PRIx64 "\n", inlen, v1); \
printf("(%3zu) v2 %016" PRIx64 "\n", inlen, v2); \
printf("(%3zu) v3 %016" PRIx64 "\n", inlen, v3); \
} while (0)
#else
#define TRACE
#endif

/*
Computes a SipHash value
*in: pointer to input data (read-only)
inlen: input data length in bytes (any size_t value)
*k: pointer to the key data (read-only), must be 16 bytes
*out: pointer to output data (write-only), outlen bytes must be allocated
outlen: length of the output in bytes, must be 8 or 16
*/
int siphash(const void *in, const size_t inlen, const void *k, uint8_t *out,
const size_t outlen) {

const unsigned char *ni = (const unsigned char *)in;
const unsigned char *kk = (const unsigned char *)k;

assert((outlen == 8) || (outlen == 16));
uint64_t v0 = UINT64_C(0x736f6d6570736575);
uint64_t v1 = UINT64_C(0x646f72616e646f6d);
uint64_t v2 = UINT64_C(0x6c7967656e657261);
uint64_t v3 = UINT64_C(0x7465646279746573);
uint64_t k0 = U8TO64_LE(kk);
uint64_t k1 = U8TO64_LE(kk + 8);
uint64_t m;
int i;
const unsigned char *end = ni + inlen - (inlen % sizeof(uint64_t));
const int left = inlen & 7;
uint64_t b = ((uint64_t)inlen) << 56;
v3 ^= k1;
v2 ^= k0;
v1 ^= k1;
v0 ^= k0;

if (outlen == 16)
v1 ^= 0xee;

for (; ni != end; ni += 8) {
m = U8TO64_LE(ni);
v3 ^= m;

TRACE;
for (i = 0; i < cROUNDS; ++i)
SIPROUND;

v0 ^= m;
}

switch (left) {
case 7:
b |= ((uint64_t)ni[6]) << 48;
/* FALLTHRU */
case 6:
b |= ((uint64_t)ni[5]) << 40;
/* FALLTHRU */
case 5:
b |= ((uint64_t)ni[4]) << 32;
/* FALLTHRU */
case 4:
b |= ((uint64_t)ni[3]) << 24;
/* FALLTHRU */
case 3:
b |= ((uint64_t)ni[2]) << 16;
/* FALLTHRU */
case 2:
b |= ((uint64_t)ni[1]) << 8;
/* FALLTHRU */
case 1:
b |= ((uint64_t)ni[0]);
break;
case 0:
break;
}

v3 ^= b;

TRACE;
for (i = 0; i < cROUNDS; ++i)
SIPROUND;

v0 ^= b;

if (outlen == 16)
v2 ^= 0xee;
else
v2 ^= 0xff;

TRACE;
for (i = 0; i < dROUNDS; ++i)
SIPROUND;

b = v0 ^ v1 ^ v2 ^ v3;
U64TO8_LE(out, b);

if (outlen == 8)
return 0;

v1 ^= 0xdd;

TRACE;
for (i = 0; i < dROUNDS; ++i)
SIPROUND;

b = v0 ^ v1 ^ v2 ^ v3;
U64TO8_LE(out + 8, b);

return 0;
}