Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions BloomFilter.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ unsigned int32 i = (h_top_bits * z_as_64_bit) >> 32;
```

The first line extracts the most significant 32 bits from `h` and
assignes them to a 64-bit unsigned integer. The second line is
assigns them to a 64-bit unsigned integer. The second line is
simpler: it just sets an unsigned 64-bit value to the same value as
the 32-bit unsigned value `z`. The purpose of having both `h_top_bits`
and `z_as_64_bit` be 64-bit values is so that their product is a
Expand Down Expand Up @@ -233,14 +233,14 @@ with a seed of 0 and [following the specification version

The `check` operation in SBBFs can return `true` for an argument that
was never inserted into the SBBF. These are called "false
positives". The "false positive probabilty" is the probability that
positives". The "false positive probability" is the probability that
any given hash value that was never `insert`ed into the SBBF will
cause `check` to return `true` (a false positive). There is not a
simple closed-form calculation of this probability, but here is an
example:

A filter that uses 1024 blocks and has had 26,214 hash values
`insert`ed will have a false positive probabilty of around 1.26%. Each
`insert`ed will have a false positive probability of around 1.26%. Each
of those 1024 blocks occupies 256 bits of space, so the total space
usage is 262,144. That means that the ratio of bits of space to hash
values is 10-to-1. Adding more hash values increases the denominator
Expand Down
6 changes: 3 additions & 3 deletions Compression.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ that writers refrain from creating such pages by default for better interoperabi
### LZO

A codec based on or interoperable with the
[LZO compression library](http://www.oberhumer.com/opensource/lzo/).
[LZO compression library](https://www.oberhumer.com/opensource/lzo/).

### BROTLI

Expand All @@ -91,11 +91,11 @@ switch to the newer, interoperable `LZ4_RAW` codec.
A codec based on the Zstandard format defined by
[RFC 8478](https://tools.ietf.org/html/rfc8478). If any ambiguity arises
when implementing this format, the implementation provided by the
[ZStandard compression library](https://facebook.github.io/zstd/)
[Zstandard compression library](https://facebook.github.io/zstd/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked that https://facebook.github.io/zstd/ refers to it as Zstandard

is authoritative.

### LZ4_RAW

A codec based on the [LZ4 block format](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
If any ambiguity arises when implementing this format, the implementation
provided by the [LZ4 compression library](http://www.lz4.org/) is authoritative.
provided by the [LZ4 compression library](https://www.lz4.org/) is authoritative.
4 changes: 2 additions & 2 deletions Encodings.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ repetition and definition levels.
Supported Types: INT32, INT64

This encoding is adapted from the Binary packing described in
["Decoding billions of integers per second through vectorization"](http://arxiv.org/pdf/1209.2137v5.pdf)
["Decoding billions of integers per second through vectorization"](https://arxiv.org/pdf/1209.2137v5.pdf)
by D. Lemire and L. Boytsov.

In delta encoding we make use of variable length integers for storing various
Expand Down Expand Up @@ -207,7 +207,7 @@ Each block contains
positive integers for bit packing)
* the bitwidth of each block is stored as a byte
* each miniblock is a list of bit packed ints according to the bit width
stored at the begining of the block
stored at the beginning of the block

To encode a block, we will:

Expand Down