Skip to content

Commit e48262b

Browse files
committed
add readme
1 parent c514e5d commit e48262b

File tree

8 files changed

+281
-79
lines changed

8 files changed

+281
-79
lines changed

README.MD

Lines changed: 189 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,199 @@
1-
# VarInt: fast & memory efficient arbitrary bit length integers.
1+
<p align="center">
2+
<img src="https://raw.githubusercontent.com/1pkg/varint/master/h_gopher.png" alt="varint"/>
3+
</p>
4+
5+
# VarInt: fast & memory efficient arbitrary bit width integers in Go.
26

37
## Introduction
48

9+
VarInt Go library provides fast & memory efficient arbitrary bit width unsigned integer array type.
10+
11+
The purpose of VarInt to provide the maximum memory compact way to use and store unsigned custom bits integers. It does so by storing all the integers adjacent to each other inside a continuous numeric byte slice. It allocates the underlying numeric bytes slice only once on creation and doesn't expect to allocate any more memory afterwards. VarInt provides all the basic arithmetic and bitwise operations. To apply any of these operations, internal bits manipulations are required which implies certain computational overhead. Thus providing a tradeoff between CPU time and memory. Overhead grows lineraly, proportionally to bit len and is comparable with overhead from big.Int operations. Unlike big.Int however, VarInt uses exact number of bits to store the integers inside. Which makes VarInt extremely memory efficient. For example, to store a slice of 100 integers 100 bit each, big.Int requires 12400 bits, while VarInt needs exactly 10000 bits. In the same fashion VarInt also provides an efficient way to store integers smaller than 64 bits. For example, to store a slice of 1000 integers 2 bit each, []uin8 requires 8000 bits, while VarInt needs exactly 2000 bits. However, note that VarInt is no way close to be optimized as well as big.Int, and provides diminishing returns as bit length grows above certain threshold.
12+
13+
Currently, in a conscious decision multiple operations are implemented in favour of simplicity and not computational complexity, this includes Mul that uses standard long multiplication instead of fast multiplication algorithms like Karatsuba multiplication, and Div that uses standard slow division instead of fast division algorithms. The main rationale behind this choice is the fact that VarInt has the most efficiency when used for small and medium size integers in the range of 1 to 5000 bit width, therefore asymptotic complexity should be less significant for this library. Note that VarInt carries a small fixed overhead internaly, it allocates 2 separate uint cells at the beginning of the numeric bytes slice to store length and bit length. It also collocates extra Bits variable at the end of numeric bytes slice which is used internally for many operations as a computation temporary buffer, including: Mul, Div, Mod, Sort. Currently, for simplicity and consistency most VarInt operations apply changes in place on the provided index and require the provided Bits to have exactly the same bit len, otherwise ErrorUnequalBitLengthCardinality is returned. Currently, VarInt provides only unsigned arithmetic.
14+
515
## Examples
616

17+
**Allocate 10000 integers VarInt 25 bits in width. And then fills it with its max value.**
18+
19+
```go
20+
vint, _ := varint.NewVarInt(25, 10000)
21+
b := varint.NewBits(25, []uint{ 33554431 })
22+
for i := 0; i < 10000; i++ {
23+
_ = vint.Set(i, b)
24+
}
25+
```
26+
27+
**Allocate 10000 integers VarInt 25 bits in width. Fills it with increasing values, and rotates it right.**
28+
29+
```go
30+
vint, _ := varint.NewVarInt(25, 10000)
31+
for i := 0; i < 10000; i++ {
32+
_ = vint.Set(i, varint.NewBitsBits(25, varint.NewBitsUint(uint(i))))
33+
}
34+
b := varint.NewBits(25, nil)
35+
_ = vint.Get(0, b)
36+
for i := 1; i < 10000; i++ {
37+
_ = vint.GetSet(i, b)
38+
}
39+
_ = vint.Set(0, b)
40+
```
41+
42+
**Allocates 10000 integers VarInt 50 bits in width. Fills it with random values, then finds min and max values.**
43+
44+
```go
45+
vint, _ := varint.NewVarInt(50, 10000)
46+
rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
47+
for i := 0; i < 10000; i++ {
48+
_ = vint.Set(i, varint.NewBitsRand(50, rnd))
49+
}
50+
bmin, bmax, b := varint.NewBits(50, nil), varint.NewBits(50, nil), varint.NewBits(50, nil)
51+
_, _ = vint.Get(0, bmin), vint.Get(0, bmax)
52+
for i := 1; i < 10000; i++ {
53+
_ = vint.Get(i, b)
54+
switch {
55+
case varint.Compare(b, bmin) == -1:
56+
bmin = varint.NewBitsBits(50, b)
57+
case varint.Compare(b, bmax) == 1:
58+
bmax = varint.NewBitsBits(50, b)
59+
}
60+
}
61+
```
62+
63+
**Allocates 10000 integers VarInt 50 bits in width. Fills it from big.Int channel, then subtracts 1000 from even numbers and adds 1 to odd numbers. Finally, converts integers back to the big.Int channel.**
64+
65+
```go
66+
ch := make(chan *big.Int)
67+
vint, _ := varint.NewVarInt(50, 10000)
68+
for i := 0; i < 10000; i++ {
69+
_ = vint.Set(i, varint.NewBitsBits(50, varint.NewBitsBigInt(<-ch)))
70+
}
71+
b1000, b1 := varint.NewBitsBits(50, varint.NewBitsUint(1000)), varint.NewBitsBits(50, varint.NewBitsUint(1))
72+
for i := 0; i < 10000; i++ {
73+
if i % 2 == 0 {
74+
_ = vint.Sub(i, b1000)
75+
} else {
76+
_ = vint.Add(i, b1)
77+
}
78+
}
79+
for i := 0; i < 10000; i++ {
80+
_ = vint.Get(i, b1)
81+
ch <- b1.BigInt()
82+
}
83+
```
84+
85+
**Allocates 10000 integers VarInt 50 bits in width. Fills it with random values, then if bitwise negation for a number is even multiply it by 2.**
86+
87+
```go
88+
vint, _ := varint.NewVarInt(50, 10000)
89+
rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
90+
for i := 0; i < 10000; i++ {
91+
_ = vint.Set(i, varint.NewBitsRand(50, rnd))
92+
}
93+
b, b2 := varint.NewBits(50, nil), varint.NewBitsBits(50, varint.NewBitsUint(2))
94+
for i := 0; i < 10000; i++ {
95+
_ = vint.Get(i, b)
96+
_ = vint.Not(i)
97+
_ = vint.Mod(i, b2)
98+
_ = vint.GetSet(i, b)
99+
if b.Empty() {
100+
_ = vint.Mul(i, b2)
101+
}
102+
}
103+
```
104+
105+
**Allocates 10000 integers VarInt 100 bits in width. Fills it with random values, then sorts it in ascending order.**
106+
107+
```go
108+
vint, _ := varint.NewVarInt(100, 10000)
109+
rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
110+
for i := 0; i < 10000; i++ {
111+
_ = vint.Set(i, varint.NewBitsRand(100, rnd))
112+
}
113+
sort.Sort(varint.Sortable(vint))
114+
```
115+
116+
**Allocates 10000 integers VarInt 100 bits in width. Fills it with random values, then flushes it to a file and reads it back.**
117+
118+
```go
119+
vint, _ := varint.NewVarInt(100, 10000)
120+
rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
121+
for i := 0; i < 10000; i++ {
122+
_ = vint.Set(i, varint.NewBitsRand(100, rnd))
123+
}
124+
f, _ := os.Create("vint.bin")
125+
defer os.Remove(f.Name())
126+
_, _ = f.ReadFrom(varint.Encode(vint))
127+
f, _ = os.Open(f.Name())
128+
_ = varint.Decode(f, vint)
129+
```
130+
7131
## Benchmarks
8132

133+
**Arithmetic Operations 100000000 integers, 4 bits width**
134+
135+
| | ns/op | B/op | allocs/op | allocs/MB |
136+
| :---------: | :---: | :--: | :-------: | :-------: |
137+
| VarInt | 103.3 | 0 | 0 | 47.69 |
138+
| Uint8 Slice | 1.555 | 0 | 0 | 95.38 |
139+
140+
**Arithmetic Operations 10000000 integers, 64 bits width**
141+
142+
| | ns/op | B/op | allocs/op | allocs/MB |
143+
| :----------: | :---: | :--: | :-------: | :-------: |
144+
| VarInt | 99.46 | 0 | 0 | 76.30 |
145+
| Uint64 Slice | 2.398 | 0 | 0 | 76.30 |
146+
147+
**Arithmetic Operations 10000000 integers, 100 bits width**
148+
149+
| | ns/op | B/op | allocs/op | allocs/MB |
150+
| :----------: | :---: | :--: | :-------: | :-------: |
151+
| VarInt | 169.4 | 0 | 0 | 119.2 |
152+
| BigInt Slice | 504.9 | 120 | 3 | 495.9 |
153+
154+
**Arithmetic Operations 100000 integers, 10000 bits width**
155+
156+
| | ns/op | B/op | allocs/op | allocs/MB |
157+
| :----------: | :------: | :--: | :-------: | :-------: |
158+
| VarInt | 78451 | 0 | 0 | 119.2 |
159+
| BigInt Slice | 657.8 .0 | 148 | 2 | 145.8 |
160+
161+
**Bitwise Operations 100000000 integers, 4 bits width**
162+
163+
| | ns/op | B/op | allocs/op | allocs/MB |
164+
| :---------: | :---: | :--: | :-------: | :-------: |
165+
| VarInt | 76.84 | 0 | 0 | 47.69 |
166+
| Uint8 Slice | 2.42 | 0 | 0 | 95.38 |
167+
168+
**Arithmetic Operations 10000000 integers, 64 bits width**
169+
170+
| | ns/op | B/op | allocs/op | allocs/MB |
171+
| :----------: | :---: | :--: | :-------: | :-------: |
172+
| VarInt | 79.06 | 0 | 0 | 76.30 |
173+
| Uint64 Slice | 2.451 | 0 | 0 | 76.30 |
174+
175+
**Arithmetic Operations 10000000 integers, 100 bits width**
176+
177+
| | ns/op | B/op | allocs/op | allocs/MB |
178+
| :----------: | :---: | :--: | :-------: | :-------: |
179+
| VarInt | 134.5 | 0 | 0 | 119.2 |
180+
| BigInt Slice | 186.9 | 48 | 1 | 427.2 |
181+
182+
**Arithmetic Operations 100000 integers, 10000 bits width**
183+
184+
| | ns/op | B/op | allocs/op | allocs/MB |
185+
| :----------: | :---: | :--: | :-------: | :-------: |
186+
| VarInt | 5273 | 0 | 0 | 119.2 |
187+
| BigInt Slice | 172.3 | 153 | 0 | 150.3 |
188+
189+
The benchmarks from above are run using, see `BenchmarkVarIntOperations` for more details.
190+
191+
```
192+
go version go1.19.3 darwin/amd64
193+
Intel(R) Core(TM) i5-1030NG7 CPU @ 1.10GHz
194+
go test -run=^$ -bench ^BenchmarkVarIntOperations$ github.com/1pkg/varint -benchtime 1000000x
195+
```
196+
9197
## Licence
10198

11199
VarInt is licensed under the MIT License.

bits.go

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ type Bits []uint
2727
// doesn't fit into the provided bit length, it is truncated to fit into the provided bit len.
2828
// In case the provided bit len is negative number, actual bit len is calculated from the bytes slice.
2929
// In case the provided bit len is 0, empty Bits marker is returned.
30+
// See Bits type for more details.
3031
func NewBits(blen int, bytes []uint) Bits {
3132
if blen == 0 {
3233
return []uint{0}
@@ -70,18 +71,21 @@ func NewBits(blen int, bytes []uint) Bits {
7071

7172
// NewBitsUint allocates and returns new Bits instance with
7273
// deduced bit length to exactly fit the provided number.
74+
// See Bits type for more details.
7375
func NewBitsUint(n uint) Bits {
7476
return NewBits(-1, []uint{n})
7577
}
7678

7779
// NewBitsBits allocates, copies and returns new Bits instance
78-
// from the provided Bits, effectively making a deep copy of it.
79-
func NewBitsBits(bits Bits) Bits {
80-
return NewBits(bits.BitLen(), bits.Bytes())
80+
// from the provided bit len and Bits, effectively making a deep copy of it.
81+
// See Bits type for more details.
82+
func NewBitsBits(blen int, bits Bits) Bits {
83+
return NewBits(blen, bits.Bytes())
8184
}
8285

8386
// NewBitsRand allocates and returns new Bits instance filled with
8487
// random bytes from provided Rand that fits the provided bit length.
88+
// See Bits type for more details.
8589
func NewBitsRand(blen int, rnd *rand.Rand) Bits {
8690
// Calculate number of whole words plus
8791
// one word if partial mod word is needed.
@@ -97,6 +101,7 @@ func NewBitsRand(blen int, rnd *rand.Rand) Bits {
97101
// NewBitsBigInt allocates, copies and returns new Bits instance
98102
// from the provided big.Int, it deduces bit length to exactly fit
99103
// the provided number. In case nil is provided empty Bits marker is returned.
104+
// See Bits type for more details.
100105
func NewBitsBigInt(i *big.Int) Bits {
101106
if i == nil {
102107
return NewBitsUint(0)
@@ -115,6 +120,7 @@ func NewBitsBigInt(i *big.Int) Bits {
115120
// converted to 2, base values above 62 are converted to 62. Leading plus '+' sings are ignored.
116121
// Separating underscore '_' signs are allowed and also ignored. In case empty or invalid
117122
// string is provided a special nil Bits marker is returned. The implementation follows big.Int.
123+
// See Bits type for more details.
118124
func NewBitsString(s string, base int) Bits {
119125
// Fix unsuported bases to closest supported.
120126
const minb, maxb = 2, 62

bits_test.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ func TestBitsNew(t *testing.T) {
6767
test(tname, th.T, func(h h) {
6868
b := NewBits(tcase.blen, tcase.bytes)
6969
bn := NewBitsUint(tcase.n)
70-
bb := NewBitsBits(tcase.bits)
70+
bb := NewBitsBits(tcase.blen, tcase.bits)
7171
bbig := NewBitsBigInt(tcase.big)
7272
brnd := NewBitsRand(tcase.blen, rnd)
7373
bs := NewBitsString(tcase.s, tcase.base)

h_gopher.png

94.6 KB
Loading

h_test.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ func bench(bname string, b *testing.B, f func(b *testing.B)) {
5050
var after runtime.MemStats
5151
runtime.ReadMemStats(&after)
5252
m := float64(after.TotalAlloc-before.TotalAlloc) / k / k
53-
b.ReportMetric(m, "M_allocated")
53+
b.ReportMetric(m, "allocs/MB")
5454
})
5555
}
5656

@@ -85,7 +85,7 @@ func (h h) NewBits2B62(b62 string) (Bits, Bits) {
8585
} else {
8686
b1, b2 = h.NewBitsB62(b62), h.NewBitsB62(rb62s)
8787
}
88-
return b1, NewBits(b1.BitLen(), b2.Bytes())
88+
return b1, NewBitsBits(b1.BitLen(), b2)
8989
}
9090

9191
func (h *h) NewVarInt(bits, length int) VarInt {

support.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ import (
1010
// The Bits variable is collocated on VarInt itself, so bvar
1111
// doesn't allocate any new memory. The reserved Bits variable
1212
// is appended to the end of any VarInt and used internally for many operations
13-
// as a compuatation temporary buffer, including: Mul, Div, Mod, Sort.
13+
// as a computation temporary buffer, including: Mul, Div, Mod, Sort.
1414
// bvar is standalone function by choice to make VarInt more consistent and ergonomic.
1515
func bvar(vint VarInt, empty bool) Bits {
1616
if vint == nil {

varinit.go

Lines changed: 25 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -5,39 +5,31 @@ import math_bits "math/bits"
55
// wsize const alias to system uint word size in bits.
66
const wsize = math_bits.UintSize
77

8-
// VarInt defines the memory efficient unsigned integer array type that provides
9-
// basic arithmetic and bitwise operations on arbitrary variadic bit length integers.
10-
// The purpose of VarInt to provide the memory optimal way to use unsigned custom bit len integers.
11-
// It does so by storing all the integers adjacent to each other inside
12-
// continuous numeric bytes slice. To function, it allocates the underlying
13-
// numeric bytes slice only once on creation and doesn't expect to allocate
14-
// any more new memory for any of its operations thereafter. To apply any
15-
// of its operations, some bits manipulations are required which implies
16-
// some computational overhead. Thus providing a tradeoff between CPU time and memeory.
17-
// Overhead grows lineraly proportionally to bit len and comparable with overhead
18-
// provided by big.Int type operations. Unlike big.Int however, VarInt uses exact number
19-
// of bits to store the integers inside. Which makes VarInt extremely memory efficient. For example,
20-
// to store a slice of 100 integers 100 bit each, big.Int requires 12400 bits while
21-
// VarInt needs exactly 10000 bits (excluding fixed internal overhead).
22-
// In the same way VarInt also provides the efficient way to store integers smaller than 64 bits.
23-
// For example, to store a slice of 1000 integers 2 bit each, []uin8 requires 8000 bits
24-
// while VarInt needs exactly 2000 bits (excluding fixed internal overhead).
25-
// Note however, that VarInt is no way close to be optimized as well as big.Int, and
26-
// provides diminishing returns as bit length grows above certain threshold. Currently,
27-
// in a conscious decision multiple operations implemented in favor of simplicity and
28-
// not computational complexity, this includes Mul that uses standard long multiplication
29-
// instead of fast multiplication algorithms like Karatsuba multiplication, and Div that
30-
// uses standard slow division instead of fast division algorithms. The main rationale behind
31-
// the choice is the fact that VarInt has the most efficiency when used for small and medium size integers
32-
// in the range of 1 to 5000 bits, therefore asymptotic complexity should have less of impact.
33-
// VarInt carries a small fixed overhead internaly, it allocates 2 separate uint cells at the beginning
34-
// of the numeric bytes slice to store length and bit length somewhere. It also collocates extra Bits
35-
// variable at the end of numeric bytes slice which is used internally for many operations as
36-
// a compuatation temporary buffer, including: Mul, Div, Mod, Sort. Currently, for simplicity and consistency
37-
// most VarInt operations apply changes in place on the provided index and require the provided Bits to have
38-
// exactly the same bit len, otherwise ErrorUnequalBitLengthCardinality is returned. Currently, VarInt
39-
// provides only unsigned arithmetic. It heavily uses Bits data transfer type to carry the data between
40-
// API boundaries, see Bits for more details.
8+
// VarInt provides fast and memory efficient arbitrary bit length unsigned integer array type.
9+
//
10+
// The purpose of VarInt to provide the maximum memory compact way to use and store unsigned custom bits integers.
11+
// It does so by storing all the integers adjacent to each other inside a continuous numeric byte slice.
12+
// It allocates the underlying numeric bytes slice only once on creation and doesn't expect to allocate any more memory afterwards.
13+
// VarInt provides all the basic arithmetic and bitwise operations. To apply any of these operations, internal bits manipulations are required
14+
// which implies certain computational overhead. Thus providing a tradeoff between CPU time and memory.
15+
// Overhead grows lineraly, proportionally to bit len and is comparable with overhead from big.Int operations.
16+
// Unlike big.Int however, VarInt uses exact number of bits to store the integers inside. Which makes VarInt extremely memory efficient.
17+
// For example, to store a slice of 100 integers 100 bit each, big.Int requires 12400 bits, while VarInt needs exactly 10000 bits.
18+
// In the same fashion VarInt also provides an efficient way to store integers smaller than 64 bits.
19+
// For example, to store a slice of 1000 integers 2 bit each, []uin8 requires 8000 bits, while VarInt needs exactly 2000 bits.
20+
// However, note that VarInt is no way close to be optimized as well as big.Int, and provides diminishing returns as bit length grows above certain threshold.
21+
//
22+
// Currently, in a conscious decision multiple operations are implemented in favour of simplicity and not computational complexity,
23+
// this includes Mul that uses standard long multiplication instead of fast multiplication algorithms like Karatsuba multiplication,
24+
// and Div that uses standard slow division instead of fast division algorithms.
25+
// The main rationale behind this choice is the fact that VarInt has the most efficiency when used for small and medium size integers
26+
// in the range of 1 to 5000 bit width, therefore asymptotic complexity should be less significant for this library.
27+
// Note that VarInt carries a small fixed overhead internaly, it allocates 2 separate uint cells at the beginning of the numeric bytes slice
28+
// to store length and bit length. It also collocates extra Bits variable at the end of numeric bytes slice which is used internally
29+
// for many operations as a computation temporary buffer, including: Mul, Div, Mod, Sort.
30+
// Currently, for simplicity and consistency most VarInt operations apply changes in place on the provided index and require
31+
// the provided Bits to have exactly the same bit len, otherwise ErrorUnequalBitLengthCardinality is returned.
32+
// Currently, VarInt provides only unsigned arithmetic.
4133
type VarInt []uint
4234

4335
// NewVarInt allocates and returns VarInt instance that is capable to

0 commit comments

Comments
 (0)