Skip to content

Commit 596b28b

Browse files
committed
Tweaks.
1 parent 3da142c commit 596b28b

File tree

1 file changed

+42
-1
lines changed

1 file changed

+42
-1
lines changed

README.md

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,42 @@
1-
A low-level ultra-fast string compactor. Up to 50% reduction.
1+
# java-string-compressor
2+
3+
A low-level ultra-fast string compactor. Up to 50% reduction.
4+
5+
- 4 bits -> 50% compression rate
6+
- 5 bits -> 38% compression rate
7+
- 6 bits -> 25% compression rate
8+
9+
Fast! Tiny milliseconds to compress a 10 MB string. Check out the benchmarks.
10+
11+
See the test directory for usage examples and edge cases.
12+
13+
### 4‑bit compressor (`FourBitAsciiCompressor`)
14+
15+
Compression rate: 50%
16+
Maximum of 16 different chars. Default charset: `0-9`, `;`, `#`, `-`, `+`, `.`, `,`
17+
18+
```java
19+
byte[] data = str.getBytes(US_ASCII); // Assume data is a 100 megabytes string.
20+
byte[] c = new FourBitAsciiCompressor().compress(data); // c is 50 megabytes.
21+
```
22+
23+
### 5‑bit compressor (`FiveBitAsciiCompressor`)
24+
25+
Compression rate: 38%
26+
Maximum of 32 different chars. Default charset: `A-Z`, space, `.`, `,`, `\`, `-`, `@`
27+
28+
```java
29+
byte[] data = str.getBytes(US_ASCII); // Assume data is a 100 megabytes string.
30+
byte[] c = new FiveBitAsciiCompressor().compress(data); // c is 62 megabytes.
31+
```
32+
33+
### 6‑bit compressor (`SixBitAsciiCompressor`)
34+
35+
Compression rate: 25%
36+
Maximum of 64 different chars. Default charset supports `A-Z`, `0-9`, and many punctuation marks which are defined at
37+
`SixBitAsciiCompressor.DEFAULT_6BIT_CHARSET`.
38+
39+
```java
40+
byte[] data = str.getBytes(US_ASCII); // Assume data is a 100 megabytes string.
41+
byte[] c = new SixBitAsciiCompressor().compress(data); // c is 75 megabytes.
42+
```

0 commit comments

Comments
 (0)