Skip to content
flanglet edited this page Apr 26, 2024 · 20 revisions

How do I select specific transforms at runtime instead of using compression levels?

You provide the transform using the -t (or --transform=) command line option followed by the transform(s).

Example: -t TEXT or -t RLT+TEXT+UTF+LZ

How many threads are going to be used during (de)compression ?

By default, kanzi detects the number of cores in the CPU and uses half of the cores. The maximum number of parallel jobs allowed is hard coded to 64.

Providing -j 1 on the command line makes (de)compression use one core.

Providing -j 0 on the command line makes (de)compression use all available cores.

Can kanzi (de)compress full folders ?

Yes, if the input source provided on the command line is a directory, all files under that folder are going be recursively processed.

The files will be processed in parallel if more than one core is available.

To avoid recursion and process only the top level folder, use a dot syntax:

EG. -i ~/myfolder/. on Linux
EG. -i c:\users\programs\. on Windows

When processing a folder, can kanzi avoid processing link files or dot files ?

Yes, to avoid processing link files, add this option to the command line --no-link

To avoid processing dot files, add this option to the command line --no-dot-file

Does kanzi support pipes and input/output redirection ?

Yes, one way to do it is to use STDIN/STDOUT as input/output on the command line:

 gunzip /tmp/kanzi.1.gz  | kanzi  -c -i stdin -l 2 -o /tmp/kanzi.1.knz

 kanzi -d -i /tmp/silesia.tar.knz -o stdout | tar -xf -

Or, using redirections,

kanzi -c -f -l 2 < /tmp/enwik8 > /tmp/enwik8.knz

If -i is absent from the command line, the data is assumed to come from STDIN and go to STDOUT. Another example (processing a 0 length pseudo-file !):

cat /proc/stat  | kanzi -c -i stdin -l 0 -o /tmp/stat.knz 
kanzi -d -i /tmp/stat.knz -o stdout

Notice that, during compression, kanzi stores the size of the input file (when it is available) so that the decompressor can verify the output size after decompression. The origial size is also used by the decompressor to optimize internal resources. Thus, providing -i and -o is recommended over redirection.

Does kanzi produce a seekable stream ?

Yes, it is possible to decompress only one or a sequence of consecutive blocks by using the --from and --to options during decompression.

kanzi -d -i /tmp/book1.knz -v 4 -f

Block 1: 34451 => 36530 [0 ms] => 65536 [0 ms]
Block 2: 33295 => 35330 [0 ms] => 65536 [0 ms]
Block 3: 33702 => 35807 [0 ms] => 65536 [0 ms]
Block 4: 33555 => 35502 [0 ms] => 65536 [0 ms]
Block 5: 34057 => 36065 [0 ms] => 65536 [0 ms]
Block 6: 33556 => 35622 [0 ms] => 65536 [0 ms]
Block 7: 33357 => 35167 [0 ms] => 65536 [0 ms]
Block 8: 33460 => 35446 [0 ms] => 65536 [0 ms]
Block 9: 33428 => 35431 [0 ms] => 65536 [0 ms]
Block 10: 33177 => 35180 [0 ms] => 65536 [0 ms]
Block 11: 33218 => 35156 [0 ms] => 65536 [0 ms]
Block 12: 24871 => 26246 [0 ms] => 47875 [0 ms]

Decompressing:     1 ms
Input size:        394176
Output size:       768771
Throughput (KB/s): 750752


kanzi -d -i /tmp/book1.knz -v 4 -f  --from=4 --to=10

Block 4: 33555 => 35502 [0 ms] => 65536 [0 ms]
Block 5: 34057 => 36065 [0 ms] => 65536 [0 ms]
Block 6: 33556 => 35622 [0 ms] => 65536 [0 ms]
Block 7: 33357 => 35167 [0 ms] => 65536 [0 ms]
Block 8: 33460 => 35446 [0 ms] => 65536 [0 ms]
Block 9: 33428 => 35431 [0 ms] => 65536 [0 ms]

Decompressing:     1 ms
Input size:        394176
Output size:       393216
Throughput (KB/s): 384000

Can I find information about a compressed file without decompressing?

Yes, just use a combination of options (verbosity, from and to):

./kanzi -d -i /tmp/silesia.tar.knz -f -v 3 --from=1 --to=1

1 file to decompress

Verbosity: 3
Overwrite: true
Using 4 jobs
Input file name: '/tmp/silesia.tar.knz'
Output file name: '/tmp/silesia.tar.knz.bak'

Decompressing /tmp/silesia.tar.knz ...
Bitstream version: 5
Checksum: false
Block size: 4194304 bytes
Using HUFFMAN entropy codec (stage 1)
Using PACK+LZ transform (stage 2)
Original size: 211957760 bytes


Decompressing:     17 ms
Input size:        68350949
Output size:       0
Throughput (KB/s): 0

Will kanzi bitstream be backward compatible in future releases ?

Yes, the bitstream version is part of the bitstream header and is used during decompression to ensure that old versions can be decompressed.

Clone this wiki locally