You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -8,6 +8,8 @@ compression and decompression methods which facilitate the applications to
8
8
easily integrate and use them.
9
9
AOCL-Compression supports lz4, zlib/deflate, lzma, zstd, bzip2, snappy, and lz4hc
10
10
based compression and decompression methods along with their native APIs.
11
+
The library offers openMP based multi-threaded implementation of lz4, zlib,
12
+
zstd and snappy compression methods.
11
13
It supports the dynamic dispatcher feature that executes the most optimal
12
14
function variant implemented using Function Multi-versioning thereby offering
13
15
a single optimized library portable across different x86 CPU architectures.
@@ -23,7 +25,7 @@ Installation
23
25
------------
24
26
25
27
1. Download the latest stable release from the Github repository:<br>
26
-
https://github.amd.com/AOCL/aocl-compression
28
+
https://github.com/amd/aocl-compression
27
29
2. Install CMake on the machine where the sources are to be compiled.
28
30
3. Make any one of the compilers GCC or Clang available on the machine.
29
31
4. Then, use the cmake based build system to compile and generate AOCL-Compression <br>
@@ -90,7 +92,11 @@ Building with Visual Studio IDE (GUI)
90
92
Microsoft Visual Studio project is generated.
91
93
6. Click __Open Project__.
92
94
Microsoft Visual Studio project for the source package __is launched__.
93
-
7. Build the entire solution or the required projects.
95
+
7. For building multi-threaded library based on AOCL_ENABLE_THREADS, set the
96
+
LLVM openMP library path in the Linker->General option and openMP library name
97
+
in the Linker->Input under the project properties. Set /openmp as the additional
98
+
compilation option.
99
+
8. Build the entire solution or the required projects.
94
100
95
101
Building with Visual Studio IDE (command line)
96
102
----------------------------------------------
@@ -112,20 +118,29 @@ AOCL_LZ4_OPT_PREFETCH_BACKWARDS | Enable LZ4 optimizations related to backw
112
118
SNAPPY_MATCH_SKIP_OPT | Enable Snappy match skipping optimization (Disabled by default)
113
119
LZ4_FRAME_FORMAT_SUPPORT | Enable building LZ4 with Frame format and API support (Enabled by default)
114
120
AOCL_LZ4HC_DISABLE_PATTERN_ANALYSIS | Disable Pattern Analysis in LZ4HC for level 9 (Enabled by default)
115
-
AOCL_ZSTD_4BYTE_LAZY2_MATCH_FINDER | Enable 4-byte comparison for finding a potential better match candidate with Lazy2 compressor (Disabled by default)
121
+
AOCL_ZSTD_SEARCH_SKIP_OPT_DFAST_FAST| Enable ZSTD match skipping optimization, and reduce search strength/tolerance for levels 1-4 (Disabled by default)
122
+
AOCL_ZSTD_WILDCOPY_LONG | Faster wildcopy when match lengths are long in ZSTD decompression (Disabled by default)
116
123
AOCL_TEST_COVERAGE | Enable GTest and AOCL test bench based CTest suite (Disabled by default)
124
+
AOCL_ENABLE_LOG_FEATURE | Enables logging through environment variable `AOCL_ENABLE_LOG` (Disabled by default)
125
+
CODE_COVERAGE | Enable source code coverage. Only supported on Linux with the GCC compiler (Disabled by default)
126
+
ASAN | Enable Address Sanitizer checks. Only supported on Linux/Debug build (Disabled by default)
127
+
VALGRIND | Enable Valgrind checks. Only supported on Linux/Debug and incompatible with ASAN=ON (Disabled by default)
117
128
BUILD_DOC | Build documentation for this library (Disabled by default)
118
-
ZLIB_DEFLATE_FAST_MODE_2 | Enable optimization for deflate fast using Z_FIXED strategy. Do not combine with ZLIB_DEFLATE_FAST_MODE_3 (Disabled by default)
119
-
ZLIB_DEFLATE_FAST_MODE_3 | Enable ZLIB deflate quick strategy. Do not combine with ZLIB_DEFLATE_FAST_MODE_2 (Disabled by default)
129
+
ZLIB_DEFLATE_FAST_MODE | Enable ZLIB deflate quick strategy (Disabled by default)
120
130
AOCL_LZ4_MATCH_SKIP_OPT_LDS_STRAT1 | Enable LZ4 match skipping optimization strategy-1 based on a larger base step size applied for long distance search (Disabled by default)
121
131
AOCL_LZ4_MATCH_SKIP_OPT_LDS_STRAT2 | Enable LZ4 match skipping optimization strategy-2 by aggressively setting search distance on top of strategy-1. Preferred to be used with Silesia corpus (Disabled by default)
132
+
AOCL_LZ4_NEW_PRIME_NUMBER | Enable the usage of a new prime number for LZ4 hashing function. Preferred to be used with Silesia corpus (Disabled by default)
133
+
AOCL_LZ4_EXTRA_HASH_TABLE_UPDATES | Enable storing of additional potential matches to improve compression ratio. Recommended for higher compressibility use cases (Disabled by default)
134
+
AOCL_LZ4_HASH_BITS_USED | Control the number of bits used for LZ4 hashing, allowed values are LOW (low perf gain and less CR regression) and HIGH (high perf gain and high CR regression) (Disabled by default)
122
135
AOCL_EXCLUDE_BZIP2 | Exclude BZIP2 compression method from the library build (Disabled by default)
123
136
AOCL_EXCLUDE_LZ4 | Exclude LZ4 compression method from the library build. LZ4HC also gets excluded (Disabled by default)
124
137
AOCL_EXCLUDE_LZ4HC | Exclude LZ4HC compression method from the library build (Disabled by default)
125
138
AOCL_EXCLUDE_LZMA | Exclude LZMA compression method from the library build (Disabled by default)
126
139
AOCL_EXCLUDE_SNAPPY | Exclude SNAPPY compression method from the library build (Disabled by default)
127
140
AOCL_EXCLUDE_ZLIB | Exclude ZLIB compression method from the library build (Disabled by default)
128
141
AOCL_EXCLUDE_ZSTD | Exclude ZSTD compression method from the library build (Disabled by default)
142
+
AOCL_XZ_UTILS_LZMA_API_EXPERIMENTAL | Build with xz utils lzma APIs. Experimental feature with limited API support (Disabled by default)
143
+
AOCL_ENABLE_THREADS | Enable multi-threaded compression and decompression using SMP based openMP threads (Disabled by default)
129
144
130
145
Running AOCL-Compression Test Bench On Linux
131
146
--------------------------------------------
@@ -165,18 +180,41 @@ Here, 5 is the level and 0 is the additional parameter passed to ZSTD method.
165
180
Here, 5 is the level and 0 is the additional parameter passed to ZSTD method.
166
181
167
182
168
-
* To run the test bench with error/debug/trace/info logs, use the command:<br>
169
-
`aocl_compression_bench -a -t -v <input filename>`<br>
170
-
Here, `-v` can be passed with a number such as v<n> that can take values:
171
-
* 1 for Error (default)
172
-
* 2 for Info
173
-
* 3 for Debug
174
-
* 4 for Trace.
175
-
183
+
* To run the test bench with error/debug/trace/info logs, build the library by using `-DAOCL_ENABLE_LOG_FEATURE=ON` & set the environment variable `AOCL_ENABLE_LOG` to any of the following:<br>
184
+
*`AOCL_ENABLE_LOG=ERR` for Error logs.
185
+
*`AOCL_ENABLE_LOG=INFO` for Error, Info logs.
186
+
*`AOCL_ENABLE_LOG=DEBUG` for Error, Info, Debug logs.
187
+
*`AOCL_ENABLE_LOG=TRACE` for Error, Info, Debug, Trace logs.<br>
188
+
Note: When building the library for highest performance, do not enable `DAOCL_ENABLE_LOG_FEATURE`.
189
+
190
+
191
+
* To run the test bench but only compression or decompression <br>
192
+
for a given input file, use the command:<br>
193
+
`aocl_compression_bench -rcompress <input filename>` or <br>
194
+
`aocl_compression_bench -rdecompress -ezstd <compressed input filename>` or <br>
@@ -229,6 +267,27 @@ Following are a few sample commands that can be executed in the build directory
229
267
To run GTest test cases for a specific method<br>
230
268
`ctest -R <METHOD_NAME_IN_CAPITALS>`
231
269
270
+
Running source code coverage using GCOV
271
+
---------------------------------------
272
+
273
+
To measure source code coverage, use CODE_COVERAGE option while configuring the CMake build. Run CMake with the custom target option 'code-coverage' to execute tests and generate code coverage data. The code coverage reports are generated in the build directory under subdirectory called 'coverage/html_report'. Open the HTML files in browser to view the coverage information.
274
+
275
+
Following is the sample command usage to run code coverage:
Use VALGRIND option for Valgrind memory check and ASAN option for ASAN memory check while configuring the CMake build. VALGRIND and ASAN options can not be enabled together.
282
+
283
+
Following are the commands to execute in the 'build' directory to run memory checks.
284
+
285
+
To run Valgrind memory check<br>
286
+
`ctest -T memcheck`
287
+
288
+
To run ASAN memory check<br>
289
+
`ctest`
290
+
232
291
Running Performance Benchmarking
233
292
--------------------------------
234
293
@@ -253,6 +312,38 @@ Generating Documentation
253
312
- Documents will be generated in HTML format in the folder __docs/html__ . Open the index.html file in any browser to view the documentation.
254
313
- CMake will use the existing Doxygen if available. Else, it will prompt the user to install doxygen and try again.
255
314
315
+
Enabling/disabling optimizations
316
+
--------------------------------
317
+
- AOCL optimizations can be disabled by setting the environment variable AOCL_DISABLE_OPT to ON.
318
+
- Reference code paths are taken in such a scenario.
319
+
- This needs to be set before launching the application for it to take effect.
320
+
- If optimization is turned off via aocl_compression_desc::optOff (= 1) passed to aocl_llc_setup(), then reference code paths are taken.
321
+
- If optimization is turned on via aocl_compression_desc::optOff (= 0) passed to aocl_llc_setup(), then AOCL_DISABLE_OPT is checked
322
+
additionally to override aocl_compression_desc::optOff value.
323
+
324
+
Enabling specific instructions (ISA)
325
+
------------------------------------
326
+
- AOCL optimizations can be restricted to certain ISAs by setting the environment variable
327
+
AOCL_ENABLE_INSTRUCTIONS. Supported values are SSE2, AVX, AVX2 and AVX512.
328
+
- This ensures optimized code paths with ISAs above the set value are not taken. E.g. If
329
+
it is set to AVX, no AVX2 and AVX512 optimized code paths are taken.
330
+
- This needs to be set before launching the application for it to take effect.
331
+
- It takes precedence over aocl_compression_desc::optLevel setting passed to aocl_llc_setup().
332
+
- Note: When calling aocl_llc_setup() API from multiple threads, changing aocl_compression_desc::optOff
333
+
and aocl_compression_desc::optLevel values between threads can lead to undefined behaviour.
334
+
335
+
Multi-threaded Compression and Decompression
336
+
--------------------------------------------
337
+
- Parallel compression and decompression of lz4, zlib, zstd and snappy is implemented using
338
+
openMP multi-threading. A RAP (random access point) frame is introduced in AOCL-Compression
339
+
to support parallel decompression of the compressed streams/files. Use AOCL_ENABLE_THREADS
340
+
config option to enable the multi-threading.
341
+
- A stream compressed with multi-threaded AOCL-Compression library can be decompressed using any
342
+
single-threaded standard decompressor by simply skipping the initial block of bytes containing
343
+
the RAP frame present at the start of the stream.
344
+
- The multi-threaded compression support is optimally tuned for AMD CPUs on Linux® OS whereas
345
+
this support is experimental for Windows® platforms.
0 commit comments