API Reference

This piece of documentation only previews the APIs to release and keeps a record. All the mentioned APIs are subject to change without any notification until the final release, 0.3 (where -rc* is dropped).

Essentials

cuSZ assumes all the data are on device memory, and all the metadata are on host memory.
For performance consideration,
- cuSZ does not handle I/O, memory copy between spaces (unless internally), or communication and
- cuSZ has one-time initialization, which could be expensive but amortized.
After the initialization, the compressor could be used
- for both compression and decompression, and
- for compressing other data of the same ND size.

Configuration

Two data structures are involved to configure the compressor: cuszCTX ("context") and cuszHEADER ("header"). The former describes runtime configuration while the latter the file format.
The current implementation (as of 0.3-rc2) makes "context" a superset of "header".
The compression requires "context" only and the decompression requires both "context" and "header".

A "context" is made of string-based configuration in k-v pairs (k1=v1,k2=v2,...), where the delimiter of pairs is a comma (","). A "header" is stored and accessed during decompression.

| key            | value options                   | description                                               | required |    default    | CLI counterpart |
| -------------- | ------------------------------- | --------------------------------------------------------- | :------: | :-----------: | --------------- |
| len            | "[X]", "[X]x[Y]", "[X]x[Y]x[Z]" | ND length; also for allocation size by default            |    •     |       -       | -l [X]x[Y]x[Z]  |
| eb             | scientific notation             | error bound                                               |    •     |       -       | -e [EB]         |
| mode           | "abs", "r2r"                    | error-bounding mode                                       |    •     |       -       | -m [MODE]       |
| alloclen       | "[X]", "[X]x[Y]", "[X]x[Y]x[Z]" | overriding allocation length                              |          | same as "len" |                 |
| radius         | power-of-two integer            | quantization code coverage (single-side)                  |          |      512      | n/a             |
| pipeline       | "auto", "binary", "radius"      | whether to use sparsity-aware (binary) path               |          |    "auto"     | n/a             |
| anchor         | "on", "ON", "off", "OFF"        | whether to use anchor point                               |          |     "off"     | n/a             |
| huffbyte       | "4", "8"                        | override to use 8-byte internal type for VLE              |          |      "4"      | n/a             |
| nondestructive | "on", "ON", "off", "OFF"        | whether to overwrite the input data                       |          |               |                 |
| failfast       | "on", "ON", "off", "OFF"        | whether to fail or allocate more memory on out-of-memory  |          |               |                 |
| densityfactor  | number greater than 1           | override outlier gatherer (spcodec) reserved space factor |          |   "4" (25%)   | n/a             |

cuSZ API works based on two use scenarios.

API Use

First, data type is required before accessing compressor; then, we specify compressor by predictor type,

using Compressor = typename Framework<Data>::XFeaturedCompressor;

where X is substituted with a desired predictor from Lorenzo (ready), Spline3 (in progress). Alternatively, user can use `

using Compressor = typename Framework<Data>::DefaultCompressor;

Two types of configuration struct are involved: cusz::Context for compress-time and cusz::Header for decompress-time. For compress-time, a (C-)string is required to construct a cusz::Context instance.

char const* config_str = "len=3600x1800,eb=1e-4,mode=r2r";
auto ctx = new cusz::Context(config_str);

The core compression and decompress APIs are defined as

/**
 * @brief Core compression API for cuSZ, requiring that input and output are on device pointers/iterators.
 *
 * @tparam Compressor predefined Compressor type, accessible via cusz::Framework<T>::XFeaturedCompressor
 * @tparam T uncompressed data type
 * @param compressor Compressor instance
 * @param config (host) cusz::Context as configuration type
 * @param uncompressed (device) input uncompressed type
 * @param uncompressed_alloc_len (host) for checking; >1.03x the original data size to ensure the legal memory access
 * @param compressed (device) exposed compressed array in Compressor (shallow copy); need to transfer before Compressor
 * destruction
 * @param compressed_len (host) output compressed array length
 * @param header (host) header for compressed binary description; aquired by a deep copy
 * @param stream CUDA stream
 * @param timerecord collected time information for compressor; aquired by a deep copy
 */
template <class Compressor, typename T>
void core_compress(
    Compressor *compressor, cusz::Context *config,
    T *uncompressed, size_t uncompressed_alloc_len,
    BYTE *&compressed, size_t &compressed_len, cusz::Header &header,
    cudaStream_t stream = nullptr,
    cusz::TimeRecord *timerecord = nullptr);


/**
 * @brief Core decompression API for cuSZ, requiring that input and output are on device pointers/iterators.
 *
 * @tparam Compressor predefined Compressor type, accessible via cusz::Framework<T>::XFeaturedCompressor
 * @tparam T uncompressed data type
 * @param compressor Compressor instance
 * @param config (host) cusz::Header as configuration type
 * @param compressed (device) input compressed array
 * @param compressed_len (host) input compressed length for checking
 * @param decompressed (device) output decompressed array
 * @param decompressed_alloc_len (host) for checking; >1.03x the original data size to ensure the legal memory access
 * @param stream CUDA stream
 * @param timerecord collected time information for compressor; aquired by a deep copy
 */
template <class Compressor, typename T>
void core_decompress(
    Compressor *compressor, cusz::Header *config,
    BYTE *compressed, size_t compressed_len,
    T *decompressed, size_t decompressed_alloc_len,
    cudaStream_t stream = nullptr,
    cusz::TimeRecord *timerecord = nullptr);

Future Work

cusz::Framework will be expanded by supporting more types and selecting individual Codec and SpCodec, for example,

using Compressor = cusz::Framework<T>::CompressorTemplate<Predictor, Codec, SpCodec>;

developers: Jiannan Tian, Cody Rivera, Wenyu Gai, Dingwen Tao, Sheng Di, Franck Cappello
contributors (alphabetic): Jon Calhoun, Megan Hickman Fulp, Xin Liang, Robert Underwood, Kai Zhao
Special thanks to Dominique LaSalle (NVIDIA) for serving as Mentor in Argonne GPU Hackaton 2021!