@@ -12,11 +12,32 @@ Usage:
1212```
1313
1414* ` archives-dir ` is the directory that archives should be written to.
15- * ` input-path ` is any new-line-delimited JSON (ndjson) log file or directory containing such files.
16- * ` options ` allow you to specify things like which field should be considered as the log event's
17- timestamp (` --timestamp-key <field-path> ` ), or whether to fully parse array entries and encode
18- them into dedicated columns (` --structurize-arrays ` ).
19- * For a complete list, run ` ./clp-s c --help `
15+ * ` input-path ` is a filesystem path or URL to either:
16+ * a new-line-delimited JSON (ndjson) log file;
17+ * a KV-IR file; or
18+ * a directory containing such files.
19+ * ` options ` allow you to specify how data gets compressed into an archive. For example:
20+ * ` --single-file-archive ` specifies that single-file archives should be produced (i.e., each
21+ archive is a single file in ` archives-dir ` ).
22+ * ` --file-type <json|kv-ir> ` specifies whether the input files are encoded as ndjson or KV-IR.
23+ * ` --timestamp-key <field-path> ` specifies which field should be treated as each log event's
24+ timestamp.
25+ * ` --target-encoded-size <size> ` specifies the threshold (in bytes) at which archives are split,
26+ where ` size ` is the total size of the dictionaries and encoded messages in an archive.
27+ * This option acts as a soft limit on memory usage for compression, decompression, and search.
28+ * This option significantly affects compression ratio.
29+ * ` --structurize-arrays ` specifies that arrays should be fully parsed and array entries should be
30+ encoded into dedicated columns.
31+ * ` --auth <s3|none> ` specifies the authentication method that should be used for network requests
32+ if the input path is a URL.
33+ * When S3 authentication is enabled, we issue a GET request following the [ AWS Signature Version
34+ 4 specification] [ aws-signature-v4 ] . This request uses the environment variables
35+ ` AWS_ACCESS_KEY_ID ` , ` AWS_SECRET_ACCESS_KEY ` , and, optionally, ` AWS_SESSION_TOKEN ` if it
36+ exists.
37+ * For more information on usage with S3, see our
38+ [ dedicated guide] ( guides-using-object-storage/index ) .
39+
40+ For a complete list of options, run ` ./clp-s c --help ` .
2041
2142### Examples
2243
@@ -37,6 +58,14 @@ Specifying the timestamp-key will create a range-index for the timestamp column
3758compression ratio and search performance.
3859:::
3960
61+ ** Compress a KV-IR file stored on S3 into a single-file archive:**
62+
63+ ``` shell
64+ AWS_ACCESS_KEY_ID=' ...' AWS_SECRET_ACCESS_KEY=' ...' \
65+ ./clp-s c --single-file-archive --file-type kv-ir --auth s3 /mnt/data/archives \
66+ https://my-bucket.s3.us-east-2.amazonaws.com/kv-ir-log.clp
67+ ```
68+
4069** Set the target encoded size to 1 GiB and the compression level to 6 (3 by default)**
4170
4271``` shell
@@ -52,13 +81,14 @@ compression ratio and search performance.
5281Usage:
5382
5483``` shell
55- ./clp-s x [< options> ] < archives-dir > < output-dir>
84+ ./clp-s x [< options> ] < archives-path > < output-dir>
5685```
5786
58- * ` archives-dir ` is a directory containing archives.
87+ * ` archives-path ` is a directory containing archives, a path to an archive, or a URL pointing to a
88+ single-file archive.
5989* ` output-dir ` is the directory that decompressed logs should be written to.
60- * ` options ` allow you to specify things like a specific archive (from within ` archives-dir ` ) to
61- decompress (` --archive-id <archive-id> ` ).
90+ * ` options ` allow you to specify things like a specific archive (from within ` archives-path ` , if it
91+ is a directory) to decompress (` --archive-id <archive-id> ` ).
6292 * For a complete list, run ` ./clp-s x --help `
6393
6494### Examples
@@ -74,13 +104,14 @@ Usage:
74104Usage:
75105
76106``` shell
77- ./clp-s s [< options> ] < archives-dir > < kql-query>
107+ ./clp-s s [< options> ] < archives-path > < kql-query>
78108```
79109
80- * ` archives-dir ` is a directory containing archives.
110+ * ` archives-path ` is a directory containing archives, a path to an archive, or a URL pointing to a
111+ single-file archive.
81112* ` kql-query ` is a [ KQL] ( reference-json-search-syntax ) query.
82- * ` options ` allow you to specify things like a specific archive (from within ` archives-dir ` ) to
83- search (` --archive-id <archive-id> ` ).
113+ * ` options ` allow you to specify things like a specific archive (from within ` archives-path ` , if it
114+ is a directory) to search (` --archive-id <archive-id> ` ).
84115 * For a complete list, run ` ./clp-s s --help `
85116
86117### Examples
@@ -125,3 +156,5 @@ compressed data:**
125156 the same file.
126157* In addition, there are a few limitations, related to querying arrays, described in the search
127158 syntax [ reference] ( reference-json-search-syntax ) .
159+
160+ [ aws-signature-v4 ] : https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html
0 commit comments