diff --git a/docs/src/user-guide/quick-start-cluster-setup/multi-node.md b/docs/src/user-guide/guides-multi-node.md similarity index 93% rename from docs/src/user-guide/quick-start-cluster-setup/multi-node.md rename to docs/src/user-guide/guides-multi-node.md index 705f2f97a6..96c89f2ad5 100644 --- a/docs/src/user-guide/quick-start-cluster-setup/multi-node.md +++ b/docs/src/user-guide/guides-multi-node.md @@ -110,8 +110,26 @@ Where `` is the name of the component in the groups above. ## Using CLP -Check out the [compression](../quick-start-compression/index) and -[search](../quick-start-search/index) guides to compress and search your logs. +To learn how to compress and search your logs, check out the quick-start guide that corresponds to +the flavor of CLP you're running: + +::::{grid} 1 1 2 2 +:gutter: 2 + +:::{grid-item-card} +:link: quick-start/clp-json +Using clp-json +^^^ +How to compress and search JSON logs. +::: + +:::{grid-item-card} +:link: quick-start/clp-text +Using clp-text +^^^ +How to compress and search unstructured text logs. +::: +:::: ## Stopping CLP diff --git a/docs/src/user-guide/guides-overview.md b/docs/src/user-guide/guides-overview.md index 5e8179bf71..02faefe2a1 100644 --- a/docs/src/user-guide/guides-overview.md +++ b/docs/src/user-guide/guides-overview.md @@ -11,4 +11,11 @@ Using object storage ^^^ Using CLP to ingest logs from object storage and store archives on object storage. ::: + +:::{grid-item-card} +:link: guides-multi-node +Multi-node deployment +^^^ +How to deploy CLP across multiple nodes. +::: :::: diff --git a/docs/src/user-guide/guides-using-object-storage/clp-usage.md b/docs/src/user-guide/guides-using-object-storage/clp-usage.md index e2d552e78f..e4eec83c76 100644 --- a/docs/src/user-guide/guides-using-object-storage/clp-usage.md +++ b/docs/src/user-guide/guides-using-object-storage/clp-usage.md @@ -1,7 +1,7 @@ # Using CLP with object storage To compress logs from S3, follow the steps in the section below. For all other operations, you -should be able to use CLP as described in the [quick start](../quick-start-overview.md) guide. +should be able to use CLP as described in the [clp-json quick-start guide](../quick-start/clp-json). ## Compressing logs from S3 diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index ea553ef806..0912157cf6 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -11,8 +11,8 @@ to use object storage for any combination of the three use cases (e.g., compress cache the stream files on S3, but store archives on the local filesystem). :::{note} -Currently, only the [clp-json][release-choices] release supports object storage. Support for -`clp-text` will be added in a future release. +Currently, only [clp-json][release-choices] supports object storage. Support for `clp-text` will be +added in a future release. ::: :::{note} @@ -22,8 +22,8 @@ will be added in a future release. ## Prerequisites -1. This guide assumes you're able to configure, start, stop, and use a CLP cluster as described in - the [quick-start guide](../quick-start-overview.md). +1. This guide assumes you're able to configure, start, stop, and use CLP as described in the + [clp-json quick-start guide](../quick-start/clp-json.md). 2. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. 3. An S3 bucket and key prefix where you wish to store compressed archives. 4. An S3 bucket and key prefix where you wish to cache stream files. @@ -136,4 +136,4 @@ clp-usage [aws-iam-roles]: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-role.html [aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html [aws-sts-credentials]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html -[release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release +[release-choices]: ../quick-start/index.md#choosing-a-flavor diff --git a/docs/src/user-guide/index.md b/docs/src/user-guide/index.md index 642eac4a62..a637c8a650 100644 --- a/docs/src/user-guide/index.md +++ b/docs/src/user-guide/index.md @@ -9,17 +9,18 @@ The sections are as follows: :gutter: 2 :::{grid-item-card} -:link: quick-start-overview +:link: quick-start/index Quick start ^^^ -A quick start guide for setting up a CLP cluster, compressing your logs, and searching them. +A quick-start guide for choosing a flavor of CLP, setting it up, compressing your logs, and +searching them. ::: :::{grid-item-card} :link: guides-overview Guides ^^^ -Guides for using CLP in a variety of use cases. +Guides for using CLP in various use cases. ::: :::{grid-item-card} @@ -48,10 +49,9 @@ Reference docs like format specifications, etc. :hidden: :caption: Quick start -quick-start-overview -quick-start-cluster-setup/index -quick-start-compression/index -quick-start-search/index +quick-start/index +quick-start/clp-json +quick-start/clp-text ::: :::{toctree} @@ -61,6 +61,7 @@ quick-start-search/index guides-overview guides-using-object-storage/index +guides-multi-node ::: :::{toctree} diff --git a/docs/src/user-guide/quick-start-cluster-setup/index.md b/docs/src/user-guide/quick-start-cluster-setup/index.md deleted file mode 100644 index b3042c7a9c..0000000000 --- a/docs/src/user-guide/quick-start-cluster-setup/index.md +++ /dev/null @@ -1,100 +0,0 @@ -# Cluster setup - -To set up a cluster, you'll need to: - -* Choose a release. -* Choose between a single or multi-node deployment. -* Ensure you meet the requirements for running the release. -* Configure the release (if necessary). -* Start CLP. - -## Choosing a release - -There are two flavours of CLP [releases][clp-releases]: - -* **[clp-json](#clp-json)** for compressing and searching **JSON** logs. -* **[clp-text](#clp-text)** for compressing and searching **unstructured text** logs. - -:::{note} -Both flavours contain the same binaries but are configured with different values for the -`package.storage_engine` key in the package's config file (`etc/clp-config.yml`). -::: - -You should download and extract the release that's appropriate for your logs. - -### clp-json - -`clp-json` releases are appropriate for JSON logs where each log event is an independent JSON -object. For example: - -```json lines -{ - "t": { - "$date": "2023-03-21T23:46:37.392" - }, - "ctx": "conn11", - "msg": "Waiting for write concern." -} -{ - "t": { - "$date": "2023-03-21T23:46:37.392" - }, - "msg": "Set last op to system time" -} -``` - -The log file above contains two log events represented by two JSON objects printed one after the -other. Whitespace is ignored, so the log events could also appear with no newlines and indentation. - -### clp-text - -`clp-text` releases are appropriate for unstructured text logs where each log event contains a -timestamp and may span one or more lines. - -:::{note} -If your logs do not contain timestamps or CLP can't automatically parse the timestamps in your logs, -it will treat each line as an independent log event. -::: - -For example: - -``` -2015-03-23T15:50:17.926Z INFO container_1 Transitioned from ALLOCATED to ACQUIRED -2015-03-23T15:50:17.927Z ERROR Scheduler: Error trying to assign container token -java.lang.IllegalArgumentException: java.net.UnknownHostException: i-e5d112ea - at org.apache.hadoop.security.buildTokenService(SecurityUtil.java:374) - at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) -Caused by: java.net.UnknownHostException: i-e5d112ea - ... 17 more -``` - -The log file above contains two log events, both beginning with a timestamp. The first is a single -line while the second contains multiple lines. - -## Deployment options - -Choose one of the deployment options below: - -::::{grid} 1 1 2 2 -:gutter: 2 - -:::{grid-item-card} -:link: single-node -Single-node deployment -::: - -:::{grid-item-card} -:link: multi-node -Multi-node deployment -::: -:::: - -:::{toctree} -:hidden: -:caption: Cluster setup - -single-node -multi-node -::: - -[clp-releases]: https://github.com/y-scope/clp/releases diff --git a/docs/src/user-guide/quick-start-cluster-setup/single-node.md b/docs/src/user-guide/quick-start-cluster-setup/single-node.md deleted file mode 100644 index ea234858ac..0000000000 --- a/docs/src/user-guide/quick-start-cluster-setup/single-node.md +++ /dev/null @@ -1,40 +0,0 @@ -# Single-node deployment - -A single-node deployment allows you to run CLP on a single host. - -## Requirements - -* [Docker] - * If you're not running as root, ensure `docker` can be run - [without superuser privileges][docker-non-root]. - * If you're using Docker Desktop, ensure version 4.34 or higher is installed, and - [host networking is enabled][docker-desktop-host-networking]. -* Python 3.8 or higher - -## Starting CLP - -```bash -sbin/start-clp.sh -``` - -:::{note} -If CLP fails to start (e.g., due to a port conflict), try adjusting the settings in -`etc/clp-config.yml` and then run the start command again. -::: - -## Using CLP - -Check out the [compression](../quick-start-compression/index) and -[search](../quick-start-search/index) guides to compress and search your logs. - -## Stopping CLP - -If you need to stop the cluster, run: - -```bash -sbin/stop-clp.sh -``` - -[Docker]: https://docs.docker.com/engine/install/ -[docker-desktop-host-networking]: https://docs.docker.com/engine/network/drivers/host/#docker-desktop -[docker-non-root]: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user diff --git a/docs/src/user-guide/quick-start-compression/index.md b/docs/src/user-guide/quick-start-compression/index.md deleted file mode 100644 index fab8a8781c..0000000000 --- a/docs/src/user-guide/quick-start-compression/index.md +++ /dev/null @@ -1,33 +0,0 @@ -# Compression - -You can compress your logs using a script in the CLP package. Depending on the format of your logs, -choose one of the options below. - -:::{caution} -If you're using the `clp-json` release, you can only compress JSON logs. If you're using the -`clp-text` release, you should only compress unstructured text logs (`clp-text` can compress and -search JSON logs as if it was unstructured text, but `clp-text` cannot query individual fields). -This limitation will be addressed in a future version of CLP. -::: - -::::{grid} 1 1 2 2 -:gutter: 2 - -:::{grid-item-card} -:link: json -Compressing JSON logs -::: - -:::{grid-item-card} -:link: text -Compressing unstructured text logs -::: -:::: - -:::{toctree} -:hidden: -:caption: Compression - -json -text -::: diff --git a/docs/src/user-guide/quick-start-compression/json.md b/docs/src/user-guide/quick-start-compression/json.md deleted file mode 100644 index 7784a4f509..0000000000 --- a/docs/src/user-guide/quick-start-compression/json.md +++ /dev/null @@ -1,38 +0,0 @@ -# Compressing JSON logs - -To compress JSON logs, from inside the package directory, run: - -```bash -sbin/compress.sh --timestamp-key '' [ ...] -``` - -* `` is the field path of the kv-pair that contains the timestamp in each log event. - * E.g., if your log events look like - `{"timestamp": {"iso8601": "2024-01-01 00:01:02.345", ...}}`, you should enter - `timestamp.iso8601` as the timestamp key. - - :::{caution} - Log events without the specified timestamp key will _not_ have an assigned timestamp. Currently, - these events can only be searched from the command line (when you don't specify a timestamp - filter). - ::: - -* `` are paths to JSON log files or directories containing such files. - * Each JSON log file should contain each log event as a [separate JSON object][json-log-format], - i.e., _not_ as an array. - -:::{tip} -To compress logs from object storage, see -[Using object storage](../guides-using-object-storage/index). -::: - -# Sample logs - -For some sample logs, check out the open-source [datasets](../resources-datasets.md). - -# Examining compression statistics - -The compression script used above will output the compression ratio of each dataset you compress, or -you can use the UI to view overall statistics. - -[json-log-format]: ../quick-start-cluster-setup/index.md#clp-json diff --git a/docs/src/user-guide/quick-start-compression/text.md b/docs/src/user-guide/quick-start-compression/text.md deleted file mode 100644 index 29e798b9d8..0000000000 --- a/docs/src/user-guide/quick-start-compression/text.md +++ /dev/null @@ -1,18 +0,0 @@ -# Compressing unstructured text logs - -To compress unstructured text logs, from inside the package directory, run: - -```bash -sbin/compress.sh [ ...] -``` - -`` are paths to unstructured text log files or directories containing such files. - -# Sample logs - -For some sample logs, check out the open-source [datasets](../resources-datasets.md). - -# Examining compression statistics - -The compression script used above will output the compression ratio of each dataset you compress, or -you can use the UI to view overall statistics. diff --git a/docs/src/user-guide/quick-start-overview.md b/docs/src/user-guide/quick-start-overview.md deleted file mode 100644 index 9cf155a607..0000000000 --- a/docs/src/user-guide/quick-start-overview.md +++ /dev/null @@ -1,8 +0,0 @@ -# Overview - -CLP operates as a distributed system, so you'll need to set up a cluster before compressing and -searching your logs. Follow the guides below for each step: - -1. [Cluster setup](quick-start-cluster-setup/index) -2. [Compression](quick-start-compression/index) -3. [Search](quick-start-search/index) diff --git a/docs/src/user-guide/quick-start-search/cli-search.md b/docs/src/user-guide/quick-start-search/cli-search.md deleted file mode 100644 index b8d9061923..0000000000 --- a/docs/src/user-guide/quick-start-search/cli-search.md +++ /dev/null @@ -1,23 +0,0 @@ -# Searching from the command line - -From inside the package, run: - -``` -sbin/search.sh '' -``` - -The format of `` depends on the format your logs: [JSON](../reference-json-search-syntax.md) -or [unstructured text](../reference-text-search-syntax.md). - -To narrow your search to a specific time range: - -* Add `--begin-time ` to filter for log events after a certain time. - * `` is the timestamp as milliseconds since the UNIX epoch. -* Add `--end-time ` to filter for log events after a certain time. - -To perform case-insensitive searches, add the `--ignore-case` flag. - -:::{caution} -To match the convention of other tools, by default, searches are case-**insensitive** in the UI and -searches are case-**sensitive** on the command line. -::: diff --git a/docs/src/user-guide/quick-start-search/index.md b/docs/src/user-guide/quick-start-search/index.md deleted file mode 100644 index 6e3ac9bc7b..0000000000 --- a/docs/src/user-guide/quick-start-search/index.md +++ /dev/null @@ -1,41 +0,0 @@ -# Search - -You can search your logs from the UI or from the command line. - -::::{grid} 1 1 2 2 -:gutter: 2 - -:::{grid-item-card} -:link: ui-search -Searching from the UI -::: - -:::{grid-item-card} -:link: cli-search -Searching from the command line -::: -:::: - -The syntax for searching JSON logs and unstructured text logs is different. - -::::{grid} 1 1 2 2 -:gutter: 2 - -:::{grid-item-card} -:link: ../reference-json-search-syntax -`clp-json` search syntax -::: - -:::{grid-item-card} -:link: ../reference-text-search-syntax -`clp-text` search syntax -::: -:::: - -:::{toctree} -:hidden: -:caption: Search - -ui-search -cli-search -::: diff --git a/docs/src/user-guide/quick-start-search/ui-search.md b/docs/src/user-guide/quick-start-search/ui-search.md deleted file mode 100644 index 32af5bb4b0..0000000000 --- a/docs/src/user-guide/quick-start-search/ui-search.md +++ /dev/null @@ -1,30 +0,0 @@ -# Searching from the UI - -CLP includes a web interface available at [http://localhost:4000](http://localhost:4000) by default -(if you changed `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values). - -:::{image} clp-search-ui.png -::: - -The image above shows the search page after running a query. The numbered circles correspond to the -following features: - -1. The search box is where you can enter a query. - * The format of a query depends on the format of your logs: - [JSON](../reference-json-search-syntax.md) or - [unstructured text](../reference-text-search-syntax.md). -2. The timeline shows the number of results across the time range of your query. - * You can click and drag to zoom into a time range or use the time range filter in (4). -3. The table displays the search results for your query. -4. Clicking the icon reveals additional filters for your query. - * The time range filter allows you to specify the period of time that matching log events must be - in. - * The case sensitivity filter allows you to specify whether CLP should respect the case of your - query. -5. Clicking the icon reveals options for displaying results. -6. The icon clears the results of the last query. - -:::{note} -By default, the UI will only return 1,000 of the latest search results. To perform searches which -return more results, use the [command line](cli-search.md). -::: \ No newline at end of file diff --git a/docs/src/user-guide/quick-start/clp-json.md b/docs/src/user-guide/quick-start/clp-json.md new file mode 100644 index 0000000000..d83971fd6e --- /dev/null +++ b/docs/src/user-guide/quick-start/clp-json.md @@ -0,0 +1,176 @@ +# clp-json quick-start + +This page will walk you through how to start CLP and use it to compress and search JSON logs. + +:::{caution} +If you're using a `clp-json` release, you can only compress and search JSON logs. This limitation +will be addressed in a future version of CLP. +::: + +--- + +## Starting CLP + +To start CLP, run: + +```bash +sbin/start-clp.sh +``` + +:::{note} +If CLP fails to start (e.g., due to a port conflict), try adjusting the settings in +`etc/clp-config.yml` and then run the start command again. +::: + +--- + +## Compressing JSON logs + +To compress some JSON logs, run: + +```bash +sbin/compress.sh --timestamp-key '' [ ...] +``` + +* `` is the field path of the kv-pair that contains the timestamp in each log event. + * E.g., if your log events look like + `{"timestamp": {"iso8601": "2024-01-01 00:01:02.345", ...}}`, you should enter + `timestamp.iso8601` as the timestamp key. + + :::{caution} + Log events without the specified timestamp key will *not* have an assigned timestamp. Currently, + these events can only be searched from the command line (when you don't specify a timestamp + filter). + ::: + +* `` are paths to JSON log files or directories containing such files. + * Each JSON log file should contain each log event as a + [separate JSON object](./index.md#clp-json), i.e., *not* as an array. + +The compression script will output the compression ratio of each dataset you compress, or you can +use the UI to view overall statistics. + +Compressed logs will be stored in the directory specified by the `archive_output.storage.directory` +config option in `etc/clp-config.yml` (`archive_output.storage.directory` defaults to +`var/data/archives`). + +:::{tip} +To compress logs from object storage, see +[Using object storage](../guides-using-object-storage/index). +::: + +### Sample logs + +For some sample logs, check out the [open-source datasets](../resources-datasets). + +--- + +## Searching JSON logs + +You can search your compressed logs from CLP's [UI](#searching-from-the-ui) or the +[command line](#searching-from-the-command-line). + +In clp-json, queries are written as a set of conditions (predicates) on key-value pairs (kv-pairs). +For example, [Figure 1](#figure-1) shows a query that matches the first log event in +[Figure 2](#figure-2). + +(figure-1)= +:::{card} + +```sql +ctx: "conn11" AND msg: "*write concern*" +``` + ++++ +**Figure 1**: An example query. +::: + +(figure-2)= +:::{card} + +```json lines +{ + "t": { + "$date": "2023-03-21T23:46:37.392" + }, + "ctx": "conn11", + "msg": "Waiting for write concern." +} +{ + "t": { + "$date": "2023-03-21T23:46:37.392" + }, + "msg": "Set last op to system time" +} +``` + ++++ +**Figure 2**: A set of JSON log events. +::: + +The query in [Figure 1](#figure-1) will match log events that contain the kv-pair `"ctx": "conn11"` +as well as a kv-pair with key `"msg"` and a value that matches the wildcard query +`"*write concern*"`. + +A complete reference for clp-json's query syntax is available on the +[syntax reference page](../reference-json-search-syntax). + +### Searching from the UI + +To search your compressed logs from CLP's UI, open [http://localhost:4000](http://localhost:4000) in +your browser (if you changed `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new +values). + +:::{image} clp-search-ui.png +::: + +The image above shows the search page after running a query. The numbered circles correspond to +the following features: + +1. The search box is where you can enter your query. +2. The timeline shows the number of results across the time range of your query. + * You can click and drag to zoom into a time range, or use the time range filter in (4). +3. The table displays the search results for your query. +4. Clicking the icon reveals additional filters for your query. + * The time range filter allows you to specify the period of time that matching log events must be + in. + * The case sensitivity filter allows you to specify whether CLP should respect the case of your + query. +5. Clicking the icon reveals options for displaying results. +6. The icon clears the results of the last query. + +:::{note} +By default, the UI will only return 1,000 of the latest search results. To perform searches which +return more results, use the [command line](#searching-from-the-command-line). +::: + +### Searching from the command line + +To search your compressed logs from the command line, run: + +```bash +sbin/search.sh '' +``` + +To narrow your search to a specific time range: + +* Add `--begin-time ` to filter for log events after a certain time. + * `` is the timestamp as milliseconds since the UNIX epoch. +* Add `--end-time ` to filter for log events before a certain time. + +To perform case-insensitive searches, add the `--ignore-case` flag. + +:::{caution} +To match the convention of other tools, by default, searches are case-**insensitive** in the UI and +searches are case-**sensitive** on the command line. +::: + +--- + +## Stopping CLP + +If you need to stop CLP, run: + +```bash +sbin/stop-clp.sh +``` diff --git a/docs/src/user-guide/quick-start-search/clp-search-ui.png b/docs/src/user-guide/quick-start/clp-search-ui.png similarity index 100% rename from docs/src/user-guide/quick-start-search/clp-search-ui.png rename to docs/src/user-guide/quick-start/clp-search-ui.png diff --git a/docs/src/user-guide/quick-start/clp-text.md b/docs/src/user-guide/quick-start/clp-text.md new file mode 100644 index 0000000000..cfc3469b71 --- /dev/null +++ b/docs/src/user-guide/quick-start/clp-text.md @@ -0,0 +1,157 @@ +# clp-text quick-start + +This page will walk you through how to start CLP and use it to compress and search unstructured +text logs. + +:::{caution} +If you're using a `clp-text` release, you should only compress unstructured text logs. `clp-text` +is able to compress and search JSON logs as if they were unstructured text, but `clp-text` cannot +query individual fields. This limitation will be addressed in a future version of CLP. +::: + +--- + +## Starting CLP + +To start CLP, run: + +```bash +sbin/start-clp.sh +``` + +:::{note} +If CLP fails to start (e.g., due to a port conflict), try adjusting the settings in +`etc/clp-config.yml` and then run the start command again. +::: + +--- + +## Compressing unstructured text logs + +To compress some unstructured text logs, run: + +```bash +sbin/compress.sh [ ...] +``` + +`` are paths to unstructured text log files or directories containing such files. + +The compression script will output the compression ratio of each dataset you compress, or you can +use the UI to view overall statistics. + +Compressed logs will be stored in the directory specified by the `archive_output.storage.directory` +config option in `etc/clp-config.yml` (`archive_output.storage.directory` defaults to +`var/data/archives`). + +### Sample logs + +For some sample logs, check out the [open-source datasets](../resources-datasets). + +--- + +## Searching unstructured text logs + +You can search your compressed logs from CLP's [UI](#searching-from-the-ui) or the +[command line](#searching-from-the-command-line). + +In clp-text, queries are written as wildcard expressions. A wildcard expression is a plain text +query where: + +* `*` matches zero or more characters +* `?` matches any single character + +For example, consider the query in [Figure 1](#figure-1) and the logs in [Figure 2](#figure-2). + +(figure-1)= +:::{card} + +```bash +"INFO container_? Transitioned*ACQUIRED" +``` + ++++ +**Figure 1**: An example query. +::: + +(figure-2)= +:::{card} + +```text +2015-03-23T15:50:17.926Z INFO container_1 Transitioned from ALLOCATED to ACQUIRED +2015-03-23T15:50:17.927Z ERROR Scheduler: Error trying to assign container token +java.lang.IllegalArgumentException: java.net.UnknownHostException: i-e5d112ea + at org.apache.hadoop.security.buildTokenService(SecurityUtil.java:374) + at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) +Caused by: java.net.UnknownHostException: i-e5d112ea + ... 17 more +``` + ++++ +**Figure 2**: A set of unstructured text log events. +::: + +The query in [Figure 1](#figure-1) will match with the first log message, as the `?` will match the +character "1", and the `*` will match the text " from ALLOCATED to ". + +A complete reference for clp-text's query syntax is available on the +[syntax reference page](../reference-text-search-syntax). + +### Searching from the UI + +To search your compressed logs from CLP's UI, open [http://localhost:4000](http://localhost:4000) in +your browser (if you changed `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new +values). + +:::{image} clp-search-ui.png +::: + +The image above shows the search page after running a query. The numbered circles correspond to the +following features: + +1. The search box is where you can enter your query. +2. The timeline shows the number of results across the time range of your query. + * You can click and drag to zoom into a time range, or use the time range filter in (4). +3. The table displays the search results for your query. +4. Clicking the icon reveals additional filters for your query. + * The time range filter allows you to specify the period of time that matching log events must be + in. + * The case sensitivity filter allows you to specify whether CLP should respect the case of your + query. +5. Clicking the icon reveals options for displaying results. +6. The icon clears the results of the last query. + +:::{note} +By default, the UI will only return 1,000 of the latest search results. To perform searches which +return more results, use the [command line](#searching-from-the-command-line). +::: + +### Searching from the command line + +To search your compressed logs from the command line, run: + +```bash +sbin/search.sh '' +``` + +To narrow your search to a specific time range: + +* Add `--begin-time ` to filter for log events after a certain time. + * `` is the timestamp as milliseconds since the UNIX epoch. +* Add `--end-time ` to filter for log events before a certain time. + +To perform case-insensitive searches, add the `--ignore-case` flag. + +:::{caution} +To match the convention of other tools, by default, searches are case-**insensitive** in the UI and +searches are case-**sensitive** on the command line. +::: + +--- + +## Stopping CLP + +If you need to stop CLP, run: + +```bash +sbin/stop-clp.sh +``` diff --git a/docs/src/user-guide/quick-start/index.md b/docs/src/user-guide/quick-start/index.md new file mode 100644 index 0000000000..029dde79d8 --- /dev/null +++ b/docs/src/user-guide/quick-start/index.md @@ -0,0 +1,143 @@ +# Overview + +This guide describes the following: + +* [CLP's system requirements](#system-requirements) +* [How to choose a CLP flavor](#choosing-a-flavor) +* [How to use CLP](#using-clp). + +--- + +## System Requirements + +To run a CLP release, you'll need: + +* [Docker](#docker) +* [Python](#python) + +### Docker + +To check whether Docker is installed on your system, run: + +```bash +docker version +``` + +If Docker isn't installed, follow [these instructions][Docker] to install it. + +NOTE: + +* If you're not running as root, ensure Docker can be run + [without superuser privileges][docker-non-root]. +* If you're using Docker Desktop, ensure version 4.34 or higher is installed, and + [host networking is enabled][docker-desktop-host-networking]. + +### Python + +To check whether Python is installed on your system, run: + +```bash +python3 --version +``` + +CLP requires Python 3.8 or higher. If Python isn't installed, or if the version isn't high enough, +install or upgrade it by following the instructions for your OS. + +--- + +## Choosing a flavor + +There are two flavors of CLP: + +* **[clp-json](#clp-json)** for compressing and searching **JSON** logs. +* **[clp-text](#clp-text)** for compressing and searching **unstructured text** logs. + +:::{note} +Both flavors contain the same binaries but are configured with different values for the +`package.storage_engine` key in the package's config file (`etc/clp-config.yml`). +::: + +### clp-json + +The JSON flavor of CLP is appropriate for JSON logs, where each log event is an independent JSON +object. For example: + +```json lines +{ + "t": { + "$date": "2023-03-21T23:46:37.392" + }, + "ctx": "conn11", + "msg": "Waiting for write concern." +} +{ + "t": { + "$date": "2023-03-21T23:46:37.392" + }, + "msg": "Set last op to system time" +} +``` + +The log file above contains two log events represented by two JSON objects printed one after the +other. Whitespace is ignored, so the log events could also appear with no newlines and indentation. + +If you're using JSON logs, download and extract the `clp-json` release from the +[Releases][clp-releases] page, then proceed to the [clp-json quick-start](./clp-json.md) guide. + +### clp-text + +The text flavor of CLP is appropriate for unstructured text logs, where each log event contains a +timestamp and may span one or more lines. + +:::{note} +If your logs don't contain timestamps or CLP can't automatically parse the timestamps in your logs, +it will treat each line as an independent log event. +::: + +For example: + +```text +2015-03-23T15:50:17.926Z INFO container_1 Transitioned from ALLOCATED to ACQUIRED +2015-03-23T15:50:17.927Z ERROR Scheduler: Error trying to assign container token +java.lang.IllegalArgumentException: java.net.UnknownHostException: i-e5d112ea + at org.apache.hadoop.security.buildTokenService(SecurityUtil.java:374) + at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) +Caused by: java.net.UnknownHostException: i-e5d112ea + ... 17 more +``` + +The log file above contains two log events, both beginning with a timestamp. The first is a single +line, while the second contains multiple lines. + +If you're using unstructured text logs, download and extract the `clp-text` release from the +[Releases][clp-releases] page, then proceed to the [clp-text quick-start](./clp-text.md) guide. + +--- + +## Using CLP + +Once you've installed CLP's requirements and downloaded a CLP release, proceed to the quick-start +guide for your chosen flavor by clicking the corresponding link below. + +::::{grid} 1 1 2 2 +:gutter: 2 + +:::{grid-item-card} +:link: clp-json +CLP for JSON logs +^^^ +How to compress and search JSON logs. +::: + +:::{grid-item-card} +:link: clp-text +CLP for unstructured text logs +^^^ +How to compress and search unstructured text logs. +::: +:::: + +[clp-releases]: https://github.com/y-scope/clp/releases +[Docker]: https://docs.docker.com/engine/install/ +[docker-desktop-host-networking]: https://docs.docker.com/engine/network/drivers/host/#docker-desktop +[docker-non-root]: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user