diff --git a/docs/src/user-docs/guides-using-presto.md b/docs/src/user-docs/guides-using-presto.md index 48ea7a00a8..87df924dca 100644 --- a/docs/src/user-docs/guides-using-presto.md +++ b/docs/src/user-docs/guides-using-presto.md @@ -33,27 +33,60 @@ Using Presto with CLP requires: 1. Follow the [quick-start](quick-start/index.md) guide to download and extract the CLP package, but don't start the package just yet. -2. Before starting the package, update the package's config as follows: +2. Before starting the package, update the package's config file (`etc/clp-config.yml`) as follows: - * Open `etc/clp-config.yml` located within the package. - * Uncomment the `database` section. - * Change `database.host` value to a non-localhost hostname/IP. - * After the change, the `database` section should look something like this: + * Set the `package.query_engine` key to `"presto"`. ```yaml - database: - type: "mariadb" # "mariadb" or "mysql" - host: "" - port: 3306 - name: "clp-db" + package: + storage_engine: "clp-s" + query_engine: "presto" ``` - :::{note} - This change is necessary since the Presto containers run on a Docker network, whereas CLP's - database runs on the host network. So `localhost` refers to two different entities in those - networks. This limitation will be addressed in the future when we unify Presto and CLP's - deployment infrastructure. - ::: + * Set the `database.host` key to a non-localhost hostname/IP. + + ```yaml + database: + # type: "mariadb" + host: "" + # port: 3306 + # name: "clp-db" + ``` + + :::{note} + This change is necessary because the Presto containers run on a Docker network, and CLP's + database runs on the host network. `localhost` will refer to a different entity in each of + those contexts. This limitation will be addressed in the future when we unify Presto and CLP's + deployment infrastructure. + ::: + + * Set the `results_cache.retention_period` key to `null` since the CLP + Presto integration + doesn't yet support garbage collection. + + ```yaml + results_cache: + # host: "localhost" + # port: 27017 + # db_name: "clp-query-results" + # stream_collection_name: "stream-files" + # + # # Retention period for search results, in minutes. Set to null to disable automatic deletion. + retention_period: null + ``` + + * Update the `presto` key with the host and port of the Presto cluster. If you follow the + [Setting up Presto](#setting-up-presto) section, the host is `localhost` and the port is + `8889`. + + ```yaml + presto: + host: "" + port: + ``` + + :::{note} + Presto doesn't need to be running before you start CLP. + ::: 3. If you'd like to store your compressed logs on S3, follow the [using object storage](guides-using-object-storage/index.md) guide. @@ -115,7 +148,7 @@ Using Presto with CLP requires: 5. Start a Presto cluster by running: ```bash - docker compose up + docker compose up --detach ``` * To use more than one Presto worker, you can use the `--scale` option as follows: @@ -128,24 +161,22 @@ Using Presto with CLP requires: ### Stopping the Presto cluster -To stop the Presto cluster, use CTRL + C. +To stop the Presto cluster: + +```bash +docker compose stop +``` To clean up the Presto cluster entirely: ```bash -docker compose rm +docker compose down ``` ## Querying your logs through Presto -To query your logs through Presto, you can use the Presto CLI: - -```bash -docker compose exec presto-coordinator \ - presto-cli \ - --catalog clp \ - --schema default -``` +You can query your compressed logs in your browser from [CLP's UI](#querying-from-clps-ui), or +from the command line using the [Presto CLI](#querying-from-the-presto-cli). Each dataset in CLP shows up as a table in Presto. To show all available datasets: @@ -179,6 +210,26 @@ contain the field `foo.bar`, you can query it using: SELECT foo.bar FROM default LIMIT 1; ``` +### Querying from CLP's UI + +CLP's UI should be available at [http://localhost:4000](http://localhost:4000) (if you changed +`webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values). + +:::{note} +The UI can only run one query at a time, and queries must not end with a `;`. +::: + +### Querying from the Presto CLI + +To access the Presto CLI, navigate to the `tools/deployment/presto-clp` directory and run: + +```bash +docker compose exec presto-coordinator \ + presto-cli \ + --catalog clp \ + --schema default +``` + ## Limitations The Presto CLP integration has the following limitations at present: