Skip to content
Open
Show file tree
Hide file tree
Changes from 79 commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
d926979
:shirt: fixed ruff checks
anlowee Oct 21, 2024
f9d9e24
:memo: updated methodology
anlowee Oct 21, 2024
3ae6fb4
:memo: canceled modification
anlowee Oct 21, 2024
3e662e0
:shirt: fixed linting issues by black
anlowee Oct 21, 2024
3b89387
:shirt: fixed spelling and line wrapping issues
anlowee Oct 23, 2024
1847341
merge
anlowee Oct 24, 2024
3d1b344
:shirt: idk what happened, some previous fixes were gone. Just readde…
anlowee Oct 24, 2024
16920ad
:construction: wip
anlowee Oct 24, 2024
629da6a
:tada: added MongoDB executor
anlowee Oct 24, 2024
a25a435
:shirt: fixed some coderabitai's suggestions
anlowee Oct 24, 2024
f56c994
:shirt: fixed coderabbitai suggestions
anlowee Oct 25, 2024
b0eb437
:shirt: fixed coderabbit suggestions
anlowee Oct 25, 2024
6ed8020
:shirt: fixed coderabbit issues
anlowee Oct 25, 2024
f013db8
:construction: rebased from xiaochong-fix-ruff-check
anlowee Oct 24, 2024
96520dd
:construction: merged
anlowee Oct 25, 2024
ce49181
:construction: wip, separting ingest and query
anlowee Oct 27, 2024
845e7de
:tada: added MongoDB executor
anlowee Oct 24, 2024
f8f0e16
:construction: wip
anlowee Oct 27, 2024
2701df2
Merge branch 'xiaochong-add-mongodb-benchmark-toolset' of https://git…
anlowee Oct 27, 2024
68484fc
:sparkles: finished mongodb executor and results, but there are still…
anlowee Oct 28, 2024
6459d1b
:memo: start working on mongodb document, but before that lets do som…
anlowee Oct 28, 2024
8a5d1b8
:memo: initially updated the methodology for mongodb
anlowee Oct 28, 2024
fe24519
:construction: wip refactoring
anlowee Oct 29, 2024
4eaa056
:bug: minor fixed
anlowee Oct 29, 2024
89243e5
:bug: added dataset path argument
anlowee Oct 29, 2024
dd54161
:bug: minor fix
anlowee Oct 29, 2024
f320ae4
:bug: minor fix
anlowee Oct 29, 2024
0e26d0a
:construction: wip
anlowee Oct 29, 2024
64af2a3
:construction: wip
anlowee Oct 29, 2024
edb5a05
:tada: added MongoDB executor
anlowee Oct 24, 2024
dd386d8
:test_tube: added assets for clickhouse
anlowee Oct 29, 2024
7d166bc
:construction: wip
anlowee Oct 29, 2024
6411cea
:construction: wip
anlowee Oct 29, 2024
049ee44
:construction: wip
anlowee Oct 29, 2024
efc0649
:construction: wip
anlowee Oct 30, 2024
598997b
:construction: wip
anlowee Oct 30, 2024
8167cf1
:test_tube: added clickhouse results
anlowee Oct 30, 2024
cf9cc18
:lipstick: changed coloar calculation algorithm to logarithmic
anlowee Oct 30, 2024
b98c897
Merge branch 'xiaochong-fix-ruff-checks' of github.com:anlowee/clp-be…
anlowee Oct 30, 2024
501ed07
:tada: added MongoDB executor
anlowee Oct 24, 2024
3aa5606
:construction: wip
anlowee Oct 27, 2024
8073f53
:sparkles: finished mongodb executor and results, but there are still…
anlowee Oct 28, 2024
b88472c
:memo: start working on mongodb document, but before that lets do som…
anlowee Oct 28, 2024
df5e59f
:memo: initially updated the methodology for mongodb
anlowee Oct 28, 2024
4c12584
:rocket: Merge branch 'xiaochong-add-mongodb-benchmark-toolset' of gi…
anlowee Nov 3, 2024
895fb27
:rocket: Merge branch 'xiaochong-add-clickhouse-benchmark-toolset' in…
anlowee Nov 3, 2024
338b448
:rocket: Merge branch 'xiaochong-add-mongodb-benchmark-toolset' into …
anlowee Nov 3, 2024
b439d42
:hammer: refactored mongodb
anlowee Nov 3, 2024
a7ca501
:construction: added clp-s
anlowee Nov 3, 2024
3e9b573
:construction: wip-refactored semi-structure elasticsearch assets
anlowee Nov 4, 2024
d5587e2
:construction: wip-refactored glt
anlowee Nov 4, 2024
292b86f
:construction: wip-elasticsearch unstructured finished
anlowee Nov 4, 2024
095b483
:construction: wip-refactored loki
anlowee Nov 4, 2024
4a88cde
:construction: wip-refactored grep
anlowee Nov 4, 2024
6a005fa
:memo: updated pydoc
anlowee Nov 4, 2024
da5116a
:lipstick: formatted
anlowee Nov 4, 2024
dd67084
:fire: removed binaries
Nov 5, 2024
afdbb94
:memo: updated docs
anlowee Nov 6, 2024
8e291ba
:rocket: Merge branch 'xiaochong-refactor-and-add-new-results' of git…
Nov 6, 2024
58cd0b2
:construction: wip-updated splunk asstes and refactered the results
Nov 6, 2024
3d88950
:memo: updated splunk docs
Nov 6, 2024
0ef5a94
:memo: updated docs
Nov 6, 2024
62f1150
:shirt: addressed a senior comment
Nov 6, 2024
3ee697d
:memo: fixed doc error
Nov 6, 2024
babe2fc
:lipstick: changed the color representing the worst to very dark red …
Nov 6, 2024
8e485ea
:construction: wip-addressing some comments from a senior
Nov 7, 2024
8fa08a1
:rocket: Merge branch 'xiaochong-refactor-and-add-new-results' of git…
Nov 7, 2024
dae42f6
:hammer: fixed naming issues
anlowee Nov 7, 2024
f90734b
:construction: updated splunk results; some linting fix
Nov 7, 2024
88dfafb
:construction: fixed splunk script bugs
Nov 7, 2024
c11f491
:lipstick: wrapped the lines in clickhouse assets
anlowee Nov 8, 2024
ff1b7c9
:lipstick: finished wrapping lines for semistructure assets
anlowee Nov 8, 2024
6ca8007
:lipstick: fixed all wrapping lines and end with newline issue
anlowee Nov 8, 2024
336dc8e
Format Markdown files with Prettier and some manual tweaks.
kirkrodrigues Nov 11, 2024
6136f93
Edit README.md and associated files.
kirkrodrigues Nov 11, 2024
08f1f75
Sentence case for headings; minor edit.
kirkrodrigues Nov 11, 2024
6c848c2
Alphabetize env vars; Add newline at end of file.
kirkrodrigues Nov 11, 2024
d040292
:memo: fixed some lint issues of markdown
Nov 11, 2024
af15244
Minor edits.
kirkrodrigues Nov 11, 2024
5cd9997
:construction: wip
Nov 11, 2024
dae3cce
:construction: wip
Nov 11, 2024
552f43f
:construction: wip
Nov 13, 2024
2cfb374
:construction: wip
Nov 14, 2024
838fbb3
:construction: wip
Nov 14, 2024
66f605e
:lipstick: finished
Nov 14, 2024
e064936
:bug: fixed semi-structured Elasticsearch methodology link
Nov 17, 2024
45d2419
:shirt: fixed some comments from a senior
Nov 21, 2024
2fb824e
:construction: fixed some commnets but some comments need further det…
Nov 25, 2024
c867307
:construction: fixed the rest of comments
Dec 4, 2024
44d7ff2
:lipstick: fixed the mermaid issue
Jan 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 101 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,116 @@
# clp-bench
clp-bench is a tool for benchmarking [CLP] as well as other log management tools. The tool itself is
a Python package, and we also provide a [web interface][ui] for viewing results.

The methodology for the benchmarks is described [here](docs/methodology.md).
**clp-bench** is a benchmarking tool designed for [CLP] and other log management systems. It
functions as a Python package and includes a [web interface][ui] for displaying benchmark results.

For a detailed description of the benchmarking methodology, see
[this document](docs/methodology.md).

## Requirements

* Docker
* Python v3.10 or higher
- Docker
- Python v3.10 or higher

# Set up
# Setup

```shell
python3 -m venv venv
. venv/bin/activate
pip install -e .
```

You can use `clp-bench --help` to see usage instructions.
To view usage instructions, run `clp-bench --help`.

# Contributing

🚧 This section is under construction.

We encourage contributions that add benchmark results for various tools to support broader community
development.

## Adding new results

To benchmark a new system, duplicate one of the directories in [assets] and update the following
files:

- **`config.yaml`**: Contains essential benchmarking configurations:

- **`system_metric.enable`**: Toggle to enable system metric monitoring (e.g., memory usage). Set
to `true` to activate.
- **`system_metric.memory.ingest_polling_interval`**: Time interval (in seconds) for polling
memory during data ingestion.
- **`system_metric.memory.run_query_benchmark_polling_interval`**: Time interval (in seconds) for
polling memory during query benchmarking.
- **`container_id`**: Identifier for the benchmark container. Usually `${tool}-clp-bench`.
- **`assets_path`**: Path to the assets directory in the container. Leave as default unless
modifying `docker-run.sh` (described below).
- **`datasets_path`**: Path for datasets in the container; may refer to a file, directory, or file
pattern. clp-bench does not validate the dataset's presence.
- **`hot_run_warm_up_times`**: Number of repetitions for query warm-up in hot-run mode before
measuring latency. This may be automated in the future.
- **`related_processes`**: List of command substrings (from `ps aux`) to track relevant memory
usage.
- **`queries`**: Array of queries for benchmarking. Ensure escape characters are carefully
handled.

- **`docker-build.sh`**: Builds the container as per the `Dockerfile` in the same directory.
Usually, only the `container_name` variable should be adjusted to match the `container_id` in
`config.yaml`.

- **`docker-run.sh`**: Runs the container, taking the dataset path as an argument. Typically, only
the `container_name` variable needs alignment with `container_id` in `config.yaml`.

- **`Dockerfile`**: Used for building the container, ensuring installation of the required tool and
dependencies.

- **`launch-script.sh`**: Initializes and starts the tool (e.g., if it functions as a server or
service).

- **`reset-script.sh`**: Prepares a clean environment by removing previous data (e.g., dropping
tables); runs after `launch-script.sh` in `ingest` mode.

- **`measure-decompressed-size-script.sh`**: Measures the raw dataset size before ingestion.
Typically unchanged, it takes `datasets_path` from `config.yaml` and uses `du -bc` for size
calculation in bytes.

- **`ingest-script.sh`**: Handles data ingestion, with clp-bench measuring the total latency of this
script. Avoid adding extra operations.

- **`measure-compressed-size-script.sh`**: Measures the compressed data size post-ingestion, usually
via tool-specific methods.

- **`search-script.sh`**: Executes queries specified in `config.yaml`. clp-bench supports two
benchmarking modes:

- **Hot-run mode**: Runs queries for `hot_run_warm_up_times` to warm up the cache, then measures
latency.
- **Cold-run mode**: Clears the cache with `clear-cache-script.sh` before measuring latency.

- **`clear-cache-script.sh`**: Clears the tool's cache, essential for cold runs.

- **`methodology.md`**: Describes specific benchmarking set up details, including tuning and dataset
preprocessing.

- **`results.json`**: Contains benchmarking results, which are loaded and displayed in the UI:

- **`target`**: The ID used by the frontend, should be lowercase. IDs of the same type must be
unique.
- **`targetDisplayedName`**: The name to display in the column on the webpage.
- **`displayedOrder`**: Defines the display order of results; a smaller value places the column
further to the right.
- **`isEnable`**: Indicates if the results should be displayed (default is `true`). If set to
`false`, results won't appear on the webpage.
- **`type`**: Specifies data type (1 for Unstructured, 2 for Semi-structured).
- **`ingestTime`**: Total end-to-end time taken to ingest all dataset data.
- **`compressedSize`**: The size of compressed archives.
- **`avgIngestMem`**: The average memory used during ingestion.
- **`metrics`**: An array of query benchmarking results for each metric:

- **`metric`**: Specifies the type (1 for Hot run, 2 for Cold run).
- **`avgQueryMem`**: The average memory usage during query benchmarking.
- **`queryTimes`**: An array of end-to-end query latencies, ordered to match the sequence of
queries.

[assets]: assets
[CLP]: https://github.com/y-scope/clp
[ui]: ui
[ui]: ui
93 changes: 0 additions & 93 deletions assets/elasticsearch-unstructured/compress.py

This file was deleted.

12 changes: 0 additions & 12 deletions assets/elasticsearch-unstructured/docker_build.sh

This file was deleted.

21 changes: 0 additions & 21 deletions assets/elasticsearch-unstructured/docker_run.sh

This file was deleted.

30 changes: 0 additions & 30 deletions assets/elasticsearch-unstructured/ela-config.yaml

This file was deleted.

11 changes: 0 additions & 11 deletions assets/elasticsearch-unstructured/poll_mem.py

This file was deleted.

45 changes: 0 additions & 45 deletions assets/elasticsearch-unstructured/query.py

This file was deleted.

12 changes: 0 additions & 12 deletions assets/elasticsearch-unstructured/start-ela.sh

This file was deleted.

3 changes: 0 additions & 3 deletions assets/elasticsearch-unstructured/stop-ela.sh

This file was deleted.

Loading