-
Notifications
You must be signed in to change notification settings - Fork 5
clp-bench: Refactor and Add New Results #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
anlowee
wants to merge
90
commits into
y-scope:main
Choose a base branch
from
anlowee:xiaochong-refactor-and-add-new-results
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 79 commits
Commits
Show all changes
90 commits
Select commit
Hold shift + click to select a range
d926979
:shirt: fixed ruff checks
anlowee f9d9e24
:memo: updated methodology
anlowee 3ae6fb4
:memo: canceled modification
anlowee 3e662e0
:shirt: fixed linting issues by black
anlowee 3b89387
:shirt: fixed spelling and line wrapping issues
anlowee 1847341
merge
anlowee 3d1b344
:shirt: idk what happened, some previous fixes were gone. Just readde…
anlowee 16920ad
:construction: wip
anlowee 629da6a
:tada: added MongoDB executor
anlowee a25a435
:shirt: fixed some coderabitai's suggestions
anlowee f56c994
:shirt: fixed coderabbitai suggestions
anlowee b0eb437
:shirt: fixed coderabbit suggestions
anlowee 6ed8020
:shirt: fixed coderabbit issues
anlowee f013db8
:construction: rebased from xiaochong-fix-ruff-check
anlowee 96520dd
:construction: merged
anlowee ce49181
:construction: wip, separting ingest and query
anlowee 845e7de
:tada: added MongoDB executor
anlowee f8f0e16
:construction: wip
anlowee 2701df2
Merge branch 'xiaochong-add-mongodb-benchmark-toolset' of https://git…
anlowee 68484fc
:sparkles: finished mongodb executor and results, but there are still…
anlowee 6459d1b
:memo: start working on mongodb document, but before that lets do som…
anlowee 8a5d1b8
:memo: initially updated the methodology for mongodb
anlowee fe24519
:construction: wip refactoring
anlowee 4eaa056
:bug: minor fixed
anlowee 89243e5
:bug: added dataset path argument
anlowee dd54161
:bug: minor fix
anlowee f320ae4
:bug: minor fix
anlowee 0e26d0a
:construction: wip
anlowee 64af2a3
:construction: wip
anlowee edb5a05
:tada: added MongoDB executor
anlowee dd386d8
:test_tube: added assets for clickhouse
anlowee 7d166bc
:construction: wip
anlowee 6411cea
:construction: wip
anlowee 049ee44
:construction: wip
anlowee efc0649
:construction: wip
anlowee 598997b
:construction: wip
anlowee 8167cf1
:test_tube: added clickhouse results
anlowee cf9cc18
:lipstick: changed coloar calculation algorithm to logarithmic
anlowee b98c897
Merge branch 'xiaochong-fix-ruff-checks' of github.com:anlowee/clp-be…
anlowee 501ed07
:tada: added MongoDB executor
anlowee 3aa5606
:construction: wip
anlowee 8073f53
:sparkles: finished mongodb executor and results, but there are still…
anlowee b88472c
:memo: start working on mongodb document, but before that lets do som…
anlowee df5e59f
:memo: initially updated the methodology for mongodb
anlowee 4c12584
:rocket: Merge branch 'xiaochong-add-mongodb-benchmark-toolset' of gi…
anlowee 895fb27
:rocket: Merge branch 'xiaochong-add-clickhouse-benchmark-toolset' in…
anlowee 338b448
:rocket: Merge branch 'xiaochong-add-mongodb-benchmark-toolset' into …
anlowee b439d42
:hammer: refactored mongodb
anlowee a7ca501
:construction: added clp-s
anlowee 3e9b573
:construction: wip-refactored semi-structure elasticsearch assets
anlowee d5587e2
:construction: wip-refactored glt
anlowee 292b86f
:construction: wip-elasticsearch unstructured finished
anlowee 095b483
:construction: wip-refactored loki
anlowee 4a88cde
:construction: wip-refactored grep
anlowee 6a005fa
:memo: updated pydoc
anlowee da5116a
:lipstick: formatted
anlowee dd67084
:fire: removed binaries
afdbb94
:memo: updated docs
anlowee 8e291ba
:rocket: Merge branch 'xiaochong-refactor-and-add-new-results' of git…
58cd0b2
:construction: wip-updated splunk asstes and refactered the results
3d88950
:memo: updated splunk docs
0ef5a94
:memo: updated docs
62f1150
:shirt: addressed a senior comment
3ee697d
:memo: fixed doc error
babe2fc
:lipstick: changed the color representing the worst to very dark red …
8e485ea
:construction: wip-addressing some comments from a senior
8fa08a1
:rocket: Merge branch 'xiaochong-refactor-and-add-new-results' of git…
dae42f6
:hammer: fixed naming issues
anlowee f90734b
:construction: updated splunk results; some linting fix
88dfafb
:construction: fixed splunk script bugs
c11f491
:lipstick: wrapped the lines in clickhouse assets
anlowee ff1b7c9
:lipstick: finished wrapping lines for semistructure assets
anlowee 6ca8007
:lipstick: fixed all wrapping lines and end with newline issue
anlowee 336dc8e
Format Markdown files with Prettier and some manual tweaks.
kirkrodrigues 6136f93
Edit README.md and associated files.
kirkrodrigues 08f1f75
Sentence case for headings; minor edit.
kirkrodrigues 6c848c2
Alphabetize env vars; Add newline at end of file.
kirkrodrigues d040292
:memo: fixed some lint issues of markdown
af15244
Minor edits.
kirkrodrigues 5cd9997
:construction: wip
dae3cce
:construction: wip
552f43f
:construction: wip
2cfb374
:construction: wip
838fbb3
:construction: wip
66f605e
:lipstick: finished
e064936
:bug: fixed semi-structured Elasticsearch methodology link
45d2419
:shirt: fixed some comments from a senior
2fb824e
:construction: fixed some commnets but some comments need further det…
c867307
:construction: fixed the rest of comments
44d7ff2
:lipstick: fixed the mermaid issue
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,23 +1,116 @@ | ||
| # clp-bench | ||
| clp-bench is a tool for benchmarking [CLP] as well as other log management tools. The tool itself is | ||
| a Python package, and we also provide a [web interface][ui] for viewing results. | ||
|
|
||
| The methodology for the benchmarks is described [here](docs/methodology.md). | ||
| **clp-bench** is a benchmarking tool designed for [CLP] and other log management systems. It | ||
| functions as a Python package and includes a [web interface][ui] for displaying benchmark results. | ||
|
|
||
| For a detailed description of the benchmarking methodology, see | ||
| [this document](docs/methodology.md). | ||
anlowee marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Requirements | ||
|
|
||
| * Docker | ||
| * Python v3.10 or higher | ||
| - Docker | ||
| - Python v3.10 or higher | ||
|
|
||
| # Set up | ||
| # Setup | ||
|
|
||
| ```shell | ||
| python3 -m venv venv | ||
| . venv/bin/activate | ||
| pip install -e . | ||
| ``` | ||
|
|
||
| You can use `clp-bench --help` to see usage instructions. | ||
| To view usage instructions, run `clp-bench --help`. | ||
anlowee marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Contributing | ||
|
|
||
| 🚧 This section is under construction. | ||
|
|
||
| We encourage contributions that add benchmark results for various tools to support broader community | ||
| development. | ||
|
|
||
| ## Adding new results | ||
|
|
||
| To benchmark a new system, duplicate one of the directories in [assets] and update the following | ||
anlowee marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| files: | ||
|
|
||
| - **`config.yaml`**: Contains essential benchmarking configurations: | ||
anlowee marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| - **`system_metric.enable`**: Toggle to enable system metric monitoring (e.g., memory usage). Set | ||
| to `true` to activate. | ||
| - **`system_metric.memory.ingest_polling_interval`**: Time interval (in seconds) for polling | ||
| memory during data ingestion. | ||
| - **`system_metric.memory.run_query_benchmark_polling_interval`**: Time interval (in seconds) for | ||
| polling memory during query benchmarking. | ||
| - **`container_id`**: Identifier for the benchmark container. Usually `${tool}-clp-bench`. | ||
| - **`assets_path`**: Path to the assets directory in the container. Leave as default unless | ||
| modifying `docker-run.sh` (described below). | ||
| - **`datasets_path`**: Path for datasets in the container; may refer to a file, directory, or file | ||
| pattern. clp-bench does not validate the dataset's presence. | ||
| - **`hot_run_warm_up_times`**: Number of repetitions for query warm-up in hot-run mode before | ||
| measuring latency. This may be automated in the future. | ||
| - **`related_processes`**: List of command substrings (from `ps aux`) to track relevant memory | ||
| usage. | ||
| - **`queries`**: Array of queries for benchmarking. Ensure escape characters are carefully | ||
| handled. | ||
|
|
||
| - **`docker-build.sh`**: Builds the container as per the `Dockerfile` in the same directory. | ||
| Usually, only the `container_name` variable should be adjusted to match the `container_id` in | ||
| `config.yaml`. | ||
|
|
||
| - **`docker-run.sh`**: Runs the container, taking the dataset path as an argument. Typically, only | ||
| the `container_name` variable needs alignment with `container_id` in `config.yaml`. | ||
|
|
||
| - **`Dockerfile`**: Used for building the container, ensuring installation of the required tool and | ||
| dependencies. | ||
|
|
||
| - **`launch-script.sh`**: Initializes and starts the tool (e.g., if it functions as a server or | ||
| service). | ||
|
|
||
| - **`reset-script.sh`**: Prepares a clean environment by removing previous data (e.g., dropping | ||
| tables); runs after `launch-script.sh` in `ingest` mode. | ||
|
|
||
| - **`measure-decompressed-size-script.sh`**: Measures the raw dataset size before ingestion. | ||
| Typically unchanged, it takes `datasets_path` from `config.yaml` and uses `du -bc` for size | ||
| calculation in bytes. | ||
|
|
||
| - **`ingest-script.sh`**: Handles data ingestion, with clp-bench measuring the total latency of this | ||
| script. Avoid adding extra operations. | ||
|
|
||
| - **`measure-compressed-size-script.sh`**: Measures the compressed data size post-ingestion, usually | ||
| via tool-specific methods. | ||
|
|
||
| - **`search-script.sh`**: Executes queries specified in `config.yaml`. clp-bench supports two | ||
| benchmarking modes: | ||
|
|
||
| - **Hot-run mode**: Runs queries for `hot_run_warm_up_times` to warm up the cache, then measures | ||
| latency. | ||
| - **Cold-run mode**: Clears the cache with `clear-cache-script.sh` before measuring latency. | ||
|
|
||
| - **`clear-cache-script.sh`**: Clears the tool's cache, essential for cold runs. | ||
|
|
||
| - **`methodology.md`**: Describes specific benchmarking set up details, including tuning and dataset | ||
| preprocessing. | ||
|
|
||
| - **`results.json`**: Contains benchmarking results, which are loaded and displayed in the UI: | ||
|
|
||
| - **`target`**: The ID used by the frontend, should be lowercase. IDs of the same type must be | ||
| unique. | ||
| - **`targetDisplayedName`**: The name to display in the column on the webpage. | ||
| - **`displayedOrder`**: Defines the display order of results; a smaller value places the column | ||
| further to the right. | ||
| - **`isEnable`**: Indicates if the results should be displayed (default is `true`). If set to | ||
| `false`, results won't appear on the webpage. | ||
| - **`type`**: Specifies data type (1 for Unstructured, 2 for Semi-structured). | ||
| - **`ingestTime`**: Total end-to-end time taken to ingest all dataset data. | ||
| - **`compressedSize`**: The size of compressed archives. | ||
| - **`avgIngestMem`**: The average memory used during ingestion. | ||
| - **`metrics`**: An array of query benchmarking results for each metric: | ||
|
|
||
| - **`metric`**: Specifies the type (1 for Hot run, 2 for Cold run). | ||
| - **`avgQueryMem`**: The average memory usage during query benchmarking. | ||
| - **`queryTimes`**: An array of end-to-end query latencies, ordered to match the sequence of | ||
| queries. | ||
|
|
||
| [assets]: assets | ||
| [CLP]: https://github.com/y-scope/clp | ||
| [ui]: ui | ||
| [ui]: ui | ||
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.