Skip to content

Commit b3daa8a

Browse files
authored
Merge pull request #83 from miciav/feature/loki-grafana-main
Feature/loki grafana main
2 parents 6e48a93 + 1053be4 commit b3daa8a

File tree

170 files changed

+8257
-3721
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

170 files changed

+8257
-3721
lines changed

QWEN.md

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
# Linux Benchmark Library - QWEN Context
2+
3+
## Project Overview
4+
5+
The Linux Benchmark Library (LBB) is a robust and configurable Python library for benchmarking Linux computational node performance. It provides a layered architecture for orchestrating repeatable workloads, collecting rich metrics, and producing clean outputs.
6+
7+
### Key Features
8+
- **Layered Architecture**: Runner, controller, app, UI, and analytics components
9+
- **Workload Plugins**: Extensible via entry points and user plugin directory
10+
- **Remote Orchestration**: Uses Ansible with run journaling
11+
- **Organized Outputs**: Results, reports, and exports per run and host
12+
- **Multiple Execution Modes**: Local, remote, Docker, and Multipass execution
13+
14+
### Project Structure
15+
```
16+
linux-benchmark-lib/
17+
├── lb_runner/ # Runner (collectors, local execution helpers)
18+
├── lb_controller/ # Orchestration and journaling
19+
├── lb_app/ # Stable API for CLI/UI integrations
20+
├── lb_ui/ # CLI/TUI implementation
21+
├── lb_analytics/ # Reporting and analytics
22+
├── lb_plugins/ # Workload plugins and registry
23+
├── lb_provisioner/ # Docker/Multipass helpers
24+
├── lb_common/ # Shared API helpers
25+
└── tests/ # Unit and integration tests
26+
```
27+
28+
## Architecture Components
29+
30+
### lb_runner
31+
- Core benchmark execution engine
32+
- Local metric collection system
33+
- Plugin execution framework
34+
- System information gathering
35+
36+
### lb_controller
37+
- Remote orchestration engine
38+
- Ansible integration
39+
- Run journaling and state management
40+
- Lifecycle management and interrupt handling
41+
42+
### lb_plugins
43+
- Workload plugin system with built-in plugins (stress_ng, fio, dd, hpl, stream, etc.)
44+
- Plugin registry and discovery system
45+
- Configuration models for each workload type
46+
47+
### lb_ui
48+
- Command-line interface (CLI) and text user interface (TUI)
49+
- Typer-based command structure
50+
- Configuration management
51+
52+
### lb_common
53+
- Shared utilities and configuration helpers
54+
- Logging and observability components
55+
- Environment variable parsing
56+
57+
## Building and Running
58+
59+
### Installation
60+
```bash
61+
# Create virtual environment
62+
uv venv
63+
64+
# Install in different modes
65+
uv pip install -e . # runner only
66+
uv pip install -e ".[ui]" # CLI/TUI
67+
uv pip install -e ".[controller]" # Ansible + analytics
68+
uv pip install -e ".[ui,controller]" # full CLI
69+
uv pip install -e ".[dev]" # test + lint tools
70+
uv pip install -e ".[docs]" # mkdocs
71+
```
72+
73+
### Switching Dependency Sets
74+
```bash
75+
bash scripts/switch_mode.sh base # Base runner only
76+
bash scripts/switch_mode.sh controller # Full CLI with UI
77+
bash scripts/switch_mode.sh headless # Controller without UI
78+
bash scripts/switch_mode.sh dev # Development mode
79+
```
80+
81+
### Quick Start (CLI)
82+
```bash
83+
# Initialize configuration
84+
lb config init -i
85+
86+
# Enable a plugin and run
87+
lb plugin list --enable stress_ng
88+
lb run --remote --run-id demo-run
89+
90+
# Development Docker run
91+
LB_ENABLE_TEST_CLI=1 lb run --docker --run-id demo-docker
92+
```
93+
94+
### Quick Start (Python API)
95+
```python
96+
from lb_controller.api import (
97+
BenchmarkConfig,
98+
BenchmarkController,
99+
RemoteExecutionConfig,
100+
RemoteHostConfig,
101+
)
102+
103+
config = BenchmarkConfig(
104+
repetitions=2,
105+
remote_hosts=[
106+
RemoteHostConfig(name="node1", address="192.168.1.10", user="ubuntu")
107+
],
108+
remote_execution=RemoteExecutionConfig(enabled=True),
109+
)
110+
111+
controller = BenchmarkController(config)
112+
summary = controller.run(["stress_ng"], run_id="demo-run")
113+
print(summary.per_host_output)
114+
```
115+
116+
## Key APIs
117+
118+
### Runner API
119+
- `LocalRunner`: Core local benchmark execution
120+
- `BenchmarkConfig`: Configuration for benchmark runs
121+
- `MetricCollectorConfig`: Configuration for metric collection
122+
- `WorkloadConfig`: Configuration for individual workloads
123+
124+
### Controller API
125+
- `BenchmarkController`: Remote orchestration controller
126+
- `RunJournal`: Run state and journaling
127+
- `RunLifecycle`: Run phase management
128+
- `StopCoordinator`: Interrupt and stop handling
129+
130+
### Plugin API
131+
- `WorkloadPlugin`: Base class for workload plugins
132+
- `BasePluginConfig`: Base configuration for plugins
133+
- `PluginRegistry`: Plugin discovery and management
134+
- Various plugin-specific configs (StressNGConfig, FIOConfig, etc.)
135+
136+
## Development Conventions
137+
138+
### Logging Policy
139+
- Configure logging via `lb_common.api.configure_logging()` in entrypoints
140+
- `lb_ui` configures logging automatically; `lb_runner` and `lb_controller` do not
141+
- Keep stdout clean for `LB_EVENT` streaming when integrating custom UIs
142+
143+
### Testing
144+
- Unit tests marked with specific markers (unit_runner, unit_controller, etc.)
145+
- Integration tests with different levels (inter_generic, inter_docker, inter_multipass, etc.)
146+
- Slow tests marked with `slow` and `slowest` markers
147+
148+
### Code Quality
149+
- Uses mypy for type checking
150+
- Black for code formatting
151+
- Pytest for testing
152+
- Various linting tools (flake8, vulture, etc.)
153+
154+
## Available Workload Plugins
155+
156+
The library includes several built-in workload plugins:
157+
- **stress_ng**: CPU, memory, I/O stress testing
158+
- **fio**: Flexible I/O tester
159+
- **dd**: Basic disk I/O operations
160+
- **hpl**: High Performance Linpack
161+
- **stream**: Memory bandwidth test
162+
- **sysbench**: System performance benchmark
163+
- **geekbench**: Cross-platform benchmark
164+
- **unixbench**: Unix system benchmark
165+
- **yabs**: Yet Another Benchmark Suite
166+
- **phoronix_test_suite**: Phoronix test framework
167+
168+
## Documentation and Resources
169+
170+
- Documentation site: https://miciav.github.io/linux-benchmark-lib/
171+
- API reference: https://miciav.github.io/linux-benchmark-lib/api/
172+
- Workloads & plugins: https://miciav.github.io/linux-benchmark-lib/plugins/
173+
- Diagrams: https://miciav.github.io/linux-benchmark-lib/diagrams/
174+
175+
## CLI Commands
176+
177+
The main CLI provides several command groups:
178+
- `lb config`: Configuration management
179+
- `lb plugin`: Plugin management and listing
180+
- `lb run`: Running benchmarks
181+
- `lb provision`: Environment provisioning
182+
- `lb runs`: Run history and management
183+
- `lb doctor`: System checks and diagnostics

benchmark_config.dfaas_multipass.json

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"remote_hosts": [
33
{
44
"name": "dfaas-target",
5-
"address": "192.168.2.2",
5+
"address": "192.168.2.4",
66
"user": "ubuntu",
77
"become": true,
88
"vars": {
@@ -17,12 +17,12 @@
1717
},
1818
"plugin_settings": {
1919
"dfaas": {
20-
"k6_host": "192.168.2.3",
20+
"k6_host": "192.168.2.5",
2121
"k6_user": "ubuntu",
2222
"k6_ssh_key": "/home/ubuntu/.ssh/dfaas_k6_key",
2323
"k6_port": 22,
24-
"gateway_url": "http://192.168.2.2:31112",
25-
"prometheus_url": "http://192.168.2.2:30411",
24+
"gateway_url": "http://{host.address}:31112",
25+
"prometheus_url": "http://{host.address}:30411",
2626
"functions": [
2727
{
2828
"name": "env",
@@ -37,8 +37,11 @@
3737
},
3838
"workloads": {
3939
"dfaas": {
40-
"plugin": "dfaas",
41-
"enabled": true
40+
"plugin": "dfaas"
4241
}
42+
},
43+
"loki": {
44+
"enabled": true,
45+
"endpoint": "http://192.168.2.1:3100"
4346
}
4447
}

docs/cli.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,8 @@ Order used by commands that need a config:
3737
Run analytics on an existing run.
3838
- `lb plugin ...`
3939
Inspect and manage workload plugins.
40+
- `lb provision loki-grafana install|remove|status [--mode local|docker] [--grafana-url URL] [--grafana-api-key KEY] [--loki-endpoint URL] [--no-configure]`
41+
Install/remove Loki + Grafana and configure datasources/dashboards.
4042
- `lb config ...`
4143
Create and manage benchmark configuration files.
4244
- `lb doctor ...`
@@ -46,10 +48,11 @@ Order used by commands that need a config:
4648

4749
## Plugin management (`lb plugin ...`)
4850

49-
- `lb plugin list [--select] [--enable NAME | --disable NAME] [-c FILE] [--set-default]`
50-
- `lb plugin select [-c FILE] [--set-default]`
51+
- `lb plugin list [--select] [--enable NAME | --disable NAME]`
52+
- `lb plugin select`
5153

5254
Running `lb plugin` with no subcommand is equivalent to `lb plugin list`.
55+
Plugin enablement is stored in the platform config (`~/.config/lb/platform.json`); workloads live in the run config.
5356

5457
## Config management (`lb config ...`)
5558

docs/configuration.md

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
## Configuration
22

3-
All knobs are defined in `BenchmarkConfig` (import from `lb_runner.api`).
3+
Runtime configuration is split between the run config (`BenchmarkConfig`) and the
4+
platform config (`PlatformConfig`) from `lb_runner.api`.
45

56
```python
67
from pathlib import Path
@@ -26,7 +27,6 @@ config = BenchmarkConfig(
2627
workloads={
2728
"stress_ng": WorkloadConfig(
2829
plugin="stress_ng",
29-
enabled=True,
3030
options={"cpu_workers": 4, "vm_workers": 2, "vm_bytes": "2G"},
3131
)
3232
},
@@ -38,19 +38,44 @@ config = BenchmarkConfig.load(Path("benchmark_config.json"))
3838

3939
### Notes
4040

41-
- `workloads` is the primary map of workload names to configuration.
41+
- `workloads` is the primary map of workload names to configuration; presence implies execution.
4242
- `plugin_settings` can hold typed Pydantic configs for plugins; it is optional.
4343
- `plugin_assets` is populated from the plugin registry and captures setup/teardown playbooks plus extravars.
4444
- `output_dir`, `report_dir`, and `data_export_dir` control where artifacts are written.
4545
- `remote_execution.enabled` controls whether the controller uses Ansible to run workloads.
4646
- `remote_execution.upgrade_pip` toggles the pip upgrade step during global setup.
4747
- `workloads.<name>.intensity` accepts `low`, `medium`, `high`, or `user_defined`.
4848

49+
### Platform vs Run Config
50+
51+
The configuration model is split into two files:
52+
53+
- **Platform config**: `~/.config/lb/platform.json`
54+
- Holds environment-level settings (e.g. Loki endpoint/labels, Grafana URL/API key, output defaults).
55+
- Includes only plugin enablement flags: `"plugins": { "dfaas": true, "fio": false }`.
56+
- Optional `grafana` block stores connection defaults (`url`, `api_key`, `org_id`) for provisioning.
57+
- Does **not** contain workload definitions or plugin configs.
58+
- Never drives execution directly.
59+
60+
- **Run config**: passed via `-c/--config` or `benchmark_config.json`
61+
- Includes `remote_hosts` and workload definitions.
62+
- Only workloads present in the file are considered runnable.
63+
- Workloads do not use an `enabled` flag; presence implies execution.
64+
- Experiment-specific plugin options live here (e.g. DFaaS rates/functions/iterations).
65+
66+
Behavioral rules:
67+
- If a workload is present in the run config but disabled in the platform config,
68+
it will be skipped with a warning in the run plan.
69+
- Provisioning choices (multipass/docker/remote) remain CLI-driven and do not
70+
live in the platform config.
71+
- This is a breaking change: legacy run configs with `enabled` or platform-only
72+
fields must be updated to the new split.
73+
4974
### Plugin settings vs workloads
5075

5176
`workloads` drives execution and can include ad-hoc `options`. `plugin_settings` is
5277
the typed, validated config model for a plugin. The config service will hydrate
53-
`plugin_settings` and backfill `workloads` when missing.
78+
`plugin_settings`, but does not auto-create workloads.
5479

5580
Example:
5681

@@ -63,8 +88,7 @@ Example:
6388
},
6489
"workloads": {
6590
"fio": {
66-
"plugin": "fio",
67-
"enabled": true
91+
"plugin": "fio"
6892
}
6993
}
7094
```

0 commit comments

Comments
 (0)