Skip to content

Commit 739563a

Browse files
authored
Merge pull request #2339 from ramalama-labs/metrics
Add benchmark metrics persistence
2 parents 45f1556 + 82843f9 commit 739563a

21 files changed

+761
-58
lines changed

docs/ramalama-bench.1.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,9 @@ process to be launched inside of the container. If an environment variable is
5151
specified without a value, the container engine checks the host environment
5252
for a value and set the variable only if it is set on the host.
5353

54+
#### **--format**
55+
Set the output format of the benchmark results. Options include json and table (default: table).
56+
5457
#### **--help**, **-h**
5558
show this help message and exit
5659

docs/ramalama-benchmarks.1.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
% ramalama-benchmarks 1
2+
3+
## NAME
4+
ramalama\-benchmarks - view and interact with historical benchmark results
5+
6+
## SYNOPSIS
7+
**ramalama benchmarks** [*options*] *command* [*args*...]
8+
9+
## DESCRIPTION
10+
View and interact with historical benchmark results.
11+
Results are stored as newline-delimited JSON (JSONL) in a `benchmarks.jsonl` file.
12+
The storage folder is shown in `ramalama benchmarks --help` and can be
13+
overridden via `ramalama.benchmarks.storage_folder` in `ramalama.conf`.
14+
15+
## OPTIONS
16+
17+
#### **--help**, **-h**
18+
show this help message and exit
19+
20+
## COMMANDS
21+
22+
#### **list**
23+
list benchmark results
24+
25+
## LIST OPTIONS
26+
27+
#### **--limit**=LIMIT
28+
limit number of results to display
29+
30+
#### **--offset**=OFFSET
31+
offset for pagination (default: 0)
32+
33+
#### **--format**={table,json}
34+
output format (table or json) (default: table)
35+
36+
## EXAMPLES
37+
38+
```
39+
ramalama benchmarks list
40+
```
41+
42+
## SEE ALSO
43+
**[ramalama(1)](ramalama.1.md)**, **[ramalama-bench(1)](ramalama-bench.1.md)**, **[ramalama.conf(5)](ramalama.conf.5.md)**
44+
45+
## HISTORY
46+
Jan 2026, Originally compiled by Ian Eaves <ian@ramalama.com>

docs/ramalama.1.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ The default can be overridden in the ramalama.conf file.
137137
| Command | Description |
138138
| ------------------------------------------------- | ---------------------------------------------------------- |
139139
| [ramalama-bench(1)](ramalama-bench.1.md) |benchmark specified AI Model|
140+
| [ramalama-benchmarks(1)](ramalama-benchmarks.1.md)|view and interact with historical benchmark results|
140141
| [ramalama-chat(1)](ramalama-chat.1.md) |OpenAI chat with the specified REST API URL|
141142
| [ramalama-containers(1)](ramalama-containers.1.md)|list all RamaLama containers|
142143
| [ramalama-convert(1)](ramalama-convert.1.md) |convert AI Models from local storage to OCI Image|

docs/ramalama.conf

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,10 +221,22 @@
221221

222222

223223
[ramalama.user]
224+
#
224225
# Suppress the interactive prompt when running on macOS with a Podman VM
225226
# that doesn't support GPU acceleration (e.g., applehv provider).
226227
# When set to true, RamaLama will automatically proceed without GPU support
227228
# instead of asking for confirmation.
228229
# Can also be set via the `RAMALAMA_USER__NO_MISSING_GPU_PROMPT` environment variable.
229230
#
231+
232+
[ramalama.benchmarks]
233+
#storage_folder = <default store>/benchmarks
234+
#
235+
# Manually specify where to save benchmark results.
236+
# By default, results are stored under the default model store directory
237+
# in benchmarks/benchmarks.jsonl.
238+
# Changing `ramalama.store` does not update this; set storage_folder explicitly.
239+
240+
241+
[ramalama.user]
230242
#no_missing_gpu_prompt = false

docs/ramalama.conf.5.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,16 @@ Configuration settings for the openai hosted provider
267267
**api_key**=""
268268

269269
Provider-specific API key used when invoking OpenAI-hosted transports. Overrides `RAMALAMA_API_KEY` when set.
270+
## RAMALAMA.BENCHMARKS TABLE
271+
The ramalama.benchmarks table contains benchmark related settings.
272+
273+
`[[ramalama.benchmarks]]`
274+
275+
**storage_folder**="<default store>/benchmarks"
276+
277+
Manually specify where to save benchmark results.
278+
By default, this will be stored in the default model store directory under `benchmarks/`.
279+
Changing `ramalama.store` does not update this; set `ramalama.benchmarks.storage_folder` explicitly if needed.
270280

271281
## RAMALAMA.USER TABLE
272282
The ramalama.user table contains user preference settings.

inference-spec/engines/llama.cpp.yaml

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,22 @@ commands:
117117
inference_engine:
118118
name: "llama-bench"
119119
binary: "llama-bench"
120-
options: *bench_perplexity_options
120+
options:
121+
- name: "--model"
122+
description: "The AI model to run"
123+
value: "{{ model.model_path }}"
124+
- name: "-ngl"
125+
description: "Number of layers to offload to the GPU if available"
126+
value: "{{ 999 if args.ngl < 0 else args.ngl }}"
127+
- name: "-ngld"
128+
description: "Number of layers to offload to the GPU if available"
129+
value: "{{ None if not args.model_draft else 999 if args.ngl < 0 else args.ngl }}"
130+
- name: "--threads"
131+
description: "Number of Threads to use during generation"
132+
value: "{{ args.threads }}"
133+
- name: "-o"
134+
description: "Output format printed to stdout"
135+
value: "json"
121136
- name: rag
122137
inference_engine:
123138
name: "rag"

ramalama/benchmarks/errors.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
class MissingStorageFolderError(Exception):
2+
def __init__(self):
3+
message = """
4+
No valid benchmarks storage folder could be determined.
5+
6+
Set an explicit path via:
7+
RAMALAMA__BENCHMARKS_STORAGE_FOLDER=/absolute/path/to/benchmarks
8+
9+
If this seems wrong for your setup, report it at:
10+
https://www.github.com/containers/ramalama/issues
11+
"""
12+
super().__init__(message)

ramalama/benchmarks/manager.py

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
import json
2+
import logging
3+
from dataclasses import asdict
4+
from functools import cached_property
5+
from pathlib import Path
6+
7+
from ramalama.benchmarks.errors import MissingStorageFolderError
8+
from ramalama.benchmarks.schemas import BenchmarkRecord, DeviceInfoV1, get_benchmark_record
9+
from ramalama.benchmarks.utilities import parse_jsonl
10+
from ramalama.config import CONFIG
11+
from ramalama.log_levels import LogLevel
12+
13+
logger = logging.getLogger("ramalama.benchmarks")
14+
logger.setLevel(CONFIG.log_level or LogLevel.WARNING)
15+
16+
SCHEMA_VERSION = 1
17+
BENCHMARKS_FILENAME = "benchmarks.jsonl"
18+
19+
20+
class BenchmarksManager:
21+
def __init__(self, storage_folder: str | Path | None):
22+
if storage_folder is None:
23+
raise MissingStorageFolderError
24+
25+
self.storage_folder = Path(storage_folder)
26+
self.storage_file = self.storage_folder / BENCHMARKS_FILENAME
27+
self.storage_file.parent.mkdir(parents=True, exist_ok=True)
28+
29+
@cached_property
30+
def device_info(self) -> DeviceInfoV1:
31+
return DeviceInfoV1.current_device_info()
32+
33+
def save(self, results: list[BenchmarkRecord] | BenchmarkRecord):
34+
if not isinstance(results, list):
35+
results = [results]
36+
37+
if len(results) == 0:
38+
return
39+
40+
with self.storage_file.open("a", encoding="utf-8") as handle:
41+
for record in results:
42+
handle.write(json.dumps(asdict(record), ensure_ascii=True))
43+
handle.write("\n")
44+
45+
def list(self) -> list[BenchmarkRecord]:
46+
"""List benchmark results from JSONL storage."""
47+
if not self.storage_file.exists():
48+
return []
49+
content = self.storage_file.read_text(encoding="utf-8")
50+
return [get_benchmark_record(result) for result in parse_jsonl(content)]

0 commit comments

Comments
 (0)