You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since we cannot "list files" in a static browser environment easily without a server index, we will use a clever Vite feature or a simple manifest approach.
9
+
***Approach:** We will use `import.meta.glob('/public/data/*.json')` provided by Vite to load all JSON files at build time/runtime. This makes adding data as simple as "drag and drop file into folder".
10
+
11
+
### Deployment Strategy
12
+
We will create a `.github/workflows/deploy.yml` that:
13
+
1. Triggers on push to `main`.
14
+
2. Installs dependencies (`npm ci`).
15
+
3. Builds the site (`npm run build`).
16
+
4. Deploys the `dist/` folder to the `gh-pages` branch.
Copy file name to clipboardExpand all lines: README.md
+24Lines changed: 24 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,6 +106,8 @@ Options:
106
106
-m, --model TEXT Model identifier (e.g., 'llama3:latest') [required]
107
107
-t, --test-set TEXT Testset name or stem (e.g., 'easy', 'testset_easy') [default: easy]
108
108
-o, --output TEXT Output file path for results [default: results/output.json]
109
+
-d, --device-name TEXT Override auto-detected device name
110
+
--device-type TEXT Override device type (apple, nvidia, amd, intel, unknown)
109
111
--help Show this message and exit
110
112
111
113
The runner looks for`testset_<name>.json` inside `src/data/testsets/`. Passing `easy`, `testset_easy`, or the exact filename stem all point to the same file.
@@ -216,6 +218,13 @@ Results are exported as JSON and validated against a strict schema:
216
218
"region": "unknown",
217
219
"notice": null,
218
220
"sampling_ms": 100,
221
+
"device_name": "Apple M1 Pro Mac (MacBookPro18,3)",
222
+
"device_type": "apple",
223
+
"os_name": "macOS",
224
+
"os_version": "14.2",
225
+
"cpu_model": "Apple M1 Pro",
226
+
"ram_gb": 16.0,
227
+
"chip_architecture": "arm64",
219
228
"testset_id": "testset_easy",
220
229
"testset_name": "Easy Baseline",
221
230
"question_id": "easy-1",
@@ -225,6 +234,21 @@ Results are exported as JSON and validated against a strict schema:
225
234
]
226
235
```
227
236
237
+
### Required Device Fields
238
+
239
+
Every result now includes mandatory device information for accurate hardware comparison:
240
+
241
+
| Field | Description | Example |
242
+
|-------|-------------|----------|
243
+
|`device_name`| Human-readable device name | "Apple M1 Pro Mac" |
244
+
|`device_type`| Category for filtering |`apple`, `nvidia`, `amd`, `intel`, `unknown`|
245
+
|`os_name`| Operating system | "macOS", "Linux", "Windows" |
Device information is **auto-detected** on startup. Use `--device-name` or `--device-type` to override if needed.
251
+
228
252
Additional metadata fields (e.g., `testset_goal`, `testset_notes`, `question_task_type`, `expected_answer_description`, `max_output_tokens_hint`, `energy_relevance`) are included when present in the source testset and validated via `src/data/metrics_schema.json`.
@@ -333,6 +334,75 @@ docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
333
334
334
335
---
335
336
337
+
## Device Detection
338
+
339
+
The runner automatically detects your hardware and includes device information in all benchmark results. This enables accurate comparisons across different machines on the Energy Leaderboard.
340
+
341
+
### Auto-Detection
342
+
343
+
When you run a benchmark, the runner automatically detects:
344
+
345
+
```
346
+
[yellow]Detecting device information...[/yellow]
347
+
[green]✓[/green] Device: Apple M1 Pro Mac (MacBookPro18,3)
348
+
Type: apple
349
+
OS: macOS 14.2
350
+
CPU: Apple M1 Pro
351
+
RAM: 16.0 GB
352
+
```
353
+
354
+
### Detected Information
355
+
356
+
| Field | Description | Auto-Detected On |
357
+
|-------|-------------|------------------|
358
+
|`device_name`| Human-readable name | All platforms |
0 commit comments