NOVADEDOG
diff --git a/‎.github/workflows/deploy.yml‎
Lines changed: 61 additions & 0 deletions b/‎.github/workflows/deploy.yml‎
Lines changed: 61 additions & 0 deletions
diff --git a/‎AI_DOCS/leaderboard_constitution.md‎
Lines changed: 20 additions & 0 deletions b/‎AI_DOCS/leaderboard_constitution.md‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎AI_DOCS/leaderboard_plan.md‎
Lines changed: 16 additions & 0 deletions b/‎AI_DOCS/leaderboard_plan.md‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎AI_DOCS/leaderboard_specify.md‎
Lines changed: 34 additions & 0 deletions b/‎AI_DOCS/leaderboard_specify.md‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎AI_DOCS/leaderboard_tasks.md‎
Lines changed: 26 additions & 0 deletions b/‎AI_DOCS/leaderboard_tasks.md‎
Lines changed: 26 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 24 additions & 0 deletions b/‎README.md‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎RUNBOOK.md‎
Lines changed: 74 additions & 4 deletions b/‎RUNBOOK.md‎
Lines changed: 74 additions & 4 deletions
@@ -0,0 +1,61 @@
+name: Deploy to GitHub Pages
+
+on:
+  push:
+    branches:
+      - main
+    paths:
+      - 'energy-leaderboard-web/**'
+      - '.github/workflows/deploy.yml'
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+concurrency:
+  group: 'pages'
+  cancel-in-progress: true
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: energy-leaderboard-web
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+          cache-dependency-path: energy-leaderboard-web/package-lock.json
+
+      - name: Install dependencies
+        run: npm ci
+
+      - name: Build
+        run: npm run build
+
+      - name: Setup Pages
+        uses: actions/configure-pages@v4
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: energy-leaderboard-web/dist
+
+  deploy:
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    runs-on: ubuntu-latest
+    needs: build
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
@@ -0,0 +1,20 @@
+# Constitution: The Guiding Principles (Leaderboard)
+
+### 1. Transparency & Honesty
+The leaderboard must clearly distinguish between *measured* data (hardware sensors) and *estimated* data.
+* **Action:** Every row must have a visible "Measured" or "Estimated" badge.
+* **Action:** Rows without a power trace (verification) must be marked.
+
+### 2. User-Centric Design
+The data is complex (Wh/1k tokens, Latency, Joules), but the UI must be simple.
+* **Action:** Default view shows only the most important metrics (Model, Efficiency, CO2).
+* **Action:** Provide "Info tooltips" explaining what "Wh/1k tokens" means.
+
+### 3. Static First
+To ensure the project is free to host and easy to fork, it must not require a dynamic backend server.
+* **Action:** Build a Static Single Page App (SPA).
+* **Action:** Data ingestion happens at build-time or run-time from static JSON files.
+
+### 4. Mobile Responsiveness
+The leaderboard must be readable on mobile devices.
+* **Action:** The main table must support horizontal scrolling or a card-view on small screens.
@@ -0,0 +1,16 @@
+# Plan: Architecture & File Structure
+
+### File Structure
+energy-leaderboard-web/ ├── public/ │ └── data/ # Drop result.json files here! │ ├── run_apple_m1.json │ └── run_nvidia_t4.json ├── src/ │ ├── components/ │ │ ├── LeaderboardTable.tsx # The main component │ │ ├── FilterBar.tsx # Search & Dropdowns │ │ └── MetricBadge.tsx # For "Measured" vs "Estimated" │ ├── lib/ │ │ ├── data-loader.ts # Fetches/Imports the JSONs │ │ └── types.ts # TypeScript interfaces for the JSON schema │ ├── App.tsx │ └── main.tsx ├── index.html ├── package.json ├── tailwind.config.js └── vite.config.ts
+
+
+### Core Logic (Data Loader)
+Since we cannot "list files" in a static browser environment easily without a server index, we will use a clever Vite feature or a simple manifest approach.
+* **Approach:** We will use `import.meta.glob('/public/data/*.json')` provided by Vite to load all JSON files at build time/runtime. This makes adding data as simple as "drag and drop file into folder".
+
+### Deployment Strategy
+We will create a `.github/workflows/deploy.yml` that:
+1.  Triggers on push to `main`.
+2.  Installs dependencies (`npm ci`).
+3.  Builds the site (`npm run build`).
+4.  Deploys the `dist/` folder to the `gh-pages` branch.
@@ -0,0 +1,34 @@
+# Specifications: Leaderboard Requirements
+
+### Functional Requirements
+
+**F1: Data Ingestion**
+* The app MUST automatically load multiple `result.json` files from the `public/data/` directory.
+* It MUST parse the standard JSON schema (defined in the Runner project) including fields like `energy_wh_net`, `tokens_total`, `g_co2`.
+
+**F2: The Leaderboard Table**
+* The app MUST display a sortable table with the following columns (based on Eindproductplan):
+    1.  **Rank** (Auto-calculated based on efficiency)
+    2.  **Model** (e.g., "llama3:8b")
+    3.  **Efficiency** (Wh per 1k tokens) - *Primary Sort Key*
+    4.  **CO2** (g)
+    5.  **Net Energy** (Wh)
+    6.  **Speed** (Tokens/sec or Duration)
+    7.  **Device** (OS / GPU / CPU)
+    8.  **Method** (Measured/Estimated)
+* Users MUST be able to sort by clicking headers.
+
+**F3: Filtering & Search**
+* Provide a text search bar for "Model Name".
+* Provide dropdown filters for:
+    * **Device Type** (NVIDIA, Apple, AMD, CPU)
+    * **Method** (Show All, Only Measured)
+
+**F4: Detail View**
+* Clicking a row SHOULD open a modal/expanded view showing full details (Start/End time, specific hardware specs, user comments).
+
+### Non-Functional Requirements
+
+**NF1: Tech Stack:** React (TypeScript), Vite, Tailwind CSS, Shadcn/UI (optional, for nice tables).
+**NF2: Hosting:** Must be deployable to GitHub Pages via a provided GitHub Actions workflow.
+**NF3: Performance:** Fast loading score (Lighthouse > 90).
@@ -0,0 +1,26 @@
+# Tasks: Implementation Plan
+
+### Sprint 1: Setup & Scaffolding
+**Goal:** A running React app that can read one JSON file.
+1.  [ ] **Task 1.1:** Initialize Vite project (React + TS) named `energy-leaderboard-web`.
+2.  [ ] **Task 1.2:** Install Tailwind CSS and configure `tailwind.config.js` (use green/eco colors).
+3.  [ ] **Task 1.3:** Create `src/lib/types.ts` matching the Runner's JSON output schema.
+4.  [ ] **Task 1.4:** Create dummy data in `public/data/example_run.json`.
+
+### Sprint 2: Data Ingestion & Logic
+**Goal:** Load all JSON files and process them.
+5.  [ ] **Task 2.1:** Implement `src/lib/data-loader.ts` using `import.meta.glob` to load all JSONs from the data folder.
+6.  [ ] **Task 2.2:** Create a helper function to calculate "Rank" and normalize data (e.g. handle missing fields).
+
+### Sprint 3: UI Components (The Leaderboard)
+**Goal:** A visual table with sorting and filtering.
+7.  [ ] **Task 3.1:** Build `LeaderboardTable.tsx`. Use standard HTML `<table>` or a library like TanStack Table for sorting.
+8.  [ ] **Task 3.2:** Implement columns: Rank, Model, Efficiency (Wh/1k), CO2, Hardware.
+9.  [ ] **Task 3.3:** Add `FilterBar.tsx` for searching models and filtering by "Measured/Estimated".
+10. [ ] **Task 3.4:** Style the table. Add "Green Leaf" badges for good scores and "Warning" badges for estimated data.
+
+### Sprint 4: Deployment & Polish
+**Goal:** Live on GitHub Pages.
+11. [ ] **Task 4.1:** Create `.github/workflows/deploy.yml` for GitHub Pages deployment.
+12. [ ] **Task 4.2:** Update `vite.config.ts` with the correct `base` url (usually `/{repo-name}/`).
+13. [ ] **Task 4.3:** Write `README.md` explaining how to add a new test result (just upload the JSON!).
@@ -106,6 +106,8 @@ Options:
   -m, --model TEXT          Model identifier (e.g., 'llama3:latest') [required]
   -t, --test-set TEXT       Testset name or stem (e.g., 'easy', 'testset_easy') [default: easy]
   -o, --output TEXT         Output file path for results [default: results/output.json]
+  -d, --device-name TEXT    Override auto-detected device name
+  --device-type TEXT        Override device type (apple, nvidia, amd, intel, unknown)
   --help                    Show this message and exit
 
 The runner looks for `testset_<name>.json` inside `src/data/testsets/`. Passing `easy`, `testset_easy`, or the exact filename stem all point to the same file.
@@ -216,6 +218,13 @@ Results are exported as JSON and validated against a strict schema:
     "region": "unknown",
     "notice": null,
     "sampling_ms": 100,
+    "device_name": "Apple M1 Pro Mac (MacBookPro18,3)",
+    "device_type": "apple",
+    "os_name": "macOS",
+    "os_version": "14.2",
+    "cpu_model": "Apple M1 Pro",
+    "ram_gb": 16.0,
+    "chip_architecture": "arm64",
     "testset_id": "testset_easy",
     "testset_name": "Easy Baseline",
     "question_id": "easy-1",
@@ -225,6 +234,21 @@ Results are exported as JSON and validated against a strict schema:
 ]
 ```
 
+### Required Device Fields
+
+Every result now includes mandatory device information for accurate hardware comparison:
+
+| Field | Description | Example |
+|-------|-------------|----------|
+| `device_name` | Human-readable device name | "Apple M1 Pro Mac" |
+| `device_type` | Category for filtering | `apple`, `nvidia`, `amd`, `intel`, `unknown` |
+| `os_name` | Operating system | "macOS", "Linux", "Windows" |
+| `os_version` | OS version string | "14.2" |
+
+Optional device fields: `cpu_model`, `gpu_model`, `ram_gb`, `chip_architecture`.
+
+Device information is **auto-detected** on startup. Use `--device-name` or `--device-type` to override if needed.
+
 Additional metadata fields (e.g., `testset_goal`, `testset_notes`, `question_task_type`, `expected_answer_description`, `max_output_tokens_hint`, `energy_relevance`) are included when present in the source testset and validated via `src/data/metrics_schema.json`.
 
 ## Test Sets
 
@@ -9,10 +9,11 @@ This document provides detailed, platform-specific installation instructions, tr
    - [Linux + NVIDIA GPU (NVML)](#linux--nvidia-gpu-nvml)
    - [Linux + AMD GPU (ROCm-smi)](#linux--amd-gpu-rocm-smi)
    - [Linux + Intel/AMD CPU (RAPL)](#linux--intelamd-cpu-rapl)
-2. [Ollama Setup](#ollama-setup)
-3. [Docker Troubleshooting](#docker-troubleshooting)
-4. [Common Issues](#common-issues)
-5. [Advanced Configuration](#advanced-configuration)
+2. [Device Detection](#device-detection)
+3. [Ollama Setup](#ollama-setup)
+4. [Docker Troubleshooting](#docker-troubleshooting)
+5. [Common Issues](#common-issues)
+6. [Advanced Configuration](#advanced-configuration)
 
 ---
 
@@ -333,6 +334,75 @@ docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
 
 ---
 
+## Device Detection
+
+The runner automatically detects your hardware and includes device information in all benchmark results. This enables accurate comparisons across different machines on the Energy Leaderboard.
+
+### Auto-Detection
+
+When you run a benchmark, the runner automatically detects:
+
+```
+[yellow]Detecting device information...[/yellow]
+[green]✓[/green] Device: Apple M1 Pro Mac (MacBookPro18,3)
+  Type: apple
+  OS: macOS 14.2
+  CPU: Apple M1 Pro
+  RAM: 16.0 GB
+```
+
+### Detected Information
+
+| Field | Description | Auto-Detected On |
+|-------|-------------|------------------|
+| `device_name` | Human-readable name | All platforms |
+| `device_type` | Category (`apple`, `nvidia`, `amd`, `intel`, `unknown`) | All platforms |
+| `os_name` | Operating system | All platforms |
+| `os_version` | OS version | All platforms |
+| `cpu_model` | CPU model string | macOS, Linux, Windows |
+| `gpu_model` | GPU model (if applicable) | NVIDIA, AMD, Apple Silicon |
+| `ram_gb` | System RAM in GB | All platforms |
+| `chip_architecture` | CPU arch (`arm64`, `x86_64`) | All platforms |
+
+### Manual Override
+
+If auto-detection fails or you want a custom device name:
+
+```bash
+# Override device name
+python src/main.py run-test -m llama3:latest -t easy --device-name "My Gaming PC"
+
+# Override device type
+python src/main.py run-test -m llama3:latest -t easy --device-type nvidia
+
+# Both
+python src/main.py run-test -m llama3:latest -t easy \
+  --device-name "Custom RTX 4090 Build" \
+  --device-type nvidia
+```
+
+### Troubleshooting Detection
+
+**Test device detection:**
+```bash
+python -c "from src.utils.device_info import detect_device_info; info = detect_device_info(); print(info)"
+```
+
+**macOS issues:**
+- `system_profiler` must be available (built-in on all macOS versions)
+- Apple Silicon chips are detected via `sysctl` and `system_profiler`
+
+**Linux issues:**
+- GPU detection requires `nvidia-smi` (NVIDIA) or `rocm-smi` (AMD) in PATH
+- CPU info is read from `/proc/cpuinfo`
+- RAM info is read from `/proc/meminfo`
+
+**Windows issues:**
+- Uses `wmic` commands for CPU and RAM detection
+- NVIDIA GPU detected via `nvidia-smi`
+
+---
+
 ## Ollama Setup
 
 ### Installation