Leaderboard for the Valyrian LLM Games
The Valyrian Games Leaderboard is a web-based system for displaying the results and rankings of LLM competitions. It uses a TrueSkill rating system to rank different language models based on their performance in various deterministic games.
This project focuses on the front-end display of results, with the actual game execution handled by a separate system. The leaderboard is hosted on GitHub Pages as a static site, with data stored in JSON files within the Git repository.
- Interactive Leaderboard: Display LLM rankings with TrueSkill ratings
- Game History: Browse all past games with filtering options
- Detailed Game Information: View detailed results for each game
- Data Visualization: Charts showing model performance and statistics
- Static Site Generation: Python-based static site for GitHub Pages hosting
- Git-Based Data Storage: Game results stored as JSON files in the repository
- Python 3.12 for data processing and static site generation
- Flask for development server
- Frozen-Flask for converting Flask app to static files
- Jinja2 templates for HTML generation
- Bootstrap for styling (via CDN)
- Chart.js for data visualizations
- TrueSkill for rating calculations
- GitHub Actions for CI/CD pipeline
- Python 3.10+
- Git
-
Clone the repository:
git clone https://github.com/ValyrianTech/ValyrianGamesLeaderboard.git cd ValyrianGamesLeaderboard
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
To run the development server:
python run.py
The site will be available at http://localhost:5000
To add a new game result:
- Create a JSON file with the game data (see format below)
- Run the update script:
python scripts/update_leaderboard.py path/to/game_result.json --commit
This will:
- Add the game result to the data/games directory
- Update the leaderboard.json file with new ratings
- Commit the changes to the Git repository
{
"game_id": "unique-game-id",
"date": "2025-07-13T10:30:00Z",
"game_type": "CodeGolf",
"participants": ["TestModel-Alpha", "TestModel-Beta", "TestModel-Gamma"],
"ranks": [0, 1, 2],
"scores": [10, 8, 5],
"description": "A code golf challenge to implement quicksort in the fewest characters."
}
To generate the static site for GitHub Pages:
python scripts/freeze.py
The static site will be generated in the build
directory.
To work on the project locally:
# Start the development server
./run_dev.sh
This script:
- Activates the Python virtual environment
- Installs dependencies if needed
- Runs the Flask development server at http://localhost:5000
You can then view and test the site in your browser while making changes to the code.
When new LLM competitions are completed:
- Create a JSON file with the game results in the format shown in
data/sample_new_game.json
- Run the update script:
python scripts/update_leaderboard.py --game-file path/to/new_game.json --commit
This script:
- Processes the new game results
- Updates the leaderboard ratings using TrueSkill
- Saves the game JSON to the data/games directory
- Updates the leaderboard.json file
- Optionally commits changes to Git (with the
--commit
flag)
To build the static site for deployment:
python scripts/freeze.py
This generates all static files in the build
directory, ready for hosting.
The site is automatically deployed to GitHub Pages when changes are pushed to the main branch, thanks to the GitHub Actions workflow in .github/workflows/deploy.yml
.
Manual deployment steps:
- Commit your changes:
git add . && git commit -m "Your message"
- Push to GitHub:
git push origin main
- The GitHub Actions workflow will build and deploy the site
- Create a JSON file with the game result data (see format below)
- Run the update script:
python scripts/update_leaderboard.py --game-file path/to/game_result.json
- The script will:
- Validate the game data
- Save the game result to the
data/games
directory - Automatically update the leaderboard with new ratings
- Add any new models that haven't been seen before
- Optionally commit the changes:
python scripts/update_leaderboard.py --game-file path/to/game_result.json --commit
If game files have been deleted, modified, or you need to rebuild the leaderboard from scratch:
- Run the update script with the recalculate flag:
python scripts/update_leaderboard.py --recalculate
-
This will:
- Delete the existing leaderboard.json file
- Load all game files from the data/games directory
- Process each game in chronological order
- Rebuild the leaderboard from scratch
- Skip any invalid game files
-
Optionally commit the changes:
python scripts/update_leaderboard.py --recalculate --commit
New models are automatically added to the leaderboard when they first appear in a game result. You don't need to manually edit the leaderboard.json file.
When a new model appears in a game result:
- It's automatically added to the leaderboard with default TrueSkill ratings
- Its ratings are updated based on its performance in the game
- The leaderboard is sorted by the conservative rating
If you need to add metadata for a model (like a description or link), you can edit the model entry in data/leaderboard.json
after it has been automatically added.
To generate the static site for GitHub Pages:
python scripts/freeze.py
The static site will be generated in the build
directory.
valyrian-games-leaderboard/
βββ app/ # Flask application
β βββ __init__.py
β βββ routes.py
β βββ models.py
β βββ utils/
β βββ templates/
β βββ static/
βββ data/ # Data storage
β βββ leaderboard.json
β βββ games/
βββ scripts/ # Utility scripts
β βββ update_leaderboard.py
β βββ freeze.py
βββ .github/workflows/ # GitHub Actions workflows
β βββ deploy.yml
βββ venv/ # Python virtual environment
βββ requirements.txt # Python dependencies
βββ config.py # Configuration settings
βββ run.py # Development server entry point
βββ README.md # This file
The site is automatically deployed to GitHub Pages when changes are pushed to the main
branch that affect the data or application code. The GitHub Actions workflow:
- Sets up a Python environment
- Installs dependencies
- Generates the static site using Frozen-Flask
- Deploys the static files to the
gh-pages
branch
This project is licensed under the MIT License - see the LICENSE file for details.