llm-benchmark (ollama-benchmark)

LLM Benchmark for Throughput via Ollama (Local LLMs)

Measure how fast your local LLMs really are—with a simple, cross-platform CLI tool that tells you the tokens-per-second truth.

Installation prerequisites

Working Ollama installation.

Installation Steps

Depending on your python setup either

pip install llm-benchmark

or

pipx install llm-benchmark

Usage for general users directly

llm_benchmark run

Installation and Usage in Video format

It's tested on Python 3.9 and above.

ollama installation with the following models installed

7B model can be run on machines with 8GB of RAM

13B model can be run on machines with 16GB of RAM

Usage explaination

On Windows, Linux, and macOS, it will detect memory RAM size to first download required LLM models.

When memory RAM size is greater than or equal to 4GB, but less than 7GB, it will check if gemma:2b exist. The program implicitly pull the model.

ollama pull deepseek-r1:1.5b
ollama pull gemma:2b
ollama pull phi:2.7b
ollama pull phi3:3.8b

When memory RAM size is greater than 7GB, but less than 15GB, it will check if these models exist. The program implicitly pull these models

ollama pull phi3:3.8b
ollama pull gemma2:9b
ollama pull mistral:7b
ollama pull llama3.1:8b
ollama pull deepseek-r1:8b
ollama pull llava:7b

When memory RAM size is greater than 15GB, but less than 31GB, it will check if these models exist. The program implicitly pull these models

ollama pull gemma2:9b
ollama pull mistral:7b
ollama pull phi4:14b
ollama pull deepseek-r1:8b
ollama pull deepseek-r1:14b
ollama pull llava:7b
ollama pull llava:13b

When memory RAM size is greater than 31GB, it will check if these models exist. The program implicitly pull these models

ollama pull phi4:14b
ollama pull deepseek-r1:14b
ollama pull gpt-oss:20b

Python Poetry manually(advanced) installation

https://python-poetry.org/docs/#installing-manually

For developers to develop new features on Windows Powershell or on Ubuntu Linux or macOS

python3 -m venv .venv
. ./.venv/bin/activate
pip install -U pip setuptools
pip install poetry

Usage in Python virtual environment

poetry shell
poetry install
llm_benchmark hello jason

Example #1 send systeminfo and benchmark results to a remote server

llm_benchmark run

Example #2 Do not send systeminfo and benchmark results to a remote server

llm_benchmark run --no-sendinfo

Example #3 Benchmark run on explicitly given the path to the ollama executable (When you built your own developer version of ollama)

llm_benchmark run --ollamabin=~/code/ollama/ollama

Example #4 run custom benchmark models

Create a custom benchmark file like following yaml format, replace with your own benchmark models, remember to use double quote for your model name

file_name: "custombenchmarkmodels.yml"
version: 2.0.custom
models:
  - model: "deepseek-r1:1.5b"
  - model: "qwen:0.5b"

run with the flag and point to the path of custombenchmarkmodels.yml

llm_benchmark run --custombenchmark=path/to/custombenchmarkmodels.yml

Reference

Ollama

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.github/workflows		.github/workflows
llm_benchmark		llm_benchmark
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
llm-benchmark.gif		llm-benchmark.gif
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llm-benchmark (ollama-benchmark)

Installation prerequisites

Installation Steps

Usage for general users directly

Installation and Usage in Video format

ollama installation with the following models installed

Usage explaination

Python Poetry manually(advanced) installation

For developers to develop new features on Windows Powershell or on Ubuntu Linux or macOS

Usage in Python virtual environment

Example #1 send systeminfo and benchmark results to a remote server

Example #2 Do not send systeminfo and benchmark results to a remote server

Example #3 Benchmark run on explicitly given the path to the ollama executable (When you built your own developer version of ollama)

Example #4 run custom benchmark models

Reference

About

Uh oh!

Releases 35

Uh oh!

Contributors 3

Uh oh!

Languages

License

aidatatools/ollama-benchmark

Folders and files

Latest commit

History

Repository files navigation

llm-benchmark (ollama-benchmark)

Installation prerequisites

Installation Steps

Usage for general users directly

Installation and Usage in Video format

ollama installation with the following models installed

Usage explaination

Python Poetry manually(advanced) installation

For developers to develop new features on Windows Powershell or on Ubuntu Linux or macOS

Usage in Python virtual environment

Example #1 send systeminfo and benchmark results to a remote server

Example #2 Do not send systeminfo and benchmark results to a remote server

Example #3 Benchmark run on explicitly given the path to the ollama executable (When you built your own developer version of ollama)

Example #4 run custom benchmark models

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 35

Uh oh!

Contributors 3

Uh oh!

Languages