A Python tool (CLI + library) that creates an index mapping SHA256 hashes of GGUF files to their HuggingFace URLs, enabling identification of local GGUF files.
Using uv (recommended):
uv sync
source .venv/bin/activate
gguf-index --helpor directly:
uv tool install gguf-index@git+https://github.com/mozilla-ai/gguf-index
gguf-index --helpSearch HuggingFace for GGUF repositories and add them to the index:
# Search all GGUF repos
gguf-index search
# Search with a query
gguf-index search "llama"
# Limit the number of repos to index
gguf-index search --limit 10Index all GGUF files from a specific HuggingFace repository:
gguf-index add TheBloke/Llama-2-7B-GGUFgguf-index lookup abc123def456...Compute the SHA256 of a local file and look it up in the index:
gguf-index identify /path/to/model.gguf# Export to stdout
gguf-index export
# Export to a file
gguf-index export -o my_index.jsongguf-index statsfrom gguf_index import GGUFIndex
# Create an index with custom storage paths
index = GGUFIndex(
json_path="my_index.json",
sqlite_path="my_index.db",
)
# Load existing index
index.load()
# Build index from HuggingFace search
files_indexed = index.build_from_search(query="llama", limit=10)
# Index a specific repository
index.index_repo("TheBloke/Llama-2-7B-GGUF")
# Look up a file by SHA256 (returns list of all known sources)
entries = index.lookup("abc123def456...")
for entry in entries:
print(f"Found: {entry.repo_id}/{entry.filename}")
print(f"Download: {entry.download_url}")
# Identify a local file (returns list of all known sources)
sha256, entries = index.identify_file("/path/to/model.gguf")
print(f"SHA256: {sha256}")
print(f"Found in {len(entries)} repository(ies)")
for entry in entries:
print(f" - {entry.repo_id}/{entry.filename}")
# Get statistics
stats = index.stats()
print(f"Unique files: {stats['unique_files']}")
print(f"Total sources: {stats['total_sources']}")The index is revision-aware - each unique (repo_id, revision, filename) tuple maps to exactly one SHA256 hash. This enables:
- Full history tracking: All historical versions of a file are indexed, not just the latest
- Precise URLs: Download URLs include the exact commit hash, ensuring you get the exact file
- Multiple sources per hash: The same file content (SHA256) can exist at multiple URLs across repos and revisions
{
"TheBloke/gpt-oss-GGUF/abc123def.../gpt-oss.Q4_K_M.gguf": {
"sha256": "...",
"repo_id": "TheBloke/gpt-oss-GGUF",
"revision": "abc123def456789...",
"filename": "gpt-oss.Q4_K_M.gguf",
"size": 4368438944,
"indexed_at": "2026-02-13T12:00:00Z"
}
}Primary key: (repo_id, revision, filename) - ensures each unique URL maps to exactly one SHA256.
When looking up a file by SHA256, all matching URLs (across repos and historical revisions) are displayed.
GGUF Index supports two storage backends:
- JSON: Portable and shareable, stored in
~/.gguf-index/index.json - SQLite: Fast lookups with indexed SHA256 column, stored in
~/.gguf-index/index.db
By default, both backends are used. You can customize the storage paths or disable either backend via CLI options.
Apache-2.0