Skip to content

Conversation

Copy link

Copilot AI commented Sep 20, 2025

This PR adds a new llama-pull command-line tool that provides a dedicated interface for downloading AI models from HuggingFace and Docker Registry repositories.

Features

The new tool supports downloading models from two sources:

# Download from HuggingFace
llama-pull -hf bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_M

# Download from Docker Registry  
llama-pull -dr gemma3

Implementation

The tool leverages the existing robust download infrastructure from common/arg.cpp, ensuring:

  • Consistent caching behavior with other llama.cpp tools
  • Support for authentication tokens and offline mode
  • Proper error handling and progress reporting
  • Integration with the existing build system

Key Benefits

  • Dedicated download tool: Provides a focused interface for model acquisition without requiring model loading
  • Minimal implementation: Reuses existing download logic, reducing code duplication and maintenance burden
  • Consistent UX: Uses the same argument patterns and caching as other llama.cpp tools
  • Multiple sources: Supports both HuggingFace and Docker Registry in a single tool

Usage Examples

# Basic HuggingFace download
llama-pull -hf microsoft/DialoGPT-medium

# Docker with specific quantization
llama-pull -dr ai/gemma3:Q4_K_M

# Use cached models only
llama-pull -dr gemma3 --offline

# Private HuggingFace repo
llama-pull -hf private/model --hf-token $HF_TOKEN

Downloaded models are stored in the standard llama.cpp cache directory (~/.cache/llama.cpp/) and can be immediately used with other tools like llama-cli, llama-server, etc.

Files Added

  • tools/llama-pull/llama-pull.cpp - Main implementation
  • tools/llama-pull/CMakeLists.txt - Build configuration
  • tools/llama-pull/README.md - Documentation
  • Updated tools/CMakeLists.txt to include the new tool

The implementation is minimal (~100 lines) and integrates seamlessly with the existing codebase.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • huggingface.co
    • Triggering command: ./build/bin/llama-pull -hf bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_M (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add llama-pull binary for downloading models from HuggingFace and Docker Registry: llama-pull -hf <model> - Download models from HuggingFace llama-pull -dr <model> - Download models from Docker Registry Add llama-pull binary for downloading models from HuggingFace and Docker Registry Sep 20, 2025
Copilot AI requested a review from ericcurtin September 20, 2025 15:51
Copilot finished work on behalf of ericcurtin September 20, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants