Added workflow to package a safetensors model (docker#100)

ilopezluna · doringeman · commit 48320a8e37ff · 2025-09-24T15:52:12.000+03:00
* Added workflow to package a safetensors model

* Build llama-converter image

* No need to install git-lfs

* Use linux/amd6 platform

* No need for intermediate image stage llama-converter

* remove duplicated FROM

* Update README.md

* add prerpare-matrix step as gguf workflow

* fix example

* llamacpp tag always came from the input

* Update README.md
diff --git a/pkg/distribution/README.md b/pkg/distribution/README.md
@@ -13,6 +13,8 @@ Model Distribution is a Go library and CLI tool that allows you to package, push
 - Local model storage
 - Model metadata management
 - Command-line interface for all operations
+- GitHub workflows for automated model packaging
+- Support for both GGUF and safetensors model formats
 
 ## Usage
 
@@ -123,27 +125,31 @@ if err != nil {
 
 ### GitHub Workflows for Model Packaging and Promotion
 
-This project provides GitHub workflows to automate the process of packaging GGUF models and promoting them from staging to production environments.
+This project provides GitHub workflows to automate the process of packaging both GGUF and Safetensors models and promoting them from staging to production environments.
 
 #### Overview
 
 The model promotion process follows a two-step workflow:
-1. **Package and Push to Staging**: Use `package-gguf-model.yml` to download a GGUF model from HuggingFace and push it to the `aistaging` namespace
+1. **Package and Push to Staging**: Use either:
+   - `package-gguf-model.yml` to download a pre-built GGUF model and push it to the `aistaging` namespace
+   - `package-safetensors-model.yml` to clone a safetensors model from HuggingFace, convert it to GGUF, and push it to the `aistaging` namespace
 2. **Promote to Production**: Use `promote-model-to-production.yml` to copy the model from staging (`aistaging`) to production (`ai`) namespace
 
 #### Prerequisites
 
 The following GitHub secrets must be configured:
 - `DOCKER_USER`: DockerHub username for production namespace
 - `DOCKER_OAT`: DockerHub access token for production namespace
-- `DOCKER_USER_STAGING`: DockerHub username for staging namespace (typically `aistaging`)
+- `DOCKER_USER_STAGING`: DockerHub username for staging namespace (`aistaging`)
 - `DOCKER_OAT_STAGING`: DockerHub access token for staging namespace
 
 **Note**: The current secrets are configured to write to the `ai` production namespace. If you need to write to a different namespace, you'll need to update the `DOCKERHUB_USERNAME` and `DOCKERHUB_TOKEN` secrets accordingly.
 
 #### Step 1: Package Model to Staging
 
-Use the **Package GGUF model** workflow to download a model from HuggingFace and push it to the staging environment.
+##### Option A: Package GGUF Model
+
+Use the **Package GGUF model** workflow to download a pre-built GGUF model and push it to the staging environment.
 
 **Single Model Example:**
 1. Go to Actions → Package GGUF model → Run workflow
@@ -174,6 +180,43 @@ For packaging multiple models at once, use the `models_json` input:
 ]
 ```
 
+##### Option B: Package Safetensors Model
+
+Use the **Package Safetensors model** workflow to clone a Safetensors model from HuggingFace, convert it to GGUF format, and push it to the staging environment.
+
+**Single Model Example:**
+1. Go to Actions → Package Safetensors model → Run workflow
+2. Fill in the inputs:
+   - **HuggingFace repository**: `HuggingFaceTB/SmolLM2-135M-Instruct`
+   - **Registry repository**: `smollm2-safetensors`
+   - **Weights**: `135M`
+   - **Quantization**: `Q4_K_M` (default)
+   - **Llama.cpp tag**: `full-b5763` (default)
+   - **License URL**: `https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/apache-2.0.md`
+
+This will create: `aistaging/smollm2-safetensors:135M-Q4_K_M`
+
+**Multi-Model Example:**
+For packaging multiple safetensors models at once, use the `models_json` input:
+```json
+[
+  {
+    "hf_repository": "microsoft/DialoGPT-medium",
+    "repository": "dialogpt",
+    "weights": "medium",
+    "quantization": "Q4_K_M",
+    "license_url": "https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/mit.md"
+  },
+  {
+    "hf_repository": "microsoft/DialoGPT-large",
+    "repository": "dialogpt",
+    "weights": "large",
+    "quantization": "Q8_0",
+    "license_url": "https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/mit.md"
+  }
+]
+```
+
 #### Step 2: Promote to Production
 
 Once your model is successfully packaged in staging, use the **Promote Model to Production** workflow to copy it to the production namespace.
@@ -188,7 +231,9 @@ This will copy: `aistaging/smollm2:135M-Q4_K_M` → `ai/smollm2:135M-Q4_K_M`
 
 #### Complete Example Walkthrough
 
-Let's walk through packaging and promoting a Qwen3 model:
+##### Example 1: GGUF Model (Pre-built)
+
+Let's walk through packaging and promoting a pre-built GGUF model:
 
 1. **Package to Staging**:
    - Workflow: Package GGUF model
@@ -207,3 +252,28 @@ Your model is now available in production and can be pulled using:
 ```bash
 docker pull ai/smollm2:135M-Q4_K_M
 ```
+
+##### Example 2: Safetensors Model (Convert to GGUF)
+
+Let's walk through packaging and promoting a safetensors model:
+
+1. **Package to Staging**:
+   - Workflow: Package Safetensors model
+   - HuggingFace repository: `HuggingFaceTB/SmolLM2-135M-Instruct`
+   - Registry repository: `smollm2-safetensors`
+   - Weights: `135M`
+   - Quantization: `Q4_K_M`
+   - Llama.cpp tag: `full-b5763`
+   - License URL: `https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/apache-2.0.md`
+   - Result: `aistaging/smollm2-safetensors:135M-Q4_K_M`
+
+2. **Promote to Production**:
+   - Workflow: Promote Model to Production
+   - Image: `smollm2-safetensors:135M-Q4_K_M`
+   - Result: `ai/smollm2-safetensors:135M-Q4_K_M`
+
+Your converted model is now available in production and can be pulled using:
+```bash
+docker pull ai/smollm2-safetensors:135M-Q4_K_M
+```
+