Skip to content

Commit 48320a8

Browse files
ilopezlunadoringeman
authored andcommitted
Added workflow to package a safetensors model (docker#100)
* Added workflow to package a safetensors model * Build llama-converter image * No need to install git-lfs * Use linux/amd6 platform * No need for intermediate image stage llama-converter * remove duplicated FROM * Update README.md * add prerpare-matrix step as gguf workflow * fix example * llamacpp tag always came from the input * Update README.md
1 parent 395f8f5 commit 48320a8

File tree

1 file changed

+75
-5
lines changed

1 file changed

+75
-5
lines changed

pkg/distribution/README.md

Lines changed: 75 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ Model Distribution is a Go library and CLI tool that allows you to package, push
1313
- Local model storage
1414
- Model metadata management
1515
- Command-line interface for all operations
16+
- GitHub workflows for automated model packaging
17+
- Support for both GGUF and safetensors model formats
1618

1719
## Usage
1820

@@ -123,27 +125,31 @@ if err != nil {
123125

124126
### GitHub Workflows for Model Packaging and Promotion
125127

126-
This project provides GitHub workflows to automate the process of packaging GGUF models and promoting them from staging to production environments.
128+
This project provides GitHub workflows to automate the process of packaging both GGUF and Safetensors models and promoting them from staging to production environments.
127129

128130
#### Overview
129131

130132
The model promotion process follows a two-step workflow:
131-
1. **Package and Push to Staging**: Use `package-gguf-model.yml` to download a GGUF model from HuggingFace and push it to the `aistaging` namespace
133+
1. **Package and Push to Staging**: Use either:
134+
- `package-gguf-model.yml` to download a pre-built GGUF model and push it to the `aistaging` namespace
135+
- `package-safetensors-model.yml` to clone a safetensors model from HuggingFace, convert it to GGUF, and push it to the `aistaging` namespace
132136
2. **Promote to Production**: Use `promote-model-to-production.yml` to copy the model from staging (`aistaging`) to production (`ai`) namespace
133137

134138
#### Prerequisites
135139

136140
The following GitHub secrets must be configured:
137141
- `DOCKER_USER`: DockerHub username for production namespace
138142
- `DOCKER_OAT`: DockerHub access token for production namespace
139-
- `DOCKER_USER_STAGING`: DockerHub username for staging namespace (typically `aistaging`)
143+
- `DOCKER_USER_STAGING`: DockerHub username for staging namespace (`aistaging`)
140144
- `DOCKER_OAT_STAGING`: DockerHub access token for staging namespace
141145

142146
**Note**: The current secrets are configured to write to the `ai` production namespace. If you need to write to a different namespace, you'll need to update the `DOCKERHUB_USERNAME` and `DOCKERHUB_TOKEN` secrets accordingly.
143147

144148
#### Step 1: Package Model to Staging
145149

146-
Use the **Package GGUF model** workflow to download a model from HuggingFace and push it to the staging environment.
150+
##### Option A: Package GGUF Model
151+
152+
Use the **Package GGUF model** workflow to download a pre-built GGUF model and push it to the staging environment.
147153

148154
**Single Model Example:**
149155
1. Go to Actions → Package GGUF model → Run workflow
@@ -174,6 +180,43 @@ For packaging multiple models at once, use the `models_json` input:
174180
]
175181
```
176182

183+
##### Option B: Package Safetensors Model
184+
185+
Use the **Package Safetensors model** workflow to clone a Safetensors model from HuggingFace, convert it to GGUF format, and push it to the staging environment.
186+
187+
**Single Model Example:**
188+
1. Go to Actions → Package Safetensors model → Run workflow
189+
2. Fill in the inputs:
190+
- **HuggingFace repository**: `HuggingFaceTB/SmolLM2-135M-Instruct`
191+
- **Registry repository**: `smollm2-safetensors`
192+
- **Weights**: `135M`
193+
- **Quantization**: `Q4_K_M` (default)
194+
- **Llama.cpp tag**: `full-b5763` (default)
195+
- **License URL**: `https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/apache-2.0.md`
196+
197+
This will create: `aistaging/smollm2-safetensors:135M-Q4_K_M`
198+
199+
**Multi-Model Example:**
200+
For packaging multiple safetensors models at once, use the `models_json` input:
201+
```json
202+
[
203+
{
204+
"hf_repository": "microsoft/DialoGPT-medium",
205+
"repository": "dialogpt",
206+
"weights": "medium",
207+
"quantization": "Q4_K_M",
208+
"license_url": "https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/mit.md"
209+
},
210+
{
211+
"hf_repository": "microsoft/DialoGPT-large",
212+
"repository": "dialogpt",
213+
"weights": "large",
214+
"quantization": "Q8_0",
215+
"license_url": "https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/mit.md"
216+
}
217+
]
218+
```
219+
177220
#### Step 2: Promote to Production
178221

179222
Once your model is successfully packaged in staging, use the **Promote Model to Production** workflow to copy it to the production namespace.
@@ -188,7 +231,9 @@ This will copy: `aistaging/smollm2:135M-Q4_K_M` → `ai/smollm2:135M-Q4_K_M`
188231

189232
#### Complete Example Walkthrough
190233

191-
Let's walk through packaging and promoting a Qwen3 model:
234+
##### Example 1: GGUF Model (Pre-built)
235+
236+
Let's walk through packaging and promoting a pre-built GGUF model:
192237

193238
1. **Package to Staging**:
194239
- Workflow: Package GGUF model
@@ -207,3 +252,28 @@ Your model is now available in production and can be pulled using:
207252
```bash
208253
docker pull ai/smollm2:135M-Q4_K_M
209254
```
255+
256+
##### Example 2: Safetensors Model (Convert to GGUF)
257+
258+
Let's walk through packaging and promoting a safetensors model:
259+
260+
1. **Package to Staging**:
261+
- Workflow: Package Safetensors model
262+
- HuggingFace repository: `HuggingFaceTB/SmolLM2-135M-Instruct`
263+
- Registry repository: `smollm2-safetensors`
264+
- Weights: `135M`
265+
- Quantization: `Q4_K_M`
266+
- Llama.cpp tag: `full-b5763`
267+
- License URL: `https://huggingface.co/datasets/choosealicense/licenses/resolve/main/markdown/apache-2.0.md`
268+
- Result: `aistaging/smollm2-safetensors:135M-Q4_K_M`
269+
270+
2. **Promote to Production**:
271+
- Workflow: Promote Model to Production
272+
- Image: `smollm2-safetensors:135M-Q4_K_M`
273+
- Result: `ai/smollm2-safetensors:135M-Q4_K_M`
274+
275+
Your converted model is now available in production and can be pulled using:
276+
```bash
277+
docker pull ai/smollm2-safetensors:135M-Q4_K_M
278+
```
279+

0 commit comments

Comments
 (0)