Skip to content

Commit 5a612e0

Browse files
committed
Initial version
1 parent 27f09fd commit 5a612e0

File tree

10 files changed

+178
-33
lines changed

10 files changed

+178
-33
lines changed

.gitattributes

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# This overrides the core.autocrlf setting - http://git-scm.com/docs/gitattributes
2+
# Set default behaviour, in case users don't have core.autocrlf set.
3+
# We default to Unix line endings (LF) because the exceptions from this are rare.
4+
* text=auto eol=lf

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
#
2+
# .gitignore
3+
#
4+
5+
*.sublime-workspace
6+
7+
.env

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Changelog
2+
All notable changes to this project will be documented in this file.
3+
4+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
5+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6+
7+
## [1.0.0] - 2023-11-28
8+
9+
### Added
10+
- Add .env configuration
11+
- Add Documentation
12+
- Add download script
13+
- Add quantization script

README.md

Lines changed: 107 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,107 @@
1-
# windows_manage_llms
2-
PowerShell automation to download large language models (LLMs) from Git repositories and quantize them with llama.cpp into the GGUF format.
1+
# Windows Manage Large Language Models
2+
3+
PowerShell automation to download large language models (LLMs) via Git and quantize them with llama.cpp to the `GGUF` format.
4+
5+
Think batch quantization like https://huggingface.co/TheBloke does it, but on your local machine :wink:
6+
7+
## Features
8+
9+
- Easy configuration via a `.env` file
10+
- Automates the synchronization of Git repositories containing large files (LFS)
11+
- Only fetches one LFS object at a time
12+
- Displays a progress indicator on downloading LFS objects
13+
- Automates the quantization from the source models
14+
- Handles the intermediate files during quantization to reduce disk usage
15+
- Improves quantization speed by separating read from write loads
16+
17+
## Installation
18+
19+
### Prerequisites
20+
21+
Use https://github.com/countzero/windows_llama.cpp to compile a specific version of the [llama.cpp](https://github.com/ggerganov/llama.cpp) project on your machine.
22+
23+
24+
### Clone the repository from GitHub
25+
26+
Clone the repository to a nice place on your machine via:
27+
28+
```PowerShell
29+
git clone [email protected]:countzero/windows_manage_large_language_models.git
30+
```
31+
32+
### Create a .env file
33+
34+
Create the following `.env` file in the project directory. Make sure to change the `LLAMA_CPP_DIRECTORY` value.
35+
36+
```Env
37+
# Path to the llama.cpp project that contains the
38+
# convert.py script and the quantize.exe binary.
39+
LLAMA_CPP_DIRECTORY=C:\windows_llama.cpp\vendor\llama.cpp
40+
41+
# Path to the Git repositories containing the models.
42+
SOURCE_DIRECTORY=.\source
43+
44+
# Path to the quantized models in GGUF format.
45+
TARGET_DIRECTORY=.\gguf
46+
47+
# Path to the cache directory for intermediate files.
48+
#
49+
# Hint: Ideally this should be located on a different
50+
# physical drive to improve the quantization speed.
51+
CACHE_DIRECTORY=.\cache
52+
53+
#
54+
# Comma separated list of quantization types.
55+
#
56+
# Possible llama.cpp quantization types:
57+
#
58+
# Q2_K : 2.63G, +0.6717 ppl @ LLaMA-v1-7B
59+
# Q3_K_S : 2.75G, +0.5551 ppl @ LLaMA-v1-7B
60+
# Q3_K_M : 3.07G, +0.2496 ppl @ LLaMA-v1-7B
61+
# Q3_K_L : 3.35G, +0.1764 ppl @ LLaMA-v1-7B
62+
# Q4_0 : 3.56G, +0.2166 ppl @ LLaMA-v1-7B
63+
# Q4_1 : 3.90G, +0.1585 ppl @ LLaMA-v1-7B
64+
# Q4_K_S : 3.59G, +0.0992 ppl @ LLaMA-v1-7B
65+
# Q4_K_M : 3.80G, +0.0532 ppl @ LLaMA-v1-7B
66+
# Q5_0 : 4.33G, +0.0683 ppl @ LLaMA-v1-7B
67+
# Q5_1 : 4.70G, +0.0349 ppl @ LLaMA-v1-7B
68+
# Q5_K_S : 4.33G, +0.0400 ppl @ LLaMA-v1-7B
69+
# Q5_K_M : 4.45G, +0.0122 ppl @ LLaMA-v1-7B
70+
# Q6_K : 5.15G, -0.0008 ppl @ LLaMA-v1-7B
71+
# Q8_0 : 6.70G, +0.0004 ppl @ LLaMA-v1-7B
72+
# F16 : 13.00G @ 7B
73+
# F32 : 26.00G @ 7B
74+
# COPY : only copy tensors, no quantizing
75+
#
76+
# Hint: The sweet spot is Q4_K_M.
77+
#
78+
QUANTIZATION_TYPES=q4_K_M,q2_K
79+
```
80+
81+
## Usage
82+
83+
### Clone a model
84+
85+
Clone a Git repository containing an LLM into the `SOURCE_DIRECTORY` without checking out any files and downloading any large files (lfs).
86+
87+
```PowerShell
88+
git -C "./source" clone --no-checkout https://huggingface.co/microsoft/Orca-2-7b
89+
```
90+
91+
### Download model sources
92+
93+
Download all files across all Git repositories that are inside the `SOURCE_DIRECTORY`.
94+
95+
```PowerShell
96+
./download_model_sources.ps1
97+
```
98+
99+
**Hint:** This can also be used to update already existing sources from the remote repositories.
100+
101+
### Quantize models
102+
103+
Quantize all model weights that are inside the `SOURCE_DIRECTORY` into the `TARGET_DIRECTORY` to create a specific `GGUF` file for each `QUANTIZATION_TYPES`.
104+
105+
```PowerShell
106+
./quantize_weights_for_llama.cpp.ps1
107+
```

cache/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Ignore everything in this directory except this file.
2+
*
3+
!.gitignore

download_model_sources.ps1

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,17 @@
11
$stopwatch = [System.Diagnostics.Stopwatch]::startNew()
22

3-
$sourceDirectory = "R:\AI\LLM\source"
3+
Get-Content "./.env" | ForEach {
4+
5+
$name, $value = $_.split('=', 2)
6+
7+
if ([string]::IsNullOrWhiteSpace($name) -or $name.Contains('#')) {
8+
return
9+
}
10+
11+
Set-Content env:\$name $value
12+
}
13+
14+
$sourceDirectory = $env:SOURCE_DIRECTORY
415

516
$naturalSort = { [regex]::Replace($_, '\d+', { $args[0].Value.PadLeft(20) }) }
617

gguf/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Ignore everything in this directory except this file.
2+
*
3+
!.gitignore

quantize_weights_for_llama.cpp.ps1

Lines changed: 18 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,25 @@
11
$stopwatch = [System.Diagnostics.Stopwatch]::startNew()
22

3-
$llamaCppDirectory = "D:\Privat\GitHub\windows_llama.cpp\vendor\llama.cpp"
4-
$sourceDirectory = "R:\AI\LLM\source"
5-
$targetDirectory = "R:\AI\LLM\gguf"
6-
$cacheDirectory = "E:\cache"
7-
8-
$exclude = @()
9-
10-
$types = @(
11-
# "q2_K"
12-
# "q3_K"
13-
# "q3_K_L"
14-
# "q3_K_M"
15-
# "q3_K_S"
16-
# "q4_0"
17-
# "q4_1"
18-
# "q4_K"
19-
"q4_K_M"
20-
# "q4_K_S"
21-
# "q5_0"
22-
# "q5_1"
23-
# "q5_K"
24-
# "q5_K_M"
25-
# "q5_K_S"
26-
# "q6_K"
27-
# "q8_0"
28-
)
3+
Get-Content "./.env" | ForEach {
4+
5+
$name, $value = $_.split('=', 2)
6+
7+
if ([string]::IsNullOrWhiteSpace($name) -or $name.Contains('#')) {
8+
return
9+
}
10+
11+
Set-Content env:\$name $value
12+
}
13+
14+
$llamaCppDirectory = $env:LLAMA_CPP_DIRECTORY
15+
$sourceDirectory = $env:SOURCE_DIRECTORY
16+
$targetDirectory = $env:TARGET_DIRECTORY
17+
$cacheDirectory = $env:CACHE_DIRECTORY
18+
$quantizationTypes = $env:QUANTIZATION_TYPES -split ','
2919

3020
$naturalSort = { [regex]::Replace($_, '\d+', { $args[0].Value.PadLeft(20) }) }
3121

32-
$repositoryDirectories = Get-ChildItem -Directory $sourceDirectory -Exclude $exclude -Name | Sort-Object $naturalSort
22+
$repositoryDirectories = Get-ChildItem -Directory $sourceDirectory -Name | Sort-Object $naturalSort
3323

3424
Write-Host "Quantizing $($repositoryDirectories.Length) large language models." -ForegroundColor "Yellow"
3525

@@ -46,11 +36,9 @@ ForEach ($repositoryName in $repositoryDirectories) {
4636

4737
Write-Host "Working on ${repositoryName}..." -ForegroundColor "DarkYellow"
4838

49-
# We are creating the intermediate unquantized model in a dedicated cache directory
50-
# so that it can be locatend on another drive to improve the quantization speed.
5139
$unquantizedModelPath = Join-Path -Path $cacheDirectory -ChildPath "${repositoryName}.model-unquantized.gguf"
5240

53-
ForEach ($type in $types) {
41+
ForEach ($type in $quantizationTypes) {
5442

5543
$quantizedModelPath = Join-Path -Path $targetDirectoryPath -ChildPath "model-quantized-${type}.gguf"
5644

source/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Ignore everything in this directory except this file.
2+
*
3+
!.gitignore
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"folders":
3+
[
4+
{
5+
"path": "."
6+
}
7+
]
8+
}

0 commit comments

Comments
 (0)