|
| 1 | +# Making a Llamafile Release |
| 2 | + |
| 3 | +There are a few steps in making a Llamafile release which will be detailed in this document. |
| 4 | + |
| 5 | +The two primary artifacts of the release are the `llamafile-<version>.zip` and the binaries for the GitHub release. |
| 6 | + |
| 7 | +## Release Process |
| 8 | + |
| 9 | +Note: Step 2 and 3 are only needed if you are making a new release of the ggml-cuda.so and ggml-rocm.so shared libraries. You only need to do this when you are making changes to the CUDA code or the API's surrounding it. Otherwise you can use the previous release of the shared libraries. |
| 10 | + |
| 11 | +1. Update the version number in `version.h` |
| 12 | +2. Build the ggml-cuda.so and ggml-rocm.so shared libraries on Linux. You need to do this for Llamafile and LocalScore. Llamafile uses TINYBLAS as a default and LocalScore uses CUBLAS as a default for CUDA. |
| 13 | + - For Llamafile you can do this by running the script `./llamafile/cuda.sh` and `./llamafile/rocm.sh` respectively. |
| 14 | + - For LocalScore you can do this by running the script `./localscore/cuda.sh`. |
| 15 | + - The files will be built and placed your home directory. |
| 16 | +3. Build the ggml-cuda.dll and ggml-rocm.dll shared libraries on Windows. You need to do this for Llamafile and LocalScore. |
| 17 | + - You can do this by running the script `./llamafile/cuda.bat` and `./llamafile/rocm.bat` respectively. |
| 18 | + - For LocalScore you can do this by running the script `./localscore/cuda.bat`. |
| 19 | + - The files will be built and placed in the `build/release` directory. |
| 20 | +4. Build the project with `make -j8` |
| 21 | +5. Install the built project to your /usr/local/bin directory with `sudo make install PREFIX=/usr/local` |
| 22 | + |
| 23 | +### Llamafile Release Zip |
| 24 | + |
| 25 | +The easiest way to create the release zip is to: |
| 26 | + |
| 27 | +`make install PREFIX=<preferred_dir>/llamafile-<version>` |
| 28 | + |
| 29 | +After the directory is created, you will want to bundle the built shared libraries into the following release binaries: |
| 30 | + |
| 31 | +- `llamafile` |
| 32 | +- `localscore` |
| 33 | +- `whisperfile` |
| 34 | + |
| 35 | +You can do this for each binary with a command like the following: |
| 36 | + |
| 37 | +Note: You MUST put the shared libraries in the same directory as the binary you are creating. |
| 38 | + |
| 39 | +For llamafile and whisperfile you can do the following: |
| 40 | + |
| 41 | +`zipalign -j0 llamafile ggml-cuda.so ggml-rocm.so ggml-cuda.dll ggml-rocm.dll` |
| 42 | +`zipalign -j0 whisperfile ggml-cuda.so ggml-rocm.so ggml-cuda.dll ggml-rocm.dll` |
| 43 | + |
| 44 | +After doing this, delete the ggml-cuda.so and ggml-cuda.dll files from the directory, and copy + rename the ggml-cuda.localscore.so and ggml-cuda.localscore.dll files to the directory. |
| 45 | + |
| 46 | +``` |
| 47 | +rm <path_to>/llamafile-<version>/bin/ggml-cuda.so <path_to>/llamafile-<version>/bin/ggml-cuda.dll |
| 48 | +cp ~/ggml-cuda.localscore.so <path_to>/llamafile-<version>/bin/ggml-cuda.so |
| 49 | +cp ~/ggml-cuda.localscore.dll <path_to>/llamafile-<version>/bin/ggml-cuda.dll |
| 50 | +``` |
| 51 | + |
| 52 | +For localscore you can now package it: |
| 53 | + |
| 54 | +`zipalign -j0 localscore ggml-cuda.so ggml-rocm.so ggml-cuda.dll ggml-rocm.dll` |
| 55 | + |
| 56 | +After you have done this for all the binaries, you will want to get the existing PDFs (from the prior release) and add them to the directory: |
| 57 | + |
| 58 | +`cp <path_to>/doc/*.pdf <path_to>/llamafile-<version>/share/doc/llamafile/` |
| 59 | + |
| 60 | +The zip is structured as follows. |
| 61 | + |
| 62 | +``` |
| 63 | +llamafile-<version> |
| 64 | +|-- README.md |
| 65 | +|-- bin |
| 66 | +| |-- llamafile |
| 67 | +| |-- llamafile-bench |
| 68 | +| |-- llamafile-convert |
| 69 | +| |-- llamafile-imatrix |
| 70 | +| |-- llamafile-perplexity |
| 71 | +| |-- llamafile-quantize |
| 72 | +| |-- llamafile-tokenize |
| 73 | +| |-- llamafile-upgrade-engine |
| 74 | +| |-- llamafiler |
| 75 | +| |-- llava-quantize |
| 76 | +| |-- localscore |
| 77 | +| |-- sdfile |
| 78 | +| |-- whisperfile |
| 79 | +| `-- zipalign |
| 80 | +`-- share |
| 81 | + |-- doc |
| 82 | + | `-- llamafile |
| 83 | + | |-- llamafile-imatrix.pdf |
| 84 | + | |-- llamafile-perplexity.pdf |
| 85 | + | |-- llamafile-quantize.pdf |
| 86 | + | |-- llamafile.pdf |
| 87 | + | |-- llamafiler.pdf |
| 88 | + | |-- llava-quantize.pdf |
| 89 | + | |-- whisperfile.pdf |
| 90 | + | `-- zipalign.pdf |
| 91 | + `-- man |
| 92 | + `-- man1 |
| 93 | + |-- llamafile-imatrix.1 |
| 94 | + |-- llamafile-perplexity.1 |
| 95 | + |-- llamafile-quantize.1 |
| 96 | + |-- llamafile.1 |
| 97 | + |-- llamafiler.1 |
| 98 | + |-- llava-quantize.1 |
| 99 | + |-- whisperfile.1 |
| 100 | + `-- zipalign.1 |
| 101 | +``` |
| 102 | + |
| 103 | +Before you zip the directory, you will want to remove the shared libraries from the directory. |
| 104 | + |
| 105 | +`rm *.so *.dll` |
| 106 | + |
| 107 | +You can zip the directory with the following command: |
| 108 | + |
| 109 | +`zip -r llamafile-<version>.zip llamafile-<version>` |
| 110 | + |
| 111 | +### Llamafile Release Binaries |
| 112 | + |
| 113 | +After you have built the zip it is quite easy to create the release binaries. |
| 114 | + |
| 115 | +The following binaries are part of the release: |
| 116 | + |
| 117 | +- `llamafile` |
| 118 | +- `llamafile-bench` |
| 119 | +- `llamafiler` |
| 120 | +- `sdfile` |
| 121 | +- `localscore` |
| 122 | +- `whisperfile` |
| 123 | +- `zipalign` |
| 124 | + |
| 125 | +You can use the script to create the appropriately named binaries: |
| 126 | + |
| 127 | +`./llamafile/release.sh -v <version> -s <source_dir> -d <dest_dir>` |
| 128 | + |
| 129 | +Make sure to move the llamafile-<version>.zip file to the <dest_dir> as well, and you are good to release after you've tested. |
0 commit comments