Skip to content

Commit 277d978

Browse files
Merge pull request #23 from BradHutchings/work-in-progress
Work in progress
2 parents 74293b6 + b36532c commit 277d978

File tree

2 files changed

+6
-2
lines changed

2 files changed

+6
-2
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@ In no particular order of importance, these are the things that bother me:
7070
- GPU support without a complicated kludge, and that can support all supported platform / CPU / GPU triads. Perhaps a plugin system with shared library dispatch? Invoking dev tools on Apple Metal like llamafile does is "complicated".
7171
- Code signing instructions. Might have to sign executables within the zip package, plus the package itself.
7272
- Clean up remaining build warnings, either by fixing source (i.e. Cosmo) or finding the magical compiler flags.
73-
- Copy the `cosmo_args` function into `server.cpp` so it could potentially be incorporated upstream in non-Cosmo builds. `common/arg2.cpp` might be a good landing spot. License in [Cosmo source code](https://github.com/jart/cosmopolitan/blob/master/tool/args/args2.c) appears to be MIT compatible with attribution.
73+
- Copy the `cosmo_args` function into `server.cpp` so it could potentially be incorporated upstream in non-Cosmo builds. `common/arg2.cpp` might be a good landing spot. License in [Cosmo source code](https://github.com/jart/cosmopolitan/blob/master/tool/args/args2.c) appears to be MIT compatible with attribution.
74+
- The args thing is cute, but it might be easier as a yaml file. Key value pairs. Flags can be keys with null values.
7475
- The `--ctx-size` parameter doesn't seem quite right given that new models have the training (or max) context size in their metadata. That size should be used subject to a maximum in a passed parameter. E.g. So a 128K model can run comfortably on a smaller device.
7576
- Write docs for a Deploying step. It should address the args file, removing the extra executable depending on platform, models, host, port. context size.
77+
- Make a `.gitattributes` file so we can set the default file to be displayed and keep the README.md from llama.cpp. This will help in syncing changes continually from upstream. Reference: https://git-scm.com/docs/gitattributes

docs/Configuring-ls1-Brads-Env.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ A `default-args` file in the archive can specify sane default parameters. The fo
9191

9292
We don't yet support including the model inside the zip archive (yet). That has a 4GB size limitation on Windows anyway, as `.exe` files cannot exceed 4GB. So let's use an adjacent file called `model.gguf`.
9393

94-
We will serve on localhost, port 8080 by default for safety. The `--ctx-size` parameter is the size of the context window. This is kinda screwy to have as a set size rather than a maximum because the `.gguf` files now have the training context size in metadata. We set it to 8192 to be sensible.
94+
We will serve on localhost, port 8080 by default for safety. The `--ctx-size` parameter is the size of the context window. This is kinda screwy to have as a set size rather than a maximum because the `.gguf` files now have the training context size in metadata. We set it to 8192 to be sensible. The `--threads-http` parameter ensures that the browser can ask for all the image files in our default UI at once.
9595
```
9696
cat << EOF > $DEFAULT_ARGS
9797
-m
@@ -102,6 +102,8 @@ model.gguf
102102
8080
103103
--ctx-size
104104
8192
105+
--threads-http
106+
8
105107
--path
106108
/zip/website
107109
...

0 commit comments

Comments
 (0)