Merge pull request #23 from BradHutchings/work-in-progress

BradHutchings · web-flow · commit 277d978d6f0f · 2025-04-04T11:09:33.000-07:00
Work in progress
diff --git a/README.md b/README.md
@@ -70,6 +70,8 @@ In no particular order of importance, these are the things that bother me:
 - GPU support without a complicated kludge, and that can support all supported platform / CPU / GPU triads. Perhaps a plugin system with shared library dispatch? Invoking dev tools on Apple Metal like llamafile does is "complicated".
 - Code signing instructions. Might have to sign executables within the zip package, plus the package itself.
 - Clean up remaining build warnings, either by fixing source (i.e. Cosmo) or finding the magical compiler flags.
-- Copy the `cosmo_args` function into `server.cpp` so it could potentially be incorporated upstream in non-Cosmo builds. `common/arg2.cpp` might be a good landing spot. License in [Cosmo source code](https://github.com/jart/cosmopolitan/blob/master/tool/args/args2.c) appears to be MIT compatible with attribution. 
+- Copy the `cosmo_args` function into `server.cpp` so it could potentially be incorporated upstream in non-Cosmo builds. `common/arg2.cpp` might be a good landing spot. License in [Cosmo source code](https://github.com/jart/cosmopolitan/blob/master/tool/args/args2.c) appears to be MIT compatible with attribution.
+  - The args thing is cute, but it might be easier as a yaml file. Key value pairs. Flags can be keys with null values.
 - The `--ctx-size` parameter doesn't seem quite right given that new models have the training (or max) context size in their metadata. That size should be used subject to a maximum in a passed parameter. E.g. So a 128K model can run comfortably on a smaller device.
 - Write docs for a Deploying step. It should address the args file, removing the extra executable depending on platform, models, host, port. context size.
+- Make a `.gitattributes` file so we can set the default file to be displayed and keep the README.md from llama.cpp. This will help in syncing changes continually from upstream. Reference: https://git-scm.com/docs/gitattributes
diff --git a/docs/Configuring-ls1-Brads-Env.md b/docs/Configuring-ls1-Brads-Env.md
@@ -91,7 +91,7 @@ A `default-args` file in the archive can specify sane default parameters. The fo
 
 We don't yet support including the model inside the zip archive (yet). That has a 4GB size limitation on Windows anyway, as `.exe` files cannot exceed 4GB. So let's use an adjacent file called `model.gguf`.
 
-We will serve on localhost, port 8080 by default for safety. The `--ctx-size` parameter is the size of the context window. This is kinda screwy to have as a set size rather than a maximum because the `.gguf` files now have the training context size in metadata. We set it to 8192 to be sensible.
+We will serve on localhost, port 8080 by default for safety. The `--ctx-size` parameter is the size of the context window. This is kinda screwy to have as a set size rather than a maximum because the `.gguf` files now have the training context size in metadata. We set it to 8192 to be sensible. The `--threads-http` parameter ensures that the browser can ask for all the image files in our default UI at once.
 ```
 cat << EOF > $DEFAULT_ARGS
 -m
@@ -102,6 +102,8 @@ model.gguf
 8080
 --ctx-size
 8192
+--threads-http
+8
 --path
 /zip/website
 ...

Original file line number	Diff line number	Diff line change
@@ -91,7 +91,7 @@ A `default-args` file in the archive can specify sane default parameters. The fo
`91`	`91`
`92`	`92`	We don't yet support including the model inside the zip archive (yet). That has a 4GB size limitation on Windows anyway, as `.exe` files cannot exceed 4GB. So let's use an adjacent file called `model.gguf`.
`93`	`93`
`94`		-We will serve on localhost, port 8080 by default for safety. The `--ctx-size` parameter is the size of the context window. This is kinda screwy to have as a set size rather than a maximum because the `.gguf` files now have the training context size in metadata. We set it to 8192 to be sensible.
	`94`	+We will serve on localhost, port 8080 by default for safety. The `--ctx-size` parameter is the size of the context window. This is kinda screwy to have as a set size rather than a maximum because the `.gguf` files now have the training context size in metadata. We set it to 8192 to be sensible. The `--threads-http` parameter ensures that the browser can ask for all the image files in our default UI at once.
`95`	`95`	```
`96`	`96`	`cat << EOF > $DEFAULT_ARGS`
`97`	`97`	`-m`
`@@ -102,6 +102,8 @@ model.gguf`
`102`	`102`	`8080`
`103`	`103`	`--ctx-size`
`104`	`104`	`8192`
	`105`	`+--threads-http`
	`106`	`+8`
`105`	`107`	`--path`
`106`	`108`	`/zip/website`
`107`	`109`	`...`