1- .Dd August 17 , 2024
1+ .Dd November 30 , 2024
22.Dt LLAMAFILER 1
33.Os Mozilla Ocho
44.Sh NAME
@@ -30,6 +30,11 @@ recommended that you run multiple instances of llamafiler behind a
3030reverse proxy such as NGINX or Redbean.
3131.It Fl mm Ar FNAME , Fl Fl mmproj Ar FNAME
3232Path of vision model weights.
33+ .It Fl Fl db Ar FILE
34+ Specifies path of sqlite3 database.
35+ .Pp
36+ The default is
37+ .Pa ~/.llamafile/llamafile.sqlite3
3338.It Fl l Ar HOSTPORT , Fl Fl listen Ar HOSTPORT
3439Specifies the local [HOST:]PORT on which the HTTP server should listen.
3540By default this is 0.0.0.0:8080 which means llamafiler will bind to port
@@ -55,8 +60,10 @@ Please note that
5560has a strong influence on how many slots can be created.
5661.It Fl p Ar TEXT , Fl Fl prompt Ar TEXT
5762Specifies system prompt. This value is passed along to the web frontend.
58- .It Fl Fl no-display-prompt Ar TEXT
63+ .It Fl Fl no-display-prompt
5964Hide system prompt from web user interface.
65+ .It Fl Fl nologo
66+ Hide llamafile logo icon from web ui.
6067.It Fl Fl url-prefix Ar URLPREFIX
6168Specifies a URL prefix (subdirectory) under which the HTTP server will
6269make the API accessible, e.g. /lamafiler. Useful when running llamafiler
@@ -130,6 +137,39 @@ supported by the host operating system. The default keepalive is 5.
130137Size of HTTP output buffer size, in bytes. Default is 1048576.
131138.It Fl Fl http-ibuf-size Ar N
132139Size of HTTP input buffer size, in bytes. Default is 1048576.
140+ .It Fl Fl chat-template Ar NAME
141+ Specifies or overrides chat template for model.
142+ .Pp
143+ Normally the GGUF metadata tokenizer.chat_template will specify this
144+ value for instruct models. This flag may be used to either override the
145+ chat template, or specify one when the GGUF metadata field is absent,
146+ which effectively forces the web ui to enable chatbot mode.
147+ .Pp
148+ Supported chat template names are: chatml, llama2, llama3, mistral
149+ (alias for llama2), phi3, zephyr, monarch, gemma, gemma2 (alias for
150+ gemma), orion, openchat, vicuna, vicuna-orca, deepseek, command-r,
151+ chatglm3, chatglm4, minicpm, deepseek2, or exaone3.
152+ .Pp
153+ It is also possible to pass the jinja2 template itself to this argument.
154+ Since llamafiler doesn't currently support jinja2, a heuristic will be
155+ used to guess which of the above templates the template represents.
156+ .It Fl Fl completion-mode
157+ Forces web ui to operate in completion mode, rather than chat mode.
158+ Normally the web ui chooses its mode based on the GGUF metadata. Base
159+ models normally don't define tokenizer.chat_template whereas instruct
160+ models do. If it's a base model, then the web ui will automatically use
161+ completion mode only, without needing to specify this flag. This flag is
162+ useful in cases where a prompt template is defined by the gguf, but it
163+ is desirable for the chat interface to be disabled.
164+ .It Fl Fl db-startup-sql
165+ Specifies SQL code that should be executed whenever connecting to the
166+ SQLite database. The default is the following code, which enables the
167+ write-ahead log.
168+ .Bd -literal -offset indent
169+ PRAGMA journal_mode=WAL;
170+ PRAGMA synchronous=NORMAL;
171+ .Ed
172+ .El
133173.Sh EXAMPLE
134174Here's an example of how you might start this server:
135175.Pp
0 commit comments