You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: triton/apps/llms.rst
+65-8Lines changed: 65 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ instructions on how to run inference and training on the models.
15
15
Pre-downloaded model weights
16
16
****************************
17
17
18
-
We have downloaded following models weights:
18
+
We have downloaded the following models weights (PyTorch model checkpoint directories):
19
19
20
20
.. list-table::
21
21
:header-rows: 1
@@ -38,7 +38,7 @@ We have downloaded following models weights:
38
38
39
39
* * Llama 2
40
40
* 7b-chat
41
-
* ``module load model-llama2/7b``
41
+
* `module load model-llama2/7b-chat`
42
42
* Raw weights of 7B parameter chat optimized version of `Llama 2 <https://ai.meta.com/llama/>`__.
43
43
44
44
* * Llama 2
@@ -53,19 +53,49 @@ We have downloaded following models weights:
53
53
54
54
* * Llama 2
55
55
* 70b
56
-
* ``module load model-llama2/13b``
56
+
* `module load model-llama2/70b`
57
57
* Raw weights of 70B parameter version of `Llama 2 <https://ai.meta.com/llama/>`__.
58
58
59
59
* * Llama 2
60
60
* 70b-chat
61
-
* ``module load model-llama2/13b-chat``
61
+
* `module load model-llama2/70b-chat`
62
62
* Raw weights of 70B parameter chat optimized version of `Llama 2 <https://ai.meta.com/llama/>`__.
63
63
64
64
Each module will set the following environment variables:
65
65
66
-
- ``MODEL_ROOT`` - Folder where model weights are stored.
67
-
68
-
66
+
- ``MODEL_ROOT`` - Folder where model weights are stored, i.e., PyTorch model checkpoint directory.
67
+
- ``TOKENIZER_PATH`` - File path to the tokenizer.model.
68
+
69
+
Here is an example slurm script using the raw weights to do batch inference. For detailed environment setting up, example prompts and python code, please check out `this repo <>`__.
@@ -142,6 +190,15 @@ Each module will set the following environment variables:
142
190
- ``MODEL_ROOT`` - Folder where model weights are stored.
143
191
- ``MODEL_WEIGHTS`` - Path to the model weights in GGUF format.
144
192
193
+
This Python code snippet is part of a 'Chat with Your PDF Documents' example, utilizing LangChain and leveraging model weights stored in a .gguf file. For detailed environment setting up and python code, please check out `this repo <>`__.
0 commit comments