Skip to content

Commit 78c19ab

Browse files
committed
readme update
1 parent 2ae36eb commit 78c19ab

File tree

1 file changed

+20
-10
lines changed

1 file changed

+20
-10
lines changed

readme.md

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,13 @@ CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.8
1313

1414
1. Clone the repo using git:
1515
```shell
16-
git clone https://github.com/rauni-iitr/langchain_chromaDB_opensourceLLM_streamlit.git
16+
$ git clone https://github.com/rauni-iitr/langchain_chromaDB_opensourceLLM_streamlit.git
1717
```
1818

1919
2. Create a virtual enviornment, with 'venv' or with 'conda' and activate.
2020
```shell
21-
python3 -m venv .venv
22-
source .venv/bin/activate
21+
$ python3 -m venv .venv
22+
$ source .venv/bin/activate
2323
```
2424

2525
3. Now this rag application is built using few dependencies:
@@ -33,29 +33,39 @@ CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.8
3333
3434
You can install all of these with pip;
3535
```shell
36-
pip install pypdf chromadb transformers sentence-transformers streamlit
36+
$ pip install pypdf chromadb transformers sentence-transformers streamlit
3737
```
3838
4. Installing llama-cpp-python:
39-
* This project uses uses [LlamaCpp-Python](https://github.com/abetlen/llama-cpp-python) for GGUF(llama-cpp-python >=0.1.83) models, if you are using GGML models you need (llama-cpp-python <=0.1.76).
39+
* This project uses uses [LlamaCpp-Python](https://github.com/abetlen/llama-cpp-python) for GGUF(llama-cpp-python >=0.1.83) models loading and inference, if you are using GGML models you need (llama-cpp-python <=0.1.76).
4040
4141
If you are going to use BLAS or Metal with [llama-cpp](https://github.com/abetlen/llama-cpp-python#installation-with-openblas--cublas--clblast--metal) for faster inference then appropriate flags need to be setup:
4242
4343
For Nvidia's GPU infernece, use 'cuBLAS', run below commands in your terminal:
4444
```shell
45-
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
45+
$ CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
4646
```
4747

4848
For Apple's Metal(M1/M2) based infernece, use 'METAL', run:
4949
```shell
50-
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
50+
$ CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
5151
```
5252
For more info, for setting right flags on any device where your app is running, see [here](https://codesandbox.io/p/github/imotai/llama-cpp-python/main).
5353
54-
5. Downloading GGUF/GGML models:
55-
* To run the model with open source models saved locally, download [model](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/tree/main).<br>
54+
5. Downloading GGUF/GGML models, need to be downloaded and path given to code in 'rag.py':
55+
* To run the model with open source LLMs saved locally, download [model](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/tree/main).<br>
5656
57-
* You can download any gguf file here based on your RAM specifications, you can find 2, 3, 4 and 8 bit quantized models for [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) developed by MistralAI here.
57+
* You can download any gguf file here based on your RAM specifications, you can find 2, 3, 4 and 8 bit quantized models for [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) developed by MistralAI here.<br>
5858
59+
**Note:** You can download any other model like llama-2, other versions of mistral or any other model with gguf and ggml format to be run through llama-cpp.
60+
If you have access to GPU, you can use GPTQ models(for better llm performance) as well which can be loaded with other libraries as well like transformers.
61+
62+
### Your setup to run the llm app is ready.
63+
64+
To run the model:
65+
66+
```shell
67+
$ streamlit run st_app.py
68+
```
5969
6070
6171

0 commit comments

Comments
 (0)