Skip to content

Commit 244e1c8

Browse files
minor changes
1 parent 8286e29 commit 244e1c8

File tree

1 file changed

+21
-14
lines changed
  • cloud-infrastructure/ai-infra-gpu/GPU/rag-langchain-vllm-mistral

1 file changed

+21
-14
lines changed

cloud-infrastructure/ai-infra-gpu/GPU/rag-langchain-vllm-mistral/README.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
# Overview
22

3-
This repository is a variant of the Retrieval Augmented Generation (RAG) tutorial available [here](https://github.com/oracle-devrel/technology-engineering/tree/main/ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files) with a local deployment of Mistral 7B Instruct v0.2 using vLLM powered by a NVIDIA A10 GPU instead of the OCI GenAI Service.
3+
This repository is a variant of the Retrieval Augmented Generation (RAG) tutorial available [here](https://github.com/oracle-devrel/technology-engineering/tree/main/ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files). Instead of the OCI GenAI Service, it uses a local deployment of Mistral 7B Instruct v0.2 using a vLLM inference server powered by a NVIDIA A10 GPU.
44

55
# Requirements
66

77
* An OCI tenancy with A10 GPU quota.
88

99
# Libraries
1010

11-
* LlamaIndex: a data framework for LLM-based applications which benefit from context augmentation. [doc](https://docs.llamaindex.ai/en/stable/)
12-
* LangChain: a framework for developing applications powered by large language models. [doc](https://python.langchain.com/docs/get_started/introduction)
13-
* vLLM: a fast and easy-to-use library for LLM inference and serving. [doc](https://docs.vllm.ai/en/latest/)
14-
* Qdrant: a vector similarity search engine. [doc](https://qdrant.tech/documentation/)
11+
* **LlamaIndex**: a data framework for LLM-based applications which benefit from context augmentation.
12+
* **LangChai**: a framework for developing applications powered by large language models.
13+
* **vLLM**: a fast and easy-to-use library for LLM inference and serving.
14+
* **Qdrant**: a vector similarity search engine.
1515

1616
# Mistral LLM
1717

@@ -55,14 +55,14 @@ The python script creates an all-in-one framework with local instances of the Qd
5555

5656
### Framework components
5757

58-
* SitemapReader: Asynchronous sitemap reader for web. Reads pages from the web based on their sitemap.xml. Other data connectors are available (Snowflake, Twitter, Wikipedia, etc.).
59-
* QdrantClient: Python client for the Qdrant vector search engine.
60-
* SentenceTransformerEmbeddings: Sentence embeddings model (from HuggingFace). Other options include Aleph Alpha, Cohere, MistralAI, SpaCy, etc.
61-
* VLLM: Fast and easy-to-use LLM inference server
62-
* Settings: Bundle of commonly used resources used during the indexing and querying stage in a LlamaIndex pipeline/application. In this example we use global configuration.
63-
* QdrantVectorStore: Vector store where embeddings and docs are stored within a Qdrant collection.
64-
* StorageContext: Utility container for storing nodes, indices, and vectors.
65-
* VectorStoreIndex: Index built from the documents loaded in the Vector Store.
58+
* **SitemapReader**: Asynchronous sitemap reader for web. Reads pages from the web based on their sitemap.xml. Other data connectors are available (Snowflake, Twitter, Wikipedia, etc.). In this example the site mapxml file is stored in an OCI bucket.
59+
* **QdrantClient**: Python client for the Qdrant vector search engine.
60+
* **SentenceTransformerEmbeddings**: Sentence embeddings model object (from HuggingFace). Other options include Aleph Alpha, Cohere, MistralAI, SpaCy, etc.
61+
* **VLLM**: Fast and easy-to-use LLM inference server.
62+
* **Settings**: Bundle of commonly used resources used during the indexing and querying stage in a LlamaIndex pipeline/application. In this example we use global configuration.
63+
* **QdrantVectorStore**: Vector store where embeddings and docs are stored within a Qdrant collection.
64+
* **StorageContext**: Utility container for storing nodes, indices, and vectors.
65+
* **VectorStoreIndex**: Index built from the documents loaded in the Vector Store.
6666

6767
### Remote Qdrant client
6868

@@ -112,4 +112,11 @@ To deploy the container, refer to this [tutorial](https://github.com/oracle-devr
112112

113113
# Notes
114114

115-
The libraries used in this example are evolving quite fast. The python script provided here might have to be updated in a near future to avoid Warnings and Errors.
115+
The libraries used in this example are evolving quite fast. The python script provided here might have to be updated in a near future to avoid Warnings and Errors.
116+
117+
# Documentation
118+
119+
* [LlamaIndex](https://docs.llamaindex.ai/en/stable/)
120+
* [LangChain](https://python.langchain.com/docs/get_started/introduction)
121+
* [vLLM](https://docs.vllm.ai/en/latest/)
122+
* [Qdrant](https://qdrant.tech/documentation/)

0 commit comments

Comments
 (0)