|
1 | 1 | # Overview
|
2 | 2 |
|
3 |
| -This repository is a variant of the Retrieval Augmented Generation (RAG) tutorial available [here](https://github.com/oracle-devrel/technology-engineering/tree/main/ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files) with a local deployment of Mistral 7B Instruct v0.2 using vLLM powered by a NVIDIA A10 GPU instead of the OCI GenAI Service. |
| 3 | +This repository is a variant of the Retrieval Augmented Generation (RAG) tutorial available [here](https://github.com/oracle-devrel/technology-engineering/tree/main/ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files). Instead of the OCI GenAI Service, it uses a local deployment of Mistral 7B Instruct v0.2 using a vLLM inference server powered by a NVIDIA A10 GPU. |
4 | 4 |
|
5 | 5 | # Requirements
|
6 | 6 |
|
7 | 7 | * An OCI tenancy with A10 GPU quota.
|
8 | 8 |
|
9 | 9 | # Libraries
|
10 | 10 |
|
11 |
| -* LlamaIndex: a data framework for LLM-based applications which benefit from context augmentation. [doc](https://docs.llamaindex.ai/en/stable/) |
12 |
| -* LangChain: a framework for developing applications powered by large language models. [doc](https://python.langchain.com/docs/get_started/introduction) |
13 |
| -* vLLM: a fast and easy-to-use library for LLM inference and serving. [doc](https://docs.vllm.ai/en/latest/) |
14 |
| -* Qdrant: a vector similarity search engine. [doc](https://qdrant.tech/documentation/) |
| 11 | +* **LlamaIndex**: a data framework for LLM-based applications which benefit from context augmentation. |
| 12 | +* **LangChai**: a framework for developing applications powered by large language models. |
| 13 | +* **vLLM**: a fast and easy-to-use library for LLM inference and serving. |
| 14 | +* **Qdrant**: a vector similarity search engine. |
15 | 15 |
|
16 | 16 | # Mistral LLM
|
17 | 17 |
|
@@ -55,14 +55,14 @@ The python script creates an all-in-one framework with local instances of the Qd
|
55 | 55 |
|
56 | 56 | ### Framework components
|
57 | 57 |
|
58 |
| -* SitemapReader: Asynchronous sitemap reader for web. Reads pages from the web based on their sitemap.xml. Other data connectors are available (Snowflake, Twitter, Wikipedia, etc.). |
59 |
| -* QdrantClient: Python client for the Qdrant vector search engine. |
60 |
| -* SentenceTransformerEmbeddings: Sentence embeddings model (from HuggingFace). Other options include Aleph Alpha, Cohere, MistralAI, SpaCy, etc. |
61 |
| -* VLLM: Fast and easy-to-use LLM inference server |
62 |
| -* Settings: Bundle of commonly used resources used during the indexing and querying stage in a LlamaIndex pipeline/application. In this example we use global configuration. |
63 |
| -* QdrantVectorStore: Vector store where embeddings and docs are stored within a Qdrant collection. |
64 |
| -* StorageContext: Utility container for storing nodes, indices, and vectors. |
65 |
| -* VectorStoreIndex: Index built from the documents loaded in the Vector Store. |
| 58 | +* **SitemapReader**: Asynchronous sitemap reader for web. Reads pages from the web based on their sitemap.xml. Other data connectors are available (Snowflake, Twitter, Wikipedia, etc.). In this example the site mapxml file is stored in an OCI bucket. |
| 59 | +* **QdrantClient**: Python client for the Qdrant vector search engine. |
| 60 | +* **SentenceTransformerEmbeddings**: Sentence embeddings model object (from HuggingFace). Other options include Aleph Alpha, Cohere, MistralAI, SpaCy, etc. |
| 61 | +* **VLLM**: Fast and easy-to-use LLM inference server. |
| 62 | +* **Settings**: Bundle of commonly used resources used during the indexing and querying stage in a LlamaIndex pipeline/application. In this example we use global configuration. |
| 63 | +* **QdrantVectorStore**: Vector store where embeddings and docs are stored within a Qdrant collection. |
| 64 | +* **StorageContext**: Utility container for storing nodes, indices, and vectors. |
| 65 | +* **VectorStoreIndex**: Index built from the documents loaded in the Vector Store. |
66 | 66 |
|
67 | 67 | ### Remote Qdrant client
|
68 | 68 |
|
@@ -112,4 +112,11 @@ To deploy the container, refer to this [tutorial](https://github.com/oracle-devr
|
112 | 112 |
|
113 | 113 | # Notes
|
114 | 114 |
|
115 |
| -The libraries used in this example are evolving quite fast. The python script provided here might have to be updated in a near future to avoid Warnings and Errors. |
| 115 | +The libraries used in this example are evolving quite fast. The python script provided here might have to be updated in a near future to avoid Warnings and Errors. |
| 116 | + |
| 117 | +# Documentation |
| 118 | + |
| 119 | +* [LlamaIndex](https://docs.llamaindex.ai/en/stable/) |
| 120 | +* [LangChain](https://python.langchain.com/docs/get_started/introduction) |
| 121 | +* [vLLM](https://docs.vllm.ai/en/latest/) |
| 122 | +* [Qdrant](https://qdrant.tech/documentation/) |
0 commit comments