|
| 1 | +# RAGPinecone Plugin for GAME SDK |
| 2 | + |
| 3 | +A Retrieval Augmented Generation (RAG) plugin using Pinecone as the vector database for the GAME SDK. |
| 4 | + |
| 5 | +## Features |
| 6 | + |
| 7 | +- Query a knowledge base for relevant context |
| 8 | +- Advanced hybrid search (vector + BM25) for better retrieval |
| 9 | +- AI-generated answers based on retrieved documents |
| 10 | +- Add documents to the knowledge base |
| 11 | +- Delete documents from the knowledge base |
| 12 | +- Chunk documents for better retrieval |
| 13 | +- Process documents from a folder automatically |
| 14 | +- Integrate with Telegram bot for RAG-powered conversations |
| 15 | + |
| 16 | +## Installation |
| 17 | + |
| 18 | +### From Source |
| 19 | + |
| 20 | +1. Clone the repository or navigate to the plugin directory: |
| 21 | +```bash |
| 22 | +cd game-python/plugins/RAGPinecone |
| 23 | +``` |
| 24 | + |
| 25 | +2. Install the plugin in development mode: |
| 26 | +```bash |
| 27 | +pip install -e . |
| 28 | +``` |
| 29 | + |
| 30 | +This will install all required dependencies and make the plugin available in your environment. |
| 31 | + |
| 32 | +## Setup and Configuration |
| 33 | + |
| 34 | +1. Set the following environment variables: |
| 35 | + - `PINECONE_API_KEY`: Your Pinecone API key |
| 36 | + - `OPENAI_API_KEY`: Your OpenAI API key (for embeddings) |
| 37 | + - `GAME_API_KEY`: Your GAME API key |
| 38 | + - `TELEGRAM_BOT_TOKEN`: Your Telegram bot token (if using with Telegram) |
| 39 | + |
| 40 | +2. Import and initialize the plugin to use in your agent: |
| 41 | + |
| 42 | +```python |
| 43 | +from rag_pinecone_gamesdk.rag_pinecone_plugin import RAGPineconePlugin |
| 44 | +from rag_pinecone_gamesdk.rag_pinecone_game_functions import query_knowledge_fn, add_document_fn |
| 45 | + |
| 46 | +# Initialize the plugin |
| 47 | +rag_plugin = RAGPineconePlugin( |
| 48 | + pinecone_api_key="your-pinecone-api-key", |
| 49 | + openai_api_key="your-openai-api-key", |
| 50 | + index_name="your-index-name", |
| 51 | + namespace="your-namespace" |
| 52 | +) |
| 53 | + |
| 54 | +# Add the functions to your agent's action space |
| 55 | +agent_action_space = [ |
| 56 | + query_knowledge_fn(rag_plugin), |
| 57 | + add_document_fn(rag_plugin), |
| 58 | + # ... other functions |
| 59 | +] |
| 60 | +``` |
| 61 | + |
| 62 | +## Available Functions |
| 63 | + |
| 64 | +### Basic RAG Functions |
| 65 | + |
| 66 | +1. `query_knowledge(query: str, num_results: int = 3)` - Query the knowledge base for relevant context |
| 67 | +2. `add_document(content: str, metadata: dict = None)` - Add a document to the knowledge base |
| 68 | + |
| 69 | +### Advanced RAG Functions |
| 70 | + |
| 71 | +1. `advanced_query_knowledge(query: str)` - Query the knowledge base using hybrid retrieval (vector + BM25) and get an AI-generated answer |
| 72 | +2. `get_relevant_documents(query: str)` - Get relevant documents using hybrid retrieval without generating an answer |
| 73 | + |
| 74 | +Example usage of advanced functions: |
| 75 | + |
| 76 | +```python |
| 77 | +from rag_pinecone_gamesdk.search_rag import RAGSearcher |
| 78 | +from rag_pinecone_gamesdk.rag_pinecone_game_functions import advanced_query_knowledge_fn, get_relevant_documents_fn |
| 79 | + |
| 80 | +# Initialize the RAG searcher |
| 81 | +rag_searcher = RAGSearcher( |
| 82 | + pinecone_api_key="your-pinecone-api-key", |
| 83 | + openai_api_key="your-openai-api-key", |
| 84 | + index_name="your-index-name", |
| 85 | + namespace="your-namespace" |
| 86 | +) |
| 87 | + |
| 88 | +# Add the advanced functions to your agent's action space |
| 89 | +agent_action_space = [ |
| 90 | + advanced_query_knowledge_fn(rag_searcher), |
| 91 | + get_relevant_documents_fn(rag_searcher), |
| 92 | + # ... other functions |
| 93 | +] |
| 94 | +``` |
| 95 | + |
| 96 | +## Populating the Knowledge Base |
| 97 | + |
| 98 | +### Using the Documents Folder |
| 99 | + |
| 100 | +The easiest way to populate the knowledge base is to place your documents in the `Documents` folder and run the provided script: |
| 101 | + |
| 102 | +```bash |
| 103 | +cd game-python/plugins/RAGPinecone |
| 104 | +python examples/populate_knowledge_base.py |
| 105 | +``` |
| 106 | + |
| 107 | +This will process all supported files in the Documents folder and add them to the knowledge base. |
| 108 | + |
| 109 | +Supported file types: |
| 110 | +- `.txt` - Text files |
| 111 | +- `.pdf` - PDF documents |
| 112 | +- `.docx` - Word documents |
| 113 | +- `.doc` - Word documents |
| 114 | +- `.csv` - CSV files |
| 115 | +- `.md` - Markdown files |
| 116 | +- `.html` - HTML files |
| 117 | + |
| 118 | +### Using the API |
| 119 | + |
| 120 | +You can also populate the knowledge base programmatically: |
| 121 | + |
| 122 | +```python |
| 123 | +from rag_pinecone_gamesdk.populate_rag import RAGPopulator |
| 124 | + |
| 125 | +# Initialize the populator |
| 126 | +populator = RAGPopulator( |
| 127 | + pinecone_api_key="your-pinecone-api-key", |
| 128 | + openai_api_key="your-openai-api-key", |
| 129 | + index_name="your-index-name", |
| 130 | + namespace="your-namespace" |
| 131 | +) |
| 132 | + |
| 133 | +# Add a document |
| 134 | +content = "Your document content here" |
| 135 | +metadata = { |
| 136 | + "title": "Document Title", |
| 137 | + "author": "Author Name", |
| 138 | + "source": "Source Name", |
| 139 | +} |
| 140 | + |
| 141 | +status, message, results = populator.add_document(content, metadata) |
| 142 | +print(f"Status: {status}") |
| 143 | +print(f"Message: {message}") |
| 144 | +print(f"Results: {results}") |
| 145 | + |
| 146 | +# Process all documents in a folder |
| 147 | +status, message, results = populator.process_documents_folder() |
| 148 | +print(f"Status: {status}") |
| 149 | +print(f"Message: {message}") |
| 150 | +print(f"Processed {results.get('total_files', 0)} files, {results.get('successful_files', 0)} successful") |
| 151 | +``` |
| 152 | + |
| 153 | +## Testing the Advanced Search |
| 154 | + |
| 155 | +You can test the advanced search functionality using the provided example script: |
| 156 | + |
| 157 | +```bash |
| 158 | +cd game-python/plugins/RAGPinecone |
| 159 | +python examples/test_advanced_search.py |
| 160 | +``` |
| 161 | + |
| 162 | +This will run a series of test queries using the advanced hybrid retrieval system. |
| 163 | + |
| 164 | +## Integration with Telegram |
| 165 | + |
| 166 | +See the `examples/test_rag_pinecone_telegram.py` file for an example of how to integrate the RAGPinecone plugin with a Telegram bot. |
| 167 | + |
| 168 | +To run the Telegram bot with advanced RAG capabilities: |
| 169 | + |
| 170 | +```bash |
| 171 | +cd game-python/plugins/RAGPinecone |
| 172 | +python examples/test_rag_pinecone_telegram.py |
| 173 | +``` |
| 174 | + |
| 175 | +## Advanced Usage |
| 176 | + |
| 177 | +### Hybrid Retrieval |
| 178 | + |
| 179 | +The advanced search functionality uses a hybrid retrieval approach that combines: |
| 180 | + |
| 181 | +1. **Vector Search**: Uses embeddings to find semantically similar documents |
| 182 | +2. **BM25 Search**: Uses keyword matching to find documents with relevant terms |
| 183 | + |
| 184 | +This hybrid approach often provides better results than either method alone, especially for complex queries. |
| 185 | + |
| 186 | +### Custom Document Processing |
| 187 | + |
| 188 | +You can customize how documents are processed by extending the `RAGPopulator` class: |
| 189 | + |
| 190 | +```python |
| 191 | +from rag_pinecone_gamesdk.populate_rag import RAGPopulator |
| 192 | + |
| 193 | +class CustomRAGPopulator(RAGPopulator): |
| 194 | + def chunk_document(self, content, metadata): |
| 195 | + # Custom chunking logic |
| 196 | + # ... |
| 197 | + return chunked_docs |
| 198 | +``` |
| 199 | + |
| 200 | +### Custom Embedding Models |
| 201 | + |
| 202 | +You can use different embedding models by specifying the `embedding_model` parameter: |
| 203 | + |
| 204 | +```python |
| 205 | +rag_plugin = RAGPineconePlugin( |
| 206 | + embedding_model="sentence-transformers/all-mpnet-base-v2" |
| 207 | +) |
| 208 | +``` |
| 209 | + |
| 210 | +## Requirements |
| 211 | + |
| 212 | +- Python 3.9+ |
| 213 | +- Pinecone account |
| 214 | +- OpenAI API key |
| 215 | +- GAME SDK |
| 216 | +- langchain |
| 217 | +- langchain_community |
| 218 | +- langchain_pinecone |
| 219 | +- langchain_openai |
0 commit comments