|
| 1 | +# Basic Introduction |
| 2 | + |
| 3 | +This file provides guidance to AI coding tools and developers when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +HugeGraph-LLM is a comprehensive toolkit that bridges graph databases and large language models, |
| 8 | +part of the Apache HugeGraph AI ecosystem. It enables seamless integration between HugeGraph and LLMs for building |
| 9 | +intelligent applications with three main capabilities: Knowledge Graph Construction, Graph-Enhanced RAG, |
| 10 | +and Text2Gremlin query generation. |
| 11 | + |
| 12 | +## Tech Stack |
| 13 | + |
| 14 | +- **Language**: Python 3.10+ (uv package manager required) |
| 15 | +- **Framework**: FastAPI + Gradio for web interfaces |
| 16 | +- **Graph Database**: HugeGraph Server 1.5+ |
| 17 | +- **LLM Integration**: LiteLLM (supports OpenAI, Ollama, Qianfan, etc.) |
| 18 | +- **Vector Operations**: FAISS, NumPy, and will support multiple Vector DB soon |
| 19 | +- **Code style**: ruff & mypy (on the way, soon) |
| 20 | +- **Key Dependencies**: hugegraph-python-client |
| 21 | + |
| 22 | +## Essential Commands |
| 23 | + |
| 24 | +### Running the Application |
| 25 | +```bash |
| 26 | +# Install dependencies and create virtual environment (uv already installed) |
| 27 | +uv sync |
| 28 | +# Activate virtual environment |
| 29 | +source .venv/bin/activate |
| 30 | +# Launch main RAG demo application |
| 31 | +python -m hugegraph_llm.demo.rag_demo.app |
| 32 | +# Custom host/port |
| 33 | +python -m hugegraph_llm.demo.rag_demo.app --host 127.0.0.1 --port 18001 |
| 34 | +``` |
| 35 | + |
| 36 | +### Testing |
| 37 | +```bash |
| 38 | +pytest src/tests/ |
| 39 | +# Or using unittest |
| 40 | +python -m unittest discover src/tests/ |
| 41 | +``` |
| 42 | +PS: we skip Docker Deployment details here. |
| 43 | + |
| 44 | +## Architecture Overview |
| 45 | + |
| 46 | +### Core Directory Structure |
| 47 | +- `src/hugegraph_llm/api/` - FastAPI endpoints (rag_api.py, admin_api.py) |
| 48 | +- `src/hugegraph_llm/demo/rag_demo/` - Main Gradio UI application |
| 49 | +- `src/hugegraph_llm/operators/` - Core processing pipelines |
| 50 | +- `src/hugegraph_llm/models/` - LLM, embedding, reranker implementations |
| 51 | +- `src/hugegraph_llm/indices/` - Vector and graph indexing |
| 52 | +- `src/hugegraph_llm/config/` - Configuration management |
| 53 | +- `src/hugegraph_llm/utils/` - Utilities, logging, decorators |
| 54 | + |
| 55 | +### Key Processing Pipelines |
| 56 | + |
| 57 | +1. **KG Construction** (`operators/kg_construction_task.py`) |
| 58 | + - Text chunking and vectorization pipeline |
| 59 | + - Schema management and validation |
| 60 | + - Information extraction using LLMs |
| 61 | + - Graph data commitment to HugeGraph |
| 62 | + |
| 63 | +2. **Graph RAG** (`operators/graph_rag_task.py`) |
| 64 | + - Multi-modal retrieval (vector, graph, hybrid) |
| 65 | + - Keyword extraction and entity matching |
| 66 | + - Graph traversal and Gremlin query generation |
| 67 | + - Result merging and reranking |
| 68 | + |
| 69 | +3. **Text2Gremlin** (`operators/gremlin_generate_task.py`) |
| 70 | + - Natural language to Gremlin query conversion |
| 71 | + - Template-based and few-shot learning approaches |
| 72 | + |
| 73 | +### Configuration Management |
| 74 | + |
| 75 | +- Main config: `.env` file (generate with `config.generate` module) |
| 76 | +- Prompt config: `src/hugegraph_llm/resources/demo/config_prompt.yaml` |
| 77 | +- HugeGraph connection settings in environment variables |
| 78 | +- LLM provider configuration through `LiteLLM` & `openai/ollama` client |
| 79 | + |
| 80 | +## Development Workflow |
| 81 | + |
| 82 | +1. **Prerequisites**: Ensure HugeGraph Server is running and LLM provider is configured |
| 83 | +2. **Environment Setup**: Use UV for dependency management, activate virtual environment |
| 84 | +3. **Configuration**: Generate configs and set up .env file with proper credentials |
| 85 | +4. **Development**: Use Gradio demo for interactive testing, FastAPI for programmatic access |
| 86 | +5. **Testing**: Unit tests use standard unittest framework in src/tests/ |
| 87 | + |
| 88 | +## Important Notes |
| 89 | + |
| 90 | +- Always use `uv` package manager instead of `pip` for dependency management |
| 91 | +- HugeGraph Server must be accessible while running the app |
| 92 | +- The system supports multiple LLM providers through `LiteLLM` abstraction |
| 93 | +- Each file should be better < 600 lines for maintainability |
0 commit comments