-
Notifications
You must be signed in to change notification settings - Fork 60
HugeGraph LLM Roadmap
Architecture diagram references HG-RAG Technical Architecture. For earlier categorized structure diagram, see Legacy-HG-llm-design.
- Supports traditional "Triple + Disambiguation + LLM" approach (limited use)
- Supports "Property Graph + LLM" extraction (main choice)
- Graph Schema supports two modes:
-
schema-free: Unspecified, all nodes and edges use generic labels [e.g., "link"/"entity"] (poor results, not recommended) -
schema-defined: Manual schema construction (recommended, good results)
-
- Prompts use fixed templates with manual adjustments (mainly adding examples, good results, high ROI)
- Optional NLP fusion/disambiguation for triples, disabled by default
- Not yet available for property graphs, considering graph algorithms later (see below)
- Basic query enhancement/rewriting before tokenization (no complex rewriting plans)
- Intent recognition pending (in planning, see below)
-
Step1: Text2GQL (Gremlin), prioritizes converting user questions to graph queries, offering optimal effect/cost/performance when correct (priority)
- Supports embedded simple/complex templates for common queries (includes default
Template) - If query successful, returns
graph_contextwith recall effect1(A-Excellent) - If query fails/empty, proceeds to step 2 ↓, falls back to generalized query (K-hop neighbors)
- If results found, returns
graph_contextwith recall effect0(B-Normal) - If no graph data found, returns recall effect
-1(C-Missing)
- If results found, returns
- Supports embedded simple/complex templates for common queries (includes default
-
Step2: Generalized Query (K-hop neighbors, default 2 hops)
- Graph/Vector sorting weight settings in mixed mode (default 1:1)
- Priority sorting using "closest neighbor" (self > 1-hop > 2-hop...)
- Custom vertex label priority sorting (e.g., "IDC" > "Formula" > "Concept")
- Compressed/deduplicated results, supports both Gremlin + Rest-API (In Progress)
-
Step3: Advanced options for different business customizations (optional)
- Built-in
OLTPalgorithms: e.g., "All Paths/Custom Paths", "Shortest Path", "Jaccard Similarity", "Cycle Detection", "Common Neighbors", "Subgraph Query", etc. - Built-in
OLAPalgorithms: e.g., "Community Detection/Clustering", "PageRank/PPR", "Triangle Counting", "Centrality/Closeness", etc. (provides "overview" knowledge, like MS-Global Search) - Custom APIs/queries for specific scenarios:
- e.g., adding specific algorithms, direct intent recognition for common/complex scenarios (reduces uncertainty)
- Combine Gremlin + RESTful API algorithms for mixed use (avoids difficulty in writing complete Gremlin for complex scenarios)
- Future support for intent recognition to automatically determine which algorithm to call (see below)
- Built-in
Note:
- The above Graph Query/GraphRAG can provide
RESTful APIfor direct user calls (considering providing an SDK in the future)- If you already have RAG/vector/ReRank services, you can integrate
Graph_Onlydata return without LLM answers, easily connecting with existing processes
- Supports uploading excel for batch retesting/effect comparison (e.g., comparing
Original LLM -> Vector Only -> Graph Only -> Vector + Graphmix) - Automated scoring to follow (In Progress)
- Supports independent configuration of chat + embedding + rerank models
- Chat/embedding supports mainstream models like openai/Qianfan/ollama with one-click switching, rerank supports offline/online modes
- Visualization currently relies on separate Hubble (will integrate RAG visualization later, see below)
- Adding automatic extraction of a
chunk/mirrorgraph, storing original text segment information and associating with the business graph, design reference below (core)
![]() |
![]() |
-
chunk_graphcan effectively solve the problem of Graph recall original reference in complex issues (e.g.,formulas/laws/judgmentscitations) - Also builds a linked/contextual vector graph (enhances/replaces part of the vector library)
- Semi-automatic schema generation: Automatically generate possible schema references based on user input (text), then modify
- Semi-automatic prompt generation: Automatically generate different prompt templates based on user-provided "background description"
- Graph extraction adds graph algorithm-assisted optimization (similar to the diagram below)
- Other solutions will be introduced based on actual effect/ROI, following industry and academia + focus on implementation
-
Plan to integrate
Agentic RAG-like approach, aiming for practical simple CoT multi-round thinking simulation (avoiding reinventing the wheel, research reference Agentic Framework Simple Research –> we consider useCrewAIorLLamaindexcurrently:)
-
Add "intent recognition", turning into Graph Agent to automatically determine the appropriate choice (effect is automatic determination of question execution path↓)
-
Refactor to introduce new pipeline framework, flexible scheduling of combination operators (serial/parallel combination), allowing users to DIY as much as possible
-
Consider supporting MCP protocol
-
Text2GQLlocalization fine-tuning + significant enhancement of effect and latency- Gradually replace all models with local versions, facilitating FT, reducing latency/costs, etc. (core)
- Strategies include introducing templates + FT for optimization
- Training set source is key, will use "small amount of manual + AST syntax tree automatic generation" system generation method, roughly as shown below
-
Make the entire GraphRAG Query
pipeline-based, more convenient for users to customize selection, parallel execution- Users can choose to terminate early
- Users can flexibly choose any step combination
- Provide SDK for user calls
-
(Optional) Custom support for different business choices
- We may default to adding a subgraph of community detection results (linked with the current original graph, similar to Microsoft GraphRAG, optional, ROI not necessarily high)
- Users/we can assist in selecting some query templates for fixed scenarios
- Combined with our intent recognition, priority at P0 (highest), if matched successfully, can choose to end early (with
early_stopparameter) - If not matched, follow default step1 -> step2 -> step3...
- Combined with our intent recognition, priority at P0 (highest), if matched successfully, can choose to end early (with
- Further, we will provide a
function callmode - Later iterations will turn it into a
Graph Agentform (automatically scheduling multiple graph components/algorithms)
- Introducing
RAGAS-like system framework for scoring, currently good for English, needs adjustment for Chinese (In Progress)
- By default, gradually replace all models with local versions, facilitating FT, reducing latency/costs, etc. (core)
- Mentioned
Text2GQLlocalization (see above) -
embedding/tokenizer/rerankwill be replaced with local models ensuring cost/performance control (gradually replacing) - Finally consider providing graph extraction models, achieving full model localization/FT (optional)
- Mentioned
- Enhanced GraphRAG visualization: View graph/vector-related visualization results directly on the RAG platform (no need to switch to Hubble page)
- Support multi-turn dialogue/history records and other basic RAG functions (reuse industry RAG Frame design as much as possible)
- Support multi-user/tenant resource isolation concepts, including Audit Log, etc. (scheduled based on demand)
- Support more Vector options (including disk/single machine/distributed) and consider built-in vector support in Graph (adjust based on demand)
- …..
Specific details of requirements are broken down in RAG Features/Requirements Progress (Overall), summarizing the overall technical overview
Good articles/materials can be added to the references section below. For urgent additions, directly @reference sub-documents~
Documentation license here.




