Skip to content

HugeGraph LLM Roadmap

imbajin edited this page Mar 10, 2025 · 6 revisions

Background

Architecture diagram references HG-RAG Technical Architecture. For earlier categorized structure diagram, see Legacy-HG-llm-design.

GraphRAG (Current Status):

1. Knowledge Extraction: (Core 1)

  • Supports traditional "Triple + Disambiguation + LLM" approach (limited use)
  • Supports "Property Graph + LLM" extraction (main choice)
  • Graph Schema supports two modes:
    • schema-free: Unspecified, all nodes and edges use generic labels [e.g., "link"/"entity"] (poor results, not recommended)
    • schema-defined: Manual schema construction (recommended, good results)
  • Prompts use fixed templates with manual adjustments (mainly adding examples, good results, high ROI)

2. Knowledge Fusion/Disambiguation

  • Optional NLP fusion/disambiguation for triples, disabled by default
  • Not yet available for property graphs, considering graph algorithms later (see below)

3. Intent Recognition/Query Rewriting

  • Basic query enhancement/rewriting before tokenization (no complex rewriting plans)
  • Intent recognition pending (in planning, see below)

4. Graph Query (Core 2)

  1. Step1: Text2GQL (Gremlin), prioritizes converting user questions to graph queries, offering optimal effect/cost/performance when correct (priority)

    1. Supports embedded simple/complex templates for common queries (includes default Template)
    2. If query successful, returns graph_context with recall effect 1 (A-Excellent)
    3. If query fails/empty, proceeds to step 2 ↓, falls back to generalized query (K-hop neighbors)
      • If results found, returns graph_context with recall effect 0 (B-Normal)
      • If no graph data found, returns recall effect -1 (C-Missing)
  2. Step2: Generalized Query (K-hop neighbors, default 2 hops)

    1. Graph/Vector sorting weight settings in mixed mode (default 1:1)
    2. Priority sorting using "closest neighbor" (self > 1-hop > 2-hop...)
    3. Custom vertex label priority sorting (e.g., "IDC" > "Formula" > "Concept")
    4. Compressed/deduplicated results, supports both Gremlin + Rest-API (In Progress)
  3. Step3: Advanced options for different business customizations (optional)

    1. Built-in OLTP algorithms: e.g., "All Paths/Custom Paths", "Shortest Path", "Jaccard Similarity", "Cycle Detection", "Common Neighbors", "Subgraph Query", etc.
    2. Built-in OLAP algorithms: e.g., "Community Detection/Clustering", "PageRank/PPR", "Triangle Counting", "Centrality/Closeness", etc. (provides "overview" knowledge, like MS-Global Search)
    3. Custom APIs/queries for specific scenarios:
      • e.g., adding specific algorithms, direct intent recognition for common/complex scenarios (reduces uncertainty)
      • Combine Gremlin + RESTful API algorithms for mixed use (avoids difficulty in writing complete Gremlin for complex scenarios)
    4. Future support for intent recognition to automatically determine which algorithm to call (see below)

Note:

  • The above Graph Query/GraphRAG can provide RESTful API for direct user calls (considering providing an SDK in the future)
  • If you already have RAG/vector/ReRank services, you can integrate Graph_Only data return without LLM answers, easily connecting with existing processes

5. RAG Effectiveness Retesting

  • Supports uploading excel for batch retesting/effect comparison (e.g., comparing Original LLM -> Vector Only -> Graph Only -> Vector + Graph mix)
  • Automated scoring to follow (In Progress)

6. Other Features/Supplementary Information (Misc)

  • Supports independent configuration of chat + embedding + rerank models
  • Chat/embedding supports mainstream models like openai/Qianfan/ollama with one-click switching, rerank supports offline/online modes
  • Visualization currently relies on separate Hubble (will integrate RAG visualization later, see below)

GraphRAG Planning (In Progress + Future)

1. Knowledge Extraction: (Core 1)

  • Adding automatic extraction of a chunk/mirror graph, storing original text segment information and associating with the business graph, design reference below (core)
image image
  • chunk_graph can effectively solve the problem of Graph recall original reference in complex issues (e.g., formulas/laws/judgments citations)
  • Also builds a linked/contextual vector graph (enhances/replaces part of the vector library)
  • Semi-automatic schema generation: Automatically generate possible schema references based on user input (text), then modify
  • Semi-automatic prompt generation: Automatically generate different prompt templates based on user-provided "background description"

2. Knowledge Fusion/Disambiguation

  • Graph extraction adds graph algorithm-assisted optimization (similar to the diagram below)
    • image
  • Other solutions will be introduced based on actual effect/ROI, following industry and academia + focus on implementation

3. Intent Recognition/Query Rewriting (Graph Agent - Core 2)

  • Plan to integrate Agentic RAG-like approach, aiming for practical simple CoT multi-round thinking simulation (avoiding reinventing the wheel, research reference Agentic Framework Simple Research –> we consider use CrewAI or LLamaindex currently:) agentic-rag

  • Add "intent recognition", turning into Graph Agent to automatically determine the appropriate choice (effect is automatic determination of question execution path↓)

  • Refactor to introduce new pipeline framework, flexible scheduling of combination operators (serial/parallel combination), allowing users to DIY as much as possible

  • Consider supporting MCP protocol

4. Graph Query (Core 1)

  1. Text2GQL localization fine-tuning + significant enhancement of effect and latency

    1. Gradually replace all models with local versions, facilitating FT, reducing latency/costs, etc. (core)
    2. Strategies include introducing templates + FT for optimization
    3. Training set source is key, will use "small amount of manual + AST syntax tree automatic generation" system generation method, roughly as shown below
      • image
  2. Make the entire GraphRAG Query pipeline-based, more convenient for users to customize selection, parallel execution

    1. Users can choose to terminate early
    2. Users can flexibly choose any step combination
    3. Provide SDK for user calls
  3. (Optional) Custom support for different business choices

    1. We may default to adding a subgraph of community detection results (linked with the current original graph, similar to Microsoft GraphRAG, optional, ROI not necessarily high)
    2. Users/we can assist in selecting some query templates for fixed scenarios
      • Combined with our intent recognition, priority at P0 (highest), if matched successfully, can choose to end early (with early_stop parameter)
      • If not matched, follow default step1 -> step2 -> step3...
    3. Further, we will provide a function call mode
    4. Later iterations will turn it into a Graph Agent form (automatically scheduling multiple graph components/algorithms)

5. RAG Effectiveness Retesting

  • Introducing RAGAS-like system framework for scoring, currently good for English, needs adjustment for Chinese (In Progress)
    • image

6. Other Features (Misc)

  • By default, gradually replace all models with local versions, facilitating FT, reducing latency/costs, etc. (core)
    • Mentioned Text2GQL localization (see above)
    • embedding/ tokenizer/ rerank will be replaced with local models ensuring cost/performance control (gradually replacing)
    • Finally consider providing graph extraction models, achieving full model localization/FT (optional)
  • Enhanced GraphRAG visualization: View graph/vector-related visualization results directly on the RAG platform (no need to switch to Hubble page)
  • Support multi-turn dialogue/history records and other basic RAG functions (reuse industry RAG Frame design as much as possible)
  • Support multi-user/tenant resource isolation concepts, including Audit Log, etc. (scheduled based on demand)
  • Support more Vector options (including disk/single machine/distributed) and consider built-in vector support in Graph (adjust based on demand)
  • …..

Specific details of requirements are broken down in RAG Features/Requirements Progress (Overall), summarizing the overall technical overview

References:

Good articles/materials can be added to the references section below. For urgent additions, directly @reference sub-documents~