Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@ hs_err_pid*
hugegraph-server/hugegraph-dist/docker/data/

# AI-IDE prompt files (We only keep AGENTS.md, other files could soft-linked it when needed)
# Serena MCP memories
.serena/
# Claude Projects
CLAUDE.md
CLAUDE_*.md
Expand Down
1 change: 1 addition & 0 deletions .licenserc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ header: # `header` section is configurations for source codes license header.
- 'LICENSE'
- 'NOTICE'
- 'DISCLAIMER'
- '.serena/**'
- '**/*.versionsBackup'
- '**/*.versionsBackup'
- '**/*.proto'
Expand Down
1 change: 1 addition & 0 deletions .serena/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/cache
99 changes: 99 additions & 0 deletions .serena/memories/architecture_and_modules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Architecture and Module Structure

## Three-Tier Architecture

### 1. Client Layer
- Gremlin/Cypher query interfaces
- REST API endpoints
- Multiple client language bindings

### 2. Server Layer (hugegraph-server)
- **REST API Layer** (hugegraph-api): GraphAPI, SchemaAPI, GremlinAPI, CypherAPI, AuthAPI
- **Graph Engine Layer** (hugegraph-core): Schema management, traversal optimization, task scheduling
- **Backend Interface**: Abstraction over storage backends

### 3. Storage Layer
- Pluggable backend implementations
- Each backend extends `hugegraph-core` abstractions
- Implements `BackendStore` interface

## Multi-Module Structure

The project consists of 7 main modules:

### 1. hugegraph-server (13 submodules)
Core graph engine, REST APIs, and backend implementations:
- `hugegraph-core` - Core graph engine and abstractions
- `hugegraph-api` - REST API implementations (includes OpenCypher in `opencypher/`)
- `hugegraph-dist` - Distribution packaging and scripts
- `hugegraph-test` - Test suites (unit, core, API, TinkerPop)
- `hugegraph-example` - Example code
- Backend implementations:
- `hugegraph-rocksdb` (default)
- `hugegraph-hstore` (distributed)
- `hugegraph-hbase`
- `hugegraph-mysql`
- `hugegraph-postgresql`
- `hugegraph-cassandra`
- `hugegraph-scylladb`
- `hugegraph-palo`

### 2. hugegraph-pd (8 submodules)
Placement Driver for distributed deployments (meta server):
- `hg-pd-core` - Core PD logic
- `hg-pd-service` - PD service implementation
- `hg-pd-client` - Client library
- `hg-pd-common` - Shared utilities
- `hg-pd-grpc` - gRPC protocol definitions (auto-generated)
- `hg-pd-cli` - Command line interface
- `hg-pd-dist` - Distribution packaging
- `hg-pd-test` - Test suite

### 3. hugegraph-store (9 submodules)
Distributed storage backend with RocksDB and Raft:
- `hg-store-core` - Core storage logic
- `hg-store-node` - Storage node implementation
- `hg-store-client` - Client library
- `hg-store-common` - Shared utilities
- `hg-store-grpc` - gRPC protocol definitions (auto-generated)
- `hg-store-rocksdb` - RocksDB integration
- `hg-store-cli` - Command line interface
- `hg-store-dist` - Distribution packaging
- `hg-store-test` - Test suite

### 4. hugegraph-commons
Shared utilities across modules:
- Locks and concurrency utilities
- Configuration management
- RPC framework components

### 5. hugegraph-struct
Data structure definitions shared between modules.
**Important**: Must be built before PD and Store modules.

### 6. install-dist
Distribution packaging and release management:
- License and NOTICE files
- Dependency management scripts
- Release documentation

### 7. hugegraph-cluster-test
Cluster integration tests for distributed deployments

## Cross-Module Dependencies

```
hugegraph-commons → (shared by all modules)
hugegraph-struct → hugegraph-pd + hugegraph-store
hugegraph-core → (extended by all backend implementations)
```

## Distributed Architecture (Optional)

For production distributed deployments:
- **hugegraph-pd**: Service discovery, partition management, metadata
- **hugegraph-store**: Distributed storage with Raft (3+ nodes)
- **hugegraph-server**: Multiple server instances (3+)
- Communication: All use gRPC with Protocol Buffers

**Status**: Distributed components (PD + Store) are in BETA
92 changes: 92 additions & 0 deletions .serena/memories/code_style_and_conventions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Code Style and Conventions

## Code Style Configuration
- **Import**: Use `hugegraph-style.xml` in your IDE (IntelliJ IDEA recommended)
- **EditorConfig**: `.editorconfig` file defines style rules (validated in CI)
- **Checkstyle**: `style/checkstyle.xml` defines additional rules

## Core Style Rules (from .editorconfig)

### General
- Charset: UTF-8
- End of line: LF (Unix-style)
- Insert final newline: true
- Max line length: 100 characters (120 for XML)
- Visual guides at column 100

### Java Files
- Indent: 4 spaces (not tabs)
- Continuation indent: 8 spaces
- Wrap on typing: true
- Wrap long lines: true

### Import Organization
```
$*
|
java.**
|
javax.**
|
org.**
|
com.**
|
*
```
- Class count to use import on demand: 100
- Names count to use import on demand: 100

### Formatting Rules
- Line comments not at first column
- Align multiline: chained methods, parameters in calls, binary operations, assignments, ternary, throws, extends, array initializers
- Wrapping: normal (wrap if necessary)
- Brace forcing:
- if: if_multiline
- do-while: always
- while: if_multiline
- for: if_multiline
- Enum constants: split_into_lines

### Blank Lines
- Max blank lines in declarations: 1
- Max blank lines in code: 1
- Blank lines between package declaration and header: 1
- Blank lines before right brace: 1
- Blank lines around class: 1
- Blank lines after class header: 1

### Documentation
- Add `<p>` tag on empty lines: true
- Do not wrap if one line: true
- Align multiline annotation parameters: true

### XML Files
- Indent: 4 spaces
- Max line length: 120
- Text wrap: off
- Space inside empty tag: true

### Maven
- Compiler source/target: Java 11
- Max compiler errors: 500
- Compiler args: `-Xlint:unchecked`
- Source encoding: UTF-8

## Lombok Usage
- Version: 1.18.30
- Scope: provided
- Optional: true

## License Headers
- All source files MUST include Apache Software License header
- Validated by apache-rat-plugin and skywalking-eyes
- Exclusions defined in pom.xml (line 171-221)
- gRPC generated code excluded from license check

## Naming Conventions
- Package names: lowercase, dot-separated (e.g., org.apache.hugegraph)
- Class names: PascalCase
- Method names: camelCase
- Constants: UPPER_SNAKE_CASE
- Variables: camelCase
63 changes: 63 additions & 0 deletions .serena/memories/ecosystem_and_related_projects.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# HugeGraph Ecosystem and Related Projects

## Core Repository (This Project)
**Repository**: apache/hugegraph (server)
**Purpose**: Core graph database engine (OLTP)

## Related Repositories

### 1. hugegraph-toolchain
**Repository**: https://github.com/apache/hugegraph-toolchain
**Components**:
- **hugegraph-loader**: Bulk data loading tool
- **hugegraph-hubble**: Web-based visualization dashboard
- **hugegraph-tools**: Command-line utilities
- **hugegraph-client**: Java client SDK

### 2. hugegraph-computer
**Repository**: https://github.com/apache/hugegraph-computer
**Purpose**: Distributed graph computing framework (OLAP)
**Features**: PageRank, Connected Components, Shortest Path, Community Detection

### 3. hugegraph-ai
**Repository**: https://github.com/apache/incubator-hugegraph-ai
**Purpose**: Graph AI, LLM, and Knowledge Graph integration
**Features**: Graph-enhanced LLM, KG construction, Graph RAG, NL to Gremlin/Cypher

### 4. hugegraph-website
**Repository**: https://github.com/apache/hugegraph-doc
**Purpose**: Official documentation and website
**URL**: https://hugegraph.apache.org/

## Integration Points

### Data Pipeline
```
Data Sources → hugegraph-loader → hugegraph-server
┌───────────────────┼───────────────────┐
↓ ↓ ↓
hugegraph-hubble hugegraph-computer hugegraph-ai
(Visualization) (Analytics) (AI/ML)
```

## External Integrations

### Big Data Platforms
- Apache Flink, Apache Spark, HDFS

### Storage Backends
- RocksDB (default), HBase, Cassandra, ScyllaDB, MySQL, PostgreSQL

### Query Languages
- Gremlin (Apache TinkerPop), Cypher (OpenCypher), REST API

## Version Compatibility
- Server: 1.7.0
- TinkerPop: 3.5.1
- Java: 11+ required

## Use Cases
- Social networks, Fraud detection, Recommendation systems
- Knowledge graphs, Network analysis, Supply chain management
- IT operations, Bioinformatics
104 changes: 104 additions & 0 deletions .serena/memories/implementation_patterns_and_guidelines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Implementation Patterns and Guidelines

## Backend Development

### Backend Architecture Pattern
- All backends extend abstractions from `hugegraph-server/hugegraph-core`
- Implement the `BackendStore` interface
- Each backend is a separate Maven module under `hugegraph-server/`
- Backend selection configured in `hugegraph.properties` via `backend` property

### Available Backends
- **RocksDB** (default, embedded): `hugegraph-rocksdb`
- **HStore** (distributed, production): `hugegraph-hstore`
- **Legacy** (≤1.5.0): MySQL, PostgreSQL, Cassandra, ScyllaDB, HBase, Palo

### Backend Testing Profiles
- `memory`: In-memory backend for fast unit tests
- `rocksdb`: RocksDB for realistic local tests
- `hbase`: HBase for distributed scenarios
- `hstore`: HStore for production-like distributed tests

## gRPC Protocol Development

### Protocol Buffer Definitions
- PD protos: `hugegraph-pd/hg-pd-grpc/src/main/proto/`
- Store protos: `hugegraph-store/hg-store-grpc/src/main/proto/`

### Code Generation
When modifying `.proto` files:
1. Run `mvn clean compile` to regenerate gRPC stubs
2. Generated Java code goes to `*/grpc/` packages
3. Output location: `target/generated-sources/protobuf/`
4. Generated files excluded from Apache RAT checks
5. All inter-service communication uses gRPC

## Authentication System

### Default State
- Authentication **disabled by default**
- Enable via `bin/enable-auth.sh` or configuration
- **Required for production deployments**

### Implementation Location
`hugegraph-server/hugegraph-api/src/main/java/org/apache/hugegraph/api/auth/`

### Multi-Level Security Model
- Users, Groups, Projects, Targets, Access control

## TinkerPop Integration

### Compliance
- Full Apache TinkerPop 3 implementation
- Custom optimization strategies
- Supports both Gremlin and OpenCypher query languages

### Query Language Support
- **Gremlin**: Native via TinkerPop integration
- **OpenCypher**: Implementation in `hugegraph-api/opencypher/`

## Testing Patterns

### Test Suite Organization
- **UnitTestSuite**: Pure unit tests, no external dependencies
- **CoreTestSuite**: Core functionality tests with backend
- **ApiTestSuite**: REST API integration tests
- **StructureStandardTest**: TinkerPop structure compliance
- **ProcessStandardTest**: TinkerPop process compliance

### Backend Selection in Tests
Use Maven profiles:
```bash
-P core-test,memory # Fast in-memory
-P core-test,rocksdb # Persistent local
-P api-test,rocksdb # API with persistent backend
```

## Distribution and Packaging

### Creating Distribution
```bash
mvn clean package -DskipTests
```
Output: `install-dist/target/hugegraph-<version>.tar.gz`

## Code Organization

### Package Structure
```
org.apache.hugegraph
├── backend/ # Backend implementations
├── api/ # REST API endpoints
├── core/ # Core graph engine
├── schema/ # Schema definitions
├── traversal/ # Traversal and query processing
├── task/ # Background tasks
├── auth/ # Authentication/authorization
└── util/ # Utilities
```

### Module Dependencies
- Commons is shared by all modules
- Struct must be built before PD and Store
- Backend modules depend on core
- Test module depends on all server modules
Loading
Loading