Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .serena/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/cache
99 changes: 99 additions & 0 deletions .serena/memories/architecture_and_modules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Architecture and Module Structure

## Three-Tier Architecture

### 1. Client Layer
- Gremlin/Cypher query interfaces
- REST API endpoints
- Multiple client language bindings

### 2. Server Layer (hugegraph-server)
- **REST API Layer** (hugegraph-api): GraphAPI, SchemaAPI, GremlinAPI, CypherAPI, AuthAPI
- **Graph Engine Layer** (hugegraph-core): Schema management, traversal optimization, task scheduling
- **Backend Interface**: Abstraction over storage backends

### 3. Storage Layer
- Pluggable backend implementations
- Each backend extends `hugegraph-core` abstractions
- Implements `BackendStore` interface

## Multi-Module Structure

The project consists of 7 main modules:

### 1. hugegraph-server (13 submodules)
Core graph engine, REST APIs, and backend implementations:
- `hugegraph-core` - Core graph engine and abstractions
- `hugegraph-api` - REST API implementations (includes OpenCypher in `opencypher/`)
- `hugegraph-dist` - Distribution packaging and scripts
- `hugegraph-test` - Test suites (unit, core, API, TinkerPop)
- `hugegraph-example` - Example code
- Backend implementations:
- `hugegraph-rocksdb` (default)
- `hugegraph-hstore` (distributed)
- `hugegraph-hbase`
- `hugegraph-mysql`
- `hugegraph-postgresql`
- `hugegraph-cassandra`
- `hugegraph-scylladb`
- `hugegraph-palo`

### 2. hugegraph-pd (8 submodules)
Placement Driver for distributed deployments (meta server):
- `hg-pd-core` - Core PD logic
- `hg-pd-service` - PD service implementation
- `hg-pd-client` - Client library
- `hg-pd-common` - Shared utilities
- `hg-pd-grpc` - gRPC protocol definitions (auto-generated)
- `hg-pd-cli` - Command line interface
- `hg-pd-dist` - Distribution packaging
- `hg-pd-test` - Test suite

### 3. hugegraph-store (9 submodules)
Distributed storage backend with RocksDB and Raft:
- `hg-store-core` - Core storage logic
- `hg-store-node` - Storage node implementation
- `hg-store-client` - Client library
- `hg-store-common` - Shared utilities
- `hg-store-grpc` - gRPC protocol definitions (auto-generated)
- `hg-store-rocksdb` - RocksDB integration
- `hg-store-cli` - Command line interface
- `hg-store-dist` - Distribution packaging
- `hg-store-test` - Test suite

### 4. hugegraph-commons
Shared utilities across modules:
- Locks and concurrency utilities
- Configuration management
- RPC framework components

### 5. hugegraph-struct
Data structure definitions shared between modules.
**Important**: Must be built before PD and Store modules.

### 6. install-dist
Distribution packaging and release management:
- License and NOTICE files
- Dependency management scripts
- Release documentation

### 7. hugegraph-cluster-test
Cluster integration tests for distributed deployments

## Cross-Module Dependencies

```
hugegraph-commons → (shared by all modules)
hugegraph-struct → hugegraph-pd + hugegraph-store
hugegraph-core → (extended by all backend implementations)
```

## Distributed Architecture (Optional)

For production distributed deployments:
- **hugegraph-pd**: Service discovery, partition management, metadata
- **hugegraph-store**: Distributed storage with Raft (3+ nodes)
- **hugegraph-server**: Multiple server instances (3+)
- Communication: All use gRPC with Protocol Buffers

**Status**: Distributed components (PD + Store) are in BETA
92 changes: 92 additions & 0 deletions .serena/memories/code_style_and_conventions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Code Style and Conventions

## Code Style Configuration
- **Import**: Use `hugegraph-style.xml` in your IDE (IntelliJ IDEA recommended)
- **EditorConfig**: `.editorconfig` file defines style rules (validated in CI)
- **Checkstyle**: `style/checkstyle.xml` defines additional rules

## Core Style Rules (from .editorconfig)

### General
- Charset: UTF-8
- End of line: LF (Unix-style)
- Insert final newline: true
- Max line length: 100 characters (120 for XML)
- Visual guides at column 100

### Java Files
- Indent: 4 spaces (not tabs)
- Continuation indent: 8 spaces
- Wrap on typing: true
- Wrap long lines: true

### Import Organization
```
$*
|
java.**
|
javax.**
|
org.**
|
com.**
|
*
```
- Class count to use import on demand: 100
- Names count to use import on demand: 100

### Formatting Rules
- Line comments not at first column
- Align multiline: chained methods, parameters in calls, binary operations, assignments, ternary, throws, extends, array initializers
- Wrapping: normal (wrap if necessary)
- Brace forcing:
- if: if_multiline
- do-while: always
- while: if_multiline
- for: if_multiline
- Enum constants: split_into_lines

### Blank Lines
- Max blank lines in declarations: 1
- Max blank lines in code: 1
- Blank lines between package declaration and header: 1
- Blank lines before right brace: 1
- Blank lines around class: 1
- Blank lines after class header: 1

### Documentation
- Add `<p>` tag on empty lines: true
- Do not wrap if one line: true
- Align multiline annotation parameters: true

### XML Files
- Indent: 4 spaces
- Max line length: 120
- Text wrap: off
- Space inside empty tag: true

### Maven
- Compiler source/target: Java 11
- Max compiler errors: 500
- Compiler args: `-Xlint:unchecked`
- Source encoding: UTF-8

## Lombok Usage
- Version: 1.18.30
- Scope: provided
- Optional: true

## License Headers
- All source files MUST include Apache Software License header
- Validated by apache-rat-plugin and skywalking-eyes
- Exclusions defined in pom.xml (line 171-221)
- gRPC generated code excluded from license check

## Naming Conventions
- Package names: lowercase, dot-separated (e.g., org.apache.hugegraph)
- Class names: PascalCase
- Method names: camelCase
- Constants: UPPER_SNAKE_CASE
- Variables: camelCase
63 changes: 63 additions & 0 deletions .serena/memories/ecosystem_and_related_projects.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# HugeGraph Ecosystem and Related Projects

## Core Repository (This Project)
**Repository**: apache/hugegraph (server)
**Purpose**: Core graph database engine (OLTP)

## Related Repositories

### 1. hugegraph-toolchain
**Repository**: https://github.com/apache/hugegraph-toolchain
**Components**:
- **hugegraph-loader**: Bulk data loading tool
- **hugegraph-hubble**: Web-based visualization dashboard
- **hugegraph-tools**: Command-line utilities
- **hugegraph-client**: Java client SDK

### 2. hugegraph-computer
**Repository**: https://github.com/apache/hugegraph-computer
**Purpose**: Distributed graph computing framework (OLAP)
**Features**: PageRank, Connected Components, Shortest Path, Community Detection

### 3. hugegraph-ai
**Repository**: https://github.com/apache/incubator-hugegraph-ai
**Purpose**: Graph AI, LLM, and Knowledge Graph integration
**Features**: Graph-enhanced LLM, KG construction, Graph RAG, NL to Gremlin/Cypher

### 4. hugegraph-website
**Repository**: https://github.com/apache/hugegraph-doc
**Purpose**: Official documentation and website
**URL**: https://hugegraph.apache.org/

## Integration Points

### Data Pipeline
```
Data Sources → hugegraph-loader → hugegraph-server
┌───────────────────┼───────────────────┐
↓ ↓ ↓
hugegraph-hubble hugegraph-computer hugegraph-ai
(Visualization) (Analytics) (AI/ML)
```

## External Integrations

### Big Data Platforms
- Apache Flink, Apache Spark, HDFS

### Storage Backends
- RocksDB (default), HBase, Cassandra, ScyllaDB, MySQL, PostgreSQL

### Query Languages
- Gremlin (Apache TinkerPop), Cypher (OpenCypher), REST API

## Version Compatibility
- Server: 1.7.0
- TinkerPop: 3.5.1
- Java: 11+ required

## Use Cases
- Social networks, Fraud detection, Recommendation systems
- Knowledge graphs, Network analysis, Supply chain management
- IT operations, Bioinformatics
104 changes: 104 additions & 0 deletions .serena/memories/implementation_patterns_and_guidelines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Implementation Patterns and Guidelines

## Backend Development

### Backend Architecture Pattern
- All backends extend abstractions from `hugegraph-server/hugegraph-core`
- Implement the `BackendStore` interface
- Each backend is a separate Maven module under `hugegraph-server/`
- Backend selection configured in `hugegraph.properties` via `backend` property

### Available Backends
- **RocksDB** (default, embedded): `hugegraph-rocksdb`
- **HStore** (distributed, production): `hugegraph-hstore`
- **Legacy** (≤1.5.0): MySQL, PostgreSQL, Cassandra, ScyllaDB, HBase, Palo

### Backend Testing Profiles
- `memory`: In-memory backend for fast unit tests
- `rocksdb`: RocksDB for realistic local tests
- `hbase`: HBase for distributed scenarios
- `hstore`: HStore for production-like distributed tests

## gRPC Protocol Development

### Protocol Buffer Definitions
- PD protos: `hugegraph-pd/hg-pd-grpc/src/main/proto/`
- Store protos: `hugegraph-store/hg-store-grpc/src/main/proto/`

### Code Generation
When modifying `.proto` files:
1. Run `mvn clean compile` to regenerate gRPC stubs
2. Generated Java code goes to `*/grpc/` packages
3. Output location: `target/generated-sources/protobuf/`
4. Generated files excluded from Apache RAT checks
5. All inter-service communication uses gRPC

## Authentication System

### Default State
- Authentication **disabled by default**
- Enable via `bin/enable-auth.sh` or configuration
- **Required for production deployments**

### Implementation Location
`hugegraph-server/hugegraph-api/src/main/java/org/apache/hugegraph/api/auth/`

### Multi-Level Security Model
- Users, Groups, Projects, Targets, Access control

## TinkerPop Integration

### Compliance
- Full Apache TinkerPop 3 implementation
- Custom optimization strategies
- Supports both Gremlin and OpenCypher query languages

### Query Language Support
- **Gremlin**: Native via TinkerPop integration
- **OpenCypher**: Implementation in `hugegraph-api/opencypher/`

## Testing Patterns

### Test Suite Organization
- **UnitTestSuite**: Pure unit tests, no external dependencies
- **CoreTestSuite**: Core functionality tests with backend
- **ApiTestSuite**: REST API integration tests
- **StructureStandardTest**: TinkerPop structure compliance
- **ProcessStandardTest**: TinkerPop process compliance

### Backend Selection in Tests
Use Maven profiles:
```bash
-P core-test,memory # Fast in-memory
-P core-test,rocksdb # Persistent local
-P api-test,rocksdb # API with persistent backend
```

## Distribution and Packaging

### Creating Distribution
```bash
mvn clean package -DskipTests
```
Output: `install-dist/target/hugegraph-<version>.tar.gz`

## Code Organization

### Package Structure
```
org.apache.hugegraph
├── backend/ # Backend implementations
├── api/ # REST API endpoints
├── core/ # Core graph engine
├── schema/ # Schema definitions
├── traversal/ # Traversal and query processing
├── task/ # Background tasks
├── auth/ # Authentication/authorization
└── util/ # Utilities
```

### Module Dependencies
- Commons is shared by all modules
- Struct must be built before PD and Store
- Backend modules depend on core
- Test module depends on all server modules
Loading
Loading