Skip to content

Commit 05270f9

Browse files
decebalclaude
andcommitted
feat: v0.6.0 - Performance optimizations for 1M+ events/sec
## SIMD-Accelerated JSON Parsing - Zero-copy parsing with simd-json (2-3x faster) - SimdJsonParser, ZeroCopyJson, BatchEventParser - 824K events/sec, 112 MB/s throughput ## Lock-Free Data Structures - LockFreeEventQueue: 41M push/sec, 3.4M pop/sec - ShardedEventQueue: 1.06M events/sec (4 threads) - LockFreeMetrics: ~5-10ns per metric update ## Batch Processing Pipeline - BatchProcessor combining SIMD + lock-free queues - Configurable presets (high_throughput, low_latency) - 335K-726K events/sec ## Arena Memory Pooling - Thread-local arena pools with bumpalo - 28.5M allocations/sec - 99%+ recycle rate ## Clean Architecture Refactoring - Moved modules to proper Clean Architecture locations - Domain, Application, Infrastructure layers - Updated all imports and exports Performance: Full pipeline 726K events/sec (up from 469K) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 1b38f74 commit 05270f9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+2885
-199
lines changed

.claude/settings.local.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,9 @@
101101
"Bash(mkdir:*)",
102102
"Bash(bun pm:*)",
103103
"Bash(cargo fmt:*)",
104-
"Bash(cargo search:*)"
104+
"Bash(cargo search:*)",
105+
"Bash(cargo test:*)",
106+
"Bash(cargo build:*)"
105107
],
106108
"deny": [],
107109
"ask": []

apps/core/Cargo.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "allsource-core"
3-
version = "0.2.0"
3+
version = "0.6.0"
44
edition = "2021"
55
rust-version = "1.75"
66
authors = ["AllSource Team"]
@@ -59,6 +59,7 @@ tower-http = { version = "0.6", features = ["cors", "trace"] }
5959
# =============================================================================
6060
serde = { version = "1.0", features = ["derive"] }
6161
serde_json = "1.0"
62+
simd-json = "0.14" # SIMD-accelerated JSON parsing for high throughput
6263
toml = "0.8"
6364

6465
# =============================================================================
@@ -83,6 +84,7 @@ uuid = { version = "1.19", features = ["v4", "serde"] }
8384
# High-performance data structures (Updated: 2025-12-05)
8485
# DashMap 6.x is stable, 7.0 is RC - stay on stable
8586
# =============================================================================
87+
bumpalo = { version = "3.16", features = ["collections"] } # Arena allocator for zero-copy
8688
crossbeam = "0.8"
8789
crossbeam-queue = "0.3"
8890
dashmap = "6.1"

apps/core/README.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,46 @@ allsource-core = "0.2"
9999
- **Pipeline Statistics**: Track processing metrics per pipeline
100100
- **Integrated Processing**: Events flow through pipelines during ingestion
101101

102+
### 🏗️ Clean Architecture
103+
104+
The codebase follows a strict layered architecture for maintainability and testability:
105+
106+
```
107+
┌─────────────────────────────────────────────────────────────┐
108+
│ Infrastructure Layer │
109+
│ (HTTP handlers, WebSocket, persistence, security) │
110+
│ infrastructure::web, infrastructure::persistence, │
111+
│ infrastructure::security, infrastructure::repositories │
112+
├─────────────────────────────────────────────────────────────┤
113+
│ Application Layer │
114+
│ (Use cases, services, DTOs) │
115+
│ application::use_cases, application::services, │
116+
│ application::dto │
117+
├─────────────────────────────────────────────────────────────┤
118+
│ Domain Layer │
119+
│ (Entities, value objects, repository traits) │
120+
│ domain::entities, domain::value_objects, │
121+
│ domain::repositories │
122+
└─────────────────────────────────────────────────────────────┘
123+
```
124+
125+
**Module Organization:**
126+
127+
| Layer | Path | Contents |
128+
|-------|------|----------|
129+
| Domain | `domain::entities` | Event, Tenant, Schema, Projection, AuditEvent, EventStream |
130+
| Domain | `domain::value_objects` | EntityId, TenantId, EventType, PartitionKey |
131+
| Domain | `domain::repositories` | Repository trait definitions (no implementations) |
132+
| Application | `application::use_cases` | IngestEvent, QueryEvents, ManageTenant, ManageSchema |
133+
| Application | `application::services` | AuditLogger, Pipeline, Replay, Analytics |
134+
| Application | `application::dto` | Request/Response DTOs for API boundaries |
135+
| Infrastructure | `infrastructure::web` | HTTP handlers, WebSocket, API routes |
136+
| Infrastructure | `infrastructure::persistence` | Storage, WAL, Snapshots, Compaction, Index |
137+
| Infrastructure | `infrastructure::security` | Auth, Middleware, Rate Limiting |
138+
| Infrastructure | `infrastructure::repositories` | In-memory, PostgreSQL, RocksDB implementations |
139+
140+
**Dependency Rule:** Inner layers never depend on outer layers. Domain has zero external dependencies.
141+
102142
### 🏔️ SierraDB-Inspired Production Patterns (NEW)
103143

104144
Based on battle-tested patterns from [SierraDB](https://github.com/cablehead/xs), a production-grade event store:

apps/core/docs/CHANGELOG.md

Lines changed: 106 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,80 @@ All notable changes to AllSource Core will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.6.0] - 2025-12-05
9+
10+
### Added
11+
12+
#### SIMD-Accelerated JSON Parsing (`simd_json.rs`)
13+
- Zero-copy JSON parsing using `simd-json` library
14+
- 2-3x faster parsing with SIMD instructions (AVX2, SSE4.2, NEON)
15+
- `SimdJsonParser` for high-throughput deserialization
16+
- `ZeroCopyJson` for read-only field access without full deserialization
17+
- `BatchEventParser` for efficient batch processing
18+
- `SimdJsonStats` for tracking parsing throughput and errors
19+
- **Performance: 824K events/sec, 112 MB/s throughput**
20+
21+
#### Lock-Free Data Structures (`lock_free/`)
22+
- `LockFreeEventQueue`: MPMC queue using crossbeam ArrayQueue
23+
- ~10-20ns push/pop operations
24+
- **Performance: 41M push/sec, 3.4M pop/sec**
25+
- `ShardedEventQueue`: Distributed queue for high contention scenarios
26+
- 16 shards by default for cache-line optimization
27+
- Batch push/pop operations
28+
- **Performance: 1.06M events/sec with 4 threads**
29+
- `LockFreeMetrics`: Atomic counters for zero-contention monitoring
30+
- Min/max/avg latency tracking with CAS operations
31+
- ~5-10ns per metric update
32+
33+
#### Batch Processing Pipeline (`batch_processor.rs`)
34+
- `BatchProcessor`: High-throughput ingestion combining SIMD + lock-free queues
35+
- Configurable batch sizes (1K-50K events)
36+
- Three configuration presets:
37+
- `default()`: Balanced for general use
38+
- `high_throughput()`: Maximum events/sec (50K batch, 10M queue)
39+
- `low_latency()`: Quick responses (1K batch, 100K queue)
40+
- `RawEventData`: Efficient event deserialization struct
41+
- **Performance: 335K events/sec (single), 726K events/sec (4 threads)**
42+
43+
#### Arena Memory Pooling (`arena_pool.rs`)
44+
- Thread-local arena pools using `bumpalo` allocator
45+
- `get_arena()`: Get recycled arena from pool (~10-20ns)
46+
- `PooledArena`: RAII wrapper with automatic pool return
47+
- `ScopedArena`: Convenient scoped allocation pattern
48+
- `SizedBufferPool`: Pre-allocated buffers for specific sizes
49+
- 99%+ arena recycle rate
50+
- **Performance: 28.5M allocations/sec**
51+
52+
#### Performance Validation Tests (`performance_test.rs`)
53+
- Comprehensive benchmark suite for all optimizations
54+
- Tests for SIMD parsing, queue throughput, batch processing
55+
- Concurrent performance validation
56+
- Sustained throughput testing
57+
58+
### Changed
59+
- Added `simd-json = "0.14"` dependency for SIMD acceleration
60+
- Added `bumpalo = "3.16"` dependency for arena allocation
61+
- Added `From<SimdJsonError>` impl for `AllSourceError`
62+
- Clean Architecture module organization maintained
63+
64+
### Performance
65+
- **Lock-Free Queue (push)**: 41.1M ops/sec
66+
- **Arena Allocations**: 28.5M allocs/sec
67+
- **Lock-Free Queue (pop)**: 3.4M ops/sec
68+
- **Sharded Queue (concurrent)**: 2.5M ops/sec
69+
- **SIMD JSON Parsing**: 824K events/sec
70+
- **Full Pipeline (4 threads)**: 726K events/sec
71+
- **Sustained Throughput**: 418K events/sec
72+
- **Batch Processor**: 335K events/sec
73+
74+
### Technical Details
75+
- All new modules follow Clean Architecture patterns
76+
- Full test coverage with 57+ new tests
77+
- Debug and release mode performance thresholds
78+
- Thread-safe implementations with `Send + Sync`
79+
80+
---
81+
882
## [0.5.0] - 2025-10-20
983

1084
### Added
@@ -176,18 +250,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
176250

177251
## [Unreleased]
178252

179-
### Planned for v0.6 - Performance & Optimization
180-
181-
- [ ] Zero-copy deserialization optimization
182-
- [ ] SIMD-accelerated queries
183-
- [ ] Memory-mapped Parquet files
184-
- [ ] Adaptive indexing strategies
185-
- [ ] Query result caching
186-
- [ ] Compression tuning
187-
- [ ] Batch write optimization
188-
189-
Target: 1M+ events/sec, <5μs queries
190-
191253
### Planned for v0.7 - Advanced Features
192254

193255
- [ ] Multi-tenancy support
@@ -231,14 +293,45 @@ Target: 10M+ events/sec (distributed), 99.99% availability
231293

232294
| Version | Date | Status | Highlights |
233295
|---------|------|--------|------------|
234-
| [0.5.0] | 2025-10-20 | ✅ Current | Schema registry, event replay, stream processing |
296+
| [0.6.0] | 2025-12-05 | ✅ Current | SIMD JSON, lock-free queues, batch processing, arena pools |
297+
| [0.5.0] | 2025-10-20 | ✅ Stable | Schema registry, event replay, stream processing |
235298
| [0.2.0] | 2025-01-15 | ✅ Stable | Parquet storage, WAL, snapshots, analytics |
236299
| [0.1.0] | 2024-12-01 | ✅ Stable | Core event store, indexing, projections |
237300

238301
---
239302

240303
## Upgrade Notes
241304

305+
### Upgrading from 0.5.0 to 0.6.0
306+
307+
**Breaking Changes**: None
308+
309+
**New Features**: All performance optimizations are opt-in and transparent.
310+
311+
**Dependencies**:
312+
- Added `simd-json = "0.14"` for SIMD-accelerated JSON parsing
313+
- Added `bumpalo = "3.16"` for arena memory allocation
314+
315+
**New Modules** (all in `infrastructure::persistence`):
316+
- `simd_json`: SIMD JSON parsing utilities
317+
- `lock_free::sharded_queue`: High-throughput sharded queue
318+
- `batch_processor`: Batch processing pipeline
319+
- `arena_pool`: Thread-local arena memory pools
320+
- `performance_test`: Performance validation tests
321+
322+
**Migration Steps**:
323+
1. Update dependencies: `cargo update`
324+
2. Rebuild: `cargo build --release`
325+
3. Run tests: `cargo test`
326+
4. No data migration required
327+
5. Optionally integrate new batch processing APIs for higher throughput
328+
329+
**Performance Improvements**:
330+
- JSON parsing: 2-3x faster with SIMD
331+
- Queue operations: 10-100x faster with lock-free structures
332+
- Memory allocation: 5-10x faster with arena pooling
333+
- Full pipeline: 726K events/sec (up from 469K)
334+
242335
### Upgrading from 0.2.0 to 0.5.0
243336

244337
**Breaking Changes**: None

apps/core/src/analytics.rs renamed to apps/core/src/application/services/analytics.rs

File renamed without changes.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,24 @@
1+
// Application services layer
2+
// Contains business logic orchestration and domain service implementations
3+
4+
pub mod analytics;
15
pub mod audit_logger;
6+
pub mod pipeline;
7+
pub mod projection;
8+
pub mod replay;
9+
pub mod schema;
10+
pub mod tenant_service;
211

12+
// Re-exports for convenience
13+
pub use analytics::AnalyticsEngine;
314
pub use audit_logger::{AuditLogger, RequestContext};
15+
pub use pipeline::{Pipeline, PipelineConfig, PipelineManager, PipelineOperator, PipelineStats};
16+
pub use projection::{
17+
EntitySnapshotProjection, EventCounterProjection, Projection, ProjectionManager,
18+
};
19+
pub use replay::{ReplayManager, ReplayProgress, StartReplayRequest, StartReplayResponse};
20+
pub use schema::{
21+
CompatibilityMode, RegisterSchemaRequest, RegisterSchemaResponse, SchemaRegistry,
22+
SchemaRegistryConfig, ValidateEventRequest, ValidateEventResponse,
23+
};
24+
pub use tenant_service::{Tenant, TenantManager, TenantQuotas, TenantUsage};

apps/core/src/pipeline.rs renamed to apps/core/src/application/services/pipeline.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
use crate::domain::entities::Event;
22
use crate::error::{AllSourceError, Result};
3-
use crate::metrics::MetricsRegistry;
3+
use crate::infrastructure::observability::metrics::MetricsRegistry;
44
use chrono::{DateTime, Duration, Utc};
55
use parking_lot::RwLock;
66
use serde::{Deserialize, Serialize};

apps/core/src/projection.rs renamed to apps/core/src/application/services/projection.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
use crate::domain::entities::Event;
22
use crate::error::Result;
3-
use crate::metrics::MetricsRegistry;
3+
use crate::infrastructure::observability::metrics::MetricsRegistry;
44
use dashmap::DashMap;
55
use serde_json::Value;
66
use std::sync::Arc;

apps/core/src/replay.rs renamed to apps/core/src/application/services/replay.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
use crate::application::dto::QueryEventsRequest;
22
use crate::domain::entities::Event;
33
use crate::error::{AllSourceError, Result};
4-
use crate::projection::Projection;
4+
use crate::application::services::projection::Projection;
55
use crate::store::EventStore;
66
use chrono::{DateTime, Utc};
77
use parking_lot::RwLock;

apps/core/src/schema.rs renamed to apps/core/src/application/services/schema.rs

File renamed without changes.

0 commit comments

Comments
 (0)