@@ -23,8 +23,8 @@ Comet is a high-performance embedded segmented log designed for edge observabili
2323├─────────────────────┤ ├──────────────────────┤
2424│ - nextEntryNumber │ │ - Index │
2525│ - pendingWrites │ │ - Data Files │
26- │ - writeBuffers │ │ - Consumer Offsets │
27- │ - fileSize │ │ - State File │
26+ │ - writeBuffers │ │ - State File │
27+ │ - fileSize │ │ │
2828└─────────────────────┘ └──────────────────────┘
2929 │ ▲
3030 │ │
@@ -93,13 +93,20 @@ Each shard represents an independent data stream partition with strict state sep
9393│ Durable State: │
9494│ - Binary searchable index │
9595│ - Memory-mapped state file │
96- │ - Consumer offsets │
9796│ - File metadata │
97+ │ │
98+ │ Consumer State (Separate): │
99+ │ - Memory-mapped offset file │
100+ │ - Lock-free consumer tracking │
98101└──────────────────────────────────────┘
99102 │ │ │
100103 ▼ ▼ ▼
101104 Data Files Index File State File
102105 (.comet) (.bin) (.state)
106+ │
107+ ▼
108+ Consumer Offsets
109+ (offsets.state)
103110```
104111
105112** Critical State Management:**
@@ -142,7 +149,8 @@ Key features:
142149
143150- ** Exactly-once semantics** : Through ACK tracking
144151- ** Deterministic assignment** : Consistent hashing for multi-consumer
145- - ** In-memory offsets** : Fast read-after-ACK consistency
152+ - ** Separate offset storage** : Consumer offsets stored independently from writer's index
153+ - ** Lock-free offset updates** : Memory-mapped storage for multi-process safety
146154- ** Batch operations** : Amortizes overhead across messages
147155
148156### 4. Reader
@@ -244,14 +252,14 @@ For each shard:
244252 ├─→ Binary search index (durable entries only)
245253 ├─→ Read entries from mapped files
246254 ├─→ Decompress if needed
247- └─→ Update consumer offsets (on ACK)
255+ └─→ Update consumer offsets in mmap (on ACK)
248256```
249257
250258** Key Points:**
251259
252260- Consumers only see entries in index.CurrentEntryNumber
253261- Reader cache automatically detects stale mappings
254- - Consumer offsets are persisted with index
262+ - Consumer offsets are stored separately in memory-mapped files
255263- No unflushed/pending data is ever visible
256264
257265### Retention Path
@@ -271,6 +279,8 @@ Client.cleanupShard()
271279
272280## Memory-Mapped State
273281
282+ ### Writer State (comet.state)
283+
274284Each shard maintains a 1KB state file with cache-line aligned sections:
275285
276286```
@@ -362,6 +372,43 @@ Benefits:
362372- ** Atomic access** : Lock-free updates
363373- ** Fixed size** : Predictable memory usage
364374
375+ ### Consumer Offset Storage (offsets.state)
376+
377+ Each shard maintains a separate 64KB memory-mapped file for consumer offsets:
378+
379+ ```
380+ ┌─────────────────────────┐ Header (64 bytes)
381+ │ Version (4B) │ Format version (1)
382+ │ Magic (4B) │ 0xC0FE0FF5
383+ │ Reserved (56B) │ Future expansion
384+ ├─────────────────────────┤ Consumer Entries (512 × 128 bytes)
385+ │ Entry 0 (128B) │ First consumer group
386+ │ ├─ GroupName (48B) │ Null-terminated string
387+ │ ├─ Offset (8B) │ Current consumer offset
388+ │ ├─ LastUpdate (8B) │ Unix nano timestamp
389+ │ ├─ AckCount (8B) │ Total acknowledgments
390+ │ └─ Reserved (56B) │ Future use
391+ │ Entry 1 (128B) │ Second consumer group
392+ │ ... │
393+ │ Entry 511 (128B) │ Last consumer group
394+ └─────────────────────────┘
395+ ```
396+
397+ ** Key Features:**
398+
399+ - ** Lock-free access** : Atomic operations for multi-process safety
400+ - ** 512 consumer groups** : Per shard with linear probing hash table
401+ - ** Memory-mapped** : Changes visible immediately across processes
402+ - ** Cache-line aligned** : Each entry is exactly 2 cache lines (128 bytes)
403+ - ** Automatic migration** : From old file-based format to mmap format
404+
405+ ** Consumer Group Management:**
406+
407+ - Groups allocated using FNV-1a hash with linear probing
408+ - Empty slots detected by null GroupName[ 0]
409+ - Atomic slot claiming prevents race conditions
410+ - No explicit locking required for reads or writes
411+
365412## Wire Format
366413
367414Each entry follows a simple, efficient format:
@@ -403,11 +450,6 @@ Binary format for fast lookups and persistence:
403450├────────────────────────────┤
404451│ File count (4B) │
405452├────────────────────────────┤
406- │ Consumer offsets │ For each consumer:
407- │ - Group name length (1B) │ - Length of group name
408- │ - Group name (N bytes) │ - UTF-8 group name
409- │ - Offset (8B) │ - Consumer offset
410- ├────────────────────────────┤
411453│ Binary search nodes │ For each node (20B):
412454│ - Entry number (8B) │ - Entry number
413455│ - File index (4B) │ - Index into files array
@@ -511,13 +553,13 @@ For each shard:
511553
512554- All data flushed to disk
513555- Index reflects actual state
514- - Consumer offsets saved
556+ - Consumer offsets preserved in separate memory-mapped files
515557
516558** After Crash:**
517559
518560- Only synced data is recoverable
519561- Index rebuilt from actual files
520- - Consumer offsets reset to last known good state
562+ - Consumer offsets preserved independently in offsets. state files
521563- Unflushed writes are lost (by design)
522564
523565### Recovery Scenarios
@@ -564,6 +606,6 @@ This design ensures that even after catastrophic failures, Comet can recover to
564606
565607### Resource Usage
566608
567- - ** Memory** : ~ 1KB state + configurable cache
609+ - ** Memory** : ~ 65KB state + configurable memory-mapped file cache
568610- ** Disk** : Efficient compression, automatic cleanup
569611- ** CPU** : Minimal (compression optional)
0 commit comments