Skip to content

Commit e360170

Browse files
authored
feat: add initial implementation of EJSON v2
Implement EJSON v2 (MongoDB Extended JSON) codec
2 parents ab4b881 + 6df3b4e commit e360170

File tree

9 files changed

+1953
-0
lines changed

9 files changed

+1953
-0
lines changed

src/ejson/EjsonDecoder.ts

Lines changed: 516 additions & 0 deletions
Large diffs are not rendered by default.

src/ejson/EjsonEncoder.ts

Lines changed: 593 additions & 0 deletions
Large diffs are not rendered by default.

src/ejson/README.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# EJSON v2 (MongoDB Extended JSON) Codec
2+
3+
This directory contains the implementation of MongoDB Extended JSON v2 codec, providing high-performance encoding and decoding functionality for BSON types in JSON format.
4+
5+
## Performance Optimizations
6+
7+
**High-Performance Binary Encoding**: The implementation uses `Writer` and `Reader` directly to output raw bytes without intermediate JSON representations, following the same pattern as `JsonEncoder` and `JsonDecoder` for optimal performance.
8+
9+
## Features
10+
11+
**EjsonEncoder** - Supports both encoding modes:
12+
- **Canonical Mode**: Preserves all type information using explicit type wrappers like `{"$numberInt": "42"}`
13+
- **Relaxed Mode**: Uses native JSON types where possible for better readability (e.g., `42` instead of `{"$numberInt": "42"}`)
14+
15+
**EjsonDecoder** - Strict parsing with comprehensive validation:
16+
- Validates exact key matches for type wrappers
17+
- Throws descriptive errors for malformed input
18+
- Supports both canonical and relaxed format parsing
19+
20+
## API
21+
22+
### Binary-First API (Recommended for Performance)
23+
```typescript
24+
import {EjsonEncoder, EjsonDecoder} from '@jsonjoy.com/json-pack/ejson2';
25+
import {Writer} from '@jsonjoy.com/util/lib/buffers/Writer';
26+
27+
const writer = new Writer();
28+
const encoder = new EjsonEncoder(writer, { canonical: true });
29+
const decoder = new EjsonDecoder();
30+
31+
// Encode to bytes
32+
const bytes = encoder.encode(data);
33+
34+
// Decode from bytes
35+
const result = decoder.decode(bytes);
36+
```
37+
38+
### String API (For Compatibility)
39+
```typescript
40+
import {createEjsonEncoder, createEjsonDecoder} from '@jsonjoy.com/json-pack/ejson2';
41+
42+
const encoder = createEjsonEncoder({ canonical: true });
43+
const decoder = createEjsonDecoder();
44+
45+
// Encode to string
46+
const jsonString = encoder.encodeToString(data);
47+
48+
// Decode from string
49+
const result = decoder.decodeFromString(jsonString);
50+
```
51+
52+
## Supported BSON Types
53+
54+
The implementation supports all BSON types as per the MongoDB specification:
55+
56+
- **ObjectId**: `{"$oid": "507f1f77bcf86cd799439011"}`
57+
- **Numbers**: Int32, Int64, Double with proper canonical/relaxed handling
58+
- **Decimal128**: `{"$numberDecimal": "123.456"}`
59+
- **Binary & UUID**: Full base64 encoding with subtype support
60+
- **Code & CodeWScope**: JavaScript code with optional scope
61+
- **Dates**: ISO-8601 format (relaxed) or timestamp (canonical)
62+
- **RegExp**: Pattern and options preservation
63+
- **Special types**: MinKey, MaxKey, Undefined, DBPointer, Symbol, Timestamp
64+
65+
## Examples
66+
67+
```typescript
68+
import { createEjsonEncoder, createEjsonDecoder, BsonObjectId, BsonInt64 } from '@jsonjoy.com/json-pack/ejson2';
69+
70+
const data = {
71+
_id: new BsonObjectId(0x507f1f77, 0xbcf86cd799, 0x439011),
72+
count: new BsonInt64(9223372036854775807),
73+
created: new Date('2023-01-15T10:30:00.000Z')
74+
};
75+
76+
// Canonical mode (preserves all type info)
77+
const canonical = createEjsonEncoder({ canonical: true });
78+
console.log(canonical.encodeToString(data));
79+
// {"_id":{"$oid":"507f1f77bcf86cd799439011"},"count":{"$numberLong":"9223372036854775807"},"created":{"$date":{"$numberLong":"1673778600000"}}}
80+
81+
// Relaxed mode (more readable)
82+
const relaxed = createEjsonEncoder({ canonical: false });
83+
console.log(relaxed.encodeToString(data));
84+
// {"_id":{"$oid":"507f1f77bcf86cd799439011"},"count":9223372036854775807,"created":{"$date":"2023-01-15T10:30:00.000Z"}}
85+
86+
// Decoding with validation
87+
const decoder = createEjsonDecoder();
88+
const decoded = decoder.decodeFromString(canonical.encodeToString(data));
89+
console.log(decoded._id instanceof BsonObjectId); // true
90+
```
91+
92+
## Implementation Details
93+
94+
- **High-Performance Binary Encoding**: Uses `Writer` and `Reader` directly to eliminate intermediate JSON string representations
95+
- **Shared Value Classes**: Reuses existing BSON value classes from `src/bson/values.ts`
96+
- **Strict Validation**: Prevents type wrappers with extra fields (e.g., `{"$oid": "...", "extra": "field"}` throws error)
97+
- **Round-trip Compatibility**: Ensures encoding → decoding preserves data integrity
98+
- **Error Handling**: Comprehensive error messages for debugging
99+
- **Specification Compliant**: Follows MongoDB Extended JSON v2 specification exactly
100+
101+
## Testing
102+
103+
Added 54 comprehensive tests covering:
104+
- All BSON type encoding/decoding in both modes
105+
- Round-trip compatibility testing
106+
- Error handling and edge cases
107+
- Special numeric values (Infinity, NaN)
108+
- Date handling for different year ranges
109+
- Malformed input validation
110+
111+
All existing tests continue to pass, ensuring no breaking changes.
Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
import {EjsonDecoder} from '../EjsonDecoder';
2+
import {
3+
BsonBinary,
4+
BsonDbPointer,
5+
BsonDecimal128,
6+
BsonFloat,
7+
BsonInt32,
8+
BsonInt64,
9+
BsonJavascriptCode,
10+
BsonJavascriptCodeWithScope,
11+
BsonMaxKey,
12+
BsonMinKey,
13+
BsonObjectId,
14+
BsonSymbol,
15+
BsonTimestamp,
16+
} from '../../bson/values';
17+
18+
describe('EjsonDecoder', () => {
19+
const decoder = new EjsonDecoder();
20+
21+
test('decodes primitive values', () => {
22+
expect(decoder.decodeFromString('null')).toBe(null);
23+
expect(decoder.decodeFromString('true')).toBe(true);
24+
expect(decoder.decodeFromString('false')).toBe(false);
25+
expect(decoder.decodeFromString('"hello"')).toBe('hello');
26+
expect(decoder.decodeFromString('42')).toBe(42);
27+
expect(decoder.decodeFromString('3.14')).toBe(3.14);
28+
});
29+
30+
test('decodes arrays', () => {
31+
expect(decoder.decodeFromString('[1, 2, 3]')).toEqual([1, 2, 3]);
32+
expect(decoder.decodeFromString('["a", "b"]')).toEqual(['a', 'b']);
33+
});
34+
35+
test('decodes plain objects', () => {
36+
const result = decoder.decodeFromString('{"name": "John", "age": 30}');
37+
expect(result).toEqual({name: 'John', age: 30});
38+
});
39+
40+
test('decodes ObjectId', () => {
41+
const result = decoder.decodeFromString('{"$oid": "507f1f77bcf86cd799439011"}') as BsonObjectId;
42+
expect(result).toBeInstanceOf(BsonObjectId);
43+
expect(result.timestamp).toBe(0x507f1f77);
44+
expect(result.process).toBe(0xbcf86cd799);
45+
expect(result.counter).toBe(0x439011);
46+
});
47+
48+
test('throws on invalid ObjectId', () => {
49+
expect(() => decoder.decodeFromString('{"$oid": "invalid"}')).toThrow('Invalid ObjectId format');
50+
expect(() => decoder.decodeFromString('{"$oid": 123}')).toThrow('Invalid ObjectId format');
51+
});
52+
53+
test('decodes Int32', () => {
54+
const result = decoder.decodeFromString('{"$numberInt": "42"}') as BsonInt32;
55+
expect(result).toBeInstanceOf(BsonInt32);
56+
expect(result.value).toBe(42);
57+
58+
const negResult = decoder.decodeFromString('{"$numberInt": "-42"}') as BsonInt32;
59+
expect(negResult.value).toBe(-42);
60+
});
61+
62+
test('throws on invalid Int32', () => {
63+
expect(() => decoder.decodeFromString('{"$numberInt": 42}')).toThrow('Invalid Int32 format');
64+
expect(() => decoder.decodeFromString('{"$numberInt": "2147483648"}')).toThrow('Invalid Int32 format');
65+
expect(() => decoder.decodeFromString('{"$numberInt": "invalid"}')).toThrow('Invalid Int32 format');
66+
});
67+
68+
test('decodes Int64', () => {
69+
const result = decoder.decodeFromString('{"$numberLong": "9223372036854775807"}') as BsonInt64;
70+
expect(result).toBeInstanceOf(BsonInt64);
71+
expect(result.value).toBe(9223372036854775807);
72+
});
73+
74+
test('throws on invalid Int64', () => {
75+
expect(() => decoder.decodeFromString('{"$numberLong": 123}')).toThrow('Invalid Int64 format');
76+
expect(() => decoder.decodeFromString('{"$numberLong": "invalid"}')).toThrow('Invalid Int64 format');
77+
});
78+
79+
test('decodes Double', () => {
80+
const result = decoder.decodeFromString('{"$numberDouble": "3.14"}') as BsonFloat;
81+
expect(result).toBeInstanceOf(BsonFloat);
82+
expect(result.value).toBe(3.14);
83+
84+
const infResult = decoder.decodeFromString('{"$numberDouble": "Infinity"}') as BsonFloat;
85+
expect(infResult.value).toBe(Infinity);
86+
87+
const negInfResult = decoder.decodeFromString('{"$numberDouble": "-Infinity"}') as BsonFloat;
88+
expect(negInfResult.value).toBe(-Infinity);
89+
90+
const nanResult = decoder.decodeFromString('{"$numberDouble": "NaN"}') as BsonFloat;
91+
expect(isNaN(nanResult.value)).toBe(true);
92+
});
93+
94+
test('throws on invalid Double', () => {
95+
expect(() => decoder.decodeFromString('{"$numberDouble": 3.14}')).toThrow('Invalid Double format');
96+
expect(() => decoder.decodeFromString('{"$numberDouble": "invalid"}')).toThrow('Invalid Double format');
97+
});
98+
99+
test('decodes Decimal128', () => {
100+
const result = decoder.decodeFromString('{"$numberDecimal": "123.456"}') as BsonDecimal128;
101+
expect(result).toBeInstanceOf(BsonDecimal128);
102+
expect(result.data).toBeInstanceOf(Uint8Array);
103+
expect(result.data.length).toBe(16);
104+
});
105+
106+
test('decodes Binary', () => {
107+
const result = decoder.decodeFromString('{"$binary": {"base64": "AQIDBA==", "subType": "00"}}') as BsonBinary;
108+
expect(result).toBeInstanceOf(BsonBinary);
109+
expect(result.subtype).toBe(0);
110+
expect(Array.from(result.data)).toEqual([1, 2, 3, 4]);
111+
});
112+
113+
test('decodes UUID', () => {
114+
const result = decoder.decodeFromString('{"$uuid": "c8edabc3-f738-4ca3-b68d-ab92a91478a3"}') as BsonBinary;
115+
expect(result).toBeInstanceOf(BsonBinary);
116+
expect(result.subtype).toBe(4);
117+
expect(result.data.length).toBe(16);
118+
});
119+
120+
test('throws on invalid UUID', () => {
121+
expect(() => decoder.decodeFromString('{"$uuid": "invalid-uuid"}')).toThrow('Invalid UUID format');
122+
});
123+
124+
test('decodes Code', () => {
125+
const result = decoder.decodeFromString('{"$code": "function() { return 42; }"}') as BsonJavascriptCode;
126+
expect(result).toBeInstanceOf(BsonJavascriptCode);
127+
expect(result.code).toBe('function() { return 42; }');
128+
});
129+
130+
test('decodes CodeWScope', () => {
131+
const result = decoder.decodeFromString(
132+
'{"$code": "function() { return x; }", "$scope": {"x": 42}}',
133+
) as BsonJavascriptCodeWithScope;
134+
expect(result).toBeInstanceOf(BsonJavascriptCodeWithScope);
135+
expect(result.code).toBe('function() { return x; }');
136+
expect(result.scope).toEqual({x: 42});
137+
});
138+
139+
test('decodes Symbol', () => {
140+
const result = decoder.decodeFromString('{"$symbol": "mySymbol"}') as BsonSymbol;
141+
expect(result).toBeInstanceOf(BsonSymbol);
142+
expect(result.symbol).toBe('mySymbol');
143+
});
144+
145+
test('decodes Timestamp', () => {
146+
const result = decoder.decodeFromString('{"$timestamp": {"t": 1234567890, "i": 12345}}') as BsonTimestamp;
147+
expect(result).toBeInstanceOf(BsonTimestamp);
148+
expect(result.timestamp).toBe(1234567890);
149+
expect(result.increment).toBe(12345);
150+
});
151+
152+
test('throws on invalid Timestamp', () => {
153+
expect(() => decoder.decodeFromString('{"$timestamp": {"t": -1, "i": 12345}}')).toThrow('Invalid Timestamp format');
154+
expect(() => decoder.decodeFromString('{"$timestamp": {"t": 123, "i": -1}}')).toThrow('Invalid Timestamp format');
155+
});
156+
157+
test('decodes RegularExpression', () => {
158+
const result = decoder.decodeFromString('{"$regularExpression": {"pattern": "test", "options": "gi"}}') as RegExp;
159+
expect(result).toBeInstanceOf(RegExp);
160+
expect(result.source).toBe('test');
161+
expect(result.flags).toBe('gi');
162+
});
163+
164+
test('decodes DBPointer', () => {
165+
const result = decoder.decodeFromString(
166+
'{"$dbPointer": {"$ref": "collection", "$id": {"$oid": "507f1f77bcf86cd799439011"}}}',
167+
) as BsonDbPointer;
168+
expect(result).toBeInstanceOf(BsonDbPointer);
169+
expect(result.name).toBe('collection');
170+
expect(result.id).toBeInstanceOf(BsonObjectId);
171+
});
172+
173+
test('decodes Date (ISO format)', () => {
174+
const result = decoder.decodeFromString('{"$date": "2023-01-01T00:00:00.000Z"}') as Date;
175+
expect(result).toBeInstanceOf(Date);
176+
expect(result.toISOString()).toBe('2023-01-01T00:00:00.000Z');
177+
});
178+
179+
test('decodes Date (canonical format)', () => {
180+
const result = decoder.decodeFromString('{"$date": {"$numberLong": "1672531200000"}}') as Date;
181+
expect(result).toBeInstanceOf(Date);
182+
expect(result.getTime()).toBe(1672531200000);
183+
});
184+
185+
test('throws on invalid Date', () => {
186+
expect(() => decoder.decodeFromString('{"$date": "invalid-date"}')).toThrow('Invalid Date format');
187+
expect(() => decoder.decodeFromString('{"$date": {"$numberLong": "invalid"}}')).toThrow('Invalid Date format');
188+
});
189+
190+
test('decodes MinKey', () => {
191+
const result = decoder.decodeFromString('{"$minKey": 1}');
192+
expect(result).toBeInstanceOf(BsonMinKey);
193+
});
194+
195+
test('decodes MaxKey', () => {
196+
const result = decoder.decodeFromString('{"$maxKey": 1}');
197+
expect(result).toBeInstanceOf(BsonMaxKey);
198+
});
199+
200+
test('decodes undefined', () => {
201+
const result = decoder.decodeFromString('{"$undefined": true}');
202+
expect(result).toBeUndefined();
203+
});
204+
205+
test('decodes DBRef', () => {
206+
const result = decoder.decodeFromString(
207+
'{"$ref": "collection", "$id": {"$oid": "507f1f77bcf86cd799439011"}, "$db": "database"}',
208+
) as Record<string, unknown>;
209+
expect(result.$ref).toBe('collection');
210+
expect(result.$id).toBeInstanceOf(BsonObjectId);
211+
expect(result.$db).toBe('database');
212+
});
213+
214+
test('decodes nested objects with Extended JSON types', () => {
215+
const json = '{"name": "test", "count": {"$numberInt": "42"}, "timestamp": {"$date": "2023-01-01T00:00:00.000Z"}}';
216+
const result = decoder.decodeFromString(json) as Record<string, unknown>;
217+
218+
expect(result.name).toBe('test');
219+
expect(result.count).toBeInstanceOf(BsonInt32);
220+
expect((result.count as BsonInt32).value).toBe(42);
221+
expect(result.timestamp).toBeInstanceOf(Date);
222+
});
223+
224+
test('handles objects with $ keys that are not type wrappers', () => {
225+
const result = decoder.decodeFromString('{"$unknown": "value", "$test": 123}') as Record<string, unknown>;
226+
expect(result.$unknown).toBe('value');
227+
expect(result.$test).toBe(123);
228+
});
229+
230+
test('throws on malformed type wrappers', () => {
231+
expect(() => decoder.decodeFromString('{"$numberInt": "42", "extra": "field"}')).toThrow();
232+
expect(() => decoder.decodeFromString('{"$binary": "invalid"}')).toThrow();
233+
expect(() => decoder.decodeFromString('{"$timestamp": {"t": "invalid"}}')).toThrow();
234+
});
235+
});

0 commit comments

Comments
 (0)