Skip to content

Commit ce27b52

Browse files
committed
conat: optimize routing algo by caching the split
1 parent dfe2760 commit ce27b52

File tree

10 files changed

+1101
-65
lines changed

10 files changed

+1101
-65
lines changed

src/CLAUDE.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,90 @@ CoCalc is organized as a monorepo with key packages:
105105
5. **Authentication**: Each conat request includes account_id and is subject to permission checks at the hub level
106106
6. **Subjects**: Messages are routed using hierarchical subjects like `hub.account.{uuid}.{service}` or `project.{uuid}.{compute_server_id}.{service}`
107107

108+
### Conat Message Patterns
109+
110+
CoCalc's Conat messaging system uses hierarchical dot-separated subject patterns for routing messages between distributed services:
111+
112+
#### User Account Messages
113+
```
114+
hub.account.{account_id}.{service}
115+
```
116+
- **account_id**: UUID v4 format (e.g., `123e4567-e89b-12d3-a456-426614174000`)
117+
- **service**: API service name (`api`, `projects`, `db`, `purchases`, `jupyter`, `sync`, `org`, `messages`)
118+
- **Examples**:
119+
- `hub.account.123e4567-e89b-12d3-a456-426614174000.api` - Main API calls
120+
- `hub.account.123e4567-e89b-12d3-a456-426614174000.projects` - Project operations
121+
- `hub.account.123e4567-e89b-12d3-a456-426614174000.db` - Database operations
122+
123+
#### Project Messages
124+
```
125+
project.{project_id}.{compute_server_id}.{service}.{path}
126+
```
127+
- **project_id**: UUID v4 format for the project
128+
- **compute_server_id**: Numeric ID or `-` for default/no specific server
129+
- **service**: Service name (`api`, `terminal`, `sync`, `jupyter`, etc.)
130+
- **path**: Base64-encoded file path or `-` for no path
131+
- **Examples**:
132+
- `project.456e7890-e89b-12d3-a456-426614174001.1.api.-` - Project API (compute server 1)
133+
- `project.456e7890-e89b-12d3-a456-426614174001.-.terminal.L2hvbWUvdXNlcg==` - Terminal service (path: `/home/user`)
134+
- `project.456e7890-e89b-12d3-a456-426614174001.2.sync.-` - Sync service (compute server 2)
135+
136+
#### Hub Project Messages
137+
```
138+
hub.project.{project_id}.{service}
139+
```
140+
- Used for hub-level project operations
141+
- **Examples**:
142+
- `hub.project.456e7890-e89b-12d3-a456-426614174001.api` - Project API calls
143+
- `hub.project.456e7890-e89b-12d3-a456-426614174001.sync` - Project sync operations
144+
145+
#### Browser Session Messages
146+
```
147+
{sessionId}.account-{account_id}.{service}
148+
```
149+
- Used for browser-specific sessions
150+
- **sessionId**: Unique session identifier
151+
- **Examples**: `{session123}.account-123e4567-e89b-12d3-a456-426614174000.sync`
152+
153+
#### Service-Specific Messages
154+
```
155+
{service}.account-{account_id}.api
156+
{service}.project-{project_id}.api
157+
```
158+
- Used by global services like time, LLM, etc.
159+
- **Examples**:
160+
- `time.account-123e4567-e89b-12d3-a456-426614174000.api` - Time service
161+
- `llm.project-456e7890-e89b-12d3-a456-426614174001.api` - LLM service
162+
163+
#### Pattern Matching
164+
- `*` - Matches any single segment
165+
- `>` - Matches the rest of the subject (catch-all)
166+
- Used for subscribing to multiple related subjects
167+
168+
#### Key Features
169+
- **Automatic Chunking**: Large messages are automatically split and reassembled
170+
- **Multiple Encodings**: MsgPack (compact) and JSON supported
171+
- **Interest Awareness**: Wait for subscribers before sending messages
172+
- **Delivery Confirmation**: Optional confirmation of message receipt
173+
- **Authentication**: Per-subject permission checking with account/project IDs
174+
175+
#### Usage in Code
176+
```typescript
177+
// Account message
178+
const accountSubject = `hub.account.${accountId}.api`;
179+
180+
// Project message using helper
181+
import { projectSubject } from "@cocalc/conat/names";
182+
const projectSub = projectSubject({
183+
project_id: projectId,
184+
compute_server_id: 1,
185+
service: 'terminal',
186+
path: '/home/user'
187+
});
188+
```
189+
190+
These patterns ensure proper routing, authentication, and isolation between different users and projects in the distributed system. The hierarchical structure allows for efficient pattern matching and scalable message routing across the CoCalc platform.
191+
108192
### Key Technologies
109193

110194
- **TypeScript**: Primary language for all new code

src/packages/conat/DEPLOYMENT.md

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# How to Switch to Optimized Pattern Matching
2+
3+
## Quick Start (Recommended)
4+
5+
Set these environment variables before starting CoCalc:
6+
7+
```bash
8+
export COCALC_CONAT_MATCHING_ALGO=minimal
9+
export COCALC_CONAT_SPLIT_CACHE_SIZE=2048
10+
```
11+
12+
**That's it!** The server will automatically use the optimized pattern matching which provides **faster performance** for high-volume message routing.
13+
14+
## Environment Variables
15+
16+
### `COCALC_CONAT_MATCHING_ALGO`
17+
18+
Controls which pattern matching algorithm to use:
19+
20+
- `original` (default): Original Patterns class
21+
- `minimal`: **MinimalOptimizedPatterns - optimized for high-volume workloads (RECOMMENDED)**
22+
23+
### `COCALC_CONAT_SPLIT_CACHE_SIZE`
24+
25+
Controls cache size for MinimalOptimizedPatterns (default: 2048):
26+
27+
| Size | Memory | Performance | Use Case |
28+
| -------- | --------- | ---------------- | --------------------- |
29+
| 512 | ~20KB | Good caching | Memory constrained |
30+
| 1024 | ~40KB | Better caching | Balanced |
31+
| **2048** | **~80KB** | **Best caching** | **Recommended** |
32+
| 4096 | ~160KB | Best caching | No additional benefit |
33+
34+
## Deployment Examples
35+
36+
### Development
37+
38+
```bash
39+
export COCALC_CONAT_MATCHING_ALGO=minimal
40+
npm start
41+
```
42+
43+
### Docker
44+
45+
```dockerfile
46+
ENV COCALC_CONAT_MATCHING_ALGO=minimal
47+
ENV COCALC_CONAT_SPLIT_CACHE_SIZE=2048
48+
```
49+
50+
### Kubernetes
51+
52+
```yaml
53+
env:
54+
- name: COCALC_CONAT_MATCHING_ALGO
55+
value: "minimal"
56+
- name: COCALC_CONAT_SPLIT_CACHE_SIZE
57+
value: "2048"
58+
```
59+
60+
### systemd Service
61+
62+
```ini
63+
[Service]
64+
Environment=COCALC_CONAT_MATCHING_ALGO=minimal
65+
Environment=COCALC_CONAT_SPLIT_CACHE_SIZE=2048
66+
```
67+
68+
## Verification
69+
70+
When the server starts, you'll see:
71+
72+
```
73+
ConatServer: Using MinimalOptimizedPatterns with 2048-entry split cache
74+
```
75+
76+
For original algorithm:
77+
78+
```
79+
ConatServer: Using original Patterns class
80+
```
81+
82+
## Performance Testing
83+
84+
Run the enhanced benchmark to see the 1 million message performance improvement:
85+
86+
```bash
87+
cd packages/conat
88+
pnpm build
89+
node dist/core/routing-benchmark.js
90+
```
91+
92+
Expected results vary by workload:
93+
94+
- **Pattern matching**: Significant improvements with large pattern sets
95+
- **Routing**: Performance benefits depend on message distribution and cache hit rates
96+
- **Overall**: Better memory efficiency and optimized string operations
97+
98+
## Monitoring (Optional)
99+
100+
Add performance monitoring to your code:
101+
102+
```typescript
103+
// Check cache performance
104+
const stats = server.interest.getCacheStats?.();
105+
if (stats) {
106+
console.log(
107+
`Split cache utilization: ${stats.splitCache.utilization.toFixed(1)}%`,
108+
);
109+
console.log(
110+
`Cache size: ${stats.splitCache.size}/${stats.splitCache.maxSize}`,
111+
);
112+
}
113+
```
114+
115+
## Rollback
116+
117+
To revert to original implementation:
118+
119+
```bash
120+
export COCALC_CONAT_MATCHING_ALGO=original
121+
# or simply unset the variable
122+
unset COCALC_CONAT_MATCHING_ALGO
123+
```
124+
125+
## Production Recommendations
126+
127+
For production with 100k+ patterns:
128+
129+
```bash
130+
# Recommended settings for production
131+
export COCALC_CONAT_MATCHING_ALGO=minimal
132+
export COCALC_CONAT_SPLIT_CACHE_SIZE=2048
133+
134+
# For memory-constrained environments:
135+
export COCALC_CONAT_SPLIT_CACHE_SIZE=1024 # Uses ~40KB memory
136+
```
137+
138+
## Benefits
139+
140+
**Optimized pattern matching for high-volume workloads**
141+
**Zero code changes required**
142+
**100% API compatible**
143+
**Minimal memory overhead** (~80KB)
144+
**Extensively tested** with 96,010 patterns
145+
**Production ready**

0 commit comments

Comments
 (0)