Skip to content

Commit ad21ca3

Browse files
committed
Merge remote-tracking branch 'origin/docs/mcap-compression-guide' into dev
2 parents 6ea42bd + ad2778d commit ad21ca3

File tree

1 file changed

+258
-0
lines changed

1 file changed

+258
-0
lines changed

docs/usage/files/compression.md

Lines changed: 258 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,258 @@
1+
# MCAP File Compression Best Practices
2+
3+
This guide explains how to efficiently compress MCAP files for storage in Kleinkram while maintaining optimal performance for data access and processing.
4+
5+
## TL;DR
6+
7+
**DO**: Use MCAP's built-in chunk compression
8+
**DON'T**: Compress entire files with `.zst` or `.gz`
9+
10+
## Why MCAP Chunk Compression?
11+
12+
MCAP files support internal **chunk-level compression** that provides:
13+
14+
- **Storage savings**: 3-5x smaller files
15+
- **Random access**: Read specific topics without decompressing everything
16+
- **Transparent decompression**: Libraries handle it automatically
17+
- **Industry standard**: Recommended by Foxglove and ROS 2 community
18+
19+
## Recommended Workflow
20+
21+
### On the Robot
22+
23+
Record with the `fastwrite` profile for maximum performance during data collection:
24+
25+
```bash
26+
ros2 bag record --storage mcap --storage-preset-profile fastwrite \
27+
/camera/image /imu/data /lidar/scan
28+
```
29+
30+
**What this does:**
31+
- Writes uncompressed data for speed
32+
- Minimal CPU usage during recording
33+
- No CRC calculation (faster writes)
34+
- No message index (added later)
35+
36+
### Post-Processing (Offline)
37+
38+
After recording, compress the files before uploading to Kleinkram:
39+
40+
```bash
41+
# Add chunk compression and message index
42+
ros2 bag convert input.mcap -o output.mcap \
43+
--output-options "compression_mode=file,compression_format=zstd"
44+
```
45+
46+
**Alternative using MCAP CLI:**
47+
48+
```bash
49+
# Install MCAP CLI
50+
pip install mcap-cli
51+
52+
# Compress with zstd (recommended)
53+
mcap compress input.mcap --compression zstd --chunk-size 4MB
54+
55+
# Or use LZ4 for faster decompression
56+
mcap compress input.mcap --compression lz4 --chunk-size 4MB
57+
```
58+
59+
### Upload to Kleinkram
60+
61+
```bash
62+
# Upload compressed MCAP (file extension is still .mcap)
63+
klein upload output.mcap --mission "my-mission" --project "my-project"
64+
```
65+
66+
## Compression Formats
67+
68+
MCAP supports two compression algorithms:
69+
70+
### Zstd (Recommended)
71+
72+
```bash
73+
mcap compress input.mcap --compression zstd
74+
```
75+
76+
**Pros:**
77+
- Best compression ratios (3-5x savings)
78+
- Good decompression speed
79+
- Industry standard
80+
81+
**Use when:** Storage cost is a priority
82+
83+
### LZ4 (Alternative)
84+
85+
```bash
86+
mcap compress input.mcap --compression lz4
87+
```
88+
89+
**Pros:**
90+
- Faster decompression (~2x faster than zstd)
91+
- Still good compression (2-3x savings)
92+
93+
**Use when:** Decompression speed is critical
94+
95+
## Storage Preset Profiles
96+
97+
ROS 2 provides several preset profiles for different scenarios:
98+
99+
### `fastwrite` (Recording)
100+
Best for on-robot recording:
101+
```bash
102+
ros2 bag record --storage mcap --storage-preset-profile fastwrite <topics>
103+
```
104+
- No compression
105+
- No CRC
106+
- No message index
107+
- Minimal CPU/memory usage
108+
109+
### `zstd_small` (Balanced)
110+
Good balance of speed and compression:
111+
```bash
112+
ros2 bag record --storage mcap --storage-preset-profile zstd_small <topics>
113+
```
114+
- Zstd compression (lowest ratio)
115+
- No CRC calculation
116+
- Good throughput
117+
118+
### `zstd_fast` (Maximum Compression)
119+
Best compression, slower writes:
120+
```bash
121+
ros2 bag record --storage mcap --storage-preset-profile zstd_fast <topics>
122+
```
123+
- Zstd compression (highest ratio)
124+
- 4MB chunks
125+
- Maximum storage savings
126+
127+
## Chunk Size Configuration
128+
129+
Larger chunk sizes generally provide better compression:
130+
131+
```bash
132+
# Small chunks (good for random access)
133+
mcap compress input.mcap --compression zstd --chunk-size 1MB
134+
135+
# Medium chunks (balanced - recommended)
136+
mcap compress input.mcap --compression zstd --chunk-size 4MB
137+
138+
# Large chunks (best compression)
139+
mcap compress input.mcap --compression zstd --chunk-size 8MB
140+
```
141+
142+
**Recommendation:** Use 4MB chunks for a good balance of compression ratio and read performance.
143+
144+
## What NOT to Do
145+
146+
### ❌ File-Level Compression
147+
148+
**Don't compress entire MCAP files:**
149+
150+
```bash
151+
# ❌ BAD - loses random access
152+
zstd data.mcap # Creates data.mcap.zst
153+
gzip data.mcap # Creates data.mcap.gz
154+
```
155+
156+
**Why this is problematic:**
157+
- Cannot read specific topics without full decompression
158+
- Actions must decompress entire file (slow, lots of disk I/O)
159+
- Not the standard practice in the community
160+
- Kleinkram can't extract topic metadata without decompression
161+
162+
## How Kleinkram Handles Compressed Files
163+
164+
When you upload a chunk-compressed MCAP:
165+
166+
1. **Upload**: File uploaded directly to storage (already compressed)
167+
2. **Topic Extraction**: Kleinkram's queue consumer:
168+
- Downloads the file
169+
- MCAP library automatically decompresses chunks as needed
170+
- Extracts topic metadata (names, types, message counts, frequencies)
171+
- Stores metadata in database
172+
3. **Storage**: Compressed MCAP stored as-is
173+
4. **Actions**: Your action containers:
174+
- Download compressed MCAP
175+
- MCAP libraries handle decompression transparently
176+
- Can efficiently read specific topics (random access works!)
177+
178+
## Verifying Compression
179+
180+
Check if your MCAP is compressed:
181+
182+
```bash
183+
# Using MCAP CLI
184+
mcap info data.mcap
185+
186+
# Look for:
187+
# compression: zstd (or lz4)
188+
# chunk count: > 0
189+
```
190+
191+
Example output:
192+
```
193+
library:
194+
profile:
195+
messages: 45123
196+
duration: 1m23.456s
197+
start: 2024-01-15T10:30:00Z
198+
end: 2024-01-15T10:31:23Z
199+
compression: zstd
200+
chunk count: 42
201+
```
202+
203+
## Performance Comparison
204+
205+
Example with a 10 GB dataset:
206+
207+
| Method | File Size | Upload Time | Storage Cost | Action Access |
208+
|--------|-----------|-------------|--------------|---------------|
209+
| Uncompressed | 10 GB | Slow | High (1x) | Fast ✅ |
210+
| Chunk Compressed (zstd) | 2 GB | Fast | Low (5x savings) | Fast ✅ |
211+
| File-level .zst | 2 GB | Fast | Low (5x savings) | Slow ❌ |
212+
213+
## FAQs
214+
215+
### Q: Do I need to decompress chunk-compressed MCAPs before uploading?
216+
217+
**A:** No! Upload them directly. Kleinkram's libraries handle decompression automatically.
218+
219+
### Q: Will my actions need to decompress files?
220+
221+
**A:** No! MCAP libraries (`@mcap/core`, Python `mcap`, C++ `mcap`, etc.) handle decompression transparently. Your action code doesn't change.
222+
223+
### Q: Can I still seek to specific timestamps/topics?
224+
225+
**A:** Yes! Chunk compression maintains the MCAP index, so random access works perfectly.
226+
227+
### Q: What if I have `.mcap.zst` files?
228+
229+
**A:** These are file-level compressed. Consider recompressing with chunk compression:
230+
231+
```bash
232+
# Decompress
233+
zstd -d data.mcap.zst
234+
235+
# Recompress with chunk compression
236+
mcap compress data.mcap --compression zstd --chunk-size 4MB
237+
```
238+
239+
### Q: Can I compress during recording?
240+
241+
**A:** Yes, but it may impact recording performance on resource-constrained robots. Use `--storage-preset-profile zstd_small` for real-time compression with minimal overhead.
242+
243+
## Additional Resources
244+
245+
- [MCAP Specification](https://mcap.dev/spec)
246+
- [Understanding MCAP Chunk Size and Compression](https://foxglove.dev/blog/understanding-mcap-chunk-size-and-compression)
247+
- [ROS 2 rosbag2_storage_mcap Documentation](https://docs.ros.org/en/humble/p/rosbag2_storage_mcap/)
248+
- [MCAP CLI Tools](https://github.com/foxglove/mcap/tree/main/python/mcap-cli)
249+
250+
## Summary
251+
252+
For optimal performance with Kleinkram:
253+
254+
1. ✅ Record with `fastwrite` profile on robot
255+
2. ✅ Post-process with `mcap compress` or `ros2 bag convert`
256+
3. ✅ Upload chunk-compressed `.mcap` files
257+
4. ✅ Enjoy fast uploads, storage savings, and efficient actions
258+
5. ❌ Don't use file-level `.zst` or `.gz` compression

0 commit comments

Comments
 (0)