|
| 1 | +# DawDreamer Pickle Format Documentation |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +DawDreamer supports pickling (serialization) of `RenderEngine` and all processor types using Python's `pickle` module. This allows you to save and restore complete audio processing graphs, including all processor state, parameters, automation curves, and MIDI events. |
| 6 | + |
| 7 | +## Basic Usage |
| 8 | + |
| 9 | +```python |
| 10 | +import pickle |
| 11 | +import dawdreamer as daw |
| 12 | + |
| 13 | +# Create and configure a RenderEngine |
| 14 | +engine = daw.RenderEngine(44100, 512) |
| 15 | +# ... configure processors and graph ... |
| 16 | +engine.render(duration) |
| 17 | + |
| 18 | +# Serialize to bytes |
| 19 | +pickled_bytes = pickle.dumps(engine) |
| 20 | + |
| 21 | +# Deserialize |
| 22 | +restored_engine = pickle.loads(pickled_bytes) |
| 23 | + |
| 24 | +# Or save/load from file |
| 25 | +with open("my_session.pkl", "wb") as f: |
| 26 | + pickle.dump(engine, f) |
| 27 | + |
| 28 | +with open("my_session.pkl", "rb") as f: |
| 29 | + restored_engine = pickle.load(f) |
| 30 | +``` |
| 31 | + |
| 32 | +## Supported Processor Types |
| 33 | + |
| 34 | +All processor types support pickling: |
| 35 | + |
| 36 | +- **RenderEngine** - Complete audio graph with all processors |
| 37 | +- **PlaybackProcessor** - Audio data preserved as numpy arrays |
| 38 | +- **FaustProcessor** - DSP code, parameters, polyphony settings, MIDI, automation |
| 39 | +- **PluginProcessor** - Plugin path, VST state blob, MIDI events |
| 40 | +- **SamplerProcessor** - Sample data, parameters, MIDI events |
| 41 | +- **OscillatorProcessor** - Frequency setting |
| 42 | +- **FilterProcessor** - Type, frequency, Q, gain |
| 43 | +- **CompressorProcessor** - Threshold, ratio, attack, release |
| 44 | +- **ReverbProcessor** - Room size, damping, wet/dry, width, freeze |
| 45 | +- **PannerProcessor** - Panning rule, pan value |
| 46 | +- **DelayProcessor** - Mode, delay time, feedback |
| 47 | +- **AddProcessor** - Gain levels for each input |
| 48 | + |
| 49 | +## What Gets Preserved |
| 50 | + |
| 51 | +### RenderEngine State |
| 52 | +- Sample rate |
| 53 | +- Buffer size |
| 54 | +- BPM (single value or automation array) |
| 55 | +- PPQN (pulses per quarter note) |
| 56 | +- Audio processing graph structure |
| 57 | +- All processor instances and their connections |
| 58 | + |
| 59 | +### Processor State (Common to All) |
| 60 | +- Unique name |
| 61 | +- Sample rate |
| 62 | +- Parameter values |
| 63 | +- Parameter automation curves (numpy arrays) |
| 64 | + |
| 65 | +### Processor-Specific State |
| 66 | + |
| 67 | +#### PlaybackProcessor |
| 68 | +- Complete audio data as numpy array (shape: [channels, samples]) |
| 69 | + |
| 70 | +#### FaustProcessor |
| 71 | +- DSP source code string |
| 72 | +- Faust library paths |
| 73 | +- Compiled state (automatically recompiled on restore) |
| 74 | +- Polyphony settings: `num_voices`, `group_voices`, `dynamic_voices`, `release_length` |
| 75 | +- All parameter values and automation curves |
| 76 | +- MIDI events (both beat-based and second-based) |
| 77 | + |
| 78 | +#### PluginProcessor |
| 79 | +- Plugin file path |
| 80 | +- Plugin state blob (VST/AU internal state via `getStateInformation()`) |
| 81 | +- Parameter values (restored from plugin state) |
| 82 | +- MIDI events (both beat-based and second-based) |
| 83 | + |
| 84 | +#### SamplerProcessor |
| 85 | +- Original sample data (non-upsampled) |
| 86 | +- All sampler parameters (attack, decay, sustain, release, etc.) |
| 87 | +- MIDI events (both beat-based and second-based) |
| 88 | + |
| 89 | +## MIDI Serialization Format |
| 90 | + |
| 91 | +MIDI events are serialized using a binary format for efficiency: |
| 92 | + |
| 93 | +### Binary Format (per MIDI message) |
| 94 | +``` |
| 95 | +[Sample Position: 4 bytes, big-endian] |
| 96 | +[Message Size: 2 bytes, big-endian] |
| 97 | +[Message Data: variable length] |
| 98 | +``` |
| 99 | + |
| 100 | +- **Sample Position** (int32): Sample-accurate timing position |
| 101 | +- **Message Size** (uint16): Number of bytes in MIDI message |
| 102 | +- **Message Data**: Raw MIDI bytes (typically 3 bytes for note on/off) |
| 103 | + |
| 104 | +### MIDI Buffer Types |
| 105 | +Each processor with MIDI support maintains two separate buffers: |
| 106 | +- **myMidiBufferQN**: Beat-based events (quarter notes) |
| 107 | +- **myMidiBufferSec**: Second-based events (absolute time) |
| 108 | + |
| 109 | +Both buffers are preserved independently during pickling. |
| 110 | + |
| 111 | +## Implementation Details |
| 112 | + |
| 113 | +### Placement New Pattern |
| 114 | +DawDreamer uses nanobind's placement new pattern for reconstruction: |
| 115 | + |
| 116 | +```cpp |
| 117 | +void setPickleState(nb::dict state) { |
| 118 | + // Extract parameters from state dictionary |
| 119 | + std::string name = nb::cast<std::string>(state["unique_name"]); |
| 120 | + // ... extract other parameters ... |
| 121 | + |
| 122 | + // Reconstruct object in-place using placement new |
| 123 | + new (this) ProcessorType(name, ...); |
| 124 | + |
| 125 | + // Restore additional state |
| 126 | + // ... restore parameters, MIDI, etc. ... |
| 127 | +} |
| 128 | +``` |
| 129 | +
|
| 130 | +This pattern is required by nanobind to properly reconstruct C++ objects within the Python object lifecycle. |
| 131 | +
|
| 132 | +### Parameter Automation Restoration |
| 133 | +Automation curves are restored **after** the audio graph is compiled: |
| 134 | +
|
| 135 | +1. During pickle: Save automation arrays with processor name and parameter name |
| 136 | +2. During unpickle: Build processor map during graph restoration |
| 137 | +3. After graph compilation: Apply saved automation to processors by name |
| 138 | +
|
| 139 | +This deferred restoration is necessary because: |
| 140 | +- Processors may be reconstructed during unpickling |
| 141 | +- Graph compilation creates the final processor instances |
| 142 | +- Automation must be applied to the compiled processor instances |
| 143 | +
|
| 144 | +### VST/AU Plugin State |
| 145 | +Plugins use JUCE's binary state format: |
| 146 | +- `getStateInformation()` creates a binary blob of plugin state |
| 147 | +- `setStateInformation()` restores plugin from binary blob |
| 148 | +- This format is plugin-specific and opaque to DawDreamer |
| 149 | +- Parameter values are automatically restored from plugin state |
| 150 | +
|
| 151 | +## Limitations and Caveats |
| 152 | +
|
| 153 | +### 1. Platform and Architecture |
| 154 | +- **Plugin paths must be valid** on the restore system |
| 155 | +- VST/AU plugins must be installed at the same paths |
| 156 | +- Plugin state blobs are generally portable but plugin-specific |
| 157 | +- Cross-platform compatibility depends on plugin implementation |
| 158 | +
|
| 159 | +### 2. Faust Processors |
| 160 | +- DSP code is preserved and **recompiled on restore** |
| 161 | +- Faust libraries must be available at restore time |
| 162 | +- `faust_libraries_paths` should point to valid locations |
| 163 | +- Compilation may fail if Faust version differs significantly |
| 164 | +
|
| 165 | +### 3. Sample Rate and Buffer Size |
| 166 | +- Sample rate is preserved and enforced |
| 167 | +- Buffer size changes are supported but may affect timing |
| 168 | +- Audio data is NOT resampled automatically |
| 169 | +
|
| 170 | +### 4. File References |
| 171 | +- **PlaybackProcessor**: Audio data is embedded (can be large) |
| 172 | +- **PluginProcessor**: Only plugin path is stored (plugin must exist) |
| 173 | +- **SamplerProcessor**: Sample data is embedded |
| 174 | +- No automatic path resolution or file tracking |
| 175 | +
|
| 176 | +### 5. Automation Timing |
| 177 | +- Automation arrays are sample-accurate |
| 178 | +- Changing sample rate after restore will affect timing |
| 179 | +- BPM automation is preserved as-is (no time-stretching) |
| 180 | +
|
| 181 | +### 6. MIDI Timing |
| 182 | +- MIDI events use sample positions (sample-accurate) |
| 183 | +- Beat-based MIDI depends on BPM at restore time |
| 184 | +- No MIDI time-stretching or quantization is applied |
| 185 | +
|
| 186 | +### 7. Real-time State |
| 187 | +- Audio processing state (buffers, delay lines) is **NOT** preserved |
| 188 | +- Processors are reset to initial state after restore |
| 189 | +- No continuation of reverb tails, delay feedback, etc. |
| 190 | +
|
| 191 | +### 8. Graph Topology |
| 192 | +- Graph structure is preserved exactly |
| 193 | +- Processor order matters (serialized in graph order) |
| 194 | +- Cyclic graphs are not supported (limitation of DawDreamer, not pickle) |
| 195 | +
|
| 196 | +### 9. Memory Considerations |
| 197 | +- **Large audio data can create large pickle files** |
| 198 | +- PlaybackProcessor and SamplerProcessor embed full audio |
| 199 | +- Consider external file storage for large audio datasets |
| 200 | +- No compression is applied (use `pickle.HIGHEST_PROTOCOL` and compress externally) |
| 201 | +
|
| 202 | +### 10. Thread Safety |
| 203 | +- Pickling is **not thread-safe** |
| 204 | +- Do not pickle while rendering |
| 205 | +- Ensure exclusive access during serialization |
| 206 | +
|
| 207 | +## Versioning Considerations |
| 208 | +
|
| 209 | +**Current Status**: No explicit version tracking (as of v0.8.4) |
| 210 | +
|
| 211 | +Future versions may add: |
| 212 | +- Version number in pickle state dictionaries |
| 213 | +- Backward compatibility checks |
| 214 | +- Migration code for format changes |
| 215 | +
|
| 216 | +**Recommendation**: Store the DawDreamer version alongside pickled data: |
| 217 | +```python |
| 218 | +import dawdreamer as daw |
| 219 | +data = { |
| 220 | + 'version': daw.__version__, |
| 221 | + 'engine': pickle.dumps(engine) |
| 222 | +} |
| 223 | +``` |
| 224 | + |
| 225 | +## Best Practices |
| 226 | + |
| 227 | +### 1. Version Tracking |
| 228 | +Always store the DawDreamer version used to create pickles: |
| 229 | +```python |
| 230 | +metadata = { |
| 231 | + 'dawdreamer_version': daw.__version__, |
| 232 | + 'created_date': datetime.now().isoformat(), |
| 233 | + 'sample_rate': SAMPLE_RATE, |
| 234 | +} |
| 235 | +``` |
| 236 | + |
| 237 | +### 2. Validation After Restore |
| 238 | +Always verify restored state: |
| 239 | +```python |
| 240 | +restored_engine = pickle.loads(pickled_bytes) |
| 241 | +# Verify graph structure |
| 242 | +assert len(restored_engine.get_audio().shape) == 2 |
| 243 | +# Test render |
| 244 | +restored_engine.render(0.1) # Short test render |
| 245 | +``` |
| 246 | + |
| 247 | +### 3. Plugin Path Handling |
| 248 | +Store plugin paths separately for cross-platform support: |
| 249 | +```python |
| 250 | +# Before pickle |
| 251 | +plugin_paths = { |
| 252 | + 'effect': plugin.plugin_path, |
| 253 | + 'instrument': instrument.plugin_path |
| 254 | +} |
| 255 | + |
| 256 | +# After restore, verify paths exist |
| 257 | +if not os.path.exists(plugin_paths['effect']): |
| 258 | + # Handle missing plugin |
| 259 | +``` |
| 260 | + |
| 261 | +### 4. Large Audio Data |
| 262 | +For large projects, consider hybrid approach: |
| 263 | +```python |
| 264 | +# Option 1: Separate audio storage |
| 265 | +audio_data = playback.get_audio() |
| 266 | +np.save('audio.npy', audio_data) |
| 267 | +# ... pickle without audio data ... |
| 268 | + |
| 269 | +# Option 2: Compressed pickle |
| 270 | +import gzip |
| 271 | +with gzip.open('session.pkl.gz', 'wb') as f: |
| 272 | + pickle.dump(engine, f) |
| 273 | +``` |
| 274 | + |
| 275 | +### 5. Error Handling |
| 276 | +Always wrap unpickling in try-except: |
| 277 | +```python |
| 278 | +try: |
| 279 | + engine = pickle.loads(pickled_bytes) |
| 280 | +except Exception as e: |
| 281 | + print(f"Failed to restore session: {e}") |
| 282 | + # Handle error (use backup, notify user, etc.) |
| 283 | +``` |
| 284 | + |
| 285 | +## Testing |
| 286 | + |
| 287 | +The test suite (`tests/test_pickle.py`) includes: |
| 288 | +- Round-trip tests for all processor types |
| 289 | +- MIDI preservation tests (empty, normal, large buffers) |
| 290 | +- Automation curve preservation |
| 291 | +- Complex graph structures |
| 292 | +- Edge cases and error conditions |
| 293 | + |
| 294 | +Run tests: |
| 295 | +```bash |
| 296 | +pytest tests/test_pickle.py -v |
| 297 | +``` |
| 298 | + |
| 299 | +## Technical References |
| 300 | + |
| 301 | +- **Nanobind Documentation**: https://nanobind.readthedocs.io/ |
| 302 | +- **Python Pickle Protocol**: https://docs.python.org/3/library/pickle.html |
| 303 | +- **JUCE VST State**: Uses `AudioProcessor::getStateInformation()` |
| 304 | +- **MIDI Format**: Based on JUCE `MidiBuffer` iteration |
| 305 | + |
| 306 | +## Future Enhancements |
| 307 | + |
| 308 | +Potential improvements for future versions: |
| 309 | + |
| 310 | +1. **Explicit versioning** - Add version field to all pickle states |
| 311 | +2. **Compression** - Optional built-in compression for audio data |
| 312 | +3. **Incremental updates** - Patch format for parameter changes |
| 313 | +4. **Validation** - Checksum/hash verification of restored state |
| 314 | +5. **Migration** - Automatic format migration for older versions |
| 315 | +6. **Streaming** - Support for lazy loading of large audio data |
| 316 | +7. **Metadata** - Standardized metadata fields (author, date, description) |
| 317 | + |
| 318 | +## Contributing |
| 319 | + |
| 320 | +When adding new processors or extending pickle support: |
| 321 | + |
| 322 | +1. Implement `getPickleState()` returning `nb::dict` |
| 323 | +2. Implement `setPickleState(nb::dict)` using placement new |
| 324 | +3. Add comprehensive tests to `test_pickle.py` |
| 325 | +4. Update this documentation |
| 326 | +5. Consider backward compatibility impact |
| 327 | + |
| 328 | +See existing processors for implementation patterns. |
0 commit comments