Skip to content

Commit 364d1d2

Browse files
authored
[BREAKING] Conv1D manages its own ring buffer (#181)
* Improve Conv1D class to manage its own ring buffer - Add RingBuffer class to manage write/read pointers for Eigen::MatrixXf buffers - Add Reset() method to Conv1D to initialize ring buffer and pre-allocate output - Add get_output() method to Conv1D to access output buffer - Rename process_() to Process() and update to use internal ring buffer - Update ConvNet to use Conv1D internal buffers via Process() and get_output() - Update WaveNet to use Conv1D internal buffers and propagate Reset() - Add comprehensive tests for Conv1D in test_conv1d.cpp Implements issue #145 * Fix RingBuffer to handle max lookback and add comprehensive tests - Fix Reset() to zero buffer behind starting write position - Fix Rewind() to copy receptive_field (max lookback) samples to start - Set write position after receptive_field when rewinding - Add comprehensive test suite for RingBuffer (12 tests) - Tests cover: construction, reset, write/read, advance, rewind, lookback, etc. Fixes issues with RingBuffer not properly handling max lookback when rewinding. * Rename 'receptive field' to 'max lookback' in RingBuffer - Rename _receptive_field member to _max_lookback - Rename SetReceptiveField() to SetMaxLookback() - Update all comments and documentation - Update all test references from receptive_field to max_lookback - More descriptive name for ring buffer context * Remove trailing whitespace (formatting cleanup) * Move RingBuffer and Conv1D to separate source and header files - Create NAM/ring_buffer.h and NAM/ring_buffer.cpp for RingBuffer class - Create NAM/conv1d.h and NAM/conv1d.cpp for Conv1D class - Remove RingBuffer and Conv1D from NAM/dsp.h and NAM/dsp.cpp - Update includes in convnet.h, wavenet.h, and test files - All tests pass after refactoring * Replace GetCapacity() with GetMaxBufferSize() in RingBuffer - Remove GetCapacity() from public interface (storage size is internal detail) - Add GetMaxBufferSize() to return the max_buffer_size passed to Reset() - Store max_buffer_size as member variable - Update tests to use GetMaxBufferSize() instead of GetCapacity() - External code should trust that storage is sized correctly * Add assertions in RingBuffer::Rewind() to prevent aliasing - Assert that write pointer is at least 2 * max_lookback to avoid aliasing - Assert that copy start position is within storage bounds - Remove silent failure condition - now asserts instead of silently skipping - Prevents data corruption from overlapping copy operations * Rename Conv1D::get_output to GetOutput - Renamed method in conv1d.h and conv1d.cpp - Updated all usages in convnet.cpp and wavenet.cpp - Updated all test cases in test_conv1d.cpp - Matches naming convention with Conv1x1::GetOutput() * Remove _total_written tracking and GetReadPos() from RingBuffer - Remove _total_written member variable and all references - Remove GetReadPos() method (read_pos calculation now inline in Read()) - Remove test_get_read_pos() test function - Simplify RingBuffer for exclusive use by Conv1D layer - Update tests to work without GetReadPos() * Refactor LayerArray::Process() to remove head_outputs parameter and add GetHeadOutputs() - Remove head_outputs parameter from both Process() overloads - Add GetHeadOutputs() method to retrieve head outputs from _head_rechannel - Update WaveNet::process() to use GetHeadOutputs() directly from layer arrays - Remove deprecated set_num_frames_() methods from _Layer and _LayerArray - Update _Layer::SetMaxBufferSize() to use Conv1D::SetMaxBufferSize() instead of Reset() - Simplify head input zeroing in LayerArray::Process() * Refactor WaveNet LayerArray and remove _DilatedConv wrapper - Remove _DilatedConv wrapper class, use Conv1D directly in _Layer - Refactor LayerArray::Process() to extract common logic into ProcessInner() - Make _Layer::Process() RT-safe by removing resize in Process() - Update _Layer::SetMaxBufferSize() to use Conv1D::SetMaxBufferSize() - Update tests and ConvNet to use SetMaxBufferSize() instead of Reset() - Simplify _Layer::Process() by combining conv and input_mixin processing * Add comprehensive WaveNet tests organized by component - Split WaveNet tests into three files by component: - test_layer.cpp: Tests for individual WaveNet layers - test_layer_array.cpp: Tests for layer arrays - test_full.cpp: Tests for full WaveNet models - Use nested namespaces: test_wavenet::test_layer, test_wavenet::test_layer_array, test_wavenet::test_full - Remove old monolithic test_wavenet.cpp file - Update test runner to include new test files directly - Add 12 new comprehensive tests covering: - Gated and non-gated layers - Different activations - Multi-channel layers - Layer array processing - Receptive field calculations - Full model processing - Edge cases (zero input, different buffer sizes) - Prewarm functionality * Refactor Conv1D and RingBuffer API, improve tests - Add Conv1D constructor with shape parameters - Rename Conv1D::Reset() to SetMaxBufferSize(), remove unused sampleRate param - Add Conv1D::has_bias() method - Reorganize RingBuffer public/private interface (move internal methods to private) - Improve test assertions with numerical accuracy checks - Clean up ring buffer tests, remove internal state checks - Remove commented code from WaveNet - Update NOTES with completed tasks * Complete ConvNet refactoring to use Conv1D ring buffer API - Refactor ConvNetBlock to add Process() and GetOutput() methods using new Conv1D API - Simplify ConvNet::process() to eliminate _block_vals for Conv1D layers - Simplify buffer management - Conv1D handles its own ring buffers - Fix test expectations in test_conv1d.cpp (corrected weight ordering) - Fix test bug in test_conv1d.cpp (output2 -> output) - Fix ring buffer test assertion - Add comprehensive ConvNet tests (test_convnet.cpp) - Update test runner to include ConvNet tests This completes the refactoring work for issue #145, making ConvNet use Conv1D's new ring buffer API similar to how WaveNet was refactored. * Fix Eigen Block resize error in wavenet Layer Process Fixed assertion failure 'DenseBase::resize() does not actually allow to resize' at wavenet.cpp:61. The issue occurred when assigning _z.leftCols() to _output_head.leftCols() for gated layers, where _z has 2*channels rows but _output_head has only channels rows. Fix: Use _z.topRows(channels).leftCols(num_frames) for gated layers to correctly select only the first channels rows that should be copied to _output_head. * Fix ConvNet test weight counts Fixed incorrect weight counts in ConvNet tests. The tests were missing bias weights when batchnorm=false (since !batchnorm=true means do_bias=true in ConvNetBlock::set_weights_). Changes: - test_convnet_basic: Added 4 bias weights (2 per block) - 15 to 19 weights - test_convnet_batchnorm: Removed 1 extra bias (batchnorm=true means no bias) - 10 to 9 weights - test_convnet_multiple_blocks: Added 6 bias weights (2 per block) - 23 to 29 weights - test_convnet_zero_input: Added 1 bias weight - 3 to 4 weights - test_convnet_different_buffer_sizes: Added 1 bias weight - 3 to 4 weights - test_convnet_prewarm: Added 6 bias weights (2 per block) - 23 to 29 weights - test_convnet_multiple_calls: Added 1 bias weight - 3 to 4 weights All tests now pass successfully. * Add ConvNetBlock buffer management methods Added SetMaxBufferSize and Process methods to ConvNetBlock to manage output buffers independently from Conv1D, allowing proper batchnorm and activation application on block-owned buffers. * Remove unneeded includes * Remove unused _head_arrays from WaveNet class * Remove unused code from WaveNet class - Remove WaveNet::_head_output member variable (never read) - Remove WaveNet::_set_num_frames_ method declaration (never implemented) - Remove _LayerArray::_get_receptive_field() private method (has TODO to remove) - Remove _Head::_head member variable (never used) - Remove entire _Head class (never instantiated in WaveNet) * Optimize matrix operations and fix build warnings - Add .noalias() to matrix assignments for better performance - Remove unnecessary _z.setZero() call (matrix is initialized as needed) - Remove redundant comment in SetMaxBufferSize - Add build workaround for conv1d.cpp Eigen warnings * Add real-time safety test for WaveNet process() method - Add test_process_realtime_safe() to verify no allocations during process() - Add allocation tracking using malloc/free hooks to catch Eigen allocations - Add helper tests to verify allocation tracking works correctly - Test processes buffers of multiple sizes with two layer arrays - Ensures WaveNet::process() is real-time safe (no allocations/frees) * Add real-time safety tests for Conv1D, Layer, and LayerArray - Add test_conv1d_process_realtime_safe() to test Conv1D with full matrix input - Add test_conv1d_process_block_realtime_safe() to test Conv1D with Block input - Add test_layer_process_realtime_safe() to test Layer::Process() - Add test_layer_array_process_realtime_safe() to test LayerArray::Process() - Tests confirm Conv1D allocates when passed Block expressions - Attempt to fix RingBuffer::Write() to avoid allocations (work in progress) * Refine real-time tests to use full buffers and document RingBuffer usage - Remove Conv1D Block-input real-time test and rely on full-matrix path - Ensure tests and runtime code pass full buffers between layers, slicing only inside - Simplify RingBuffer::Write to take MatrixXf and add note about full-buffer requirement * Pass full buffers between WaveNet layers for real-time safety - Add full-buffer getters for Conv1x1, _Layer, and _LayerArray - Change LayerArray and WaveNet to pass full MatrixXf buffers and slice internally - Simplify RingBuffer::Write API and document the full-buffer requirement - Restore real-time safety test file and keep only full-matrix Conv1D test in runner * Remove num_frames parameter from output getters, return full buffers - Remove GetOutputHead(num_frames) and GetOutputNextLayer(num_frames) from _Layer - Remove GetLayerOutputs(num_frames) and GetHeadOutputs(num_frames) from _LayerArray - Rename *Full() methods to original names (GetOutputHead, GetOutputNextLayer, etc.) - Update Conv1D and Conv1x1 GetOutput() to return full buffer (no num_frames param) - Update all internal code and tests to use .leftCols(num_frames) on full buffers - All methods now return pre-allocated full buffers; callers slice as needed * Untrack some files that were accidentally added * Remove accidentally-tracked files * Add missing cassert include to activations.h * Fix missing include * Add --branch flag to benchmark_compare.sh to compare against different branches
1 parent c6f0be3 commit 364d1d2

23 files changed

+2878
-547
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,5 @@
3030
*.exe
3131
*.out
3232
*.app
33+
34+
.vscode/

NAM/activations.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
#pragma once
22

3+
#include <cassert>
34
#include <string>
45
#include <cmath> // expf
56
#include <unordered_map>

NAM/conv1d.cpp

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
#include "conv1d.h"
2+
3+
namespace nam
4+
{
5+
// Conv1D =====================================================================
6+
7+
void Conv1D::set_weights_(std::vector<float>::iterator& weights)
8+
{
9+
if (this->_weight.size() > 0)
10+
{
11+
const long out_channels = this->_weight[0].rows();
12+
const long in_channels = this->_weight[0].cols();
13+
// Crazy ordering because that's how it gets flattened.
14+
for (auto i = 0; i < out_channels; i++)
15+
for (auto j = 0; j < in_channels; j++)
16+
for (size_t k = 0; k < this->_weight.size(); k++)
17+
this->_weight[k](i, j) = *(weights++);
18+
}
19+
for (long i = 0; i < this->_bias.size(); i++)
20+
this->_bias(i) = *(weights++);
21+
}
22+
23+
void Conv1D::set_size_(const int in_channels, const int out_channels, const int kernel_size, const bool do_bias,
24+
const int _dilation)
25+
{
26+
this->_weight.resize(kernel_size);
27+
for (size_t i = 0; i < this->_weight.size(); i++)
28+
this->_weight[i].resize(out_channels,
29+
in_channels); // y = Ax, input array (C,L)
30+
if (do_bias)
31+
this->_bias.resize(out_channels);
32+
else
33+
this->_bias.resize(0);
34+
this->_dilation = _dilation;
35+
}
36+
37+
void Conv1D::set_size_and_weights_(const int in_channels, const int out_channels, const int kernel_size,
38+
const int _dilation, const bool do_bias, std::vector<float>::iterator& weights)
39+
{
40+
this->set_size_(in_channels, out_channels, kernel_size, do_bias, _dilation);
41+
this->set_weights_(weights);
42+
}
43+
44+
void Conv1D::SetMaxBufferSize(const int maxBufferSize)
45+
{
46+
_max_buffer_size = maxBufferSize;
47+
48+
// Calculate receptive field (maximum lookback needed)
49+
const long kernel_size = get_kernel_size();
50+
const long dilation = get_dilation();
51+
const long receptive_field = kernel_size > 0 ? (kernel_size - 1) * dilation : 0;
52+
53+
const long in_channels = get_in_channels();
54+
55+
// Initialize input ring buffer
56+
// Set max lookback before Reset so that Reset() can use it to calculate storage size
57+
// Reset() will calculate storage size as: 2 * max_lookback + max_buffer_size
58+
_input_buffer.SetMaxLookback(receptive_field);
59+
_input_buffer.Reset(in_channels, maxBufferSize);
60+
61+
// Pre-allocate output matrix
62+
const long out_channels = get_out_channels();
63+
_output.resize(out_channels, maxBufferSize);
64+
_output.setZero();
65+
}
66+
67+
68+
void Conv1D::Process(const Eigen::MatrixXf& input, const int num_frames)
69+
{
70+
// Write input to ring buffer
71+
_input_buffer.Write(input, num_frames);
72+
73+
// Zero output before processing
74+
_output.leftCols(num_frames).setZero();
75+
76+
// Process from ring buffer with dilation lookback
77+
// After Write(), data is at positions [_write_pos, _write_pos+num_frames-1]
78+
// For kernel tap k with offset, we need to read from _write_pos + offset
79+
// The offset is negative (looking back), so _write_pos + offset reads from earlier positions
80+
// The original process_() reads: input.middleCols(i_start + offset, ncols)
81+
// where i_start is the current position and offset is negative for lookback
82+
for (size_t k = 0; k < this->_weight.size(); k++)
83+
{
84+
const long offset = this->_dilation * (k + 1 - (long)this->_weight.size());
85+
// Offset is negative (looking back)
86+
// Read from position: _write_pos + offset
87+
// Since offset is negative, we compute lookback = -offset to read from _write_pos - lookback
88+
const long lookback = -offset;
89+
90+
// Read num_frames starting from write_pos + offset (which is write_pos - lookback)
91+
auto input_block = _input_buffer.Read(num_frames, lookback);
92+
93+
// Perform convolution: output += weight[k] * input_block
94+
_output.leftCols(num_frames).noalias() += this->_weight[k] * input_block;
95+
}
96+
97+
// Add bias if present
98+
if (this->_bias.size() > 0)
99+
{
100+
_output.leftCols(num_frames).colwise() += this->_bias;
101+
}
102+
103+
// Advance ring buffer write pointer after processing
104+
_input_buffer.Advance(num_frames);
105+
}
106+
107+
void Conv1D::process_(const Eigen::MatrixXf& input, Eigen::MatrixXf& output, const long i_start, const long ncols,
108+
const long j_start) const
109+
{
110+
// This is the clever part ;)
111+
for (size_t k = 0; k < this->_weight.size(); k++)
112+
{
113+
const long offset = this->_dilation * (k + 1 - this->_weight.size());
114+
if (k == 0)
115+
output.middleCols(j_start, ncols).noalias() = this->_weight[k] * input.middleCols(i_start + offset, ncols);
116+
else
117+
output.middleCols(j_start, ncols).noalias() += this->_weight[k] * input.middleCols(i_start + offset, ncols);
118+
}
119+
if (this->_bias.size() > 0)
120+
{
121+
output.middleCols(j_start, ncols).colwise() += this->_bias;
122+
}
123+
}
124+
125+
long Conv1D::get_num_weights() const
126+
{
127+
long num_weights = this->_bias.size();
128+
for (size_t i = 0; i < this->_weight.size(); i++)
129+
num_weights += this->_weight[i].size();
130+
return num_weights;
131+
}
132+
} // namespace nam

NAM/conv1d.h

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
#pragma once
2+
3+
#include <Eigen/Dense>
4+
#include <vector>
5+
#include "ring_buffer.h"
6+
7+
namespace nam
8+
{
9+
class Conv1D
10+
{
11+
public:
12+
Conv1D() { this->_dilation = 1; };
13+
Conv1D(const int in_channels, const int out_channels, const int kernel_size, const int bias, const int dilation)
14+
{
15+
set_size_(in_channels, out_channels, kernel_size, bias, dilation);
16+
};
17+
void set_weights_(std::vector<float>::iterator& weights);
18+
void set_size_(const int in_channels, const int out_channels, const int kernel_size, const bool do_bias,
19+
const int _dilation);
20+
void set_size_and_weights_(const int in_channels, const int out_channels, const int kernel_size, const int _dilation,
21+
const bool do_bias, std::vector<float>::iterator& weights);
22+
// Reset the ring buffer and pre-allocate output buffer
23+
// :param sampleRate: Unused, for interface consistency
24+
// :param maxBufferSize: Maximum buffer size for output buffer and to size ring buffer
25+
void SetMaxBufferSize(const int maxBufferSize);
26+
// Get the entire internal output buffer. This is intended for internal wiring
27+
// between layers; callers should treat the buffer as pre-allocated storage
28+
// and only consider the first `num_frames` columns valid for a given
29+
// processing call. Slice with .leftCols(num_frames) as needed.
30+
Eigen::MatrixXf& GetOutput() { return _output; }
31+
const Eigen::MatrixXf& GetOutput() const { return _output; }
32+
// Process input and write to internal output buffer
33+
// :param input: Input matrix (channels x num_frames)
34+
// :param num_frames: Number of frames to process
35+
void Process(const Eigen::MatrixXf& input, const int num_frames);
36+
// Process from input to output (legacy method, kept for compatibility)
37+
// Rightmost indices of input go from i_start for ncols,
38+
// Indices on output for from j_start (to j_start + ncols - i_start)
39+
void process_(const Eigen::MatrixXf& input, Eigen::MatrixXf& output, const long i_start, const long ncols,
40+
const long j_start) const;
41+
long get_in_channels() const { return this->_weight.size() > 0 ? this->_weight[0].cols() : 0; };
42+
long get_kernel_size() const { return this->_weight.size(); };
43+
long get_num_weights() const;
44+
long get_out_channels() const { return this->_weight.size() > 0 ? this->_weight[0].rows() : 0; };
45+
int get_dilation() const { return this->_dilation; };
46+
bool has_bias() const { return this->_bias.size() > 0; };
47+
48+
protected:
49+
// conv[kernel](cout, cin)
50+
std::vector<Eigen::MatrixXf> _weight;
51+
Eigen::VectorXf _bias;
52+
int _dilation;
53+
54+
private:
55+
RingBuffer _input_buffer; // Ring buffer for input (channels x buffer_size)
56+
Eigen::MatrixXf _output; // Pre-allocated output buffer (out_channels x maxBufferSize)
57+
int _max_buffer_size = 0; // Stored maxBufferSize
58+
};
59+
} // namespace nam

0 commit comments

Comments
 (0)