|
| 1 | +--- |
| 2 | +title: 1D Filtering |
| 3 | +tags: [computer science, programming] |
| 4 | +style: fill |
| 5 | +color: danger |
| 6 | +description: Filtering a chord and a profile |
| 7 | +--- |
| 8 | + |
| 9 | +# Introduction |
| 10 | + |
| 11 | +1D filtering is used in a wide variety of use cases: time series, forecasting, ECC signals, or, as focused here, filtering profiles like those scanned with laser profilometers, where causality (dependency on distant past values) is less critical than spatial contiguity/coherence in neighborhood points, robustness against data loss, extreme outliers, and preservation of central trends. |
| 12 | + |
| 13 | +In this post, we will explore various 1D filtering techniques and apply them (implemented in C++) to two different 1D signals: a C Major chord signal and a real laser-scanned profile. |
| 14 | + |
| 15 | +# Algorithms |
| 16 | + |
| 17 | +Each subsection combines theory, code, and results. A global visual/quantitative comparison is provided at the end. Later sections will explain filtered signals and how noise was added. First, we briefly describe each method and show visual results. |
| 18 | + |
| 19 | +## Exponential Filter |
| 20 | + |
| 21 | +Smooths signals by exponentially weighting past values. This filter applies exponential smoothing (aka exponential moving average), commonly used in signal processing for smoothing or creating a "moving average" with emphasis on recent values. |
| 22 | + |
| 23 | +**Equation**: |
| 24 | +$$ |
| 25 | +y[i] = \alpha \cdot x[i] + (1 - \alpha) \cdot y[i-1] |
| 26 | +$$ |
| 27 | + |
| 28 | +**Parameters**: |
| 29 | +- α (0 < α < 1): |
| 30 | + - α ≈ 1 → Fast tracking (less smoothing) |
| 31 | + - α ≈ 0 → Slow tracking (more smoothing) |
| 32 | + |
| 33 | +**Filtered Results**: |
| 34 | + |
| 35 | + |
| 36 | + |
| 37 | +## Total Variation (TV) Filter |
| 38 | + |
| 39 | +Minimizes total variation (gradients) while preserving edges. |
| 40 | + |
| 41 | +**Energy Minimization Equation**: |
| 42 | +$$ |
| 43 | +y[i] = \alpha \cdot x[i] + (1 - \alpha) \cdot y[i-1] |
| 44 | +$$ |
| 45 | + |
| 46 | +**Parameters**: |
| 47 | +- λ ≥ 0: Controls fidelity to original signal |
| 48 | +- `step_size`: Gradient descent step size |
| 49 | +- `iterations`: Number of iterations |
| 50 | + |
| 51 | +**Filtered Results**: |
| 52 | + |
| 53 | + |
| 54 | + |
| 55 | +## Convex Hull Envelope |
| 56 | + |
| 57 | +Computes upper convex hull of points (i, x[i]). |
| 58 | + |
| 59 | +**Implicit Equation**: |
| 60 | +$$ |
| 61 | +y[i] = \max \{ f[j] \ | \ (j, f[j]) \text{ lies on the convex hull} \} |
| 62 | +$$ |
| 63 | + |
| 64 | +**Parameters**: None (depends on signal geometry) |
| 65 | + |
| 66 | +**Filtered Results**: |
| 67 | + |
| 68 | + |
| 69 | + |
| 70 | +## Moving Average |
| 71 | + |
| 72 | +Averages values in symmetric windows. |
| 73 | + |
| 74 | +**Equation**: |
| 75 | +$$ |
| 76 | +y[i] = \frac{1}{N} \sum_{k=-w}^{w} x[i + k] |
| 77 | +$$ |
| 78 | +where \( N = 2w + 1 \) (window size). |
| 79 | + |
| 80 | +**Parameters**: |
| 81 | +- `window_size`: Window width (odd integer) |
| 82 | + |
| 83 | +**Filtered Results**: |
| 84 | + |
| 85 | + |
| 86 | + |
| 87 | +## Moving Maximum/Minimum |
| 88 | + |
| 89 | +Upper envelope (maximum) or lower envelope (minimum). |
| 90 | + |
| 91 | +**Equation (maximum)**: |
| 92 | +$$ |
| 93 | +y[i] = \max \{ x[i - w], \ ..., \ x[i + w] \} |
| 94 | +$$ |
| 95 | + |
| 96 | +**Parameters**: |
| 97 | +- `window_size`: Window width |
| 98 | + |
| 99 | +**Filtered Results (maximum)**: |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | +**Filtered Results (minimum)**: |
| 104 | + |
| 105 | + |
| 106 | + |
| 107 | +## Moving Median |
| 108 | + |
| 109 | +Removes outliers using local median. |
| 110 | + |
| 111 | +**Equation**: |
| 112 | +$$ |
| 113 | +y[i] = \text{median} \{ x[i - w], \ ..., \ x[i + w] \} |
| 114 | +$$ |
| 115 | + |
| 116 | +**Parameters**: |
| 117 | +- `window_size`: Window width |
| 118 | + |
| 119 | +**Filtered Results**: |
| 120 | + |
| 121 | + |
| 122 | + |
| 123 | +## L0 Gradient Minimization |
| 124 | + |
| 125 | +Promotes constant regions (zero gradients). |
| 126 | + |
| 127 | +**Energy Equation**: |
| 128 | +$$ |
| 129 | +E(u) = \|\nabla u\|_0 + \lambda \|u - f\|_2^2 |
| 130 | +$$ |
| 131 | +where \( \|\nabla u\|_0 \) counts non-zero gradients. |
| 132 | + |
| 133 | +**Parameters**: |
| 134 | +- λ ≥ 0: Smoothness control |
| 135 | +- `beta_max`: Maximum regularization parameter |
| 136 | +- `beta_rate`: Beta increase rate |
| 137 | +- `iterations`: Optimization iterations |
| 138 | + |
| 139 | +**Filtered Results**: |
| 140 | + |
| 141 | + |
| 142 | + |
| 143 | +## Hampel Filter |
| 144 | + |
| 145 | +Detects outliers using median and MAD (Median Absolute Deviation). |
| 146 | + |
| 147 | +**Decision Equation**: |
| 148 | +$$ |
| 149 | +y[i] = |
| 150 | +\begin{cases} |
| 151 | +x[i], & \text{if } |x[i] - \text{median}(W)| \leq \kappa \cdot \text{MAD}(W) \\ |
| 152 | +\text{median}(W), & \text{otherwise} |
| 153 | +\end{cases} |
| 154 | +$$ |
| 155 | + |
| 156 | +**Parameters**: |
| 157 | +- `window_size`: Window size W |
| 158 | +- `nsigma` (κ): MAD multiplier threshold (typical: 3σ ≈ 3 × 1.4826 × MAD) |
| 159 | + |
| 160 | +**Filtered Results**: |
| 161 | + |
| 162 | + |
| 163 | + |
| 164 | +## Hampel Variant Filter |
| 165 | + |
| 166 | +Optimized Hampel version approximating MAD using secondary median filter. |
| 167 | + |
| 168 | +**Decision Equation**: |
| 169 | +$$ |
| 170 | +y_t = |
| 171 | +\begin{cases} |
| 172 | +x_t, & \text{if } |x_t - m_t| \leq t \cdot S_{t_m} \\ |
| 173 | +m_t, & \text{otherwise} |
| 174 | +\end{cases} |
| 175 | +$$ |
| 176 | +where: |
| 177 | +- $$ m_t $$: Median in main window |
| 178 | +- $$ S_t $$: Approximate MAD (median-filtered deviations) |
| 179 | +- $$ t $$: Threshold (equivalent to κ in classic Hampel) |
| 180 | + |
| 181 | +**Parameters**: |
| 182 | +- `window_size`: Main window size |
| 183 | +- `nsigma`: Outlier detection threshold |
| 184 | +- `aux_window`: MAD approximation window size |
| 185 | + |
| 186 | +**Advantages**: |
| 187 | +- 30-50% faster than classic Hampel |
| 188 | +- Similar outlier robustness |
| 189 | +- Ideal for long/real-time signals |
| 190 | + |
| 191 | +**Filtered Results**: |
| 192 | + |
| 193 | + |
| 194 | + |
| 195 | +# Comparison |
| 196 | + |
| 197 | +Some methods aren't designed for denoising, making universal quality metrics less meaningful. However, we apply PSNR/MSE to core methods: Moving Average, Moving Median, Hampel, Hampel Variant, TV, L0, and Exponential. |
| 198 | + |
| 199 | +## MSE (Mean Squared Error) |
| 200 | +$$ |
| 201 | +\text{MSE} = \frac{1}{N} \sum_{i=1}^N (y[i] - \hat{y}[i])^2 |
| 202 | +$$ |
| 203 | + |
| 204 | +## PSNR (Peak Signal-to-Noise Ratio) |
| 205 | +$$ |
| 206 | +\text{PSNR} = 10 \cdot \log_{10} \left( \frac{\max(y)^2}{\text{MSE}} \right), |
| 207 | +$$ |
| 208 | + |
| 209 | +where $$ \max(y) $$ is signal peak value (e.g., 1 for normalized signals). |
| 210 | + |
| 211 | +## Noise Model |
| 212 | +Our mathematical noise model combines: |
| 213 | +- Gaussian noise |
| 214 | +- Impulse noise |
| 215 | +- Non-linear noise from laser scanning |
| 216 | + |
| 217 | +Noise and spikes are added via: |
| 218 | + |
| 219 | +```cpp |
| 220 | +std::vector<double> addNoiseAndSpikes(const std::vector<double>& clean_signal, double noise_level = 0.1, double spike_probability = 0.01, double spike_amplitude_max = 0.5) { |
| 221 | + std::vector<double> noisy_signal = clean_signal; // Copy clean signal |
| 222 | + std::default_random_engine generator; |
| 223 | + std::normal_distribution<double> noise(0.0, noise_level); |
| 224 | + std::uniform_real_distribution<double> spike_prob_dist(0.0, 1.0); |
| 225 | + std::uniform_real_distribution<double> spike_amplitude_dist(0.0, spike_amplitude_max); |
| 226 | + |
| 227 | + for (size_t i = 0; i < clean_signal.size(); ++i) { |
| 228 | + noisy_signal[i] += noise(generator); // Add Gaussian noise |
| 229 | + if (spike_prob_dist(generator) < spike_probability) { // Add spike |
| 230 | + double spike_amplitude = spike_amplitude_dist(generator); |
| 231 | + noisy_signal[i] += spike_amplitude; |
| 232 | + } |
| 233 | + } |
| 234 | + return noisy_signal; |
| 235 | +} |
| 236 | +``` |
| 237 | +
|
| 238 | +The additive noise model (apart from the non-linear noise inherent to the profile scan) is: |
| 239 | +
|
| 240 | +$$ |
| 241 | +y_i = x_i + \eta_i + s_i, |
| 242 | +$$ |
| 243 | +
|
| 244 | +where: |
| 245 | +- $$\eta_i \sim \mathcal{N}(0, \sigma^2)$$: Gaussian noise |
| 246 | +- $$s_i$$: Spike term (0 or amplitude $$A \sim \mathcal{U}(0, A_{\text{max}})$$) |
| 247 | +
|
| 248 | +## Results |
| 249 | +
|
| 250 | +We test methods on a C Major chord signal and a real laser-scanned profile. |
| 251 | +
|
| 252 | +### C Major Chord |
| 253 | +
|
| 254 | +A C Major chord combines frequencies: |
| 255 | +
|
| 256 | +| Note | Frequency (Hz) | |
| 257 | +|------|----------------| |
| 258 | +| C | 261.63 | |
| 259 | +| E | 329.62 | |
| 260 | +| G | 392.00 | |
| 261 | +
|
| 262 | +Generated in C++ as: |
| 263 | +
|
| 264 | +```cpp |
| 265 | +std::vector<double> generateChordSignal(double duration_ms, double sample_rate = 44100.0) { |
| 266 | + int size = static_cast<int>((duration_ms / 1000.0) * sample_rate); |
| 267 | + double freq_C = 261.63, freq_E = 329.63, freq_G = 392.00; |
| 268 | + std::vector<double> signal(size); |
| 269 | + |
| 270 | + for (int i = 0; i < size; ++i) { |
| 271 | + double t = i / sample_rate; |
| 272 | + signal[i] = std::sin(2*M_PI*freq_C*t) + std::sin(2*M_PI*freq_E*t) + std::sin(2*M_PI*freq_G*t); |
| 273 | + } |
| 274 | + return signal; |
| 275 | +} |
| 276 | +``` |
| 277 | + |
| 278 | +Clean vs. noisy 10ms signal: |
| 279 | + |
| 280 | + |
| 281 | + |
| 282 | +Audio Samples: |
| 283 | + |
| 284 | +- Clean: \`<audio controls><source src="../assets/blog_data/2025-05-01-1d-filtering/acorde_do_mayor.wav"></audio>\` |
| 285 | +- Noisy: \`<audio controls><source src="../assets/blog_data/2025-05-01-1d-filtering/acorde_do_mayor_ruido.wav"></audio>\` |
| 286 | +- Median-filtered: \`<audio controls><source src="../assets/blog_data/2025-05-01-1d-filtering/acorde_do_mayor_filtrado.wav"></audio>\` |
| 287 | + |
| 288 | +### Quantitative Comparison |
| 289 | + |
| 290 | +| Method | PSNR (dB) | MSE | |
| 291 | +|------------------|-----------|-------------| |
| 292 | +| Moving Average | 29.927 | 0.001017 | |
| 293 | +| Moving Median | 27.9873 | 0.001590 | |
| 294 | +| Hampel | 27.6265 | 0.001727 | |
| 295 | +| Hampel Variant | 27.8575 | 0.001638 | |
| 296 | +| TV | 27.477 | 0.001788 | |
| 297 | +| L0 | 18.6888 | 0.013525 | |
| 298 | +| Exponential | 23.2828 | 0.004696 | |
| 299 | + |
| 300 | +Surprisingly for the author, the best method (under these highly controlled conditions) is Moving Average . |
| 301 | + |
| 302 | +### Laser Profile |
| 303 | + |
| 304 | +Noisy laser profile (additional synthetic noise added): |
| 305 | + |
| 306 | + |
| 307 | + |
| 308 | +The visual filtering results are shown in their respective sections. Since we lack ground-truth data for this signal, traditional metrics like PSNR aren't applicable, but we could evaluate the filtering quality through 'unsupervised' (talking in ML argot) approaches such as Stein's risk estimators, entropy analysis, or noise stationarity measurements. A proper evaluation would need to consider both noise suppression and feature preservation characteristics specific to our profilometry application, leaving this as valuable future work requiring careful methodological development. |
| 309 | + |
| 310 | +# Conclusion |
| 311 | + |
| 312 | +Parameter tuning is critical for optimal performance. Methods with fewer parameters (e.g., Moving Average) offer practicality but may lack flexibility for highly degraded signals. TV/L0 priors favor piece-wise constant solutions, making them suitable for step-like signals (e.g., TTL pulses) but less ideal for smooth signals. The exponential filter introduces inherent delay, useful for prediction but suboptimal for denoising. A detailed case study is needed to bound each method's capabilities under varying degradation types/levels. |
| 313 | + |
| 314 | +# References |
| 315 | + |
| 316 | +- L0 Smoothing: http://www.cse.cuhk.edu.hk/leojia/projects/L0smoothing/index.html |
| 317 | + |
| 318 | +# TODO |
| 319 | + |
| 320 | +- Savitzky-Golay Filter. |
| 321 | +- L1-Trend Filtering (solve with CVXPY): https://www.cvxpy.org/examples/applications/l1_trend_filter.html. |
| 322 | +- TV-1D Filter Theory: https://github.com/benchopt/benchmark_tv_1d . |
| 323 | +- Evaluate profile filtering. |
0 commit comments