[Phase 3] Feature Engineering

## Phase 3: Feature Engineering

**Parent:** #1
**Depends on:** #3

### Objectives
Extract clinically meaningful features from ECG beats for classical ML models.

### Tasks

- [ ] Implement time-domain feature extraction (10+ features)
- [ ] Implement frequency-domain feature extraction (8+ features)
- [ ] Implement wavelet-based feature extraction (8+ features)
- [ ] Create unified feature extraction pipeline
- [ ] Document clinical meaning of each feature
- [ ] Validate features against published literature
- [ ] Handle edge cases (NaN, division by zero)

### Files to Create/Modify

| File | Action | Description |
|------|--------|-------------|
| `src/feature_extraction.py` | Create | Feature extraction module |
| `tests/test_features.py` | Create | Unit tests |

### Features to Extract

**Time Domain (10 features):**
- Mean, std, variance
- Skewness, kurtosis
- RMS (root mean square)
- Peak amplitude, peak-to-peak
- QRS duration estimate
- RR interval ratio

**Frequency Domain (8 features):**
- Spectral centroid, spectral spread
- Spectral entropy
- Band powers: VLF, LF, HF
- LF/HF ratio
- Dominant frequency

**Wavelet Features (8+ features):**
- Energy at scales 4, 8, 16, 32 (db4 wavelet)
- Approximation coefficient statistics
- Detail coefficient statistics

### Code Reference

```python
from scipy.stats import skew, kurtosis
from scipy.signal import welch
import pywt
import numpy as np

class FeatureExtractor:
    def __init__(self, fs: int = 360):
        self.fs = fs

    def time_domain_features(self, beat: np.ndarray) -> dict:
        return {
            'mean': np.mean(beat),
            'std': np.std(beat),
            'variance': np.var(beat),
            'rms': np.sqrt(np.mean(beat**2)),
            'peak': np.max(np.abs(beat)),
            'peak_to_peak': np.ptp(beat),
            'skewness': skew(beat),
            'kurtosis': kurtosis(beat),
        }

    def frequency_domain_features(self, beat: np.ndarray) -> dict:
        freqs, psd = welch(beat, fs=self.fs, nperseg=min(256, len(beat)))
        total_power = np.sum(psd)
        spectral_centroid = np.sum(freqs * psd) / (total_power + 1e-10)
        return {
            'spectral_centroid': spectral_centroid,
            'total_power': total_power,
            # ... more features
        }

    def wavelet_features(self, beat: np.ndarray, wavelet: str = 'db4') -> dict:
        coeffs = pywt.wavedec(beat, wavelet, level=4)
        features = {}
        for i, c in enumerate(coeffs):
            features[f'wavelet_energy_{i}'] = np.sum(c**2)
            features[f'wavelet_std_{i}'] = np.std(c)
        return features

    def extract_all(self, beat: np.ndarray) -> np.ndarray:
        """Extract all features and return as array."""
        all_features = {}
        all_features.update(self.time_domain_features(beat))
        all_features.update(self.frequency_domain_features(beat))
        all_features.update(self.wavelet_features(beat))
        return np.array(list(all_features.values()))
```

### Definition of Done

- [ ] 30-40 features extracted per beat
- [ ] All features have valid ranges (no NaN, inf)
- [ ] Feature names documented with clinical interpretation
- [ ] Unit tests verify calculations against known values
- [ ] Feature extraction runs <10ms per beat

### Technical Notes

**For junior developers:**
- Kurtosis is high for impulsive signals (like arrhythmias)
- LF/HF ratio relates to autonomic nervous system
- Wavelet decomposition captures multi-scale information
- Always add small epsilon (1e-10) to denominators


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Phase 3] Feature Engineering #4

Phase 3: Feature Engineering

Objectives

Tasks

Files to Create/Modify

Features to Extract

Code Reference

Definition of Done

Technical Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

File	Action	Description
`src/feature_extraction.py`	Create	Feature extraction module
`tests/test_features.py`	Create	Unit tests

[Phase 3] Feature Engineering #4

Description

Phase 3: Feature Engineering

Objectives

Tasks

Files to Create/Modify

Features to Extract

Code Reference

Definition of Done

Technical Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions