-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Description
Task 004: HTFAPreprocessor Implementation
Overview
Implement the HTFAPreprocessor class providing a comprehensive, configurable preprocessing pipeline for neuroimaging data. This includes brain masking, spatial smoothing, temporal detrending, standardization, and quality control steps optimized for HTFA analysis using nilearn's preprocessing capabilities.
Problem Statement
HTFA requires properly preprocessed fMRI data to produce meaningful results:
- Brain masking to remove non-brain voxels
- Spatial smoothing to improve signal-to-noise ratio
- Temporal preprocessing (detrending, standardization)
- Quality control and outlier detection
- Consistent preprocessing across subjects and sessions
Technical Requirements
Core Preprocessing Pipeline
- Brain Masking: Automatic brain extraction or custom mask application
- Spatial Smoothing: Configurable FWHM with nilearn smoothing functions
- Temporal Detrending: Linear/polynomial detrending with high-pass filtering
- Standardization: Z-score normalization within runs and across voxels
- Quality Control: Motion parameter extraction and outlier detection
Configuration and Flexibility
- Sensible Defaults: Automatic parameter selection based on data characteristics
- Full Customization: Override any preprocessing step with custom parameters
- Pipeline Validation: Ensure preprocessing parameters are compatible
- Memory Efficiency: Process data in chunks to handle large datasets
BIDS Compliance and Integration
- Metadata Preservation: Maintain BIDS metadata throughout processing
- Confound Integration: Handle BIDS-standard confound regressors
- Derivative Generation: Create BIDS derivatives-compliant preprocessed data
- Provenance Tracking: Record all preprocessing steps and parameters
Implementation Details
HTFAPreprocessor Class Design
class HTFAPreprocessor:
"""
Comprehensive preprocessing pipeline for HTFA analysis.
Parameters
----------
mask_strategy : {'auto', 'custom', None}
Brain masking approach
smoothing_fwhm : float or None
Spatial smoothing kernel size in mm
detrend : bool or int
Temporal detrending (True=linear, int=polynomial order)
standardize : bool
Apply temporal standardization
high_pass : float or None
High-pass filter cutoff in Hz
"""Brain Masking Implementation
- Automatic Masking: Use nilearn's compute_brain_mask for EPI data
- Custom Mask Support: Accept user-provided brain masks
- Mask Validation: Ensure mask dimensions match functional data
- Multi-subject Consistency: Option to use common mask across subjects
Spatial Preprocessing Pipeline
- Smoothing Strategy: Gaussian kernel smoothing with configurable FWHM
- Resolution Preservation: Maintain original voxel resolution after smoothing
- Edge Handling: Proper boundary conditions for smoothing operations
- Memory Management: Process volumes sequentially for large datasets
Temporal Preprocessing Components
- Detrending Options: Linear, polynomial, or custom detrending functions
- High-pass Filtering: Butterworth or FIR filters for low-frequency removal
- Standardization Methods: Z-score, robust scaling, or custom normalization
- Confound Regression: Integration with BIDS confound files
Dependencies
External Dependencies
- nilearn: Core neuroimaging preprocessing functions
- nibabel: NIfTI file I/O and manipulation
- scipy: Signal processing for filtering and detrending
- sklearn: Preprocessing utilities and validation functions
Internal Dependencies
- Task 003: Input detection and BIDS parsing for data loading
- htfa.bids: BIDS integration utilities and metadata handling
- htfa.validation: Input validation framework
Success Criteria
Functional Requirements
- Complete preprocessing pipeline with all standard neuroimaging steps
- Configurable parameters with sensible defaults for HTFA analysis
- Integration with BIDS datasets and metadata preservation
- Support for both single-subject and multi-subject preprocessing
- Comprehensive quality control and outlier detection
Performance Requirements
- Preprocessing time <25% of total analysis time for typical datasets
- Memory usage scales linearly with data size (no memory leaks)
- Support for datasets with >100 subjects without performance degradation
- Chunked processing for datasets exceeding available RAM
Code Quality Requirements
- Full type hints and mypy compliance
- Comprehensive docstrings with parameter descriptions and examples
- >90% test coverage including edge cases and error scenarios
- Integration tests with real and synthetic neuroimaging data
Test Plan
Unit Tests
- Masking Functions: Test automatic and custom brain masking
- Smoothing Operations: Validate spatial smoothing with different kernels
- Temporal Processing: Test detrending, filtering, and standardization
- Parameter Validation: Ensure invalid parameters raise appropriate errors
Integration Tests
- BIDS Pipeline: Test complete preprocessing of BIDS datasets
- Memory Management: Verify efficient processing of large datasets
- Quality Control: Test outlier detection and quality metrics
- Multi-subject Consistency: Validate consistent preprocessing across subjects
Performance Tests
- Memory Profiling: Monitor memory usage during preprocessing
- Speed Benchmarks: Measure preprocessing time for various dataset sizes
- Scalability Testing: Verify performance with increasing subject counts
- Resource Utilization: Monitor CPU and memory usage patterns
Implementation Notes
Default Parameter Selection
# Sensible defaults for HTFA preprocessing
defaults = {
'smoothing_fwhm': 6.0, # 6mm FWHM for good spatial regularization
'detrend': True, # Linear detrending
'standardize': True, # Z-score standardization
'high_pass': 1/128, # 128s high-pass filter cutoff
'mask_strategy': 'auto' # Automatic brain masking
}Memory Management Strategy
- Lazy Loading: Load data only when needed for processing
- Chunked Processing: Process timepoints in chunks for large datasets
- Memory Monitoring: Track memory usage and warn about resource constraints
- Cleanup: Explicit memory cleanup after processing steps
Quality Control Metrics
- Motion Assessment: Extract frame displacement metrics
- Signal Quality: Compute temporal SNR and variance metrics
- Outlier Detection: Identify volumes with excessive motion or artifacts
- Coverage Assessment: Evaluate brain mask coverage and quality
BIDS Derivatives Compliance
Output Structure
derivatives/htfa/
sub-{subject}/
ses-{session}/
func/
sub-{subject}_ses-{session}_task-{task}_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz
sub-{subject}_ses-{session}_task-{task}_desc-confounds_timeseries.tsv
sub-{subject}_ses-{session}_task-{task}_desc-preprocessing_params.json
Metadata Generation
- Processing Parameters: JSON sidecar with all preprocessing parameters
- Software Provenance: Record htfa version, nilearn version, processing date
- Quality Metrics: Include motion parameters, SNR, and outlier information
- Transformation Records: Document any spatial transformations applied
Deliverables
-
htfa/preprocessing.py: HTFAPreprocessor class implementation -
htfa/quality_control.py: Quality assessment and outlier detection utilities -
htfa/derivatives.py: BIDS derivatives output formatting -
tests/test_preprocessing.py: Comprehensive preprocessing test suite -
tests/test_quality_control.py: Quality control validation tests - Documentation with preprocessing parameter guide and best practices
Acceptance Criteria
Must-Have Features
- Complete HTFAPreprocessor class with all neuroimaging preprocessing steps
- Configurable pipeline with sensible defaults optimized for HTFA
- BIDS derivatives-compliant output with proper metadata
- Quality control metrics and outlier detection capabilities
- Memory-efficient processing suitable for large datasets
Quality Gates
- All preprocessing steps validated against established neuroimaging standards
- Performance benchmarks met (preprocessing <25% of total analysis time)
- Memory usage linear scaling without leaks or excessive consumption
- BIDS compliance verified with standard validation tools
- Comprehensive test coverage including edge cases and error conditions
Definition of Ready for Next Task
- Preprocessing pipeline complete and fully tested
- BIDS derivatives output properly formatted and validated
- Quality control framework operational and documented
- Ready for HTFAResults implementation and visualization components
- Integration with core algorithms (TFA/HTFA) verified and working
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels