Skip to content

Core TFA Algorithm Implementation #71

@jeremymanning

Description

@jeremymanning

Task 001: Core TFA Algorithm Implementation

Description

Implement the core Topographic Factor Analysis (TFA) algorithm with k-means initialization, non-linear optimization, and convergence detection. This is the foundation algorithm that HTFA will build upon.

The TFA class should provide a scikit-learn compatible interface for fitting factor models to neuroimaging data, with automatic parameter initialization and robust optimization.

Acceptance Criteria

  • TFA class inherits from sklearn.base.BaseEstimator and TransformerMixin
  • Implements fit(X) method that learns factors from input data
  • Implements transform(X) method that projects data onto learned factors
  • K-means initialization for factor centroids and spatial templates
  • Non-linear optimization using scipy.optimize (L-BFGS or similar)
  • Convergence detection with configurable tolerance and max iterations
  • Validates input data dimensions and handles edge cases
  • Passes comprehensive tests with synthetic 4D neuroimaging data
  • Algorithm converges to stable solution within reasonable iterations
  • Results are deterministic given same random seed

Technical Details

Core Components

  1. Initialization: K-means clustering on flattened voxel timeseries to initialize factor centroids
  2. Optimization: Alternating minimization between spatial templates and temporal factors
  3. Objective Function: Minimize reconstruction error between observed and predicted data
  4. Convergence: Monitor objective function change and parameter stability

Key Parameters

  • n_factors: Number of factors to extract (default: 10)
  • max_iter: Maximum optimization iterations (default: 100)
  • tol: Convergence tolerance (default: 1e-6)
  • random_state: Random seed for reproducibility
  • init: Initialization method ('k-means' or 'random')

Data Format

  • Input: 4D numpy array (subjects, voxels, timepoints) or (voxels, timepoints)
  • Output: Spatial templates (n_factors, n_voxels) and temporal factors (n_factors, n_timepoints)

Optimization Strategy

  • Use scipy.optimize.minimize with L-BFGS algorithm
  • Implement gradient computation for faster convergence
  • Handle numerical stability with proper regularization
  • Support masking for non-brain voxels

Dependencies

  • numpy for numerical computations
  • scipy.optimize for non-linear optimization
  • sklearn.base for estimator interface
  • sklearn.cluster.KMeans for initialization

Effort Estimate

Size: L (2-3 days)

Breakdown:

  • Algorithm research and design: 0.5 days
  • Core implementation: 1.5 days
  • Testing and validation: 1 day

Definition of Done

  • TFA class implemented in htfa/core/tfa.py
  • Full scikit-learn estimator interface compliance
  • Comprehensive unit tests in tests/unit/core/test_tfa.py
  • Algorithm converges on synthetic data within expected iterations
  • Code passes all linting and type checking
  • Documentation strings follow Google style guide
  • No regression in existing test suite

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions