Skip to content

Create synthetic data degradation tools #187

@KedoKudo

Description

@KedoKudo

Parent Epic

Part of #172 (2D Resonance Imaging)

Description

Create tools to artificially degrade clean synthetic data to test recovery methods.

Motivation

We have:

  • Clean synthetic data (ground truth known)
  • Real sparse data (ground truth unknown)

To validate recovery methods, we need to:

  1. Degrade synthetic data in controlled ways
  2. Apply recovery methods
  3. Compare to known ground truth

Degradation Types

1. Poisson Noise

  • Add realistic counting statistics
  • Scale by total counts parameter

2. Sparsification

  • Remove fraction of counts
  • Simulate flux-limited scenarios

3. Detector Effects

  • Dead pixels
  • Non-uniform efficiency
  • Dark current

4. Background Noise

  • Flat background
  • Time-dependent background

Proposed Interface

class DataDegrader:
    """Degrade clean synthetic data for testing."""
    
    def __init__(self, random_seed: int | None = None):
        self.rng = np.random.default_rng(random_seed)
    
    def add_poisson_noise(
        self,
        data: np.ndarray,
        total_counts: float,
    ) -> np.ndarray:
        """Add Poisson counting statistics."""
        ...
    
    def sparsify(
        self,
        data: np.ndarray,
        keep_fraction: float,
    ) -> np.ndarray:
        """Randomly remove counts to simulate sparse data."""
        ...
    
    def add_dead_pixels(
        self,
        data: np.ndarray,
        dead_fraction: float,
    ) -> tuple[np.ndarray, np.ndarray]:  # data, mask
        """Add dead pixel regions."""
        ...
    
    def degrade_to_level(
        self,
        data: np.ndarray,
        target_level: int,  # L1-L4
    ) -> np.ndarray:
        """Degrade to specified severity level."""
        ...

Tasks

  • Implement DataDegrader class
  • Implement each degradation type
  • Calibrate degradation to match real data
  • Create degradation presets (L1-L4)
  • Add unit tests
  • Create example degraded datasets

Acceptance Criteria

  • Degraded data statistically similar to real data
  • Reproducible with random seed
  • Works with existing synthetic data

Metadata

Metadata

Assignees

No one assigned

    Labels

    imaging2D resonance imaging featuresoftwareSoftware engineering workv2.2

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions